What This Voice Agent Does
This AI-Powered Voice Assistant:
- Listens to users through their microphone
- Converts speech to text using OpenAI's Whisper
- Scrapes live information from your chosen websites
- Generates smart responses using OpenAI's GPT-4 with that real-time data
- Speaks the answer back using OpenAI's text-to-speech
- All through a beautiful web interface users can access anywhere
The Flow
User speaks into microphone
↓
Whisper API converts speech to text
↓
System scrapes your chosen websites
↓
GPT-4 API generates response with live data
↓
TTS API converts response to speech
↓
User hears the answer
What You Need Before Starting
Important: You only need ONE API key for this entire project!
Required Accounts
1. OpenAI Account (The Only API Key You Need)
Sign up: https://platform.openai.com/signup
Add payment method: Required even for pay-as-you-go
What you'll use it for:
- Whisper (speech-to-text)
- GPT-4 (AI brain that generates responses)
- TTS (text-to-speech)
2. Vercel Account (Free Hosting)
Go to: https://vercel.com/signup
Free tier available. This hosts your voice agent.
3. GitHub Account (Code Storage)
Go to: https://github.com/signup
Free. This stores your files.
Getting Your API Key
OpenAI API Key Setup
Step 1: Go to https://platform.openai.com/api-keys
Step 2: Click the green "Create new secret key" button
Step 3: Give it a name like "Voice Travel Agent"
Step 4: Copy the key (starts with sk-proj-...)
Step 5: Save it somewhere safe - you can't see it again!
CRITICAL: Make sure you have a payment method added to your OpenAI account, or the API won't work!
To Add Payment Method:
Creating Your Perfect System Prompt
This is THE MOST IMPORTANT part - it's what makes your voice agent unique!
What is a System Prompt?
The system prompt tells GPT-4:
- WHO it is (role/personality)
- HOW to respond (tone, length, style)
- WHAT to focus on (priorities)
- HOW to handle uncertainty (what to do when it doesn't know)
Your Current Brooklyn Guide Prompt
You are a knowledgeable and enthusiastic Brooklyn travel guide.
Your goal is to help visitors discover the best of Brooklyn - from
trendy neighborhoods and delicious dining spots to exciting events
and hidden gems.
IMPORTANT GUIDELINES:
- Be conversational, friendly, and enthusiastic about Brooklyn
- Keep responses concise (2-3 sentences max) since they will be
converted to speech
- Provide specific recommendations when possible
- If you don't have current information, acknowledge it and provide
general guidance
- Focus on the most relevant information from the context provided
CURRENT BROOKLYN INFORMATION:
[This is where the scraped website content gets inserted automatically]
Remember: You're speaking to someone who wants quick, helpful, and
engaging information about Brooklyn. Be their friendly local guide!
The Universal Template for ANY Topic
You are a [ADJECTIVE] [ROLE] for [TOPIC/LOCATION].
IMPORTANT GUIDELINES:
- TONE: Be [describe tone - conversational/professional/energetic/calming/etc.]
- LENGTH: Keep responses to [2-4] sentences since they will be converted to speech
- SPECIFICITY: [Provide specific names and recommendations / Give general overviews /
Focus on unique features]
- UNCERTAINTY: If you don't have current information, [acknowledge it honestly /
provide general insights / suggest alternatives]
- FOCUS: Prioritize [what matters most - price/quality/convenience/experience/etc.]
CURRENT [TOPIC] INFORMATION:
[Scraped content goes here automatically]
Remember: You're helping someone [their goal]. Be their [relationship to them]!
Real Examples for Different Industries
Example 1: Miami Real Estate Agent
You are a professional and trustworthy real estate advisor for Miami, Florida.
IMPORTANT GUIDELINES:
- TONE: Be professional yet approachable and informative
- LENGTH: Keep responses to 3-4 sentences for voice clarity
- SPECIFICITY: Always mention specific neighborhoods, price ranges, and property features
- UNCERTAINTY: If you lack current market data, provide general Miami market insights
- FOCUS: Prioritize helping buyers and sellers make informed financial decisions
CURRENT MIAMI REAL ESTATE INFORMATION:
[Scraped content]
Remember: You're guiding someone through one of the biggest financial decisions
of their life. Be accurate, helpful, and trustworthy!
Example 2: Los Angeles Fitness & Wellness Guide
You are an energetic and motivating fitness coach and wellness expert for Los Angeles.
IMPORTANT GUIDELINES:
- TONE: Be upbeat, positive, and encouraging with high energy
- LENGTH: Keep responses to 2-3 sentences since they'll be spoken aloud
- SPECIFICITY: Recommend specific gyms, studios, trails, and wellness spots by name
- UNCERTAINTY: If you don't have current class schedules, suggest general fitness options
- FOCUS: Prioritize variety - yoga studios, hiking trails, boutique fitness, outdoor activities
CURRENT LA FITNESS & WELLNESS INFORMATION:
[Scraped content]
Remember: You're inspiring someone to prioritize their health.
Be their motivational fitness buddy!
Example 3: Austin Food Truck Guide
You are a passionate and enthusiastic food expert specializing in Austin's food truck scene.
IMPORTANT GUIDELINES:
- TONE: Be excited, descriptive, and fun when talking about food
- LENGTH: Keep responses to 3-4 sentences for speech conversion
- SPECIFICITY: Always mention specific truck names, locations, and signature dishes
- UNCERTAINTY: If you don't have current menus, focus on the truck's style and specialties
- FOCUS: Prioritize unique flavors, must-try dishes, and trucks that capture Austin's culture
CURRENT AUSTIN FOOD TRUCK INFORMATION:
[Scraped content]
Remember: You're helping someone discover their next amazing meal on wheels.
Make them hungry!
How to Test Your System Prompt
After you create your prompt, test it with these 4 types of questions:
- Specific Question: "What's the best pizza place in [location]?"
Tests if it gives specific recommendations
- Broad Question: "What should I do today?"
Tests if responses are concise enough for speech
- Unknown Information: "What are the hours for [obscure place]?"
Tests how it handles uncertainty
- Comparison Question: "What's better, X or Y?"
Tests if it provides helpful guidance
Complete Troubleshooting Guide
Problem 1: Microphone Doesn't Work
Symptom: Button doesn't respond or browser doesn't ask for permission
Fix:
- Make sure you're using HTTPS (Vercel provides this automatically)
- Check your browser settings:
- Chrome: Visit
chrome://settings/content/microphone and allow access
- Safari: Go to Safari → Settings → Websites → Microphone
- Firefox: Check
about:permissions
- Try a different browser
- Check if another app is using your microphone
- Restart your browser
Problem 2: "401 Unauthorized" Error
Symptom: Nothing happens after speaking, or you see error messages
Fix:
- Check your API key is correctly entered in Vercel:
- Go to Vercel → Your Project → Settings → Environment Variables
- Verify
OPENAI_API_KEY is correct
- Make sure your OpenAI account has a payment method:
- Regenerate your key if needed:
- Create a new key on OpenAI platform
- Update it in Vercel
- Redeploy
Problem 3: "429 Too Many Requests"
Symptom: Works a few times, then stops working
Fix:
- You're hitting rate limits - wait 60 seconds and try again
- Check your usage tier at https://platform.openai.com/account/limits
- If testing a lot, space out your requests by 10-15 seconds
- Consider upgrading your OpenAI account tier for higher limits
Problem 4: No Audio Plays Back
Symptom: You see the text response but don't hear anything
Fix:
- Check your device volume
- Look for a "Click to play" button (some browsers block autoplay)
- Try a different browser (Safari sometimes has issues)
- Check browser console for errors (F12 key → Console tab)
- Make sure your OpenAI API key is working
Problem 5: Empty or Generic Responses
Symptom: GPT-4 gives generic answers or says it doesn't know anything
Fix:
- Check if your websites are being scraped correctly
- Your websites might be blocking scrapers - try different URLs
- Reduce the number of URLs you're scraping (max 3 at a time)
- Make sure the websites you're scraping have actual text content (not just images)
- Add fallback text in your code for when scraping fails
Problem 6: "Function Timeout" Error
Symptom: Request takes too long and fails
Fix:
- Reduce the number of websites you're scraping (stick to 2-3)
- Choose faster-loading websites
- Your Vercel free plan has a 10-second limit
- Upgrade to Vercel Pro for 60-second timeout
- Optimize your scraping to only grab essential text
Problem 7: Website Scraping Fails
Symptom: Agent works but gives generic answers without recent info
Fix:
- Check if the websites allow scraping:
- Visit
https://yourwebsite.com/robots.txt
- Look for "Disallow" rules
- Some websites block automated access:
- Try alternative websites on the same topic
- Use official APIs if available
- Test individual URLs:
- Remove all but one URL
- See if that one works
- Add URLs back one at a time
Problem 8: Deployment Fails
Symptom: Vercel shows "Build Failed" error
Fix:
- Check the error logs in Vercel:
- Go to your project → Deployments
- Click the failed deployment
- Read the error message
- Common issues:
- Missing files - make sure all three files are uploaded
- Wrong file names - check spelling exactly
- Node version - Vercel uses Node 18+ by default
- Redeploy:
- Make a small change in GitHub (add a space, etc.)
- Push the change
- Vercel will try again
Problem 9: Environment Variables Not Loading
Symptom: Key doesn't work even though it's correct
Fix:
- Verify variable name is EXACT:
OPENAI_API_KEY not openai_api_key or OPENAI_KEY
- Check all environments are selected:
- Production ✓
- Preview ✓
- Development ✓
- After adding/changing variables, you MUST redeploy:
- Go to Deployments
- Click "..." menu on latest deployment
- Click "Redeploy"
Problem 10: Audio Quality Issues
Symptom: Voice sounds robotic or unclear
Fix:
- Change the voice in your TTS settings:
- Options:
alloy, echo, fable, onyx, nova, shimmer
- Try
nova for friendly, onyx for professional
- Adjust speaking speed:
- Default is
1.0
- Try
0.9 for clearer speech
- Try
1.1 for more energy
- Keep GPT-4's responses short (2-3 sentences):
- Long responses are harder to listen to
- Edit your system prompt to enforce brevity
Customizing for ANY Topic
Step-by-Step Customization Process
Step 1: Choose Your Topic
- Location-based: [City] restaurants, [City] real estate, [City] attractions
- Interest-based: Fitness, music venues, coffee shops, hiking trails
- Service-based: Home services, event planning, travel tips
Step 2: Find 5-10 Websites with Good Information
What makes a good website:
- Has current, updated information
- Lots of text content (not just images)
- Loads quickly
- Covers your topic well
- Allows scraping (check robots.txt)
Where to find them:
- Official tourism sites
- Local government sites
- Review sites (Yelp, TripAdvisor)
- Event calendars
- Local blogs and magazines
Step 3: Map Keywords to URLs
Think about what people will ask, then decide which websites have those answers.
Example for Food Guide:
- User says "restaurant" or "dining" → Use Yelp and local food blogs
- User says "events" or "festivals" → Use event calendar sites
- User says "cheap eats" → Use budget dining guides
Step 4: Write Your System Prompt
Use the template from earlier and fill in:
- The role (guide, advisor, expert, coach)
- The tone (friendly, professional, energetic)
- Response length (2-3 sentences usually best)
- What to focus on (price, experience, quality, convenience)
- How to handle unknowns
Step 5: Choose Your Voice
Voice Personalities:
| Voice |
Personality |
Best For |
nova |
Friendly and energetic |
Travel, lifestyle |
alloy |
Neutral and balanced |
Good for anything |
echo |
Clear and direct |
Instructions |
fable |
Warm and expressive |
Stories |
onyx |
Deep and authoritative |
Professional topics |
shimmer |
Soft and calming |
Wellness |
Step 6: Update Your Frontend Text
Change these in your HTML:
- Page title
- Header text ("Brooklyn Voice Travel Guide" → "Your Custom Title")
- Description
- Footer with your social media
Step 7: Test Everything
Run through the testing checklist above!
Quick Customization Worksheet
My Voice Agent Topic:
_______________________________________
My Target Audience:
_______________________________________
5-10 Websites I'll Scrape:
- _______________________________________
- _______________________________________
- _______________________________________
- _______________________________________
- _______________________________________
Keywords People Will Use:
- _____________ → URLs: _____________
- _____________ → URLs: _____________
- _____________ → URLs: _____________
My Agent's Personality (3 words):
_____________, _____________, _____________
Response Length:
_____ sentences
TTS Voice Choice:
_______________________________________
Main Focus/Priority:
_______________________________________
Real Customization Examples
Example 1: San Francisco Coffee Shop Guide
Topic: SF Coffee Shops
Audience: Coffee lovers and remote workers
Websites:
- https://www.sfgate.com/food/article/best-coffee-shops-san-francisco
- https://sf.eater.com/maps/best-coffee-shops-san-francisco
- https://www.timeout.com/san-francisco/restaurants/best-coffee-shops-in-san-francisco
Voice: nova (friendly and energetic)
Example 2: Denver Hiking Guide
Topic: Denver Area Hiking Trails
Audience: Outdoor enthusiasts and tourists
Websites:
- https://www.alltrails.com/us/colorado/denver
- https://www.denver.org/things-to-do/sports-recreation/hiking/
- https://www.uncovercolorado.com/best-hikes-near-denver/
Voice: fable (warm and expressive)