Vibe Coding: Instructions

MongoDB Airbnb AI Arena (Vibe Coding Edition)

Welcome to the MongoDB Airbnb AI Arena! 🚀

Your mission is to create the best backend possible for this Airbnb application. The frontend is already built and waiting for you, and all API endpoints are clearly defined below. Now it’s time to vibe code your backend and bring this application to life!

Your Challenge:

✨ Implement robust API endpoints that power the frontend
🏗️ Build efficient MongoDB queries and data operations
🔍 Create powerful search and filtering capabilities
💡 Add creative features that make your backend stand out
🎯 Focus on performance, scalability, and clean code

What’s Provided:

🎨 Frontend: Fully functional React application
📋 API Spec: Complete endpoint definitions and schemas
🗄️ Database: MongoDB Atlas with sample Airbnb data
🛠️ Tools: All the development tools you need

Your Goal: Make the frontend work flawlessly by implementing these API endpoints. Show off your skills, get creative, and build something amazing! 💪

📋 Requirements

Technical Specifications:

🌐 Server Port: Must run on http://localhost:5000
🌐 Public Link: Run on https://<username>.<customer>.mongogameday.com/backend
📋 Documentation Format: OpenAPI 3.0
🛠️ Language: Build in any programming language you prefer
🎯 Framework: Use any web framework (Express.js, FastAPI, Spring Boot, etc.)
🗄️ Database: MongoDB Atlas (connection details provided)

Implementation Freedom:

✨ Your Choice: Node.js, Python, Java
🏗️ Your Framework: Express, SpringBoot, FastAPI, Flask
💡 Your Style: RESTful APIs following the provided OpenAPI specification
🚀 Your Creativity: Add bonus features and optimizations

📋 API Documentation Preview

🚀 Interactive Documentation

📖 Open in Swagger Editor
1. Go to Swagger Editor.
2. Click on File > Import URL.
3. Paste this URL:
```
https://raw.githubusercontent.com/simonegaiera/mongodb-airbnb-workshop/main/docs/assets/files/swagger.json
```
4. The API documentation will load automatically.
💾 Download swagger.json – Local file

🎯 API Endpoints Overview

🏠 Listings & Analytics (Comprehensive Listing Operations)

🔧 Method	🌐 Endpoint	📝 Description
GET	/api/listingsAndReviews	📄 Get all listings with pagination
POST	/api/listingsAndReviews	✨ Create a new listing
GET	/api/listingsAndReviews/{id}	🎯 Get specific listing by ID
PATCH	/api/listingsAndReviews/{id}	🔧 Update listing field
DELETE	/api/listingsAndReviews/{id}	🗑️ Delete listing
POST	/api/listingsAndReviews/{id}/reviews	💬 Add review to listing
GET	/api/listingsAndReviews/distinct	🔍 Get distinct field values
POST	/api/listingsAndReviews/filter	🎛️ Filter listings with complex criteria
GET	/api/listingsAndReviews/statistics	📊 Get price statistics

🏠 Comprehensive Listing Operations: This unified section includes CRUD operations for listings, reviews management, advanced filtering using MongoDB aggregation pipelines, and statistical analysis with aggregated data operations. All core listing functionality is consolidated here for better organization.

🔎 Atlas Search (Advanced Lexical Search Features)

🔧 Method	🌐 Endpoint	📝 Description
POST	/api/listingsAndReviews/autocomplete	⚡ Search autocomplete
POST	/api/listingsAndReviews/facet	🔎 Faceted search
POST	/api/listingsAndReviews/search	🔍 Full-text search

💡 Atlas Search Integration: These endpoints are designed to leverage MongoDB Atlas Search capabilities for advanced lexical search functionality including full-text search, autocomplete, and faceted search. You’re expected to implement these features using Atlas Search indexes and operators for lexical search operations.

🧠 AI & Vector Search (Advanced AI-Powered Functionality)

🔧 Method	🌐 Endpoint	📝 Description
POST	/api/listingsAndReviews/vectorsearch	🧠 Vector-based semantic search
POST	/api/chat	💬 Send chat message to AI chatbot
POST	/api/chat/clear	🧹 Clear chat history and memory

🧠 Advanced AI-Powered Features: This unified section combines semantic search operations using MongoDB Atlas Vector Search with automated embeddings and knnBeta operator, plus AI chatbot capabilities with RAG (Retrieval-Augmented Generation), AWS Bedrock LLM integration via LangChain, and MongoDB-based conversation memory storage. All AI and vector search functionality is consolidated here for better organization.

📈 Results

🔧 Method	🌐 Endpoint	📝 Description
GET	/api/results	📊 Get section results
GET	/api/results/participants	👥 Get all participants
GET	/api/results/whoami	🙋‍♂️ Get current participant info

🗄️ MongoDB Collection Schema & Performance

Understanding the `listingsAndReviews` Collection

Database Usage:

🏠 Primary Collection: All listing-related endpoints (/api/listingsAndReviews/*) use the listingsAndReviews collection
🔍 Atlas Search: Search endpoints (/autocomplete, /facet, /search) operate on the listingsAndReviews collection
🧠 Vector Search: The /vectorsearch endpoint uses the listingsAndReviews collection with automated embeddings
💬 Chat System: Chat endpoints (/api/chat/*) use the listingsAndReviews collection for RAG operations
📊 Results Data: Only the Results endpoints (/api/results/*) use data from the airbnb_arena database

MCP Integration Available: There is a Model Context Protocol (MCP) available that can help you understand the structure and schema of the two collections. This MCP provides insights into:

Field types and structures
Data patterns and relationships
Sample document formats
Nested object schemas

Recommendation: Leverage the MCP to properly understand the MongoDB collection schema before implementing your API endpoints. This will ensure accurate field mappings, proper data validation, and efficient query construction.

Key Collection Features:

🏠 Rich Listing Data: Property details, amenities, location information
⭐ Embedded Reviews: Review arrays with dates, comments, and reviewer info
📍 Geospatial Data: Location coordinates for mapping and proximity searches
🏷️ Categorical Fields: Property types, room types, amenities for filtering
💰 Pricing Information: Nightly rates and pricing structures

📋 Data Schemas

{
  "_id": "string",
  "listing_url": "string",
  "name": "string",
  "summary": "string",
  "space": "string",
  "description": "string",
  "neighborhood_overview": "string",
  "notes": "string",
  "transit": "string",
  "access": "string",
  "interaction": "string",
  "house_rules": "string",
  "property_type": "string",
  "room_type": "string",
  "bed_type": "string",
  "minimum_nights": "string",
  "maximum_nights": "string",
  "cancellation_policy": "string",
  "last_scraped": "Date",
  "calendar_last_scraped": "Date",
  "accommodates": "Number",
  "bedrooms": "Number",
  "beds": "Number",
  "number_of_reviews": "Number",
  "bathrooms": "Decimal128",
  "amenities": ["string"],
  "price": "Decimal128",
  "weekly_price": "Decimal128",
  "monthly_price": "Decimal128",
  "cleaning_fee": "Decimal128",
  "extra_people": "Decimal128",
  "guests_included": "Decimal128",
  "security_deposit": "Decimal128",
  "images": {
    "thumbnail_url": "string",
    "medium_url": "string",
    "picture_url": "string",
    "xl_picture_url": "string"
  },
  "host": {
    "host_id": "string",
    "host_url": "string",
    "host_name": "string",
    "host_location": "string",
    "host_about": "string",
    "host_thumbnail_url": "string",
    "host_picture_url": "string",
    "host_neighbourhood": "string",
    "host_is_superhost": "Boolean",
    "host_has_profile_pic": "Boolean",
    "host_identity_verified": "Boolean",
    "host_listings_count": "Number",
    "host_total_listings_count": "Number",
    "host_verifications": ["string"],
    "host_response_time": "string",
    "host_response_rate": "Number"
  },
  "address": {
    "street": "string",
    "suburb": "string",
    "government_area": "string",
    "market": "string",
    "country": "string",
    "country_code": "string",
    "location": {
      "type": "string",
      "coordinates": ["Number"],
      "is_location_exact": "Boolean"
    }
  },
  "availability": {
    "availability_30": "Number",
    "availability_60": "Number",
    "availability_90": "Number",
    "availability_365": "Number"
  },
  "review_scores": {
    "review_scores_accuracy": "Number",
    "review_scores_cleanliness": "Number",
    "review_scores_checkin": "Number",
    "review_scores_communication": "Number",
    "review_scores_location": "Number",
    "review_scores_value": "Number",
    "review_scores_rating": "Number"
  },
  "reviews": [
    {
      "_id": "string",
      "date": "Date",
      "listing_id": "string",
      "reviewer_id": "string",
      "reviewer_name": "string",
      "comments": "string"
    }
  ],
  "first_review": "Date",
  "last_review": "Date",
  "updated_at": "Date"
}

Important Data Types:

Decimal128: Used for precise monetary values (price, weekly_price, monthly_price, cleaning_fee, extra_people, guests_included, security_deposit, bathrooms)
Number: Used for integers (accommodates, bedrooms, beds, number_of_reviews, host counts, availability counts, review scores)
Date: Used for timestamps (last_scraped, calendar_last_scraped, first_review, last_review, updated_at, review dates)
Boolean: Used for true/false values (host verification flags, location exactness)
Array: Used for lists (amenities, host_verifications, reviews, coordinates)
Document: Used for nested objects (images, host, address, availability, review_scores)

⚡ Performance Requirements & Index Creation

Performance is Key: Your application must be fast and responsive. Slow queries will hurt user experience and your ranking in the competition!

Index Strategy: Ask your LLM to create comprehensive index definitions that cover:

🔍 Query Performance: Indexes for all filtering and sorting operations
📊 Aggregation Support: Indexes optimized for statistics and analytics
🌐 Atlas Search: Full-text search indexes for autocomplete, facet, and text search
🧠 Vector Search: Vector indexes with automated embeddings for semantic search
📍 Geospatial: 2dsphere indexes for location-based queries

Index Creation Options:

Via MCP (Preferred if available):

Use the MCP to create indexes directly in your MongoDB cluster.
This is the fastest way to get your indexes deployed automatically.

Manual Creation (Fallback):

Request your LLM to generate a comprehensive mongodb_indexes.json file
containing all required index definitions. You'll need to create these
indexes manually in MongoDB Atlas or via your application startup code.

Required Index Types:

✅ Standard Database Indexes: For basic CRUD operations and filtering
✅ Atlas Search Indexes: For text search, autocomplete, and faceted search
✅ Vector Search Indexes: For semantic search with automated embeddings
✅ Compound Indexes: For complex queries with multiple filter criteria
✅ Geospatial Indexes: For location-based searches and proximity queries

Performance Tips:

🎯 Index Coverage: Ensure all your queries are covered by appropriate indexes
📈 Monitor Performance: Use MongoDB Atlas Performance Advisor
🔄 Query Optimization: Design queries that leverage your indexes efficiently
📊 Aggregation Pipelines: Optimize pipelines with proper index support

💡 Pro Tip: Ask your LLM to analyze your API endpoints and automatically generate the optimal index strategy. A well-indexed application can be 100x faster than one without proper indexes!

🚀 Enhanced Implementation Guide

MongoDB Collection Understanding

Use the MCP: Leverage the available Model Context Protocol to understand the listingsAndReviews collection structure
Schema Exploration: Use the MCP insights to implement proper field mappings and data validation
Query Optimization: Build efficient MongoDB queries based on the actual collection schema

Atlas Search Implementation

For the Atlas Search endpoints (/autocomplete, /facet, /search):

Search Indexes: Create appropriate Atlas Search indexes for lexical search
Search Operators: Use Atlas Search operators like text, autocomplete, and facet
Performance: Optimize search queries for fast response times
Relevance: Implement proper scoring and ranking for search results

Field Mappings:

name (for autocomplete): Enable autocomplete search on listing names
amenities (for facet): Support faceted filtering by available amenities
property_type (for facet): Allow filtering by property type categories
beds (for numeric facet): Enable range filtering on number of beds

📋 Index Requirements:

Lexical Search Index: Create a single Atlas Search index that supports all lexical search operations (/search, /facet, /autocomplete) using appropriate operators and analyzers

Atlas Vector Search Implementation

For the Vector Search endpoint (/vectorsearch):

Vector Index: Create Atlas Vector Search index with automated embeddings
Auto Embedding: Leverage MongoDB’s automated embedding feature for seamless vector operations
Embedding Field: Configure automated embeddings on the description field for semantic search capabilities

📋 Index Requirements:

Vector Search Index: Create a separate vector search index specifically for the /vectorsearch endpoint with automated embeddings on the description field

Automated Embeddings: Reference documentation at http://mongodb.com/docs/atlas/atlas-vector-search/automated-embedding/

🎯 Vector Search Parameters:

numCandidates: Set to 100 for optimal performance and accuracy

limit: Return top 10 most relevant results

💡 Important: Ensure your MongoDB Atlas cluster is properly configured for Vector Search, including the creation of a dedicated vector search index with automated embeddings on the description field. Reference the latest documentation for detailed setup instructions.

Chat System Implementation

For the Chat endpoints (/chat, /chat/clear):

Vector Search: Use Atlas Vector Search with automated embeddings to find relevant listings based on user queries
LLM Integration: Connect to AWS Bedrock using LangChain for natural language processing
Memory Management: Use LangChain to store the chat history in MongoDB collections
RAG Architecture: Implement Retrieval-Augmented Generation for accurate, context-aware responses
Session Handling: Maintain conversation continuity across multiple interactions
Embedding Strategy: Use automated embeddings for both document indexing and query processing in vector operations

Development Workflow

Explore: Use the MCP to understand the collection schema
Index Planning: Ask LLM to create comprehensive index definitions for performance
Index Creation: Deploy indexes via MCP or manually from generated JSON file
Plan: Design your API endpoints based on the schema insights
Implement: Build the REST API following the OpenAPI specification
Test: Use the provided test cases in the rest-lab/ folder
Optimize: Fine-tune queries and add performance improvements
Monitor: Check query performance and adjust indexes if needed

Bonus Features (Get Creative!)

🚀 Caching: Implement Redis or in-memory caching for frequently accessed data
📊 Analytics: Add advanced analytics and reporting endpoints
🔒 Rate Limiting: Implement API rate limiting for production readiness
📝 Logging: Add comprehensive logging and monitoring
🎯 Validation: Implement robust input validation and error handling
🔍 Advanced Search: Create innovative lexical search features using Atlas Search capabilities
🧠 Smart Memory: Store chat conversation history in MongoDB for personalized experiences
🤖 Intelligent Responses: Use automated vector embeddings for contextually relevant listing recommendations
⚡ Auto Embedding: Leverage MongoDB’s automated embedding capabilities for vector search operations

🛠️ Testing

Postman Collection

Import the OpenAPI specification directly into Postman:

Open Postman
Click “Import” and provide the collection file
All endpoints will be automatically imported

VS Code Extensions

REST Client: Create .http files for testing

Testing with curl

All endpoints can be tested using curl commands. See examples above or use the “Try it out” feature in Swagger UI to generate curl commands automatically.

🔒 Authentication

Currently, the API does not require authentication.