AI-powered platform for diagnosing and resolving real-world maintenance issues
Homeowners and renters waste hours researching maintenance issues across Reddit, forums, YouTube, and retail sites. They struggle to diagnose problems, understand risks, find the right parts, and decide whether to DIY or hire a professional.
Build an AI-powered platform that automates the entire research and decision-making process for real-world maintenance and repair issues.
Built multimodal ingestion pipeline supporting voice notes (any language), photos, and video (dissected into frames)
Integrated OpenAI Vision and Whisper for analyzing diagnostic media
Implemented Firecrawl to crawl Reddit, forums, and retail sites for solutions and in-stock products
Created budget/income input system that weighs opportunity cost and time for personalized recommendations
Built email drafting and sending feature for contacting contractors
Implemented end-to-end encryption for all diagnostic media and personal data
Reduced time-to-decision by ~70% through automated diagnostics
Eliminated manual research across multiple sites
Enabled users to make informed fix-now vs. defer decisions based on their financial context
MVP, stretch goals, and future vision
Development phases and milestones
Photo-based issue detection and recommendations
Upload photos and detect maintenance issues using AI
OpenAI-powered analysis with actionable recommendations
Describe issues verbally for AI analysis
Location mapping and cost intelligence
Mapbox integration for property visualization
Budget-aware recommendations
Automated contractor communication
Mobile app, marketplace, and community
Connect homeowners with vetted contractors
Native mobile app for on-site diagnostics
Community platform for sharing solutions
Predictive maintenance scheduling based on home age and conditions
Connect with IoT devices for real-time monitoring
Common questions about this project, answered in STAR format
How did you approach building an AI-powered diagnostic system?
Key Takeaway: AI works best when you constrain it with domain knowledge rather than letting it hallucinate freely.
Tell me about a time you had to make a difficult technical decision.
Key Takeaway: The best technical decision is often the one that lets you learn faster, not the theoretically optimal one.
How did you handle the DIY vs. professional recommendation feature?
Key Takeaway: When building AI systems with real-world consequences, explicit guardrails are more important than model sophistication.
Quick answers to 'Why did you choose X?' questions
For MVP validation, free tiers suffice. OpenWeatherMap free allows 1000 calls/day - enough with caching. If product gains traction, upgrading is trivial. Architecture (caching, error handling) is the same. Do not pay for scale you do not have.
More generous free tier (50k vs 28k loads). Better custom styling. Vector tiles faster than raster. Better React integration with react-map-gl. Trade-off is less familiarity, but for property locations Mapbox is sufficient.
Single database for all data - no sync between systems. For thousands of vectors, pgvector is fast enough. Keeps everything in Postgres - simpler architecture. Trade-off is scaling ceiling at millions of vectors, but that is a good problem to have.
Unit tests for parsing logic with fixture HTML files - test that selectors extract correct data. Integration tests hit a local mock server returning known HTML. E2E tests verify full pipeline from URL to database entry. Edge cases: malformed HTML, missing fields, rate limiting responses. Monitoring in production for scraper health.
Created evaluation dataset with query-answer pairs. Test that relevant documents appear in top-k results. Measure retrieval accuracy and relevance scores. A/B test different chunking strategies. Monitor in production: log queries and which chunks were retrieved for manual review.
Every decision has costs — here's how I thought through them
Needed deep control over how product information, forum posts, and pricing were extracted. Generic APIs would not capture the specific data needed for accurate recommendations.
Training a custom model would require thousands of labeled images. OpenAI Vision works well enough for the diagnostic use case and allowed faster iteration.
For diagnosing static issues like cracks or damage, individual frames capture what is needed. Motion analysis would be overkill and more expensive.
Users are sharing photos and videos of their homes. Trust is essential. The privacy-first approach builds that trust even if it limits data collection.
A $500 repair means different things to different people. Financial context makes the "fix now vs. defer" recommendation actually useful.
Home improvement advice changes constantly - new products, updated prices, seasonal recommendations. RAG ensures recommendations are always current without costly model retraining.
For thousands of product embeddings, pgvector is fast enough. Keeping everything in Postgres eliminates sync complexity and reduces costs. If I hit millions of vectors, I can migrate - but that is a problem for later.
Users often hold camera still or pan slowly - 30 frames may contain only 8-10 unique views. Deduplication is essential for cost-effective video analysis at scale.
A 30-second video takes ~45 seconds to fully analyze. Blocking the UI that long is a terrible experience. Async processing with notification means users can continue their day and get results when ready.
For a 0-to-1 product, I need to validate the idea fast, not manage infrastructure. Supabase lets me focus on the product. If I hit scale limits, that means the product is working and I can afford to migrate.
For a web-only product with no mobile app, the simplicity of a unified Next.js app outweighs the flexibility of separate frontend/backend. Ship faster, refactor later if needed.
For home improvement product recommendations, the small model captures enough semantic meaning. The cost savings compound - every scraped page and every user query needs embeddings.
At scale, running Puppeteer for every page would be extremely expensive. 70% of pages are static HTML (forums, articles) and can use the fast path.
Scraping 10,000+ pages reliably requires proper job queue infrastructure. A cron job would fail silently, retry incorrectly, or overwhelm target sites. The queue adds reliability at the cost of complexity.
The hardest problems I solved on this project
Analyzed cost breakdown. Each video was being split into frames and every frame sent to the API. A 30-second video at 30fps = 900 API calls.
Implemented perceptual hashing (pHash) to detect similar frames. Extract frames at lower rate (2fps instead of 30fps). Compare each frame hash to previous - only send to API if significantly different. Result: 60-70% reduction in API calls per video while maintaining analysis quality.
Lesson: Before sending data to expensive APIs, ask what can be filtered locally. Deduplication and sampling can dramatically reduce costs without sacrificing quality.
Evaluated refresh strategies: full re-scrape (expensive), incremental updates (complex), or smarter caching.
Implemented tiered refresh strategy. High-traffic products refresh daily. Medium-traffic weekly. Low-traffic monthly. Price-sensitive data (deals, sales) gets priority refresh. Added staleness indicator in UI so users know data freshness. Background job queue handles refreshes without blocking user requests.
Lesson: Not all data needs the same freshness. Tiered caching based on access patterns and business importance saves resources while keeping important data current.
Analyzed retrieval results. Found that semantic similarity was matching on surface-level words but missing context. Also chunks were too large, mixing multiple topics.
Improved chunking strategy: smaller chunks (500 tokens) with overlap. Added metadata to chunks (source, category, date). Implemented hybrid search: semantic similarity + keyword matching. Added re-ranking step using a smaller model to filter irrelevant results before sending to LLM.
Lesson: RAG quality depends heavily on chunking and retrieval strategy. Smaller chunks with metadata, hybrid search, and re-ranking dramatically improve relevance. Garbage in, garbage out still applies to AI.
Map component was imported at page level, causing the entire Mapbox library to load before page was interactive.
Dynamic import of map component with ssr: false (Mapbox requires window). Added loading skeleton while map loads. Lazy loaded map only when scrolled into view using Intersection Observer. Implemented tile caching and reduced initial zoom level to load fewer tiles. Result: page becomes interactive 2 seconds faster.
Lesson: Heavy third-party libraries like maps should be dynamically imported and lazy loaded. Do not block initial render with components that need the full library. Load on demand when user needs the feature.
Each user request was calling OpenWeatherMap API directly. Popular locations caused repeated identical calls.
Implemented server-side caching layer. Cache weather data by location with 30-minute TTL (weather does not change that fast). Used Redis for cache storage. First request hits API, subsequent requests for same location serve from cache. Added stale-while-revalidate - serve stale data immediately, refresh in background. Reduced API calls by 90%.
Lesson: External APIs should almost always have a caching layer. Identify what data can be shared across users and cache aggressively. Weather is a perfect example - same for everyone in a location.
Recommendation results are personalized (based on user input) but also need to be shareable/bookmarkable. Pure CSR would hurt SEO and shareability. Pure SSR would mean no caching.
Hybrid approach: SSR the page shell and layout, CSR the personalized recommendations. URL contains encoded query params so results are shareable. Server renders a loading skeleton that hydrates with actual recommendations. This way shared links work, SEO gets the page structure, but recommendations are always fresh and personalized.
Lesson: Personalized content does not mean you cannot use SSR. Render the static parts server-side, hydrate personalized parts client-side. URL state makes personalized pages shareable.
Each recommendation needed data from weather API, product API, and pricing API. Sequential calls were taking 3+ seconds.
Parallelized API calls using Promise.all. For APIs that did not depend on each other, fire all requests simultaneously. Added timeout handling - if one API is slow, return partial results rather than waiting forever. Implemented background enrichment - show basic recommendation immediately, enhance with additional data as it arrives (streaming).
Lesson: Never make sequential API calls when you can parallelize. Use Promise.all for independent requests. Design for partial results when some data sources are slow or unavailable.
Key sections I'd walk through in a code review
src/lib/ingestion/multimodal.tsThis module handles voice, photo, and video input. Voice goes through Whisper for transcription, photos go directly to Vision, and videos get dissected into frames first. Each path normalizes the output into a common DiagnosticInput type that the analysis engine consumes.
src/lib/decision/riskEngine.tsTakes the diagnostic result, user's budget/income, and urgency to calculate a recommendation score. Uses a weighted formula that considers: safety risk (highest weight), cost of delay, DIY feasibility, and financial impact. Returns a structured recommendation with confidence intervals.
Want to discuss this project?