What it is
A generalized platform that uses agentic LLM workflows to extract structured product data from unstructured sources, generate embeddings, and sync the results to consumer-facing sites. Currently deployed across two verticals: artisanal foods (tastemongers.com) and chef's knives (edgemongers.com).
Why I built it
Product reviews live in scattered, unstructured text: blog posts, forums, retailer descriptions. Most ratings sites either rely on hand-curated data or shallow aggregations of star averages. I wanted to see whether agentic LLM extraction could produce ratings that were rigorous and domain-aware, then test whether the same pipeline could generalize across verticals without rewriting the model layer.
How it works
The pipeline runs in stages:
- Source ingestion: product references and unstructured text (descriptions, reviews) are collected per niche.
- Agentic extraction: a LangChain workflow drives OpenAI to extract domain-specific fields (e.g., "edge sharpness" for knives, "flavor complexity" for cheese) using rubrics defined per niche. The agent reasons about which evidence supports which rating.
- Embedding + persistence: extracted fields and source text are embedded and persisted to a Neon Postgres database alongside the structured rating record.
- Sync to consumer sites: the public Next.js sites read directly from the same Postgres instance via a serverless connection.
The platform's generalizability comes from treating the rating rubric as data, not code. Adding a new vertical means inserting a niche row + rubric definitions; the extraction prompts assemble themselves around the rubric.
What I learned
- Structured extraction with LLMs is best when you let the model justify its scores. Asking for a rating without asking for the supporting evidence produces less stable results across runs.
- Generalization across domains doesn't come for free even with an LLM. Knives and cheese share zero vocabulary. The platform's "abstraction" was the rubric structure, not the extraction prompts themselves.
- Persisting source provenance matters. Every rating in the database links back to the text that produced it. Non-negotiable for auditability when LLMs are in the loop.
Links
- Source: github.com/dleon86/axiomatiq_ratings_db
- Live (knives): edgemongers.com/ratings
- Live (cheese, v0 platform): tastemongers.com/ratings