PubEngine, Scaling Programmatic local SEO using AI-driven content architecture.

An end-to-end automated local search engine built for the UK pub ecosystem.

The Challenge: The Content Gap Blindspot In traditional SEO, if a keyword tool says “0 search volume,” content teams usually ignore it. But human beings don’t search in perfect keywords. they search in highly specific, localized intents (e.g., “quiet pub with a fireplace” or “dog-friendly pub but the music is too loud for pets”).

My current focus is bypassing these traditional SEO tools entirely. I am building a programmatic engine designed to read tens of thousands of real customer reviews, understand the underlying human experiences, and automatically extract hyper-specific content gaps that competitors are completely missing.

The Strategy: Math Over Keywords Instead of guessing what pub-goers want, I am architecting an automated pipeline that lets the raw data speak for itself.

  1. Data Ingestion: I am automating the continuous collection of over 59,000 real-world pub reviews using n8n workflows and third-party scraping APIs.
  2. Semantic Translation: Instead of relying on exact word matches (which are messy and inaccurate), the system converts every single review into high-dimensional vectors. This allows the engine to mathematically understand that “Great crackling fire” and “Loved reading my book away from the loud music” share the exact same cozy intent, even if they share zero words.
  3. Unsupervised Clustering: I am mapping these thousands of reviews into a mathematical space to group them based on semantic density rather than manual tags.
  4. AI Synthesis: Once the dense “clusters” of human intent are identified, the pipeline feeds those specific groups to a Large Language Model to synthesize clean, punchy SEO labels for brand-new categories.
Data Collection Phase

The Tech Stack

  • Languages: Python, SQL, JavaScript
  • Database & Storage: PostgreSQL, Supabase, pgvector
  • Automation & Pipelines: n8n, Third-party scraping APIs
  • AI & Machine Learning: OpenAI API (text-embedding-3-large, gpt-4o-mini), Scikit-learn, HDBSCAN (Density Clustering), UMAP (Dimensionality Reduction)
  • Visualization: JSON data mapped to frontend charting libraries (Chart.js / D3.js)

The Current Impact & Next Steps The engine is actively bypassing the “garbage in, garbage out” problem of standard keyword matching. By squashing massive 3,000-dimensional AI embeddings down to an optimized 256-dimension database column, the system is now able to instantly search and map 59,000 rows in milliseconds without database timeouts.

Instead of building generic landing pages for “Dog Friendly Pubs,” the ultimate goal of this platform is to programmatically generate highly targeted directory pages based on mathematically proven customer pain points and praises, capturing high-converting, long-tail traffic that traditional SEO tools simply cannot see.