Opportunity Intelligence &
workflow automation.
A local-first AI pipeline that automates the judgment-heavy parts of a real job search: finding roles, grading fit honestly, and writing application documents that stay defensible. Built for a market where applying online has become a volume game that no existing tool handled end to end.
Applying online for jobs has become a volume game, where the effort you put into each one is filtered out by a machine before a person ever sees it.From the search that started this · 2026
The problem.
Section 02 / Why it exists
When I was searching for jobs full time, it stopped feeling like my time was respected. Tailoring a résumé to a role is real work, fifteen or twenty minutes each, and you do it only to watch a posting hit "100 applicants" before you submit. The tools that existed solved only fragments of the problem: scrapers that just tracked postings, paid apps that capped you at a handful of auto-applies. None did the whole chain end to end.
Before a recruiter ever sees a résumé, a machine does. An applicant tracking system parses and filters every application, and that filtering is uneven and opaque. It consistently disadvantages new and recent grads, capable on paper but light on the exact keywords and years of experience it scores for. Getting to the next stage can feel like a lottery.
It is not universal. Mid-senior candidates, or grads with stacked internships, may never feel it. But many early-career applicants do, and I was one of them. So I built the reasoning in the open, where a human controls what the AI is allowed to claim and every score can be audited.
Spray-and-pray.
Mass-apply bots optimize for conversion, not truth. They inflate, they generalize, and they leave you defending claims you never made.
The link graveyard.
Trackers and spreadsheets store postings without understanding them. Fit, salary, and positioning are still left entirely to you.
The application grind.
Upload a résumé to Workday, watch it "auto-fill," then paste every section back in by hand and click every skill one by one, sometimes a personality assessment on top, all while dodging fake postings and scams.
How it works, six stages.
Section 03 / The pipeline
Each stage produces a human-readable artifact. You can open any intermediate output and see exactly what happened and why. That inspectability was a constraint from the start, not a feature added later. The middle of the system runs as a streaming 2A → 2B conveyor: as soon as structured intake lands for one role, grading can start on it without waiting for the rest of the scrape batch. Every stage persists to a single SQLite database keyed by job, a deliberate data layer rather than a log file, and the whole funnel is driven from an operator dashboard rather than a status screen.
What sets it apart.
Section 04 / Deliberate decisions
Not a feature pile. A serious engineering project for a real use case, with a deliberate call at every layer.
Not a job bot, an opportunity radar.
The feature that most sets this apart, and why it's called Opportunity Intelligence. Beyond open roles, a self-contained module scrapes upcoming networking, info sessions, and career fairs, judges each for relevance with the same dual-worker local model that grades jobs, geocodes them, and plots them on a live map with filters and Register / View links. Jobs and networking become one opportunity surface. Working end to end today, still maturing.
Fully local AI, at low cost.
Ollama-first across local GPUs, using free open-weight models such as Qwen 3 and Google Gemma 4. No cloud API bills, no data leaving the machine, and unlimited experimentation against real prompts.
Dual-GPU AI inferencing.
Both GPUs run AI inference in parallel and can work the same module at once without interfering. Models load by script, per card and sometimes multiple onto the same card, each tuned for its stage. When a card finishes, it falls back to the next task so neither sits idle.
Honest, not inflated.
Scores are built to be defensible, not conversion-optimized. A dedicated near-miss band surfaces borderline roles for review instead of silently discarding them.
Evidence-grounded.
The Profile Hub is a curated evidence contract, not a config file. Writing rules, guardrails, resume variants, and proof points define what the AI may claim. A deterministic QA pass flags anything overclaimed.
Four stages, not a prompt.
Document generation is company-context extraction → a planner that decides positioning → generation → QA, compiled to PDF and DOCX.
An operator workspace.
Pipeline controls, collected jobs, per-stage model routing, GPU telemetry, application tracking, and a profile editor all live in one workspace, not a status screen.
The data layer.
Data layer · SQLite
The pipeline is only as trustworthy as the record behind it. Every stage writes to one local SQLite database, with the schema and the access rules chosen deliberately rather than left to a framework.
Why SQLite.
Local-first and single-candidate, so an embedded ACID database with zero operations beat running a server beside the models. The honest ceiling: multi-user moves this to Postgres, which is on the roadmap.
One table per stage.
Scrape, structured intake, grading, and the application history each own a table, keyed by job, so every generated document maps back to its exact posting and system-regenerable data stays separate from work that cannot be recreated.
Two GPUs, no collisions.
WAL journaling plus a status-guarded claim lets two GPU workers run at once without ever taking the same job, and a timed-out job returns to the queue instead of being lost.
Indexed for the dashboard.
Indexes follow the queries the dashboard actually runs, filtering and sorting by status, category, source, and date, so the operator views stay fast as the tables grow.
UPDATE jobs_raw SET status='grading'
WHERE job_id=? AND status='resolved'The row is only taken if it is still resolved, so no two GPU workers grade the same job.
Sequential to parallel.
Section 05 / The architecture
The first build ran sequentially. Every module ran one after another, M1 → M2 → M3. It worked, but I could see the wasted time: the pipeline wasn't bound the way it ran. Scraping is CPU work; the judgment and writing are GPU AI inferencing. So I mapped each module to the hardware it actually needs and rebuilt the pipeline concurrent.
Dual-GPU AI inferencing.
The hard part
Two cards running in parallel is one thing. Getting two local language models to work the same module at once without interfering is the real feat.
Both cards infer at once.
Each GPU runs AI inference simultaneously. It is independent work in parallel, not one card waiting on the other. Throughput roughly doubles where the work is GPU-bound.
Two models, one module.
Both GPUs can work the same module concurrently without stepping on each other. Two language models on one task, isolated so they don't collide on memory or output.
Each stage, its own model.
Models load by script: different models per card, and different models onto the same card. M2 (grading) and M3 (document generation) run distinct language models, each given its own fine-tuning parameters before any job prompt. When a card finishes its task it falls back to the next, so neither sits idle.
Why Inference Lab exists
Running two models on the same task without collisions meant I had to see what each card was doing: which model was loaded, VRAM in use, whether they were interfering. To make the concurrency visible and tunable while I ran test after test, I built a separate control plane for it.
Built to keep running.
Section 06 / Reliability engineering
A local two-GPU pipeline running unattended needs boring, practical failure handling. The system treats retries, stale work, and GPU hiccups as normal operating conditions, not exceptional surprises.
Atomic job claiming.
Workers claim rows before inference, so two GPU workers never grade or generate against the same job at the same time.
Retry without losing work.
Transient endpoint or GPU failures retry with backoff; timed-out grading work returns to the queue instead of disappearing.
Stale rows age out.
Old unresolved rows move to a stale state automatically, keeping dashboard reads and future runs focused on live work.
No reload churn.
Models stay resident between calls, avoiding repeated load/unload cycles that waste time and create VRAM failure modes.
Documents, four passes.
Section 07 / Generation
A generated document is a function of the Profile Hub plus a template, not a one-shot rewrite. The Profile Hub is the source-of-truth contract for what the model can say; the QA pass is deterministic and flags any claim that isn't grounded in that evidence.
Company context
Extract what the role and the company actually care about.
Planner
Decide positioning: which evidence leads, what the angle is.
Generation
Write the resume + cover letter, grounded only in evidence-bank facts.
QA
Deterministic check flags overclaimed or misaligned content.
Grades, and the near misses.
Section 08 / Illustrative data
A sanitized slice of the tracker. Grades are the honest 0-10 fit score; the near-miss band holds roles that fall just under the apply threshold. They are kept for human review, not discarded.
Illustrative · sanitized fixture · not a real applicant ledger
The dashboard, up close.
Section 09 / Gallery
The operator surfaces, captured from the running app in demo mode. Real screenshots drop into these slots; every shot uses sanitized data, never real personal information.
Beyond the job board, the opportunity radar.
Section 10 / Opportunity Radar
Opportunity isn't just open reqs. A self-contained module, with its own database and roughly a dozen per-platform extractors across finance, business, tech, and university sources, surfaces upcoming networking events, info sessions, and career fairs. The same dual-worker local model that grades roles judges each event for relevance, pass or fail with a one-line reason, against the same interest profile, then geocodes and plots them on the map beside the roles. Volunteering events are a near-term extension of the same surface.Working · still maturing
Illustrative · sample events · scraping, local-model filtering, and the map are live today; JS-heavy sources + user-managed source editing still hardening
Built to extend.
Section 11 / Extensibility · baked in
Every stage reads and writes a human-readable contract, so a new input plugs into one stage without touching the rest. Adding capability is configuration, not a rewrite. Extensibility is a property of the architecture, not a feature bolted on.
New platform sources.
Add a source adapter; ATS resolution, dedupe, grading, and generation downstream are unchanged.
New resume variants.
Drop a variant into the bank; matching picks the best one per role automatically.
New category searches.
Point it at a new role vertical; the same understand → grade → generate flow applies, no new code path.
New local models.
Register a worker in Inference Lab and route any stage to it. Swap or add a model by config.
What's next.
Section 12 / Roadmap
Framed honestly. The auto-apply engine is fully architected but benched from the default pipeline today. the work in progress is making a cloud model drive the actual applying.
Opportunity Radar: hardening the surface.
The radar already works end to end: events scrape, get filtered by the local model, geocode, and render on the map. What's left is hardening, not building: JS-gated event sources, user-managed source editing, and adding volunteering events as a near-term extension. A key pillar, not a side feature, and what makes this Opportunity Intelligence rather than a job bot.
Cloud-driven auto-apply.
Platform-specific handlers (Indeed, Workday, LinkedIn, jobs.ca) and an accessibility-tree agent for tricky forms are built. The direction in progress: a cloud model's browser control creates accounts and submits using stored candidate data, logging each result back to the tracker.
Multi-user.
Single-candidate today by design. Multi-user support, with separate evidence banks, profiles, and trackers per person, is a planned direction, not a current claim.
Deeper observability.
More of what Inference Lab already shows: per-stage latency, model-swap history, and run-over-run grading drift, surfaced directly in the operator dashboard.
- Architecture, this case study, and the operator UI concepts
- Honest grading rationale + the four-stage document contract
- Pipeline orchestration · prompts · scrapers · credentials stay private
- Real candidate data & the evidence bank's contents are never published