Background jobs
Budget: 3-5 days.
Why this matters
Solves can take minutes. Today solve_vrp blocks the request. That is fine
for a demo with one user, but it holds a worker the whole time, it breaks
on server restarts, and it has no retry path when FastVRP transiently
fails. A queue fixes all three.
What to learn
| Topic | Time |
|---|---|
| Queue concepts: producers, consumers, visibility timeouts, dead letters | 0.5 day |
| Pick one of Celery, RQ, or arq and read its quickstart (arq fits this async codebase best) | 0.5 day |
| Redis as the broker: install, run, connect | 0.5 day |
| Progress reporting pattern: worker writes to Postgres, SSE handler reads | 1 day |
Move solve_vrp from tools.py into an arq job | 1-2 days |
| Retry policy with exponential backoff for FastVRP timeouts | 0.5 day |
Resources
- arq docs: async-native task queue, closest fit to this codebase.
- Celery docs: heavier, battle-tested alternative.
- RQ docs: simple sync worker alternative.
- FastAPI background tasks: read this to understand why they are not a substitute for a real queue.
- Redis docs: you only need the basics to use it as a broker.
Exercise
Move the body of solve_vrp in tools.py into an arq job. The tool
invocation should enqueue the job and return a job ID. Store status and
result on the solutions table added in the previous chapter. Update
the SSE chat handler so when the agent is waiting on a solve, it streams
progress updates as they land in Postgres. Add a retry policy that retries
FastVRP timeouts up to three times with exponential backoff.