December 11, 2025 by Adam 6 min read

Why I'm Planning to Migrate from Redis Queue to Temporal in My Homelab

My RQ setup works, but Temporal's workflow orchestration promises better handling of complex multi-step jobs. Here's my migration plan and the real tradeoffs.

homelab temporal redis workflow-orchestration self-hosting python rq

The Current State: RQ Actually Works Pretty Well

Let me be honest upfront: my Redis Queue (RQ) setup isn’t broken. It’s running a media data ETL pipeline right now, processing TMDB API calls, populating MongoDB, and reporting job status in real-time via SocketIO. I’ve built something reasonably sophisticated:

# My actual CustomWorker implementation
class CustomWorker(Worker):
    def execute_job(self, job, queue):
        print(f"Starting job execution: {job.id}")
        result = super().execute_job(job, queue)

        if job.get_status() == JobStatus.FINISHED:
            report_to_server({
                "job_id": job.id,
                "status": "completed",
                "result": job.return_value()
            })
        elif job.get_status() == JobStatus.FAILED:
            report_to_server({
                "job_id": job.id,
                "status": "failed",
                "error": job.latest_result()
            })

        return result

The architecture separates concerns cleanly:

Redis DB 6: API response caching (14-day TTL)
Redis DB 7: RQ job queue for background tasks
Redis DB 10: SocketIO pub/sub for real-time updates

Workers connect via SocketIO client and push status updates that broadcast to all connected web clients. It’s not bad.

So Why Consider Temporal?

The problems emerge when workflows get complex. My ETL pipeline has dependencies:

Fetch movie lists from TMDB (500+ pages)
For each movie, fetch detailed info (cast, crew, videos)
Process and denormalize data
Update search indexes
Notify frontend of new content

In RQ, I’m coordinating this manually:

# Current approach - manual orchestration
def run_all_tasks(refresh: bool = False):
    if refresh:
        cache.flush()
    create_indexes()

    for i in range(1, TOTAL_PAGES + 1):
        queue.enqueue(
            fetch_build_tmdb,
            "movie_popular",
            "/movie/popular",
            i,
            region="US"
        )

This works for simple fan-out, but when I need:

Step 2 to wait for step 1 to complete
Parallel execution of independent tasks
Retry logic that picks up from failure points
Audit trails of what ran when

…the code gets ugly fast.

What RQ Does Well

Credit where it’s due. My current setup handles:

Real-time visibility via SocketIO:

# Workers report status to web UI in real-time
sio = socketio.Client()
sio.connect("http://localhost:7010")

def report_to_server(data):
    sio.emit("report", data)

Worker management via REST API:

@router.post("/start")
async def start_workers(num_workers: int = 1):
    for _ in range(num_workers):
        subprocess.Popen(["gnome-terminal", "--", "bash", "-c",
                          f"python {worker_script}; exec bash"])
    return {"message": f"Started {num_workers} worker(s)"}

Built-in retry with backoff:

for attempt in range(MAX_RETRIES):
    try:
        response = httpx.get(url, headers=headers, params=params)
        response.raise_for_status()
        # ... process data
    except httpx.HTTPStatusError as e:
        if e.response.status_code == 429 and attempt < MAX_RETRIES - 1:
            time.sleep(RATE_LIMIT_DELAY * (attempt + 1))
            continue

For simple fire-and-forget tasks with manual retry logic, RQ is genuinely good enough.

Where RQ Falls Short

The Durability Problem

Redis is in-memory. Yes, RDB/AOF persistence exists, but RQ doesn’t checkpoint job progress. If a worker crashes mid-execution:

Simple jobs? Re-run from scratch
Multi-step jobs? Hope your code handles partial state
Jobs calling external APIs? Pray for idempotency

I haven’t lost data yet, but I’ve come close.

The Orchestration Problem

My current UNIFIED_BACKEND_PLAN.md describes the target architecture:

backend/
├── etl/
│   ├── orchestrators/        # ETL orchestration scripts
│   │   ├── movies.py         # Movie list orchestration
│   │   ├── movie_details.py  # Movie details processing

“Orchestrators” sounds fancy, but it’s really just Python scripts calling queue.enqueue() in loops. Real workflow orchestration - dependencies, parallelization, error recovery - requires Temporal or similar.

The Visibility Problem

My SocketIO dashboard shows job status, but:

No workflow-level view (what percentage complete?)
No historical analysis (how long did step 3 take last week?)
No easy replay of failed workflows

The Temporal Proposition

Temporal reframes background jobs as durable workflows. The mental model shift:

RQ	Temporal
“Enqueue a job”	“Start a workflow”
“Job failed, retry from scratch”	“Activity failed, retry from checkpoint”
“Poll for status”	“Subscribe to workflow events”
“Manual dependency management”	“Workflow code is the dependency graph”

Here’s what my ETL could look like:

@workflow.defn
class MovieETLWorkflow:
    @workflow.run
    async def run(self, region: str = "US") -> dict:
        # Step 1: Create indexes
        await workflow.execute_activity(
            create_indexes,
            start_to_close_timeout=timedelta(minutes=5),
        )

        # Step 2: Fetch all movie lists in parallel
        movie_tasks = []
        for page in range(1, MAX_PAGES + 1):
            movie_tasks.append(
                workflow.execute_activity(
                    fetch_movie_list,
                    args=[page, region],
                    start_to_close_timeout=timedelta(minutes=30),
                    retry_policy=RetryPolicy(maximum_attempts=3),
                )
            )
        movie_ids = await asyncio.gather(*movie_tasks)

        # Step 3: Fetch details for each movie
        detail_tasks = []
        for movie_id in flatten(movie_ids):
            detail_tasks.append(
                workflow.execute_activity(
                    fetch_movie_details,
                    args=[movie_id],
                    start_to_close_timeout=timedelta(minutes=5),
                )
            )
        await asyncio.gather(*detail_tasks)

        return {"movies_processed": len(detail_tasks)}

If a worker dies mid-execution? Temporal replays the workflow from the last completed activity. Not “re-runs everything” - literally continues from where it stopped.

The Comparison That Actually Matters

Feature	My Current RQ Setup	Temporal
Setup complexity	Already done	2-3 hours
State durability	None (Redis in-memory)	Complete (PostgreSQL)
Workflow visibility	Custom SocketIO dashboard	Full workflow viewer built-in
Retry from failure point	No	Yes
Long-running workflows	Works but fragile	Native support
Multi-step orchestration	Manual coordination	Built into workflow code
Resource overhead	~150MB (Redis + workers)	~800MB (Temporal stack)

The overhead is significant. Temporal runs multiple services:

Frontend service
History service
Matching service
PostgreSQL (or MySQL/Cassandra)
Web UI

For a homelab, Docker Compose makes this manageable, but it’s not trivial.

My Migration Plan

I’m not ripping out RQ tomorrow. The plan:

Phase 1: Parallel Testing (Current)

Keep RQ running for existing workloads
Deploy Temporal stack alongside
Port one workflow (media processing) as proof of concept

Phase 2: Incremental Migration

Move complex multi-step workflows to Temporal
Keep simple fire-and-forget tasks on RQ
Run both systems for 4-6 weeks

Phase 3: Full Migration

Move all workflows to Temporal
Decommission RQ
Repurpose Redis for pure caching

Docker Compose for Temporal

version: "3.8"
services:
  postgresql:
    image: postgres:15
    environment:
      POSTGRES_USER: temporal
      POSTGRES_PASSWORD: temporal
      POSTGRES_DB: temporal
    volumes:
      - temporal_db:/var/lib/postgresql/data
    networks:
      - temporal

  temporal:
    image: temporalio/auto-setup:latest
    environment:
      - DB=postgres12
      - DB_PORT=5432
      - POSTGRES_USER=temporal
      - POSTGRES_PWD=temporal
      - POSTGRES_SEEDS=postgresql
    depends_on:
      - postgresql
    ports:
      - "7233:7233"
    networks:
      - temporal

  temporal-ui:
    image: temporalio/ui:latest
    environment:
      - TEMPORAL_ADDRESS=temporal:7233
    depends_on:
      - temporal
    ports:
      - "8088:8080"
    networks:
      - temporal

volumes:
  temporal_db:

networks:
  temporal:

When to Actually Make the Switch

Stick with RQ if:

Tasks are independent and stateless
You’ve already built monitoring (like I have)
Failures just mean “retry the whole thing”
You don’t need workflow-level visibility
Resource constraints matter

Consider Temporal if:

Workflows have multiple dependent steps
Long-running jobs must survive crashes
You need audit trails and workflow history
Business logic requires exactly-once execution
You’re orchestrating external services with complex failure modes

Honest Assessment

My RQ setup works. The SocketIO real-time reporting, the worker management API, the retry logic - it’s production-ready for what it does. I’ve spent considerable time making it robust.

Temporal promises to solve problems I occasionally have, not problems I constantly have. The migration is about future-proofing and reducing the custom code I maintain, not fixing something broken.

Is the 5x resource overhead worth it? For complex media pipelines and backup orchestration, probably yes. For simple cron-style jobs, absolutely not.

The migration starts next month. I’ll report back on whether Temporal lives up to the promise, or whether I’m just trading one set of problems for another.

Currently running: RQ with custom SocketIO monitoring. Planning: Temporal migration. The homelab never stops evolving.