📦

Symposion

AI debate framework with Goodhart's Law detection

Active v0.6.0 MIT

[vm] symposion.homelab :3000

Tech Stack

Go Temporal PostgreSQL pgvector Mattermost Fiber Claude CLI Gemini CLI

Requirements

• Go 1.24+
• Temporal Server
• PostgreSQL 15+ with pgvector
• Mattermost
• Claude CLI

Features

✓ Multi-agent debates (Claude, Gemini, GPT)
✓ Goodhart's Law detection metrics
✓ Temporal workflow orchestration
✓ Vector memory for cross-topic learning
✓ Human-in-the-loop approval gates
✓ Mattermost channel integration
✓ Implementation plan generation

The Problem: AI Echo Chambers

When multiple AI agents work together, they can fall into consensus traps:

Agents agree too quickly without genuine reasoning
Debate quality degrades as agents optimize for agreement
Goodhart’s Law manifests: agents game consensus metrics
Implementation plans lack diverse perspectives

Symposion tackles this by orchestrating structured debates with explicit consensus quality measurement.

Architecture

A 10-step Temporal workflow coordinates the research process:

┌─────────────────────────────────────────────────────────────┐
│                    TEMPORAL WORKFLOW                         │
│  1. Parse Input → 2. Create Channels → 3. RAG Recall        │
│  4. Initial Positions → 5. Debate Rounds → 6. Consensus     │
│  7. Human Gate → 8. Implementation → 9. Build → 10. Report  │
└─────────────────────────────────────────────────────────────┘
                            │
        ┌───────────────────┼───────────────────┐
        ▼                   ▼                   ▼
   ┌─────────┐        ┌─────────┐        ┌─────────┐
   │  Agent  │        │  Agent  │        │  Agent  │
   │ (Claude)│        │ (Gemini)│        │  (GPT)  │
   └─────────┘        └─────────┘        └─────────┘
        │                   │                   │
        └───────────────────┼───────────────────┘
                            ▼
                    ┌───────────────┐
                    │  Mattermost   │
                    │  (4 channels) │
                    └───────────────┘

Channel	Purpose
debate	Agent discussions and position statements
planning	Implementation plan drafts
build	Code scaffolding and artifacts
reviews	Human feedback and approvals

Goodhart’s Law Detection

The system calculates metrics to detect when agents are gaming for agreement rather than genuine consensus:

Consensus Quality Metrics:

Position Diversity: How different are initial positions?
Argument Depth: Are agents engaging with specifics?
Concession Authenticity: Do position changes cite evidence?
Dissent Preservation: Are minority views maintained?

When metrics drop below thresholds, the system flags potential Goodhart gaming and can inject adversarial prompts to restore genuine debate.

Vector Memory

PostgreSQL with pgvector stores debate outcomes for cross-topic learning:

1536-dimension embeddings via OpenAI
HNSW indexing for fast similarity search
RAG recall brings relevant past debates into context
Agents learn from previous research sessions

Human-in-the-Loop

Critical decision points require human approval:

Research Direction: Approve debate topic framing
Implementation Plan: Sign off before code generation
Build Artifacts: Review scaffolded repositories

Signals can timeout after 7 days with configurable defaults.

Current Status

Component	Status
Temporal orchestration	Complete
Mattermost integration	Complete
CLI agent providers	Complete
Vector memory/RAG	Complete
Goodhart metrics	Complete
API providers (native)	Stubbed
UI dashboard	In progress

Summary

Benefit	Description
Diversity	Multiple AI perspectives prevent echo chambers
Quality	Goodhart detection ensures genuine reasoning
Memory	Vector store enables cross-topic learning
Control	Human gates at critical decision points