Files

karpathy eb0eb26f4c v0

2025-11-22 14:27:53 -08:00

6.9 KiB

Raw Permalink Blame History

CLAUDE.md - Technical Notes for LLM Council

This file contains technical details, architectural decisions, and important implementation notes for future development sessions.

Project Overview

LLM Council is a 3-stage deliberation system where multiple LLMs collaboratively answer user questions. The key innovation is anonymized peer review in Stage 2, preventing models from playing favorites.

Architecture

Backend Structure (`backend/`)

config.py

Contains COUNCIL_MODELS (list of OpenRouter model identifiers)
Contains CHAIRMAN_MODEL (model that synthesizes final answer)
Uses environment variable OPENROUTER_API_KEY from .env
Backend runs on port 8001 (NOT 8000 - user had another app on 8000)

openrouter.py

query_model(): Single async model query
query_models_parallel(): Parallel queries using asyncio.gather()
Returns dict with 'content' and optional 'reasoning_details'
Graceful degradation: returns None on failure, continues with successful responses

council.py - The Core Logic

stage1_collect_responses(): Parallel queries to all council models
stage2_collect_rankings():
- Anonymizes responses as "Response A, B, C, etc."
- Creates label_to_model mapping for de-anonymization
- Prompts models to evaluate and rank (with strict format requirements)
- Returns tuple: (rankings_list, label_to_model_dict)
- Each ranking includes both raw text and parsed_ranking list
stage3_synthesize_final(): Chairman synthesizes from all responses + rankings
parse_ranking_from_text(): Extracts "FINAL RANKING:" section, handles both numbered lists and plain format
calculate_aggregate_rankings(): Computes average rank position across all peer evaluations

storage.py

JSON-based conversation storage in data/conversations/
Each conversation: {id, created_at, messages[]}
Assistant messages contain: {role, stage1, stage2, stage3}
Note: metadata (label_to_model, aggregate_rankings) is NOT persisted to storage, only returned via API

main.py

FastAPI app with CORS enabled for localhost:5173 and localhost:3000
POST /api/conversations/{id}/message returns metadata in addition to stages
Metadata includes: label_to_model mapping and aggregate_rankings

Frontend Structure (`frontend/src/`)

App.jsx

Main orchestration: manages conversations list and current conversation
Handles message sending and metadata storage
Important: metadata is stored in the UI state for display but not persisted to backend JSON

components/ChatInterface.jsx

Multiline textarea (3 rows, resizable)
Enter to send, Shift+Enter for new line
User messages wrapped in markdown-content class for padding

components/Stage1.jsx

Tab view of individual model responses
ReactMarkdown rendering with markdown-content wrapper

components/Stage2.jsx

Critical Feature: Tab view showing RAW evaluation text from each model
De-anonymization happens CLIENT-SIDE for display (models receive anonymous labels)
Shows "Extracted Ranking" below each evaluation so users can validate parsing
Aggregate rankings shown with average position and vote count
Explanatory text clarifies that boldface model names are for readability only

components/Stage3.jsx

Final synthesized answer from chairman
Green-tinted background (#f0fff0) to highlight conclusion

Styling (*.css)

Light mode theme (not dark mode)
Primary color: #4a90e2 (blue)
Global markdown styling in index.css with .markdown-content class
12px padding on all markdown content to prevent cluttered appearance

Key Design Decisions

Stage 2 Prompt Format

The Stage 2 prompt is very specific to ensure parseable output:

1. Evaluate each response individually first
2. Provide "FINAL RANKING:" header
3. Numbered list format: "1. Response C", "2. Response A", etc.
4. No additional text after ranking section

This strict format allows reliable parsing while still getting thoughtful evaluations.

De-anonymization Strategy

Models receive: "Response A", "Response B", etc.
Backend creates mapping: {"Response A": "openai/gpt-5.1", ...}
Frontend displays model names in bold for readability
Users see explanation that original evaluation used anonymous labels
This prevents bias while maintaining transparency

Error Handling Philosophy

Continue with successful responses if some models fail (graceful degradation)
Never fail the entire request due to single model failure
Log errors but don't expose to user unless all models fail

UI/UX Transparency

All raw outputs are inspectable via tabs
Parsed rankings shown below raw text for validation
Users can verify system's interpretation of model outputs
This builds trust and allows debugging of edge cases

Important Implementation Details

Relative Imports

All backend modules use relative imports (e.g., from .config import ...) not absolute imports. This is critical for Python's module system to work correctly when running as python -m backend.main.

Port Configuration

Backend: 8001 (changed from 8000 to avoid conflict)
Frontend: 5173 (Vite default)
Update both backend/main.py and frontend/src/api.js if changing

Markdown Rendering

All ReactMarkdown components must be wrapped in <div className="markdown-content"> for proper spacing. This class is defined globally in index.css.

Model Configuration

Models are hardcoded in backend/config.py. Chairman can be same or different from council members. The current default is Gemini as chairman per user preference.

Common Gotchas

Module Import Errors: Always run backend as python -m backend.main from project root, not from backend directory
CORS Issues: Frontend must match allowed origins in main.py CORS middleware
Ranking Parse Failures: If models don't follow format, fallback regex extracts any "Response X" patterns in order
Missing Metadata: Metadata is ephemeral (not persisted), only available in API responses

Future Enhancement Ideas

Configurable council/chairman via UI instead of config file
Streaming responses instead of batch loading
Export conversations to markdown/PDF
Model performance analytics over time
Custom ranking criteria (not just accuracy/insight)
Support for reasoning models (o1, etc.) with special handling

Testing Notes

Use test_openrouter.py to verify API connectivity and test different model identifiers before adding to council. The script tests both streaming and non-streaming modes.

Data Flow Summary

User Query
    ↓
Stage 1: Parallel queries → [individual responses]
    ↓
Stage 2: Anonymize → Parallel ranking queries → [evaluations + parsed rankings]
    ↓
Aggregate Rankings Calculation → [sorted by avg position]
    ↓
Stage 3: Chairman synthesis with full context
    ↓
Return: {stage1, stage2, stage3, metadata}
    ↓
Frontend: Display with tabs + validation UI

The entire flow is async/parallel where possible to minimize latency.

6.9 KiB Raw Permalink Blame History