# SwiftDBAI — Product Requirements Document

> **SwiftDBAI** is the umbrella name for AI-powered SQLite database tooling. v1 ships `SwiftDBAI` (chat + SQL engine). Future versions may add `SwiftDBAIMCP` (MCP server mode).

**Version:** 0.2 (Revised — post-pivot from SwiftDataAI)
**Date:** 2026-04-04
**Author:** Krishna Kumar

---

## 1. Problem Statement

Developers building apps with SQLite databases have no natural-language interface to query, explore, or mutate their data. Debugging, prototyping, and building AI-powered features all require hand-writing SQL — even for simple questions like "show me all overdue tasks" or "how many users signed up this week."

There is no drop-in Swift package that lets a user (or an LLM) **chat with any SQLite database** using plain English.

---

## 2. Vision

**SwiftDBAI** is a Swift package that gives any SQLite-backed app a conversational interface to its data. Developers embed it in minutes; end users ask questions and get answers from their own data.

The data layer is **all SQL via GRDB** — no SwiftData APIs, no `#Predicate`, no `FetchDescriptor`. SwiftDBAI works with **any SQLite database**, not just SwiftData stores. Schema discovery is automatic via `sqlite_master` introspection — zero configuration required. The developer passes their own GRDB `DatabasePool` or `DatabaseQueue`; SwiftDBAI never manages the connection lifecycle.

Built on [**AnyLanguageModel**](https://github.com/huggingface/AnyLanguageModel) from Hugging Face — a unified Swift LLM abstraction that supports OpenAI, Anthropic, Gemini, Ollama, CoreML, MLX, and llama.cpp through a single API. SwiftDBAI generates SQL from natural language, validates it against a developer-configured operation allowlist, executes it via GRDB, and renders results as text, data tables, or Swift Charts.

---

## 3. Target Users

| Persona | Need |
|---|---|
| **iOS/macOS Developer** | Drop-in chat UI + engine to add "talk to your data" features to any SQLite-backed app without building NLP pipelines |
| **AI/LLM App Builder** | SQL generation layer that lets an LLM read/write any SQLite database through validated, allowlisted operations |
| **Power User / Debugger** | In-app console to inspect and mutate SQLite data during development |

---

## 4. Goals & Non-Goals

### Goals
- Natural-language querying of any SQLite database via GRDB
- **LLM-agnostic** via [AnyLanguageModel](https://github.com/huggingface/AnyLanguageModel) — works with OpenAI, Anthropic, Gemini, Ollama, CoreML, MLX, llama.cpp out of the box
- Drop-in SwiftUI chat view that "just works" with zero configuration — provide a database path and a model
- Schema-aware — automatically introspects tables, columns, types, primary keys, foreign keys, and indexes from `sqlite_master`
- Read **and** write support (SELECT, INSERT, UPDATE, DELETE) with developer-configured operation allowlist and confirmation guards
- All SQL validation via allowlist check — no SQL parser for safety, no `#Predicate` generation
- UI rendering: text summaries + scrollable data tables + Swift Charts (bar, line, pie) — all in v1
- Swift 6 concurrency safe, structured concurrency throughout (Swift 6.1 language mode)
- Works on iOS 17+, macOS 14+, visionOS 1+

### Non-Goals (v1)
- ~~Replacing Core Data~~ Not tied to any ORM — works with raw SQLite
- Building a general-purpose chat framework (data-scoped only)
- Full SQL parsing for safety (allowlist check is sufficient)
- Training or fine-tuning models
- Cloud sync of chat history
- Managing database connections (developer owns the GRDB connection)

---

## 5. Architecture Overview

```
┌──────────────────────────────────────────────────────────┐
│                       SwiftDBAI                          │
├──────────┬───────────┬──────────────┬────────────────────┤
│ Chat UI  │  Engine   │ Schema       │  SQL Pipeline      │
│ (SwiftUI)│           │ Introspector │                    │
└────┬─────┴─────┬─────┴──────┬───────┴───────┬────────────┘
     │           │            │               │
     ▼           ▼            ▼               ▼
  ChatView   ChatEngine   sqlite_master    SQLQueryParser
  DataChat   PromptBuilder  PRAGMA         OperationAllowlist
  View       TextSummary    table_info     MutationPolicy
             Renderer       foreign_keys   QueryValidator
                            index_list
┌──────────────────────────────────────────────────────────┐
│            Rendering Layer                               │
├──────────┬──────────────┬────────────────────────────────┤
│  Text    │  DataTable   │  Swift Charts                  │
│ Summary  │  (scrollable)│  (Bar, Line, Pie)              │
└──────────┴──────────────┴────────────────────────────────┘
┌──────────────────────────────────────────────────────────┐
│                    GRDB.swift 7.0+                        │
│              DatabasePool / DatabaseQueue                 │
└──────────────────────────────────────────────────────────┘
┌──────────────────────────────────────────────────────────┐
│              AnyLanguageModel (HuggingFace)              │
├──────┬──────┬────────┬───────┬───────┬──────┬────────────┤
│OpenAI│Claude│ Gemini │Ollama │CoreML │ MLX  │ llama.cpp  │
└──────┴──────┴────────┴───────┴───────┴──────┴────────────┘
```

### 5.1 Core Modules

| Module | Responsibility |
|---|---|
| **SchemaIntrospector** | Queries `sqlite_master`, `PRAGMA table_info`, `PRAGMA foreign_key_list`, and `PRAGMA index_list` to auto-discover all tables, columns (name, type, nullability, defaults), primary keys, foreign keys, and indexes. Produces a `DatabaseSchema` model the LLM uses as context. Zero configuration — no annotations or model definitions needed. |
| **SQLQueryParser** | Extracts SQL from the raw LLM response, detects the operation type (SELECT/INSERT/UPDATE/DELETE), validates it against the `OperationAllowlist`, enforces `MutationPolicy` table restrictions, and flags destructive operations that require confirmation. |
| **OperationAllowlist** | Developer-configured set of permitted SQL operations. Presets: `.readOnly` (SELECT only, the default), `.standard` (SELECT + INSERT + UPDATE), `.unrestricted` (all including DELETE). |
| **MutationPolicy** | Builds on `OperationAllowlist` with per-table restrictions. Controls which mutations are allowed on which tables. DELETE requires confirmation by default. |
| **QueryValidator** | Extensible protocol for custom pre-execution validation rules (e.g., `TableAllowlistValidator`, `MaxRowLimitValidator`). Developers implement `QueryValidator` to add domain-specific checks. |
| **ChatEngine** | Orchestrates the full pipeline: schema introspection (once, lazily) -> system prompt with schema context -> LLM generates SQL -> `SQLQueryParser` validates -> GRDB executes -> `TextSummaryRenderer` summarizes -> response. Supports multi-turn conversation with configurable context window. |
| **PromptBuilder** | Constructs the LLM system prompt including the introspected schema description, allowlist rules, and optional developer-provided context. |
| **TextSummaryRenderer** | Uses the LLM to generate natural-language summaries of query results. Configurable max rows for summarization. |
| **ChatView / DataChatView** | Drop-in SwiftUI views. `DataChatView` is the zero-config entry point (database path + model). `ChatView` accepts a `ChatViewModel` for full control. Renders message bubbles, scrollable data tables, Swift Charts (bar/line/pie via `ChartDataDetector`), and error states. |

### 5.2 Data Flow

```
User types: "Show me all tasks due this week"
       │
       ▼
ChatEngine ensures schema is introspected (via SchemaIntrospector)
  - Queries sqlite_master, PRAGMA table_info, foreign_key_list, index_list
  - Caches DatabaseSchema for subsequent queries
       │
       ▼
PromptBuilder constructs system prompt with:
  - Full schema description (tables, columns, types, keys, indexes)
  - OperationAllowlist rules
  - Optional developer context
  - Conversation history (within context window)
       │
       ▼
LanguageModelSession.respond(to: userMessage)
  → AnyLanguageModel routes to configured provider (OpenAI / Anthropic / Ollama / ...)
       │
       ▼
LLM returns raw SQL: "SELECT * FROM tasks WHERE dueDate >= date('now', 'weekday 0', '-7 days') ORDER BY dueDate ASC"
       │
       ▼
SQLQueryParser:
  1. Extracts SQL from LLM response (strips markdown fences, etc.)
  2. Detects operation type → SELECT
  3. Validates against OperationAllowlist → allowed
  4. Checks MutationPolicy table restrictions (if applicable)
  5. Runs custom QueryValidators
       │
       ▼
GRDB executes SQL via DatabasePool/DatabaseQueue
  → Returns rows as [[String: Value]] with column names
       │
       ▼
TextSummaryRenderer asks LLM to summarize results in natural language
ChartDataDetector checks if results are chart-eligible
       │
       ▼
ChatView renders: text summary + scrollable DataTable + Swift Charts (if applicable)
```

---

## 6. Key APIs (Implemented)

### 6.1 Setup (Minimal — Zero Config)

```swift
import SwiftDBAI
import AnyLanguageModel

struct ContentView: View {
    var body: some View {
        // Just a database path and a model — that's it
        DataChatView(
            databasePath: "/path/to/mydata.sqlite",
            model: OllamaLanguageModel(model: "llama3")
        )
    }
}
```

### 6.2 Choosing a Provider (via AnyLanguageModel)

```swift
import AnyLanguageModel

// OpenAI
let model = OpenAILanguageModel(apiKey: "sk-...", model: "gpt-4o")

// Anthropic
let model = AnthropicLanguageModel(apiKey: "sk-ant-...", model: "claude-sonnet-4-20250514")

// Ollama (local)
let model = OllamaLanguageModel(model: "llama3")

// Gemini
let model = GeminiLanguageModel(apiKey: "...", model: "gemini-2.0-flash")

// Pass to DataChatView with options
DataChatView(
    databasePath: "/path/to/db.sqlite",
    model: model,
    allowlist: .standard,
    additionalContext: "This database stores a recipe app's data."
)
```

### 6.3 Bringing Your Own GRDB Connection

```swift
import GRDB
import SwiftDBAI

// Developer manages their own connection
let dbPool = try DatabasePool(path: "/path/to/mydata.sqlite")

// Option A: DataChatView with existing connection
DataChatView(
    database: dbPool,
    model: model,
    allowlist: .readOnly
)

// Option B: Headless / programmatic use via ChatEngine
let engine = ChatEngine(
    database: dbPool,
    model: model,
    allowlist: .standard
)

let response = try await engine.send("How many tasks are overdue?")
print(response.summary)     // "You have 12 overdue tasks."
print(response.sql)         // "SELECT COUNT(*) FROM tasks WHERE dueDate < date('now')"
print(response.queryResult) // QueryResult with columns, rows, execution time
```

### 6.4 Schema Introspection (Auto — Zero Config)

```swift
// Schema is introspected automatically on first query.
// Or pre-warm it explicitly:
let schema = try await engine.prepareSchema()

// schema.tableNames → ["tasks", "projects", "users"]
// schema.tables["tasks"]?.columns → [ColumnSchema(name: "id", type: "INTEGER", isPrimaryKey: true), ...]
// schema.tables["tasks"]?.foreignKeys → [ForeignKeySchema(fromColumn: "projectId", toTable: "projects", ...)]
// schema.schemaDescription → Compact text for LLM prompts

// No @Model annotations, no #Predicate, no FetchDescriptor.
// Just sqlite_master + PRAGMA introspection.
```

### 6.5 Operation Allowlist (Safety)

```swift
// Presets
let readOnly = OperationAllowlist.readOnly          // SELECT only (default)
let standard = OperationAllowlist.standard          // SELECT + INSERT + UPDATE
let unrestricted = OperationAllowlist.unrestricted  // All including DELETE

// Custom
let custom = OperationAllowlist([.select, .insert]) // Only SELECT and INSERT

// Pass to ChatEngine or DataChatView
let engine = ChatEngine(
    database: dbPool,
    model: model,
    allowlist: .standard
)
```

### 6.6 Mutation Policy (Table-Level Control)

```swift
// Read-only (default)
let readOnly = MutationPolicy.readOnly

// Allow INSERT and UPDATE on specific tables only
let restricted = MutationPolicy(
    allowedOperations: [.insert, .update],
    allowedTables: ["orders", "order_items"]
)

// Full access — DELETE requires confirmation by default
let full = MutationPolicy.unrestricted

let engine = ChatEngine(
    database: dbPool,
    model: model,
    mutationPolicy: restricted
)
```

### 6.7 Custom Query Validators

```swift
// Built-in: restrict queries to specific tables
let tableValidator = TableAllowlistValidator(
    allowedTables: ["tasks", "projects"]
)

// Built-in: enforce row limits on SELECT queries
let limitValidator = MaxRowLimitValidator(maxRows: 1000)

// Custom: implement QueryValidator protocol
struct NoJoinValidator: QueryValidator {
    func validate(sql: String, operation: SQLOperation) throws {
        if sql.uppercased().contains("JOIN") {
            throw QueryValidationError.rejected("JOIN queries are not allowed.")
        }
    }
}

let config = ChatEngineConfiguration(
    validators: [tableValidator, limitValidator, NoJoinValidator()]
)

let engine = ChatEngine(
    database: dbPool,
    model: model,
    allowlist: .readOnly,
    configuration: config
)
```

### 6.8 Tool Execution Delegate (Destructive Operation Confirmation)

```swift
let engine = ChatEngine(
    database: dbPool,
    model: model,
    allowlist: .unrestricted,
    delegate: MyDelegate()
)

actor MyDelegate: ToolExecutionDelegate {
    func confirmDestructiveOperation(_ context: DestructiveOperationContext) async -> Bool {
        // Show confirmation UI, inspect context.sql, context.targetTable, etc.
        return true  // or false to reject
    }

    func willExecuteSQL(_ sql: String, classification: SQLClassification) async {
        // Observe before execution
    }

    func didExecuteSQL(_ sql: String, success: Bool) async {
        // Observe after execution
    }
}
```

---

## 7. Feature Requirements

### P0 — Must Have (v1.0) — All Implemented

| # | Feature | Description | Status |
|---|---|---|---|
| F1 | **Schema Discovery** | Auto-introspect all tables, columns (name, type, nullability, defaults), primary keys, foreign keys, and indexes from `sqlite_master` and PRAGMA statements. Zero config — no annotations needed. | Done |
| F2 | **Natural Language to SQL** | Convert NL queries to SQL via LLM. The LLM generates raw SQL; no `#Predicate` or `FetchDescriptor` — pure SQL throughout. | Done |
| F3 | **Result Rendering — Text** | `TextSummaryRenderer` uses the LLM to produce natural-language summaries of query results. | Done |
| F4 | **Result Rendering — Data Tables** | `ScrollableDataTableView` renders query results as scrollable, structured tables in SwiftUI. | Done |
| F5 | **Result Rendering — Swift Charts** | `ChartDataDetector` auto-detects chart-eligible results. `BarChartView`, `LineChartView`, `PieChartView` render via Swift Charts. | Done |
| F6 | **Drop-in ChatView** | `DataChatView` (zero-config: path + model) and `ChatView` (full control via `ChatViewModel`). Message bubbles, loading states, error display. | Done |
| F7 | **AnyLanguageModel Integration** | Uses HuggingFace's AnyLanguageModel for the LLM layer. `LanguageModelSession` for SQL generation and result summarization. | Done |
| F8 | **SQL Safety — Operation Allowlist** | `OperationAllowlist` with presets (`.readOnly`, `.standard`, `.unrestricted`) and custom sets. Allowlist check only — no SQL parser for safety. | Done |
| F9 | **SQL Safety — Mutation Policy** | `MutationPolicy` adds per-table restrictions on top of the allowlist. DELETE requires confirmation by default. | Done |
| F10 | **SQL Safety — Custom Validators** | `QueryValidator` protocol with built-in `TableAllowlistValidator` and `MaxRowLimitValidator`. Extensible for domain-specific rules. | Done |
| F11 | **Mutation Support** | INSERT, UPDATE, DELETE via SQL with allowlist validation and optional confirmation via `ToolExecutionDelegate`. | Done |
| F12 | **Conversation Context** | Multi-turn support with configurable context window size. "Show overdue tasks" -> "Now sort them by priority" maintains history. | Done |
| F13 | **Error Handling** | Typed `SwiftDBAIError` enum covering schema introspection failures, empty schemas, invalid SQL, disallowed operations, confirmation required, database errors, LLM failures, and query timeouts. | Done |

### P1 — Should Have (v1.x)

| # | Feature | Description |
|---|---|---|
| F14 | **On-Device Providers** | Guide for using Ollama, CoreML, MLX, or llama.cpp via AnyLanguageModel for fully offline / privacy-sensitive deployments |
| F15 | **Chat History Persistence** | Optionally persist chat history to SQLite via GRDB |
| F16 | **Theming API** | Customize colors, fonts, bubble styles, dark/light mode in ChatView |
| F17 | **Streaming Responses** | Token-by-token display for cloud LLM providers |
| F18 | **Export Results** | Copy/share query results as CSV, JSON, or formatted text |

### P2 — Nice to Have (v2.0+)

| # | Feature | Description |
|---|---|---|
| F19 | **Voice Input** | Speech-to-text for hands-free data queries |
| F20 | **MCP Server Mode** | Expose any SQLite database as an MCP server so external LLM clients can query it |
| F21 | **Suggested Questions** | Auto-generate starter questions based on introspected schema |
| F22 | **Audit Log** | Log all mutations with timestamp, before/after values |
| F23 | **Multi-Database** | Support querying across multiple SQLite databases simultaneously |

---

## 8. Privacy & Security

| Concern | Approach |
|---|---|
| **Provider choice is yours** | Use Ollama or a self-hosted model to keep data off third-party servers |
| **No telemetry** | The package collects nothing |
| **API key handling** | Cloud provider keys are never persisted by the kit; developer is responsible for secure storage |
| **SQL safety** | Developer-configured `OperationAllowlist` controls what SQL the LLM may generate. Allowlist check only — no attempt at SQL parsing for injection prevention. The developer is responsible for setting appropriate allowlist levels. |
| **Mutation safety** | `MutationPolicy` provides per-table restrictions. DELETE requires explicit confirmation by default via `ToolExecutionDelegate`. |
| **Data stays in-process** | Query results stay in the GRDB connection; no serialization to disk or network unless developer opts in |
| **Connection ownership** | Developer manages their own GRDB `DatabasePool`/`DatabaseQueue`. SwiftDBAI never opens, closes, or migrates the database on its own. |

---

## 9. Technical Constraints

- **Swift Package Manager** only (no CocoaPods/Carthage)
- **Minimum deployments:** iOS 17.0, macOS 14.0, visionOS 1.0
- **Swift 6.1** language mode with strict concurrency checking
- **Dependencies:** GRDB.swift 7.0+ and AnyLanguageModel (branch: main)
- **No UIKit dependency** — pure SwiftUI for the view layer
- **No SwiftData dependency** — pure GRDB/SQL throughout. Works with any SQLite database regardless of how it was created.
- **No Core Data dependency** — no ORM layer of any kind

---

## 10. Implementation Status

| Metric | Current |
|---|---|
| Source files | 30 |
| Test files | 19 |
| Tests passing | 352 |
| Swift language mode | 6.1 |
| Dependencies | GRDB.swift 7.0+, AnyLanguageModel |

---

## 11. Success Metrics

| Metric | Target |
|---|---|
| Integration time | < 5 minutes for basic "chat with my data" — provide a database path and a model |
| Query accuracy | > 90% of common queries (SELECT with filters, sorting, aggregates) produce correct SQL on first attempt |
| Latency (kit overhead) | < 500ms for schema introspection + SQL validation on a typical 20-table database (excludes LLM response time) |
| Package size | < 2 MB added to app binary (excluding LLM model weights) |
| Crash rate | 0 crashes from kit code in production |

---

## 12. Open Questions

1. **AnyLanguageModel maturity** — The library is relatively new; we need to track API stability and pin to a specific version. What's our fallback if breaking changes land? (Currently pinned to `branch: main`.)
2. **SQL injection surface** — The allowlist check validates operation type but does not parse SQL structure. Should we add a lightweight SQL tokenizer for additional safety, or is the allowlist sufficient given the LLM is the only SQL author?
3. **Schema change detection** — `SchemaIntrospector` caches the schema after first introspection. If the database schema changes at runtime (migrations, etc.), the cache becomes stale. Should we add a `schema_version` PRAGMA check or a manual invalidation API?
4. **Large schema handling** — For databases with many tables (100+), the schema description in the LLM system prompt may be very large. Should we add table filtering or relevance ranking?
5. **Chart auto-detection accuracy** — `ChartDataDetector` heuristically determines if results are chart-eligible. How do we handle false positives/negatives?

---

## 13. Milestones

| Milestone | Scope | Status |
|---|---|---|
| **M1: Foundation** | SchemaIntrospector + SQLQueryParser + headless ChatEngine | Done |
| **M2: Safety** | OperationAllowlist + MutationPolicy + QueryValidator + ToolExecutionDelegate | Done |
| **M3: Chat UI** | DataChatView + ChatView + ChatViewModel + MessageBubbleView + ErrorMessageView | Done |
| **M4: Rendering** | TextSummaryRenderer + ScrollableDataTableView + ChartDataDetector + Bar/Line/Pie charts | Done |
| **M5: Multi-turn** | ConversationHistory + context window + PromptBuilder with history | Done |
| **M6: Polish & Ship** | Error handling (SwiftDBAIError), 352 tests, documentation | Done |

---

## 14. References

- [GRDB.swift](https://github.com/groue/GRDB.swift) — SQLite toolkit for Swift
- [AnyLanguageModel (HuggingFace)](https://github.com/huggingface/AnyLanguageModel) — Unified Swift LLM abstraction
- [Swift Charts](https://developer.apple.com/documentation/charts) — Apple's declarative charting framework
- [Model Context Protocol (MCP)](https://modelcontextprotocol.io) — For future MCP server mode
- [Swift Package Manager](https://www.swift.org/documentation/package-manager/)
- [SQLite PRAGMA Statements](https://www.sqlite.org/pragma.html) — Used for schema introspection