Files

Krishna Kumar b1724fe7ca Initial implementation of SwiftDBAI

Chat with any SQLite database using natural language. Built on
AnyLanguageModel (HuggingFace) for LLM-agnostic provider support
and GRDB for SQLite access.

Core features:
- Auto schema introspection from sqlite_master (zero config)
- NL → SQL generation via any AnyLanguageModel provider
- Three rendering modes: text summary, data table, Swift Charts
- Drop-in DataChatView (SwiftUI) and headless ChatEngine
- Operation allowlist with read-only default
- Mutation policy with per-table control
- ToolExecutionDelegate for destructive operation confirmation
- Multi-turn conversation context
- 352 tests across 24 suites, all passing

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-04-04 09:30:56 -05:00

23 KiB

Raw Blame History

SwiftDBAI — Product Requirements Document

SwiftDBAI is the umbrella name for AI-powered SQLite database tooling. v1 ships SwiftDBAI (chat + SQL engine). Future versions may add SwiftDBAIMCP (MCP server mode).

Version: 0.2 (Revised — post-pivot from SwiftDataAI) Date: 2026-04-04 Author: Krishna Kumar

1. Problem Statement

Developers building apps with SQLite databases have no natural-language interface to query, explore, or mutate their data. Debugging, prototyping, and building AI-powered features all require hand-writing SQL — even for simple questions like "show me all overdue tasks" or "how many users signed up this week."

There is no drop-in Swift package that lets a user (or an LLM) chat with any SQLite database using plain English.

2. Vision

SwiftDBAI is a Swift package that gives any SQLite-backed app a conversational interface to its data. Developers embed it in minutes; end users ask questions and get answers from their own data.

The data layer is all SQL via GRDB — no SwiftData APIs, no #Predicate, no FetchDescriptor. SwiftDBAI works with any SQLite database, not just SwiftData stores. Schema discovery is automatic via sqlite_master introspection — zero configuration required. The developer passes their own GRDB DatabasePool or DatabaseQueue; SwiftDBAI never manages the connection lifecycle.

Built on AnyLanguageModel from Hugging Face — a unified Swift LLM abstraction that supports OpenAI, Anthropic, Gemini, Ollama, CoreML, MLX, and llama.cpp through a single API. SwiftDBAI generates SQL from natural language, validates it against a developer-configured operation allowlist, executes it via GRDB, and renders results as text, data tables, or Swift Charts.

3. Target Users

Persona	Need
iOS/macOS Developer	Drop-in chat UI + engine to add "talk to your data" features to any SQLite-backed app without building NLP pipelines
AI/LLM App Builder	SQL generation layer that lets an LLM read/write any SQLite database through validated, allowlisted operations
Power User / Debugger	In-app console to inspect and mutate SQLite data during development

4. Goals & Non-Goals

Goals

Natural-language querying of any SQLite database via GRDB
LLM-agnostic via AnyLanguageModel — works with OpenAI, Anthropic, Gemini, Ollama, CoreML, MLX, llama.cpp out of the box
Drop-in SwiftUI chat view that "just works" with zero configuration — provide a database path and a model
Schema-aware — automatically introspects tables, columns, types, primary keys, foreign keys, and indexes from sqlite_master
Read and write support (SELECT, INSERT, UPDATE, DELETE) with developer-configured operation allowlist and confirmation guards
All SQL validation via allowlist check — no SQL parser for safety, no #Predicate generation
UI rendering: text summaries + scrollable data tables + Swift Charts (bar, line, pie) — all in v1
Swift 6 concurrency safe, structured concurrency throughout (Swift 6.1 language mode)
Works on iOS 17+, macOS 14+, visionOS 1+

Non-Goals (v1)

~~Replacing Core Data~~ Not tied to any ORM — works with raw SQLite
Building a general-purpose chat framework (data-scoped only)
Full SQL parsing for safety (allowlist check is sufficient)
Training or fine-tuning models
Cloud sync of chat history
Managing database connections (developer owns the GRDB connection)

5. Architecture Overview

┌──────────────────────────────────────────────────────────┐
│                       SwiftDBAI                          │
├──────────┬───────────┬──────────────┬────────────────────┤
│ Chat UI  │  Engine   │ Schema       │  SQL Pipeline      │
│ (SwiftUI)│           │ Introspector │                    │
└────┬─────┴─────┬─────┴──────┬───────┴───────┬────────────┘
     │           │            │               │
     ▼           ▼            ▼               ▼
  ChatView   ChatEngine   sqlite_master    SQLQueryParser
  DataChat   PromptBuilder  PRAGMA         OperationAllowlist
  View       TextSummary    table_info     MutationPolicy
             Renderer       foreign_keys   QueryValidator
                            index_list
┌──────────────────────────────────────────────────────────┐
│            Rendering Layer                               │
├──────────┬──────────────┬────────────────────────────────┤
│  Text    │  DataTable   │  Swift Charts                  │
│ Summary  │  (scrollable)│  (Bar, Line, Pie)              │
└──────────┴──────────────┴────────────────────────────────┘
┌──────────────────────────────────────────────────────────┐
│                    GRDB.swift 7.0+                        │
│              DatabasePool / DatabaseQueue                 │
└──────────────────────────────────────────────────────────┘
┌──────────────────────────────────────────────────────────┐
│              AnyLanguageModel (HuggingFace)              │
├──────┬──────┬────────┬───────┬───────┬──────┬────────────┤
│OpenAI│Claude│ Gemini │Ollama │CoreML │ MLX  │ llama.cpp  │
└──────┴──────┴────────┴───────┴───────┴──────┴────────────┘

5.1 Core Modules

Module	Responsibility
SchemaIntrospector	Queries `sqlite_master`, `PRAGMA table_info`, `PRAGMA foreign_key_list`, and `PRAGMA index_list` to auto-discover all tables, columns (name, type, nullability, defaults), primary keys, foreign keys, and indexes. Produces a `DatabaseSchema` model the LLM uses as context. Zero configuration — no annotations or model definitions needed.
SQLQueryParser	Extracts SQL from the raw LLM response, detects the operation type (SELECT/INSERT/UPDATE/DELETE), validates it against the `OperationAllowlist`, enforces `MutationPolicy` table restrictions, and flags destructive operations that require confirmation.
OperationAllowlist	Developer-configured set of permitted SQL operations. Presets: `.readOnly` (SELECT only, the default), `.standard` (SELECT + INSERT + UPDATE), `.unrestricted` (all including DELETE).
MutationPolicy	Builds on `OperationAllowlist` with per-table restrictions. Controls which mutations are allowed on which tables. DELETE requires confirmation by default.
QueryValidator	Extensible protocol for custom pre-execution validation rules (e.g., `TableAllowlistValidator`, `MaxRowLimitValidator`). Developers implement `QueryValidator` to add domain-specific checks.
ChatEngine	Orchestrates the full pipeline: schema introspection (once, lazily) -> system prompt with schema context -> LLM generates SQL -> `SQLQueryParser` validates -> GRDB executes -> `TextSummaryRenderer` summarizes -> response. Supports multi-turn conversation with configurable context window.
PromptBuilder	Constructs the LLM system prompt including the introspected schema description, allowlist rules, and optional developer-provided context.
TextSummaryRenderer	Uses the LLM to generate natural-language summaries of query results. Configurable max rows for summarization.
ChatView / DataChatView	Drop-in SwiftUI views. `DataChatView` is the zero-config entry point (database path + model). `ChatView` accepts a `ChatViewModel` for full control. Renders message bubbles, scrollable data tables, Swift Charts (bar/line/pie via `ChartDataDetector`), and error states.

5.2 Data Flow

User types: "Show me all tasks due this week"
       │
       ▼
ChatEngine ensures schema is introspected (via SchemaIntrospector)
  - Queries sqlite_master, PRAGMA table_info, foreign_key_list, index_list
  - Caches DatabaseSchema for subsequent queries
       │
       ▼
PromptBuilder constructs system prompt with:
  - Full schema description (tables, columns, types, keys, indexes)
  - OperationAllowlist rules
  - Optional developer context
  - Conversation history (within context window)
       │
       ▼
LanguageModelSession.respond(to: userMessage)
  → AnyLanguageModel routes to configured provider (OpenAI / Anthropic / Ollama / ...)
       │
       ▼
LLM returns raw SQL: "SELECT * FROM tasks WHERE dueDate >= date('now', 'weekday 0', '-7 days') ORDER BY dueDate ASC"
       │
       ▼
SQLQueryParser:
  1. Extracts SQL from LLM response (strips markdown fences, etc.)
  2. Detects operation type → SELECT
  3. Validates against OperationAllowlist → allowed
  4. Checks MutationPolicy table restrictions (if applicable)
  5. Runs custom QueryValidators
       │
       ▼
GRDB executes SQL via DatabasePool/DatabaseQueue
  → Returns rows as [[String: Value]] with column names
       │
       ▼
TextSummaryRenderer asks LLM to summarize results in natural language
ChartDataDetector checks if results are chart-eligible
       │
       ▼
ChatView renders: text summary + scrollable DataTable + Swift Charts (if applicable)

6. Key APIs (Implemented)

6.1 Setup (Minimal — Zero Config)

import SwiftDBAI
import AnyLanguageModel

struct ContentView: View {
    var body: some View {
        // Just a database path and a model — that's it
        DataChatView(
            databasePath: "/path/to/mydata.sqlite",
            model: OllamaLanguageModel(model: "llama3")
        )
    }
}

6.2 Choosing a Provider (via AnyLanguageModel)

import AnyLanguageModel

// OpenAI
let model = OpenAILanguageModel(apiKey: "sk-...", model: "gpt-4o")

// Anthropic
let model = AnthropicLanguageModel(apiKey: "sk-ant-...", model: "claude-sonnet-4-20250514")

// Ollama (local)
let model = OllamaLanguageModel(model: "llama3")

// Gemini
let model = GeminiLanguageModel(apiKey: "...", model: "gemini-2.0-flash")

// Pass to DataChatView with options
DataChatView(
    databasePath: "/path/to/db.sqlite",
    model: model,
    allowlist: .standard,
    additionalContext: "This database stores a recipe app's data."
)

6.3 Bringing Your Own GRDB Connection

import GRDB
import SwiftDBAI

// Developer manages their own connection
let dbPool = try DatabasePool(path: "/path/to/mydata.sqlite")

// Option A: DataChatView with existing connection
DataChatView(
    database: dbPool,
    model: model,
    allowlist: .readOnly
)

// Option B: Headless / programmatic use via ChatEngine
let engine = ChatEngine(
    database: dbPool,
    model: model,
    allowlist: .standard
)

let response = try await engine.send("How many tasks are overdue?")
print(response.summary)     // "You have 12 overdue tasks."
print(response.sql)         // "SELECT COUNT(*) FROM tasks WHERE dueDate < date('now')"
print(response.queryResult) // QueryResult with columns, rows, execution time

6.4 Schema Introspection (Auto — Zero Config)

// Schema is introspected automatically on first query.
// Or pre-warm it explicitly:
let schema = try await engine.prepareSchema()

// schema.tableNames → ["tasks", "projects", "users"]
// schema.tables["tasks"]?.columns → [ColumnSchema(name: "id", type: "INTEGER", isPrimaryKey: true), ...]
// schema.tables["tasks"]?.foreignKeys → [ForeignKeySchema(fromColumn: "projectId", toTable: "projects", ...)]
// schema.schemaDescription → Compact text for LLM prompts

// No @Model annotations, no #Predicate, no FetchDescriptor.
// Just sqlite_master + PRAGMA introspection.

6.5 Operation Allowlist (Safety)

// Presets
let readOnly = OperationAllowlist.readOnly          // SELECT only (default)
let standard = OperationAllowlist.standard          // SELECT + INSERT + UPDATE
let unrestricted = OperationAllowlist.unrestricted  // All including DELETE

// Custom
let custom = OperationAllowlist([.select, .insert]) // Only SELECT and INSERT

// Pass to ChatEngine or DataChatView
let engine = ChatEngine(
    database: dbPool,
    model: model,
    allowlist: .standard
)

6.6 Mutation Policy (Table-Level Control)

// Read-only (default)
let readOnly = MutationPolicy.readOnly

// Allow INSERT and UPDATE on specific tables only
let restricted = MutationPolicy(
    allowedOperations: [.insert, .update],
    allowedTables: ["orders", "order_items"]
)

// Full access — DELETE requires confirmation by default
let full = MutationPolicy.unrestricted

let engine = ChatEngine(
    database: dbPool,
    model: model,
    mutationPolicy: restricted
)

6.7 Custom Query Validators

// Built-in: restrict queries to specific tables
let tableValidator = TableAllowlistValidator(
    allowedTables: ["tasks", "projects"]
)

// Built-in: enforce row limits on SELECT queries
let limitValidator = MaxRowLimitValidator(maxRows: 1000)

// Custom: implement QueryValidator protocol
struct NoJoinValidator: QueryValidator {
    func validate(sql: String, operation: SQLOperation) throws {
        if sql.uppercased().contains("JOIN") {
            throw QueryValidationError.rejected("JOIN queries are not allowed.")
        }
    }
}

let config = ChatEngineConfiguration(
    validators: [tableValidator, limitValidator, NoJoinValidator()]
)

let engine = ChatEngine(
    database: dbPool,
    model: model,
    allowlist: .readOnly,
    configuration: config
)

6.8 Tool Execution Delegate (Destructive Operation Confirmation)

let engine = ChatEngine(
    database: dbPool,
    model: model,
    allowlist: .unrestricted,
    delegate: MyDelegate()
)

actor MyDelegate: ToolExecutionDelegate {
    func confirmDestructiveOperation(_ context: DestructiveOperationContext) async -> Bool {
        // Show confirmation UI, inspect context.sql, context.targetTable, etc.
        return true  // or false to reject
    }

    func willExecuteSQL(_ sql: String, classification: SQLClassification) async {
        // Observe before execution
    }

    func didExecuteSQL(_ sql: String, success: Bool) async {
        // Observe after execution
    }
}

7. Feature Requirements

P0 — Must Have (v1.0) — All Implemented

#	Feature	Description	Status
F1	Schema Discovery	Auto-introspect all tables, columns (name, type, nullability, defaults), primary keys, foreign keys, and indexes from `sqlite_master` and PRAGMA statements. Zero config — no annotations needed.	Done
F2	Natural Language to SQL	Convert NL queries to SQL via LLM. The LLM generates raw SQL; no `#Predicate` or `FetchDescriptor` — pure SQL throughout.	Done
F3	Result Rendering — Text	`TextSummaryRenderer` uses the LLM to produce natural-language summaries of query results.	Done
F4	Result Rendering — Data Tables	`ScrollableDataTableView` renders query results as scrollable, structured tables in SwiftUI.	Done
F5	Result Rendering — Swift Charts	`ChartDataDetector` auto-detects chart-eligible results. `BarChartView`, `LineChartView`, `PieChartView` render via Swift Charts.	Done
F6	Drop-in ChatView	`DataChatView` (zero-config: path + model) and `ChatView` (full control via `ChatViewModel`). Message bubbles, loading states, error display.	Done
F7	AnyLanguageModel Integration	Uses HuggingFace's AnyLanguageModel for the LLM layer. `LanguageModelSession` for SQL generation and result summarization.	Done
F8	SQL Safety — Operation Allowlist	`OperationAllowlist` with presets (`.readOnly`, `.standard`, `.unrestricted`) and custom sets. Allowlist check only — no SQL parser for safety.	Done
F9	SQL Safety — Mutation Policy	`MutationPolicy` adds per-table restrictions on top of the allowlist. DELETE requires confirmation by default.	Done
F10	SQL Safety — Custom Validators	`QueryValidator` protocol with built-in `TableAllowlistValidator` and `MaxRowLimitValidator`. Extensible for domain-specific rules.	Done
F11	Mutation Support	INSERT, UPDATE, DELETE via SQL with allowlist validation and optional confirmation via `ToolExecutionDelegate`.	Done
F12	Conversation Context	Multi-turn support with configurable context window size. "Show overdue tasks" -> "Now sort them by priority" maintains history.	Done
F13	Error Handling	Typed `SwiftDBAIError` enum covering schema introspection failures, empty schemas, invalid SQL, disallowed operations, confirmation required, database errors, LLM failures, and query timeouts.	Done

P1 — Should Have (v1.x)

#	Feature	Description
F14	On-Device Providers	Guide for using Ollama, CoreML, MLX, or llama.cpp via AnyLanguageModel for fully offline / privacy-sensitive deployments
F15	Chat History Persistence	Optionally persist chat history to SQLite via GRDB
F16	Theming API	Customize colors, fonts, bubble styles, dark/light mode in ChatView
F17	Streaming Responses	Token-by-token display for cloud LLM providers
F18	Export Results	Copy/share query results as CSV, JSON, or formatted text

P2 — Nice to Have (v2.0+)

#	Feature	Description
F19	Voice Input	Speech-to-text for hands-free data queries
F20	MCP Server Mode	Expose any SQLite database as an MCP server so external LLM clients can query it
F21	Suggested Questions	Auto-generate starter questions based on introspected schema
F22	Audit Log	Log all mutations with timestamp, before/after values
F23	Multi-Database	Support querying across multiple SQLite databases simultaneously

8. Privacy & Security

Concern	Approach
Provider choice is yours	Use Ollama or a self-hosted model to keep data off third-party servers
No telemetry	The package collects nothing
API key handling	Cloud provider keys are never persisted by the kit; developer is responsible for secure storage
SQL safety	Developer-configured `OperationAllowlist` controls what SQL the LLM may generate. Allowlist check only — no attempt at SQL parsing for injection prevention. The developer is responsible for setting appropriate allowlist levels.
Mutation safety	`MutationPolicy` provides per-table restrictions. DELETE requires explicit confirmation by default via `ToolExecutionDelegate`.
Data stays in-process	Query results stay in the GRDB connection; no serialization to disk or network unless developer opts in
Connection ownership	Developer manages their own GRDB `DatabasePool`/`DatabaseQueue`. SwiftDBAI never opens, closes, or migrates the database on its own.

9. Technical Constraints

Swift Package Manager only (no CocoaPods/Carthage)
Minimum deployments: iOS 17.0, macOS 14.0, visionOS 1.0
Swift 6.1 language mode with strict concurrency checking
Dependencies: GRDB.swift 7.0+ and AnyLanguageModel (branch: main)
No UIKit dependency — pure SwiftUI for the view layer
No SwiftData dependency — pure GRDB/SQL throughout. Works with any SQLite database regardless of how it was created.
No Core Data dependency — no ORM layer of any kind

10. Implementation Status

Metric	Current
Source files	30
Test files	19
Tests passing	352
Swift language mode	6.1
Dependencies	GRDB.swift 7.0+, AnyLanguageModel

11. Success Metrics

Metric	Target
Integration time	< 5 minutes for basic "chat with my data" — provide a database path and a model
Query accuracy	> 90% of common queries (SELECT with filters, sorting, aggregates) produce correct SQL on first attempt
Latency (kit overhead)	< 500ms for schema introspection + SQL validation on a typical 20-table database (excludes LLM response time)
Package size	< 2 MB added to app binary (excluding LLM model weights)
Crash rate	0 crashes from kit code in production

12. Open Questions

AnyLanguageModel maturity — The library is relatively new; we need to track API stability and pin to a specific version. What's our fallback if breaking changes land? (Currently pinned to branch: main.)
SQL injection surface — The allowlist check validates operation type but does not parse SQL structure. Should we add a lightweight SQL tokenizer for additional safety, or is the allowlist sufficient given the LLM is the only SQL author?
Schema change detection — SchemaIntrospector caches the schema after first introspection. If the database schema changes at runtime (migrations, etc.), the cache becomes stale. Should we add a schema_version PRAGMA check or a manual invalidation API?
Large schema handling — For databases with many tables (100+), the schema description in the LLM system prompt may be very large. Should we add table filtering or relevance ranking?
Chart auto-detection accuracy — ChartDataDetector heuristically determines if results are chart-eligible. How do we handle false positives/negatives?

13. Milestones

Milestone	Scope	Status
M1: Foundation	SchemaIntrospector + SQLQueryParser + headless ChatEngine	Done
M2: Safety	OperationAllowlist + MutationPolicy + QueryValidator + ToolExecutionDelegate	Done
M3: Chat UI	DataChatView + ChatView + ChatViewModel + MessageBubbleView + ErrorMessageView	Done
M4: Rendering	TextSummaryRenderer + ScrollableDataTableView + ChartDataDetector + Bar/Line/Pie charts	Done
M5: Multi-turn	ConversationHistory + context window + PromptBuilder with history	Done
M6: Polish & Ship	Error handling (SwiftDBAIError), 352 tests, documentation	Done

14. References

GRDB.swift — SQLite toolkit for Swift
AnyLanguageModel (HuggingFace) — Unified Swift LLM abstraction
Swift Charts — Apple's declarative charting framework
Model Context Protocol (MCP) — For future MCP server mode
Swift Package Manager
SQLite PRAGMA Statements — Used for schema introspection

23 KiB Raw Blame History