# Data Directory

This directory contains the SQLite databases used by the Watch Finished system.

## Database Schema Overview

```mermaid
erDiagram
    FILES ||--o{ TASKS : "processed-by"
    SETTINGS ||--o{ DATASETS : "configures"

    FILES {
        string dataset PK
        string input PK
        string output
        string status
        string date
    }

    TASKS {
        integer id PK
        string dataset
        string input
        string output
        string preset
        string status
        integer progress
        string created_at
        string updated_at
    }

    SETTINGS {
        string key PK
        string value
    }

    DATASETS {
        string name PK
        boolean enabled
        string destination
        string exts
        string ext
        string preset
        string clean
    }
```

## Files

- `database.db` - Main application database containing:
  - `files` table: Processed video files with metadata
  - `tasks` table: Video processing queue and task status
  - `settings` table: Application configuration and dataset settings

- `database.db.bak` - Backup of the main database (created during migrations)

## Database Schema

### Files Table

```sql
CREATE TABLE files (
  dataset TEXT,      -- Dataset name (e.g., 'movies', 'tvshows')
  input TEXT,        -- Original file path
  output TEXT,       -- Processed file path
  status TEXT,       -- Processing status ('pending', 'processing', 'success', 'failed')
  date TEXT,         -- ISO timestamp of last update
  PRIMARY KEY (dataset, input)
);
```

### Tasks Table

```sql
CREATE TABLE tasks (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  dataset TEXT,      -- Target dataset
  input TEXT,        -- Input file path
  output TEXT,       -- Output file path
  preset TEXT,       -- HandBrake preset used
  status TEXT,       -- Task status
  progress INTEGER,  -- Processing progress (0-100)
  created_at TEXT,   -- Creation timestamp
  updated_at TEXT    -- Last update timestamp
);
```

### Settings Table

```sql
CREATE TABLE settings (
  key TEXT PRIMARY KEY,
  value TEXT         -- JSON-encoded setting value
);
```

## Dataset Configuration

Dataset settings are stored in the settings table with keys like:

- `datasets/kids` - Kids movies dataset configuration
- `datasets/pr0n` - Adult content dataset configuration
- `datasets/tvshows` - TV shows dataset configuration

Each dataset configuration includes:

- `enabled`: Whether the dataset is active for watching
- `destination`: Output directory for processed files
- `exts`: File extensions to process
- `ext`: Output file extension
- `preset`: HandBrake encoding preset
- `clean`: Filename cleaning rules

## Backup and Recovery

- Always backup `database.db` before major changes
- The system creates `database.db.bak` during migrations
- To restore from backup: `cp database.db.bak database.db`

## Migration System

The application uses a proper database migration system to manage schema changes. This ensures that database changes can be versioned and applied consistently across different environments.

### Migration Files

Migration files are stored in `data/migrations/` and are named with timestamps: `YYYY-MM-DDTHH-MM-SS_migration_name.sql`.

### Running Migrations

Migrations are automatically applied when the service starts. You can also run them manually:

```bash
# Check migration status
pnpm run migrate:status

# Apply pending migrations
pnpm run migrate:up

# Create a new migration
pnpm run migrate:create <migration_name>
```

### Creating Migrations

When you need to make schema changes:

1. Create a new migration file:
   ```bash
   pnpm run migrate:create add_new_table
   ```

2. Edit the generated SQL file in `data/migrations/` with your schema changes.

3. Test the migration by running it:
   ```bash
   pnpm run migrate:up
   ```

4. Commit both the migration file and any code changes that depend on the new schema.

### Migration Best Practices

- Never modify existing migration files after they've been committed.
- If you need to change a migration, create a new one that undoes and redoes the changes.
- Test migrations on a copy of production data before applying to production.
- Keep migrations small and focused on a single change.
- Use descriptive names for migration files.