This document describes the user interfaces (Web UI and CLI) for the optimized duplicate detection system.
Location: /indexing
Features:
Index Destination Directory
Index Statistics
Duplicate Statistics
Navigation:
Location: /duplicates
New Features:
watch-finished-cli duplicates:scan [options]
Options:
--reset: Reset existing duplicate groupsExample:
watch-finished-cli duplicates:scan
watch-finished-cli duplicates:scan --reset
watch-finished-cli duplicates:list [options]
Options:
--status <status>: Filter by status (pending/reviewed/purged)--dataset <dataset>: Filter by datasetExample:
watch-finished-cli duplicates:list
watch-finished-cli duplicates:list --status pending --dataset movies
watch-finished-cli index:destination --dataset <dataset> --destination <path> [options]
Required:
--dataset <dataset>: Dataset name--destination <path>: Destination directory pathOptions:
--reindex: Clear and rebuild the index--batch-size <size>: Number of files to process at once (default: 100)Example:
# Index a destination
watch-finished-cli index:destination \
--dataset movies \
--destination /media/movies
# Re-index (clear and rebuild)
watch-finished-cli index:destination \
--dataset movies \
--destination /media/movies \
--reindex \
--batch-size 200
watch-finished-cli index:stats [options]
Options:
--dataset <dataset>: Filter by datasetExample:
watch-finished-cli index:stats
watch-finished-cli index:stats --dataset movies
watch-finished-cli index:count --dataset <dataset> [options]
Required:
--dataset <dataset>: Dataset nameOptions:
--destination <path>: Filter by destination pathExample:
watch-finished-cli index:count --dataset movies
watch-finished-cli index:count --dataset movies --destination /media/movies
watch-finished-cli index:clear --dataset <dataset> [options]
Required:
--dataset <dataset>: Dataset nameOptions:
--destination <path>: Filter by destination pathExample:
watch-finished-cli index:clear --dataset movies
watch-finished-cli index:clear --dataset movies --destination /media/movies
# 1. Index destination
watch-finished-cli index:destination \
--dataset movies \
--destination /media/movies
# Output: ✅ Indexed: 1234, Skipped: 5, Errors: 0
# 2. Check index count
watch-finished-cli index:count --dataset movies
# Output: 📈 Indexed files for movies: 1234
# 3. View duplicate statistics
watch-finished-cli index:stats --dataset movies
# Output: Shows duplicate groups with details
# 4. Scan for duplicates (uses database)
watch-finished-cli duplicates:scan
# Output: ✅ Scan complete
# 5. List duplicates
watch-finished-cli duplicates:list --dataset movies
# Output: Shows detailed list of duplicate groups
Issue: Index count is 0 after indexing
Issue: Duplicates not showing after scan
Issue: Command not found
pnpm install in apps/cli directorynode apps/cli/dist/index.jsIssue: Connection error
Issue: Slow indexing
#!/bin/bash
# Index all datasets
DATASETS=("movies" "tvshows" "music")
DESTINATIONS=(
"/media/movies"
"/media/tvshows"
"/media/music"
)
for i in "${!DATASETS[@]}"; do
dataset="${DATASETS[$i]}"
destination="${DESTINATIONS[$i]}"
echo "Indexing $dataset..."
watch-finished-cli index:destination \
--dataset "$dataset" \
--destination "$destination" \
--batch-size 150
done
echo "Running duplicate scan..."
watch-finished-cli duplicates:scan
echo "Getting duplicate stats..."
watch-finished-cli index:stats
# Re-index daily at 2 AM
0 2 * * * /path/to/watch-finished-cli index:destination --dataset movies --destination /media/movies --reindex
# Scan for duplicates daily at 3 AM
0 3 * * * /path/to/watch-finished-cli duplicates:scan
# Weekly stats email
0 8 * * 1 /path/to/watch-finished-cli index:stats | mail -s "Weekly Duplicate Stats" admin@example.com