M3U Snapshot Feature - Usage Guide
Status: Active
Last Updated: 2025-11-02
Related Docs: Command Reference, EPG Generation
Code Location: backend/epgoat/domain/parsers.py, backend/epgoat/cli/run_provider.py
Overview
The M3U snapshot feature captures the exact M3U file used during EPG generation, saving it with a timestamp for benchmarking purposes. This is critical when M3U files change dynamically and you need to test against identical data.
Quick Start
Save a snapshot during your next run:
python3 backend/epgoat/run_provider.py \
--provider tps \
--save-m3u-snapshot \
--max-channels 5 \
--force-refresh \
--verbose
Output:
INFO: ✓ M3U snapshot saved: /Users/you/repo/dist/snapshots/tps-snapshot-20251030-143052.m3u
INFO: Snapshot size: 524288 bytes (1234 lines)
Features
✅ Exact capture - Saves M3U immediately after fetch, before any processing
✅ Timestamped - Format: {provider}-snapshot-{YYYYMMDD-HHMMSS}.m3u
✅ Provider-tagged - Includes provider name in filename
✅ Works with URLs and local files - Handles both HTTP(S) and file:// inputs
✅ Auto-creates directory - dist/snapshots/ created automatically
✅ Git-ignored - Snapshots excluded from version control
File Locations
- Snapshots directory:
dist/snapshots/ - Naming pattern:
{provider}-snapshot-{timestamp}.m3u - Example:
tps-snapshot-20251030-143052.m3u
Usage Scenarios
1. Baseline for API Matching Project
# Save baseline before starting API overhaul
python3 backend/epgoat/run_provider.py \
--provider tps \
--save-m3u-snapshot \
--max-channels 1000
# Result: dist/snapshots/tps-snapshot-20251030-120000.m3u
# Use this exact file for benchmarking after Phase 7
2. Compare Runs Over Time
# Monday run
python3 backend/epgoat/run_provider.py --provider tps --save-m3u-snapshot
# → tps-snapshot-20251028-090000.m3u
# Friday run
python3 backend/epgoat/run_provider.py --provider tps --save-m3u-snapshot
# → tps-snapshot-20251101-090000.m3u
# Compare changes
diff dist/snapshots/tps-snapshot-20251028-090000.m3u \
dist/snapshots/tps-snapshot-20251101-090000.m3u
3. Debug Specific Failures
# Save snapshot when failure occurs
python3 backend/epgoat/run_provider.py \
--provider tps \
--save-m3u-snapshot \
--verbose
# Later: replay with exact same data
python3 backend/epgoat/application/epg_generator.py \
--m3u dist/snapshots/tps-snapshot-20251030-143052.m3u \
--tz "America/Chicago"
Implementation Details
Code Changes
backend/epgoat/domain/parsers.py(lines 329-397)- Added
save_snapshotandproviderparameters toparse_m3u() - Saves M3U content immediately after fetch
-
Creates timestamped files in
dist/snapshots/ -
backend/epgoat/cli/run_provider.py(lines 550, 268, 272) - Added
--save-m3u-snapshotCLI flag - Passes flag through to EPG generator
-
Passes provider name for filename tagging
-
backend/epgoat/application/epg_generator.py(lines 501-503) - Reads
save_m3u_snapshotflag from args -
Passes to
parse_m3u()during parsing -
.gitignore(line 61) - Added
dist/snapshots/to exclusions
What Gets Saved
- Raw M3U content - Exactly as received from URL or file
- No modifications - Before any parsing, filtering, or processing
- Full file - Not truncated (unlike clone_m3u which modifies)
- UTF-8 encoding - Standard text encoding
Performance Impact
- Negligible - Only adds ~1-5ms per run
- Disk usage - Typical M3U: 500KB-2MB per snapshot
- No API calls - Uses already-fetched content
Comparison with clone_m3u
| Feature | --save-m3u-snapshot |
backend/epgoat/utilities/clone_m3u.py |
|---|---|---|
| Timing | Immediately after fetch | After EPG generation |
| Content | Raw, unmodified | Modified (tvg-ids added) |
| Re-fetch | No | Yes (could be different) |
| Purpose | Benchmarking/debugging | Stable IDs for players |
| When to use | Need exact source data | Need stable channel IDs |
Best Practices
For Benchmarking (Your Use Case)
# 1. Save baseline snapshot
python3 backend/epgoat/run_provider.py \
--provider tps \
--save-m3u-snapshot \
--max-channels 1000 \
--force-refresh
# 2. Note the snapshot filename from logs
# → tps-snapshot-20251030-120000.m3u
# 3. Copy to baselines directory for safe keeping
mkdir -p dist/baselines
cp dist/snapshots/tps-snapshot-20251030-120000.m3u \
dist/baselines/tps-baseline-phase0.m3u
# 4. After Phase 7, test against same data
python3 backend/epgoat/application/epg_generator.py \
--m3u dist/baselines/tps-baseline-phase0.m3u \
--tz "America/Chicago"
Cleanup Old Snapshots
# Keep only last 7 days
find dist/snapshots -name "*.m3u" -mtime +7 -delete
# Keep only last 10 snapshots per provider
ls -t dist/snapshots/tps-*.m3u | tail -n +11 | xargs rm -f
Troubleshooting
Snapshot not created?
Check the logs for:
ERROR: Failed to fetch M3U from URL: ...
If validation fails, snapshot won't be saved.
Wrong provider name in filename?
Ensure --provider is passed:
python3 backend/epgoat/run_provider.py --provider tps --save-m3u-snapshot
# ^^^^^^^^^^^^^^
Snapshot directory not created?
Directory is auto-created. If failing, check permissions:
mkdir -p dist/snapshots
chmod 755 dist/snapshots
Examples
Minimal Command
python3 backend/epgoat/run_provider.py \
--provider tps \
--save-m3u-snapshot
Full Featured
python3 backend/epgoat/run_provider.py \
--provider tps \
--save-m3u-snapshot \
--max-channels 1000 \
--force-refresh \
--verbose \
--debug-matching \
--date 2025-10-30
With Custom M3U URL
python3 backend/epgoat/run_provider.py \
--provider custom \
--save-m3u-snapshot \
--m3u "https://example.com/playlist.m3u"
FAQ
Q: Does this slow down the run? A: No, negligible impact (~1-5ms). The M3U is already fetched; we just save it.
Q: Can I use this with local M3U files? A: Yes! It will copy the file with a timestamp.
Q: Do I need this every run? A: No, only when you need to capture the exact data for later replay/benchmarking.
Q: How much disk space do snapshots use? A: Typical M3U: 500KB-2MB. 100 snapshots ≈ 50-200MB.
Q: Are snapshots committed to git?
A: No, dist/snapshots/ is in .gitignore.
Pro Tip: For your API matching project, save a snapshot NOW before any changes. This will be your "Phase 0 baseline" to compare against after Phase 7 completion!