Pattern Addition Guide
Status: Active
Last Updated: 2025-11-02
Related Docs: EPG Matching Pipeline, Command Reference
Code Location: backend/epgoat/domain/patterns.py, backend/config/sport_emojis.yml, backend/config/sport_categories.yml
Quick reference for adding new sport leagues and channel patterns
Quick Start
Adding support for a new sport league requires 4 steps:
- Add regex pattern →
backend/epgoat/domain/patterns.py - Add emoji mapping →
config/sport_emojis.yml - Add category mapping →
config/sport_categories.yml - Add tests →
test_patterns.py
Step-by-Step Example
Let's add support for a new league called "XFL" (Extreme Football League).
Step 1: Add Pattern to patterns.py
Edit backend/epgoat/domain/patterns.py and add to ALLOWED_CHANNEL_PATTERNS:
ALLOWED_CHANNEL_PATTERNS = [
# ... existing patterns ...
# XFL - Extreme Football League
(r'^XFL\s+\d+\s*:', 'XFL'),
]
Pattern Breakdown:
- ^ - Start of string (required)
- XFL - League name (case-insensitive due to IGNORECASE flag)
- \s+ - One or more whitespace characters
- \d+ - One or more digits (channel number)
- \s* - Optional whitespace
- : - Colon (required for this pattern)
Pattern Variations:
For streaming services where colon is optional:
(r'^XFL\s+\d+\s*:?', 'XFL'), # Colon optional
For patterns with pipe separator:
(r'^XFL\s*\|\s*\d+', 'XFL |'), # Note the space in family name
For patterns with special names:
(r'^XFL\s+Game\s+Pass\s+\d+', 'XFL Game Pass'),
Step 2: Add Emoji Mapping
Edit backend/config/sport_emojis.yml:
# ... existing mappings ...
# Extreme Football
xfl: '🏈'
Important: - Key must be lowercase - Value must be a valid unicode emoji - Use existing emoji if appropriate (e.g., '🏈' for football)
Common Emojis: - 🏀 Basketball - 🏈 American Football - ⚽ Soccer - 🏒 Hockey - ⚾ Baseball - 🥊 Combat Sports - 🎾 Tennis - 🏉 Rugby - 🏎️ Motorsports - 🔴 Generic/Streaming
Step 3: Add Category Mapping
Edit backend/config/sport_categories.yml:
# ... existing mappings ...
# Extreme Football
xfl: 'Sports / American Football / XFL'
Format Rules:
- Must start with Sports
- Parts separated by / (space-slash-space)
- Follow hierarchy: Sports / <Sport Type> / <League>
- No empty parts
Common Sport Types: - Basketball - American Football - Soccer - Ice Hockey - Baseball - Combat Sports - Tennis - Motorsports
Step 4: Add Tests
Edit test_patterns.py and add test cases:
def test_xfl_patterns(self):
"""Test XFL patterns."""
test_cases = [
("XFL 01: Dragons vs Guardians", True, "XFL"),
("XFL 05: Championship Game", True, "XFL"),
("XFL 10: FHD", True, "XFL"),
]
for channel_name, should_match, expected_family in test_cases:
matched, family, _ = match_prefix_and_shell(channel_name)
assert matched == should_match, f"Failed for: {channel_name}"
if should_match:
assert family == expected_family
Add to appropriate test class in test_patterns.py:
- TestSportLeaguePatterns for sport leagues
- TestStreamingServicePatterns for streaming services
Step 5: Verify
Run tests to verify:
# Run pattern tests
pytest test_patterns.py::TestSportLeaguePatterns::test_xfl_patterns -v
# Run all pattern tests
pytest test_patterns.py -v
# Test manually
python -c "
from patterns import match_prefix_and_shell
from config import get_sport_emoji, get_sport_category
matched, family, match_obj = match_prefix_and_shell('XFL 01: Dragons vs Guardians')
print(f'Matched: {matched}')
print(f'Family: {family}')
print(f'Emoji: {get_sport_emoji(family)}')
print(f'Category: {get_sport_category(family)}')
"
Expected output:
Matched: True
Family: XFL
Emoji: 🏈
Category: Sports / American Football / XFL
Pattern Examples
Standard League Pattern
Channel names like: NBA 01: Lakers vs Celtics
(r'^NBA\s+\d+\s*:', 'NBA'),
Streaming Service (Optional Colon)
Channel names like: ESPN+ 01 or ESPN+ 01:
(r'^ESPN\+\s+\d+\s*:?', 'ESPN+'),
Note: Escape + with \+
League with Pipe Separator
Channel names like: NFL | 03: Game
(r'^NFL\s*\|\s*\d+', 'NFL |'),
Note: Family name includes the space and pipe
League with Sub-Brand
Channel names like: NFL Game Pass 1: RedZone
(r'^NFL\s+Game\s+Pass\s+\d+', 'NFL Game Pass'),
International Services
Channel names like: DAZN CA 01: Boxing
(r'^DAZN\s+CA\s+\d+\s*:?', 'DAZN CA'),
Multi-Word Leagues
Channel names like: UEFA Champions League 01: Final
(r'^UEFA\s+Champions\s+League\s+\d+', 'UEFA Champions League'),
Special Characters
Channel names like: SEC+ 03: Game
(r'^SEC\+\s+\d+\s*:?', 'SEC+'),
Escape special regex characters: + . * ? [ ] ( ) { } ^ $ | \
Pattern Testing Checklist
Before committing your pattern:
- [ ] Pattern compiles without errors
- [ ] Pattern matches expected channel names
- [ ] Pattern doesn't match unrelated channels (test negative cases)
- [ ] Emoji is valid unicode character
- [ ] Category follows hierarchical format
- [ ] Unit tests added and passing
- [ ] Manual verification completed
Common Mistakes
❌ Missing Anchor
(r'NBA\s+\d+\s*:', 'NBA'), # Wrong - could match mid-string
✅ Correct:
(r'^NBA\s+\d+\s*:', 'NBA'), # Anchored to start
❌ Not Escaping Special Characters
(r'^ESPN+\s+\d+', 'ESPN+'), # Wrong - + is a regex operator
✅ Correct:
(r'^ESPN\+\s+\d+', 'ESPN+'), # Escaped with \+
❌ Family Name Doesn't Match Pattern
(r'^NBA\s+\d+\s*:', 'Basketball'), # Wrong - use 'NBA'
✅ Correct:
(r'^NBA\s+\d+\s*:', 'NBA'), # Family matches league name
❌ Category Doesn't Start with Sports
nba: 'Basketball / NBA' # Wrong
✅ Correct:
nba: 'Sports / Basketball / NBA'
❌ Incorrect Category Separator
nba: 'Sports/Basketball/NBA' # Wrong - no spaces
✅ Correct:
nba: 'Sports / Basketball / NBA' # Space-slash-space
Pattern Priority
Patterns are checked in order, so more specific patterns should come before generic ones:
ALLOWED_CHANNEL_PATTERNS = [
# Specific patterns first
(r'^NFL\s+Game\s+Pass\s+\d+', 'NFL Game Pass'),
(r'^NFL\s+Multi\s+Screen', 'NFL Multi Screen'),
# Generic pattern last
(r'^NFL\s+\d+\s*:', 'NFL'),
]
If generic NFL pattern came first, it would match "NFL Game Pass 1" before the specific pattern.
Advanced Patterns
Optional Time in Channel Name
Some channels include time: NBA 01 @ 7:30 PM: Lakers vs Celtics
The pattern still works because we only match the prefix:
(r'^NBA\s+\d+\s*:', 'NBA'), # Matches up to the colon
The rest (@ 7:30 PM: Lakers vs Celtics) becomes the "shell" that's parsed for event info.
Channels with Multiple Formats
If a league uses inconsistent naming, add multiple patterns:
ALLOWED_CHANNEL_PATTERNS = [
(r'^MLS\s+\d+\s*:', 'MLS'), # "MLS 01: Game"
(r'^MLS\s*\|\s*\d+', 'MLS |'), # "MLS | 01"
(r'^MLS\s+Espanol\s+\d+', 'MLS Espanol'), # "MLS Espanol 01"
]
Different family names allow different configurations while keeping the base league recognizable.
Testing Your Pattern
Unit Test Template
def test_your_league_patterns(self):
"""Test YOUR-LEAGUE patterns."""
test_cases = [
# Format: (channel_name, should_match, expected_family)
("YOUR-LEAGUE 01: Event Name", True, "YOUR-LEAGUE"),
("YOUR-LEAGUE 05: Another Event", True, "YOUR-LEAGUE"),
("YOUR-LEAGUE 99:", True, "YOUR-LEAGUE"),
("NOT-YOUR-LEAGUE 01:", False, None), # Negative case
]
for channel_name, should_match, expected_family in test_cases:
matched, family, _ = match_prefix_and_shell(channel_name)
assert matched == should_match, f"Failed for: {channel_name}"
if should_match:
assert family == expected_family
Integration Test
Verify the complete workflow:
def test_your_league_workflow(self):
"""Test complete workflow for YOUR-LEAGUE."""
channel_name = "YOUR-LEAGUE 01: Team A vs Team B"
# Step 1: Pattern matching
matched, family, match_obj = match_prefix_and_shell(channel_name)
assert matched is True
assert family == "YOUR-LEAGUE"
# Step 2: Classification
classif = classify_channel(channel_name, family, match_obj)
assert classif.classification == "event"
assert "Team A vs Team B" in classif.payload
# Step 3: Config lookups
emoji = get_sport_emoji(family)
category = get_sport_category(family)
assert emoji == '🏈' # Your expected emoji
assert "YOUR-LEAGUE" in category
Validation
After adding your pattern, validate the configuration:
# Validate emoji config
python -c "from schemas import validate_sport_emojis; from config import load_sport_config; validate_sport_emojis(load_sport_config('sport_emojis.yml', validate=False))"
# Validate category config
python -c "from schemas import validate_sport_categories; from config import load_sport_config; validate_sport_categories(load_sport_config('sport_categories.yml', validate=False))"
# Run all validation tests
pytest test_schemas.py -v
Debugging Tips
Pattern Not Matching
import re
pattern = r'^YOUR-LEAGUE\s+\d+\s*:'
channel = "YOUR-LEAGUE 01: Event"
rx = re.compile(pattern, re.IGNORECASE)
match = rx.match(channel)
if match:
print(f"Matched: {match.group()}")
print(f"Match object: {match}")
else:
print("No match - check pattern")
Check Existing Patterns
from patterns import ALLOWED_CHANNEL_PATTERNS
# List all patterns
for pattern, family in ALLOWED_CHANNEL_PATTERNS:
print(f"{family:30} {pattern}")
Verify Config Loading
from config import SPORT_EMOJIS, SPORT_CATEGORIES
print(f"Emoji for 'your-league': {SPORT_EMOJIS.get('your-league')}")
print(f"Category for 'your-league': {SPORT_CATEGORIES.get('your-league')}")
Resources
- Python Regex Documentation
- Regex Testing Tool - Use "Python" flavor
- Unicode Emoji List
- XMLTV Category Guidelines
Getting Help
If you're stuck:
- Check existing patterns for similar examples
- Run
pytest test_patterns.py -vto see all test cases - Use
regex101.comto test your pattern - Check the test files for usage examples
Last Updated: 2025-10-24 See Also: README.md for full documentation