7 Commits

Author SHA1 Message Date
shoko
c348d6daa1 tests: Add unit tests for browse_events, fetch_all_pages, filter_events, is_match_market, get_ml_market, get_ml_volume, sort_events
New test classes:
- TestIsMatchMarket: 5 tests for is_match_market() classification
- TestGetMlMarket: 5 tests for get_ml_market() and get_ml_volume()
- TestFilterEvents: 5 tests for filter_events() and sort_events()
- TestFetchAllPages: 4 tests for fetch_all_pages() early-exit logic
- TestBrowseEvents: 5 tests for browse_events() sort_by parameter

Total: 24 new tests (62 total, all passing)
2026-03-25 19:08:36 +00:00
shoko
764c75e712 Fix: Switch fetch_page from subprocess to urllib, add early-exit to fetch_all_pages, add sort_by to browse_events
- fetch_page: replace subprocess.run(curl) with urllib (stdlib, cleaner)
- fetch_all_pages: add matches_max/non_matches_max params for early-exit.
  When both are set, stop fetching once quotas are satisfied.
- browse_events: add sort_by param (None='fast' early-exit, 'volume'=full fetch+sort).
  Early-exit only used when sort_by=None (no client-side sort needed).
- Remove subprocess import (no longer needed after migration)
2026-03-25 18:53:11 +00:00
shoko
3a9f8fb365 Fix #14: Refactor print_browse/send_to_telegram into single pipeline
Replace duplicate inline formatting with unified format+render pipeline.

New functions:
- format_match_event(e) — canonical dict for match events
- format_non_match_event(e) — canonical dict for non-match events
- render_match_lines(event_dict, i, mode) — text/HTML renderer
- render_non_match_lines(event_dict, i, mode) — text/HTML renderer
- send_chunked(...) — extracted Telegram chunking logic

Also fixed send_chunked() chunking bug: the original '. ' in line
check never matched event lines (period is followed by '</b>' not space).

Tests: 38 total, all passing.

Fixes: #14
2026-03-25 17:50:54 +00:00
shoko
a7837cec0f Merge #15: Unify duplicate time functions 2026-03-25 14:34:05 +00:00
shoko
8cde441996 Fix #15: Unify duplicate time functions into _get_time_data()
Replace three duplicated time parsing functions with a single
_get_time_data(e, tz) helper returning {time_status, time_urgency, abs_time}.

Deleted functions:
- get_match_time_status(e)  — urgency + status string
- get_match_time_str(e)    — status string only
- get_start_time_wib(e)    — (abs_time, rel_str) tuple

New unified helper:
- _get_time_data(e, tz=None) returns {time_status, time_urgency, abs_time}
- tz defaults to WIB (UTC+7, Indonesia)
- canonical rel_str format: 'LIVE', 'In 6h', '12h ago', etc.
- time_urgency: 0-3 (higher=livelier)

All call sites updated to use _get_time_data():
- format_event(), format_detail_event()
- print_browse(), print_detail()
- send_to_telegram()

Also: removed dead code in print_detail() that called get_match_time_str()
but never used the result.

Tests: 9 new tests for _get_time_data() covering TBD, future, live,
and past event scenarios. 19 tests total, all passing.

Fixes: #15
2026-03-25 13:59:54 +00:00
b2180a4a34 Merge pull request 'Fix #5: HTML injection in Telegram messages' (#20) from fix/issue-5-html-injection-telegram into master 2026-03-25 13:13:52 +01:00
shoko
d0534aedbf Fix #5: HTML injection in Telegram messages
Add escape_html() function to prevent HTML injection in Telegram
parse_mode=HTML messages. Apply escaping to event titles inserted
into <a> tags in send_to_telegram().

- Add escape_html() using stdlib html.escape()
- Escape match event titles (line 648) and non-match titles (line 676)
- Add TestHtmlInjection with 2 tests proving fix:
  - <script> tags escaped as &lt;script&gt;
  - & ampersands escaped as &amp;
- Fixes HIGH severity: titles from Polymarket API were inserted
  without escaping, allowing malformed HTML in Telegram messages
2026-03-25 11:42:42 +00:00
2 changed files with 1462 additions and 265 deletions

View File

@@ -4,6 +4,7 @@ Polymarket Event Browser
Browse tradeable Polymarket events by game category.
"""
import html
import json
import time
import argparse
@@ -18,6 +19,7 @@ from urllib.request import urlopen, Request
PAGE_SIZE = 50
MAX_RETRIES = 5
INITIAL_RETRY_DELAY = 2 # exponential backoff starts at 2s
WIB = timezone(timedelta(hours=7)) # UTC+7 for Indonesian users
GAME_CATEGORIES = {
"All Esports": "Esports",
@@ -43,49 +45,65 @@ def fetch_page(q, page=1, max_retries=MAX_RETRIES, initial_delay=INITIAL_RETRY_D
delay = initial_delay
for attempt in range(max_retries):
time.sleep(delay)
r = subprocess.run(
["curl", "-s", url, "--max-time", "10", "-H", "User-Agent: curl/7.88.1"],
capture_output=True
)
if r.returncode == 0 and len(r.stdout) > 0:
try:
return json.loads(r.stdout.decode('utf-8'))
except json.JSONDecodeError:
if attempt < max_retries - 1:
delay *= 2 # Exponential backoff
continue
return None
else:
# Rate limit or other error - exponential backoff
if attempt > 0:
time.sleep(delay)
try:
req = Request(url, headers={"User-Agent": "Mozilla/5.0"})
with urlopen(req, timeout=10) as r:
return json.loads(r.read())
except Exception:
if attempt < max_retries - 1:
delay *= 2
continue
return None
return None
def fetch_all_pages(q, max_pages=100):
def fetch_all_pages(q, matches_max=None, non_matches_max=None):
"""
Fetch ALL pages until pagination ends.
max_pages is a safety cap to prevent infinite loops.
Fetch pages until pagination ends, or until quotas are satisfied.
Args:
q: search query
matches_max: stop early once we have this many match events (None = no limit)
non_matches_max: stop early once we have this many non-match events (None = no limit)
Returns:
{"events": [...], "total_raw": N, "partial": bool}
"""
all_events = []
total_raw = 0
for page in range(1, max_pages + 1):
time.sleep(0.2) # small delay between pages (API rate limit is generous)
match_count = 0
non_match_count = 0
page = 0
while True:
page += 1
time.sleep(0.2)
data = fetch_page(q, page)
if data is None:
break
events = data.get("events", [])
total_raw = data.get("pagination", {}).get("totalResults", 0)
all_events.extend(events)
# Stop when we get 0 events (no more pages),
# OR when we've fetched >= total results
# Count matches/non-matches in this page
for e in events:
if is_match_market(e):
match_count += 1
else:
non_match_count += 1
# Stop if we got what we wanted (only when caps are set)
if matches_max is not None and non_matches_max is not None:
if match_count >= matches_max and non_match_count >= non_matches_max:
break
# Stop when we get 0 events (no more pages)
if len(events) == 0:
break
# Stop when we've fetched all known results
if len(all_events) >= total_raw:
break
partial = (total_raw > 0 and len(all_events) < total_raw)
return {"events": all_events, "total_raw": total_raw, "partial": partial}
@@ -220,94 +238,79 @@ def format_spread(bid, ask):
spread = ask - bid
return f"{prob_to_cents(spread)}c"
def get_match_time_status(e):
def _get_time_data(e, tz=None):
"""
Return a human-readable match time status.
Returns (status_str, urgency) where urgency is 0-3 (higher = more urgent/live).
Uses startTime for actual match start time.
Displays times in WIB (UTC+7 for Indonesian users).
Unified time data extraction for event timestamps.
Uses startTime (preferred) or startDate as the event start time.
Datetime parsing and all relative calculations are UTC-based.
The tz parameter only affects the abs_time formatting.
Args:
e: Event dict with 'startTime' or 'startDate' key.
tz: datetime.timezone for abs_time formatting.
Defaults to WIB (UTC+7).
Returns:
{
"time_status": str, # e.g. "LIVE", "In 6h", "12h ago"
"time_urgency": int, # 0-3 (higher = more urgent/live)
"abs_time": str, # e.g. "Mar 25, 19:00 WIB" or "TBD"
}
"""
# Use startTime for actual match start, not startDate (which is market creation time)
tz = tz or WIB
start_str = e.get("startTime") or e.get("startDate", "")
if not start_str:
return "TBD", 0
return {"time_status": "TBD", "time_urgency": 0, "abs_time": "TBD"}
try:
start_dt = datetime.fromisoformat(start_str.replace('Z', '+00:00'))
now_utc = datetime.now(timezone.utc)
utc7 = timezone(timedelta(hours=7))
now = now_utc.astimezone(utc7)
start_utc7 = start_dt.astimezone(utc7)
delta = start_dt - now_utc
total_sec = delta.total_seconds()
if delta.total_seconds() < 0:
# Started already
hours_ago = abs(delta.total_seconds()) / 3600
if total_sec < 0:
# Event is in the past
hours_ago = abs(total_sec) / 3600
if hours_ago < 1:
return "LIVE", 3
time_status = "LIVE"
time_urgency = 3
elif hours_ago < 4:
return f"LIVE {int(hours_ago)}h", 3
time_status = f"LIVE {int(hours_ago)}h"
time_urgency = 3
elif hours_ago < 24:
return f"Started {int(hours_ago)}h ago", 1
time_status = f"{int(hours_ago)}h ago"
time_urgency = 1
else:
days = int(hours_ago / 24)
return f"{days}d ago", 0
time_status = f"{days}d ago"
time_urgency = 0
else:
# Starts in future
hours_until = delta.total_seconds() / 3600
if hours_until <= 0:
return "LIVE", 3
elif hours_until < 1:
mins = int(delta.total_seconds() / 60)
return f"In {mins}m", 3
elif hours_until < 24:
return f"In {int(hours_until)}h", 2
# Event is in the future
if total_sec < 3600:
mins = int(total_sec / 60)
time_status = f"In {mins}m"
time_urgency = 3
elif total_sec < 86400:
hours_until = int(total_sec / 3600)
time_status = f"In {hours_until}h"
time_urgency = 2
else:
days = int(hours_until / 24)
return f"In {days}d", 1
except:
return "", 0
days = int(total_sec / 86400)
time_status = f"In {days}d"
time_urgency = 1
def get_match_time_str(e):
"""
Return just the time status string (e.g. 'LIVE', 'In 6h', 'In 1d').
Uses startTime for actual match start time.
"""
start_str = e.get("startTime") or e.get("startDate", "")
if not start_str:
return "TBD"
try:
start_dt = datetime.fromisoformat(start_str.replace('Z', '+00:00'))
now_utc = datetime.now(timezone.utc)
delta = start_dt - now_utc
if delta.total_seconds() < 0:
hours_ago = abs(delta.total_seconds()) / 3600
if hours_ago < 1:
return "LIVE"
elif hours_ago < 4:
return f"LIVE {int(hours_ago)}h"
elif hours_ago < 24:
return f"{int(hours_ago)}h ago"
else:
days = int(hours_ago / 24)
return f"{days}d ago"
abs_time = start_dt.astimezone(tz).strftime("%b %d, %H:%M ")
if tz == WIB:
abs_time += "WIB"
else:
hours_until = delta.total_seconds() / 3600
if hours_until <= 0:
return "LIVE"
elif hours_until < 1:
mins = int(delta.total_seconds() / 60)
return f"In {mins}m"
elif hours_until < 24:
return f"In {int(hours_until)}h"
else:
days = int(hours_until / 24)
return f"In {days}d"
except:
return ""
abs_time += start_dt.astimezone(tz).strftime("%Z")
return {"time_status": time_status, "time_urgency": time_urgency, "abs_time": abs_time}
except Exception:
return {"time_status": "", "time_urgency": 0, "abs_time": "TBD"}
def filter_events(events, tradeable_only=True):
"""
@@ -326,6 +329,7 @@ def filter_events(events, tradeable_only=True):
return match_events, non_match_events
def sort_events(events):
return sorted(events, key=get_ml_volume, reverse=True)
@@ -333,24 +337,214 @@ def sort_events(events):
# BROWSE
# ============================================================
def browse_events(q, matches_max=10, non_matches_max=10, tradeable_only=True):
result = fetch_all_pages(q)
def browse_events(q, matches_max=10, non_matches_max=10, tradeable_only=True, sort_by=None):
"""
Browse Polymarket events.
Args:
q: search query
matches_max: max number of match markets to return
non_matches_max: max number of non-match markets to return
tradeable_only: filter to tradeable events only
sort_by: None (fast, API order) or "volume" (full fetch, sort by volume desc)
"""
# Pass quotas to fetch_all_pages for early-exit optimization.
# Only use early-exit when sort_by is None (no client-side sort needed).
use_early_exit = (sort_by is None)
fetch_matches_max = matches_max if use_early_exit else None
fetch_non_matches_max = non_matches_max if use_early_exit else None
result = fetch_all_pages(q, matches_max=fetch_matches_max, non_matches_max=fetch_non_matches_max)
events = result["events"]
match_events, non_match_events = filter_events(events, tradeable_only)
sorted_match = sort_events(match_events)
# Sort if requested; otherwise preserve API order
if sort_by == "volume":
match_events = sort_events(match_events)
non_match_events = sort_events(non_match_events)
return {
"query": q,
"total_raw": result["total_raw"],
"total_fetched": len(events),
"total_match": len(match_events),
"total_non_match": len(non_match_events),
"match_events": sorted_match[:matches_max],
"match_events": match_events[:matches_max],
"non_match_events": non_match_events[:non_matches_max],
"partial": result.get("partial", False),
}
# ============================================================
# FORMAT
# FORMAT — EVENT
# ============================================================
def format_match_event(e):
"""
Format a match event into a canonical dict for rendering.
All computing done here; renderers just template.
Returns:
{
"title": str, # raw title
"title_clean": str, # "Team A vs Team B"
"tournament": str, # "Tournament Name" or ""
"url": str,
"time_status": str, # "LIVE", "In 6h", "12h ago"
"time_urgency": int, # 0-3
"abs_time": str, # "Mar 25, 19:00 WIB"
"team_a": str,
"team_b": str,
"odds_a": str, # "55c"
"odds_b": str,
"vol": int,
}
"""
ml = get_ml_market(e)
outcomes = json.loads(ml.get("outcomes", "[]")) if ml else []
prices = json.loads(ml.get("outcomePrices", "[]")) if ml else []
td = _get_time_data(e)
title = e.get("title", "")
team_a = outcomes[0] if len(outcomes) > 0 else "?"
team_b = outcomes[1] if len(outcomes) > 1 else "?"
odds_a = format_odds(float(prices[0])) if len(prices) > 0 else "?"
odds_b = format_odds(float(prices[1])) if len(prices) > 1 else "?"
if " - " in title:
title_clean = title.split(" - ")[0].strip()
else:
title_clean = title
tournament = get_tournament(title)
return {
"title": title,
"title_clean": title_clean,
"tournament": tournament,
"url": get_event_url(e),
"time_status": td["time_status"],
"time_urgency": td["time_urgency"],
"abs_time": td["abs_time"],
"team_a": team_a,
"team_b": team_b,
"odds_a": odds_a,
"odds_b": odds_b,
"vol": get_ml_volume(e),
}
def format_non_match_event(e):
"""
Format a non-match event into a canonical dict for rendering.
Returns:
{
"title": str,
"url": str,
"time_status": str,
"time_urgency": int,
"abs_time": str,
"market_count": int,
"total_vol": int,
}
"""
td = _get_time_data(e)
total_vol = sum(float(m.get("volume", 0)) for m in e.get("markets", []))
market_count = len(e.get("markets", []))
return {
"title": e.get("title", "?"),
"url": get_event_url(e),
"time_status": td["time_status"],
"time_urgency": td["time_urgency"],
"abs_time": td["abs_time"],
"market_count": market_count,
"total_vol": int(total_vol),
}
# ============================================================
# FORMAT — RENDER
# ============================================================
def render_match_lines(event_dict, i, mode):
"""
Render a formatted match event dict into lines of text.
Args:
event_dict: canonical dict from format_match_event()
i: 1-based index for the event number
mode: "text" for plain text/Markdown, "html" for Telegram HTML
Returns:
List[str], one line per element (no trailing blank line).
Caller adds the blank line separator between events.
"""
title_clean = event_dict["title_clean"]
url = event_dict["url"]
abs_time = event_dict["abs_time"]
time_status = event_dict["time_status"]
vol = event_dict["vol"]
tournament = event_dict["tournament"]
team_a = event_dict["team_a"]
team_b = event_dict["team_b"]
odds_a = event_dict["odds_a"]
odds_b = event_dict["odds_b"]
lines = []
if mode == "html":
lines.append(
f"<b>{i}.</b> <a href=\"{url}\">{escape_html(title_clean)}</a>"
)
else:
lines.append(f"{i}. [{title_clean}]({url})")
lines.append(f" {abs_time} | {time_status}")
lines.append(f" Vol: ${vol:,.0f}")
if tournament:
lines.append(f" Tournament: {tournament}")
lines.append(f" Odds: {team_a} {odds_a} | {odds_b} {team_b}")
return lines
def render_non_match_lines(event_dict, i, mode):
"""
Render a formatted non-match event dict into lines of text.
Args:
event_dict: canonical dict from format_non_match_event()
i: 1-based index for the event number
mode: "text" for plain text/Markdown, "html" for Telegram HTML
Returns:
List[str], one line per element (no trailing blank line).
"""
title = event_dict["title"]
url = event_dict["url"]
abs_time = event_dict["abs_time"]
time_status = event_dict["time_status"]
market_count = event_dict["market_count"]
total_vol = event_dict["total_vol"]
lines = []
if mode == "html":
lines.append(f"<b>{i}.</b> <a href=\"{url}\">{escape_html(title)}</a>")
else:
lines.append(f"{i}. [{title}]({url})")
lines.append(f" {abs_time} | {time_status}")
lines.append(f" Markets: {market_count} | Total Vol: ${total_vol:,.0f}")
return lines
# ============================================================
# FORMAT — LEGACY
# ============================================================
def format_event(e):
@@ -360,12 +554,12 @@ def format_event(e):
best_bid = float(ml.get("bestBid", 0)) if ml else 0
best_ask = float(ml.get("bestAsk", 0)) if ml else 0
vol = get_ml_volume(e)
time_status, urgency = get_match_time_status(e)
td = _get_time_data(e)
return {
"title": e.get("title", ""),
"time_status": time_status,
"time_urgency": urgency,
"time_status": td["time_status"],
"time_urgency": td["time_urgency"],
"url": get_event_url(e),
"livestream": e.get("resolutionSource"),
"outcomes": outcomes,
@@ -384,11 +578,12 @@ def format_detail_event(e):
]
active_markets = sorted(active_markets, key=lambda m: float(m.get("volume", 0)), reverse=True)
time_status, urgency = get_match_time_status(e)
td = _get_time_data(e)
return {
"title": e.get("title", ""),
"time_status": time_status,
"time_status": td["time_status"],
"abs_time": td["abs_time"],
"url": get_event_url(e),
"livestream": e.get("resolutionSource"),
"outcomes": json.loads(ml.get("outcomes", "[]")) if ml else [],
@@ -415,48 +610,6 @@ def format_detail_event(e):
# DISPLAY
# ============================================================
def get_start_time_wib(e):
"""Return (date_time_str, relative_str) for display."""
start_str = e.get("startTime") or e.get("startDate", "")
if not start_str:
return "TBD", ""
try:
start_dt = datetime.fromisoformat(start_str.replace('Z', '+00:00'))
now_utc = datetime.now(timezone.utc)
utc7 = timezone(timedelta(hours=7))
start_utc7 = start_dt.astimezone(utc7)
# Absolute: "Mar 25, 19:00 WIB"
abs_str = start_utc7.strftime("%b %d, %H:%M WIB")
# Relative: "In 5h", "In 10h", "LIVE", etc.
delta = start_dt - now_utc
if delta.total_seconds() < 0:
hours_ago = abs(delta.total_seconds()) / 3600
if hours_ago < 1:
rel_str = "LIVE"
elif hours_ago < 24:
rel_str = f"{int(hours_ago)}h ago"
else:
days = int(hours_ago / 24)
rel_str = f"{days}d ago"
else:
hours_until = delta.total_seconds() / 3600
if hours_until <= 0:
rel_str = "LIVE"
elif hours_until < 1:
mins_until = int(delta.total_seconds() / 60)
rel_str = f"In {mins_until}m"
elif hours_until < 24:
rel_str = f"In {int(hours_until)}h"
else:
days = int(hours_until / 24)
rel_str = f"In {days}d"
return abs_str, rel_str
except:
return "TBD", ""
def get_header_date():
"""Return current date string like 'Mar 25, 2026'"""
now_utc = datetime.now(timezone.utc)
@@ -487,9 +640,8 @@ def print_browse(match_events, non_match_events, category, total_raw, total_fetc
if partial:
print(f"WARNING: Partial fetch (API error or timeout) — data may be incomplete")
# --- MATCH MARKETS ---
# Determine sections to show
if not matches_only and not non_matches_only:
# Default: show both
show_matches = True
show_non_matches = True
elif matches_only:
@@ -499,68 +651,31 @@ def print_browse(match_events, non_match_events, category, total_raw, total_fetc
show_matches = False
show_non_matches = True
# Match events
if show_matches:
print(f"\nMATCH MARKETS")
print("\nMATCH MARKETS")
if not match_events:
print(" No match markets found.")
else:
for i, e in enumerate(match_events, 1):
f = format_event(e)
ml = get_ml_market(e)
outcomes = json.loads(ml.get("outcomes", "[]")) if ml else []
prices = json.loads(ml.get("outcomePrices", "[]")) if ml else []
vol = f["volume"]
title = f["title"]
url = f["url"]
start_time_wib, rel_time = get_start_time_wib(e)
fd = format_match_event(e)
for line in render_match_lines(fd, i, mode="text"):
print(line)
team_a = outcomes[0] if len(outcomes) > 0 else "?"
team_b = outcomes[1] if len(outcomes) > 1 else "?"
odds_a = format_odds(float(prices[0])) if len(prices) > 0 else "?"
odds_b = format_odds(float(prices[1])) if len(prices) > 1 else "?"
if " - " in title:
title_clean = title.split(" - ")[0].strip()
else:
title_clean = title
tournament = get_tournament(title)
print(f"\n {i}. [{title_clean}]({url})")
print(f" {start_time_wib} | {rel_time}")
print(f" Vol: ${vol:,.0f}")
if tournament:
print(f" Tournament: {tournament}")
print(f" Odds: {team_a} {odds_a} | {odds_b} {team_b}")
# --- NON-MATCH MARKETS ---
# Non-match events
if show_non_matches and non_match_events:
print(f"\nNON-MATCH MARKETS")
print("\nNON-MATCH MARKETS")
for i, e in enumerate(non_match_events[:non_matches_max], 1):
title = e.get("title", "?")
url = get_event_url(e)
start_time_wib, rel_time = get_start_time_wib(e)
total_vol = sum(float(m.get("volume", 0)) for m in e.get("markets", []))
market_count = len(e.get("markets", []))
print(f"\n {i}. [{title}]({url})")
print(f" {start_time_wib} | {rel_time}")
print(f" Markets: {market_count} | Total Vol: ${total_vol:,.0f}")
fd = format_non_match_event(e)
for line in render_non_match_lines(fd, i, mode="text"):
print(line)
def print_detail(e, detail):
from datetime import datetime, timezone, timedelta
now_utc = datetime.now(timezone.utc)
utc7 = timezone(timedelta(hours=7))
now_utc7 = now_utc.astimezone(utc7)
print(f"\n{detail['title']}")
print(f"URL: {detail['url']}")
print(f"Livestream: {detail['livestream']}")
spread_str = format_spread(detail["best_bid"], detail["best_ask"]) if detail["best_bid"] and detail["best_ask"] else "N/A"
time_str = get_match_time_str(e)
print(f"\n{detail['time_status']}")
print(f"ML: {detail['outcomes'][0]} {format_odds(float(detail['prices'][0]))} vs {detail['outcomes'][1]} {format_odds(float(detail['prices'][1]))}")
print(f"ML Vol: ${detail['volume']:,.0f} | {spread_str}")
@@ -577,6 +692,15 @@ def print_detail(e, detail):
# TELEGRAM
# ============================================================
def escape_html(text):
"""Escape HTML-sensitive characters for Telegram parse_mode=HTML."""
return (text
.replace("&", "&amp;")
.replace("<", "&lt;")
.replace(">", "&gt;")
.replace('"', "&quot;"))
def send_telegram_message(bot_token, chat_id, text, timeout=10):
"""Send a message via Telegram bot API. Returns the message ID on success.
@@ -621,78 +745,68 @@ def send_to_telegram(match_events, non_match_events, category, matches_only=Fals
msg_id = send_telegram_message(bot_token, chat_id, text)
print(f" Sent msg {msg_id}")
# Build sections
lines = [f"<b>{category.upper()}</b> | {header_date}"]
lines.append("")
# Build lines
lines = [f"<b>{category.upper()}</b> | {header_date}", ""]
if show_matches:
lines.append("MATCH MARKETS")
lines.append("")
lines += ["MATCH MARKETS", ""]
if not match_events:
lines.append(" No match markets found.")
else:
for i, e in enumerate(match_events, 1):
ml = get_ml_market(e)
outcomes = json.loads(ml.get("outcomes", "[]")) if ml else []
prices = json.loads(ml.get("outcomePrices", "[]")) if ml else []
vol = get_ml_volume(e)
title = e.get("title", "?")
url = get_event_url(e)
start_time_wib, rel_time = get_start_time_wib(e)
team_a = outcomes[0] if len(outcomes) > 0 else "?"
team_b = outcomes[1] if len(outcomes) > 1 else "?"
odds_a = format_odds(float(prices[0])) if len(prices) > 0 else "?"
odds_b = format_odds(float(prices[1])) if len(prices) > 1 else "?"
tournament = get_tournament(title)
title_clean = title.split(" - ")[0].strip() if " - " in title else title
lines.append(f"<b>{i}.</b> <a href=\"{url}\">{title_clean}</a>")
lines.append(f" {start_time_wib} | {rel_time}")
lines.append(f" Vol: ${vol:,.0f}")
if tournament:
lines.append(f" Tournament: {tournament}")
lines.append(f" Odds: {team_a} {odds_a} | {odds_b} {team_b}")
fd = format_match_event(e)
lines += render_match_lines(fd, i, mode="html")
lines.append("")
lines.append("")
if show_non_matches:
lines.append("NON-MATCH MARKETS")
lines.append("")
lines += ["NON-MATCH MARKETS", ""]
if not non_match_events:
lines.append(" No non-match markets found.")
else:
for i, e in enumerate(non_match_events, 1):
title = e.get("title", "?")
url = get_event_url(e)
start_time_wib, rel_time = get_start_time_wib(e)
total_vol = sum(float(m.get("volume", 0)) for m in e.get("markets", []))
market_count = len(e.get("markets", []))
lines.append(f"<b>{i}.</b> <a href=\"{url}\">{title}</a>")
lines.append(f" {start_time_wib} | {rel_time}")
lines.append(f" Markets: {market_count} | Total Vol: ${total_vol:,.0f}")
fd = format_non_match_event(e)
lines += render_non_match_lines(fd, i, mode="html")
lines.append("")
lines.append("")
# Chunk by 10 items (events), respecting 4096 char Telegram limit
text = "\n".join(lines)
# Chunk and send
send_chunked(lines, send, category, header_date, show_matches, show_non_matches)
def send_chunked(all_lines, send_fn, category, header_date, show_matches, show_non_matches):
"""
Split already-built lines into Telegram-safe chunks and send them.
Telegram messages are capped at 4096 chars. Chunks are grouped by
section header so no event is split across messages.
Args:
all_lines: Full message lines list (built by caller).
send_fn: Closure that sends a single string and prints confirmation.
category: Category name for header.
header_date: Date string for header.
show_matches: Whether MATCH MARKETS section is present.
show_non_matches: Whether NON-MATCH MARKETS section is present.
"""
text = "\n".join(all_lines)
if len(text) <= 4096:
send(text)
send_fn(text)
return
# Split into chunks of 10 events
# Split into chunks of 10 events, respecting section headers
all_items = []
in_match = True
for line in lines:
for line in all_lines:
if line == "MATCH MARKETS":
in_match = True
elif line == "NON-MATCH MARKETS":
in_match = False
elif line.startswith("<b>") and ". " in line and "</a>" in line:
elif line.startswith("<b>") and "</a>" in line:
# Event title line: <b>1.</b> <a href="...">Title</a>
all_items.append((in_match, line))
chunk = []
chunk_len = 0
chunk_num = 1
# Header is always first
header = f"<b>{category.upper()}</b> | {header_date}\n"
if show_matches:
header += "\nMATCH MARKETS\n\n"
@@ -703,9 +817,8 @@ def send_to_telegram(match_events, non_match_events, category, matches_only=Fals
test_chunk = chunk + [item_line, ""]
test_text = header + "\n".join(chunk) + "\n".join(test_chunk)
if len(test_text) > 4096 or len(chunk) >= 10:
# Send current chunk
msg = header + "\n".join(chunk)
send(msg)
send_fn(msg)
chunk = [item_line, ""]
header = f"<b>{category.upper()}</b> (cont.) | {header_date}\n"
if show_matches and is_match:
@@ -717,7 +830,7 @@ def send_to_telegram(match_events, non_match_events, category, matches_only=Fals
if chunk:
msg = header + "\n".join(chunk)
send(msg)
send_fn(msg)
# ============================================================

File diff suppressed because it is too large Load Diff