7 Commits

Author SHA1 Message Date
shoko
c348d6daa1 tests: Add unit tests for browse_events, fetch_all_pages, filter_events, is_match_market, get_ml_market, get_ml_volume, sort_events
New test classes:
- TestIsMatchMarket: 5 tests for is_match_market() classification
- TestGetMlMarket: 5 tests for get_ml_market() and get_ml_volume()
- TestFilterEvents: 5 tests for filter_events() and sort_events()
- TestFetchAllPages: 4 tests for fetch_all_pages() early-exit logic
- TestBrowseEvents: 5 tests for browse_events() sort_by parameter

Total: 24 new tests (62 total, all passing)
2026-03-25 19:08:36 +00:00
shoko
764c75e712 Fix: Switch fetch_page from subprocess to urllib, add early-exit to fetch_all_pages, add sort_by to browse_events
- fetch_page: replace subprocess.run(curl) with urllib (stdlib, cleaner)
- fetch_all_pages: add matches_max/non_matches_max params for early-exit.
  When both are set, stop fetching once quotas are satisfied.
- browse_events: add sort_by param (None='fast' early-exit, 'volume'=full fetch+sort).
  Early-exit only used when sort_by=None (no client-side sort needed).
- Remove subprocess import (no longer needed after migration)
2026-03-25 18:53:11 +00:00
shoko
3a9f8fb365 Fix #14: Refactor print_browse/send_to_telegram into single pipeline
Replace duplicate inline formatting with unified format+render pipeline.

New functions:
- format_match_event(e) — canonical dict for match events
- format_non_match_event(e) — canonical dict for non-match events
- render_match_lines(event_dict, i, mode) — text/HTML renderer
- render_non_match_lines(event_dict, i, mode) — text/HTML renderer
- send_chunked(...) — extracted Telegram chunking logic

Also fixed send_chunked() chunking bug: the original '. ' in line
check never matched event lines (period is followed by '</b>' not space).

Tests: 38 total, all passing.

Fixes: #14
2026-03-25 17:50:54 +00:00
shoko
a7837cec0f Merge #15: Unify duplicate time functions 2026-03-25 14:34:05 +00:00
shoko
8cde441996 Fix #15: Unify duplicate time functions into _get_time_data()
Replace three duplicated time parsing functions with a single
_get_time_data(e, tz) helper returning {time_status, time_urgency, abs_time}.

Deleted functions:
- get_match_time_status(e)  — urgency + status string
- get_match_time_str(e)    — status string only
- get_start_time_wib(e)    — (abs_time, rel_str) tuple

New unified helper:
- _get_time_data(e, tz=None) returns {time_status, time_urgency, abs_time}
- tz defaults to WIB (UTC+7, Indonesia)
- canonical rel_str format: 'LIVE', 'In 6h', '12h ago', etc.
- time_urgency: 0-3 (higher=livelier)

All call sites updated to use _get_time_data():
- format_event(), format_detail_event()
- print_browse(), print_detail()
- send_to_telegram()

Also: removed dead code in print_detail() that called get_match_time_str()
but never used the result.

Tests: 9 new tests for _get_time_data() covering TBD, future, live,
and past event scenarios. 19 tests total, all passing.

Fixes: #15
2026-03-25 13:59:54 +00:00
b2180a4a34 Merge pull request 'Fix #5: HTML injection in Telegram messages' (#20) from fix/issue-5-html-injection-telegram into master 2026-03-25 13:13:52 +01:00
shoko
d0534aedbf Fix #5: HTML injection in Telegram messages
Add escape_html() function to prevent HTML injection in Telegram
parse_mode=HTML messages. Apply escaping to event titles inserted
into <a> tags in send_to_telegram().

- Add escape_html() using stdlib html.escape()
- Escape match event titles (line 648) and non-match titles (line 676)
- Add TestHtmlInjection with 2 tests proving fix:
  - <script> tags escaped as &lt;script&gt;
  - & ampersands escaped as &amp;
- Fixes HIGH severity: titles from Polymarket API were inserted
  without escaping, allowing malformed HTML in Telegram messages
2026-03-25 11:42:42 +00:00
2 changed files with 1462 additions and 265 deletions

View File

@@ -4,6 +4,7 @@ Polymarket Event Browser
Browse tradeable Polymarket events by game category. Browse tradeable Polymarket events by game category.
""" """
import html
import json import json
import time import time
import argparse import argparse
@@ -18,6 +19,7 @@ from urllib.request import urlopen, Request
PAGE_SIZE = 50 PAGE_SIZE = 50
MAX_RETRIES = 5 MAX_RETRIES = 5
INITIAL_RETRY_DELAY = 2 # exponential backoff starts at 2s INITIAL_RETRY_DELAY = 2 # exponential backoff starts at 2s
WIB = timezone(timedelta(hours=7)) # UTC+7 for Indonesian users
GAME_CATEGORIES = { GAME_CATEGORIES = {
"All Esports": "Esports", "All Esports": "Esports",
@@ -40,52 +42,68 @@ def fetch_page(q, page=1, max_retries=MAX_RETRIES, initial_delay=INITIAL_RETRY_D
url = (f"{base}?q={q.replace(' ', '%20')}&limit={PAGE_SIZE}&page={page}" url = (f"{base}?q={q.replace(' ', '%20')}&limit={PAGE_SIZE}&page={page}"
f"&search_profiles=false&search_tags=false" f"&search_profiles=false&search_tags=false"
f"&keep_closed_markets=0&events_status=active&cache=false") f"&keep_closed_markets=0&events_status=active&cache=false")
delay = initial_delay delay = initial_delay
for attempt in range(max_retries): for attempt in range(max_retries):
time.sleep(delay) if attempt > 0:
r = subprocess.run( time.sleep(delay)
["curl", "-s", url, "--max-time", "10", "-H", "User-Agent: curl/7.88.1"], try:
capture_output=True req = Request(url, headers={"User-Agent": "Mozilla/5.0"})
) with urlopen(req, timeout=10) as r:
return json.loads(r.read())
if r.returncode == 0 and len(r.stdout) > 0: except Exception:
try:
return json.loads(r.stdout.decode('utf-8'))
except json.JSONDecodeError:
if attempt < max_retries - 1:
delay *= 2 # Exponential backoff
continue
return None
else:
# Rate limit or other error - exponential backoff
if attempt < max_retries - 1: if attempt < max_retries - 1:
delay *= 2 delay *= 2
continue continue
return None return None
return None return None
def fetch_all_pages(q, max_pages=100): def fetch_all_pages(q, matches_max=None, non_matches_max=None):
""" """
Fetch ALL pages until pagination ends. Fetch pages until pagination ends, or until quotas are satisfied.
max_pages is a safety cap to prevent infinite loops.
Args:
q: search query
matches_max: stop early once we have this many match events (None = no limit)
non_matches_max: stop early once we have this many non-match events (None = no limit)
Returns:
{"events": [...], "total_raw": N, "partial": bool}
""" """
all_events = [] all_events = []
total_raw = 0 total_raw = 0
for page in range(1, max_pages + 1): match_count = 0
time.sleep(0.2) # small delay between pages (API rate limit is generous) non_match_count = 0
page = 0
while True:
page += 1
time.sleep(0.2)
data = fetch_page(q, page) data = fetch_page(q, page)
if data is None: if data is None:
break break
events = data.get("events", []) events = data.get("events", [])
total_raw = data.get("pagination", {}).get("totalResults", 0) total_raw = data.get("pagination", {}).get("totalResults", 0)
all_events.extend(events) all_events.extend(events)
# Stop when we get 0 events (no more pages),
# OR when we've fetched >= total results # Count matches/non-matches in this page
for e in events:
if is_match_market(e):
match_count += 1
else:
non_match_count += 1
# Stop if we got what we wanted (only when caps are set)
if matches_max is not None and non_matches_max is not None:
if match_count >= matches_max and non_match_count >= non_matches_max:
break
# Stop when we get 0 events (no more pages)
if len(events) == 0: if len(events) == 0:
break break
# Stop when we've fetched all known results
if len(all_events) >= total_raw: if len(all_events) >= total_raw:
break break
partial = (total_raw > 0 and len(all_events) < total_raw) partial = (total_raw > 0 and len(all_events) < total_raw)
return {"events": all_events, "total_raw": total_raw, "partial": partial} return {"events": all_events, "total_raw": total_raw, "partial": partial}
@@ -220,94 +238,79 @@ def format_spread(bid, ask):
spread = ask - bid spread = ask - bid
return f"{prob_to_cents(spread)}c" return f"{prob_to_cents(spread)}c"
def get_match_time_status(e):
"""
Return a human-readable match time status.
Returns (status_str, urgency) where urgency is 0-3 (higher = more urgent/live).
Uses startTime for actual match start time.
Displays times in WIB (UTC+7 for Indonesian users).
"""
# Use startTime for actual match start, not startDate (which is market creation time)
start_str = e.get("startTime") or e.get("startDate", "")
if not start_str:
return "TBD", 0
try:
start_dt = datetime.fromisoformat(start_str.replace('Z', '+00:00'))
now_utc = datetime.now(timezone.utc)
utc7 = timezone(timedelta(hours=7))
now = now_utc.astimezone(utc7)
start_utc7 = start_dt.astimezone(utc7)
delta = start_dt - now_utc
if delta.total_seconds() < 0:
# Started already
hours_ago = abs(delta.total_seconds()) / 3600
if hours_ago < 1:
return "LIVE", 3
elif hours_ago < 4:
return f"LIVE {int(hours_ago)}h", 3
elif hours_ago < 24:
return f"Started {int(hours_ago)}h ago", 1
else:
days = int(hours_ago / 24)
return f"{days}d ago", 0
else:
# Starts in future
hours_until = delta.total_seconds() / 3600
if hours_until <= 0:
return "LIVE", 3
elif hours_until < 1:
mins = int(delta.total_seconds() / 60)
return f"In {mins}m", 3
elif hours_until < 24:
return f"In {int(hours_until)}h", 2
else:
days = int(hours_until / 24)
return f"In {days}d", 1
except:
return "", 0
def get_match_time_str(e): def _get_time_data(e, tz=None):
""" """
Return just the time status string (e.g. 'LIVE', 'In 6h', 'In 1d'). Unified time data extraction for event timestamps.
Uses startTime for actual match start time.
Uses startTime (preferred) or startDate as the event start time.
Datetime parsing and all relative calculations are UTC-based.
The tz parameter only affects the abs_time formatting.
Args:
e: Event dict with 'startTime' or 'startDate' key.
tz: datetime.timezone for abs_time formatting.
Defaults to WIB (UTC+7).
Returns:
{
"time_status": str, # e.g. "LIVE", "In 6h", "12h ago"
"time_urgency": int, # 0-3 (higher = more urgent/live)
"abs_time": str, # e.g. "Mar 25, 19:00 WIB" or "TBD"
}
""" """
tz = tz or WIB
start_str = e.get("startTime") or e.get("startDate", "") start_str = e.get("startTime") or e.get("startDate", "")
if not start_str: if not start_str:
return "TBD" return {"time_status": "TBD", "time_urgency": 0, "abs_time": "TBD"}
try: try:
start_dt = datetime.fromisoformat(start_str.replace('Z', '+00:00')) start_dt = datetime.fromisoformat(start_str.replace('Z', '+00:00'))
now_utc = datetime.now(timezone.utc) now_utc = datetime.now(timezone.utc)
delta = start_dt - now_utc delta = start_dt - now_utc
total_sec = delta.total_seconds()
if delta.total_seconds() < 0:
hours_ago = abs(delta.total_seconds()) / 3600 if total_sec < 0:
# Event is in the past
hours_ago = abs(total_sec) / 3600
if hours_ago < 1: if hours_ago < 1:
return "LIVE" time_status = "LIVE"
time_urgency = 3
elif hours_ago < 4: elif hours_ago < 4:
return f"LIVE {int(hours_ago)}h" time_status = f"LIVE {int(hours_ago)}h"
time_urgency = 3
elif hours_ago < 24: elif hours_ago < 24:
return f"{int(hours_ago)}h ago" time_status = f"{int(hours_ago)}h ago"
time_urgency = 1
else: else:
days = int(hours_ago / 24) days = int(hours_ago / 24)
return f"{days}d ago" time_status = f"{days}d ago"
time_urgency = 0
else: else:
hours_until = delta.total_seconds() / 3600 # Event is in the future
if hours_until <= 0: if total_sec < 3600:
return "LIVE" mins = int(total_sec / 60)
elif hours_until < 1: time_status = f"In {mins}m"
mins = int(delta.total_seconds() / 60) time_urgency = 3
return f"In {mins}m" elif total_sec < 86400:
elif hours_until < 24: hours_until = int(total_sec / 3600)
return f"In {int(hours_until)}h" time_status = f"In {hours_until}h"
time_urgency = 2
else: else:
days = int(hours_until / 24) days = int(total_sec / 86400)
return f"In {days}d" time_status = f"In {days}d"
except: time_urgency = 1
return ""
abs_time = start_dt.astimezone(tz).strftime("%b %d, %H:%M ")
if tz == WIB:
abs_time += "WIB"
else:
abs_time += start_dt.astimezone(tz).strftime("%Z")
return {"time_status": time_status, "time_urgency": time_urgency, "abs_time": abs_time}
except Exception:
return {"time_status": "", "time_urgency": 0, "abs_time": "TBD"}
def filter_events(events, tradeable_only=True): def filter_events(events, tradeable_only=True):
""" """
@@ -316,16 +319,17 @@ def filter_events(events, tradeable_only=True):
""" """
match_events = [] match_events = []
non_match_events = [] non_match_events = []
for e in events: for e in events:
if is_match_market(e): if is_match_market(e):
if not tradeable_only or is_tradeable_event(e): if not tradeable_only or is_tradeable_event(e):
match_events.append(e) match_events.append(e)
else: else:
non_match_events.append(e) non_match_events.append(e)
return match_events, non_match_events return match_events, non_match_events
def sort_events(events): def sort_events(events):
return sorted(events, key=get_ml_volume, reverse=True) return sorted(events, key=get_ml_volume, reverse=True)
@@ -333,24 +337,214 @@ def sort_events(events):
# BROWSE # BROWSE
# ============================================================ # ============================================================
def browse_events(q, matches_max=10, non_matches_max=10, tradeable_only=True): def browse_events(q, matches_max=10, non_matches_max=10, tradeable_only=True, sort_by=None):
result = fetch_all_pages(q) """
Browse Polymarket events.
Args:
q: search query
matches_max: max number of match markets to return
non_matches_max: max number of non-match markets to return
tradeable_only: filter to tradeable events only
sort_by: None (fast, API order) or "volume" (full fetch, sort by volume desc)
"""
# Pass quotas to fetch_all_pages for early-exit optimization.
# Only use early-exit when sort_by is None (no client-side sort needed).
use_early_exit = (sort_by is None)
fetch_matches_max = matches_max if use_early_exit else None
fetch_non_matches_max = non_matches_max if use_early_exit else None
result = fetch_all_pages(q, matches_max=fetch_matches_max, non_matches_max=fetch_non_matches_max)
events = result["events"] events = result["events"]
match_events, non_match_events = filter_events(events, tradeable_only) match_events, non_match_events = filter_events(events, tradeable_only)
sorted_match = sort_events(match_events)
# Sort if requested; otherwise preserve API order
if sort_by == "volume":
match_events = sort_events(match_events)
non_match_events = sort_events(non_match_events)
return { return {
"query": q, "query": q,
"total_raw": result["total_raw"], "total_raw": result["total_raw"],
"total_fetched": len(events), "total_fetched": len(events),
"total_match": len(match_events), "total_match": len(match_events),
"total_non_match": len(non_match_events), "total_non_match": len(non_match_events),
"match_events": sorted_match[:matches_max], "match_events": match_events[:matches_max],
"non_match_events": non_match_events[:non_matches_max], "non_match_events": non_match_events[:non_matches_max],
"partial": result.get("partial", False), "partial": result.get("partial", False),
} }
# ============================================================ # ============================================================
# FORMAT # FORMAT — EVENT
# ============================================================
def format_match_event(e):
"""
Format a match event into a canonical dict for rendering.
All computing done here; renderers just template.
Returns:
{
"title": str, # raw title
"title_clean": str, # "Team A vs Team B"
"tournament": str, # "Tournament Name" or ""
"url": str,
"time_status": str, # "LIVE", "In 6h", "12h ago"
"time_urgency": int, # 0-3
"abs_time": str, # "Mar 25, 19:00 WIB"
"team_a": str,
"team_b": str,
"odds_a": str, # "55c"
"odds_b": str,
"vol": int,
}
"""
ml = get_ml_market(e)
outcomes = json.loads(ml.get("outcomes", "[]")) if ml else []
prices = json.loads(ml.get("outcomePrices", "[]")) if ml else []
td = _get_time_data(e)
title = e.get("title", "")
team_a = outcomes[0] if len(outcomes) > 0 else "?"
team_b = outcomes[1] if len(outcomes) > 1 else "?"
odds_a = format_odds(float(prices[0])) if len(prices) > 0 else "?"
odds_b = format_odds(float(prices[1])) if len(prices) > 1 else "?"
if " - " in title:
title_clean = title.split(" - ")[0].strip()
else:
title_clean = title
tournament = get_tournament(title)
return {
"title": title,
"title_clean": title_clean,
"tournament": tournament,
"url": get_event_url(e),
"time_status": td["time_status"],
"time_urgency": td["time_urgency"],
"abs_time": td["abs_time"],
"team_a": team_a,
"team_b": team_b,
"odds_a": odds_a,
"odds_b": odds_b,
"vol": get_ml_volume(e),
}
def format_non_match_event(e):
"""
Format a non-match event into a canonical dict for rendering.
Returns:
{
"title": str,
"url": str,
"time_status": str,
"time_urgency": int,
"abs_time": str,
"market_count": int,
"total_vol": int,
}
"""
td = _get_time_data(e)
total_vol = sum(float(m.get("volume", 0)) for m in e.get("markets", []))
market_count = len(e.get("markets", []))
return {
"title": e.get("title", "?"),
"url": get_event_url(e),
"time_status": td["time_status"],
"time_urgency": td["time_urgency"],
"abs_time": td["abs_time"],
"market_count": market_count,
"total_vol": int(total_vol),
}
# ============================================================
# FORMAT — RENDER
# ============================================================
def render_match_lines(event_dict, i, mode):
"""
Render a formatted match event dict into lines of text.
Args:
event_dict: canonical dict from format_match_event()
i: 1-based index for the event number
mode: "text" for plain text/Markdown, "html" for Telegram HTML
Returns:
List[str], one line per element (no trailing blank line).
Caller adds the blank line separator between events.
"""
title_clean = event_dict["title_clean"]
url = event_dict["url"]
abs_time = event_dict["abs_time"]
time_status = event_dict["time_status"]
vol = event_dict["vol"]
tournament = event_dict["tournament"]
team_a = event_dict["team_a"]
team_b = event_dict["team_b"]
odds_a = event_dict["odds_a"]
odds_b = event_dict["odds_b"]
lines = []
if mode == "html":
lines.append(
f"<b>{i}.</b> <a href=\"{url}\">{escape_html(title_clean)}</a>"
)
else:
lines.append(f"{i}. [{title_clean}]({url})")
lines.append(f" {abs_time} | {time_status}")
lines.append(f" Vol: ${vol:,.0f}")
if tournament:
lines.append(f" Tournament: {tournament}")
lines.append(f" Odds: {team_a} {odds_a} | {odds_b} {team_b}")
return lines
def render_non_match_lines(event_dict, i, mode):
"""
Render a formatted non-match event dict into lines of text.
Args:
event_dict: canonical dict from format_non_match_event()
i: 1-based index for the event number
mode: "text" for plain text/Markdown, "html" for Telegram HTML
Returns:
List[str], one line per element (no trailing blank line).
"""
title = event_dict["title"]
url = event_dict["url"]
abs_time = event_dict["abs_time"]
time_status = event_dict["time_status"]
market_count = event_dict["market_count"]
total_vol = event_dict["total_vol"]
lines = []
if mode == "html":
lines.append(f"<b>{i}.</b> <a href=\"{url}\">{escape_html(title)}</a>")
else:
lines.append(f"{i}. [{title}]({url})")
lines.append(f" {abs_time} | {time_status}")
lines.append(f" Markets: {market_count} | Total Vol: ${total_vol:,.0f}")
return lines
# ============================================================
# FORMAT — LEGACY
# ============================================================ # ============================================================
def format_event(e): def format_event(e):
@@ -360,12 +554,12 @@ def format_event(e):
best_bid = float(ml.get("bestBid", 0)) if ml else 0 best_bid = float(ml.get("bestBid", 0)) if ml else 0
best_ask = float(ml.get("bestAsk", 0)) if ml else 0 best_ask = float(ml.get("bestAsk", 0)) if ml else 0
vol = get_ml_volume(e) vol = get_ml_volume(e)
time_status, urgency = get_match_time_status(e) td = _get_time_data(e)
return { return {
"title": e.get("title", ""), "title": e.get("title", ""),
"time_status": time_status, "time_status": td["time_status"],
"time_urgency": urgency, "time_urgency": td["time_urgency"],
"url": get_event_url(e), "url": get_event_url(e),
"livestream": e.get("resolutionSource"), "livestream": e.get("resolutionSource"),
"outcomes": outcomes, "outcomes": outcomes,
@@ -383,12 +577,13 @@ def format_detail_event(e):
if float(m.get("volume", 0)) > 0 and is_tradeable_market(m) if float(m.get("volume", 0)) > 0 and is_tradeable_market(m)
] ]
active_markets = sorted(active_markets, key=lambda m: float(m.get("volume", 0)), reverse=True) active_markets = sorted(active_markets, key=lambda m: float(m.get("volume", 0)), reverse=True)
time_status, urgency = get_match_time_status(e) td = _get_time_data(e)
return { return {
"title": e.get("title", ""), "title": e.get("title", ""),
"time_status": time_status, "time_status": td["time_status"],
"abs_time": td["abs_time"],
"url": get_event_url(e), "url": get_event_url(e),
"livestream": e.get("resolutionSource"), "livestream": e.get("resolutionSource"),
"outcomes": json.loads(ml.get("outcomes", "[]")) if ml else [], "outcomes": json.loads(ml.get("outcomes", "[]")) if ml else [],
@@ -415,48 +610,6 @@ def format_detail_event(e):
# DISPLAY # DISPLAY
# ============================================================ # ============================================================
def get_start_time_wib(e):
"""Return (date_time_str, relative_str) for display."""
start_str = e.get("startTime") or e.get("startDate", "")
if not start_str:
return "TBD", ""
try:
start_dt = datetime.fromisoformat(start_str.replace('Z', '+00:00'))
now_utc = datetime.now(timezone.utc)
utc7 = timezone(timedelta(hours=7))
start_utc7 = start_dt.astimezone(utc7)
# Absolute: "Mar 25, 19:00 WIB"
abs_str = start_utc7.strftime("%b %d, %H:%M WIB")
# Relative: "In 5h", "In 10h", "LIVE", etc.
delta = start_dt - now_utc
if delta.total_seconds() < 0:
hours_ago = abs(delta.total_seconds()) / 3600
if hours_ago < 1:
rel_str = "LIVE"
elif hours_ago < 24:
rel_str = f"{int(hours_ago)}h ago"
else:
days = int(hours_ago / 24)
rel_str = f"{days}d ago"
else:
hours_until = delta.total_seconds() / 3600
if hours_until <= 0:
rel_str = "LIVE"
elif hours_until < 1:
mins_until = int(delta.total_seconds() / 60)
rel_str = f"In {mins_until}m"
elif hours_until < 24:
rel_str = f"In {int(hours_until)}h"
else:
days = int(hours_until / 24)
rel_str = f"In {days}d"
return abs_str, rel_str
except:
return "TBD", ""
def get_header_date(): def get_header_date():
"""Return current date string like 'Mar 25, 2026'""" """Return current date string like 'Mar 25, 2026'"""
now_utc = datetime.now(timezone.utc) now_utc = datetime.now(timezone.utc)
@@ -478,18 +631,17 @@ def print_browse(match_events, non_match_events, category, total_raw, total_fetc
utc7 = timezone(timedelta(hours=7)) utc7 = timezone(timedelta(hours=7))
now_utc7 = now_utc.astimezone(utc7) now_utc7 = now_utc.astimezone(utc7)
header_date = get_header_date() header_date = get_header_date()
print(f"\n=== {category.upper()}{' [RAW]' if raw_mode else ''} ===") print(f"\n=== {category.upper()}{' [RAW]' if raw_mode else ''} ===")
print(f"Current time (WIB): {now_utc7.strftime('%H:%M WIB')} | {header_date}") print(f"Current time (WIB): {now_utc7.strftime('%H:%M WIB')} | {header_date}")
if raw_mode: if raw_mode:
print(f"Fetched: {total_fetched} / Total API: {total_raw} | Match: {total_match} | Non-match: {total_non_match}") print(f"Fetched: {total_fetched} / Total API: {total_raw} | Match: {total_match} | Non-match: {total_non_match}")
if partial: if partial:
print(f"WARNING: Partial fetch (API error or timeout) — data may be incomplete") print(f"WARNING: Partial fetch (API error or timeout) — data may be incomplete")
# --- MATCH MARKETS --- # Determine sections to show
if not matches_only and not non_matches_only: if not matches_only and not non_matches_only:
# Default: show both
show_matches = True show_matches = True
show_non_matches = True show_non_matches = True
elif matches_only: elif matches_only:
@@ -498,69 +650,32 @@ def print_browse(match_events, non_match_events, category, total_raw, total_fetc
else: else:
show_matches = False show_matches = False
show_non_matches = True show_non_matches = True
# Match events
if show_matches: if show_matches:
print(f"\nMATCH MARKETS") print("\nMATCH MARKETS")
if not match_events: if not match_events:
print(" No match markets found.") print(" No match markets found.")
else: else:
for i, e in enumerate(match_events, 1): for i, e in enumerate(match_events, 1):
f = format_event(e) fd = format_match_event(e)
ml = get_ml_market(e) for line in render_match_lines(fd, i, mode="text"):
outcomes = json.loads(ml.get("outcomes", "[]")) if ml else [] print(line)
prices = json.loads(ml.get("outcomePrices", "[]")) if ml else []
vol = f["volume"] # Non-match events
title = f["title"]
url = f["url"]
start_time_wib, rel_time = get_start_time_wib(e)
team_a = outcomes[0] if len(outcomes) > 0 else "?"
team_b = outcomes[1] if len(outcomes) > 1 else "?"
odds_a = format_odds(float(prices[0])) if len(prices) > 0 else "?"
odds_b = format_odds(float(prices[1])) if len(prices) > 1 else "?"
if " - " in title:
title_clean = title.split(" - ")[0].strip()
else:
title_clean = title
tournament = get_tournament(title)
print(f"\n {i}. [{title_clean}]({url})")
print(f" {start_time_wib} | {rel_time}")
print(f" Vol: ${vol:,.0f}")
if tournament:
print(f" Tournament: {tournament}")
print(f" Odds: {team_a} {odds_a} | {odds_b} {team_b}")
# --- NON-MATCH MARKETS ---
if show_non_matches and non_match_events: if show_non_matches and non_match_events:
print(f"\nNON-MATCH MARKETS") print("\nNON-MATCH MARKETS")
for i, e in enumerate(non_match_events[:non_matches_max], 1): for i, e in enumerate(non_match_events[:non_matches_max], 1):
title = e.get("title", "?") fd = format_non_match_event(e)
url = get_event_url(e) for line in render_non_match_lines(fd, i, mode="text"):
start_time_wib, rel_time = get_start_time_wib(e) print(line)
total_vol = sum(float(m.get("volume", 0)) for m in e.get("markets", []))
market_count = len(e.get("markets", []))
print(f"\n {i}. [{title}]({url})")
print(f" {start_time_wib} | {rel_time}")
print(f" Markets: {market_count} | Total Vol: ${total_vol:,.0f}")
def print_detail(e, detail): def print_detail(e, detail):
from datetime import datetime, timezone, timedelta
now_utc = datetime.now(timezone.utc)
utc7 = timezone(timedelta(hours=7))
now_utc7 = now_utc.astimezone(utc7)
print(f"\n{detail['title']}") print(f"\n{detail['title']}")
print(f"URL: {detail['url']}") print(f"URL: {detail['url']}")
print(f"Livestream: {detail['livestream']}") print(f"Livestream: {detail['livestream']}")
spread_str = format_spread(detail["best_bid"], detail["best_ask"]) if detail["best_bid"] and detail["best_ask"] else "N/A" spread_str = format_spread(detail["best_bid"], detail["best_ask"]) if detail["best_bid"] and detail["best_ask"] else "N/A"
time_str = get_match_time_str(e)
print(f"\n{detail['time_status']}") print(f"\n{detail['time_status']}")
print(f"ML: {detail['outcomes'][0]} {format_odds(float(detail['prices'][0]))} vs {detail['outcomes'][1]} {format_odds(float(detail['prices'][1]))}") print(f"ML: {detail['outcomes'][0]} {format_odds(float(detail['prices'][0]))} vs {detail['outcomes'][1]} {format_odds(float(detail['prices'][1]))}")
print(f"ML Vol: ${detail['volume']:,.0f} | {spread_str}") print(f"ML Vol: ${detail['volume']:,.0f} | {spread_str}")
@@ -577,6 +692,15 @@ def print_detail(e, detail):
# TELEGRAM # TELEGRAM
# ============================================================ # ============================================================
def escape_html(text):
"""Escape HTML-sensitive characters for Telegram parse_mode=HTML."""
return (text
.replace("&", "&amp;")
.replace("<", "&lt;")
.replace(">", "&gt;")
.replace('"', "&quot;"))
def send_telegram_message(bot_token, chat_id, text, timeout=10): def send_telegram_message(bot_token, chat_id, text, timeout=10):
"""Send a message via Telegram bot API. Returns the message ID on success. """Send a message via Telegram bot API. Returns the message ID on success.
@@ -612,100 +736,89 @@ def send_to_telegram(match_events, non_match_events, category, matches_only=Fals
utc7 = timezone(timedelta(hours=7)) utc7 = timezone(timedelta(hours=7))
now_utc7 = now_utc.astimezone(utc7) now_utc7 = now_utc.astimezone(utc7)
header_date = now_utc7.strftime("%b %d, %Y") header_date = now_utc7.strftime("%b %d, %Y")
# Determine sections to show # Determine sections to show
show_matches = (not matches_only and not non_matches_only) or matches_only show_matches = (not matches_only and not non_matches_only) or matches_only
show_non_matches = (not matches_only and not non_matches_only) or non_matches_only show_non_matches = (not matches_only and not non_matches_only) or non_matches_only
def send(text): def send(text):
msg_id = send_telegram_message(bot_token, chat_id, text) msg_id = send_telegram_message(bot_token, chat_id, text)
print(f" Sent msg {msg_id}") print(f" Sent msg {msg_id}")
# Build sections # Build lines
lines = [f"<b>{category.upper()}</b> | {header_date}"] lines = [f"<b>{category.upper()}</b> | {header_date}", ""]
lines.append("")
if show_matches: if show_matches:
lines.append("MATCH MARKETS") lines += ["MATCH MARKETS", ""]
lines.append("")
if not match_events: if not match_events:
lines.append(" No match markets found.") lines.append(" No match markets found.")
else: else:
for i, e in enumerate(match_events, 1): for i, e in enumerate(match_events, 1):
ml = get_ml_market(e) fd = format_match_event(e)
outcomes = json.loads(ml.get("outcomes", "[]")) if ml else [] lines += render_match_lines(fd, i, mode="html")
prices = json.loads(ml.get("outcomePrices", "[]")) if ml else []
vol = get_ml_volume(e)
title = e.get("title", "?")
url = get_event_url(e)
start_time_wib, rel_time = get_start_time_wib(e)
team_a = outcomes[0] if len(outcomes) > 0 else "?"
team_b = outcomes[1] if len(outcomes) > 1 else "?"
odds_a = format_odds(float(prices[0])) if len(prices) > 0 else "?"
odds_b = format_odds(float(prices[1])) if len(prices) > 1 else "?"
tournament = get_tournament(title)
title_clean = title.split(" - ")[0].strip() if " - " in title else title
lines.append(f"<b>{i}.</b> <a href=\"{url}\">{title_clean}</a>")
lines.append(f" {start_time_wib} | {rel_time}")
lines.append(f" Vol: ${vol:,.0f}")
if tournament:
lines.append(f" Tournament: {tournament}")
lines.append(f" Odds: {team_a} {odds_a} | {odds_b} {team_b}")
lines.append("") lines.append("")
lines.append("") lines.append("")
if show_non_matches: if show_non_matches:
lines.append("NON-MATCH MARKETS") lines += ["NON-MATCH MARKETS", ""]
lines.append("")
if not non_match_events: if not non_match_events:
lines.append(" No non-match markets found.") lines.append(" No non-match markets found.")
else: else:
for i, e in enumerate(non_match_events, 1): for i, e in enumerate(non_match_events, 1):
title = e.get("title", "?") fd = format_non_match_event(e)
url = get_event_url(e) lines += render_non_match_lines(fd, i, mode="html")
start_time_wib, rel_time = get_start_time_wib(e)
total_vol = sum(float(m.get("volume", 0)) for m in e.get("markets", []))
market_count = len(e.get("markets", []))
lines.append(f"<b>{i}.</b> <a href=\"{url}\">{title}</a>")
lines.append(f" {start_time_wib} | {rel_time}")
lines.append(f" Markets: {market_count} | Total Vol: ${total_vol:,.0f}")
lines.append("") lines.append("")
lines.append("")
# Chunk by 10 items (events), respecting 4096 char Telegram limit
text = "\n".join(lines) # Chunk and send
send_chunked(lines, send, category, header_date, show_matches, show_non_matches)
def send_chunked(all_lines, send_fn, category, header_date, show_matches, show_non_matches):
"""
Split already-built lines into Telegram-safe chunks and send them.
Telegram messages are capped at 4096 chars. Chunks are grouped by
section header so no event is split across messages.
Args:
all_lines: Full message lines list (built by caller).
send_fn: Closure that sends a single string and prints confirmation.
category: Category name for header.
header_date: Date string for header.
show_matches: Whether MATCH MARKETS section is present.
show_non_matches: Whether NON-MATCH MARKETS section is present.
"""
text = "\n".join(all_lines)
if len(text) <= 4096: if len(text) <= 4096:
send(text) send_fn(text)
return return
# Split into chunks of 10 events # Split into chunks of 10 events, respecting section headers
all_items = [] all_items = []
in_match = True in_match = True
for line in lines: for line in all_lines:
if line == "MATCH MARKETS": if line == "MATCH MARKETS":
in_match = True in_match = True
elif line == "NON-MATCH MARKETS": elif line == "NON-MATCH MARKETS":
in_match = False in_match = False
elif line.startswith("<b>") and ". " in line and "</a>" in line: elif line.startswith("<b>") and "</a>" in line:
# Event title line: <b>1.</b> <a href="...">Title</a>
all_items.append((in_match, line)) all_items.append((in_match, line))
chunk = [] chunk = []
chunk_len = 0
chunk_num = 1
# Header is always first
header = f"<b>{category.upper()}</b> | {header_date}\n" header = f"<b>{category.upper()}</b> | {header_date}\n"
if show_matches: if show_matches:
header += "\nMATCH MARKETS\n\n" header += "\nMATCH MARKETS\n\n"
if show_non_matches: if show_non_matches:
header += "\nNON-MATCH MARKETS\n\n" header += "\nNON-MATCH MARKETS\n\n"
for is_match, item_line in all_items: for is_match, item_line in all_items:
test_chunk = chunk + [item_line, ""] test_chunk = chunk + [item_line, ""]
test_text = header + "\n".join(chunk) + "\n".join(test_chunk) test_text = header + "\n".join(chunk) + "\n".join(test_chunk)
if len(test_text) > 4096 or len(chunk) >= 10: if len(test_text) > 4096 or len(chunk) >= 10:
# Send current chunk
msg = header + "\n".join(chunk) msg = header + "\n".join(chunk)
send(msg) send_fn(msg)
chunk = [item_line, ""] chunk = [item_line, ""]
header = f"<b>{category.upper()}</b> (cont.) | {header_date}\n" header = f"<b>{category.upper()}</b> (cont.) | {header_date}\n"
if show_matches and is_match: if show_matches and is_match:
@@ -714,10 +827,10 @@ def send_to_telegram(match_events, non_match_events, category, matches_only=Fals
header += "\nNON-MATCH MARKETS\n\n" header += "\nNON-MATCH MARKETS\n\n"
else: else:
chunk.extend([item_line, ""]) chunk.extend([item_line, ""])
if chunk: if chunk:
msg = header + "\n".join(chunk) msg = header + "\n".join(chunk)
send(msg) send_fn(msg)
# ============================================================ # ============================================================

File diff suppressed because it is too large Load Diff