tests: Add unit tests for browse_events, fetch_all_pages, filter_events, is_match_market, get_ml_market, get_ml_volume, sort_events

New test classes: - TestIsMatchMarket: 5 tests for is_match_market() classification - TestGetMlMarket: 5 tests for get_ml_market() and get_ml_volume() - TestFilterEvents: 5 tests for filter_events() and sort_events() - TestFetchAllPages: 4 tests for fetch_all_pages() early-exit logic - TestBrowseEvents: 5 tests for browse_events() sort_by parameter Total: 24 new tests (62 total, all passing)
Fix: Switch fetch_page from subprocess to urllib, add early-exit to fetch_all_pages, add sort_by to browse_events
2026-03-25 19:08:36 +00:00 · 2026-03-25 18:53:11 +00:00 · 2026-03-25 17:50:54 +00:00 · 2026-03-25 14:34:05 +00:00 · 2026-03-25 13:59:54 +00:00 · 2026-03-25 13:13:52 +01:00
2 changed files with 1462 additions and 265 deletions
--- a/skills/polymarket-browse/scripts/browse.py
+++ b/skills/polymarket-browse/scripts/browse.py
@@ -4,6 +4,7 @@ Polymarket Event Browser
 Browse tradeable Polymarket events by game category.
 """
 import html
 import json
 import time
 import argparse
@@ -18,6 +19,7 @@ from urllib.request import urlopen, Request
 PAGE_SIZE = 50
 MAX_RETRIES = 5
 INITIAL_RETRY_DELAY = 2  # exponential backoff starts at 2s
 WIB = timezone(timedelta(hours=7))  # UTC+7 for Indonesian users
 GAME_CATEGORIES = {
    "All Esports": "Esports",
@@ -40,52 +42,68 @@ def fetch_page(q, page=1, max_retries=MAX_RETRIES, initial_delay=INITIAL_RETRY_D
    url = (f"{base}?q={q.replace(' ', '%20')}&limit={PAGE_SIZE}&page={page}"
           f"&search_profiles=false&search_tags=false"
           f"&keep_closed_markets=0&events_status=active&cache=false")
-    
+
    delay = initial_delay
    for attempt in range(max_retries):
-        time.sleep(delay)
+        if attempt > 0:
-        r = subprocess.run(
+            time.sleep(delay)
-            ["curl", "-s", url, "--max-time", "10", "-H", "User-Agent: curl/7.88.1"],
+        try:
-            capture_output=True
+            req = Request(url, headers={"User-Agent": "Mozilla/5.0"})
-        )
+            with urlopen(req, timeout=10) as r:
-        
+                return json.loads(r.read())
-        if r.returncode == 0 and len(r.stdout) > 0:
+        except Exception:
            try:
                return json.loads(r.stdout.decode('utf-8'))
            except json.JSONDecodeError:
                if attempt < max_retries - 1:
                    delay *= 2  # Exponential backoff
                    continue
                return None
        else:
            # Rate limit or other error - exponential backoff
            if attempt < max_retries - 1:
                delay *= 2
                continue
            return None
    return None
-def fetch_all_pages(q, max_pages=100):
+def fetch_all_pages(q, matches_max=None, non_matches_max=None):
    """
-    Fetch ALL pages until pagination ends.
+    Fetch pages until pagination ends, or until quotas are satisfied.
-    max_pages is a safety cap to prevent infinite loops.
+
    Args:
        q: search query
        matches_max: stop early once we have this many match events (None = no limit)
        non_matches_max: stop early once we have this many non-match events (None = no limit)
    Returns:
        {"events": [...], "total_raw": N, "partial": bool}
    """
    all_events = []
    total_raw = 0
-    for page in range(1, max_pages + 1):
+    match_count = 0
-        time.sleep(0.2)  # small delay between pages (API rate limit is generous)
+    non_match_count = 0
    page = 0
    while True:
        page += 1
        time.sleep(0.2)
        data = fetch_page(q, page)
        if data is None:
            break
        events = data.get("events", [])
        total_raw = data.get("pagination", {}).get("totalResults", 0)
        all_events.extend(events)
-        # Stop when we get 0 events (no more pages),
+
-        # OR when we've fetched >= total results
+        # Count matches/non-matches in this page
        for e in events:
            if is_match_market(e):
                match_count += 1
            else:
                non_match_count += 1
        # Stop if we got what we wanted (only when caps are set)
        if matches_max is not None and non_matches_max is not None:
            if match_count >= matches_max and non_match_count >= non_matches_max:
                break
        # Stop when we get 0 events (no more pages)
        if len(events) == 0:
            break
        # Stop when we've fetched all known results
        if len(all_events) >= total_raw:
            break
    partial = (total_raw > 0 and len(all_events) < total_raw)
    return {"events": all_events, "total_raw": total_raw, "partial": partial}
@@ -220,94 +238,79 @@ def format_spread(bid, ask):
    spread = ask - bid
    return f"{prob_to_cents(spread)}c"
 def get_match_time_status(e):
    """
    Return a human-readable match time status.
    Returns (status_str, urgency) where urgency is 0-3 (higher = more urgent/live).
    Uses startTime for actual match start time.
    Displays times in WIB (UTC+7 for Indonesian users).
    """
    # Use startTime for actual match start, not startDate (which is market creation time)
    start_str = e.get("startTime") or e.get("startDate", "")
    if not start_str:
        return "TBD", 0
    try:
        start_dt = datetime.fromisoformat(start_str.replace('Z', '+00:00'))
        now_utc = datetime.now(timezone.utc)
        utc7 = timezone(timedelta(hours=7))
        now = now_utc.astimezone(utc7)
        start_utc7 = start_dt.astimezone(utc7)
        delta = start_dt - now_utc
        if delta.total_seconds() < 0:
            # Started already
            hours_ago = abs(delta.total_seconds()) / 3600
            if hours_ago < 1:
                return "LIVE", 3
            elif hours_ago < 4:
                return f"LIVE {int(hours_ago)}h", 3
            elif hours_ago < 24:
                return f"Started {int(hours_ago)}h ago", 1
            else:
                days = int(hours_ago / 24)
                return f"{days}d ago", 0
        else:
            # Starts in future
            hours_until = delta.total_seconds() / 3600
            if hours_until <= 0:
                return "LIVE", 3
            elif hours_until < 1:
                mins = int(delta.total_seconds() / 60)
                return f"In {mins}m", 3
            elif hours_until < 24:
                return f"In {int(hours_until)}h", 2
            else:
                days = int(hours_until / 24)
                return f"In {days}d", 1
    except:
        return "", 0
-def get_match_time_str(e):
+def _get_time_data(e, tz=None):
    """
-    Return just the time status string (e.g. 'LIVE', 'In 6h', 'In 1d').
+    Unified time data extraction for event timestamps.
-    Uses startTime for actual match start time.
+
    Uses startTime (preferred) or startDate as the event start time.
    Datetime parsing and all relative calculations are UTC-based.
    The tz parameter only affects the abs_time formatting.
    Args:
        e: Event dict with 'startTime' or 'startDate' key.
        tz: datetime.timezone for abs_time formatting.
            Defaults to WIB (UTC+7).
    Returns:
        {
            "time_status": str,    # e.g. "LIVE", "In 6h", "12h ago"
            "time_urgency": int,  # 0-3 (higher = more urgent/live)
            "abs_time": str,       # e.g. "Mar 25, 19:00 WIB" or "TBD"
        }
    """
    tz = tz or WIB
    start_str = e.get("startTime") or e.get("startDate", "")
    if not start_str:
-        return "TBD"
+        return {"time_status": "TBD", "time_urgency": 0, "abs_time": "TBD"}
    try:
        start_dt = datetime.fromisoformat(start_str.replace('Z', '+00:00'))
        now_utc = datetime.now(timezone.utc)
        delta = start_dt - now_utc
-        
+        total_sec = delta.total_seconds()
-        if delta.total_seconds() < 0:
+
-            hours_ago = abs(delta.total_seconds()) / 3600
+        if total_sec < 0:
            # Event is in the past
            hours_ago = abs(total_sec) / 3600
            if hours_ago < 1:
-                return "LIVE"
+                time_status = "LIVE"
                time_urgency = 3
            elif hours_ago < 4:
-                return f"LIVE {int(hours_ago)}h"
+                time_status = f"LIVE {int(hours_ago)}h"
                time_urgency = 3
            elif hours_ago < 24:
-                return f"{int(hours_ago)}h ago"
+                time_status = f"{int(hours_ago)}h ago"
                time_urgency = 1
            else:
                days = int(hours_ago / 24)
-                return f"{days}d ago"
+                time_status = f"{days}d ago"
                time_urgency = 0
        else:
-            hours_until = delta.total_seconds() / 3600
+            # Event is in the future
-            if hours_until <= 0:
+            if total_sec < 3600:
-                return "LIVE"
+                mins = int(total_sec / 60)
-            elif hours_until < 1:
+                time_status = f"In {mins}m"
-                mins = int(delta.total_seconds() / 60)
+                time_urgency = 3
-                return f"In {mins}m"
+            elif total_sec < 86400:
-            elif hours_until < 24:
+                hours_until = int(total_sec / 3600)
-                return f"In {int(hours_until)}h"
+                time_status = f"In {hours_until}h"
                time_urgency = 2
            else:
-                days = int(hours_until / 24)
+                days = int(total_sec / 86400)
-                return f"In {days}d"
+                time_status = f"In {days}d"
-    except:
+                time_urgency = 1
-        return ""
+
        abs_time = start_dt.astimezone(tz).strftime("%b %d, %H:%M ")
        if tz == WIB:
            abs_time += "WIB"
        else:
            abs_time += start_dt.astimezone(tz).strftime("%Z")
        return {"time_status": time_status, "time_urgency": time_urgency, "abs_time": abs_time}
    except Exception:
        return {"time_status": "", "time_urgency": 0, "abs_time": "TBD"}
 def filter_events(events, tradeable_only=True):
    """
@@ -316,16 +319,17 @@ def filter_events(events, tradeable_only=True):
    """
    match_events = []
    non_match_events = []
-    
+
    for e in events:
        if is_match_market(e):
            if not tradeable_only or is_tradeable_event(e):
                match_events.append(e)
        else:
            non_match_events.append(e)
-    
+
    return match_events, non_match_events
 def sort_events(events):
    return sorted(events, key=get_ml_volume, reverse=True)
@@ -333,24 +337,214 @@ def sort_events(events):
 # BROWSE
 # ============================================================
-def browse_events(q, matches_max=10, non_matches_max=10, tradeable_only=True):
+def browse_events(q, matches_max=10, non_matches_max=10, tradeable_only=True, sort_by=None):
-    result = fetch_all_pages(q)
+    """
    Browse Polymarket events.
    Args:
        q: search query
        matches_max: max number of match markets to return
        non_matches_max: max number of non-match markets to return
        tradeable_only: filter to tradeable events only
        sort_by: None (fast, API order) or "volume" (full fetch, sort by volume desc)
    """
    # Pass quotas to fetch_all_pages for early-exit optimization.
    # Only use early-exit when sort_by is None (no client-side sort needed).
    use_early_exit = (sort_by is None)
    fetch_matches_max = matches_max if use_early_exit else None
    fetch_non_matches_max = non_matches_max if use_early_exit else None
    result = fetch_all_pages(q, matches_max=fetch_matches_max, non_matches_max=fetch_non_matches_max)
    events = result["events"]
    match_events, non_match_events = filter_events(events, tradeable_only)
-    sorted_match = sort_events(match_events)
+
    # Sort if requested; otherwise preserve API order
    if sort_by == "volume":
        match_events = sort_events(match_events)
        non_match_events = sort_events(non_match_events)
    return {
        "query": q,
        "total_raw": result["total_raw"],
        "total_fetched": len(events),
        "total_match": len(match_events),
        "total_non_match": len(non_match_events),
-        "match_events": sorted_match[:matches_max],
+        "match_events": match_events[:matches_max],
        "non_match_events": non_match_events[:non_matches_max],
        "partial": result.get("partial", False),
    }
 # ============================================================
-# FORMAT
+# FORMAT — EVENT
 # ============================================================
 def format_match_event(e):
    """
    Format a match event into a canonical dict for rendering.
    All computing done here; renderers just template.
    Returns:
        {
            "title": str,           # raw title
            "title_clean": str,      # "Team A vs Team B"
            "tournament": str,       # "Tournament Name" or ""
            "url": str,
            "time_status": str,      # "LIVE", "In 6h", "12h ago"
            "time_urgency": int,     # 0-3
            "abs_time": str,         # "Mar 25, 19:00 WIB"
            "team_a": str,
            "team_b": str,
            "odds_a": str,           # "55c"
            "odds_b": str,
            "vol": int,
        }
    """
    ml = get_ml_market(e)
    outcomes = json.loads(ml.get("outcomes", "[]")) if ml else []
    prices = json.loads(ml.get("outcomePrices", "[]")) if ml else []
    td = _get_time_data(e)
    title = e.get("title", "")
    team_a = outcomes[0] if len(outcomes) > 0 else "?"
    team_b = outcomes[1] if len(outcomes) > 1 else "?"
    odds_a = format_odds(float(prices[0])) if len(prices) > 0 else "?"
    odds_b = format_odds(float(prices[1])) if len(prices) > 1 else "?"
    if " - " in title:
        title_clean = title.split(" - ")[0].strip()
    else:
        title_clean = title
    tournament = get_tournament(title)
    return {
        "title": title,
        "title_clean": title_clean,
        "tournament": tournament,
        "url": get_event_url(e),
        "time_status": td["time_status"],
        "time_urgency": td["time_urgency"],
        "abs_time": td["abs_time"],
        "team_a": team_a,
        "team_b": team_b,
        "odds_a": odds_a,
        "odds_b": odds_b,
        "vol": get_ml_volume(e),
    }
 def format_non_match_event(e):
    """
    Format a non-match event into a canonical dict for rendering.
    Returns:
        {
            "title": str,
            "url": str,
            "time_status": str,
            "time_urgency": int,
            "abs_time": str,
            "market_count": int,
            "total_vol": int,
        }
    """
    td = _get_time_data(e)
    total_vol = sum(float(m.get("volume", 0)) for m in e.get("markets", []))
    market_count = len(e.get("markets", []))
    return {
        "title": e.get("title", "?"),
        "url": get_event_url(e),
        "time_status": td["time_status"],
        "time_urgency": td["time_urgency"],
        "abs_time": td["abs_time"],
        "market_count": market_count,
        "total_vol": int(total_vol),
    }
 # ============================================================
 # FORMAT — RENDER
 # ============================================================
 def render_match_lines(event_dict, i, mode):
    """
    Render a formatted match event dict into lines of text.
    Args:
        event_dict: canonical dict from format_match_event()
        i: 1-based index for the event number
        mode: "text" for plain text/Markdown, "html" for Telegram HTML
    Returns:
        List[str], one line per element (no trailing blank line).
        Caller adds the blank line separator between events.
    """
    title_clean = event_dict["title_clean"]
    url = event_dict["url"]
    abs_time = event_dict["abs_time"]
    time_status = event_dict["time_status"]
    vol = event_dict["vol"]
    tournament = event_dict["tournament"]
    team_a = event_dict["team_a"]
    team_b = event_dict["team_b"]
    odds_a = event_dict["odds_a"]
    odds_b = event_dict["odds_b"]
    lines = []
    if mode == "html":
        lines.append(
            f"<b>{i}.</b> <a href=\"{url}\">{escape_html(title_clean)}</a>"
        )
    else:
        lines.append(f"{i}. [{title_clean}]({url})")
    lines.append(f"   {abs_time} | {time_status}")
    lines.append(f"  Vol: ${vol:,.0f}")
    if tournament:
        lines.append(f"  Tournament: {tournament}")
    lines.append(f"  Odds: {team_a} {odds_a} | {odds_b} {team_b}")
    return lines
 def render_non_match_lines(event_dict, i, mode):
    """
    Render a formatted non-match event dict into lines of text.
    Args:
        event_dict: canonical dict from format_non_match_event()
        i: 1-based index for the event number
        mode: "text" for plain text/Markdown, "html" for Telegram HTML
    Returns:
        List[str], one line per element (no trailing blank line).
    """
    title = event_dict["title"]
    url = event_dict["url"]
    abs_time = event_dict["abs_time"]
    time_status = event_dict["time_status"]
    market_count = event_dict["market_count"]
    total_vol = event_dict["total_vol"]
    lines = []
    if mode == "html":
        lines.append(f"<b>{i}.</b> <a href=\"{url}\">{escape_html(title)}</a>")
    else:
        lines.append(f"{i}. [{title}]({url})")
    lines.append(f"   {abs_time} | {time_status}")
    lines.append(f"   Markets: {market_count} | Total Vol: ${total_vol:,.0f}")
    return lines
 # ============================================================
 # FORMAT — LEGACY
 # ============================================================
 def format_event(e):
@@ -360,12 +554,12 @@ def format_event(e):
    best_bid = float(ml.get("bestBid", 0)) if ml else 0
    best_ask = float(ml.get("bestAsk", 0)) if ml else 0
    vol = get_ml_volume(e)
-    time_status, urgency = get_match_time_status(e)
+    td = _get_time_data(e)
-    
+
    return {
        "title": e.get("title", ""),
-        "time_status": time_status,
+        "time_status": td["time_status"],
-        "time_urgency": urgency,
+        "time_urgency": td["time_urgency"],
        "url": get_event_url(e),
        "livestream": e.get("resolutionSource"),
        "outcomes": outcomes,
@@ -383,12 +577,13 @@ def format_detail_event(e):
        if float(m.get("volume", 0)) > 0 and is_tradeable_market(m)
    ]
    active_markets = sorted(active_markets, key=lambda m: float(m.get("volume", 0)), reverse=True)
-    
+
-    time_status, urgency = get_match_time_status(e)
+    td = _get_time_data(e)
-    
+
    return {
        "title": e.get("title", ""),
-        "time_status": time_status,
+        "time_status": td["time_status"],
        "abs_time": td["abs_time"],
        "url": get_event_url(e),
        "livestream": e.get("resolutionSource"),
        "outcomes": json.loads(ml.get("outcomes", "[]")) if ml else [],
@@ -415,48 +610,6 @@ def format_detail_event(e):
 # DISPLAY
 # ============================================================
 def get_start_time_wib(e):
    """Return (date_time_str, relative_str) for display."""
    start_str = e.get("startTime") or e.get("startDate", "")
    if not start_str:
        return "TBD", ""
    try:
        start_dt = datetime.fromisoformat(start_str.replace('Z', '+00:00'))
        now_utc = datetime.now(timezone.utc)
        utc7 = timezone(timedelta(hours=7))
        start_utc7 = start_dt.astimezone(utc7)
        # Absolute: "Mar 25, 19:00 WIB"
        abs_str = start_utc7.strftime("%b %d, %H:%M WIB")
        # Relative: "In 5h", "In 10h", "LIVE", etc.
        delta = start_dt - now_utc
        if delta.total_seconds() < 0:
            hours_ago = abs(delta.total_seconds()) / 3600
            if hours_ago < 1:
                rel_str = "LIVE"
            elif hours_ago < 24:
                rel_str = f"{int(hours_ago)}h ago"
            else:
                days = int(hours_ago / 24)
                rel_str = f"{days}d ago"
        else:
            hours_until = delta.total_seconds() / 3600
            if hours_until <= 0:
                rel_str = "LIVE"
            elif hours_until < 1:
                mins_until = int(delta.total_seconds() / 60)
                rel_str = f"In {mins_until}m"
            elif hours_until < 24:
                rel_str = f"In {int(hours_until)}h"
            else:
                days = int(hours_until / 24)
                rel_str = f"In {days}d"
        return abs_str, rel_str
    except:
        return "TBD", ""
 def get_header_date():
    """Return current date string like 'Mar 25, 2026'"""
    now_utc = datetime.now(timezone.utc)
@@ -478,18 +631,17 @@ def print_browse(match_events, non_match_events, category, total_raw, total_fetc
    utc7 = timezone(timedelta(hours=7))
    now_utc7 = now_utc.astimezone(utc7)
    header_date = get_header_date()
-    
+
    print(f"\n=== {category.upper()}{' [RAW]' if raw_mode else ''} ===")
    print(f"Current time (WIB): {now_utc7.strftime('%H:%M WIB')} | {header_date}")
-    
+
    if raw_mode:
        print(f"Fetched: {total_fetched} / Total API: {total_raw} | Match: {total_match} | Non-match: {total_non_match}")
    if partial:
        print(f"WARNING: Partial fetch (API error or timeout) — data may be incomplete")
-    
+
-    # --- MATCH MARKETS ---
+    # Determine sections to show
    if not matches_only and not non_matches_only:
        # Default: show both
        show_matches = True
        show_non_matches = True
    elif matches_only:
@@ -498,69 +650,32 @@ def print_browse(match_events, non_match_events, category, total_raw, total_fetc
    else:
        show_matches = False
        show_non_matches = True
-    
+
    # Match events
    if show_matches:
-        print(f"\nMATCH MARKETS")
+        print("\nMATCH MARKETS")
        if not match_events:
            print("  No match markets found.")
        else:
            for i, e in enumerate(match_events, 1):
-                f = format_event(e)
+                fd = format_match_event(e)
-                ml = get_ml_market(e)
+                for line in render_match_lines(fd, i, mode="text"):
-                outcomes = json.loads(ml.get("outcomes", "[]")) if ml else []
+                    print(line)
-                prices = json.loads(ml.get("outcomePrices", "[]")) if ml else []
+
-                vol = f["volume"]
+    # Non-match events
                title = f["title"]
                url = f["url"]
                start_time_wib, rel_time = get_start_time_wib(e)
                team_a = outcomes[0] if len(outcomes) > 0 else "?"
                team_b = outcomes[1] if len(outcomes) > 1 else "?"
                odds_a = format_odds(float(prices[0])) if len(prices) > 0 else "?"
                odds_b = format_odds(float(prices[1])) if len(prices) > 1 else "?"
                if " - " in title:
                    title_clean = title.split(" - ")[0].strip()
                else:
                    title_clean = title
                tournament = get_tournament(title)
                print(f"\n  {i}. [{title_clean}]({url})")
                print(f"     {start_time_wib} | {rel_time}")
                print(f"  Vol: ${vol:,.0f}")
                if tournament:
                    print(f"  Tournament: {tournament}")
                print(f"  Odds: {team_a} {odds_a} | {odds_b} {team_b}")
    # --- NON-MATCH MARKETS ---
    if show_non_matches and non_match_events:
-        print(f"\nNON-MATCH MARKETS")
+        print("\nNON-MATCH MARKETS")
        for i, e in enumerate(non_match_events[:non_matches_max], 1):
-            title = e.get("title", "?")
+            fd = format_non_match_event(e)
-            url = get_event_url(e)
+            for line in render_non_match_lines(fd, i, mode="text"):
-            start_time_wib, rel_time = get_start_time_wib(e)
+                print(line)
            total_vol = sum(float(m.get("volume", 0)) for m in e.get("markets", []))
            market_count = len(e.get("markets", []))
            print(f"\n  {i}. [{title}]({url})")
            print(f"     {start_time_wib} | {rel_time}")
            print(f"     Markets: {market_count} | Total Vol: ${total_vol:,.0f}")
 def print_detail(e, detail):
    from datetime import datetime, timezone, timedelta
    now_utc = datetime.now(timezone.utc)
    utc7 = timezone(timedelta(hours=7))
    now_utc7 = now_utc.astimezone(utc7)
    print(f"\n{detail['title']}")
    print(f"URL: {detail['url']}")
    print(f"Livestream: {detail['livestream']}")
-    
+
    spread_str = format_spread(detail["best_bid"], detail["best_ask"]) if detail["best_bid"] and detail["best_ask"] else "N/A"
    time_str = get_match_time_str(e)
    print(f"\n{detail['time_status']}")
    print(f"ML: {detail['outcomes'][0]} {format_odds(float(detail['prices'][0]))} vs {detail['outcomes'][1]} {format_odds(float(detail['prices'][1]))}")
    print(f"ML Vol: ${detail['volume']:,.0f} | {spread_str}")
@@ -577,6 +692,15 @@ def print_detail(e, detail):
 # TELEGRAM
 # ============================================================
 def escape_html(text):
    """Escape HTML-sensitive characters for Telegram parse_mode=HTML."""
    return (text
        .replace("&", "&amp;")
        .replace("<", "&lt;")
        .replace(">", "&gt;")
        .replace('"', "&quot;"))
 def send_telegram_message(bot_token, chat_id, text, timeout=10):
    """Send a message via Telegram bot API. Returns the message ID on success.
@@ -612,100 +736,89 @@ def send_to_telegram(match_events, non_match_events, category, matches_only=Fals
    utc7 = timezone(timedelta(hours=7))
    now_utc7 = now_utc.astimezone(utc7)
    header_date = now_utc7.strftime("%b %d, %Y")
-    
+
    # Determine sections to show
    show_matches = (not matches_only and not non_matches_only) or matches_only
    show_non_matches = (not matches_only and not non_matches_only) or non_matches_only
-    
+
    def send(text):
        msg_id = send_telegram_message(bot_token, chat_id, text)
        print(f"  Sent msg {msg_id}")
-    
+
-    # Build sections
+    # Build lines
-    lines = [f"<b>{category.upper()}</b> | {header_date}"]
+    lines = [f"<b>{category.upper()}</b> | {header_date}", ""]
-    lines.append("")
+
    if show_matches:
-        lines.append("MATCH MARKETS")
+        lines += ["MATCH MARKETS", ""]
        lines.append("")
        if not match_events:
            lines.append("  No match markets found.")
        else:
            for i, e in enumerate(match_events, 1):
-                ml = get_ml_market(e)
+                fd = format_match_event(e)
-                outcomes = json.loads(ml.get("outcomes", "[]")) if ml else []
+                lines += render_match_lines(fd, i, mode="html")
                prices = json.loads(ml.get("outcomePrices", "[]")) if ml else []
                vol = get_ml_volume(e)
                title = e.get("title", "?")
                url = get_event_url(e)
                start_time_wib, rel_time = get_start_time_wib(e)
                team_a = outcomes[0] if len(outcomes) > 0 else "?"
                team_b = outcomes[1] if len(outcomes) > 1 else "?"
                odds_a = format_odds(float(prices[0])) if len(prices) > 0 else "?"
                odds_b = format_odds(float(prices[1])) if len(prices) > 1 else "?"
                tournament = get_tournament(title)
                title_clean = title.split(" - ")[0].strip() if " - " in title else title
                lines.append(f"<b>{i}.</b> <a href=\"{url}\">{title_clean}</a>")
                lines.append(f"   {start_time_wib} | {rel_time}")
                lines.append(f"   Vol: ${vol:,.0f}")
                if tournament:
                    lines.append(f"   Tournament: {tournament}")
                lines.append(f"   Odds: {team_a} {odds_a} | {odds_b} {team_b}")
                lines.append("")
        lines.append("")
-    
+
    if show_non_matches:
-        lines.append("NON-MATCH MARKETS")
+        lines += ["NON-MATCH MARKETS", ""]
        lines.append("")
        if not non_match_events:
            lines.append("  No non-match markets found.")
        else:
            for i, e in enumerate(non_match_events, 1):
-                title = e.get("title", "?")
+                fd = format_non_match_event(e)
-                url = get_event_url(e)
+                lines += render_non_match_lines(fd, i, mode="html")
                start_time_wib, rel_time = get_start_time_wib(e)
                total_vol = sum(float(m.get("volume", 0)) for m in e.get("markets", []))
                market_count = len(e.get("markets", []))
                lines.append(f"<b>{i}.</b> <a href=\"{url}\">{title}</a>")
                lines.append(f"   {start_time_wib} | {rel_time}")
                lines.append(f"   Markets: {market_count} | Total Vol: ${total_vol:,.0f}")
                lines.append("")
-    
+        lines.append("")
-    # Chunk by 10 items (events), respecting 4096 char Telegram limit
+
-    text = "\n".join(lines)
+    # Chunk and send
    send_chunked(lines, send, category, header_date, show_matches, show_non_matches)
 def send_chunked(all_lines, send_fn, category, header_date, show_matches, show_non_matches):
    """
    Split already-built lines into Telegram-safe chunks and send them.
    Telegram messages are capped at 4096 chars. Chunks are grouped by
    section header so no event is split across messages.
    Args:
        all_lines: Full message lines list (built by caller).
        send_fn: Closure that sends a single string and prints confirmation.
        category: Category name for header.
        header_date: Date string for header.
        show_matches: Whether MATCH MARKETS section is present.
        show_non_matches: Whether NON-MATCH MARKETS section is present.
    """
    text = "\n".join(all_lines)
    if len(text) <= 4096:
-        send(text)
+        send_fn(text)
        return
-    
+
-    # Split into chunks of 10 events
+    # Split into chunks of 10 events, respecting section headers
    all_items = []
    in_match = True
-    for line in lines:
+    for line in all_lines:
        if line == "MATCH MARKETS":
            in_match = True
        elif line == "NON-MATCH MARKETS":
            in_match = False
-        elif line.startswith("<b>") and ". " in line and "</a>" in line:
+        elif line.startswith("<b>") and "</a>" in line:
            # Event title line: <b>1.</b> <a href="...">Title</a>
            all_items.append((in_match, line))
-    
+
    chunk = []
    chunk_len = 0
    chunk_num = 1
    # Header is always first
    header = f"<b>{category.upper()}</b> | {header_date}\n"
    if show_matches:
        header += "\nMATCH MARKETS\n\n"
    if show_non_matches:
        header += "\nNON-MATCH MARKETS\n\n"
-    
+
    for is_match, item_line in all_items:
        test_chunk = chunk + [item_line, ""]
        test_text = header + "\n".join(chunk) + "\n".join(test_chunk)
        if len(test_text) > 4096 or len(chunk) >= 10:
            # Send current chunk
            msg = header + "\n".join(chunk)
-            send(msg)
+            send_fn(msg)
            chunk = [item_line, ""]
            header = f"<b>{category.upper()}</b> (cont.) | {header_date}\n"
            if show_matches and is_match:
@@ -714,10 +827,10 @@ def send_to_telegram(match_events, non_match_events, category, matches_only=Fals
                header += "\nNON-MATCH MARKETS\n\n"
        else:
            chunk.extend([item_line, ""])
-    
+
    if chunk:
        msg = header + "\n".join(chunk)
-        send(msg)
+        send_fn(msg)
 # ============================================================
--- a/skills/polymarket-browse/tests/test_browse.py
+++ b/skills/polymarket-browse/tests/test_browse.py
Author	SHA1	Message	Date
shoko	c348d6daa1	tests: Add unit tests for browse_events, fetch_all_pages, filter_events, is_match_market, get_ml_market, get_ml_volume, sort_events New test classes: - TestIsMatchMarket: 5 tests for is_match_market() classification - TestGetMlMarket: 5 tests for get_ml_market() and get_ml_volume() - TestFilterEvents: 5 tests for filter_events() and sort_events() - TestFetchAllPages: 4 tests for fetch_all_pages() early-exit logic - TestBrowseEvents: 5 tests for browse_events() sort_by parameter Total: 24 new tests (62 total, all passing)	2026-03-25 19:08:36 +00:00
shoko	764c75e712	Fix: Switch fetch_page from subprocess to urllib, add early-exit to fetch_all_pages, add sort_by to browse_events - fetch_page: replace subprocess.run(curl) with urllib (stdlib, cleaner) - fetch_all_pages: add matches_max/non_matches_max params for early-exit. When both are set, stop fetching once quotas are satisfied. - browse_events: add sort_by param (None='fast' early-exit, 'volume'=full fetch+sort). Early-exit only used when sort_by=None (no client-side sort needed). - Remove subprocess import (no longer needed after migration)	2026-03-25 18:53:11 +00:00
shoko	3a9f8fb365	Fix #14 : Refactor print_browse/send_to_telegram into single pipeline Replace duplicate inline formatting with unified format+render pipeline. New functions: - format_match_event(e) — canonical dict for match events - format_non_match_event(e) — canonical dict for non-match events - render_match_lines(event_dict, i, mode) — text/HTML renderer - render_non_match_lines(event_dict, i, mode) — text/HTML renderer - send_chunked(...) — extracted Telegram chunking logic Also fixed send_chunked() chunking bug: the original '. ' in line check never matched event lines (period is followed by '</b>' not space). Tests: 38 total, all passing. Fixes: #14	2026-03-25 17:50:54 +00:00
shoko	a7837cec0f	Merge #15 : Unify duplicate time functions	2026-03-25 14:34:05 +00:00
shoko	8cde441996	Fix #15 : Unify duplicate time functions into _get_time_data() Replace three duplicated time parsing functions with a single _get_time_data(e, tz) helper returning {time_status, time_urgency, abs_time}. Deleted functions: - get_match_time_status(e) — urgency + status string - get_match_time_str(e) — status string only - get_start_time_wib(e) — (abs_time, rel_str) tuple New unified helper: - _get_time_data(e, tz=None) returns {time_status, time_urgency, abs_time} - tz defaults to WIB (UTC+7, Indonesia) - canonical rel_str format: 'LIVE', 'In 6h', '12h ago', etc. - time_urgency: 0-3 (higher=livelier) All call sites updated to use _get_time_data(): - format_event(), format_detail_event() - print_browse(), print_detail() - send_to_telegram() Also: removed dead code in print_detail() that called get_match_time_str() but never used the result. Tests: 9 new tests for _get_time_data() covering TBD, future, live, and past event scenarios. 19 tests total, all passing. Fixes: #15	2026-03-25 13:59:54 +00:00
shoko	b2180a4a34	Merge pull request 'Fix #5 : HTML injection in Telegram messages' (#20 ) from fix/issue-5-html-injection-telegram into master	2026-03-25 13:13:52 +01:00
shoko	d0534aedbf	Fix #5 : HTML injection in Telegram messages Add escape_html() function to prevent HTML injection in Telegram parse_mode=HTML messages. Apply escaping to event titles inserted into <a> tags in send_to_telegram(). - Add escape_html() using stdlib html.escape() - Escape match event titles (line 648) and non-match titles (line 676) - Add TestHtmlInjection with 2 tests proving fix: - <script> tags escaped as <script> - & ampersands escaped as & - Fixes HIGH severity: titles from Polymarket API were inserted without escaping, allowing malformed HTML in Telegram messages	2026-03-25 11:42:42 +00:00