diff --git a/.github/ISSUE_TEMPLATE/clarifying-questions.md b/.github/ISSUE_TEMPLATE/clarifying-questions.md new file mode 100644 index 0000000..6957891 --- /dev/null +++ b/.github/ISSUE_TEMPLATE/clarifying-questions.md @@ -0,0 +1,129 @@ +# JIGAIDO — Clarifying Questions + +## Context + +Pre-implementation review of JIGAIDO Telegram bounty bot. Below are questions gathered during code review that need answers before proceeding. + +--- + +## Questions + +### 1. `__init__.py` is empty +File `apps/telegram-bot/__init__.py` is 0 bytes. It's a Python package marker but serves no function. +- **Can we remove it?** Does anything import from `apps/telegram-bot/` as a package? +- Or should it contain `__version__`, `__app_name__`, or similar? + +### 2. `jigaido-bot.service` as a template +`jigaido-bot.service` currently has hardcoded values: +- `User=shoko` +- `Group=shoko` +- `WorkingDirectory=/home/shoko/repositories/jigaido/apps/telegram-bot/` + +**Proposal:** Make it a template (`jigaido-bot.service.j2` or similar). `setup.sh` detects the actual user/group/path and generates the real `.service` file. + +- Is this the intended approach? +- Should `setup.sh` also handle env file generation? + +### 3. Logging separation +Currently logs go to stdout (handled by systemd's `journalctl`). + +**Proposal:** +- `jigaido.log` — INFO level +- `jigaido-error.log` — ERROR level +- Path: `/var/log/jigaido/` or `~/.local/share/jigaido/logs/` +- Files should NOT be tracked by git (add to `.gitignore`) + +**Questions:** +- Preferred log directory? +- Log rotation needed (e.g., `maxsize=10MB, rotate=3`)? +- Should we use Python's `logging` module with `RotatingFileHandler`, or shell redirect in the service file? + +### 4. What does `/start` actually do? + +SPEC.md says: +> "First admin assignment is automatic when the bot detects a new group." + +Code (`cmd_start`): +- Group: `ensure_group()` → upserts group + adds creator as admin +- DM: `ensure_user()` → upserts user record + +**Questions:** +- Should `/start` in a group re-trigger admin assignment if creator already exists? (Currently yes — `add_group_admin` is idempotent via `UNIQUE` constraint, so it's safe.) +- Should `/start` message be idempotent (same message every time) or conditional? + +### 5. `db.init_db()` — safe to run multiple times? + +`schema.sql` uses `CREATE TABLE IF NOT EXISTS ...`. `db.init_db()` calls `executescript()`. + +**Clarification needed:** +- Is it safe to call `init_db()` every time the bot starts? (Seems yes — tables only created if absent.) +- Should there be a schema version table for future migrations? + +### 6. `TELEGRAM_BOT_USERNAME` hardcoded + +`commands.py:13`: +```python +TELEGRAM_BOT_USERNAME = "your_bot_username" +``` + +This is never used in the current code (searched — no references). The `post_init` in `bot.py` sets commands without using it. + +**Question:** Should this be removed entirely, or should it be set from `JIGAIDO_BOT_USERNAME` env var? + +### 7. `REMAINDER_WINDOW_DAYS` duplicated + +Defined in both: +- `commands.py:15` — `REMINDER_WINDOW_DAYS = 7` +- `cron.py:27` — `REMINDER_WINDOW_DAYS = 7` + +**Decision needed:** User said "drop cron for now." Confirm: +- Remove `cron.py` entirely? +- Or keep for future use? + +### 8. Link dedup — can we clear a link? + +`update_bounty` uses `COALESCE(?, link)`: +```python +SET text = COALESCE(?, text), + link = COALESCE(?, link), + due_date_ts = COALESCE(?, due_date_ts) +``` + +Passing `None` as `link` does NOT clear it — `COALESCE` returns the existing value. + +**Question:** Is this intentional? Should there be a way to clear/reset a link or due date? + +### 9. Git as update mechanism + +Proposal: `git pull` in the deployment directory as the update mechanism. + +**Questions:** +- Is the `.service` file and env file generated by `setup.sh` stored outside the git-tracked directory? +- Or are they gitignored files inside the repo? +- What's the update procedure? `git pull && systemctl restart jigaido-bot`? + +### 10. Cron dropped + +User explicitly said drop cron for now. + +**Action needed:** +- Remove `cron.py` and related code from `bot.py`? +- Or keep but disable? +- What about `reminder_log` table — keep or remove? + +--- + +## Priority + +| # | Question | Priority | +|---|---|---| +| 1 | `__init__.py` remove? | Low | +| 2 | `.service` as template | Medium | +| 3 | Logging separation | Medium | +| 4 | `/start` behavior clarification | Medium | +| 5 | `db.init_db()` safety | Low (already safe) | +| 6 | `TELEGRAM_BOT_USERNAME` | Low | +| 7 | `REMAINDER_WINDOW_DAYS` duplication | Low | +| 8 | Can we clear link/due_date? | Medium | +| 9 | Git as update mechanism | High | +| 10 | Cron dropped | Medium | diff --git a/.github/ISSUE_TEMPLATE/v2-simplify-storage.md b/.github/ISSUE_TEMPLATE/v2-simplify-storage.md new file mode 100644 index 0000000..be2f46b --- /dev/null +++ b/.github/ISSUE_TEMPLATE/v2-simplify-storage.md @@ -0,0 +1,177 @@ +# Simplify Storage: Replace SQLite with Per-User JSON Files + +## Status +Proposed + +## Background + +### What happened + +The SQLite-based storage layer (`db.py`) introduced several categories of complexity that outweigh its benefits at this stage: + +1. **Connection management bugs** — SQLite Python's `row_factory` disables implicit transaction handling. Combined with `PRAGMA foreign_keys = ON`, this caused `ON CONFLICT UPDATE` statements to silently fail to commit. The fix required setting `conn.isolation_level = None` directly on the connection object after creation. These are not obvious behaviors and took significant debugging time. + +2. **Test fragility** — The `fresh_db` fixture patches `DB_PATH` but the SQLite connection is a module-level singleton with connection-level state. Tests passed in isolation but failed under pytest's caching, and the root cause was subtle enough to require multiple iterations. + +3. **Tracking table complexity** — The `user_bounty_tracking` + `reminder_log` tables with dedup logic add non-trivial query complexity for what is essentially a "bookmark" feature. + +4. **Schema migrations** — Any schema change requires a migration script. For a personal bot with 2 users and 50 bounties, this overhead is disproportionate. + +5. **Cron/reminder system** — The daily reminder cron (`cron.py`) requires a separate process, scheduler (cron), and `reminder_log` table to prevent duplicate notifications. This is a significant operational surface for a v1. + +### Why it happened + +The current design was over-engineered for the actual usage pattern: +- Most commands are stateless (one request → one response) +- The user is the primary (and likely only) user +- Scale target is 10-100 users, not 10,000+ +- The bot is a personal project, not a production service + +SQLite was chosen for "correctness" but at this scale, the correctness guarantees are irrelevant while the complexity is real. + +### Current state + +The bot works and 53/53 tests pass. But `db.py` is ~300 lines with subtle connection semantics, `schema.sql` defines 7 tables, `cron.py` is a separate process, and the command layer (`commands.py`) is entangled with the DB layer. + +--- + +## Proposal + +**Replace SQLite with a per-user JSON file storage system.** + +### Storage Design + +``` +data/ +└── users/ + └── {telegram_user_id}.json # one file per user +``` + +**File structure (`users/{id}.json`):** +```json +{ + "user_id": 123, + "username": "alice", + "personal_bounties": [ + { + "id": 1, + "text": "Fix login bug", + "link": "https://github.com/example/repo/issues/1", + "due_date_ts": 1735689600, + "created_at": 1735603200 + } + ], + "tracked_bounties": [ + {"bounty_id": 5, "group_id": -1001, "created_at": 1735600000}, + {"bounty_id": 3, "group_id": null, "created_at": 1735590000} + ] +} +``` + +### Key design decisions + +1. **Single file per user** — No group-level files. Personal bounties live in the creator's file. Group bounties live in the creator's file with `group_id` set. + +2. **Bounty IDs are sequential integers per file** — Not global. Each user's file has its own `next_id` counter. This avoids coordination between users at the cost of non-global IDs (acceptable for personal use). + +3. **Cross-group tracking** — When Alice (in Group A) tracks a bounty created by Bob in Group B, Alice's file stores `{bounty_id: X, group_id: -100B}`. To display it, the bot loads Bob's file and finds bounty `X`. + +4. **No reminders in v1** — Drop the cron/reminder system entirely. The `reminder_log` table and `cron.py` are removed. Reminders can be added back as a v2 feature with a simpler design (e.g., just a "due soon" filter on `/my`). + +5. **No admin model in v1** — Drop `group_admins` table. Group bounties are open to anyone in the group to add/edit/delete. The creator can be the only one who can modify (enforced by `created_by_user_id` check). + +### Deleted components + +- `db.py` — removed entirely +- `schema.sql` — removed entirely +- `cron.py` — removed entirely +- `reminder_log` table — removed +- `user_bounty_tracking` table — replaced by `tracked_bounties` in user JSON +- `groups` table — removed (group_id stored directly in bounty objects) +- `group_admins` table — removed (simplified permission model) + +### Retained components + +- `bot.py` — minimal entrypoint +- `commands.py` — command parsing and reply logic (simplified) +- `tests/` — simplified to match new data model + +--- + +## Implementation Plan + +### Phase 1: Data model + storage layer + +1. Create `storage.py` with: + - `get_user_path(user_id)` — returns `Path` to user's JSON + - `load_user(user_id)` — reads and parses JSON, returns dict, creates file if missing + - `save_user(user_id, data)` — writes JSON atomically (temp file + rename) + - `next_bounty_id(user_id)` — increments and returns next ID for that user's file + +2. No locking needed at v1 scale. `tempfile` + `rename` gives atomic writes. + +### Phase 2: Rewrite commands.py + +Simplified command set: + +| Command | Where | Who | Description | +|---|---|---|---| +| `/bounty` | Group / DM | Anyone | List all bounties (group-scoped in group, personal in DM) | +| `/add [link] [due>` | Group | Anyone | Add bounty to group | +| `/add [link] [due>` | DM | Anyone | Add personal bounty | +| `/edit [text] [link] [due>` | Group | Creator only | Edit bounty | +| `/edit [text] [link] [due>` | DM | Creator only | Edit personal bounty | +| `/delete ` | Group | Creator only | Delete bounty | +| `/delete ` | DM | Creator only | Delete personal bounty | +| `/track ` | Group | Anyone | Track a group bounty | +| `/untrack ` | Group | Anyone | Untrack a bounty | +| `/my` | Group | Anyone | Show tracked group bounties | +| `/my` | DM | Anyone | Show tracked personal bounties | +| `/start` | Anywhere | Anyone | Re-initialize user | +| `/help` | Anywhere | Anyone | Show help | + +**Removed commands:** +- `/admin_add`, `/admin_remove` — no admin model in v1 +- Reminder-related logic — no cron in v1 + +### Phase 3: Simplify bot.py + +- Remove `Application.post_init` setup (no DB init needed) +- Bot starts instantly — JSON files created on first use +- No migration logic + +### Phase 4: Rewrite tests + +- `test_commands.py` — keep (parsing is unchanged) +- `test_storage.py` — new, tests `load_user`, `save_user`, `next_bounty_id` +- Remove all DB-dependent tests (`test_db.py` deleted) + +### Phase 5: Cleanup + +- Delete `db.py`, `schema.sql`, `cron.py`, `test_db.py` +- Delete `requirements-dev.txt` (dev deps in `pyproject.toml`) +- Update README to reflect simplified commands + +--- + +## Estimated effort + +- Storage layer: ~80 lines +- Commands rewrite: ~200 lines (simpler than current) +- Tests: ~100 lines +- Cleanup: trivial + +Total: ~1 day of work for one person. + +--- + +## When to revert to SQLite + +If any of these become true, SQLite is the right choice: +- Multiple concurrent users with write conflicts +- Need for complex queries (across all users, aggregations, etc.) +- Reminder system with proper deduplication +- Scale target > 1,000 users +- Need for ACID guarantees on concurrent writes + +For a personal bot with < 100 users, JSON files are the right default.