Add issue template for v2 storage simplification
This commit is contained in:
129
.github/ISSUE_TEMPLATE/clarifying-questions.md
vendored
Normal file
129
.github/ISSUE_TEMPLATE/clarifying-questions.md
vendored
Normal file
@@ -0,0 +1,129 @@
|
|||||||
|
# JIGAIDO — Clarifying Questions
|
||||||
|
|
||||||
|
## Context
|
||||||
|
|
||||||
|
Pre-implementation review of JIGAIDO Telegram bounty bot. Below are questions gathered during code review that need answers before proceeding.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Questions
|
||||||
|
|
||||||
|
### 1. `__init__.py` is empty
|
||||||
|
File `apps/telegram-bot/__init__.py` is 0 bytes. It's a Python package marker but serves no function.
|
||||||
|
- **Can we remove it?** Does anything import from `apps/telegram-bot/` as a package?
|
||||||
|
- Or should it contain `__version__`, `__app_name__`, or similar?
|
||||||
|
|
||||||
|
### 2. `jigaido-bot.service` as a template
|
||||||
|
`jigaido-bot.service` currently has hardcoded values:
|
||||||
|
- `User=shoko`
|
||||||
|
- `Group=shoko`
|
||||||
|
- `WorkingDirectory=/home/shoko/repositories/jigaido/apps/telegram-bot/`
|
||||||
|
|
||||||
|
**Proposal:** Make it a template (`jigaido-bot.service.j2` or similar). `setup.sh` detects the actual user/group/path and generates the real `.service` file.
|
||||||
|
|
||||||
|
- Is this the intended approach?
|
||||||
|
- Should `setup.sh` also handle env file generation?
|
||||||
|
|
||||||
|
### 3. Logging separation
|
||||||
|
Currently logs go to stdout (handled by systemd's `journalctl`).
|
||||||
|
|
||||||
|
**Proposal:**
|
||||||
|
- `jigaido.log` — INFO level
|
||||||
|
- `jigaido-error.log` — ERROR level
|
||||||
|
- Path: `/var/log/jigaido/` or `~/.local/share/jigaido/logs/`
|
||||||
|
- Files should NOT be tracked by git (add to `.gitignore`)
|
||||||
|
|
||||||
|
**Questions:**
|
||||||
|
- Preferred log directory?
|
||||||
|
- Log rotation needed (e.g., `maxsize=10MB, rotate=3`)?
|
||||||
|
- Should we use Python's `logging` module with `RotatingFileHandler`, or shell redirect in the service file?
|
||||||
|
|
||||||
|
### 4. What does `/start` actually do?
|
||||||
|
|
||||||
|
SPEC.md says:
|
||||||
|
> "First admin assignment is automatic when the bot detects a new group."
|
||||||
|
|
||||||
|
Code (`cmd_start`):
|
||||||
|
- Group: `ensure_group()` → upserts group + adds creator as admin
|
||||||
|
- DM: `ensure_user()` → upserts user record
|
||||||
|
|
||||||
|
**Questions:**
|
||||||
|
- Should `/start` in a group re-trigger admin assignment if creator already exists? (Currently yes — `add_group_admin` is idempotent via `UNIQUE` constraint, so it's safe.)
|
||||||
|
- Should `/start` message be idempotent (same message every time) or conditional?
|
||||||
|
|
||||||
|
### 5. `db.init_db()` — safe to run multiple times?
|
||||||
|
|
||||||
|
`schema.sql` uses `CREATE TABLE IF NOT EXISTS ...`. `db.init_db()` calls `executescript()`.
|
||||||
|
|
||||||
|
**Clarification needed:**
|
||||||
|
- Is it safe to call `init_db()` every time the bot starts? (Seems yes — tables only created if absent.)
|
||||||
|
- Should there be a schema version table for future migrations?
|
||||||
|
|
||||||
|
### 6. `TELEGRAM_BOT_USERNAME` hardcoded
|
||||||
|
|
||||||
|
`commands.py:13`:
|
||||||
|
```python
|
||||||
|
TELEGRAM_BOT_USERNAME = "your_bot_username"
|
||||||
|
```
|
||||||
|
|
||||||
|
This is never used in the current code (searched — no references). The `post_init` in `bot.py` sets commands without using it.
|
||||||
|
|
||||||
|
**Question:** Should this be removed entirely, or should it be set from `JIGAIDO_BOT_USERNAME` env var?
|
||||||
|
|
||||||
|
### 7. `REMAINDER_WINDOW_DAYS` duplicated
|
||||||
|
|
||||||
|
Defined in both:
|
||||||
|
- `commands.py:15` — `REMINDER_WINDOW_DAYS = 7`
|
||||||
|
- `cron.py:27` — `REMINDER_WINDOW_DAYS = 7`
|
||||||
|
|
||||||
|
**Decision needed:** User said "drop cron for now." Confirm:
|
||||||
|
- Remove `cron.py` entirely?
|
||||||
|
- Or keep for future use?
|
||||||
|
|
||||||
|
### 8. Link dedup — can we clear a link?
|
||||||
|
|
||||||
|
`update_bounty` uses `COALESCE(?, link)`:
|
||||||
|
```python
|
||||||
|
SET text = COALESCE(?, text),
|
||||||
|
link = COALESCE(?, link),
|
||||||
|
due_date_ts = COALESCE(?, due_date_ts)
|
||||||
|
```
|
||||||
|
|
||||||
|
Passing `None` as `link` does NOT clear it — `COALESCE` returns the existing value.
|
||||||
|
|
||||||
|
**Question:** Is this intentional? Should there be a way to clear/reset a link or due date?
|
||||||
|
|
||||||
|
### 9. Git as update mechanism
|
||||||
|
|
||||||
|
Proposal: `git pull` in the deployment directory as the update mechanism.
|
||||||
|
|
||||||
|
**Questions:**
|
||||||
|
- Is the `.service` file and env file generated by `setup.sh` stored outside the git-tracked directory?
|
||||||
|
- Or are they gitignored files inside the repo?
|
||||||
|
- What's the update procedure? `git pull && systemctl restart jigaido-bot`?
|
||||||
|
|
||||||
|
### 10. Cron dropped
|
||||||
|
|
||||||
|
User explicitly said drop cron for now.
|
||||||
|
|
||||||
|
**Action needed:**
|
||||||
|
- Remove `cron.py` and related code from `bot.py`?
|
||||||
|
- Or keep but disable?
|
||||||
|
- What about `reminder_log` table — keep or remove?
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Priority
|
||||||
|
|
||||||
|
| # | Question | Priority |
|
||||||
|
|---|---|---|
|
||||||
|
| 1 | `__init__.py` remove? | Low |
|
||||||
|
| 2 | `.service` as template | Medium |
|
||||||
|
| 3 | Logging separation | Medium |
|
||||||
|
| 4 | `/start` behavior clarification | Medium |
|
||||||
|
| 5 | `db.init_db()` safety | Low (already safe) |
|
||||||
|
| 6 | `TELEGRAM_BOT_USERNAME` | Low |
|
||||||
|
| 7 | `REMAINDER_WINDOW_DAYS` duplication | Low |
|
||||||
|
| 8 | Can we clear link/due_date? | Medium |
|
||||||
|
| 9 | Git as update mechanism | High |
|
||||||
|
| 10 | Cron dropped | Medium |
|
||||||
177
.github/ISSUE_TEMPLATE/v2-simplify-storage.md
vendored
Normal file
177
.github/ISSUE_TEMPLATE/v2-simplify-storage.md
vendored
Normal file
@@ -0,0 +1,177 @@
|
|||||||
|
# Simplify Storage: Replace SQLite with Per-User JSON Files
|
||||||
|
|
||||||
|
## Status
|
||||||
|
Proposed
|
||||||
|
|
||||||
|
## Background
|
||||||
|
|
||||||
|
### What happened
|
||||||
|
|
||||||
|
The SQLite-based storage layer (`db.py`) introduced several categories of complexity that outweigh its benefits at this stage:
|
||||||
|
|
||||||
|
1. **Connection management bugs** — SQLite Python's `row_factory` disables implicit transaction handling. Combined with `PRAGMA foreign_keys = ON`, this caused `ON CONFLICT UPDATE` statements to silently fail to commit. The fix required setting `conn.isolation_level = None` directly on the connection object after creation. These are not obvious behaviors and took significant debugging time.
|
||||||
|
|
||||||
|
2. **Test fragility** — The `fresh_db` fixture patches `DB_PATH` but the SQLite connection is a module-level singleton with connection-level state. Tests passed in isolation but failed under pytest's caching, and the root cause was subtle enough to require multiple iterations.
|
||||||
|
|
||||||
|
3. **Tracking table complexity** — The `user_bounty_tracking` + `reminder_log` tables with dedup logic add non-trivial query complexity for what is essentially a "bookmark" feature.
|
||||||
|
|
||||||
|
4. **Schema migrations** — Any schema change requires a migration script. For a personal bot with 2 users and 50 bounties, this overhead is disproportionate.
|
||||||
|
|
||||||
|
5. **Cron/reminder system** — The daily reminder cron (`cron.py`) requires a separate process, scheduler (cron), and `reminder_log` table to prevent duplicate notifications. This is a significant operational surface for a v1.
|
||||||
|
|
||||||
|
### Why it happened
|
||||||
|
|
||||||
|
The current design was over-engineered for the actual usage pattern:
|
||||||
|
- Most commands are stateless (one request → one response)
|
||||||
|
- The user is the primary (and likely only) user
|
||||||
|
- Scale target is 10-100 users, not 10,000+
|
||||||
|
- The bot is a personal project, not a production service
|
||||||
|
|
||||||
|
SQLite was chosen for "correctness" but at this scale, the correctness guarantees are irrelevant while the complexity is real.
|
||||||
|
|
||||||
|
### Current state
|
||||||
|
|
||||||
|
The bot works and 53/53 tests pass. But `db.py` is ~300 lines with subtle connection semantics, `schema.sql` defines 7 tables, `cron.py` is a separate process, and the command layer (`commands.py`) is entangled with the DB layer.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Proposal
|
||||||
|
|
||||||
|
**Replace SQLite with a per-user JSON file storage system.**
|
||||||
|
|
||||||
|
### Storage Design
|
||||||
|
|
||||||
|
```
|
||||||
|
data/
|
||||||
|
└── users/
|
||||||
|
└── {telegram_user_id}.json # one file per user
|
||||||
|
```
|
||||||
|
|
||||||
|
**File structure (`users/{id}.json`):**
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"user_id": 123,
|
||||||
|
"username": "alice",
|
||||||
|
"personal_bounties": [
|
||||||
|
{
|
||||||
|
"id": 1,
|
||||||
|
"text": "Fix login bug",
|
||||||
|
"link": "https://github.com/example/repo/issues/1",
|
||||||
|
"due_date_ts": 1735689600,
|
||||||
|
"created_at": 1735603200
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"tracked_bounties": [
|
||||||
|
{"bounty_id": 5, "group_id": -1001, "created_at": 1735600000},
|
||||||
|
{"bounty_id": 3, "group_id": null, "created_at": 1735590000}
|
||||||
|
]
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
### Key design decisions
|
||||||
|
|
||||||
|
1. **Single file per user** — No group-level files. Personal bounties live in the creator's file. Group bounties live in the creator's file with `group_id` set.
|
||||||
|
|
||||||
|
2. **Bounty IDs are sequential integers per file** — Not global. Each user's file has its own `next_id` counter. This avoids coordination between users at the cost of non-global IDs (acceptable for personal use).
|
||||||
|
|
||||||
|
3. **Cross-group tracking** — When Alice (in Group A) tracks a bounty created by Bob in Group B, Alice's file stores `{bounty_id: X, group_id: -100B}`. To display it, the bot loads Bob's file and finds bounty `X`.
|
||||||
|
|
||||||
|
4. **No reminders in v1** — Drop the cron/reminder system entirely. The `reminder_log` table and `cron.py` are removed. Reminders can be added back as a v2 feature with a simpler design (e.g., just a "due soon" filter on `/my`).
|
||||||
|
|
||||||
|
5. **No admin model in v1** — Drop `group_admins` table. Group bounties are open to anyone in the group to add/edit/delete. The creator can be the only one who can modify (enforced by `created_by_user_id` check).
|
||||||
|
|
||||||
|
### Deleted components
|
||||||
|
|
||||||
|
- `db.py` — removed entirely
|
||||||
|
- `schema.sql` — removed entirely
|
||||||
|
- `cron.py` — removed entirely
|
||||||
|
- `reminder_log` table — removed
|
||||||
|
- `user_bounty_tracking` table — replaced by `tracked_bounties` in user JSON
|
||||||
|
- `groups` table — removed (group_id stored directly in bounty objects)
|
||||||
|
- `group_admins` table — removed (simplified permission model)
|
||||||
|
|
||||||
|
### Retained components
|
||||||
|
|
||||||
|
- `bot.py` — minimal entrypoint
|
||||||
|
- `commands.py` — command parsing and reply logic (simplified)
|
||||||
|
- `tests/` — simplified to match new data model
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Implementation Plan
|
||||||
|
|
||||||
|
### Phase 1: Data model + storage layer
|
||||||
|
|
||||||
|
1. Create `storage.py` with:
|
||||||
|
- `get_user_path(user_id)` — returns `Path` to user's JSON
|
||||||
|
- `load_user(user_id)` — reads and parses JSON, returns dict, creates file if missing
|
||||||
|
- `save_user(user_id, data)` — writes JSON atomically (temp file + rename)
|
||||||
|
- `next_bounty_id(user_id)` — increments and returns next ID for that user's file
|
||||||
|
|
||||||
|
2. No locking needed at v1 scale. `tempfile` + `rename` gives atomic writes.
|
||||||
|
|
||||||
|
### Phase 2: Rewrite commands.py
|
||||||
|
|
||||||
|
Simplified command set:
|
||||||
|
|
||||||
|
| Command | Where | Who | Description |
|
||||||
|
|---|---|---|---|
|
||||||
|
| `/bounty` | Group / DM | Anyone | List all bounties (group-scoped in group, personal in DM) |
|
||||||
|
| `/add <text> [link] [due>` | Group | Anyone | Add bounty to group |
|
||||||
|
| `/add <text> [link] [due>` | DM | Anyone | Add personal bounty |
|
||||||
|
| `/edit <id> [text] [link] [due>` | Group | Creator only | Edit bounty |
|
||||||
|
| `/edit <id> [text] [link] [due>` | DM | Creator only | Edit personal bounty |
|
||||||
|
| `/delete <id>` | Group | Creator only | Delete bounty |
|
||||||
|
| `/delete <id>` | DM | Creator only | Delete personal bounty |
|
||||||
|
| `/track <id>` | Group | Anyone | Track a group bounty |
|
||||||
|
| `/untrack <id>` | Group | Anyone | Untrack a bounty |
|
||||||
|
| `/my` | Group | Anyone | Show tracked group bounties |
|
||||||
|
| `/my` | DM | Anyone | Show tracked personal bounties |
|
||||||
|
| `/start` | Anywhere | Anyone | Re-initialize user |
|
||||||
|
| `/help` | Anywhere | Anyone | Show help |
|
||||||
|
|
||||||
|
**Removed commands:**
|
||||||
|
- `/admin_add`, `/admin_remove` — no admin model in v1
|
||||||
|
- Reminder-related logic — no cron in v1
|
||||||
|
|
||||||
|
### Phase 3: Simplify bot.py
|
||||||
|
|
||||||
|
- Remove `Application.post_init` setup (no DB init needed)
|
||||||
|
- Bot starts instantly — JSON files created on first use
|
||||||
|
- No migration logic
|
||||||
|
|
||||||
|
### Phase 4: Rewrite tests
|
||||||
|
|
||||||
|
- `test_commands.py` — keep (parsing is unchanged)
|
||||||
|
- `test_storage.py` — new, tests `load_user`, `save_user`, `next_bounty_id`
|
||||||
|
- Remove all DB-dependent tests (`test_db.py` deleted)
|
||||||
|
|
||||||
|
### Phase 5: Cleanup
|
||||||
|
|
||||||
|
- Delete `db.py`, `schema.sql`, `cron.py`, `test_db.py`
|
||||||
|
- Delete `requirements-dev.txt` (dev deps in `pyproject.toml`)
|
||||||
|
- Update README to reflect simplified commands
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Estimated effort
|
||||||
|
|
||||||
|
- Storage layer: ~80 lines
|
||||||
|
- Commands rewrite: ~200 lines (simpler than current)
|
||||||
|
- Tests: ~100 lines
|
||||||
|
- Cleanup: trivial
|
||||||
|
|
||||||
|
Total: ~1 day of work for one person.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## When to revert to SQLite
|
||||||
|
|
||||||
|
If any of these become true, SQLite is the right choice:
|
||||||
|
- Multiple concurrent users with write conflicts
|
||||||
|
- Need for complex queries (across all users, aggregations, etc.)
|
||||||
|
- Reminder system with proper deduplication
|
||||||
|
- Scale target > 1,000 users
|
||||||
|
- Need for ACID guarantees on concurrent writes
|
||||||
|
|
||||||
|
For a personal bot with < 100 users, JSON files are the right default.
|
||||||
Reference in New Issue
Block a user