Simplify Storage: Replace SQLite with Per-User JSON Files #2
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Status
Proposed
Background
What happened
The SQLite-based storage layer (
db.py) introduced several categories of complexity that outweigh its benefits at this stage:Connection management bugs — SQLite Python's
row_factorydisables implicit transaction handling. Combined withPRAGMA foreign_keys = ON, this causedON CONFLICT UPDATEstatements to silently fail to commit.Test fragility — The
fresh_dbfixture patchesDB_PATHbut the SQLite connection is a module-level singleton with connection-level state. Tests passed in isolation but failed under pytest's caching.Tracking table complexity — The
user_bounty_tracking+reminder_logtables with dedup logic add non-trivial query complexity for what is essentially a "bookmark" feature.Schema migrations — Any schema change requires a migration script. For a personal bot with 2 users and 50 bounties, this overhead is disproportionate.
Cron/reminder system — The daily reminder cron (
cron.py) requires a separate process, scheduler (cron), andreminder_logtable to prevent duplicate notifications.Why it happened
The current design was over-engineered for the actual usage pattern. SQLite was chosen for "correctness" but at this scale, the correctness guarantees are irrelevant while the complexity is real.
Current state
The bot works and 53/53 tests pass. But
db.pyis ~300 lines with subtle connection semantics,schema.sqldefines 7 tables,cron.pyis a separate process.Proposal
Replace SQLite with a per-user JSON file storage system.
Storage Design
File structure (
users/{id}.json):Key design decisions
group_idset.next_idcounter.{bounty_id: X, group_id: -100B}. Bot loads Bob's file to find the bounty content.cron.pyandreminder_logentirely.created_by_user_id).Deleted components
db.py,schema.sql,cron.pyreminder_log,user_bounty_tracking,groups,group_adminstablesImplementation Plan
Phase 1: Storage layer
storage.py:load_user(),save_user(),next_bounty_id()tempfile+renamePhase 2: Rewrite commands.py
Simplified command set:
/bounty/add <text> [link] [due>/add <text> [link] [due>/edit <id> [text] [link] [due>/edit <id> [text] [link] [due>/delete <id>/delete <id>/track <id>/untrack <id>/my/start/helpPhase 3: Simplify bot.py
Application.post_init(no DB init needed)Phase 4: Rewrite tests
test_commands.py(parsing unchanged)test_storage.pytest_db.pyPhase 5: Cleanup
db.py,schema.sql,cron.py,test_db.py,requirements-dev.txtEstimated effort
~80 lines storage + ~200 lines commands + ~100 lines tests = ~1 day
When to revert to SQLite
Thanks for the detailed proposal! I have a few clarifying questions:
1. Data directory location
Where should
data/users/live? At repo root (~/repositories/jigaido/data/) or insideapps/telegram-bot/data/?2. Cross-group bounty lookup
When Bob tracks Alice's group bounty (Alice's file:
{bounty_id: X, group_id: -100B}), Bob's file stores{bounty_id: X, group_id: -100B}. But to display the bounty content to Bob, the bot needs to find Alice's file and read bounty X. How does the bot efficiently locate Alice's file given onlybounty_id: Xandgroup_id: -100B?3. Listing group bounties
Currently,
/bountyin a group lists all bounties for thatgroup_id. With per-user files, does the bot scan ALL user files to find bounties wheregroup_id = current_chat_id? Or is there an index?4. Migration
Is this a clean break (no migration from SQLite), or should existing data be migrated to JSON?
5. Admin model removal
The proposal removes admin management entirely. Current behavior: group creator can
/admin_addother users. New behavior: anyone can add, only creator can edit/delete. Is this change intentional and acceptable to users?PR created: #3
Changes:
data/users/for data directory location I'm thinking at ~/.jigaido/*, what do you think?
I don't understand cross-group bounty lookup, Group A should only be accessible within Group A. no individual or other groups should be able to access Group A's bounties. member of Group A can access bounties within Group A.
listing group bounties, no, it doesnt scan all user files. basically inside a group file, there will be bounties and trackers. bounties added by admin. trackers added by group members through adding a listed bounty to their track list. the flow is admin => add bounty => bounty listed in group bounties => member choose one of the bounty listed in group bounties then added in the user track list. is this clear?
migration, since we are using json it should be simple to migrate right now. we will plan about migration for sqlite and other data in another time.
no, only admin can add bounties. all users can add to their own trackers. bounties are managed by admin. trackers are managed by their own users. is this clear?
PM Review — Gap identified
PR #3 was created before the user's clarifying comment on issue #2. The implementation doesn't match the clarified design.
User clarifications (from issue #2 comment):
Data directory:
~/.jigaido/*at repo root, NOT insideapps/telegram-bot/data/Group structure needed:
data/groups/{group_id}.jsonwith:bounties— added by admintrackers— members who tracked bounties from this grouptracked_bountieslistGroups isolated: Members of Group A only access Group A's bounties. No cross-group bounty access needed.
User file:
data/users/{user_id}.jsonstores:usernametracked_bounties— list of{group_id, bounty_id}references to group bountiesCurrent PR gap: Only per-user files exist. No group file structure. No
bountiesvstrackersdistinction in groups.Please update the implementation to match the clarified design, then re-request review.
Design Updated — Please Revise PR #3
The file structure in issue #2 has been updated. PR #3 was based on the old design. Please revise the implementation to match:
New structure:
Lookup flow:
chat_id = -100123): readdata/-100123/group.jsonfor bountieschat_id = 123): readdata/123/user.jsonfor personal bountiesdata/{group_id}/{user_id}.jsonKey rules:
group.json, not globalThe updated spec is in the issue. Please update PR #3 to match.
PM Delegation — Revise PR #3
Task: Revise PR #3 to implement the new storage design from issue #2.
Spec to implement:
Lookup rules:
data/{group_id}/group.jsondata/{user_id}/user.jsondata/{group_id}/{user_id}.jsonKey constraints:
group.jsonFiles to implement:
apps/telegram-bot/storage.py— load/save for group.json, user.json, and tracking filesapps/telegram-bot/commands.py— rewritten for new storage modelapps/telegram-bot/bot.py— updated to use storage.pyFiles to delete:
db.py,schema.sql,cron.py,tests/test_db.pyDeliverable: Updated PR #3 with new implementation matching the spec above.
yes, please update PR #3 with new implementation matching the spec above @shoko