Add integration testing guide and prompts

- docs/workflow/INTEGRATION-TESTING.md - Full guide on integration testing - What it is vs unit/e2e tests - 5 principles to test per integration point - Folder structure suggestions - Self-improving loop for human findings - integration-tests/ - Template for project's integration tests - README.md - What to test (Feynman fills per project) - package.json - Test runner setup - scripts/setup.sh - Service startup - scripts/teardown.sh - Cleanup - docs/workflow/WORKFLOW.md - Added integration testing references - docs/workflow/AGENT-PROMPTS.md - Added integration testing prompts - docs/workflow/INDEX.md - Updated file structure
2026-04-18 14:08:56 +00:00
parent 002596aea9
commit 224fff2761
8 changed files with 685 additions and 7 deletions
--- a/docs/workflow/AGENT-PROMPTS.md
+++ b/docs/workflow/AGENT-PROMPTS.md
@@ -253,3 +253,104 @@ Fix the issue file(s) before committing:

 After fixing, try committing again.
 ```
+
+---
+
+## Integration Testing: Initial Survey Prompt
+
+When starting a new project or adopting this workflow for the first time:
+
+```
+Survey this codebase and fill in integration-tests/README.md:
+
+1. Identify all systems in this project
+2. Identify all integration points between systems
+3. For each integration point, list what should be tested:
+   - Happy path
+   - Data integrity
+   - Error handling
+   - Auth (if applicable)
+   - Timing/async
+4. List any known gaps in integration testing
+5. List any tests that should exist but don't yet
+
+Read docs/workflow/INTEGRATION-TESTING.md for guidance on what to test.
+
+Output: Update integration-tests/README.md with your findings.
+Human will review and approve.
+```
+
+---
+
+## Integration Testing: Per-Issue Prompt
+
+When working on an issue that touches integration points:
+
+```
+This issue involves integration between systems.
+
+Before implementing:
+1. Read integration-tests/README.md
+2. Identify which integration points this issue affects
+3. Check if tests exist for those integration points
+4. If tests don't exist, plan to create them
+
+During implementation:
+5. Implement the feature/fix
+6. Add or update integration tests to cover:
+   - Happy path (does the connection work?)
+   - Error handling (what if something fails?)
+   - Any edge cases specific to this integration
+
+After implementation:
+7. Run integration tests: pnpm run test:integration
+8. If tests fail, fix before claiming done
+9. Document test changes in the issue file
+```
+
+---
+
+## Integration Testing: Human Bug → Regression Test Prompt
+
+When human finds a bug during manual testing:
+
+```
+Human found an integration bug: <description>
+
+1. Add to integration-tests/README.md > Human Findings table:
+   | <date> | <bug description> | No | <proposed test file> |
+
+2. Implement a regression test that would catch this bug:
+   - Test that fails now (demonstrating the bug)
+   - Test that would pass when bug is fixed
+
+3. Add the test to the appropriate test file in integration-tests/tests/
+
+4. Document in the issue file:
+   - What bug was found
+   - What regression test was added
+   - How to verify the test works
+```
+
+---
+
+## Integration Testing: Adding Tests to Issue Prompt
+
+For issues that introduce new integration points:
+
+```
+This issue introduces a new integration point or modifies an existing one.
+
+Add to integration-tests/README.md:
+1. New entry in Integration Points table (if new)
+2. What to test for this integration point
+3. List of tests to implement
+
+Then implement the tests in integration-tests/tests/:
+- One test file per integration point
+- Test functions named clearly
+- Happy path first, then edge cases
+
+Run tests and verify they pass before claiming done.
+```
+
--- a/docs/workflow/INDEX.md
+++ b/docs/workflow/INDEX.md
@@ -27,13 +27,16 @@ A workflow where:
 │   ├── INDEX.md              ← You are here
 │   ├── ISSUE-FORMAT.md       ← Required fields for issue files
 │   ├── WORKFLOW.md           ← Step-by-step stages and gates
-│   └── AGENT-PROMPTS.md  ← Copy-paste prompts for agents
+│   ├── AGENT-PROMPTS.md      ← Copy-paste prompts for agents
+│   └── INTEGRATION-TESTING.md ← How to test system connections
 ├── .issues/
 │   ├── INDEX.md              ← List of all open issues
 │   └── example/
 │       └── 001-example.md
 ├── .hooks/
 │   └── issue-linter.js       ← Pre-commit validator
+├── integration-tests/        ← Integration tests (create per project)
+│   └── README.md             ← What to test (Feynman fills this)
 └── package.json             ← (optional) setup scripts
 ```

--- a/docs/workflow/INTEGRATION-TESTING.md
+++ b/docs/workflow/INTEGRATION-TESTING.md
@@ -0,0 +1,363 @@
+# Integration Testing Guide
+
+How to test connections between systems. This is **principles, not code**—adapt to your project's languages and tools.
+
+## What Are Integration Tests?
+
+```
+┌─────────────────────────────────────────────────────────────────┐
+│                         Your Application                         │
+├─────────────────────────────────────────────────────────────────┤
+│                                                                  │
+│  Unit Tests              Integration Tests        E2E Tests     │
+│  ──────────             ─────────────────       ──────────     │
+│  One component          Two+ systems working    Full user       │
+│  in isolation           together                flow in real     │
+│                                                 app             │
+│  Example:               Example:                Example:         │
+│  - auth service         - Frontend calls        - User logs in  │
+│  - user model           - Backend API          - Creates post  │
+│  - utils               - Database responds      - Post appears  │
+│                                                                  │
+└─────────────────────────────────────────────────────────────────┘
+```
+
+**Integration test = verifying that System A and System B talk to each other correctly.**
+
+Unit tests check if a single piece works. Integration tests check if pieces work **together**.
+
+## Why Integration Tests Matter
+
+- Integration breaks are the hardest to debug
+- They often fail silently (each system looks fine in isolation)
+- They cause the "works on my machine" problem
+- They disproportionately affect users, not developers
+
+**Integration tests catch these before humans do.**
+
+## Principles: What to Test for Every Connection
+
+For every connection between two systems (A → B), test these five areas:
+
+### 1. Happy Path
+
+```
+✓ A sends valid request to B
+✓ B processes and responds correctly
+✓ A handles response correctly
+```
+
+**Example:** Frontend calls `/api/users/1`, backend returns user JSON, frontend displays it.
+
+### 2. Data Integrity
+
+```
+✓ Data sent = data received
+✓ Data persists correctly on B side
+✓ Data visible to A on next fetch
+```
+
+**Example:** Frontend sends `{ name: "Alice" }`, database stores `"Alice"`, next GET returns `"Alice"`.
+
+### 3. Error Handling
+
+```
+✓ What happens when B is down?
+✓ What happens when B returns error?
+✓ Does A fail gracefully?
+✓ Does A show meaningful error to user?
+```
+
+**Example:** Backend returns 500, frontend shows "Something went wrong" instead of blank screen.
+
+### 4. Auth/Auth (if applicable)
+
+```
+✓ A authenticates to B correctly
+✓ A cannot access B without auth
+✓ A cannot access other users' data
+✓ Auth tokens refresh correctly
+```
+
+**Example:** Request without token returns 401, not 500.
+
+### 5. Timing/Async
+
+```
+✓ What happens with slow responses?
+✓ What happens with concurrent requests?
+✓ Are race conditions handled?
+✓ Does timeout work correctly?
+```
+
+**Example:** Request takes 30s, frontend shows loading indicator, eventually times out.
+
+---
+
+## Folder Structure
+
+Create an `integration-tests/` directory in your project root. Treat it like a mini-app.
+
+```
+my-project/
+├── src/                      # Your application
+├── integration-tests/         # Integration tests (THIS)
+│   ├── README.md             # THIS FILE - What to test (fill this!)
+│   ├── package.json          # Test runner dependencies
+│   ├── tests/
+│   │   ├── systems/
+│   │   │   ├── frontend-backend.spec.ts
+│   │   │   ├── backend-database.spec.ts
+│   │   │   └── cli-telegram.spec.ts
+│   │   └── shared/
+│   │       ├── auth-helpers.ts
+│   │       └── test-data.ts
+│   ├── scripts/
+│   │   ├── setup.sh          # Boot up services for tests
+│   │   └── teardown.sh       # Clean up after tests
+│   └── vitest.config.ts      # Or jest, mocha, etc.
+└── ...
+```
+
+**Note:** Choose your test framework. Common choices:
+- **Browser apps**: Playwright, Cypress
+- **APIs**: Vitest, Jest, Supertest
+- **CLI tools**: Bash scripts, bats-core
+- **Any**: Just scripts that assert output
+
+The structure is flexible. The **principles** are not.
+
+---
+
+## Step 1: Document Your Integration Points
+
+When starting a project (or adopting this workflow), fill in `integration-tests/README.md` with:
+
+```markdown
+# Integration Points
+
+| # | From System | To System | What We Test | Status |
+|---|------------|----------|--------------|--------|
+| 1 | Frontend | Backend API | Auth flow, data fetching | TODO |
+| 2 | Backend | Database | CRUD operations | TODO |
+| 3 | CLI | Telegram Bot | Command → message delivery | TODO |
+```
+
+**Feynman will fill this out** based on the codebase. Human reviews and approves.
+
+---
+
+## Step 2: For Each Integration Point, Document What to Test
+
+```markdown
+## Integration Point #1: Frontend ↔ Backend API
+
+### What We Test
+
+1. **Happy Path**
+   - [ ] User can log in and receive token
+   - [ ] User can fetch their profile
+   - [ ] User can update their profile
+
+2. **Data Integrity**
+   - [ ] Data sent = data received
+   - [ ] Data persists after page refresh
+
+3. **Error Handling**
+   - [ ] Invalid credentials → clear error message
+   - [ ] Network failure → graceful failure
+   - [ ] Invalid token → redirect to login
+
+4. **Auth**
+   - [ ] Protected routes require auth
+   - [ ] Expired token → refresh or redirect
+
+5. **Timing**
+   - [ ] Loading states work
+   - [ ] Timeout on slow requests
+```
+
+**Agent fills the checklist. Human reviews.**
+
+---
+
+## Step 3: Implement Tests
+
+Agent implements tests based on the checklist. Tests live in `integration-tests/tests/`.
+
+**The test filename should indicate what it tests:**
+```
+tests/
+├── frontend-backend-auth.spec.ts
+├── frontend-backend-profiles.spec.ts
+├── backend-database-users.spec.ts
+└── cli-telegram-commands.spec.ts
+```
+
+---
+
+## Step 4: Run Tests Before Commit
+
+The pre-commit hook runs integration tests. If tests fail, commit is rejected.
+
+Add this to `.hooks/issue-linter.js` or create a separate hook:
+
+```bash
+# In pre-commit hook
+pnpm run test:integration
+```
+
+**Tests must pass before code is committed.**
+
+---
+
+## Step 5: The Self-Improving Loop
+
+When human finds a bug during manual testing:
+
+```
+1. Human documents the bug in the issue or project's integration test README
+2. Agent implements a regression test for that bug
+3. Test is committed alongside the fix
+4. Future commits run the regression test
+5. Same type of bug is now caught automatically
+```
+
+**Format for documenting human findings:**
+
+```markdown
+## Human Findings (Regression Tests Added)
+
+| Date | What Human Found | Test Added? | Test File |
+|------|------------------|-------------|-----------|
+| 2024-04-18 | Logout didn't clear token | Yes | frontend-backend-auth.spec.ts |
+| 2024-04-20 | Profile update didn't persist | Yes | frontend-backend-profiles.spec.ts |
+```
+
+This creates a record of:
+- What bugs existed
+- What tests prevent them from returning
+
+---
+
+## Phase 2: Per-Project Setup
+
+When starting a new project:
+
+1. **Clone workflow template** (includes this guide)
+2. **Run setup**: `pnpm run setup`
+3. **Feynman reads** `docs/workflow/INTEGRATION-TESTING.md`
+4. **Feynman surveys the codebase** and fills in `integration-tests/README.md`:
+   - What systems exist?
+   - What are the integration points?
+   - What should be tested?
+5. **Human reviews and approves** the integration points
+6. **Feynman implements** integration tests based on the approved plan
+7. **Tests are committed** to the repo
+
+---
+
+## Phase 3: Ongoing
+
+When working on an issue:
+
+1. **Agent reads** `integration-tests/README.md` to understand existing tests
+2. **Agent adds tests** for new integration points introduced by the issue
+3. **Tests run** before commit
+4. **Human finds bug** → documents in `Human Findings` table
+5. **Agent adds regression test** → next time this is automatic
+
+---
+
+## Common Integration Test Patterns
+
+### API ↔ Database
+
+```typescript
+// 1. Insert data
+const user = await db.users.create({ name: "Test" });
+
+// 2. Verify database
+const found = await db.users.findUnique({ where: { id: user.id } });
+expect(found.name).toBe("Test");
+
+// 3. Clean up
+await db.users.delete({ where: { id: user.id } });
+```
+
+### Frontend ↔ API
+
+```typescript
+// 1. Setup auth
+const token = await getAuthToken();
+
+// 2. Make request
+const response = await api.get("/users/me", { headers: { Authorization: token } });
+
+// 3. Assert
+expect(response.status).toBe(200);
+expect(response.data.name).toBe("Test User");
+```
+
+### CLI ↔ External Service
+
+```bash
+# 1. Capture output
+output=$(my-cli command --arg value)
+
+// 2. Assert
+if [[ "$output" == *"expected"* ]]; then
+  echo "PASS"
+else
+  echo "FAIL: expected 'expected' in output"
+  exit 1
+fi
+```
+
+---
+
+## Tips
+
+- **Keep tests independent**: Each test should setup and teardown its own data
+- **Use realistic data**: Test with data similar to production
+- **Name tests clearly**: `it('should return 401 for invalid token')`
+- **One assertion per test**: Easier to debug failures
+- **Document edge cases**: If you found a bug manually, add a test for it
+
+---
+
+## Troubleshooting
+
+### "Integration tests take too long to run"
+
+- Run only relevant tests: `pnpm run test:integration -- --grep "auth"`
+- Run in parallel if possible
+- Separate slow tests into a "slow" suite run nightly
+
+### "Services need to be running for tests"
+
+- Add `scripts/setup.sh` that starts services
+- Hook runs setup before tests, teardown after
+- Document startup requirements in `integration-tests/README.md`
+
+### "I don't know what to test"
+
+- Start with happy path for each integration point
+- Add error cases as human finds bugs
+- Use the principles in this guide as a checklist
+
+---
+
+## Summary
+
+1. **Identify integration points** in your project
+2. **Document what to test** for each (Feynman fills, human approves)
+3. **Implement tests** (Agent does, guided by documentation)
+4. **Run before commit** (Hook enforces)
+5. **Human finds bug** → add to Human Findings table
+6. **Agent adds regression test** → loop closes
+
+This creates a system where:
+- Integration issues are caught automatically
+- Human findings improve the automated system
+- Future similar issues are prevented
--- a/docs/workflow/WORKFLOW.md
+++ b/docs/workflow/WORKFLOW.md
@@ -167,6 +167,10 @@ Implementation complete. Self-Verification filled (see Stage 4).
   - Run their tests
   - Do a quick manual check
 5. Fill `## Agent Working Notes > Self-Verification`
+6. **Run integration tests** (if this issue touches integration points):
+   - Check `integration-tests/README.md` for affected integration points
+   - Run: `pnpm run test:integration -- --grep "<integration-point>"`
+   - If tests don't exist, this is a gap—document it in the issue

 ### Self-Verification Must Be Honest

@@ -343,3 +347,44 @@ Do NOT unilaterally expand scope. Do NOT ignore the scope change.
 Ask. Write in the issue file. Wait for response. Proceed only when you have acknowledgment.

 It is better to pause and ask than to assume and break things.
+
+---
+
+## Integration Testing
+
+Integration tests verify that two or more systems work correctly together. See `docs/workflow/INTEGRATION-TESTING.md` for full guide.
+
+### Quick Reference
+
+**When to run integration tests:**
+- During Stage 4 (Self-Verification)
+- Before claiming an issue is done
+- When the issue touches integration points
+
+**How to run:**
+```bash
+pnpm run test:integration
+```
+
+**If tests don't exist for an integration point:**
+- Document it as a gap in `integration-tests/README.md`
+- Implement the test if possible
+- If not, mark it and continue
+
+**Human finds a bug:**
+1. Document in `integration-tests/README.md > Human Findings`
+2. Agent implements a regression test
+3. Test prevents future occurrences
+
+### Integration Test Checklist
+
+For every integration point this issue touches:
+
+- [ ] Happy path works
+- [ ] Data integrity verified
+- [ ] Error handling works (mock failure scenarios)
+- [ ] Auth/auth works (if applicable)
+- [ ] Timing/async handled
+
+Document any gaps in the issue file.
+
--- a/integration-tests/README.md
+++ b/integration-tests/README.md
@@ -0,0 +1,102 @@
+# Integration Tests
+
+**This file documents what integration tests should exist in this project.**
+
+Feynman will fill this out by surveying the codebase. Human reviews and approves.
+
+---
+
+## Integration Points
+
+| # | From System | To System | What We Test | Status |
+|---|------------|----------|--------------|--------|
+| — | — | — | — | — |
+
+---
+
+## What to Test Per Integration Point
+
+For each integration point above, document what to test:
+
+### Integration Point #1: [FILL: System A ↔ System B]
+
+**What to test:**
+
+1. **Happy Path**
+   - [ ] TODO
+
+2. **Data Integrity**
+   - [ ] TODO
+
+3. **Error Handling**
+   - [ ] TODO
+
+4. **Auth**
+   - [ ] TODO
+
+5. **Timing/Async**
+   - [ ] TODO
+
+---
+
+## Known Integration Test Gaps
+
+<!-- Issues or TODOs for missing tests -->
+
+- TODO: [Describe missing test]
+
+---
+
+## Tests to Implement
+
+| # | Test Description | Integration Point | Status |
+|---|-----------------|------------------|--------|
+| 1 | TODO | TODO | TODO |
+
+---
+
+## Human Findings (Regression Tests Added)
+
+When human finds a bug during manual testing, document it here. Agent implements a regression test.
+
+| Date | What Human Found | Test Added? | Test File |
+|------|------------------|-------------|-----------|
+| — | — | — | — |
+
+---
+
+## How to Run Tests
+
+```bash
+# Install dependencies
+pnpm install
+
+# Setup services (if needed)
+pnpm run test:integration:setup
+
+# Run all integration tests
+pnpm run test:integration
+
+# Run specific integration point tests
+pnpm run test:integration -- --grep "auth"
+
+# Teardown (if needed)
+pnpm run test:integration:teardown
+```
+
+---
+
+## Services Required for Tests
+
+<!-- Document what services need to be running for tests to pass -->
+
+| Service | How to Start | Port | Notes |
+|---------|-------------|------|-------|
+| — | — | — | — |
+
+---
+
+## Notes
+
+<!-- Any project-specific notes about testing -->
+
--- a/integration-tests/package.json
+++ b/integration-tests/package.json
@@ -0,0 +1,15 @@
+{
+  "name": "integration-tests",
+  "version": "1.0.0",
+  "description": "Integration tests for this project",
+  "private": true,
+  "scripts": {
+    "test": "vitest run",
+    "test:watch": "vitest",
+    "test:setup": "bash scripts/setup.sh",
+    "test:teardown": "bash scripts/teardown.sh"
+  },
+  "devDependencies": {
+    "vitest": "latest"
+  }
+}
--- a/integration-tests/scripts/setup.sh
+++ b/integration-tests/scripts/setup.sh
@@ -0,0 +1,26 @@
+#!/bin/bash
+#
+# Setup script for integration tests
+# Run this before tests to start required services
+#
+# Edit this file to match your project's needs.
+#
+
+echo "=========================================="
+echo "Integration Tests Setup"
+echo "=========================================="
+echo ""
+
+# TODO: Add commands to start services
+# Example:
+#   echo "Starting backend server..."
+#   cd ../backend && pnpm start &
+#   sleep 5
+#
+#   echo "Starting database..."
+#   docker compose up -d db
+
+echo "Setup complete. Services should be running."
+echo ""
+echo "Run tests with: pnpm test"
+echo ""
--- a/integration-tests/scripts/teardown.sh
+++ b/integration-tests/scripts/teardown.sh
@@ -0,0 +1,23 @@
+#!/bin/bash
+#
+# Teardown script for integration tests
+# Run this after tests to clean up services
+#
+# Edit this file to match your project's needs.
+#
+
+echo "=========================================="
+echo "Integration Tests Teardown"
+echo "=========================================="
+echo ""
+
+# TODO: Add commands to stop services
+# Example:
+#   echo "Stopping backend server..."
+#   pkill -f "node.*backend"
+#
+#   echo "Stopping database..."
+#   docker compose down
+
+echo "Teardown complete."
+echo ""