My Cursor + Claude Engineering Workflow: A Real Project Walkthrough

Every “AI productivity” article I’ve read is either vague advice (“use AI for boilerplate!”) or cherry-picked demos that don’t look like real work. This is neither.

This is a concrete walkthrough of a single feature — adding TOTP 2FA to CitizenApp — showing exactly where Cursor and Claude helped, where they didn’t, and what the actual time savings looked like.

The Feature: TOTP 2FA with Recovery Codes

The requirement: users can enable TOTP-based two-factor authentication (Google Authenticator-compatible). If they lose their device, they can use one of 10 one-time recovery codes.

This is a non-trivial feature: it touches the auth flow, the user model, the session model, email sending, and the frontend in at least 4 places.

Estimated time without AI: 2–3 days. Actual time: ~6 hours.

Here’s how.

Step 1: Architecture Decision (Claude)

Before opening my editor, I opened Claude with context:

I'm adding TOTP 2FA to a FastAPI app. Current auth:
- bcrypt password hashing
- JWT access tokens (15min)
- HTTP-only refresh token cookies (30 days)
- PostgreSQL + SQLAlchemy async

Requirements:
- Users can enable/disable 2FA
- TOTP compatible with Google Authenticator, Authy
- 10 recovery codes on setup, each single-use
- Login flow: password → if 2FA enabled → TOTP prompt

Questions:
1. Where to store TOTP secret? Encrypted at rest?
2. How to handle the "pending 2FA" state during login — 
   issue a limited token or use session?
3. Recovery code format — hash them or store plain?

Libraries: pyotp seems obvious. Anything better?

Claude’s response clarified three things I hadn’t fully thought through:

Yes, encrypt the TOTP secret at rest — same Fernet key as other PII. If your DB is breached, unencrypted TOTP secrets let an attacker bypass 2FA permanently.
Use a short-lived “pending 2FA” JWT rather than a session. Stateless — no DB lookup during the TOTP verification step. Include {"sub": user_id, "2fa_pending": true} in the payload with a 5-minute expiry.
Hash recovery codes — same as password hashing. Bcrypt is overkill (slow), SHA-256 is fine since recovery codes are long random strings.

Total time: 20 minutes. I had a clear, validated architecture before writing any code.

Step 2: Database Schema (Cursor)

In Cursor, I described the schema to Agent mode:

Add to models/user.py:
- totp_secret field (encrypted string, nullable)
- totp_enabled bool (default false)
- totp_verified_at datetime (nullable)

Create new file models/recovery_code.py:
- id, user_id FK, code_hash (sha256, unique), used bool, used_at datetime
- 10 codes per user, generated on 2FA setup

Cursor generated both files. I reviewed:

✅ Correct SQLAlchemy 2.0 mapped_column syntax
✅ Proper ForeignKey with cascade delete
✅ Index on code_hash for lookup performance
⚠️ Missing: used_at timezone handling — I added timezone=True to the datetime column

One manual fix. Two files I didn’t have to write from scratch.

Then Alembic migration:

alembic revision --autogenerate -m "add_totp_2fa"

I reviewed the generated migration, confirmed the column types were correct, ran it.

Step 3: The TOTP Service (Cursor + Manual)

The TOTP logic is self-contained enough for Cursor to handle most of it:

Create services/totp.py with:
- generate_totp_secret() → encrypted secret
- get_totp_uri(user_email, secret) → otpauth:// URI for QR code
- verify_totp(encrypted_secret, code, window=1) → bool
- generate_recovery_codes(user_id, db) → list of raw codes (store hashes)
- verify_recovery_code(user_id, code, db) → bool (mark used)
Use pyotp. Decrypt secret before passing to pyotp.

Generated code was ~90% correct. What I fixed manually:

# Cursor generated this:
totp = pyotp.TOTP(secret)
return totp.verify(code)

# I changed to:
totp = pyotp.TOTP(secret)
# window=1 allows 30s clock drift — important for mobile TOTP apps
return totp.verify(code, valid_window=1)

Small but important: valid_window=1 means ±1 time step (±30 seconds). Without it, users with slightly drifted clocks fail verification constantly.

Step 4: The Auth Flow Changes (Manual + Claude Review)

This was the part I wrote mostly by hand — the login flow change is architectural, not boilerplate:

# routes/auth.py — modified login endpoint
@router.post("/login")
async def login(credentials: LoginSchema, db: AsyncSession = Depends(get_async_db)):
    user = await authenticate_user(db, credentials.username, credentials.password)

    if not user:
        raise HTTPException(status_code=401, detail="Invalid credentials")

    # If 2FA is enabled, issue a limited pending token instead of full auth
    if user.totp_enabled:
        pending_token = create_access_token(
            {"sub": str(user.id), "2fa_pending": True},
            expires_delta=timedelta(minutes=5)
        )
        return {"requires_2fa": True, "pending_token": pending_token}

    # Normal flow — no 2FA
    return await issue_full_tokens(user)


@router.post("/verify-2fa")
async def verify_2fa(
    payload: Verify2FASchema,
    db: AsyncSession = Depends(get_async_db),
):
    # Validate the pending token
    token_data = decode_pending_token(payload.pending_token)
    if not token_data or not token_data.get("2fa_pending"):
        raise HTTPException(status_code=401, detail="Invalid pending token")

    user = await get_user(db, int(token_data["sub"]))

    # Try TOTP first, then recovery code
    if not (
        verify_totp(user.totp_secret, payload.code)
        or await verify_recovery_code(user.id, payload.code, db)
    ):
        raise HTTPException(status_code=401, detail="Invalid 2FA code")

    return await issue_full_tokens(user)

After writing this, I pasted it to Claude with: “Review for security issues, particularly around the pending token and the TOTP/recovery code branch.”

Claude flagged one issue: the or short-circuit means if TOTP passes, the recovery code check is never run — which is correct behaviour, but I should log the authentication method used in the audit trail.

I added it. Three lines. Worth catching.

Step 5: Tests (Cursor + Review)

I wrote a brief test spec in a comment, then let Cursor generate the test cases:

# Tests needed:
# - login without 2FA → full tokens
# - login with 2FA → pending token + requires_2fa flag
# - verify 2FA with valid TOTP → full tokens
# - verify 2FA with invalid TOTP → 401
# - verify 2FA with valid recovery code → full tokens, code marked used
# - verify 2FA with used recovery code → 401
# - verify 2FA with expired pending token → 401

Cursor generated 7 tests. I reviewed each one, fixed 2 issues:

Mock for pyotp.TOTP.verify wasn’t isolated — tests were dependent on each other
Recovery code test wasn’t checking used=True after verification

Fixed, ran pytest. All 7 passed. Added to the existing 107-test suite.

Step 6: Frontend (Cursor, mostly autonomous)

The React side: Cursor handled it with minimal guidance.

Add to pages/Login.tsx:
- If login response has requires_2fa: true, show 2FA input form
- Submit pending_token + code to /auth/verify-2fa
- Handle recovery code toggle (same input, different UX label)
- Show loading state during verification

Add to pages/AccountSettings.tsx:
- Enable 2FA flow: show QR code + secret + "verify to confirm" step
- Disable 2FA flow: require password confirmation
- Show recovery codes after setup (copy-to-clipboard, one-time display)
- Regenerate recovery codes option

Frontend took ~90 minutes. Cursor did most of the repetitive React + TypeScript. I reviewed for accessibility (focusable inputs, keyboard navigation on the code entry) and UX edge cases (what happens if QR scan fails → show manual entry option).

The Honest Time Breakdown

Task	Without AI	With AI	Saved
Architecture decision	2h (research + thinking)	20min (Claude review)	~1.5h
DB schema + migration	30min	10min	20min
TOTP service	1h	25min	35min
Auth flow changes	1.5h	1.5h (manual)	0
Tests	1h	30min	30min
Frontend	3h	1.5h	1.5h
Total	~9h	~4.25h	~4.75h

That’s roughly 2× for this feature — not 3×. The architectural work (login flow changes, security review) wasn’t AI-accelerated much.

The 3× number shows up on features with more boilerplate relative to architecture: CRUD endpoints, schema migrations, test fixtures, UI forms.

What I’d Never Let AI Do Unreviewed

Any security-sensitive logic — the valid_window issue is a perfect example of plausible-but-wrong AI output
DB migrations — generated migrations can silently drop columns if you’re not careful
Auth flow — I wrote the pending token flow entirely by hand
Error handling — AI usually handles happy path, misses edge cases

The rule I follow: AI drafts, I ship. Everything AI generates gets read line by line before it goes into main.

Cursor Config That Helps

Two settings that materially improve Cursor’s output quality:

// .cursor/rules (project-level)
{
  "rules": [
    "Always use SQLAlchemy 2.0 mapped_column syntax, not Column()",
    "Use async/await throughout — no sync database calls",
    "Follow existing patterns in the codebase before inventing new ones",
    "Always include error handling for database operations",
    "Use Pydantic v2 model_validator not root_validator"
  ]
}

These prevent the most common errors in AI-generated FastAPI code.

Want to walk through your project’s architecture with me? 30 minutes, no pitch.