Authentication Hell - The 401 Marathon
Part 5: Authentication Hell - The 401 Marathon
The Story So Far: The pivot to open source required building a Settings page where users paste their own API keys. Simple, right?
Narrator voice: It was not simple.
The Innocent Beginning
December 20, 2025. I tasked Claude Code with building Phase 0:
Build a Settings page for user API key management:
- User enters:
- Fly.io API Token
- OpenRouter API Key
- Encryption Key (optional, auto-generated if empty)
- Store in PostgreSQL
- Encrypt sensitive values (Fernet)
- Backend API: CRUD for settings
- Frontend: Vercel-style .env paste UI
Claude Code generated:
commit 068dc28
Date: 2025-12-20
feat: Implement user settings for API key management with encryption,
centralized application configuration, and new authentication modules.
What it created:
backend/app/api/settings.py- CRUD endpointsbackend/app/models/user_settings.py- Settings modelfrontend/app/settings/page.tsx- Settings UIbackend/app/services/user_api_keys.py- Encryption service
I deployed to Fly.io. Frontend loaded beautifully. I clicked "Settings."
HTTP 401 Unauthorized
The First Debug: No JWT Token
I checked the browser console:
Failed to fetch https://catwalk-backend.fly.dev/api/settings
Response: 401 Unauthorized
{
"detail": "Not authenticated"
}
The Settings endpoint requires authentication:
# backend/app/api/settings.py
from fastapi import Depends
from app.core.auth import get_current_user
@router.get("/api/settings")
async def get_settings(user: User = Depends(get_current_user)):
# This dependency requires a valid JWT token
pass
But the frontend wasn't sending any token.
I checked the API client:
// frontend/lib/api.ts
export async function getSettings() {
const response = await fetch(`${API_URL}/api/settings`)
// No Authorization header!
return response.json()
}
AI mistake #1: Generated authenticated endpoints but didn't generate the token creation logic.
I prompted: "Add JWT token generation to the frontend auth flow"
Claude Code added:
// frontend/lib/auth.ts
import { getServerSession } from 'next-auth'
export async function createBackendAccessToken() {
const session = await getServerSession()
if (!session?.user) return null
// Sign JWT token with AUTH_SECRET
const token = await new SignJWT({ sub: session.user.email })
.setProtectedHeader({ alg: 'HS256' })
.setExpirationTime('24h')
.sign(new TextEncoder().encode(process.env.AUTH_SECRET))
return token
}
Updated API client:
export async function getSettings() {
const token = await createBackendAccessToken()
const response = await fetch(`${API_URL}/api/settings`, {
headers: {
'Authorization': `Bearer ${token}`
}
})
return response.json()
}
Redeployed.
Still: 401 Unauthorized
The Secret Mismatch Mystery
I added logging to the backend:
# backend/app/core/auth.py
async def verify_jwt_token(token: str):
try:
payload = jwt.decode(
token,
settings.AUTH_SECRET, # ← What secret is this?
algorithms=["HS256"]
)
return payload
except jwt.InvalidSignatureError:
logger.error("JWT signature verification failed")
raise HTTPException(status_code=401)
Backend logs showed:
JWT signature verification failed
The signature didn't match. That meant:
- Frontend signed the token with one secret
- Backend verified with a different secret
I checked my .env files:
Frontend (.env.local):
NEXTAUTH_SECRET=abc123...
Backend (Fly.io secrets):
fly secrets list --app catwalk-backend
AUTH_SECRET=xyz789...
They were different.
But wait - where was the frontend getting AUTH_SECRET?
I checked frontend/lib/auth.ts again:
.sign(new TextEncoder().encode(process.env.AUTH_SECRET))
process.env.AUTH_SECRET was undefined because I had set NEXTAUTH_SECRET, not AUTH_SECRET.
NextAuth.js uses NEXTAUTH_SECRET by default. But my JWT signing code was looking for AUTH_SECRET.
AI mistake #2: Generated code referenced AUTH_SECRET but didn't validate it existed or match documentation.
The Fix Attempt #1: Align Secrets
I updated .env.local:
# Frontend .env.local
AUTH_SECRET=xyz789... # Match backend
NEXTAUTH_SECRET=xyz789... # Keep for NextAuth
Updated Fly.io backend:
fly secrets set AUTH_SECRET=xyz789... --app catwalk-backend
Redeployed both.
Still: 401 Unauthorized
What!?
The Debugging Rabbit Hole
I added more logging:
Frontend:
const token = await createBackendAccessToken()
console.log('Token generated:', token)
console.log('Secret used:', process.env.AUTH_SECRET?.slice(0, 10))
Backend:
@router.get("/api/settings")
async def get_settings(authorization: str = Header(None)):
logger.info(f"Received token: {authorization[:50]}...")
logger.info(f"Secret: {settings.AUTH_SECRET[:10]}...")
Frontend output:
Token generated: eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...
Secret used: xyz789abcd
Backend logs:
Received token: Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...
Secret: xyz789abcd
Secrets matched. Token was being sent. Backend received it.
But still: JWT signature verification failed
The JWT Debugging Tool
I created a debug script to decode the token manually:
// debug-jwt.mjs
import jwt from 'jsonwebtoken'
const token = process.argv[2]
const secret = process.argv[3]
try {
const decoded = jwt.verify(token, secret)
console.log('✅ Token valid:', decoded)
} catch (error) {
console.log('❌ Token invalid:', error.message)
}
Tested locally:
node debug-jwt.mjs "eyJhbGci..." "xyz789..."
❌ Token invalid: invalid signature
The token was genuinely invalid. But why?
I tried signing and verifying in the same script:
const secret = 'xyz789abcd'
const token = jwt.sign({ sub: 'test@example.com' }, secret)
console.log('Token:', token)
const decoded = jwt.verify(token, secret)
console.log('Decoded:', decoded)
Token: eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9...
Decoded: { sub: 'test@example.com', iat: 1703098765 }
It worked. Same secret for signing and verification = valid token.
Then why was the frontend → backend flow failing?
The Breakthrough: User Not Synced
I added a database query to the backend:
@router.get("/api/settings")
async def get_settings(user: User = Depends(get_current_user)):
logger.info(f"Authenticated user: {user.email}")
# Fetch settings...
But get_current_user was failing before reaching this line.
I checked the dependency:
# backend/app/core/auth.py
async def get_current_user(token: str = Depends(oauth2_scheme)):
payload = verify_jwt_token(token) # This worked now
email = payload.get("sub")
# Look up user in database
user = await db.query(User).filter(User.email == email).first()
if not user:
raise HTTPException(status_code=401, detail="User not found")
return user
The user didn't exist in the database.
The JWT token was valid, but the user it referenced (test@example.com) had never been synced to the PostgreSQL database.
AI mistake #3: Generated authentication flow but didn't implement the user sync between NextAuth (frontend session) and PostgreSQL (backend users).
The Missing Piece: User Sync Endpoint
I needed an endpoint that NextAuth calls after sign-in to create/update users in PostgreSQL.
I prompted Claude Code:
Add a user sync endpoint:
- POST /api/auth/sync-user
- Called by NextAuth signIn callback
- Creates or updates user in PostgreSQL
- Secured with AUTH_SYNC_SECRET (different from AUTH_SECRET)
Claude Code generated:
# backend/app/api/auth.py
@router.post("/api/auth/sync-user")
async def sync_user(
user_data: dict,
x_auth_secret: str = Header(None)
):
# Verify request is from our frontend
if x_auth_secret != settings.AUTH_SYNC_SECRET:
raise HTTPException(status_code=403, detail="Invalid sync secret")
# Create or update user
user = await db.query(User).filter(User.email == user_data["email"]).first()
if not user:
user = User(email=user_data["email"], name=user_data.get("name"))
db.add(user)
else:
user.name = user_data.get("name")
user.updated_at = datetime.utcnow()
await db.commit()
return {"status": "synced"}
Frontend (frontend/auth.ts):
import { AuthOptions } from 'next-auth'
export const authOptions: AuthOptions = {
providers: [
GoogleProvider({
clientId: process.env.GOOGLE_CLIENT_ID,
clientSecret: process.env.GOOGLE_CLIENT_SECRET
})
],
callbacks: {
async signIn({ user }) {
// Sync user to backend
await fetch(`${process.env.NEXT_PUBLIC_BACKEND_URL}/api/auth/sync-user`, {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'X-Auth-Secret': process.env.AUTH_SYNC_SECRET
},
body: JSON.stringify({
email: user.email,
name: user.name
})
})
return true
}
}
}
Two secrets now:
- AUTH_SECRET: Signs/verifies JWT tokens (must match frontend ↔ backend)
- AUTH_SYNC_SECRET: Secures the sync endpoint (prevents unauthorized user creation)
I set both in .env.local and Fly.io:
# Frontend
AUTH_SECRET=xyz789...
AUTH_SYNC_SECRET=abc123...
# Backend
fly secrets set AUTH_SECRET=xyz789... --app catwalk-backend
fly secrets set AUTH_SYNC_SECRET=abc123... --app catwalk-backend
Redeployed.
Signed in with Google.
Finally: Settings page loaded ✅
The Hidden Gotcha: Environment Variable Timing
Everything worked... on the first deploy after setting secrets.
But when I redeployed the backend (for unrelated changes), 401 errors returned.
Logs showed:
AUTH_SYNC_SECRET: None
The secret disappeared.
The problem: Fly.io secrets are only available after setting them manually. They don't persist in Fly.io configuration files.
If you:
- Set secret:
fly secrets set AUTH_SECRET=... - Deploy code
- Destroy app
- Recreate app
- Deploy code
The secret is gone. You must re-set it.
The fix: I created a deployment script:
#!/bin/bash
# deploy-backend.sh
# Check secrets exist
AUTH_SECRET=$(fly secrets list --app catwalk-backend | grep AUTH_SECRET)
if [ -z "$AUTH_SECRET" ]; then
echo "❌ AUTH_SECRET not set! Run:"
echo "fly secrets set AUTH_SECRET=<value> --app catwalk-backend"
exit 1
fi
# Deploy
fly deploy --app catwalk-backend
Lesson: AI-generated infrastructure code assumes secrets exist. You must validate environment setup before deployment.
Creating AUTH_TROUBLESHOOTING.md
After 2 days of debugging, I had learned:
- JWT secret mismatches cause invalid signatures
- User sync is critical for authentication to work
- Multiple secrets serve different purposes
- Environment variable timing matters
I documented everything:
commit a8dfde6
Date: 2025-12-21
feat(auth): Fix 401 errors and add comprehensive authentication
troubleshooting
Created: context/AUTH_TROUBLESHOOTING.md
# Authentication Troubleshooting Guide
## Quick Diagnosis
**Symptom**: 401 Unauthorized on /api/settings
**Check**:
1. Is AUTH_SECRET set on both frontend and backend?
2. Do the values match EXACTLY?
3. Has the user been synced to the database?
4. Is AUTH_SYNC_SECRET set?
## Common Issues
### Issue 1: JWT Signature Invalid
**Cause**: AUTH_SECRET mismatch between frontend and backend
**Fix**:
```bash
# Generate matching secret
SECRET=$(openssl rand -base64 32)
# Frontend
echo "AUTH_SECRET=\"$SECRET\"" >> .env.local
# Backend
fly secrets set AUTH_SECRET="$SECRET" --app catwalk-backend
Issue 2: User Not Found
Cause: User never synced to PostgreSQL
Check:
SELECT * FROM users WHERE email = 'your@email.com';
If empty, sync is broken.
Fix: Verify AUTH_SYNC_SECRET is set and signIn callback is firing.
Issue 3: Secrets Disappeared After Redeploy
Cause: Fly.io secrets don't persist in config files
Fix: Always check secrets before deploying:
fly secrets list --app catwalk-backend
This became **the most referenced document** during onboarding contributors.
## PR #10: The Big Authentication Fix
All the fixes were bundled into Pull Request #10:
```bash
commit 945b055
Date: 2025-12-21
Fix code review issues: auth safety, migration defaults, modal URLs
What it included:
- ✅ Consolidated auth modules (removed duplicate middleware)
- ✅ Fixed JWT secret handling (AUTH_SECRET everywhere)
- ✅ User sync endpoint (with AUTH_SYNC_SECRET)
- ✅ Database timestamp fixes (func.now() instead of Python datetime)
- ✅ Alembic migration updates (proper foreign key constraints)
- ✅ AUTH_TROUBLESHOOTING.md guide
Test results after PR #10:
pytest
51 tests passed ✅
0 warnings
Frontend type check:
bun run typecheck
0 errors ✅
Authentication flow:
- User signs in with Google ✅
- NextAuth signIn callback fires ✅
- Frontend calls
/api/auth/sync-userwith AUTH_SYNC_SECRET ✅ - Backend creates/updates user in PostgreSQL ✅
- Frontend creates JWT token signed with AUTH_SECRET ✅
- Backend verifies JWT with AUTH_SECRET ✅
- Settings page loads ✅
Total time debugging auth: ~16 hours over 2 days
AI contribution: Generated base authentication code, but missed:
- User sync logic
- Secret validation
- Environment variable checks
- Timing issues (secrets not persisting)
Human contribution: Everything. This was pure manual debugging.
What AI Can't Debug
1. Environment Configuration
AI can generate:
secret = os.getenv("AUTH_SECRET")
AI cannot:
- Verify the secret is set in your
.env.local - Check if Fly.io secrets are configured
- Validate secrets match across environments
- Detect when secrets are undefined vs empty string
Why? AI only sees code, not runtime environment.
2. Cross-System Integration
AI can generate:
- NextAuth configuration ✅
- FastAPI authentication middleware ✅
- JWT signing and verification ✅
AI cannot:
- Validate the flow works end-to-end
- Detect missing sync between NextAuth session and PostgreSQL
- Understand timing (when callbacks fire, when secrets are available)
Why? Integration bugs span multiple systems AI can't execute.
3. Secret Lifecycle
AI can generate:
fly secrets set AUTH_SECRET=...
AI cannot:
- Remember you need to run this command
- Detect when secrets are missing
- Warn that secrets don't persist in config files
- Validate secrets are set before deployment
Why? Infrastructure state is invisible to AI.
What You Must Bring
As an AI orchestrator, debugging environment issues requires:
-
Infrastructure knowledge:
- How Fly.io secrets work
- Environment variable precedence (
.env.localvs runtime) - When secrets are loaded (build time vs runtime)
-
Systematic debugging:
- Add logging at every layer
- Test each component in isolation
- Validate assumptions (secrets exist, values match, user synced)
-
Documentation discipline:
- Write troubleshooting guides as you debug
- Make implicit knowledge explicit
- Help future you (and contributors) avoid repeating this
-
Cross-system thinking:
- Trace the full request flow (frontend → backend → database)
- Identify where data transforms (session → JWT → user lookup)
- Verify every handoff point
AI helps you write the code. You debug the world it runs in.
Key Metrics: The Authentication Gauntlet
Time Debugging: 16 hours (Dec 20-21)
- Secret mismatch troubleshooting: 4 hours
- User sync investigation: 6 hours
- Environment variable issues: 3 hours
- Testing and validation: 2 hours
- Writing AUTH_TROUBLESHOOTING.md: 1 hour
Commits:
068dc28- Initial Settings implementation (AI-generated, broken)89478a0- Fix auth sync secret46d8576- Harden auth binding945b055- Final auth safety fixes (PR #10)
AI Contribution: ~30% (generated base code) Human Contribution: ~70% (debugging, environment setup, testing)
Lines of Code:
- Added: ~800 (auth modules, Settings API, frontend UI)
- Changed: ~300 (secret handling, user sync)
- Deleted: ~200 (duplicate auth middleware, broken code)
Documentation:
AUTH_TROUBLESHOOTING.md: 600 words- Updated
AGENTS.mdwith auth protocols: 200 words - Updated
CURRENT_STATUS.md: 150 words
The Lesson
AI is phenomenal at generating code patterns it's seen before.
JWT authentication? ✅ Training data has millions of examples. FastAPI middleware? ✅ Common pattern. NextAuth configuration? ✅ Well-documented.
AI is terrible at:
- Debugging environment-specific failures
- Validating cross-system integration
- Detecting missing configuration
- Understanding secret lifecycle
Your job as orchestrator:
- Generate code with AI (fast)
- Test in real environments (slow)
- Debug integration issues (manual)
- Document solutions (essential)
The ratio: For every 1 hour AI saves on coding, expect to spend 0.5-1 hour debugging environment and integration issues.
Still a massive productivity gain - but not "push button, receive working system."
Coming Next
In Part 6, security enters the spotlight:
- Automated PR review agents (CodeRabbit, Qodo, Gemini Code Assist)
- Command injection risks AI missed
- Package validation (npm/PyPI registry checks)
- Comprehensive test suite expansion
- Access token rotation for security
- PR #13: Security hardening
Spoiler: AI wrote code with a critical security flaw that would've allowed arbitrary code execution. Only automated security review agents caught it.
Commit References:
068dc28- Initial Settings implementationa8dfde6- Auth troubleshooting guide945b055- Auth safety fixes (PR #10)efbac5c- JWT auth implementation
Documents Created:
context/AUTH_TROUBLESHOOTING.md- Updated:
AGENTS.md,CURRENT_STATUS.md
Debugging Tools:
debug-jwt.mjs- Manual JWT verificationtest-jwt-match.sh- Secret validation script
Code:
This is Part 5 of 7. Authentication works, but AI missed something critical...
Previous: ← Part 4: The Pivot Next: Part 6: Security Awakening →
Jordan Hindo
Full-stack Developer & AI Engineer building in public. Exploring the future of agentic coding and AI-generated assets.
Get in touch