14 KiB
Troubleshooting Comprehensive Guide
Overview
This guide consolidates all common issues, debugging procedures, and solutions for the Sojorn platform, covering authentication, notifications, E2EE chat, backend services, and deployment issues.
Authentication Issues
JWT Algorithm Mismatch (ES256 vs HS256)
Problem: 401 Unauthorized errors due to JWT algorithm mismatch between client and server.
Symptoms:
- Edge Functions rejecting JWT with 401 errors
- Authentication working in development but not production
- Cached sessions appearing to fail
Root Cause: Supabase project issuing ES256 JWTs while backend expects HS256.
Diagnosis:
- Decode JWT at https://jwt.io
- Check header algorithm:
{ "alg": "ES256", // Problem: backend expects HS256 "kid": "b66bc58d-34b8-4..." }
Solutions:
Option A: Update Backend to Accept ES256
// In your JWT validation middleware
token, err := jwt.Parse(tokenString, func(token *jwt.Token) (interface{}, error) {
if _, ok := token.Method.(*jwt.SigningMethodECDSA); !ok {
return nil, fmt.Errorf("unexpected signing method: %v", token.Header["alg"])
}
return publicKey, nil
})
Option B: Configure Supabase to Use HS256
- Go to Supabase Dashboard → Settings → API
- Change JWT signing algorithm to HS256
- Regenerate API keys if needed
Verification:
# Test JWT validation
curl -H "Authorization: Bearer <token>" https://api.sojorn.net/health
FCM/Push Notification Issues
Web Notifications Not Working
Symptoms:
- "Web push is missing FIREBASE_WEB_VAPID_KEY" error
- No notification permission prompt
- Token registration fails
Diagnostics:
// Check browser console
FCM token registered (web): d2n2ELGKel7yzPL3wZLGSe...
Solutions:
1. Check VAPID Key Configuration
File: sojorn_app/lib/config/firebase_web_config.dart
static const String _vapidKey = 'BNxS7_your_actual_vapid_key_here';
2. Verify Service Worker
Check DevTools > Application > Service Workers for firebase-messaging-sw.js
3. Test Permission Status
// In browser console
Notification.permission === 'granted'
Android Notifications Not Working
Symptoms:
- Web notifications work, Android doesn't
- No FCM token generated on Android
- "Token is null after getToken()" error
Diagnostics:
adb logcat | findstr "FCM"
Expected Logs:
[FCM] Initializing for platform: android
[FCM] Token registered (android): eXaMpLe...
[FCM] Token synced with Go Backend successfully
Solutions:
1. Verify google-services.json
ls sojorn_app/android/app/google-services.json
Check package name matches: "package_name": "com.gosojorn.app"
2. Check Build Configuration
File: sojorn_app/android/app/build.gradle.kts
applicationId = "com.gosojorn.app"
plugins {
id("com.google.gms.google-services")
}
3. Verify Permissions
File: AndroidManifest.xml
<uses-permission android:name="android.permission.POST_NOTIFICATIONS" />
4. Reinstall App
adb uninstall com.gosojorn.app
flutter run
Backend Push Service Issues
Symptoms:
- "Failed to initialize PushService" error
- Notifications not being sent
Diagnostics:
# Check service account file
ls -la /opt/sojorn/firebase-service-account.json
# Check .env configuration
sudo cat /opt/sojorn/.env | grep FIREBASE
# Validate JSON
cat /opt/sojorn/firebase-service-account.json | jq .
# Check logs
sudo journalctl -u sojorn-api -f | grep -i push
Solutions:
- Ensure service account JSON exists and is valid
- Verify file permissions (600)
- Check Firebase project configuration
E2EE Chat Issues
Key Generation Problems
Symptoms:
- 208-bit keys instead of 256-bit
- Zero signatures
- Key upload failures
Diagnostics:
# Check database for keys
sudo -u postgres psql sojorn -c "SELECT user_id, LEFT(identity_key, 20) FROM profiles WHERE identity_key IS NOT NULL;"
Common Issues & Solutions:
1. 208-bit Key Bug
Problem: String-based KDF instead of byte-based
Solution: Update _kdf method to use SHA-256 on byte arrays
2. Fake Zero Signatures
Problem: Manual upload using fake signatures Solution: Generate real Ed25519 signatures in key upload
3. Database Constraint Errors
Problem: SQLSTATE 42P10 - constraint mismatch
Solution: Use correct constraint ON CONFLICT (user_id, key_id)
Message Encryption/Decryption Failures
Symptoms:
- Messages not decrypting
- MAC verification failures
- "Cannot decrypt own messages" issue
Diagnostics:
# Check message headers
sudo -u postgres psql sojorn -c "SELECT LEFT(message_header, 50) FROM encrypted_messages LIMIT 5;"
Expected Header Format:
{
"epk": "<base64 sender ephemeral public key>",
"n": "<base64 nonce>",
"m": "<base64 MAC>",
"v": 1
}
Solutions:
1. Verify Key Bundle Format
Identity Key Format: Ed25519:X25519 (base64 concatenated with colon)
2. Check Signature Verification
Ensure both users enforce signature verification (no legacy asymmetry)
3. Validate OTK Management
Check one-time prekeys are being generated and deleted properly
Backend Service Issues
CORS Problems
Symptoms:
- "Failed to fetch" errors
- CORS policy errors in browser console
- Pre-flight request failures
Diagnostics:
# Check Nginx configuration
sudo nginx -t
# Check Go CORS logs
sudo journalctl -u sojorn-api -f | grep -i cors
Solutions:
1. Dynamic Origin Matching
allowedOrigins := strings.Split(cfg.CORSOrigins, ",")
allowAllOrigins := false
allowedOriginSet := make(map[string]struct{})
for _, origin := range allowedOrigins {
trimmed := strings.TrimSpace(origin)
if trimmed == "*" {
allowAllOrigins = true
break
}
allowedOriginSet[trimmed] = struct{}{}
}
2. Nginx CORS Headers
add_header 'Access-Control-Allow-Origin' '$http_origin';
add_header 'Access-Control-Allow-Credentials' 'true';
Database Connection Issues
Symptoms:
- Database connection timeouts
- "Unable to connect to database" errors
- Connection pool exhaustion
Diagnostics:
# Check PostgreSQL status
sudo systemctl status postgresql
# Check connection count
sudo -u postgres psql -c "SELECT count(*) FROM pg_stat_activity;"
# Check Go backend logs
sudo journalctl -u sojorn-api -f | grep -i database
Solutions:
1. Verify Connection String
# Check .env file
sudo cat /opt/sojorn/.env | grep DATABASE_URL
2. Adjust Connection Pool
// In database connection setup
config, err := pgxpool.ParseConfig(databaseURL)
config.MaxConns = 20
config.MinConns = 5
3. Check Database Resources
# Check available connections
sudo -u postgres psql -c "SELECT max_connections FROM pg_settings;"
Service Startup Issues
Symptoms:
- Service fails to start
- Port already in use errors
- Configuration file errors
Diagnostics:
# Check service status
sudo systemctl status sojorn-api
# Check port usage
sudo netstat -tlnp | grep :8080
# Check logs
sudo journalctl -u sojorn-api -n 50
Solutions:
1. Fix Port Conflicts
# Kill process using port 8080
sudo fuser -k 8080/tcp
# Or change port in .env
PORT=8081
2. Verify Configuration
# Test configuration
cd /opt/sojorn/go-backend
go run ./cmd/api/main.go
Media Upload Issues
File Upload Failures
Symptoms:
- Upload timeouts
- File size limit errors
- Permission denied errors
Diagnostics:
# Check upload directory
ls -la /opt/sojorn/uploads/
# Check Nginx limits
grep client_max_body_size /etc/nginx/nginx.conf
# Check disk space
df -h /opt/sojorn/
Solutions:
1. Fix Directory Permissions
sudo chown -R patrick:patrick /opt/sojorn/uploads/
sudo chmod -R 755 /opt/sojorn/uploads/
2. Increase Upload Limits
# In Nginx config
client_max_body_size 50M;
3. Configure Go Limits
// In main.go
r.MaxMultipartMemory = 32 << 20 // 32 MB
R2/Cloud Storage Issues
Symptoms:
- R2 upload failures
- Authentication errors
- CORS issues with direct uploads
Diagnostics:
# Check R2 configuration
sudo cat /opt/sojorn/.env | grep R2
# Test R2 connection
curl -I https://<your-r2-domain>.r2.cloudflarestorage.com
Solutions:
1. Verify R2 Credentials
- Check R2 token permissions
- Verify bucket exists
- Test API access
2. Fix CORS for Direct Uploads
Configure CORS in R2 bucket settings for direct browser uploads.
Performance Issues
Slow API Response Times
Symptoms:
- Requests taking > 2 seconds
- Database query timeouts
- High CPU usage
Diagnostics:
# Check system resources
top
htop
# Check database queries
sudo -u postgres psql -c "SELECT query, mean_time, calls FROM pg_stat_statements ORDER BY mean_time DESC LIMIT 10;"
# Check Go goroutines
curl http://localhost:8080/debug/pprof/goroutine?debug=1
Solutions:
1. Database Optimization
-- Add indexes
CREATE INDEX CONCURRENTLY idx_posts_created_at ON posts(created_at DESC);
CREATE INDEX CONCURRENTLY idx_posts_author_id ON posts(author_id);
2. Connection Pool Tuning
config.MaxConns = 25
config.MaxConnLifetime = time.Hour
config.HealthCheckPeriod = time.Minute * 5
3. Enable Query Logging
// Add to database config
config.ConnConfig.LogLevel = pgx.LogLevelInfo
Memory Leaks
Symptoms:
- Memory usage increasing over time
- Out of memory errors
- Service crashes
Diagnostics:
# Monitor memory usage
watch -n 1 'ps aux | grep sojorn-api'
# Check Go memory stats
curl http://localhost:8080/debug/pprof/heap
Solutions:
1. Profile Memory Usage
go tool pprof http://localhost:8080/debug/pprof/heap
2. Fix Goroutine Leaks
// Ensure proper cleanup
defer cancel()
defer wg.Wait()
Deployment Issues
SSL/TLS Certificate Problems
Symptoms:
- Certificate expired errors
- SSL handshake failures
- Mixed content warnings
Diagnostics:
# Check certificate status
sudo certbot certificates
# Test SSL configuration
sudo nginx -t
# Check certificate expiry
openssl x509 -in /etc/letsencrypt/live/api.sojorn.net/cert.pem -text -noout | grep "Not After"
Solutions:
1. Renew Certificates
sudo certbot renew --dry-run
sudo certbot renew
sudo systemctl reload nginx
2. Fix Nginx SSL Config
ssl_certificate /etc/letsencrypt/live/api.sojorn.net/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/api.sojorn.net/privkey.pem;
DNS Propagation Issues
Symptoms:
- Domain not resolving
- pointing to wrong IP
- TTL still propagating
Diagnostics:
# Check DNS resolution
nslookup api.sojorn.net
dig api.sojorn.net
# Check propagation
for i in {1..10}; do echo "Attempt $i:"; dig api.sojorn.net +short; sleep 30; done
Solutions:
1. Verify DNS Records
# Check A record
dig api.sojorn.net A
# Check with multiple DNS servers
dig @8.8.8.8 api.sojorn.net
dig @1.1.1.1 api.sojorn.net
2. Reduce TTL Before Changes
Set TTL to 300 seconds before making DNS changes.
Debugging Tools & Commands
Essential Commands
# Service Management
sudo systemctl status sojorn-api
sudo systemctl restart sojorn-api
sudo journalctl -u sojorn-api -f
# Database
sudo -u postgres psql sojorn
sudo -u postgres psql -c "SELECT count(*) FROM users;"
# Network
sudo netstat -tlnp | grep :8080
curl -I https://api.sojorn.net/health
# Logs
sudo tail -f /var/log/nginx/access.log
sudo tail -f /var/log/nginx/error.log
# File System
ls -la /opt/sojorn/
df -h /opt/sojorn/
Monitoring Scripts
#!/bin/bash
# monitor.sh - Basic health check
echo "=== Service Status ==="
sudo systemctl is-active sojorn-api
echo "=== Database Connections ==="
sudo -u postgres psql -c "SELECT count(*) FROM pg_stat_activity;"
echo "=== Disk Space ==="
df -h /opt/sojorn/
echo "=== Memory Usage ==="
free -h
echo "=== Recent Errors ==="
sudo journalctl -u sojorn-api --since "1 hour ago" | grep -i error
Emergency Procedures
Service Recovery
-
Immediate Response:
sudo systemctl restart sojorn-api sudo systemctl restart nginx sudo systemctl restart postgresql -
Check Logs:
sudo journalctl -u sojorn-api -n 100 sudo journalctl -u nginx -n 100 -
Verify Health:
curl https://api.sojorn.net/health
Database Recovery
-
Check Database Status:
sudo systemctl status postgresql sudo -u postgres psql -c "SELECT 1;" -
Restore from Backup:
sudo -u postgres psql sojorn < backup.sql -
Verify Data Integrity:
sudo -u postgres psql -c "SELECT COUNT(*) FROM users;"
Contact & Support
Information to Gather
When reporting issues, include:
-
Environment Details:
- OS version
- Service versions
- Configuration files (redacted)
-
Error Messages:
- Full error messages
- Stack traces
- Log entries
-
Reproduction Steps:
- What triggers the issue
- Frequency
- Impact assessment
-
Diagnostic Output:
- Service status
- Resource usage
- Network tests
Escalation Procedures
- Level 1: Check this guide and run basic diagnostics
- Level 2: Collect detailed logs and metrics
- Level 3: Contact infrastructure provider if needed
Last Updated: January 30, 2026 Version: 1.0 Next Review: February 15, 2026