Add three critical documents for Scanner PWA production deployment: 1. STAGING_ROLLOUT_CHECKLIST.md - Main operational checklist - Pre-event setup procedures for IT/admin team - Staff device setup with PWA installation steps - Day-of operations and gate management protocols - Post-event data sync and cleanup procedures - Emergency fallback procedures and escalation contacts 2. STAFF_TRAINING_MATERIALS.md - Gate staff training resources - Step-by-step device setup for iOS/Android - Scanner operation guide with result interpretation - Troubleshooting guide for common issues - Professional smartphone usage tips for all-day events - Quick reference cards and emergency procedures 3. SCANNER_TECHNICAL_RUNBOOK.md - IT administrator guide - Complete system architecture and API documentation - Environment setup for staging/production deployment - Monitoring, alerting, and performance baseline configuration - Network requirements and quality management - Security considerations and vulnerability management - Escalation procedures and maintenance schedules These documents provide complete operational readiness for Scanner PWA deployment, ensuring smooth gate operations with minimal day-of issues. Staff preparation procedures are designed for temporary/volunteer workers with clear, simple instructions and comprehensive emergency protocols. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
28 KiB
Scanner PWA Technical Runbook
System Architecture Overview
The Black Canyon Tickets Scanner PWA is an offline-first Progressive Web App designed for gate staff to scan tickets with comprehensive abuse prevention and mobile optimization.
Core Components
┌─────────────────────────────────────────────────────────────────┐
│ Scanner PWA Architecture │
├─────────────────────────────────────────────────────────────────┤
│ Frontend (React PWA) │ Backend APIs │
│ ├── Camera/QR Detection │ ├── /api/tickets/verify │
│ ├── Offline Queue (IndexedDB) │ ├── /api/scans/log │
│ ├── Background Sync (SW) │ ├── /api/scanner/sync │
│ ├── Rate Limiting Client │ └── /api/scanner/conflicts │
│ └── Abuse Prevention UI │ │
├─────────────────────────────────────────────────────────────────┤
│ Infrastructure │ Monitoring │
│ ├── Supabase/Firebase DB │ ├── Sentry Error Tracking │
│ ├── CDN (Vercel/Netlify) │ ├── Performance Monitoring │
│ ├── SSL/HTTPS (Required) │ ├── Real-time Alerts │
│ └── Service Worker Caching │ └── Usage Analytics │
└─────────────────────────────────────────────────────────────────┘
Environment Setup
Staging Environment Configuration
Environment Variables
# Database Configuration
VITE_SUPABASE_URL=https://staging-scanner.supabase.co
VITE_SUPABASE_ANON_KEY=eyJ...staging-key
SUPABASE_SERVICE_ROLE_KEY=eyJ...service-key
# Scanner API Configuration
VITE_SCANNER_API_URL=https://staging-api.blackcanyontickets.com
VITE_SCANNER_RATE_LIMIT=8
VITE_SCANNER_DEBOUNCE_MS=2000
# PWA Configuration
VITE_PWA_NAME=BCT Scanner (Staging)
VITE_PWA_SHORT_NAME=BCT Scanner
VITE_PWA_THEME_COLOR=#1e40af
# Monitoring Configuration
VITE_SENTRY_DSN=https://staging@sentry.io/project
SENTRY_ENVIRONMENT=staging
SENTRY_RELEASE=$VERCEL_GIT_COMMIT_SHA
# Feature Flags
VITE_SCANNER_OFFLINE_ENABLED=true
VITE_ABUSE_PREVENTION_ENABLED=true
VITE_DEVICE_TRACKING_ENABLED=true
Deployment Configuration
Vercel (Recommended)
{
"buildCommand": "npm run build",
"outputDirectory": "dist",
"installCommand": "npm ci",
"framework": "vite",
"functions": {
"api/**/*.ts": {
"runtime": "nodejs18.x"
}
},
"headers": [
{
"source": "/sw.js",
"headers": [
{
"key": "Cache-Control",
"value": "public, max-age=0, must-revalidate"
}
]
}
]
}
Netlify Alternative
[build]
command = "npm run build"
publish = "dist"
[[headers]]
for = "/sw.js"
[headers.values]
Cache-Control = "public, max-age=0, must-revalidate"
[[headers]]
for = "/manifest.json"
[headers.values]
Cache-Control = "public, max-age=86400"
Production Environment
SSL/HTTPS Requirements
- Camera API: Requires HTTPS for getUserMedia() access
- Service Workers: HTTPS required for PWA functionality
- Geolocation: HTTPS required for location services
- Web Push: HTTPS required for background sync
Database Configuration
-- Scanner-specific tables
CREATE TABLE scanner_logs (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
device_id VARCHAR(255) NOT NULL,
event_id UUID NOT NULL,
qr_code VARCHAR(500) NOT NULL,
scan_result VARCHAR(50) NOT NULL, -- 'valid', 'invalid', 'already_scanned'
scan_timestamp TIMESTAMPTZ DEFAULT NOW(),
zone VARCHAR(100),
sync_status VARCHAR(50) DEFAULT 'synced',
created_at TIMESTAMPTZ DEFAULT NOW()
);
CREATE TABLE scan_conflicts (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
device_id VARCHAR(255) NOT NULL,
qr_code VARCHAR(500) NOT NULL,
offline_result VARCHAR(50) NOT NULL,
server_result VARCHAR(50) NOT NULL,
event_id UUID NOT NULL,
conflict_timestamp TIMESTAMPTZ DEFAULT NOW(),
resolution_status VARCHAR(50) DEFAULT 'pending'
);
-- Indexes for performance
CREATE INDEX idx_scanner_logs_device_event ON scanner_logs(device_id, event_id);
CREATE INDEX idx_scanner_logs_timestamp ON scanner_logs(scan_timestamp DESC);
CREATE INDEX idx_scan_conflicts_unresolved ON scan_conflicts(resolution_status)
WHERE resolution_status = 'pending';
API Endpoints and Integration
Core Scanner APIs
1. Ticket Verification Endpoint
POST /api/tickets/verify
Content-Type: application/json
Authorization: Bearer {jwt_token}
{
"qr": "TICKET_UUID_OR_CODE",
"eventId": "event_uuid",
"deviceId": "device_fingerprint",
"zone": "Gate A"
}
// Responses
// Success (200)
{
"valid": true,
"ticket": {
"eventTitle": "Sample Event",
"ticketTypeName": "General Admission",
"customerEmail": "customer@example.com",
"seatNumber": "A-15" // if assigned seating
}
}
// Already Scanned (200)
{
"valid": false,
"reason": "already_scanned",
"scannedAt": "2024-01-01T18:30:00Z",
"scannedBy": "device_abc123",
"zone": "Main Entrance"
}
// Invalid (200)
{
"valid": false,
"reason": "invalid", // or "expired", "cancelled", "locked"
"message": "Ticket not found or has been cancelled"
}
// Rate Limited (429)
{
"error": "rate_limit_exceeded",
"retryAfter": 5,
"message": "Too many requests from device"
}
2. Scan Logging Endpoint
POST /api/scans/log
Content-Type: application/json
Authorization: Bearer {jwt_token}
{
"deviceId": "device_fingerprint",
"eventId": "event_uuid",
"qr": "TICKET_CODE",
"result": "valid", // valid, invalid, already_scanned
"zone": "Gate A",
"timestamp": "2024-01-01T18:30:00Z",
"latency": 245, // ms
"offline": false
}
// Response (202 Accepted)
{
"logged": true,
"scanId": "scan_uuid"
}
3. Offline Sync Endpoint
POST /api/scanner/sync
Content-Type: application/json
Authorization: Bearer {jwt_token}
{
"deviceId": "device_fingerprint",
"scans": [
{
"qr": "TICKET_CODE_1",
"eventId": "event_uuid",
"result": "valid",
"timestamp": "2024-01-01T18:30:00Z",
"zone": "Gate A"
}
// ... more scans
]
}
// Response (200)
{
"synced": 15,
"conflicts": 2,
"failed": 0,
"conflictDetails": [
{
"qr": "TICKET_CODE_X",
"offlineResult": "valid",
"serverResult": "already_scanned",
"conflictId": "conflict_uuid"
}
]
}
Server-Side Rate Limiting
Redis-Based Rate Limiting
// Rate limiting implementation
const rateLimit = async (deviceId, windowMs = 1000, maxRequests = 8) => {
const key = `rate_limit:${deviceId}`;
const current = await redis.incr(key);
if (current === 1) {
await redis.expire(key, Math.ceil(windowMs / 1000));
}
if (current > maxRequests) {
throw new RateLimitError('Rate limit exceeded', {
retryAfter: await redis.ttl(key)
});
}
return { remaining: maxRequests - current, resetTime: Date.now() + windowMs };
};
Device Abuse Tracking
const trackDeviceAbuse = async (deviceId, violation) => {
const key = `abuse:${deviceId}`;
const violations = await redis.get(key) || '[]';
const parsed = JSON.parse(violations);
parsed.push({
type: violation.type, // 'rate_limit', 'invalid_qr_spam'
timestamp: Date.now(),
details: violation.details
});
// Keep only last 24 hours of violations
const oneDayAgo = Date.now() - 86400000;
const recent = parsed.filter(v => v.timestamp > oneDayAgo);
await redis.setex(key, 86400, JSON.stringify(recent));
// Calculate escalating penalty
const penalty = calculatePenalty(recent);
if (penalty > 0) {
await redis.setex(`penalty:${deviceId}`, penalty, '1');
}
};
Monitoring and Alerting
Sentry Configuration
Error Monitoring Setup
// sentry.client.ts
import * as Sentry from '@sentry/react';
Sentry.init({
dsn: process.env.VITE_SENTRY_DSN,
environment: process.env.NODE_ENV,
integrations: [
new Sentry.BrowserTracing({
tracingOrigins: [/^https:\/\/.*\.blackcanyontickets\.com\/api/],
}),
],
tracesSampleRate: process.env.NODE_ENV === 'production' ? 0.1 : 1.0,
beforeSend(event) {
// Filter out rate limiting errors as they're expected
if (event.exception?.values?.[0]?.type === 'RateLimitError') {
return null;
}
return event;
}
});
Performance Monitoring
// Performance tracking in scanner
const trackScanPerformance = (scanResult) => {
Sentry.addBreadcrumb({
category: 'scanner',
message: `Scan ${scanResult.result}`,
level: 'info',
data: {
latency: scanResult.latency,
offline: scanResult.offline,
zone: scanResult.zone,
deviceMemory: performance.memory?.usedJSHeapSize || 0
}
});
// Track performance metrics
Sentry.setTag('scanning_session', true);
Sentry.setContext('device_info', {
memory: performance.memory?.usedJSHeapSize,
connection: navigator.connection?.effectiveType,
battery: navigator.battery?.level
});
};
Alert Configuration
Critical Alerts (Immediate Response)
# Sentry Alert Rules
- name: "Scanner API Errors"
conditions:
- error_count > 10 in 5 minutes
actions:
- slack: "#incidents"
- email: "oncall@blackcanyontickets.com"
- name: "High Rate Limiting"
conditions:
- event.type = "RateLimitError"
- count > 50 in 10 minutes
actions:
- slack: "#scanner-ops"
- name: "Sync Failures"
conditions:
- event.message contains "sync_failed"
- count > 25 in 15 minutes
actions:
- slack: "#scanner-ops"
- pagerduty: "P1"
Performance Thresholds
// Client-side performance monitoring
const performanceMonitor = {
scanLatency: {
warning: 1000, // ms
critical: 3000
},
memoryUsage: {
warning: 50 * 1024 * 1024, // 50MB
critical: 100 * 1024 * 1024 // 100MB
},
syncQueueSize: {
warning: 50,
critical: 100
},
offlineTime: {
warning: 300000, // 5 minutes
critical: 900000 // 15 minutes
}
};
Dashboard URLs and Metrics
Grafana Dashboards
-
Scanner Operations:
https://grafana.bct.com/d/scanner-ops- Real-time scan rates by device/zone
- Network latency and error rates
- Offline queue sizes and sync status
-
Performance Metrics:
https://grafana.bct.com/d/scanner-perf- Memory usage trends
- Battery life estimates
- Camera initialization times
-
Business Metrics:
https://grafana.bct.com/d/scanner-business- Entry throughput by gate
- Peak scanning times
- Duplicate/invalid ticket rates
Real-Time Monitoring
// Health check endpoint
GET /api/scanner/health
{
"status": "healthy",
"checks": {
"database": "ok",
"redis": "ok",
"camera_api": "ok"
},
"metrics": {
"active_devices": 15,
"scans_per_minute": 245,
"pending_syncs": 12,
"conflict_rate": 0.02
}
}
Network Requirements and Quality
WiFi Configuration
Recommended WiFi Setup
Network Requirements:
- SSID: "BCT-Staff" (dedicated for scanners)
- Security: WPA3-Enterprise (or WPA2-Personal minimum)
- Bandwidth: 10 Mbps minimum, 50 Mbps recommended
- Latency: <100ms to API servers
- Coverage: -65 dBm minimum signal at all gate locations
Quality of Service (QoS):
- Scanner traffic priority: High
- API requests: TCP/443 (HTTPS)
- Sync operations: TCP/443 (HTTPS)
- Background sync: TCP/443 (HTTPS)
Network Monitoring
# WiFi quality testing script
#!/bin/bash
WIFI_SSID="BCT-Staff"
API_ENDPOINT="https://api.blackcanyontickets.com/health"
echo "Testing WiFi Quality for Scanner Operations"
echo "==========================================="
# Signal strength test
SIGNAL=$(iwconfig wlan0 | grep 'Signal level' | awk '{print $4}' | cut -d= -f2)
echo "Signal Strength: $SIGNAL dBm"
if [ ${SIGNAL:1} -gt 65 ]; then
echo "⚠️ Warning: Signal strength may affect scanning"
fi
# Latency test
PING=$(ping -c 5 api.blackcanyontickets.com | tail -1 | awk '{print $4}' | cut -d/ -f2)
echo "Average Latency: ${PING}ms"
if (( $(echo "$PING > 200" | bc -l) )); then
echo "⚠️ Warning: High latency may slow scanning"
fi
# Throughput test
echo "Testing API response time..."
API_TIME=$(curl -o /dev/null -s -w "%{time_total}" $API_ENDPOINT)
echo "API Response Time: ${API_TIME}s"
if (( $(echo "$API_TIME > 2.0" | bc -l) )); then
echo "❌ Critical: API too slow for real-time scanning"
fi
Cellular Fallback Configuration
Carrier Requirements
Cellular Backup:
- Carriers: Verizon, AT&T, T-Mobile (multi-carrier preferred)
- Data Plans: Unlimited or 10GB+ per device per event
- Coverage: -85 dBm minimum LTE signal
- Speeds: 5 Mbps down, 1 Mbps up minimum
Network Priority:
1. WiFi (primary)
2. 5G/LTE (automatic fallback)
3. Offline mode (complete network failure)
Network Handoff Testing
// Network quality detection
const detectNetworkQuality = () => {
const connection = navigator.connection || navigator.mozConnection || navigator.webkitConnection;
return {
type: connection?.effectiveType || 'unknown', // '4g', '3g', 'slow-2g'
downlink: connection?.downlink || 0, // Mbps
rtt: connection?.rtt || 0, // ms round trip time
saveData: connection?.saveData || false
};
};
// Adaptive behavior based on connection
const adaptToNetworkQuality = (quality) => {
if (quality.effectiveType === 'slow-2g') {
// Reduce sync frequency, smaller batches
return { syncInterval: 30000, batchSize: 5 };
} else if (quality.effectiveType === '3g') {
return { syncInterval: 10000, batchSize: 10 };
} else {
// 4G or better - full speed
return { syncInterval: 5000, batchSize: 25 };
}
};
Performance Baselines and SLAs
Expected Performance Metrics
Scanning Operations
Scan Processing:
- QR Detection Time: <500ms (camera to recognition)
- API Verification Time: <1000ms (network request)
- UI Response Time: <100ms (result display)
- Total Scan Time: <3 seconds (end-to-end)
Throughput:
- Peak Rate: 8 scans/second (rate limit)
- Sustained Rate: 4-6 scans/second per device
- Concurrent Devices: 20+ devices per event
- Entry Processing: <10 seconds per attendee (including interaction)
Device Performance
Memory Usage:
- Initial Load: <15MB heap size
- After 100 scans: <25MB heap size
- After 1000 scans: <40MB heap size
- Memory Growth Rate: <20MB per hour
Battery Life:
- Continuous Scanning: 4+ hours minimum
- Standby with App Open: 8+ hours
- Background Sync Impact: <10% additional drain
- Flashlight Usage: -25% battery life
CPU Performance:
- Camera Processing: <30% CPU usage
- QR Detection: <50% CPU burst (brief)
- Background Sync: <10% CPU usage
- Thermal Throttling: Graceful degradation
Network Performance
Sync Operations:
- Single Scan Sync: <500ms
- Batch Sync (25 scans): <2 seconds
- Full Queue Sync (100 scans): <10 seconds
- Conflict Resolution: <1 second per conflict
Offline Capabilities:
- Queue Capacity: 1000+ scans per device
- Storage Persistence: Survives app/browser restart
- Sync Success Rate: >99% when network restored
- Data Integrity: Zero scan loss during offline operation
Service Level Agreements (SLAs)
Availability Targets
System Availability: 99.9% during event hours
- API Uptime: 99.95%
- Database Uptime: 99.99%
- CDN Availability: 99.9%
- SSL Certificate: 99.99%
Response Time Targets:
- Scanner API: <1 second (95th percentile)
- Health Checks: <200ms (99th percentile)
- Sync Operations: <5 seconds (99th percentile)
Error Rate Targets:
- False Positives: <0.5% (valid tickets rejected)
- False Negatives: <0.1% (invalid tickets accepted)
- Sync Failures: <1% of total sync operations
- Conflict Rate: <2% of offline scans
Performance Degradation Response
Performance Tiers:
Tier 1 (Green - Normal):
- Scan latency: <1 second
- Memory usage: <40MB
- Battery life: >4 hours
- Action: Monitor only
Tier 2 (Yellow - Degraded):
- Scan latency: 1-3 seconds
- Memory usage: 40-70MB
- Battery life: 2-4 hours
- Action: Alert on-call, prepare mitigation
Tier 3 (Red - Critical):
- Scan latency: >3 seconds
- Memory usage: >70MB
- Battery life: <2 hours
- Action: Immediate response, activate backup procedures
Escalation Procedures and Contacts
Technical Escalation Matrix
Severity Levels
P1 - Business Critical (Response: 5 minutes):
- Complete scanner system failure
- Database corruption affecting scan verification
- Security breach or data leak
- >50% of devices unable to scan
P2 - High Impact (Response: 15 minutes):
- Single API endpoint failure
- Performance degradation affecting >25% devices
- Network sync failures >10 minutes
- Rate limiting preventing legitimate scans
P3 - Medium Impact (Response: 1 hour):
- Individual device issues (hardware/software)
- Non-critical API errors
- Monitoring/alerting issues
- Minor UI/UX problems
P4 - Low Impact (Response: 4 hours):
- Enhancement requests
- Documentation updates
- Non-urgent performance optimization
- Cosmetic UI issues
Contact Directory
Primary On-Call (24/7 During Events)
Platform Engineering Lead:
- Name: [REDACTED]
- Phone: [REDACTED]
- Email: oncall-platform@blackcanyontickets.com
- Slack: @platform-oncall
- Responsibilities: API issues, database problems, sync failures
DevOps Engineer:
- Name: [REDACTED]
- Phone: [REDACTED]
- Email: oncall-devops@blackcanyontickets.com
- Slack: @devops-oncall
- Responsibilities: Infrastructure, networking, deployment issues
Frontend Engineering Lead:
- Name: [REDACTED]
- Phone: [REDACTED]
- Email: oncall-frontend@blackcanyontickets.com
- Slack: @frontend-oncall
- Responsibilities: PWA issues, camera problems, UI/UX bugs
Secondary/Escalation Contacts
Engineering Manager:
- Phone: [REDACTED]
- When: P1 incidents >30 minutes, team coordination needed
CTO:
- Phone: [REDACTED]
- When: P1 incidents >60 minutes, business decisions needed
CEO:
- Phone: [REDACTED]
- When: Business-critical failures, external communication needed
Vendor/External Support
Supabase Support:
- Contact: support@supabase.com
- SLA: 4 hours (Pro Plan)
- Phone: Emergency hotline available
Sentry Support:
- Contact: support@sentry.io
- SLA: 8 hours (Business Plan)
- Documentation: docs.sentry.io
Vercel Support:
- Contact: support@vercel.com
- SLA: 24 hours (Pro Plan)
- Status Page: vercel-status.com
Escalation Procedures
P1 Incident Response
Step 1 (0-5 minutes):
- Acknowledge incident in #incidents Slack channel
- Start incident bridge call/video chat
- Assign incident commander (usually platform lead)
- Begin status page updates
Step 2 (5-15 minutes):
- Assess scope and impact
- Implement immediate mitigation (fallback procedures)
- Escalate to Engineering Manager if needed
- Communicate with venue/operations team
Step 3 (15-30 minutes):
- Deploy fixes or activate backup systems
- Monitor recovery and impact reduction
- Update stakeholders every 10 minutes
- Document timeline and actions taken
Step 4 (Post-resolution):
- Conduct immediate hot wash (15 minutes)
- Schedule full post-mortem within 48 hours
- Update runbooks based on lessons learned
- Communicate resolution to all stakeholders
Common Escalation Scenarios
Complete Scanner Failure
Symptoms: All devices unable to scan, API returning errors
Immediate Actions:
1. Check system status dashboard
2. Verify database connectivity
3. Activate manual entry procedures at gates
4. Estimate impact and communicate timeline
Escalation Triggers:
- Issue not resolved in 15 minutes → Engineering Manager
- Manual entry activated → Operations Manager
- Expected duration >1 hour → CTO
Mass Device Issues
Symptoms: >10 devices experiencing problems simultaneously
Immediate Actions:
1. Check for CDN/deployment issues
2. Verify service worker updates aren't causing problems
3. Roll back recent deployments if necessary
4. Distribute backup devices
Escalation Triggers:
- >50% devices affected → Immediate escalation to Engineering Manager
- Hardware-related issues → Venue operations team
- Software issues persisting >20 minutes → CTO notification
Database/API Performance Issues
Symptoms: Slow scan response times, sync delays, timeouts
Immediate Actions:
1. Check database performance metrics
2. Review API response times and error rates
3. Scale database resources if possible
4. Enable aggressive client-side caching
Escalation Triggers:
- Database CPU >90% for >5 minutes → DevOps immediate response
- API latency >5 seconds → Platform Engineering lead
- Unable to scale resources → CTO (budget approval needed)
Security Considerations
Data Protection and Privacy
Local Data Storage
IndexedDB Contents:
- Scan Queue: QR codes, timestamps, results (temporary)
- Device Settings: Zone, preferences (non-sensitive)
- Conflict Log: Sync discrepancies (temporary)
- No Storage: Customer PII, payment info, passwords
Data Retention:
- Scan Queue: Auto-purged after successful sync
- Settings: Persisted until manual reset
- Conflicts: Retained 24 hours for review
- Error Logs: Purged every 7 days
Network Security
API Communication:
- Protocol: HTTPS only (TLS 1.3 preferred)
- Authentication: JWT tokens with 2-hour expiration
- Rate Limiting: 8 requests/second per device
- Input Validation: All QR codes validated server-side
Content Security Policy:
- script-src: 'self' cdn.sentry.io
- connect-src: 'self' *.supabase.co *.sentry.io
- img-src: 'self' data: https:
- camera: Required for QR scanning
Device Security
Client-Side Protection:
- No API keys stored in client code
- Device fingerprinting (non-PII) for abuse tracking
- Local encryption of sensitive scan data
- Secure session management
Physical Security:
- Device lock screen during breaks
- Remote wipe capabilities if stolen
- Tamper detection for unusual usage patterns
- Secure storage when not in use
Vulnerability Management
Regular Security Assessments
Automated Scanning:
- Daily: npm audit for dependency vulnerabilities
- Weekly: OWASP ZAP scan of staging environment
- Monthly: Full penetration test of production
Manual Review:
- Code Review: Required for all scanner-related changes
- Security Review: Required for API changes
- Access Review: Quarterly review of admin permissions
Incident Response Plan
Security Incident Types:
1. Data Breach (customer PII exposed)
2. Unauthorized Access (admin account compromise)
3. Service Disruption (DDoS, malicious traffic)
4. Device Compromise (stolen/hacked scanner device)
Response Timeline:
- Detection to Acknowledgment: <15 minutes
- Initial Assessment: <30 minutes
- Containment Actions: <1 hour
- Customer Notification: <24 hours (if PII involved)
Maintenance and Updates
Regular Maintenance Tasks
Daily Operations
# Daily health check script
#!/bin/bash
echo "BCT Scanner Daily Health Check - $(date)"
# Check API health
curl -f https://api.blackcanyontickets.com/health || echo "❌ API health check failed"
# Check database connections
psql $DATABASE_URL -c "SELECT 1" || echo "❌ Database connection failed"
# Check error rates in Sentry
# (This would integrate with Sentry API to get error counts)
# Check device online status
# (Query database for devices that haven't synced in >24 hours)
Weekly Maintenance
System Updates:
- Review and apply npm dependency updates
- Update browser compatibility testing
- Review performance metrics and trends
- Clean up old scan logs and conflict data
Monitoring Review:
- Review Sentry error trends
- Analyze performance degradation patterns
- Update alert thresholds based on data
- Test escalation procedures
Monthly Maintenance
Security Updates:
- Apply security patches to all dependencies
- Review and rotate API keys/secrets
- Update SSL certificates if needed
- Review access logs for unusual patterns
Performance Optimization:
- Analyze database query performance
- Review and optimize API endpoints
- Update caching strategies
- Test load balancing configuration
Deployment and Rollback Procedures
Production Deployment
Deployment Process:
1. Code Review: Minimum 2 approvals required
2. Staging Testing: Full test suite + manual QA
3. Blue-Green Deployment: Zero downtime deployment
4. Gradual Rollout: 10% → 50% → 100% traffic
5. Monitoring: 24-hour observation period
Rollback Triggers:
- Error rate >5% increase from baseline
- Response time >2x normal latency
- Critical functionality broken (scanning failure)
- Security vulnerability discovered
Rollback Process:
1. Immediate: Switch traffic to previous version (2 minutes)
2. Communication: Alert all stakeholders (5 minutes)
3. Investigation: Root cause analysis (30 minutes)
4. Fix Forward: Rapid bug fix deployment if possible
Emergency Hotfixes
Hotfix Criteria:
- Security vulnerability (any severity)
- Data corruption or loss
- Complete service outage
- Critical business function broken
Hotfix Process:
1. Create hotfix branch from production
2. Make minimal fix with tests
3. Emergency code review (single approver)
4. Direct to production deployment
5. Post-deployment verification
6. Full retrospective within 24 hours
Appendices
A. Browser Compatibility Matrix
Recommended Browsers:
- Chrome 88+: ✅ Full support, best performance
- Safari 14+: ✅ iOS support, some PWA limitations
- Firefox 85+: ✅ Good fallback, slower QR detection
- Edge 88+: ✅ Windows support, Chromium-based
Required APIs:
- getUserMedia (Camera): All supported browsers
- IndexedDB: All supported browsers
- Service Workers: All supported browsers
- BarcodeDetector: Chrome only, ZXing fallback for others
B. Database Schema Reference
-- Complete scanner database schema
-- See SCANNER_DATABASE_SCHEMA.sql for full implementation
C. Monitoring Queries
-- Active scanning sessions
SELECT
COUNT(DISTINCT device_id) as active_devices,
COUNT(*) as total_scans,
AVG(EXTRACT(EPOCH FROM (NOW() - scan_timestamp))) as avg_age_seconds
FROM scanner_logs
WHERE scan_timestamp > NOW() - INTERVAL '1 hour';
-- Conflict analysis
SELECT
offline_result,
server_result,
COUNT(*) as conflicts
FROM scan_conflicts
WHERE conflict_timestamp > NOW() - INTERVAL '24 hours'
GROUP BY offline_result, server_result;
D. Performance Benchmarking Scripts
# Device performance test script
# See SCANNER_PERFORMANCE_TEST.sh for full implementation
This technical runbook provides comprehensive guidance for IT administrators and technical staff to successfully deploy, monitor, and maintain the Scanner PWA system. Regular updates to this document should reflect operational lessons learned and system evolution.