Files

dzinesco 5edaaf0651 docs(scanner): add comprehensive staging rollout documentation

Add three critical documents for Scanner PWA production deployment:

1. STAGING_ROLLOUT_CHECKLIST.md - Main operational checklist
   - Pre-event setup procedures for IT/admin team
   - Staff device setup with PWA installation steps
   - Day-of operations and gate management protocols
   - Post-event data sync and cleanup procedures
   - Emergency fallback procedures and escalation contacts

2. STAFF_TRAINING_MATERIALS.md - Gate staff training resources
   - Step-by-step device setup for iOS/Android
   - Scanner operation guide with result interpretation
   - Troubleshooting guide for common issues
   - Professional smartphone usage tips for all-day events
   - Quick reference cards and emergency procedures

3. SCANNER_TECHNICAL_RUNBOOK.md - IT administrator guide
   - Complete system architecture and API documentation
   - Environment setup for staging/production deployment
   - Monitoring, alerting, and performance baseline configuration
   - Network requirements and quality management
   - Security considerations and vulnerability management
   - Escalation procedures and maintenance schedules

These documents provide complete operational readiness for Scanner PWA
deployment, ensuring smooth gate operations with minimal day-of issues.
Staff preparation procedures are designed for temporary/volunteer workers
with clear, simple instructions and comprehensive emergency protocols.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>

2025-08-20 10:48:16 -06:00

28 KiB

Raw Permalink Blame History

Scanner PWA Technical Runbook

System Architecture Overview

The Black Canyon Tickets Scanner PWA is an offline-first Progressive Web App designed for gate staff to scan tickets with comprehensive abuse prevention and mobile optimization.

Core Components

┌─────────────────────────────────────────────────────────────────┐
│                     Scanner PWA Architecture                    │
├─────────────────────────────────────────────────────────────────┤
│  Frontend (React PWA)          │  Backend APIs                  │
│  ├── Camera/QR Detection       │  ├── /api/tickets/verify       │
│  ├── Offline Queue (IndexedDB) │  ├── /api/scans/log            │
│  ├── Background Sync (SW)      │  ├── /api/scanner/sync         │
│  ├── Rate Limiting Client      │  └── /api/scanner/conflicts    │
│  └── Abuse Prevention UI       │                                │
├─────────────────────────────────────────────────────────────────┤
│  Infrastructure                 │  Monitoring                   │
│  ├── Supabase/Firebase DB      │  ├── Sentry Error Tracking    │
│  ├── CDN (Vercel/Netlify)      │  ├── Performance Monitoring   │
│  ├── SSL/HTTPS (Required)      │  ├── Real-time Alerts        │
│  └── Service Worker Caching    │  └── Usage Analytics         │
└─────────────────────────────────────────────────────────────────┘

Environment Setup

Staging Environment Configuration

Environment Variables

# Database Configuration
VITE_SUPABASE_URL=https://staging-scanner.supabase.co
VITE_SUPABASE_ANON_KEY=eyJ...staging-key
SUPABASE_SERVICE_ROLE_KEY=eyJ...service-key

# Scanner API Configuration  
VITE_SCANNER_API_URL=https://staging-api.blackcanyontickets.com
VITE_SCANNER_RATE_LIMIT=8
VITE_SCANNER_DEBOUNCE_MS=2000

# PWA Configuration
VITE_PWA_NAME=BCT Scanner (Staging)
VITE_PWA_SHORT_NAME=BCT Scanner
VITE_PWA_THEME_COLOR=#1e40af

# Monitoring Configuration
VITE_SENTRY_DSN=https://staging@sentry.io/project
SENTRY_ENVIRONMENT=staging
SENTRY_RELEASE=$VERCEL_GIT_COMMIT_SHA

# Feature Flags
VITE_SCANNER_OFFLINE_ENABLED=true
VITE_ABUSE_PREVENTION_ENABLED=true
VITE_DEVICE_TRACKING_ENABLED=true

Deployment Configuration

Vercel (Recommended)

{
  "buildCommand": "npm run build",
  "outputDirectory": "dist",
  "installCommand": "npm ci",
  "framework": "vite",
  "functions": {
    "api/**/*.ts": {
      "runtime": "nodejs18.x"
    }
  },
  "headers": [
    {
      "source": "/sw.js",
      "headers": [
        {
          "key": "Cache-Control",
          "value": "public, max-age=0, must-revalidate"
        }
      ]
    }
  ]
}

Netlify Alternative

[build]
  command = "npm run build"
  publish = "dist"

[[headers]]
  for = "/sw.js"
  [headers.values]
    Cache-Control = "public, max-age=0, must-revalidate"

[[headers]]
  for = "/manifest.json"  
  [headers.values]
    Cache-Control = "public, max-age=86400"

Production Environment

SSL/HTTPS Requirements

Camera API: Requires HTTPS for getUserMedia() access
Service Workers: HTTPS required for PWA functionality
Geolocation: HTTPS required for location services
Web Push: HTTPS required for background sync

Database Configuration

-- Scanner-specific tables
CREATE TABLE scanner_logs (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  device_id VARCHAR(255) NOT NULL,
  event_id UUID NOT NULL,
  qr_code VARCHAR(500) NOT NULL,
  scan_result VARCHAR(50) NOT NULL, -- 'valid', 'invalid', 'already_scanned'
  scan_timestamp TIMESTAMPTZ DEFAULT NOW(),
  zone VARCHAR(100),
  sync_status VARCHAR(50) DEFAULT 'synced',
  created_at TIMESTAMPTZ DEFAULT NOW()
);

CREATE TABLE scan_conflicts (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  device_id VARCHAR(255) NOT NULL,
  qr_code VARCHAR(500) NOT NULL,
  offline_result VARCHAR(50) NOT NULL,
  server_result VARCHAR(50) NOT NULL,
  event_id UUID NOT NULL,
  conflict_timestamp TIMESTAMPTZ DEFAULT NOW(),
  resolution_status VARCHAR(50) DEFAULT 'pending'
);

-- Indexes for performance
CREATE INDEX idx_scanner_logs_device_event ON scanner_logs(device_id, event_id);
CREATE INDEX idx_scanner_logs_timestamp ON scanner_logs(scan_timestamp DESC);
CREATE INDEX idx_scan_conflicts_unresolved ON scan_conflicts(resolution_status) 
  WHERE resolution_status = 'pending';

API Endpoints and Integration

Core Scanner APIs

1. Ticket Verification Endpoint

POST /api/tickets/verify
Content-Type: application/json
Authorization: Bearer {jwt_token}

{
  "qr": "TICKET_UUID_OR_CODE",
  "eventId": "event_uuid",
  "deviceId": "device_fingerprint",
  "zone": "Gate A"
}

// Responses
// Success (200)
{
  "valid": true,
  "ticket": {
    "eventTitle": "Sample Event",
    "ticketTypeName": "General Admission", 
    "customerEmail": "customer@example.com",
    "seatNumber": "A-15" // if assigned seating
  }
}

// Already Scanned (200)
{
  "valid": false,
  "reason": "already_scanned",
  "scannedAt": "2024-01-01T18:30:00Z",
  "scannedBy": "device_abc123",
  "zone": "Main Entrance"
}

// Invalid (200)
{
  "valid": false,
  "reason": "invalid", // or "expired", "cancelled", "locked"
  "message": "Ticket not found or has been cancelled"
}

// Rate Limited (429)
{
  "error": "rate_limit_exceeded",
  "retryAfter": 5,
  "message": "Too many requests from device"
}

2. Scan Logging Endpoint

POST /api/scans/log
Content-Type: application/json
Authorization: Bearer {jwt_token}

{
  "deviceId": "device_fingerprint",
  "eventId": "event_uuid", 
  "qr": "TICKET_CODE",
  "result": "valid", // valid, invalid, already_scanned
  "zone": "Gate A",
  "timestamp": "2024-01-01T18:30:00Z",
  "latency": 245, // ms
  "offline": false
}

// Response (202 Accepted)
{
  "logged": true,
  "scanId": "scan_uuid"
}

3. Offline Sync Endpoint

POST /api/scanner/sync
Content-Type: application/json
Authorization: Bearer {jwt_token}

{
  "deviceId": "device_fingerprint",
  "scans": [
    {
      "qr": "TICKET_CODE_1",
      "eventId": "event_uuid",
      "result": "valid",
      "timestamp": "2024-01-01T18:30:00Z",
      "zone": "Gate A"
    }
    // ... more scans
  ]
}

// Response (200)
{
  "synced": 15,
  "conflicts": 2,
  "failed": 0,
  "conflictDetails": [
    {
      "qr": "TICKET_CODE_X",
      "offlineResult": "valid", 
      "serverResult": "already_scanned",
      "conflictId": "conflict_uuid"
    }
  ]
}

Server-Side Rate Limiting

Redis-Based Rate Limiting

// Rate limiting implementation
const rateLimit = async (deviceId, windowMs = 1000, maxRequests = 8) => {
  const key = `rate_limit:${deviceId}`;
  const current = await redis.incr(key);
  
  if (current === 1) {
    await redis.expire(key, Math.ceil(windowMs / 1000));
  }
  
  if (current > maxRequests) {
    throw new RateLimitError('Rate limit exceeded', {
      retryAfter: await redis.ttl(key)
    });
  }
  
  return { remaining: maxRequests - current, resetTime: Date.now() + windowMs };
};

Device Abuse Tracking

const trackDeviceAbuse = async (deviceId, violation) => {
  const key = `abuse:${deviceId}`;
  const violations = await redis.get(key) || '[]';
  const parsed = JSON.parse(violations);
  
  parsed.push({
    type: violation.type, // 'rate_limit', 'invalid_qr_spam'  
    timestamp: Date.now(),
    details: violation.details
  });
  
  // Keep only last 24 hours of violations
  const oneDayAgo = Date.now() - 86400000;
  const recent = parsed.filter(v => v.timestamp > oneDayAgo);
  
  await redis.setex(key, 86400, JSON.stringify(recent));
  
  // Calculate escalating penalty
  const penalty = calculatePenalty(recent);
  if (penalty > 0) {
    await redis.setex(`penalty:${deviceId}`, penalty, '1');
  }
};

Monitoring and Alerting

Sentry Configuration

Error Monitoring Setup

// sentry.client.ts
import * as Sentry from '@sentry/react';

Sentry.init({
  dsn: process.env.VITE_SENTRY_DSN,
  environment: process.env.NODE_ENV,
  integrations: [
    new Sentry.BrowserTracing({
      tracingOrigins: [/^https:\/\/.*\.blackcanyontickets\.com\/api/],
    }),
  ],
  tracesSampleRate: process.env.NODE_ENV === 'production' ? 0.1 : 1.0,
  beforeSend(event) {
    // Filter out rate limiting errors as they're expected
    if (event.exception?.values?.[0]?.type === 'RateLimitError') {
      return null;
    }
    return event;
  }
});

Performance Monitoring

// Performance tracking in scanner
const trackScanPerformance = (scanResult) => {
  Sentry.addBreadcrumb({
    category: 'scanner',
    message: `Scan ${scanResult.result}`,
    level: 'info',
    data: {
      latency: scanResult.latency,
      offline: scanResult.offline,
      zone: scanResult.zone,
      deviceMemory: performance.memory?.usedJSHeapSize || 0
    }
  });
  
  // Track performance metrics
  Sentry.setTag('scanning_session', true);
  Sentry.setContext('device_info', {
    memory: performance.memory?.usedJSHeapSize,
    connection: navigator.connection?.effectiveType,
    battery: navigator.battery?.level
  });
};

Alert Configuration

Critical Alerts (Immediate Response)

# Sentry Alert Rules
- name: "Scanner API Errors"
  conditions:
    - error_count > 10 in 5 minutes
  actions:
    - slack: "#incidents"
    - email: "oncall@blackcanyontickets.com"
  
- name: "High Rate Limiting"  
  conditions:
    - event.type = "RateLimitError"
    - count > 50 in 10 minutes
  actions:
    - slack: "#scanner-ops"

- name: "Sync Failures"
  conditions: 
    - event.message contains "sync_failed"
    - count > 25 in 15 minutes
  actions:
    - slack: "#scanner-ops"
    - pagerduty: "P1"

Performance Thresholds

// Client-side performance monitoring
const performanceMonitor = {
  scanLatency: {
    warning: 1000, // ms
    critical: 3000
  },
  memoryUsage: {
    warning: 50 * 1024 * 1024, // 50MB
    critical: 100 * 1024 * 1024 // 100MB
  },
  syncQueueSize: {
    warning: 50,
    critical: 100
  },
  offlineTime: {
    warning: 300000, // 5 minutes
    critical: 900000  // 15 minutes  
  }
};

Dashboard URLs and Metrics

Grafana Dashboards

Scanner Operations: https://grafana.bct.com/d/scanner-ops
- Real-time scan rates by device/zone
- Network latency and error rates
- Offline queue sizes and sync status
Performance Metrics: https://grafana.bct.com/d/scanner-perf
- Memory usage trends
- Battery life estimates
- Camera initialization times
Business Metrics: https://grafana.bct.com/d/scanner-business
- Entry throughput by gate
- Peak scanning times
- Duplicate/invalid ticket rates

Real-Time Monitoring

// Health check endpoint
GET /api/scanner/health
{
  "status": "healthy",
  "checks": {
    "database": "ok",
    "redis": "ok", 
    "camera_api": "ok"
  },
  "metrics": {
    "active_devices": 15,
    "scans_per_minute": 245,
    "pending_syncs": 12,
    "conflict_rate": 0.02
  }
}

Network Requirements and Quality

WiFi Configuration

Recommended WiFi Setup

Network Requirements:
  - SSID: "BCT-Staff" (dedicated for scanners)
  - Security: WPA3-Enterprise (or WPA2-Personal minimum)
  - Bandwidth: 10 Mbps minimum, 50 Mbps recommended
  - Latency: <100ms to API servers
  - Coverage: -65 dBm minimum signal at all gate locations

Quality of Service (QoS):
  - Scanner traffic priority: High
  - API requests: TCP/443 (HTTPS)  
  - Sync operations: TCP/443 (HTTPS)
  - Background sync: TCP/443 (HTTPS)

Network Monitoring

# WiFi quality testing script
#!/bin/bash
WIFI_SSID="BCT-Staff"
API_ENDPOINT="https://api.blackcanyontickets.com/health"

echo "Testing WiFi Quality for Scanner Operations"
echo "==========================================="

# Signal strength test
SIGNAL=$(iwconfig wlan0 | grep 'Signal level' | awk '{print $4}' | cut -d= -f2)
echo "Signal Strength: $SIGNAL dBm"
if [ ${SIGNAL:1} -gt 65 ]; then
  echo "⚠️ Warning: Signal strength may affect scanning"
fi

# Latency test  
PING=$(ping -c 5 api.blackcanyontickets.com | tail -1 | awk '{print $4}' | cut -d/ -f2)
echo "Average Latency: ${PING}ms"
if (( $(echo "$PING > 200" | bc -l) )); then
  echo "⚠️ Warning: High latency may slow scanning"
fi

# Throughput test
echo "Testing API response time..."
API_TIME=$(curl -o /dev/null -s -w "%{time_total}" $API_ENDPOINT)
echo "API Response Time: ${API_TIME}s"
if (( $(echo "$API_TIME > 2.0" | bc -l) )); then
  echo "❌ Critical: API too slow for real-time scanning"
fi

Cellular Fallback Configuration

Carrier Requirements

Cellular Backup:
  - Carriers: Verizon, AT&T, T-Mobile (multi-carrier preferred)
  - Data Plans: Unlimited or 10GB+ per device per event
  - Coverage: -85 dBm minimum LTE signal
  - Speeds: 5 Mbps down, 1 Mbps up minimum

Network Priority:
  1. WiFi (primary)
  2. 5G/LTE (automatic fallback)
  3. Offline mode (complete network failure)

Network Handoff Testing

// Network quality detection
const detectNetworkQuality = () => {
  const connection = navigator.connection || navigator.mozConnection || navigator.webkitConnection;
  
  return {
    type: connection?.effectiveType || 'unknown', // '4g', '3g', 'slow-2g'
    downlink: connection?.downlink || 0, // Mbps
    rtt: connection?.rtt || 0, // ms round trip time
    saveData: connection?.saveData || false
  };
};

// Adaptive behavior based on connection
const adaptToNetworkQuality = (quality) => {
  if (quality.effectiveType === 'slow-2g') {
    // Reduce sync frequency, smaller batches
    return { syncInterval: 30000, batchSize: 5 };
  } else if (quality.effectiveType === '3g') {
    return { syncInterval: 10000, batchSize: 10 };
  } else {
    // 4G or better - full speed
    return { syncInterval: 5000, batchSize: 25 };
  }
};

Performance Baselines and SLAs

Expected Performance Metrics

Scanning Operations

Scan Processing:
  - QR Detection Time: <500ms (camera to recognition)
  - API Verification Time: <1000ms (network request)
  - UI Response Time: <100ms (result display)
  - Total Scan Time: <3 seconds (end-to-end)

Throughput:
  - Peak Rate: 8 scans/second (rate limit)
  - Sustained Rate: 4-6 scans/second per device
  - Concurrent Devices: 20+ devices per event
  - Entry Processing: <10 seconds per attendee (including interaction)

Device Performance

Memory Usage:
  - Initial Load: <15MB heap size
  - After 100 scans: <25MB heap size
  - After 1000 scans: <40MB heap size
  - Memory Growth Rate: <20MB per hour

Battery Life:
  - Continuous Scanning: 4+ hours minimum
  - Standby with App Open: 8+ hours
  - Background Sync Impact: <10% additional drain
  - Flashlight Usage: -25% battery life

CPU Performance:
  - Camera Processing: <30% CPU usage
  - QR Detection: <50% CPU burst (brief)
  - Background Sync: <10% CPU usage
  - Thermal Throttling: Graceful degradation

Network Performance

Sync Operations:
  - Single Scan Sync: <500ms
  - Batch Sync (25 scans): <2 seconds
  - Full Queue Sync (100 scans): <10 seconds  
  - Conflict Resolution: <1 second per conflict

Offline Capabilities:
  - Queue Capacity: 1000+ scans per device
  - Storage Persistence: Survives app/browser restart
  - Sync Success Rate: >99% when network restored
  - Data Integrity: Zero scan loss during offline operation

Service Level Agreements (SLAs)

Availability Targets

System Availability: 99.9% during event hours
- API Uptime: 99.95%  
- Database Uptime: 99.99%
- CDN Availability: 99.9%
- SSL Certificate: 99.99%

Response Time Targets:
- Scanner API: <1 second (95th percentile)
- Health Checks: <200ms (99th percentile)  
- Sync Operations: <5 seconds (99th percentile)

Error Rate Targets:
- False Positives: <0.5% (valid tickets rejected)
- False Negatives: <0.1% (invalid tickets accepted)
- Sync Failures: <1% of total sync operations
- Conflict Rate: <2% of offline scans

Performance Degradation Response

Performance Tiers:
  Tier 1 (Green - Normal):
    - Scan latency: <1 second
    - Memory usage: <40MB
    - Battery life: >4 hours
    - Action: Monitor only

  Tier 2 (Yellow - Degraded):  
    - Scan latency: 1-3 seconds
    - Memory usage: 40-70MB
    - Battery life: 2-4 hours
    - Action: Alert on-call, prepare mitigation

  Tier 3 (Red - Critical):
    - Scan latency: >3 seconds
    - Memory usage: >70MB
    - Battery life: <2 hours  
    - Action: Immediate response, activate backup procedures

Escalation Procedures and Contacts

Technical Escalation Matrix

Severity Levels

P1 - Business Critical (Response: 5 minutes):
  - Complete scanner system failure
  - Database corruption affecting scan verification
  - Security breach or data leak
  - >50% of devices unable to scan

P2 - High Impact (Response: 15 minutes):
  - Single API endpoint failure
  - Performance degradation affecting >25% devices
  - Network sync failures >10 minutes
  - Rate limiting preventing legitimate scans

P3 - Medium Impact (Response: 1 hour):
  - Individual device issues (hardware/software)
  - Non-critical API errors
  - Monitoring/alerting issues
  - Minor UI/UX problems

P4 - Low Impact (Response: 4 hours):  
  - Enhancement requests
  - Documentation updates
  - Non-urgent performance optimization
  - Cosmetic UI issues

Contact Directory

Primary On-Call (24/7 During Events)

Platform Engineering Lead:
  - Name: [REDACTED]
  - Phone: [REDACTED]  
  - Email: oncall-platform@blackcanyontickets.com
  - Slack: @platform-oncall
  - Responsibilities: API issues, database problems, sync failures

DevOps Engineer:
  - Name: [REDACTED]
  - Phone: [REDACTED]
  - Email: oncall-devops@blackcanyontickets.com  
  - Slack: @devops-oncall
  - Responsibilities: Infrastructure, networking, deployment issues

Frontend Engineering Lead:
  - Name: [REDACTED]
  - Phone: [REDACTED]
  - Email: oncall-frontend@blackcanyontickets.com
  - Slack: @frontend-oncall  
  - Responsibilities: PWA issues, camera problems, UI/UX bugs

Secondary/Escalation Contacts

Engineering Manager:
  - Phone: [REDACTED]
  - When: P1 incidents >30 minutes, team coordination needed

CTO:  
  - Phone: [REDACTED]
  - When: P1 incidents >60 minutes, business decisions needed

CEO:
  - Phone: [REDACTED]  
  - When: Business-critical failures, external communication needed

Vendor/External Support

Supabase Support:
  - Contact: support@supabase.com
  - SLA: 4 hours (Pro Plan)
  - Phone: Emergency hotline available

Sentry Support:
  - Contact: support@sentry.io
  - SLA: 8 hours (Business Plan)
  - Documentation: docs.sentry.io

Vercel Support:
  - Contact: support@vercel.com
  - SLA: 24 hours (Pro Plan)
  - Status Page: vercel-status.com

Escalation Procedures

P1 Incident Response

Step 1 (0-5 minutes):
  - Acknowledge incident in #incidents Slack channel
  - Start incident bridge call/video chat
  - Assign incident commander (usually platform lead)
  - Begin status page updates

Step 2 (5-15 minutes):
  - Assess scope and impact
  - Implement immediate mitigation (fallback procedures)
  - Escalate to Engineering Manager if needed
  - Communicate with venue/operations team

Step 3 (15-30 minutes):  
  - Deploy fixes or activate backup systems
  - Monitor recovery and impact reduction
  - Update stakeholders every 10 minutes
  - Document timeline and actions taken

Step 4 (Post-resolution):
  - Conduct immediate hot wash (15 minutes)
  - Schedule full post-mortem within 48 hours
  - Update runbooks based on lessons learned
  - Communicate resolution to all stakeholders

Common Escalation Scenarios

Complete Scanner Failure

Symptoms: All devices unable to scan, API returning errors
Immediate Actions:
  1. Check system status dashboard
  2. Verify database connectivity
  3. Activate manual entry procedures at gates
  4. Estimate impact and communicate timeline

Escalation Triggers:
  - Issue not resolved in 15 minutes → Engineering Manager
  - Manual entry activated → Operations Manager  
  - Expected duration >1 hour → CTO

Mass Device Issues

Symptoms: >10 devices experiencing problems simultaneously
Immediate Actions:
  1. Check for CDN/deployment issues
  2. Verify service worker updates aren't causing problems
  3. Roll back recent deployments if necessary
  4. Distribute backup devices

Escalation Triggers:
  - >50% devices affected → Immediate escalation to Engineering Manager
  - Hardware-related issues → Venue operations team
  - Software issues persisting >20 minutes → CTO notification

Database/API Performance Issues

Symptoms: Slow scan response times, sync delays, timeouts
Immediate Actions:
  1. Check database performance metrics
  2. Review API response times and error rates
  3. Scale database resources if possible
  4. Enable aggressive client-side caching

Escalation Triggers:
  - Database CPU >90% for >5 minutes → DevOps immediate response
  - API latency >5 seconds → Platform Engineering lead
  - Unable to scale resources → CTO (budget approval needed)

Security Considerations

Data Protection and Privacy

Local Data Storage

IndexedDB Contents:
  - Scan Queue: QR codes, timestamps, results (temporary)
  - Device Settings: Zone, preferences (non-sensitive)
  - Conflict Log: Sync discrepancies (temporary)
  - No Storage: Customer PII, payment info, passwords

Data Retention:
  - Scan Queue: Auto-purged after successful sync
  - Settings: Persisted until manual reset
  - Conflicts: Retained 24 hours for review
  - Error Logs: Purged every 7 days

Network Security

API Communication:
  - Protocol: HTTPS only (TLS 1.3 preferred)
  - Authentication: JWT tokens with 2-hour expiration
  - Rate Limiting: 8 requests/second per device
  - Input Validation: All QR codes validated server-side

Content Security Policy:
  - script-src: 'self' cdn.sentry.io
  - connect-src: 'self' *.supabase.co *.sentry.io
  - img-src: 'self' data: https:
  - camera: Required for QR scanning

Device Security

Client-Side Protection:
  - No API keys stored in client code
  - Device fingerprinting (non-PII) for abuse tracking
  - Local encryption of sensitive scan data
  - Secure session management

Physical Security:
  - Device lock screen during breaks
  - Remote wipe capabilities if stolen
  - Tamper detection for unusual usage patterns
  - Secure storage when not in use

Vulnerability Management

Regular Security Assessments

Automated Scanning:
  - Daily: npm audit for dependency vulnerabilities
  - Weekly: OWASP ZAP scan of staging environment  
  - Monthly: Full penetration test of production

Manual Review:
  - Code Review: Required for all scanner-related changes
  - Security Review: Required for API changes
  - Access Review: Quarterly review of admin permissions

Incident Response Plan

Security Incident Types:
  1. Data Breach (customer PII exposed)
  2. Unauthorized Access (admin account compromise)  
  3. Service Disruption (DDoS, malicious traffic)
  4. Device Compromise (stolen/hacked scanner device)

Response Timeline:
  - Detection to Acknowledgment: <15 minutes
  - Initial Assessment: <30 minutes
  - Containment Actions: <1 hour
  - Customer Notification: <24 hours (if PII involved)

Maintenance and Updates

Regular Maintenance Tasks

Daily Operations

# Daily health check script
#!/bin/bash
echo "BCT Scanner Daily Health Check - $(date)"

# Check API health
curl -f https://api.blackcanyontickets.com/health || echo "❌ API health check failed"

# Check database connections
psql $DATABASE_URL -c "SELECT 1" || echo "❌ Database connection failed"

# Check error rates in Sentry
# (This would integrate with Sentry API to get error counts)

# Check device online status
# (Query database for devices that haven't synced in >24 hours)

Weekly Maintenance

System Updates:
  - Review and apply npm dependency updates
  - Update browser compatibility testing
  - Review performance metrics and trends
  - Clean up old scan logs and conflict data

Monitoring Review:
  - Review Sentry error trends
  - Analyze performance degradation patterns
  - Update alert thresholds based on data
  - Test escalation procedures

Monthly Maintenance

Security Updates:
  - Apply security patches to all dependencies
  - Review and rotate API keys/secrets
  - Update SSL certificates if needed
  - Review access logs for unusual patterns

Performance Optimization:
  - Analyze database query performance
  - Review and optimize API endpoints
  - Update caching strategies
  - Test load balancing configuration

Deployment and Rollback Procedures

Production Deployment

Deployment Process:
  1. Code Review: Minimum 2 approvals required
  2. Staging Testing: Full test suite + manual QA
  3. Blue-Green Deployment: Zero downtime deployment
  4. Gradual Rollout: 10% → 50% → 100% traffic
  5. Monitoring: 24-hour observation period

Rollback Triggers:  
  - Error rate >5% increase from baseline
  - Response time >2x normal latency
  - Critical functionality broken (scanning failure)
  - Security vulnerability discovered

Rollback Process:
  1. Immediate: Switch traffic to previous version (2 minutes)
  2. Communication: Alert all stakeholders (5 minutes)  
  3. Investigation: Root cause analysis (30 minutes)
  4. Fix Forward: Rapid bug fix deployment if possible

Emergency Hotfixes

Hotfix Criteria:
  - Security vulnerability (any severity)
  - Data corruption or loss
  - Complete service outage
  - Critical business function broken

Hotfix Process:
  1. Create hotfix branch from production
  2. Make minimal fix with tests
  3. Emergency code review (single approver)
  4. Direct to production deployment
  5. Post-deployment verification
  6. Full retrospective within 24 hours

Appendices

A. Browser Compatibility Matrix

Recommended Browsers:
  - Chrome 88+: ✅ Full support, best performance
  - Safari 14+: ✅ iOS support, some PWA limitations  
  - Firefox 85+: ✅ Good fallback, slower QR detection
  - Edge 88+: ✅ Windows support, Chromium-based

Required APIs:
  - getUserMedia (Camera): All supported browsers
  - IndexedDB: All supported browsers
  - Service Workers: All supported browsers  
  - BarcodeDetector: Chrome only, ZXing fallback for others

B. Database Schema Reference

-- Complete scanner database schema
-- See SCANNER_DATABASE_SCHEMA.sql for full implementation

C. Monitoring Queries

-- Active scanning sessions
SELECT 
  COUNT(DISTINCT device_id) as active_devices,
  COUNT(*) as total_scans,
  AVG(EXTRACT(EPOCH FROM (NOW() - scan_timestamp))) as avg_age_seconds
FROM scanner_logs 
WHERE scan_timestamp > NOW() - INTERVAL '1 hour';

-- Conflict analysis  
SELECT 
  offline_result,
  server_result, 
  COUNT(*) as conflicts
FROM scan_conflicts 
WHERE conflict_timestamp > NOW() - INTERVAL '24 hours'
GROUP BY offline_result, server_result;

D. Performance Benchmarking Scripts

# Device performance test script
# See SCANNER_PERFORMANCE_TEST.sh for full implementation

This technical runbook provides comprehensive guidance for IT administrators and technical staff to successfully deploy, monitor, and maintain the Scanner PWA system. Regular updates to this document should reflect operational lessons learned and system evolution.

28 KiB Raw Permalink Blame History