Add three critical documents for Scanner PWA production deployment: 1. STAGING_ROLLOUT_CHECKLIST.md - Main operational checklist - Pre-event setup procedures for IT/admin team - Staff device setup with PWA installation steps - Day-of operations and gate management protocols - Post-event data sync and cleanup procedures - Emergency fallback procedures and escalation contacts 2. STAFF_TRAINING_MATERIALS.md - Gate staff training resources - Step-by-step device setup for iOS/Android - Scanner operation guide with result interpretation - Troubleshooting guide for common issues - Professional smartphone usage tips for all-day events - Quick reference cards and emergency procedures 3. SCANNER_TECHNICAL_RUNBOOK.md - IT administrator guide - Complete system architecture and API documentation - Environment setup for staging/production deployment - Monitoring, alerting, and performance baseline configuration - Network requirements and quality management - Security considerations and vulnerability management - Escalation procedures and maintenance schedules These documents provide complete operational readiness for Scanner PWA deployment, ensuring smooth gate operations with minimal day-of issues. Staff preparation procedures are designed for temporary/volunteer workers with clear, simple instructions and comprehensive emergency protocols. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
1037 lines
28 KiB
Markdown
1037 lines
28 KiB
Markdown
# Scanner PWA Technical Runbook
|
|
|
|
## System Architecture Overview
|
|
|
|
The Black Canyon Tickets Scanner PWA is an offline-first Progressive Web App designed for gate staff to scan tickets with comprehensive abuse prevention and mobile optimization.
|
|
|
|
### Core Components
|
|
```
|
|
┌─────────────────────────────────────────────────────────────────┐
|
|
│ Scanner PWA Architecture │
|
|
├─────────────────────────────────────────────────────────────────┤
|
|
│ Frontend (React PWA) │ Backend APIs │
|
|
│ ├── Camera/QR Detection │ ├── /api/tickets/verify │
|
|
│ ├── Offline Queue (IndexedDB) │ ├── /api/scans/log │
|
|
│ ├── Background Sync (SW) │ ├── /api/scanner/sync │
|
|
│ ├── Rate Limiting Client │ └── /api/scanner/conflicts │
|
|
│ └── Abuse Prevention UI │ │
|
|
├─────────────────────────────────────────────────────────────────┤
|
|
│ Infrastructure │ Monitoring │
|
|
│ ├── Supabase/Firebase DB │ ├── Sentry Error Tracking │
|
|
│ ├── CDN (Vercel/Netlify) │ ├── Performance Monitoring │
|
|
│ ├── SSL/HTTPS (Required) │ ├── Real-time Alerts │
|
|
│ └── Service Worker Caching │ └── Usage Analytics │
|
|
└─────────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
## Environment Setup
|
|
|
|
### Staging Environment Configuration
|
|
|
|
#### Environment Variables
|
|
```bash
|
|
# Database Configuration
|
|
VITE_SUPABASE_URL=https://staging-scanner.supabase.co
|
|
VITE_SUPABASE_ANON_KEY=eyJ...staging-key
|
|
SUPABASE_SERVICE_ROLE_KEY=eyJ...service-key
|
|
|
|
# Scanner API Configuration
|
|
VITE_SCANNER_API_URL=https://staging-api.blackcanyontickets.com
|
|
VITE_SCANNER_RATE_LIMIT=8
|
|
VITE_SCANNER_DEBOUNCE_MS=2000
|
|
|
|
# PWA Configuration
|
|
VITE_PWA_NAME=BCT Scanner (Staging)
|
|
VITE_PWA_SHORT_NAME=BCT Scanner
|
|
VITE_PWA_THEME_COLOR=#1e40af
|
|
|
|
# Monitoring Configuration
|
|
VITE_SENTRY_DSN=https://staging@sentry.io/project
|
|
SENTRY_ENVIRONMENT=staging
|
|
SENTRY_RELEASE=$VERCEL_GIT_COMMIT_SHA
|
|
|
|
# Feature Flags
|
|
VITE_SCANNER_OFFLINE_ENABLED=true
|
|
VITE_ABUSE_PREVENTION_ENABLED=true
|
|
VITE_DEVICE_TRACKING_ENABLED=true
|
|
```
|
|
|
|
#### Deployment Configuration
|
|
|
|
**Vercel (Recommended)**
|
|
```json
|
|
{
|
|
"buildCommand": "npm run build",
|
|
"outputDirectory": "dist",
|
|
"installCommand": "npm ci",
|
|
"framework": "vite",
|
|
"functions": {
|
|
"api/**/*.ts": {
|
|
"runtime": "nodejs18.x"
|
|
}
|
|
},
|
|
"headers": [
|
|
{
|
|
"source": "/sw.js",
|
|
"headers": [
|
|
{
|
|
"key": "Cache-Control",
|
|
"value": "public, max-age=0, must-revalidate"
|
|
}
|
|
]
|
|
}
|
|
]
|
|
}
|
|
```
|
|
|
|
**Netlify Alternative**
|
|
```toml
|
|
[build]
|
|
command = "npm run build"
|
|
publish = "dist"
|
|
|
|
[[headers]]
|
|
for = "/sw.js"
|
|
[headers.values]
|
|
Cache-Control = "public, max-age=0, must-revalidate"
|
|
|
|
[[headers]]
|
|
for = "/manifest.json"
|
|
[headers.values]
|
|
Cache-Control = "public, max-age=86400"
|
|
```
|
|
|
|
### Production Environment
|
|
|
|
#### SSL/HTTPS Requirements
|
|
- **Camera API:** Requires HTTPS for getUserMedia() access
|
|
- **Service Workers:** HTTPS required for PWA functionality
|
|
- **Geolocation:** HTTPS required for location services
|
|
- **Web Push:** HTTPS required for background sync
|
|
|
|
#### Database Configuration
|
|
```sql
|
|
-- Scanner-specific tables
|
|
CREATE TABLE scanner_logs (
|
|
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
|
device_id VARCHAR(255) NOT NULL,
|
|
event_id UUID NOT NULL,
|
|
qr_code VARCHAR(500) NOT NULL,
|
|
scan_result VARCHAR(50) NOT NULL, -- 'valid', 'invalid', 'already_scanned'
|
|
scan_timestamp TIMESTAMPTZ DEFAULT NOW(),
|
|
zone VARCHAR(100),
|
|
sync_status VARCHAR(50) DEFAULT 'synced',
|
|
created_at TIMESTAMPTZ DEFAULT NOW()
|
|
);
|
|
|
|
CREATE TABLE scan_conflicts (
|
|
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
|
device_id VARCHAR(255) NOT NULL,
|
|
qr_code VARCHAR(500) NOT NULL,
|
|
offline_result VARCHAR(50) NOT NULL,
|
|
server_result VARCHAR(50) NOT NULL,
|
|
event_id UUID NOT NULL,
|
|
conflict_timestamp TIMESTAMPTZ DEFAULT NOW(),
|
|
resolution_status VARCHAR(50) DEFAULT 'pending'
|
|
);
|
|
|
|
-- Indexes for performance
|
|
CREATE INDEX idx_scanner_logs_device_event ON scanner_logs(device_id, event_id);
|
|
CREATE INDEX idx_scanner_logs_timestamp ON scanner_logs(scan_timestamp DESC);
|
|
CREATE INDEX idx_scan_conflicts_unresolved ON scan_conflicts(resolution_status)
|
|
WHERE resolution_status = 'pending';
|
|
```
|
|
|
|
## API Endpoints and Integration
|
|
|
|
### Core Scanner APIs
|
|
|
|
#### 1. Ticket Verification Endpoint
|
|
```typescript
|
|
POST /api/tickets/verify
|
|
Content-Type: application/json
|
|
Authorization: Bearer {jwt_token}
|
|
|
|
{
|
|
"qr": "TICKET_UUID_OR_CODE",
|
|
"eventId": "event_uuid",
|
|
"deviceId": "device_fingerprint",
|
|
"zone": "Gate A"
|
|
}
|
|
|
|
// Responses
|
|
// Success (200)
|
|
{
|
|
"valid": true,
|
|
"ticket": {
|
|
"eventTitle": "Sample Event",
|
|
"ticketTypeName": "General Admission",
|
|
"customerEmail": "customer@example.com",
|
|
"seatNumber": "A-15" // if assigned seating
|
|
}
|
|
}
|
|
|
|
// Already Scanned (200)
|
|
{
|
|
"valid": false,
|
|
"reason": "already_scanned",
|
|
"scannedAt": "2024-01-01T18:30:00Z",
|
|
"scannedBy": "device_abc123",
|
|
"zone": "Main Entrance"
|
|
}
|
|
|
|
// Invalid (200)
|
|
{
|
|
"valid": false,
|
|
"reason": "invalid", // or "expired", "cancelled", "locked"
|
|
"message": "Ticket not found or has been cancelled"
|
|
}
|
|
|
|
// Rate Limited (429)
|
|
{
|
|
"error": "rate_limit_exceeded",
|
|
"retryAfter": 5,
|
|
"message": "Too many requests from device"
|
|
}
|
|
```
|
|
|
|
#### 2. Scan Logging Endpoint
|
|
```typescript
|
|
POST /api/scans/log
|
|
Content-Type: application/json
|
|
Authorization: Bearer {jwt_token}
|
|
|
|
{
|
|
"deviceId": "device_fingerprint",
|
|
"eventId": "event_uuid",
|
|
"qr": "TICKET_CODE",
|
|
"result": "valid", // valid, invalid, already_scanned
|
|
"zone": "Gate A",
|
|
"timestamp": "2024-01-01T18:30:00Z",
|
|
"latency": 245, // ms
|
|
"offline": false
|
|
}
|
|
|
|
// Response (202 Accepted)
|
|
{
|
|
"logged": true,
|
|
"scanId": "scan_uuid"
|
|
}
|
|
```
|
|
|
|
#### 3. Offline Sync Endpoint
|
|
```typescript
|
|
POST /api/scanner/sync
|
|
Content-Type: application/json
|
|
Authorization: Bearer {jwt_token}
|
|
|
|
{
|
|
"deviceId": "device_fingerprint",
|
|
"scans": [
|
|
{
|
|
"qr": "TICKET_CODE_1",
|
|
"eventId": "event_uuid",
|
|
"result": "valid",
|
|
"timestamp": "2024-01-01T18:30:00Z",
|
|
"zone": "Gate A"
|
|
}
|
|
// ... more scans
|
|
]
|
|
}
|
|
|
|
// Response (200)
|
|
{
|
|
"synced": 15,
|
|
"conflicts": 2,
|
|
"failed": 0,
|
|
"conflictDetails": [
|
|
{
|
|
"qr": "TICKET_CODE_X",
|
|
"offlineResult": "valid",
|
|
"serverResult": "already_scanned",
|
|
"conflictId": "conflict_uuid"
|
|
}
|
|
]
|
|
}
|
|
```
|
|
|
|
### Server-Side Rate Limiting
|
|
|
|
#### Redis-Based Rate Limiting
|
|
```javascript
|
|
// Rate limiting implementation
|
|
const rateLimit = async (deviceId, windowMs = 1000, maxRequests = 8) => {
|
|
const key = `rate_limit:${deviceId}`;
|
|
const current = await redis.incr(key);
|
|
|
|
if (current === 1) {
|
|
await redis.expire(key, Math.ceil(windowMs / 1000));
|
|
}
|
|
|
|
if (current > maxRequests) {
|
|
throw new RateLimitError('Rate limit exceeded', {
|
|
retryAfter: await redis.ttl(key)
|
|
});
|
|
}
|
|
|
|
return { remaining: maxRequests - current, resetTime: Date.now() + windowMs };
|
|
};
|
|
```
|
|
|
|
#### Device Abuse Tracking
|
|
```javascript
|
|
const trackDeviceAbuse = async (deviceId, violation) => {
|
|
const key = `abuse:${deviceId}`;
|
|
const violations = await redis.get(key) || '[]';
|
|
const parsed = JSON.parse(violations);
|
|
|
|
parsed.push({
|
|
type: violation.type, // 'rate_limit', 'invalid_qr_spam'
|
|
timestamp: Date.now(),
|
|
details: violation.details
|
|
});
|
|
|
|
// Keep only last 24 hours of violations
|
|
const oneDayAgo = Date.now() - 86400000;
|
|
const recent = parsed.filter(v => v.timestamp > oneDayAgo);
|
|
|
|
await redis.setex(key, 86400, JSON.stringify(recent));
|
|
|
|
// Calculate escalating penalty
|
|
const penalty = calculatePenalty(recent);
|
|
if (penalty > 0) {
|
|
await redis.setex(`penalty:${deviceId}`, penalty, '1');
|
|
}
|
|
};
|
|
```
|
|
|
|
## Monitoring and Alerting
|
|
|
|
### Sentry Configuration
|
|
|
|
#### Error Monitoring Setup
|
|
```javascript
|
|
// sentry.client.ts
|
|
import * as Sentry from '@sentry/react';
|
|
|
|
Sentry.init({
|
|
dsn: process.env.VITE_SENTRY_DSN,
|
|
environment: process.env.NODE_ENV,
|
|
integrations: [
|
|
new Sentry.BrowserTracing({
|
|
tracingOrigins: [/^https:\/\/.*\.blackcanyontickets\.com\/api/],
|
|
}),
|
|
],
|
|
tracesSampleRate: process.env.NODE_ENV === 'production' ? 0.1 : 1.0,
|
|
beforeSend(event) {
|
|
// Filter out rate limiting errors as they're expected
|
|
if (event.exception?.values?.[0]?.type === 'RateLimitError') {
|
|
return null;
|
|
}
|
|
return event;
|
|
}
|
|
});
|
|
```
|
|
|
|
#### Performance Monitoring
|
|
```javascript
|
|
// Performance tracking in scanner
|
|
const trackScanPerformance = (scanResult) => {
|
|
Sentry.addBreadcrumb({
|
|
category: 'scanner',
|
|
message: `Scan ${scanResult.result}`,
|
|
level: 'info',
|
|
data: {
|
|
latency: scanResult.latency,
|
|
offline: scanResult.offline,
|
|
zone: scanResult.zone,
|
|
deviceMemory: performance.memory?.usedJSHeapSize || 0
|
|
}
|
|
});
|
|
|
|
// Track performance metrics
|
|
Sentry.setTag('scanning_session', true);
|
|
Sentry.setContext('device_info', {
|
|
memory: performance.memory?.usedJSHeapSize,
|
|
connection: navigator.connection?.effectiveType,
|
|
battery: navigator.battery?.level
|
|
});
|
|
};
|
|
```
|
|
|
|
### Alert Configuration
|
|
|
|
#### Critical Alerts (Immediate Response)
|
|
```yaml
|
|
# Sentry Alert Rules
|
|
- name: "Scanner API Errors"
|
|
conditions:
|
|
- error_count > 10 in 5 minutes
|
|
actions:
|
|
- slack: "#incidents"
|
|
- email: "oncall@blackcanyontickets.com"
|
|
|
|
- name: "High Rate Limiting"
|
|
conditions:
|
|
- event.type = "RateLimitError"
|
|
- count > 50 in 10 minutes
|
|
actions:
|
|
- slack: "#scanner-ops"
|
|
|
|
- name: "Sync Failures"
|
|
conditions:
|
|
- event.message contains "sync_failed"
|
|
- count > 25 in 15 minutes
|
|
actions:
|
|
- slack: "#scanner-ops"
|
|
- pagerduty: "P1"
|
|
```
|
|
|
|
#### Performance Thresholds
|
|
```javascript
|
|
// Client-side performance monitoring
|
|
const performanceMonitor = {
|
|
scanLatency: {
|
|
warning: 1000, // ms
|
|
critical: 3000
|
|
},
|
|
memoryUsage: {
|
|
warning: 50 * 1024 * 1024, // 50MB
|
|
critical: 100 * 1024 * 1024 // 100MB
|
|
},
|
|
syncQueueSize: {
|
|
warning: 50,
|
|
critical: 100
|
|
},
|
|
offlineTime: {
|
|
warning: 300000, // 5 minutes
|
|
critical: 900000 // 15 minutes
|
|
}
|
|
};
|
|
```
|
|
|
|
### Dashboard URLs and Metrics
|
|
|
|
#### Grafana Dashboards
|
|
- **Scanner Operations**: `https://grafana.bct.com/d/scanner-ops`
|
|
- Real-time scan rates by device/zone
|
|
- Network latency and error rates
|
|
- Offline queue sizes and sync status
|
|
|
|
- **Performance Metrics**: `https://grafana.bct.com/d/scanner-perf`
|
|
- Memory usage trends
|
|
- Battery life estimates
|
|
- Camera initialization times
|
|
|
|
- **Business Metrics**: `https://grafana.bct.com/d/scanner-business`
|
|
- Entry throughput by gate
|
|
- Peak scanning times
|
|
- Duplicate/invalid ticket rates
|
|
|
|
#### Real-Time Monitoring
|
|
```javascript
|
|
// Health check endpoint
|
|
GET /api/scanner/health
|
|
{
|
|
"status": "healthy",
|
|
"checks": {
|
|
"database": "ok",
|
|
"redis": "ok",
|
|
"camera_api": "ok"
|
|
},
|
|
"metrics": {
|
|
"active_devices": 15,
|
|
"scans_per_minute": 245,
|
|
"pending_syncs": 12,
|
|
"conflict_rate": 0.02
|
|
}
|
|
}
|
|
```
|
|
|
|
## Network Requirements and Quality
|
|
|
|
### WiFi Configuration
|
|
|
|
#### Recommended WiFi Setup
|
|
```yaml
|
|
Network Requirements:
|
|
- SSID: "BCT-Staff" (dedicated for scanners)
|
|
- Security: WPA3-Enterprise (or WPA2-Personal minimum)
|
|
- Bandwidth: 10 Mbps minimum, 50 Mbps recommended
|
|
- Latency: <100ms to API servers
|
|
- Coverage: -65 dBm minimum signal at all gate locations
|
|
|
|
Quality of Service (QoS):
|
|
- Scanner traffic priority: High
|
|
- API requests: TCP/443 (HTTPS)
|
|
- Sync operations: TCP/443 (HTTPS)
|
|
- Background sync: TCP/443 (HTTPS)
|
|
```
|
|
|
|
#### Network Monitoring
|
|
```bash
|
|
# WiFi quality testing script
|
|
#!/bin/bash
|
|
WIFI_SSID="BCT-Staff"
|
|
API_ENDPOINT="https://api.blackcanyontickets.com/health"
|
|
|
|
echo "Testing WiFi Quality for Scanner Operations"
|
|
echo "==========================================="
|
|
|
|
# Signal strength test
|
|
SIGNAL=$(iwconfig wlan0 | grep 'Signal level' | awk '{print $4}' | cut -d= -f2)
|
|
echo "Signal Strength: $SIGNAL dBm"
|
|
if [ ${SIGNAL:1} -gt 65 ]; then
|
|
echo "⚠️ Warning: Signal strength may affect scanning"
|
|
fi
|
|
|
|
# Latency test
|
|
PING=$(ping -c 5 api.blackcanyontickets.com | tail -1 | awk '{print $4}' | cut -d/ -f2)
|
|
echo "Average Latency: ${PING}ms"
|
|
if (( $(echo "$PING > 200" | bc -l) )); then
|
|
echo "⚠️ Warning: High latency may slow scanning"
|
|
fi
|
|
|
|
# Throughput test
|
|
echo "Testing API response time..."
|
|
API_TIME=$(curl -o /dev/null -s -w "%{time_total}" $API_ENDPOINT)
|
|
echo "API Response Time: ${API_TIME}s"
|
|
if (( $(echo "$API_TIME > 2.0" | bc -l) )); then
|
|
echo "❌ Critical: API too slow for real-time scanning"
|
|
fi
|
|
```
|
|
|
|
### Cellular Fallback Configuration
|
|
|
|
#### Carrier Requirements
|
|
```yaml
|
|
Cellular Backup:
|
|
- Carriers: Verizon, AT&T, T-Mobile (multi-carrier preferred)
|
|
- Data Plans: Unlimited or 10GB+ per device per event
|
|
- Coverage: -85 dBm minimum LTE signal
|
|
- Speeds: 5 Mbps down, 1 Mbps up minimum
|
|
|
|
Network Priority:
|
|
1. WiFi (primary)
|
|
2. 5G/LTE (automatic fallback)
|
|
3. Offline mode (complete network failure)
|
|
```
|
|
|
|
#### Network Handoff Testing
|
|
```javascript
|
|
// Network quality detection
|
|
const detectNetworkQuality = () => {
|
|
const connection = navigator.connection || navigator.mozConnection || navigator.webkitConnection;
|
|
|
|
return {
|
|
type: connection?.effectiveType || 'unknown', // '4g', '3g', 'slow-2g'
|
|
downlink: connection?.downlink || 0, // Mbps
|
|
rtt: connection?.rtt || 0, // ms round trip time
|
|
saveData: connection?.saveData || false
|
|
};
|
|
};
|
|
|
|
// Adaptive behavior based on connection
|
|
const adaptToNetworkQuality = (quality) => {
|
|
if (quality.effectiveType === 'slow-2g') {
|
|
// Reduce sync frequency, smaller batches
|
|
return { syncInterval: 30000, batchSize: 5 };
|
|
} else if (quality.effectiveType === '3g') {
|
|
return { syncInterval: 10000, batchSize: 10 };
|
|
} else {
|
|
// 4G or better - full speed
|
|
return { syncInterval: 5000, batchSize: 25 };
|
|
}
|
|
};
|
|
```
|
|
|
|
## Performance Baselines and SLAs
|
|
|
|
### Expected Performance Metrics
|
|
|
|
#### Scanning Operations
|
|
```yaml
|
|
Scan Processing:
|
|
- QR Detection Time: <500ms (camera to recognition)
|
|
- API Verification Time: <1000ms (network request)
|
|
- UI Response Time: <100ms (result display)
|
|
- Total Scan Time: <3 seconds (end-to-end)
|
|
|
|
Throughput:
|
|
- Peak Rate: 8 scans/second (rate limit)
|
|
- Sustained Rate: 4-6 scans/second per device
|
|
- Concurrent Devices: 20+ devices per event
|
|
- Entry Processing: <10 seconds per attendee (including interaction)
|
|
```
|
|
|
|
#### Device Performance
|
|
```yaml
|
|
Memory Usage:
|
|
- Initial Load: <15MB heap size
|
|
- After 100 scans: <25MB heap size
|
|
- After 1000 scans: <40MB heap size
|
|
- Memory Growth Rate: <20MB per hour
|
|
|
|
Battery Life:
|
|
- Continuous Scanning: 4+ hours minimum
|
|
- Standby with App Open: 8+ hours
|
|
- Background Sync Impact: <10% additional drain
|
|
- Flashlight Usage: -25% battery life
|
|
|
|
CPU Performance:
|
|
- Camera Processing: <30% CPU usage
|
|
- QR Detection: <50% CPU burst (brief)
|
|
- Background Sync: <10% CPU usage
|
|
- Thermal Throttling: Graceful degradation
|
|
```
|
|
|
|
#### Network Performance
|
|
```yaml
|
|
Sync Operations:
|
|
- Single Scan Sync: <500ms
|
|
- Batch Sync (25 scans): <2 seconds
|
|
- Full Queue Sync (100 scans): <10 seconds
|
|
- Conflict Resolution: <1 second per conflict
|
|
|
|
Offline Capabilities:
|
|
- Queue Capacity: 1000+ scans per device
|
|
- Storage Persistence: Survives app/browser restart
|
|
- Sync Success Rate: >99% when network restored
|
|
- Data Integrity: Zero scan loss during offline operation
|
|
```
|
|
|
|
### Service Level Agreements (SLAs)
|
|
|
|
#### Availability Targets
|
|
```yaml
|
|
System Availability: 99.9% during event hours
|
|
- API Uptime: 99.95%
|
|
- Database Uptime: 99.99%
|
|
- CDN Availability: 99.9%
|
|
- SSL Certificate: 99.99%
|
|
|
|
Response Time Targets:
|
|
- Scanner API: <1 second (95th percentile)
|
|
- Health Checks: <200ms (99th percentile)
|
|
- Sync Operations: <5 seconds (99th percentile)
|
|
|
|
Error Rate Targets:
|
|
- False Positives: <0.5% (valid tickets rejected)
|
|
- False Negatives: <0.1% (invalid tickets accepted)
|
|
- Sync Failures: <1% of total sync operations
|
|
- Conflict Rate: <2% of offline scans
|
|
```
|
|
|
|
#### Performance Degradation Response
|
|
```yaml
|
|
Performance Tiers:
|
|
Tier 1 (Green - Normal):
|
|
- Scan latency: <1 second
|
|
- Memory usage: <40MB
|
|
- Battery life: >4 hours
|
|
- Action: Monitor only
|
|
|
|
Tier 2 (Yellow - Degraded):
|
|
- Scan latency: 1-3 seconds
|
|
- Memory usage: 40-70MB
|
|
- Battery life: 2-4 hours
|
|
- Action: Alert on-call, prepare mitigation
|
|
|
|
Tier 3 (Red - Critical):
|
|
- Scan latency: >3 seconds
|
|
- Memory usage: >70MB
|
|
- Battery life: <2 hours
|
|
- Action: Immediate response, activate backup procedures
|
|
```
|
|
|
|
## Escalation Procedures and Contacts
|
|
|
|
### Technical Escalation Matrix
|
|
|
|
#### Severity Levels
|
|
```yaml
|
|
P1 - Business Critical (Response: 5 minutes):
|
|
- Complete scanner system failure
|
|
- Database corruption affecting scan verification
|
|
- Security breach or data leak
|
|
- >50% of devices unable to scan
|
|
|
|
P2 - High Impact (Response: 15 minutes):
|
|
- Single API endpoint failure
|
|
- Performance degradation affecting >25% devices
|
|
- Network sync failures >10 minutes
|
|
- Rate limiting preventing legitimate scans
|
|
|
|
P3 - Medium Impact (Response: 1 hour):
|
|
- Individual device issues (hardware/software)
|
|
- Non-critical API errors
|
|
- Monitoring/alerting issues
|
|
- Minor UI/UX problems
|
|
|
|
P4 - Low Impact (Response: 4 hours):
|
|
- Enhancement requests
|
|
- Documentation updates
|
|
- Non-urgent performance optimization
|
|
- Cosmetic UI issues
|
|
```
|
|
|
|
#### Contact Directory
|
|
|
|
**Primary On-Call (24/7 During Events)**
|
|
```yaml
|
|
Platform Engineering Lead:
|
|
- Name: [REDACTED]
|
|
- Phone: [REDACTED]
|
|
- Email: oncall-platform@blackcanyontickets.com
|
|
- Slack: @platform-oncall
|
|
- Responsibilities: API issues, database problems, sync failures
|
|
|
|
DevOps Engineer:
|
|
- Name: [REDACTED]
|
|
- Phone: [REDACTED]
|
|
- Email: oncall-devops@blackcanyontickets.com
|
|
- Slack: @devops-oncall
|
|
- Responsibilities: Infrastructure, networking, deployment issues
|
|
|
|
Frontend Engineering Lead:
|
|
- Name: [REDACTED]
|
|
- Phone: [REDACTED]
|
|
- Email: oncall-frontend@blackcanyontickets.com
|
|
- Slack: @frontend-oncall
|
|
- Responsibilities: PWA issues, camera problems, UI/UX bugs
|
|
```
|
|
|
|
**Secondary/Escalation Contacts**
|
|
```yaml
|
|
Engineering Manager:
|
|
- Phone: [REDACTED]
|
|
- When: P1 incidents >30 minutes, team coordination needed
|
|
|
|
CTO:
|
|
- Phone: [REDACTED]
|
|
- When: P1 incidents >60 minutes, business decisions needed
|
|
|
|
CEO:
|
|
- Phone: [REDACTED]
|
|
- When: Business-critical failures, external communication needed
|
|
```
|
|
|
|
**Vendor/External Support**
|
|
```yaml
|
|
Supabase Support:
|
|
- Contact: support@supabase.com
|
|
- SLA: 4 hours (Pro Plan)
|
|
- Phone: Emergency hotline available
|
|
|
|
Sentry Support:
|
|
- Contact: support@sentry.io
|
|
- SLA: 8 hours (Business Plan)
|
|
- Documentation: docs.sentry.io
|
|
|
|
Vercel Support:
|
|
- Contact: support@vercel.com
|
|
- SLA: 24 hours (Pro Plan)
|
|
- Status Page: vercel-status.com
|
|
```
|
|
|
|
### Escalation Procedures
|
|
|
|
#### P1 Incident Response
|
|
```yaml
|
|
Step 1 (0-5 minutes):
|
|
- Acknowledge incident in #incidents Slack channel
|
|
- Start incident bridge call/video chat
|
|
- Assign incident commander (usually platform lead)
|
|
- Begin status page updates
|
|
|
|
Step 2 (5-15 minutes):
|
|
- Assess scope and impact
|
|
- Implement immediate mitigation (fallback procedures)
|
|
- Escalate to Engineering Manager if needed
|
|
- Communicate with venue/operations team
|
|
|
|
Step 3 (15-30 minutes):
|
|
- Deploy fixes or activate backup systems
|
|
- Monitor recovery and impact reduction
|
|
- Update stakeholders every 10 minutes
|
|
- Document timeline and actions taken
|
|
|
|
Step 4 (Post-resolution):
|
|
- Conduct immediate hot wash (15 minutes)
|
|
- Schedule full post-mortem within 48 hours
|
|
- Update runbooks based on lessons learned
|
|
- Communicate resolution to all stakeholders
|
|
```
|
|
|
|
#### Common Escalation Scenarios
|
|
|
|
**Complete Scanner Failure**
|
|
```yaml
|
|
Symptoms: All devices unable to scan, API returning errors
|
|
Immediate Actions:
|
|
1. Check system status dashboard
|
|
2. Verify database connectivity
|
|
3. Activate manual entry procedures at gates
|
|
4. Estimate impact and communicate timeline
|
|
|
|
Escalation Triggers:
|
|
- Issue not resolved in 15 minutes → Engineering Manager
|
|
- Manual entry activated → Operations Manager
|
|
- Expected duration >1 hour → CTO
|
|
```
|
|
|
|
**Mass Device Issues**
|
|
```yaml
|
|
Symptoms: >10 devices experiencing problems simultaneously
|
|
Immediate Actions:
|
|
1. Check for CDN/deployment issues
|
|
2. Verify service worker updates aren't causing problems
|
|
3. Roll back recent deployments if necessary
|
|
4. Distribute backup devices
|
|
|
|
Escalation Triggers:
|
|
- >50% devices affected → Immediate escalation to Engineering Manager
|
|
- Hardware-related issues → Venue operations team
|
|
- Software issues persisting >20 minutes → CTO notification
|
|
```
|
|
|
|
**Database/API Performance Issues**
|
|
```yaml
|
|
Symptoms: Slow scan response times, sync delays, timeouts
|
|
Immediate Actions:
|
|
1. Check database performance metrics
|
|
2. Review API response times and error rates
|
|
3. Scale database resources if possible
|
|
4. Enable aggressive client-side caching
|
|
|
|
Escalation Triggers:
|
|
- Database CPU >90% for >5 minutes → DevOps immediate response
|
|
- API latency >5 seconds → Platform Engineering lead
|
|
- Unable to scale resources → CTO (budget approval needed)
|
|
```
|
|
|
|
## Security Considerations
|
|
|
|
### Data Protection and Privacy
|
|
|
|
#### Local Data Storage
|
|
```yaml
|
|
IndexedDB Contents:
|
|
- Scan Queue: QR codes, timestamps, results (temporary)
|
|
- Device Settings: Zone, preferences (non-sensitive)
|
|
- Conflict Log: Sync discrepancies (temporary)
|
|
- No Storage: Customer PII, payment info, passwords
|
|
|
|
Data Retention:
|
|
- Scan Queue: Auto-purged after successful sync
|
|
- Settings: Persisted until manual reset
|
|
- Conflicts: Retained 24 hours for review
|
|
- Error Logs: Purged every 7 days
|
|
```
|
|
|
|
#### Network Security
|
|
```yaml
|
|
API Communication:
|
|
- Protocol: HTTPS only (TLS 1.3 preferred)
|
|
- Authentication: JWT tokens with 2-hour expiration
|
|
- Rate Limiting: 8 requests/second per device
|
|
- Input Validation: All QR codes validated server-side
|
|
|
|
Content Security Policy:
|
|
- script-src: 'self' cdn.sentry.io
|
|
- connect-src: 'self' *.supabase.co *.sentry.io
|
|
- img-src: 'self' data: https:
|
|
- camera: Required for QR scanning
|
|
```
|
|
|
|
#### Device Security
|
|
```yaml
|
|
Client-Side Protection:
|
|
- No API keys stored in client code
|
|
- Device fingerprinting (non-PII) for abuse tracking
|
|
- Local encryption of sensitive scan data
|
|
- Secure session management
|
|
|
|
Physical Security:
|
|
- Device lock screen during breaks
|
|
- Remote wipe capabilities if stolen
|
|
- Tamper detection for unusual usage patterns
|
|
- Secure storage when not in use
|
|
```
|
|
|
|
### Vulnerability Management
|
|
|
|
#### Regular Security Assessments
|
|
```yaml
|
|
Automated Scanning:
|
|
- Daily: npm audit for dependency vulnerabilities
|
|
- Weekly: OWASP ZAP scan of staging environment
|
|
- Monthly: Full penetration test of production
|
|
|
|
Manual Review:
|
|
- Code Review: Required for all scanner-related changes
|
|
- Security Review: Required for API changes
|
|
- Access Review: Quarterly review of admin permissions
|
|
```
|
|
|
|
#### Incident Response Plan
|
|
```yaml
|
|
Security Incident Types:
|
|
1. Data Breach (customer PII exposed)
|
|
2. Unauthorized Access (admin account compromise)
|
|
3. Service Disruption (DDoS, malicious traffic)
|
|
4. Device Compromise (stolen/hacked scanner device)
|
|
|
|
Response Timeline:
|
|
- Detection to Acknowledgment: <15 minutes
|
|
- Initial Assessment: <30 minutes
|
|
- Containment Actions: <1 hour
|
|
- Customer Notification: <24 hours (if PII involved)
|
|
```
|
|
|
|
## Maintenance and Updates
|
|
|
|
### Regular Maintenance Tasks
|
|
|
|
#### Daily Operations
|
|
```bash
|
|
# Daily health check script
|
|
#!/bin/bash
|
|
echo "BCT Scanner Daily Health Check - $(date)"
|
|
|
|
# Check API health
|
|
curl -f https://api.blackcanyontickets.com/health || echo "❌ API health check failed"
|
|
|
|
# Check database connections
|
|
psql $DATABASE_URL -c "SELECT 1" || echo "❌ Database connection failed"
|
|
|
|
# Check error rates in Sentry
|
|
# (This would integrate with Sentry API to get error counts)
|
|
|
|
# Check device online status
|
|
# (Query database for devices that haven't synced in >24 hours)
|
|
```
|
|
|
|
#### Weekly Maintenance
|
|
```yaml
|
|
System Updates:
|
|
- Review and apply npm dependency updates
|
|
- Update browser compatibility testing
|
|
- Review performance metrics and trends
|
|
- Clean up old scan logs and conflict data
|
|
|
|
Monitoring Review:
|
|
- Review Sentry error trends
|
|
- Analyze performance degradation patterns
|
|
- Update alert thresholds based on data
|
|
- Test escalation procedures
|
|
```
|
|
|
|
#### Monthly Maintenance
|
|
```yaml
|
|
Security Updates:
|
|
- Apply security patches to all dependencies
|
|
- Review and rotate API keys/secrets
|
|
- Update SSL certificates if needed
|
|
- Review access logs for unusual patterns
|
|
|
|
Performance Optimization:
|
|
- Analyze database query performance
|
|
- Review and optimize API endpoints
|
|
- Update caching strategies
|
|
- Test load balancing configuration
|
|
```
|
|
|
|
### Deployment and Rollback Procedures
|
|
|
|
#### Production Deployment
|
|
```yaml
|
|
Deployment Process:
|
|
1. Code Review: Minimum 2 approvals required
|
|
2. Staging Testing: Full test suite + manual QA
|
|
3. Blue-Green Deployment: Zero downtime deployment
|
|
4. Gradual Rollout: 10% → 50% → 100% traffic
|
|
5. Monitoring: 24-hour observation period
|
|
|
|
Rollback Triggers:
|
|
- Error rate >5% increase from baseline
|
|
- Response time >2x normal latency
|
|
- Critical functionality broken (scanning failure)
|
|
- Security vulnerability discovered
|
|
|
|
Rollback Process:
|
|
1. Immediate: Switch traffic to previous version (2 minutes)
|
|
2. Communication: Alert all stakeholders (5 minutes)
|
|
3. Investigation: Root cause analysis (30 minutes)
|
|
4. Fix Forward: Rapid bug fix deployment if possible
|
|
```
|
|
|
|
#### Emergency Hotfixes
|
|
```yaml
|
|
Hotfix Criteria:
|
|
- Security vulnerability (any severity)
|
|
- Data corruption or loss
|
|
- Complete service outage
|
|
- Critical business function broken
|
|
|
|
Hotfix Process:
|
|
1. Create hotfix branch from production
|
|
2. Make minimal fix with tests
|
|
3. Emergency code review (single approver)
|
|
4. Direct to production deployment
|
|
5. Post-deployment verification
|
|
6. Full retrospective within 24 hours
|
|
```
|
|
|
|
---
|
|
|
|
## Appendices
|
|
|
|
### A. Browser Compatibility Matrix
|
|
```yaml
|
|
Recommended Browsers:
|
|
- Chrome 88+: ✅ Full support, best performance
|
|
- Safari 14+: ✅ iOS support, some PWA limitations
|
|
- Firefox 85+: ✅ Good fallback, slower QR detection
|
|
- Edge 88+: ✅ Windows support, Chromium-based
|
|
|
|
Required APIs:
|
|
- getUserMedia (Camera): All supported browsers
|
|
- IndexedDB: All supported browsers
|
|
- Service Workers: All supported browsers
|
|
- BarcodeDetector: Chrome only, ZXing fallback for others
|
|
```
|
|
|
|
### B. Database Schema Reference
|
|
```sql
|
|
-- Complete scanner database schema
|
|
-- See SCANNER_DATABASE_SCHEMA.sql for full implementation
|
|
```
|
|
|
|
### C. Monitoring Queries
|
|
```sql
|
|
-- Active scanning sessions
|
|
SELECT
|
|
COUNT(DISTINCT device_id) as active_devices,
|
|
COUNT(*) as total_scans,
|
|
AVG(EXTRACT(EPOCH FROM (NOW() - scan_timestamp))) as avg_age_seconds
|
|
FROM scanner_logs
|
|
WHERE scan_timestamp > NOW() - INTERVAL '1 hour';
|
|
|
|
-- Conflict analysis
|
|
SELECT
|
|
offline_result,
|
|
server_result,
|
|
COUNT(*) as conflicts
|
|
FROM scan_conflicts
|
|
WHERE conflict_timestamp > NOW() - INTERVAL '24 hours'
|
|
GROUP BY offline_result, server_result;
|
|
```
|
|
|
|
### D. Performance Benchmarking Scripts
|
|
```bash
|
|
# Device performance test script
|
|
# See SCANNER_PERFORMANCE_TEST.sh for full implementation
|
|
```
|
|
|
|
This technical runbook provides comprehensive guidance for IT administrators and technical staff to successfully deploy, monitor, and maintain the Scanner PWA system. Regular updates to this document should reflect operational lessons learned and system evolution. |