Testing and Debugging

Eventuall uses Vitest for unit testing, a custom telemetry system for client-side logging, and several debugging tools for development. This page covers how to write tests, use the logging infrastructure, and debug issues across the stack.

Testing with Vitest

Both the webapp and workers apps use Vitest as their test runner. Each has its own configuration tailored to its runtime environment.

Webapp Test Configuration

The webapp uses a minimal Vitest config (apps/webapp/vitest.config.ts):

import { defineConfig } from "vitest/config";

export default defineConfig({
  test: {
    exclude: ["node_modules", "e2e"],
  },
});

Tests run with default settings — parallel execution, no custom setup file. This works because webapp tests don't depend on Cloudflare-specific APIs.

Workers Test Configuration

The workers config (apps/workers/vitest.config.ts) is more involved because Cloudflare Workers APIs (Durable Objects, D1, Queues) don't exist in Node.js:

export default defineConfig({
  test: {
    reporters: ["verbose"],
    pool: "forks",
    testTimeout: 5000,
    setupFiles: ["./vitest.setup.ts"],
    sequence: { concurrent: false },
  },
});

Key settings:

pool: "forks" — each test file runs in its own process, preventing state leakage between tests
sequence: { concurrent: false } — tests run sequentially, not in parallel. This is required because D1 database operations can conflict when run concurrently
setupFiles — loads mock implementations of Cloudflare APIs before any test runs

Mocking Cloudflare APIs

The setup file (apps/workers/vitest.setup.ts) mocks the Cloudflare runtime APIs that don't exist in Node.js:

// Workflow mocking
vi.mock("cloudflare:workers", () => ({
  WorkflowEntrypoint: class {},
  WorkflowEvent: class {},
  WorkflowStep: class {
    do = vi.fn().mockImplementation(async (name, callback) => callback());
    sleep = vi.fn();
    sleepUntil = vi.fn();
  },
}));

// Durable Object namespace
class MockDurableObjectNamespace {
  idFromName = vi.fn().mockReturnValue({ toString: () => "mock-id" });
  idFromString = vi.fn().mockReturnValue({ toString: () => "mock-id" });
  get = vi.fn().mockReturnValue({ fetch: vi.fn() });
}

// D1 Database
class MockD1Database {
  prepare = vi.fn().mockReturnValue({
    bind: vi.fn().mockReturnThis(),
    run: vi.fn(),
    first: vi.fn(),
    all: vi.fn().mockResolvedValue({ results: [] }),
  });
  batch = vi.fn();
  exec = vi.fn();
}

These mocks let you test business logic that depends on Cloudflare APIs without running a real Workers runtime. The mocks return sensible defaults — empty arrays, resolved promises, chainable methods.

Pitfall: These mocks simulate the interface of Cloudflare APIs, not their behavior. A test that passes with mocked D1 might fail against a real D1 database if your SQL is invalid or if you depend on D1-specific behavior (like its limited transaction support). For integration-level confidence, test against a local D1 database using wrangler d1 execute --local.

Writing a Test

Tests follow the standard Vitest pattern:

import { describe, it, expect, vi } from "vitest";

describe("EventService", () => {
  it("should create an event with default room", async () => {
    const result = await createEventWithDefaults({
      name: "Test Event",
      accountId: "acc-123",
    });

    expect(result.name).toBe("Test Event");
    expect(result.status).toBe("draft");
  });
});

Running tests:

# Run all tests once
pnpm test

# Run tests in watch mode (re-runs on file changes)
pnpm test:watch

# Run tests for a specific app
cd apps/webapp && pnpm test
cd apps/workers && pnpm test

Test Coverage Gaps

The codebase currently has minimal test coverage. Most test files are placeholders. If you're adding new features, writing tests is encouraged but not enforced by CI. Focus testing efforts on:

Business logic — pure functions that compute permissions, validate inputs, or transform data
tRPC procedures — test the handler logic with mocked context
Durable Object message handlers — test state transitions in response to WebSocket messages

Architect's Note: The test infrastructure is intentionally lightweight. Since the platform runs on Cloudflare Workers, true integration tests require either wrangler dev or Miniflare, both of which add complexity. The team has prioritized manual testing through the dev server over automated integration tests. If you're considering adding integration tests, look at Cloudflare's vitest-pool-workers package, which provides a Workers-compatible Vitest pool.

Client-Side Logging and Telemetry

The platform has a structured logging system that captures client-side events and sends them to the server for storage and analysis. This is invaluable for debugging issues that users experience in production.

The Logger Utility

The logger (apps/webapp/src/utils/logger.ts) provides two logging channels:

Console logging via the loglevel library — visible in the browser's developer tools
Remote logging via a batched HTTP transport — persisted to the D1 database

import { getLogger } from "@/utils/logger";

const logger = getLogger("MyComponent");

// Console only
logger.info("Component mounted");

// Console + remote (persisted to database)
logger.remote.info("User joined room", {
  roomId: "room-123",
  deviceType: "mobile",
});

logger.remote.error("Failed to connect to LiveKit", {
  error: err.message,
  retryCount: 3,
});

The getLogger(namespace) function creates a logger scoped to a namespace (typically the component or module name). The namespace helps filter logs when debugging specific features.

Log Batching

Remote logs aren't sent immediately. The LogBatcher class batches them for efficiency:

Batch size: 10 entries (flushes when the batch reaches 10)
Flush interval: 5 seconds (flushes even if the batch isn't full)
Max queue: 100 entries (drops oldest entries if the queue overflows)
Page unload: flushes remaining entries with keepalive: true to ensure delivery

This means there can be up to a 5-second delay between a log call and the entry appearing in the database. During active usage, batches fill up quickly and flush at the 10-entry threshold.

Telemetry Context

Every log entry is enriched with a telemetry context that includes browser, device, and connection information. The useTelemetryInit hook (apps/webapp/src/hooks/useTelemetryInit.ts) sets up this context when the app loads:

// Context includes:
{
  userId: "user-123",           // From session
  eventId: "evt-456",           // Current event
  roomId: "room-789",           // Current room
  browserName: "Chrome",        // User agent parsed
  browserVersion: "120",
  osName: "macOS",
  activeCameraId: "abc",        // Current camera device
  activeCameraLabel: "FaceTime HD",
  activeMicId: "def",           // Current microphone
  activeMicLabel: "MacBook Pro Microphone",
  clientTimestamp: "2024-01-15T10:30:00Z"
}

This context is automatically attached to every remote log entry. When debugging a production issue, you can query logs by user, event, room, browser, or device to narrow down the problem.

Server-Side Log Storage

Logs arrive at the server through two paths:

tRPC mutation (telemetry.log) — used by the React app when tRPC is available
REST endpoint (POST /api/telemetry) — used as a fallback and by the LogBatcher

Both paths validate the log entries with Zod and insert them into the client_logs table:

CREATE TABLE client_logs (
  id TEXT PRIMARY KEY,        -- UUID
  level TEXT NOT NULL,        -- "debug" | "info" | "warn" | "error"
  namespace TEXT,             -- Component/module name
  message TEXT NOT NULL,      -- Log message
  data TEXT,                  -- JSON: additional structured data
  context TEXT,               -- JSON: full telemetry context
  userId TEXT,                -- Foreign key to users (indexed)
  eventId TEXT,               -- Event context (indexed)
  roomId TEXT,                -- Room context (indexed)
  clientTimestamp DATETIME    -- When the log was created on the client
);

The context column stores the full telemetry context as a JSON string. You can query it with D1's json_extract():

-- Find all errors from Safari users in a specific event
SELECT * FROM client_logs
WHERE level = 'error'
  AND eventId = 'evt-456'
  AND json_extract(context, '$.browserName') = 'Safari';

Pitfall: The telemetry endpoint is a publicProcedure — it doesn't require authentication. This is intentional because logging needs to work even when the user's session has expired or during the login flow. However, it means the endpoint could be abused. Rate limiting is not currently implemented but would be a good addition for production hardening.

Error Handling

Global Error Boundary

The app has a global error boundary (apps/webapp/src/app/error.tsx) that catches unhandled errors in any route:

"use client";

export default function GlobalError({
  error,
  reset,
}: {
  error: Error & { digest?: string };
  reset: () => void;
}) {
  useEffect(() => {
    console.error("Global error:", error);
  }, [error]);

  return (
    <div className="error-card">
      <h2>Something went wrong</h2>
      {error.digest && <p>Error ID: {error.digest}</p>}
      <button onClick={reset}>Try Again</button>
      <a href="/">Return Home</a>
    </div>
  );
}

In development mode, the error boundary also displays the full stack trace in a collapsible <details> element. In production, only the error digest (a short hash) is shown, which can be used to look up the full error in server logs.

Segment Error Boundaries

Specific route segments have their own error boundaries for more granular error handling:

Chat segment (/event/[eventId]/room/[roomId]/audience/@chat/error.tsx) — shows "An error occurred loading chat" without crashing the entire event page
Vetting segment (/event/[eventId]/room/[roomId]/vetting/[userId]/error.tsx) — isolates vetting page errors from the main room

This pattern leverages Next.js parallel routes: if chat fails to load, the video stream and participant list continue working. Users see a localized error message instead of a full-page crash.

Architect's Note: Error boundaries only catch rendering errors and errors thrown during React lifecycle methods. They don't catch errors in event handlers, async operations, or server-side code. For those, use try-catch blocks and the logger's .remote.error() method to capture the error for debugging.

Development Server Debugging

The Dev Server Script

The scripts/dev-server.sh script manages the development environment. It handles starting, stopping, and monitoring all services:

# Check if everything is running
./scripts/dev-server.sh status

# View logs from all services
./scripts/dev-server.sh logs

# Tail logs in real-time
./scripts/dev-server.sh tail

# Clean up orphaned processes
./scripts/dev-server.sh clean

The script manages several processes simultaneously: the Next.js dev server, Cloudflare Workers (via Wrangler), the Cloudflare tunnel, and optionally the statuspage app. Logs from all services are written to /tmp/eventuall-dev-logs/ with timestamped filenames.

Environment Verification

Before starting the dev server, the scripts/messages/pre-dev-check.sh script verifies that infrastructure has been provisioned:

Checks for .worktree-id or start-dev.sh file
If neither exists, shows an error with instructions to run pnpm run setup
Prevents the dev server from starting without proper Cloudflare resources

This prevents a common issue where developers try to run pnpm dev without first provisioning their D1 databases, KV namespaces, and Cloudflare tunnel.

Common Debugging Scenarios

"The page loads but shows a blank screen"

Check the browser console for JavaScript errors
Check the dev server logs: ./scripts/dev-server.sh logs
Look for hydration mismatches — these happen when server-rendered HTML doesn't match what the client renders. Common causes: using Date.now() or Math.random() in Server Components

"tRPC calls return 500 errors"

Check the Next.js server output in the dev logs
Look for TRPCError stack traces — these include the error code and message
Common cause: database schema mismatch. If you pulled changes that include new migrations, run pnpm migrate:local to apply them

"LiveKit video doesn't connect"

Check that the Cloudflare tunnel is running: ./scripts/dev-server.sh status
Verify LiveKit token generation in the dev logs — look for errors in the livekit.getToken tRPC call
Check that your environment has valid LiveKit API credentials (set via Doppler)
Use the browser's WebRTC internals (chrome://webrtc-internals) to inspect the connection state

"Authentication fails after pulling new changes"

Clear cookies for your dev domain (*.eventuall.live)
Verify the auth configuration: check that AUTH_SECRET and OAuth credentials are set in your environment
If using OTP, verify Twilio credentials are valid
Check for database migration issues — auth tables may have changed

"Durable Object state seems stale"

Durable Objects persist state across requests. If you changed the DO code, the old state may be incompatible
For local development, restart the Workers dev server to clear DO state
Check the Event DO's alarm cycle — stale presence data is cleaned up every 30 seconds

Pitfall: When debugging Workers locally with wrangler dev, the D1 database is a local SQLite file, not the remote D1 instance. Changes you make locally won't appear in the remote database, and vice versa. If you're debugging a production issue, use wrangler d1 execute <db-name> --remote to query the production database directly.

Type Checking and Linting

Beyond tests, the project enforces code quality through TypeScript and ESLint:

# Type check all packages
pnpm check-types

# Lint all packages
pnpm lint

# Format code with Prettier
npx prettier --write .

Type checking is particularly important in this codebase because tRPC relies on TypeScript's type inference for end-to-end type safety. If type checking passes, you have strong confidence that your tRPC calls match the server's expected input and output shapes.

Pre-Commit Hooks

Lefthook is configured for pre-commit hooks that run linting and type checking before each commit. This catches issues before they reach CI:

Lint check — ESLint runs on staged files
Type check — TypeScript compiler validates the full project

If a pre-commit hook fails, the commit is blocked. Fix the issues and try again. You can bypass hooks with --no-verify in an emergency, but this is discouraged.

Architect's Note: The pre-commit hooks check the entire project, not just changed files, for type checking. This is because a change in a shared type (like a Drizzle schema) can break files you didn't modify. The trade-off is slower commits. If this becomes a bottleneck, consider switching to incremental type checking with tsc --build.