./CLAUDE.md
# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## Repo layout
- `kaisetsu-app/` — the Next.js 16 app. All app work happens here.
- `scripts/setup-dotenv-key.sh` — one-shot per-developer bootstrap (fetches dotenvx private key from 1Password, writes to shell rc, enables `.githooks/`).
- `.githooks/pre-commit` — blocks commits that stage plaintext secrets in any `.env*` file (matches variable names `API_KEY|SECRET|TOKEN|PASSWORD|CREDENTIAL|PRIVATE_KEY`; exempts `DOTENV_PUBLIC_KEY_*`).
## Common commands (run from `kaisetsu-app/`)
```bash
npm run dev # dotenvx run -f .env.local -- next dev (maps QUINTIA_DOTENV_PRIVATE_KEY → DOTENV_PRIVATE_KEY_LOCAL)
npm run build # next build
npm run lint # eslint
npx tsc --noEmit # type-check only
```
`npm run dev` and `npm run start` require `QUINTIA_DOTENV_PRIVATE_KEY` to be exported in the shell (`setup-dotenv-key.sh` handles this). If you see `MISSING_PRIVATE_KEY` / `could not decrypt`, the shell rc wasn't sourced.
## Environment
`.env.local` is committed **encrypted** via dotenvx; the private key lives only in 1Password (`Quintia - Developers` vault, item `QUINTIA_DOTENV_PRIVATE_KEY`) and in each dev's shell rc. Non-secret values (e.g. `STORAGE_BACKEND`) may be encrypted too — that's fine, dotenvx decrypts transparently.
To add/update a secret: `npx dotenvx set NAME "value" --encrypt -f .env.local`. For non-secrets: `npx dotenvx set NAME "value" --plain -f .env.local`. To encrypt everything already in the file: `npx dotenvx encrypt -f .env.local`.
## Architecture — the three switchable backends
The app is designed so **local dev needs no GCP at all**. Three orthogonal env vars pick the backend for each concern. Production sets them to GCP-backed values.
### 1. `STORAGE_BACKEND` (`local` | `gcs`) — `src/lib/storage.ts`
Single `storage` object exported by `storage.ts`; all job I/O and video uploads go through it. Adding new storage-touching code means adding a method to the `StorageBackend` interface and implementing it in both `LocalBackend` (writes to `.data/jobs/{id}/…` and `.data/videos/`) and `GcsBackend` (writes to the GCS bucket). Never import `@google-cloud/storage` directly outside `storage.ts` or `upload-url/route.ts`.
Video URIs are `file://<absolute-path>` in local mode, `gs://<bucket>/<path>` in GCS mode. `storage.downloadVideoToFile(uri, destPath)` abstracts the difference for Whisper.
### 2. `WHISPER_BACKEND` (`openai` | `mlx` | `cpu`) — `src/lib/process-job.ts`
`transcribeWithWhisper()` dispatches:
- `openai` (default, zero install) → POST to `api.openai.com/v1/audio/transcriptions` with `verbose_json` + word timestamps. 25 MB audio limit — function throws a clear error if exceeded.
- `mlx` → `mlx_whisper` CLI (Apple Silicon)
- `cpu` → `whisper` CLI (production Cloud Run Job has CUDA; uses this)
All three return the same shape (`{ segments: WhisperSegment[] }`). The OpenAI path synthesizes per-segment `words[]` from the flat top-level `words` array (probability stubbed to 1.0).
### 3. Gemini — `src/lib/genai.ts`
Auto-switches at module load: if `GEMINI_API_KEY` is set, uses AI Studio API; else uses Vertex AI (requires `GCP_PROJECT_ID` + ADC). Throws immediately if neither is configured.
`prepareVideo(uri, mimeType)` must be called before passing a video to `generateContent`:
- Vertex AI mode: `gs://` URI passes through unchanged.
- API-key mode: uploads local file via Gemini Files API, polls until state becomes `ACTIVE`, returns the `generativelanguage.googleapis.com/...` URI.
All `generateContent` calls go through `callGeminiWithVideo(prompt, VideoRef)` or `callGeminiTextOnly(prompt)`, both wrapped in `withRetry` (handles 429 and undici header timeouts).
## Job execution
`src/app/api/analyze/route.ts` always runs `processJob(jobId)` in-process via Next.js `after()` (fire-and-forget after the response). There is no separate worker — the API route and the job share the Next.js process. Long jobs therefore need the Next.js process to stay alive until completion. If a separate worker becomes necessary again, reintroduce it as a sibling of the API route rather than putting orchestration logic inside `processJob`.
## Prompts & genre routing
`src/lib/process-job.ts` loads three prompt files from `prompts/<genre>/` (`1onseimeta.txt`, `2eizoumeta.txt`, `3genkouseisei.txt`). Genre `news` currently maps to `variety` (see `GENRE_PROMPT_DIR`). Adding a genre = new directory under `prompts/` plus entry in the mapping.
## Whisper hallucination handling
`process-job.ts` cross-checks Whisper output against Gemini's audio-meta analysis. If `isWhisperHallucinated()` flags it (dominant-text >40% or avg first-word probability <0.05), the pipeline falls back to Gemini blocks with `detectAndCorrectTcDrift()` realigning timecodes. When Whisper is trusted, Gemini metadata (speaker, BGM level) is merged onto Whisper blocks via `enrichBlocksWithGeminiMeta()`. Keep this dual-source logic intact when modifying the block-building code.
## Deployment
No deployment infrastructure in-repo (no Cloud Build, no Dockerfiles, no Cloud Run Job). If production is needed later, `STORAGE_BACKEND=gcs` + Vertex AI mode still works (Gemini falls back to Vertex when `GEMINI_API_KEY` is not set) — wire up hosting separately.