# Research: Copy-Paste Detection Quality Gate ## jscpd Configuration Strategy **Decision**: Use `.jscpd.json` at the repo root for configuration. **Rationale**: jscpd supports a dedicated JSON config file (`.jscpd.json`) which is the conventional approach. This keeps configuration discoverable alongside other tool configs (`biome.json`, `knip.json`). Command-line flags could work but are less maintainable and harder to share across scripts. **Alternatives considered**: - CLI flags in the `pnpm jscpd` script: Less discoverable, harder to maintain threshold changes. - `package.json` `jscpd` key: Supported but clutters package.json; separate file preferred for consistency with project conventions. ## Threshold Configuration **Decision**: Use jscpd defaults with minor tuning — minimum 5 lines, minimum 50 tokens for duplicate detection. Set a percentage threshold (e.g., 5% max duplication) to fail the check. **Rationale**: jscpd's defaults (5 lines, 50 tokens) are well-established industry standards that avoid flagging trivially similar code (imports, short utility patterns) while catching meaningful copy-paste blocks. The percentage threshold provides a clear pass/fail gate. **Alternatives considered**: - Stricter thresholds (3 lines, 30 tokens): Too aggressive, would flag structural similarities common in TypeScript (type declarations, import blocks). - No percentage threshold (fail on any duplicate): Too strict for an existing codebase that may have some acceptable duplication. ## File Inclusion/Exclusion Strategy **Decision**: Scan TypeScript and TSX files (`**/*.ts`, `**/*.tsx`). Exclude `node_modules`, `dist`, `build`, `coverage`, `.specify`, `specs`, and lock files via jscpd's `ignore` configuration. **Rationale**: The project is TypeScript-only. Excluding build artifacts, vendored dependencies, and non-source files prevents false positives and keeps scan times fast. **Alternatives considered**: - Scan all file types: Unnecessary — the project contains no other source languages. - Exclude test files: Rejected per spec assumption — test duplication is also worth catching. ## Integration with pnpm check **Decision**: Add `jscpd` to the `pnpm check` script chain in `package.json`, running it alongside knip, biome, typecheck, and vitest. **Rationale**: The existing `pnpm check` script is already the pre-commit gate via Lefthook (`lefthook.yml` runs `pnpm check`). Adding jscpd to this chain automatically integrates it into the pre-commit workflow with zero Lefthook config changes. **Alternatives considered**: - Separate Lefthook job for jscpd: Would work but deviates from the existing pattern where `pnpm check` is the single merge gate. - Run jscpd only in CI: Misses the pre-commit enforcement requirement from the spec. ## Knip Integration **Decision**: Ensure jscpd is recognized by Knip as a used dev dependency. Knip may need the binary referenced in a script to avoid being flagged as unused. **Rationale**: The project runs `knip` as part of `pnpm check`. Adding `jscpd` as a devDependency and referencing it in a package.json script ensures Knip won't report it as unused. **Alternatives considered**: None — this is a necessary housekeeping step given the project's use of Knip.