Skip to content

Conversation

@manasdutta04
Copy link

@manasdutta04 manasdutta04 commented Feb 5, 2026

feat: migrate dictionary cache to JSON format (#1266)

Description

This PR migrates the dictionary cache format from a Function-evaluated JavaScript file (export default {...}) to a safer, standard JSON format (.json). This resolves the security risk associated with using new Function(...) to evaluate cache files and improves tooling compatibility.

Fixes #1266

Key Changes

  • Format Change: Changed the dictionary cache file from dictionary.js to dictionary.json.
  • Security: Replaced the unsafe new Function(...) constructor with JSON.parse for reading cache files.
  • Migration Logic: Implemented an automatic, backward-compatible migration path. When the compiler encounters an existing dictionary.js (or legacy format content), it:
    1. Detects the export default syntax.
    2. Safely parses the content.
    3. Automatically rewrites it as pure JSON.
  • Test Coverage: Added comprehensive tests covering:
    • Reading/writing the new JSON format.
    • Automatic migration from legacy to new format.
    • Data integrity preservation during migration.
    • ensureDictionaryFile behavior for new projects.

Implementation Details

  • Modified packages/compiler/src/lib/lcp/cache.ts to handle both the legacy JS format (for migration) and the new JSON format.
  • Updated packages/compiler/src/_const.ts to point to dictionary.json.
  • Updated test helpers in cache.spec.ts and fixed hardcoded references in _loader-utils.spec.ts and _utils.spec.ts.

Verification

Ran the full test suite to ensure no regressions:

pnpm test

Result: 228/228 existing tests passed, including new migration tests.

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation (if applicable)
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

Summary by CodeRabbit

  • New Features

    • Compiler now uses JSON format for dictionary storage with automatic migration from legacy JavaScript format.
  • Tests

    • Updated test suites to reflect new JSON dictionary format and migration behavior.

- Replace Function-evaluated JS with safer JSON.parse
- Add backward-compatible migration for legacy export default format
- Change file extension from .js to .json
- Remove unsafe Function constructor usage
- Add comprehensive test coverage for migration paths

Fixes lingodotdev#1266
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 5, 2026

📝 Walkthrough

Walkthrough

This PR migrates the dictionary cache format from JavaScript modules (evaluated via Function constructor) to JSON format with automatic backward-compatible migration. The constant for the dictionary filename is updated to reflect .json extension, tests are adjusted accordingly, and the core cache read/write logic detects and converts legacy format on-the-fly.

Changes

Cohort / File(s) Summary
Translation Updates
demo/new-compiler-vite-react-spa/public/translations/de.json, es.json, fr.json
Translation entries updated across German, Spanish, and French locale files, with selective additions and removals of translation keys.
Dictionary Format Constant
packages/compiler/src/_const.ts
Updated LCP_DICTIONARY_FILE_NAME export from "dictionary.js" to "dictionary.json" to align with new JSON-based cache format.
Test File Updates
packages/compiler/src/_loader-utils.spec.ts, _utils.spec.ts
Test expectations updated to reference dictionary.json instead of dictionary.js across multiple test cases; no logic changes.
Cache Test Suite
packages/compiler/src/lib/lcp/cache.spec.ts
Extensive new test coverage added for JSON cache operations: file creation, migration from legacy export default format, data preservation, and JSON parsing scenarios. Helper functions introduced for cache serialization.
Core Cache Implementation
packages/compiler/src/lib/lcp/cache.ts
Updated ensureDictionaryFile to initialize with JSON object; _format now defaults to JSON parser; _write stores pure JSON; _read gains migration logic to detect and convert legacy format synchronously before parsing as JSON.

Sequence Diagram

sequenceDiagram
    participant App as Application
    participant Reader as Cache Reader
    participant FS as File System
    participant Migrator as Migration Logic

    App->>Reader: Read cache
    Reader->>FS: Read file
    FS-->>Reader: File content
    alt Legacy Format (export default)
        Reader->>Migrator: Detect legacy format
        Migrator->>Migrator: Parse JSON from export default
        Migrator->>FS: Write as pure JSON
        FS-->>Migrator: File written
        Migrator-->>Reader: Parsed object
    else New Format (JSON)
        Reader->>Reader: JSON.parse content
    end
    Reader-->>App: Cache object
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Suggested reviewers

  • vrcprl

Poem

🐰 Hops through code with JSON delight,
No more Functions causing fright!
Safe migrations, old to new,
Your cache is secure, through and through!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: migrating the dictionary cache format from JavaScript to JSON, which aligns with the core objective of the PR.
Description check ✅ Passed The PR description comprehensively covers the summary, key changes, implementation details, verification, and includes a completed checklist, exceeding the template requirements.
Linked Issues check ✅ Passed All requirements from issue #1266 are met: JSON format implementation, backward-compatible migration logic, safe parsing without Function constructor, and comprehensive test coverage.
Out of Scope Changes check ✅ Passed All changes are directly related to the JSON migration objective; translation file updates support testing, and core cache logic modifications directly implement the issue requirements.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@packages/compiler/src/lib/lcp/cache.ts`:
- Around line 187-205: Detection uses fileContent.trim() but the subsequent
replace/parse uses the untrimmed fileContent, so files with leading whitespace
pass isLegacyFormat but fail the replacement and JSON.parse; fix by performing
the replacement and JSON.parse against the trimmed input (or use a regex that
allows leading whitespace, e.g. /^\s*export default/), i.e. compute a trimmed
variable (e.g. const trimmed = fileContent.trim()), use that for isLegacyFormat
and for producing jsonContent before JSON.parse, then write the migrated JSON to
cachePath via fs.writeFileSync as before (symbols: fileContent, isLegacyFormat,
jsonContent, JSON.parse, fs.writeFileSync, cachePath).
🧹 Nitpick comments (2)
packages/compiler/src/lib/lcp/cache.spec.ts (1)

464-475: Consider verifying mkdirSync is called when directory doesn't exist.

The test mocks existsSync to return false for all calls, which includes both the cache file and directory checks. The implementation should call mkdirSync in this case, but the test doesn't verify this.

🧪 Optional: Add assertion for directory creation
       LCPCache.ensureDictionaryFile({
         sourceRoot: params.sourceRoot,
         lingoDir: params.lingoDir,
       });

+      expect(mockMkdirSync).toHaveBeenCalledWith(
+        expect.any(String),
+        { recursive: true }
+      );
       expect(mockWriteFileSync).toHaveBeenCalledWith(cachePath, "{}");
packages/compiler/src/lib/lcp/cache.ts (1)

177-209: Consider adding error handling for malformed cache files.

Both JSON.parse calls (Lines 197 and 208) can throw on corrupted/malformed cache files, which could crash the build process. Returning a fresh empty cache on parse errors would make the system more resilient.

🛡️ Optional: Add defensive parsing
   private static _read(params: LCPCacheParams): DictionaryCacheSchema {
     const cachePath = this._getCachePath(params);
     if (!fs.existsSync(cachePath)) {
       return {
         version: 0.1,
         files: {},
       };
     }
     const fileContent = fs.readFileSync(cachePath, "utf8");

     // Detect legacy format (export default {...};) and migrate to JSON
     const isLegacyFormat = fileContent.trim().startsWith("export default");
     
+    const parseOrDefault = (content: string): DictionaryCacheSchema => {
+      try {
+        return JSON.parse(content);
+      } catch {
+        return { version: 0.1, files: {} };
+      }
+    };
+
     if (isLegacyFormat) {
       // Strip 'export default' and trailing semicolon, then parse as JSON
       const jsonContent = fileContent
+        .trim()
         .replace(/^export default/, "")
         .replace(/;\s*$/, "")
         .trim();
       
-      const obj = JSON.parse(jsonContent);
+      const obj = parseOrDefault(jsonContent);
       
       // Migrate: immediately rewrite in new JSON format
       const newContent = JSON.stringify(obj, null, 2);
       fs.writeFileSync(cachePath, newContent);
       
       return obj;
     }

     // New format: parse directly as JSON
-    return JSON.parse(fileContent);
+    return parseOrDefault(fileContent);
   }

Comment on lines +187 to +205
// Detect legacy format (export default {...};) and migrate to JSON
const isLegacyFormat = fileContent.trim().startsWith("export default");

if (isLegacyFormat) {
// Strip 'export default' and trailing semicolon, then parse as JSON
const jsonContent = fileContent
.replace(/^export default/, "")
.replace(/;\s*$/, "")
.trim();

const obj = JSON.parse(jsonContent);

// Migrate: immediately rewrite in new JSON format
// This is a synchronous write to ensure migration happens atomically
const newContent = JSON.stringify(obj, null, 2);
fs.writeFileSync(cachePath, newContent);

return obj;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Potential bug: Leading whitespace handling inconsistency.

The legacy format detection uses fileContent.trim().startsWith() (Line 188), but the replace(/^export default/, "") (Line 193) operates on the original fileContent without trimming. If a legacy file has leading whitespace, detection succeeds but the replacement fails, causing JSON.parse to throw.

🐛 Proposed fix
     // Detect legacy format (export default {...};) and migrate to JSON
     const isLegacyFormat = fileContent.trim().startsWith("export default");
     
     if (isLegacyFormat) {
       // Strip 'export default' and trailing semicolon, then parse as JSON
       const jsonContent = fileContent
+        .trim()
         .replace(/^export default/, "")
         .replace(/;\s*$/, "")
         .trim();
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Detect legacy format (export default {...};) and migrate to JSON
const isLegacyFormat = fileContent.trim().startsWith("export default");
if (isLegacyFormat) {
// Strip 'export default' and trailing semicolon, then parse as JSON
const jsonContent = fileContent
.replace(/^export default/, "")
.replace(/;\s*$/, "")
.trim();
const obj = JSON.parse(jsonContent);
// Migrate: immediately rewrite in new JSON format
// This is a synchronous write to ensure migration happens atomically
const newContent = JSON.stringify(obj, null, 2);
fs.writeFileSync(cachePath, newContent);
return obj;
}
// Detect legacy format (export default {...};) and migrate to JSON
const isLegacyFormat = fileContent.trim().startsWith("export default");
if (isLegacyFormat) {
// Strip 'export default' and trailing semicolon, then parse as JSON
const jsonContent = fileContent
.trim()
.replace(/^export default/, "")
.replace(/;\s*$/, "")
.trim();
const obj = JSON.parse(jsonContent);
// Migrate: immediately rewrite in new JSON format
// This is a synchronous write to ensure migration happens atomically
const newContent = JSON.stringify(obj, null, 2);
fs.writeFileSync(cachePath, newContent);
return obj;
}
🤖 Prompt for AI Agents
In `@packages/compiler/src/lib/lcp/cache.ts` around lines 187 - 205, Detection
uses fileContent.trim() but the subsequent replace/parse uses the untrimmed
fileContent, so files with leading whitespace pass isLegacyFormat but fail the
replacement and JSON.parse; fix by performing the replacement and JSON.parse
against the trimmed input (or use a regex that allows leading whitespace, e.g.
/^\s*export default/), i.e. compute a trimmed variable (e.g. const trimmed =
fileContent.trim()), use that for isLegacyFormat and for producing jsonContent
before JSON.parse, then write the migrated JSON to cachePath via
fs.writeFileSync as before (symbols: fileContent, isLegacyFormat, jsonContent,
JSON.parse, fs.writeFileSync, cachePath).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Harden dictionary cache: replace Function-evaluated JS with JSON (+ migration)

1 participant