fix(cache,scoring): implement cache deserialization and fix doc scoring precision#360
Conversation
…ng precision Implement Assessment cache deserialization by adding from_dict() class methods to all model classes (Citation, Attribute, Remediation, Finding, DiscoveredSkill, AssessmentMetadata, Repository, Assessment) and wiring them into AssessmentCache._deserialize_assessment(). Previously, this method raised NotImplementedError, making the entire caching layer non-functional. Fix integer division precision loss in ArchitectureDecisionsAssessor._check_template_compliance() by changing 20 // len(sample_files) to 20 / len(sample_files) and int() to round(). This fixes systematic underscoring for repositories with non-divisor ADR file counts (e.g., 7 files was capped at 14/20 instead of 20/20). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
Warning
|
| Cohort / File(s) | Summary |
|---|---|
Model Deserialization Methods src/agentready/models/assessment.py, src/agentready/models/attribute.py, src/agentready/models/citation.py, src/agentready/models/discovered_skill.py, src/agentready/models/finding.py, src/agentready/models/metadata.py, src/agentready/models/repository.py |
Added from_dict classmethods to reconstruct model instances from dictionaries. Changes include field mapping, defaults for optional fields, nested model reconstruction, timestamp parsing, and in Repository.from_dict bypassing filesystem/git validation when creating instances. |
Documentation Scoring Precision src/agentready/assessors/documentation.py |
Changed per-file point calculation from integer floor division to true division (20 / len(sample_files)) and replaced truncating int(total_points) with round(total_points) to preserve fractional scoring across files. |
Cache Deserialization Integration src/agentready/services/assessment_cache.py |
_deserialize_assessment() now returns Assessment.from_dict(data) instead of raising; get() treats additional exceptions (KeyError, TypeError, pydantic.ValidationError) as cache misses and returns None for invalid cached data. |
Test Coverage tests/unit/test_cache_deserialization.py, tests/unit/test_doc_scoring_precision.py |
Added extensive unit tests: model from_dict round-trips, cache set/get/invalidate and TTL/expiry and malformed JSON handling, and ADR template scoring precision across 0–10 files (including non-divisor counts and partial compliance cases). |
Sequence Diagram(s)
mermaid
sequenceDiagram
participant Client as Client
participant Cache as AssessmentCache (SQLite)
participant DB as SQLiteDB
participant Model as Assessment.from_dict
rect rgba(200,200,255,0.5)
Client->>Cache: get(assessment_id)
Note right of Cache: read row from assessments
Cache->>DB: SELECT row
DB-->>Cache: row (may be missing/expired/malformed)
alt row missing or expired
Cache-->>Client: None
else valid row
Cache->>Cache: parse JSON -> dict
alt parse error or validation exceptions
Cache-->>Client: None
else
Cache->>Model: Assessment.from_dict(dict)
Model-->>Cache: Assessment instance
Cache-->>Client: Assessment
end
end
end
Estimated code review effort
🎯 4 (Complex) | ⏱️ ~45 minutes
🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
| Check name | Status | Explanation | Resolution |
|---|---|---|---|
| Docstring Coverage | Docstring coverage is 73.33% which is insufficient. The required threshold is 80.00%. | Write docstrings for the functions missing them to satisfy the coverage threshold. |
✅ Passed checks (2 passed)
| Check name | Status | Explanation |
|---|---|---|
| Title check | ✅ Passed | The title accurately summarizes the two main changes: implementing cache deserialization and fixing documentation scoring precision loss. |
| Description check | ✅ Passed | The description clearly explains both the cache deserialization implementation (adding from_dict methods and wiring them into Assessment cache) and the doc scoring precision fix (changing integer to float division and rounding). |
✏️ Tip: You can configure your own custom pre-merge checks in the settings.
✨ Finishing Touches
🧪 Generate unit tests (beta)
- Create PR with unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.
Comment @coderabbitai help to get the list of available commands and usage tips.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/agentready/models/finding.py`:
- Around line 37-47: The from_dict in Remediation currently defaults steps to []
which violates the non-empty steps invariant enforced by __post_init__; change
from_dict (Remediation.from_dict) to require and validate a non-empty steps
list: read steps = data.get("steps") (or data["steps"]) and if steps is missing
or empty raise a clear ValueError (e.g. "Remediation.from_dict requires a
non-empty 'steps'"), otherwise pass the validated steps into the constructor;
keep the rest of the fields (tools, commands, examples, citations) as-is.
In `@src/agentready/models/repository.py`:
- Around line 77-93: from_dict currently calls cls(...) which triggers
Repository.__post_init__ filesystem validation and prevents loading cached
repositories whose local path moved; change from_dict to construct the instance
without running __post_init__ (e.g., use object.__new__(Repository) and then set
attributes like path, name, url, branch, commit_hash, languages, total_files,
total_lines) or add an internal bypass flag checked by __post_init__ to skip FS
validation during deserialization; ensure the created object matches the same
attribute names used in Repository and preserves existing types (Path for path)
so other code can use it unchanged.
In `@src/agentready/services/assessment_cache.py`:
- Line 253: The get() method currently returns Assessment.from_dict(data)
directly, allowing KeyError/TypeError (or other deserialization errors) from
malformed cache rows to escape; modify get() so the call to
Assessment.from_dict(data) is wrapped in a try/except that catches
deserialization exceptions (e.g., KeyError, TypeError, ValueError,
JSONDecodeError) and handles them by logging the error and returning None (or
the cache-miss sentinel used elsewhere) instead of letting the exception bubble.
Ensure you reference the Assessment.from_dict call and keep the surrounding
cache read logic intact.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 7d7c626a-a506-44c9-9d20-2309ebd09d1a
📒 Files selected for processing (11)
src/agentready/assessors/documentation.pysrc/agentready/models/assessment.pysrc/agentready/models/attribute.pysrc/agentready/models/citation.pysrc/agentready/models/discovered_skill.pysrc/agentready/models/finding.pysrc/agentready/models/metadata.pysrc/agentready/models/repository.pysrc/agentready/services/assessment_cache.pytests/unit/test_cache_deserialization.pytests/unit/test_doc_scoring_precision.py
| @classmethod | ||
| def from_dict(cls, data: dict) -> "Remediation": | ||
| """Create Remediation from dictionary.""" | ||
| return cls( | ||
| summary=data["summary"], | ||
| steps=data.get("steps", []), | ||
| tools=data.get("tools", []), | ||
| commands=data.get("commands", []), | ||
| examples=data.get("examples", []), | ||
| citations=[Citation.from_dict(c) for c in data.get("citations", [])], | ||
| ) |
There was a problem hiding this comment.
from_dict can generate invalid Remediation instances.
Line 42 defaults steps to [], but __post_init__ rejects empty steps. Missing steps in input will fail deserialization instead of recovering safely.
Proposed fix (preserve invariant on missing steps)
def from_dict(cls, data: dict) -> "Remediation":
"""Create Remediation from dictionary."""
+ steps = data.get("steps")
+ if not steps:
+ steps = [data["summary"]]
return cls(
summary=data["summary"],
- steps=data.get("steps", []),
+ steps=steps,
tools=data.get("tools", []),
commands=data.get("commands", []),
examples=data.get("examples", []),
citations=[Citation.from_dict(c) for c in data.get("citations", [])],
)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/agentready/models/finding.py` around lines 37 - 47, The from_dict in
Remediation currently defaults steps to [] which violates the non-empty steps
invariant enforced by __post_init__; change from_dict (Remediation.from_dict) to
require and validate a non-empty steps list: read steps = data.get("steps") (or
data["steps"]) and if steps is missing or empty raise a clear ValueError (e.g.
"Remediation.from_dict requires a non-empty 'steps'"), otherwise pass the
validated steps into the constructor; keep the rest of the fields (tools,
commands, examples, citations) as-is.
- Remediation.from_dict(): fall back to summary as step when steps is missing/empty, preserving the non-empty steps invariant - Repository.from_dict(): skip filesystem validation via object.__new__() so cached assessments remain readable after the repo is moved/deleted - AssessmentCache.get(): catch KeyError and TypeError from malformed cache entries to prevent crashes on corrupt data - Add tests for all three edge cases Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
📈 Test Coverage Report
Coverage calculated from unit tests only |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/agentready/services/assessment_cache.py`:
- Around line 111-112: The except block in get() currently catches
sqlite3.Error, json.JSONDecodeError, ValueError, KeyError, and TypeError but not
Pydantic's ValidationError thrown by Assessment.from_dict() /
Config.model_validate; update the exception tuple to also include
pydantic.ValidationError (import it if missing) so corrupted config data will be
caught and get() will return None instead of raising.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 13afab3c-1ac7-4d87-af49-917fe709a08f
📒 Files selected for processing (4)
src/agentready/models/finding.pysrc/agentready/models/repository.pysrc/agentready/services/assessment_cache.pytests/unit/test_cache_deserialization.py
|
@jeremyeder - should be good for you to take a look and see if we can merge this one. |
Config.model_validate() raises pydantic.ValidationError (not a subclass of ValueError) when cached config data is malformed. Add it to the except tuple so corrupted entries return None instead of crashing. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
new tests all passing, lgtm. |
This is based on the agentready deep instrospection done by Amber agent.
more details here:
agentready-deep-review.md
Implement Assessment cache deserialization by adding from_dict() class methods to all model classes (Citation, Attribute, Remediation, Finding, DiscoveredSkill, AssessmentMetadata, Repository, Assessment) and wiring them into AssessmentCache._deserialize_assessment(). Previously, this method raised NotImplementedError, making the entire caching layer non-functional.
Fix integer division precision loss in
ArchitectureDecisionsAssessor._check_template_compliance() by changing 20 // len(sample_files) to 20 / len(sample_files) and int() to round(). This fixes systematic underscoring for repositories with non-divisor ADR file counts (e.g., 7 files was capped at 14/20 instead of 20/20).