diff --git a/.flake8 b/.flake8 deleted file mode 100644 index 2f3d98d..0000000 --- a/.flake8 +++ /dev/null @@ -1,16 +0,0 @@ -[flake8] -max-line-length = 88 -extend-ignore = E203, W503, C901 -per-file-ignores = - __init__.py:F401 - tests/*:D100,D101,D102,D103 -exclude = - .git, - __pycache__, - .venv, - venv, - build, - dist, - *.egg-info, - .pytest_cache, - .mypy_cache diff --git a/.github/QUICK_START.md b/.github/QUICK_START.md deleted file mode 100644 index 6ca72e0..0000000 --- a/.github/QUICK_START.md +++ /dev/null @@ -1,140 +0,0 @@ -# CI/CD Quick Start Guide - -## First Time Setup - -### 1. Enable GitHub Environments -``` -Settings → Environments → New environment -- Create "test-pypi" (optional reviewers) -- Create "pypi" (require reviewers, main branch only) -``` - -### 2. Configure PyPI Trusted Publishing -**On PyPI.org:** -``` -Account Settings → Publishing → Add GitHub OIDC publisher -- Repository: xarf/xarf-parser-python -- Workflow: publish-pypi.yml -- Environment: pypi -``` - -**On Test PyPI (test.pypi.org):** -``` -Same steps but with environment: test-pypi -``` - -### 3. Enable Branch Protection -``` -Settings → Branches → Add rule -Branch: main -☑ Require status checks: - - Quality Checks / quality-checks - - Test Suite / test - - CI Summary / ci-summary -☑ Require PR reviews: 1 approval -``` - -## Testing the Pipeline - -### Test PR Workflow -```bash -git checkout -b test-pipeline -echo "# test" >> README.md -git add . && git commit -m "Test CI" -git push origin test-pipeline -# Create PR on GitHub -``` - -### Test Security Scan -``` -GitHub → Actions → Security Scan → Run workflow -``` - -### Test Publishing (Test PyPI) -``` -GitHub → Actions → Publish to PyPI → Run workflow -Select: ☑ Publish to Test PyPI -``` - -### Test Release (Production) -```bash -git tag v4.0.0 -git push origin v4.0.0 -# Create release on GitHub → publishes automatically -``` - -## Common Commands - -### Run Tests Locally -```bash -pip install -e ".[dev,test]" -pytest --cov=xarf -``` - -### Run Quality Checks Locally -```bash -isort --check xarf/ tests/ -black --check xarf/ tests/ -flake8 xarf/ tests/ -bandit -r xarf/ -mypy xarf/ -pydocstyle xarf/ -radon cc --min B xarf/ -``` - -### Run Security Scans Locally -```bash -pip-audit -bandit -r xarf/ -``` - -## Monitoring - -### Check Workflow Status -``` -GitHub → Actions → View runs -``` - -### Check Security Issues -``` -GitHub → Security → Code scanning alerts -``` - -### Download Artifacts -``` -Actions → Workflow run → Artifacts section -``` - -## Troubleshooting - -### Quality Checks Fail -```bash -# Fix imports -isort xarf/ tests/ - -# Fix formatting -black xarf/ tests/ - -# Show what would be fixed -black --diff xarf/ -``` - -### Coverage Too Low -```bash -# Run with coverage report -pytest --cov=xarf --cov-report=html -open htmlcov/index.html -``` - -### Publishing Fails -1. Verify trusted publishing on PyPI -2. Check environment permissions -3. Ensure release is published (not draft) - -## Documentation - -- **Full Design**: [docs/ci-cd-pipeline-design.md](../docs/ci-cd-pipeline-design.md) -- **Workflows**: [.github/workflows/WORKFLOWS_README.md](workflows/WORKFLOWS_README.md) - ---- -**Need Help?** Check the troubleshooting section in ci-cd-pipeline-design.md diff --git a/.github/WORKFLOWS_SUMMARY.md b/.github/WORKFLOWS_SUMMARY.md deleted file mode 100644 index bad2ab6..0000000 --- a/.github/WORKFLOWS_SUMMARY.md +++ /dev/null @@ -1,403 +0,0 @@ -# GitHub Actions Workflows - Implementation Summary - -## Overview - -Successfully created 4 comprehensive GitHub Actions workflows for the xarf-parser-python project, adapted from abusix-parsers best practices while removing AWS/CodeArtifact dependencies. - -## Created Workflows - -### 1. **quality-checks.yml** (162 lines) - -Parallel execution of code quality and security checks using matrix strategy. - -**Key Features:** -- ✅ Blocking checks: isort, black, flake8, bandit -- ⚠️ Warning checks: mypy, pydocstyle, radon, pytest-cov -- Matrix-based parallel execution for speed -- Artifact uploads for logs and coverage -- Configurable timeouts per check - -**Differences from abusix-parsers:** -- ❌ Removed: AWS OIDC authentication -- ❌ Removed: CodeArtifact setup -- ❌ Removed: Poetry dependency (using pip + setuptools) -- ❌ Removed: Trivy scanner (moved to security-scan.yml) -- ✅ Added: Direct pip installation with caching -- ✅ Added: Editable install for coverage check -- ✅ Simplified: No custom GitHub actions needed -- 🔧 Adjusted: Tool versions and paths for xarf project - -**Tools & Versions:** -- isort 5.13.2 -- black 24.3.0 -- flake8 7.0.0 -- bandit 1.7.8 -- mypy 1.9.0 -- pydocstyle 6.3.0 -- radon 6.0.1 -- pytest-cov (latest) - -### 2. **security-scan.yml** (216 lines) - -Weekly security scanning with automatic issue creation. - -**Key Features:** -- 🔒 Three scan types: pip-audit, bandit, trivy -- 📅 Scheduled: Weekly on Monday 9 AM UTC -- 🐛 Auto-creates GitHub issues on scheduled failures -- 📊 SARIF reports uploaded to GitHub Security tab -- 💾 90-day artifact retention for audit trail - -**Differences from abusix-parsers:** -- ✅ Added: pip-audit for dependency CVE scanning -- ✅ Added: Automatic GitHub issue creation -- ✅ Added: Trivy filesystem scanning with SARIF -- ✅ Added: Security summary job -- 🔧 Adjusted: Scan paths and configuration - -**Schedule:** -- Cron: `0 9 * * 1` (Every Monday 9 AM UTC) -- Also runs on: Push to main, PR, workflow_dispatch - -### 3. **test.yml** (168 lines) - -Comprehensive test matrix across Python versions and platforms. - -**Key Features:** -- 🐍 Python versions: 3.8, 3.9, 3.10, 3.11, 3.12 -- 💻 Platforms: Ubuntu (all), macOS (3.12), Windows (3.12) -- 📊 Coverage upload to Codecov -- 🧪 Minimum dependency version testing -- 🔗 Integration test job (conditional) - -**Differences from abusix-parsers:** -- ❌ Removed: Poetry/CodeArtifact dependency -- ✅ Added: Multi-platform testing (macOS, Windows) -- ✅ Added: Minimum version compatibility test -- ✅ Added: Codecov integration -- ✅ Added: Integration test placeholder -- 🔧 Simplified: Direct pip installation - -**Matrix Strategy:** -```yaml -strategy: - fail-fast: false - matrix: - python-version: ['3.8', '3.9', '3.10', '3.11', '3.12'] - os: [ubuntu-latest] - include: - - python-version: '3.12' - os: macos-latest - - python-version: '3.12' - os: windows-latest -``` - -### 4. **publish.yml** (202 lines) - -Automated PyPI publishing with validation and testing. - -**Key Features:** -- 🚀 Trusted Publishing (no API tokens needed) -- ✅ Pre-publish validation and testing -- 📦 Builds both wheel and sdist -- 🎯 Dual targets: PyPI and Test PyPI -- 🏷️ Triggered by GitHub releases - -**Differences from abusix-parsers:** -- ❌ Removed: CodeArtifact publishing -- ❌ Removed: AWS authentication -- ✅ Added: Test PyPI support -- ✅ Added: Version validation from pyproject.toml -- ✅ Added: Tag/version matching check -- ✅ Added: Pre-publish quality checks -- ✅ Added: Manual dispatch with test_pypi flag -- 🔧 Using: PyPA trusted publishing (OIDC) - -**Publishing Logic:** -- Prerelease → Test PyPI -- Release → PyPI -- Manual dispatch → Configurable via input - -## Key Adaptations from abusix-parsers - -### Removed Components -1. **AWS Integration** - - No OIDC authentication - - No CodeArtifact repository - - No assume-role secrets - -2. **Poetry Dependency** - - Replaced with pip + setuptools - - Direct editable installs: `pip install -e ".[dev,test]"` - - Simpler dependency management - -3. **Custom GitHub Actions** - - No `.github/actions/setup-poetry` - - Direct action usage only - -### Added Features -1. **Enhanced Security** - - Dedicated security-scan workflow - - Weekly automated scans - - Automatic issue creation - - SARIF reporting to GitHub Security - -2. **Improved Testing** - - Multi-platform support (Linux, macOS, Windows) - - Minimum version compatibility tests - - Codecov integration - - Integration test framework - -3. **Better Publishing** - - Trusted Publishing support - - Test PyPI option - - Version validation - - Pre-publish test gate - -### Configuration Files - -The workflows reference configuration in `pyproject.toml`: - -```toml -[tool.black] -line-length = 88 -target-version = ["py38"] - -[tool.isort] -profile = "black" -line_length = 88 - -[tool.mypy] -python_version = "3.8" -strict = true - -[tool.pytest.ini_options] -addopts = "-v --cov=xarf --cov-report=term-missing --cov-report=html" -testpaths = ["tests"] - -[tool.coverage.run] -source = ["xarf"] -omit = ["tests/*", "setup.py"] -``` - -## Setup Requirements - -### 1. PyPI Trusted Publishing - -Configure at https://pypi.org/manage/account/publishing/ - -**PyPI Settings:** -- Project: `xarf-parser` -- Owner: `xarf` (or your GitHub org/user) -- Repository: `xarf-parser-python` -- Workflow: `publish.yml` -- Environment: `pypi` - -**Test PyPI Settings:** -Repeat at https://test.pypi.org with environment: `test-pypi` - -### 2. GitHub Environments (Optional) - -Create environments in repository settings: -- `pypi` - Production PyPI publishing -- `test-pypi` - Test PyPI publishing - -### 3. Branch Protection (Recommended) - -Configure for `main` branch: -- ✅ Require status checks: quality-checks, test -- ✅ Require branches to be up to date -- ✅ Require linear history -- ✅ Include administrators - -### 4. Codecov (Optional) - -1. Sign up at https://codecov.io -2. Connect GitHub repository -3. No token needed for public repos - -## Workflow Execution Flow - -``` -┌─────────────────────────────────────────────────────┐ -│ Push/PR to main/develop │ -└──────────────────┬──────────────────────────────────┘ - │ - ├─────────────────┐ - │ │ - ▼ ▼ - ┌─────────────────┐ ┌──────────────┐ - │ Quality Checks │ │ Tests │ - │ (Parallel) │ │ (Matrix) │ - └────────┬────────┘ └──────┬───────┘ - │ │ - └────────┬─────────┘ - │ - ▼ - ┌──────────┐ - │ Merge │ - └─────┬────┘ - │ - ▼ - ┌────────────────┐ - │ Create Release │ - └────────┬────────┘ - │ - ▼ - ┌────────────────────────┐ - │ Publish Workflow │ - │ 1. Validate │ - │ 2. Test │ - │ 3. Quality Check │ - │ 4. Build │ - │ 5. Publish to PyPI │ - └────────────────────────┘ -``` - -## Monitoring & Maintenance - -### Weekly Tasks -- Review security scan results (Monday mornings) -- Address any security issues found -- Update vulnerable dependencies - -### Monthly Tasks -- Update GitHub Actions versions -- Review and update tool versions -- Check for new best practices - -### Quarterly Tasks -- Review workflow efficiency -- Update Python version matrix -- Audit security configurations - -### On Python Release -- Add new Python version to test matrix -- Update classifiers in pyproject.toml -- Test compatibility - -## Performance Metrics - -Compared to sequential execution: - -| Metric | Sequential | Parallel (Matrix) | Improvement | -|--------|-----------|-------------------|-------------| -| Quality Checks | ~15 min | ~5 min | 3x faster | -| Test Suite | ~25 min | ~8 min | 3.1x faster | -| Total CI Time | ~40 min | ~13 min | 3x faster | - -**Note:** Times are estimates based on similar projects. Actual times depend on test complexity and runner availability. - -## Artifact Retention - -| Artifact | Retention | Purpose | -|----------|-----------|---------| -| Coverage Reports | 30 days | Code coverage analysis | -| Test Results | 7 days | Debugging test failures | -| Security Scans | 90 days | Audit trail and compliance | -| Build Packages | 30 days | Distribution packages | -| Check Logs | 7 days | Debugging quality issues | - -## Best Practices Implemented - -1. ✅ **Parallel Execution**: Matrix strategy for speed -2. ✅ **Fail-Fast Disabled**: See all failures in one run -3. ✅ **Continue on Error**: Non-blocking checks don't fail builds -4. ✅ **Caching**: Pip cache for faster installs -5. ✅ **Retry Logic**: Implicit in GitHub Actions -6. ✅ **Timeouts**: Prevent hanging jobs -7. ✅ **Artifact Uploads**: Preserve important files -8. ✅ **Summary Jobs**: Clear pass/fail indicators -9. ✅ **Security First**: Dedicated security workflow -10. ✅ **Version Pinning**: Specific tool versions - -## Troubleshooting - -### Common Issues - -**1. Quality checks fail on first run** -- Expected on legacy code -- Run formatters locally first: - ```bash - black xarf/ tests/ - isort --profile black xarf/ tests/ - ``` - -**2. Security scan finds vulnerabilities** -- Review severity levels -- Update dependencies: `pip install --upgrade ` -- Use `pip-audit --fix` for automatic fixes - -**3. Tests fail on specific Python version** -- Check for syntax incompatibilities -- Review dependency version constraints -- Test locally with specific version - -**4. Publishing fails with authentication error** -- Verify Trusted Publishing configuration -- Check environment names match exactly -- Ensure repository settings are correct - -**5. Coverage below threshold** -- Add tests for uncovered code -- Update coverage thresholds in pyproject.toml -- Review coverage report: `coverage.json` - -## Files Created - -``` -.github/ -└── workflows/ - ├── README.md # Detailed documentation - ├── quality-checks.yml # Code quality & security - ├── security-scan.yml # Weekly security scanning - ├── test.yml # Test matrix - └── publish.yml # PyPI publishing -``` - -**Total Lines of Code:** 748 (excluding README) - -## Next Steps - -1. **Test Workflows** - ```bash - # Push to trigger workflows - git add .github/workflows/ - git commit -m "Add GitHub Actions workflows" - git push - ``` - -2. **Configure PyPI** - - Set up Trusted Publishing - - Create environments - -3. **Review First Run** - - Check all jobs complete - - Address any failures - - Review artifact uploads - -4. **Add Badges to README** - ```markdown - [![Quality](https://github.com/xarf/xarf-parser-python/actions/workflows/quality-checks.yml/badge.svg)](https://github.com/xarf/xarf-parser-python/actions/workflows/quality-checks.yml) - [![Tests](https://github.com/xarf/xarf-parser-python/actions/workflows/test.yml/badge.svg)](https://github.com/xarf/xarf-parser-python/actions/workflows/test.yml) - [![Security](https://github.com/xarf/xarf-parser-python/actions/workflows/security-scan.yml/badge.svg)](https://github.com/xarf/xarf-parser-python/actions/workflows/security-scan.yml) - ``` - -5. **Monitor First Week** - - Watch for security scan on Monday - - Verify PR checks work correctly - - Check artifact retention - -## Support & Documentation - -- Workflow documentation: `.github/workflows/README.md` -- GitHub Actions docs: https://docs.github.com/actions -- PyPI Trusted Publishing: https://docs.pypi.org/trusted-publishers/ -- Issues: Open in repository with workflow logs - ---- - -**Implementation Date:** 2025-11-20 -**Based on:** abusix-parsers workflows -**Adapted for:** xarf-parser-python (pip + setuptools) -**Status:** ✅ Ready for testing diff --git a/.github/trivy.yaml b/.github/trivy.yaml deleted file mode 100644 index 2d7cca4..0000000 --- a/.github/trivy.yaml +++ /dev/null @@ -1,58 +0,0 @@ -# Trivy configuration for XARF Python parser -# Based on abusix-parsers security standards - -# Scan settings -scan: - # Scan for both vulnerabilities and secrets - security-checks: - - vuln - - secret - -# Vulnerability settings -vulnerability: - # Type of vulnerability sources - type: - - os - - library - - # Severity levels to report - severity: - - LOW - - MEDIUM - - HIGH - - CRITICAL - -# Secret scanning settings -secret: - # Additional secret scanning patterns - config: | - # AWS credentials - - name: AWS Access Key ID - regex: '(A3T[A-Z0-9]|AKIA|AGPA|AIDA|AROA|AIPA|ANPA|ANVA|ASIA)[A-Z0-9]{16}' - - # API Keys - - name: Generic API Key - regex: '(?i)(api[_-]?key|apikey)["\s:=]+[a-zA-Z0-9_\-]{20,}' - - # Private Keys - - name: Private Key - regex: '-----BEGIN (RSA|DSA|EC|OPENSSH) PRIVATE KEY-----' - - # GitHub tokens - - name: GitHub Token - regex: 'gh[pousr]_[A-Za-z0-9_]{36,}' - -# File path patterns to skip -skip-files: - - "**/.git/**" - - "**/node_modules/**" - - "**/.venv/**" - - "**/venv/**" - - "**/__pycache__/**" - - "**/*.pyc" - - "**/dist/**" - - "**/build/**" - - "**/.pytest_cache/**" - - "**/.mypy_cache/**" - - "**/htmlcov/**" - - "**/*.egg-info/**" diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml index 61513ca..a53579b 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/ci.yml @@ -10,12 +10,12 @@ permissions: contents: read jobs: - version-checks: + test: name: Python ${{ matrix.python-version }} Tests runs-on: ubuntu-latest strategy: matrix: - python-version: ['3.8', '3.9', '3.10', '3.11', '3.12'] + python-version: ['3.10', '3.11', '3.12', '3.13'] steps: - uses: actions/checkout@v4 @@ -35,45 +35,14 @@ jobs: run: pytest --cov=xarf --cov-report=term -v tests/ - name: Upload coverage - if: matrix.python-version == '3.11' + if: matrix.python-version == '3.12' uses: codecov/codecov-action@v4 with: fail_ci_if_error: false code-quality: - name: Code Quality - ${{ matrix.check.name }} + name: Code Quality runs-on: ubuntu-latest - strategy: - fail-fast: false - matrix: - check: - - name: "Format (black)" - cmd: "black --check ." - error: false - - name: "Imports (isort)" - cmd: "isort --check-only --profile black ." - error: false - - name: "Linting (flake8)" - cmd: "flake8 xarf/ tests/" - error: false - - name: "Security (bandit)" - cmd: "bandit -r xarf/ -ll" - error: false - - name: "Types (mypy)" - cmd: "mypy xarf/" - error: false - - name: "Complexity (radon)" - cmd: "radon cc xarf/ -a -nb" - error: false - - name: "Maintainability (radon)" - cmd: "radon mi xarf/ -nb" - error: false - - name: "Docstrings (pydocstyle)" - cmd: "pydocstyle xarf/" - error: false - - name: "Dead code (vulture)" - cmd: "vulture xarf/ .vulture_whitelist.py --min-confidence 80" - error: false steps: - uses: actions/checkout@v4 @@ -81,14 +50,22 @@ jobs: - name: Set up Python uses: actions/setup-python@v5 with: - python-version: '3.11' + python-version: '3.12' cache: 'pip' - name: Install dependencies run: | python -m pip install --upgrade pip - pip install -e ".[dev,test]" + pip install -e ".[dev]" + + - name: Lint (ruff) + run: ruff check . + + - name: Format (ruff) + run: ruff format --check . + + - name: Types (mypy) + run: mypy --strict xarf/ - - name: Run ${{ matrix.check.name }} - run: ${{ matrix.check.cmd }} - continue-on-error: ${{ matrix.check.error }} + - name: Security (bandit) + run: bandit -r xarf/ diff --git a/.gitignore b/.gitignore index a696c36..09d2f5a 100644 --- a/.gitignore +++ b/.gitignore @@ -45,24 +45,15 @@ coverage.xml # MyPy .mypy_cache/ + +# Ruff +.ruff_cache/ .dmypy.json dmypy.json -# IDE - VS Code (keep shared settings, ignore user-specific) -.vscode/* -!.vscode/settings.json -!.vscode/extensions.json -!.vscode/tasks.json -!.vscode/launch.json - -# IDE - IntelliJ IDEA (keep shared settings, ignore user-specific) -.idea/* -!.idea/inspectionProfiles/ -!.idea/codeStyles/ -!.idea/runConfigurations/ -!.idea/vcs.xml -!.idea/misc.xml -!.idea/.gitignore +# IDE +.vscode/ +.idea/ # IDE - Other *.swp diff --git a/.idea/.gitignore b/.idea/.gitignore deleted file mode 100644 index 06facd2..0000000 --- a/.idea/.gitignore +++ /dev/null @@ -1,30 +0,0 @@ -# User-specific stuff -workspace.xml -tasks.xml -usage.statistics.xml -dictionaries -shelf/ - -# Generated files -contentModel.xml -uiDesigner.xml - -# Sensitive or high-churn files -dataSources/ -dataSources.ids -dataSources.local.xml -sqlDataSources.xml -dynamic.xml - -# Gradle -.idea/**/gradle.xml -.idea/**/libraries - -# CMake -cmake-build-*/ - -# File-based project format -*.iws - -# IntelliJ -out/ diff --git a/.idea/codeStyles/Project.xml b/.idea/codeStyles/Project.xml deleted file mode 100644 index 0958f6f..0000000 --- a/.idea/codeStyles/Project.xml +++ /dev/null @@ -1,15 +0,0 @@ - - - - diff --git a/.idea/codeStyles/codeStyleConfig.xml b/.idea/codeStyles/codeStyleConfig.xml deleted file mode 100644 index 0f7bc51..0000000 --- a/.idea/codeStyles/codeStyleConfig.xml +++ /dev/null @@ -1,5 +0,0 @@ - - - - diff --git a/.idea/inspectionProfiles/Project_Default.xml b/.idea/inspectionProfiles/Project_Default.xml deleted file mode 100644 index 7b746be..0000000 --- a/.idea/inspectionProfiles/Project_Default.xml +++ /dev/null @@ -1,27 +0,0 @@ - - - - diff --git a/.idea/misc.xml b/.idea/misc.xml deleted file mode 100644 index 49ffe2c..0000000 --- a/.idea/misc.xml +++ /dev/null @@ -1,7 +0,0 @@ - - - - - - diff --git a/.idea/runConfigurations/Format_Code__Black_.xml b/.idea/runConfigurations/Format_Code__Black_.xml deleted file mode 100644 index 5fbf44a..0000000 --- a/.idea/runConfigurations/Format_Code__Black_.xml +++ /dev/null @@ -1,17 +0,0 @@ - - - - diff --git a/.idea/runConfigurations/Pre_commit_All.xml b/.idea/runConfigurations/Pre_commit_All.xml deleted file mode 100644 index 8a19ecd..0000000 --- a/.idea/runConfigurations/Pre_commit_All.xml +++ /dev/null @@ -1,17 +0,0 @@ - - - - diff --git a/.idea/runConfigurations/Tests.xml b/.idea/runConfigurations/Tests.xml deleted file mode 100644 index b98924c..0000000 --- a/.idea/runConfigurations/Tests.xml +++ /dev/null @@ -1,18 +0,0 @@ - - - - - diff --git a/.idea/vcs.xml b/.idea/vcs.xml deleted file mode 100644 index 5ace414..0000000 --- a/.idea/vcs.xml +++ /dev/null @@ -1,6 +0,0 @@ - - - - - - diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml index 439508b..7c8c9a8 100644 --- a/.pre-commit-config.yaml +++ b/.pre-commit-config.yaml @@ -1,116 +1,19 @@ -# Pre-commit hooks configuration -# Install: pip install pre-commit && pre-commit install -# Run manually: pre-commit run --all-files -# Update hooks: pre-commit autoupdate - repos: - # Code formatting - black - - repo: https://github.com/psf/black - rev: 24.10.0 - hooks: - - id: black - args: [--line-length=88] - - # Import sorting - isort - - repo: https://github.com/PyCQA/isort - rev: 5.13.2 + - repo: https://github.com/astral-sh/ruff-pre-commit + rev: v0.4.10 hooks: - - id: isort - args: [--profile=black, --line-length=88] + - id: ruff + args: [--fix] + - id: ruff-format - # Linting - flake8 - - repo: https://github.com/PyCQA/flake8 - rev: 7.1.1 - hooks: - - id: flake8 - args: [--max-line-length=100, --extend-ignore=E203,W503,C901] - additional_dependencies: [flake8-docstrings] - - # Security scanning - bandit - - repo: https://github.com/PyCQA/bandit - rev: 1.7.10 - hooks: - - id: bandit - args: [-r, xarf/, -ll] - exclude: ^tests/ - - # Type checking - mypy - repo: https://github.com/pre-commit/mirrors-mypy - rev: v1.13.0 + rev: v1.10.0 hooks: - id: mypy - args: [--strict, --python-version=3.8] - additional_dependencies: [pydantic>=2.0.0, types-python-dateutil] - files: ^xarf/ - - # Dead code detection - vulture - - repo: https://github.com/jendrikseipp/vulture - rev: v2.13 - hooks: - - id: vulture - args: [xarf/, .vulture_whitelist.py, --min-confidence=80] + additional_dependencies: [pydantic, types-jsonschema] - # Docstring style - pydocstyle - - repo: https://github.com/PyCQA/pydocstyle - rev: 6.3.0 - hooks: - - id: pydocstyle - args: [--convention=google, --add-ignore=D100,D104,D105,D107] - files: ^xarf/ - exclude: ^tests/ - - # Code complexity - radon - - repo: local - hooks: - - id: radon-cc - name: radon complexity check - entry: radon - language: system - args: [cc, xarf/, -a, -nb] - files: ^xarf/.*\.py$ - pass_filenames: false - - id: radon-mi - name: radon maintainability check - entry: radon - language: system - args: [mi, xarf/, -nb] - files: ^xarf/.*\.py$ - pass_filenames: false - - # YAML validation - - repo: https://github.com/pre-commit/pre-commit-hooks - rev: v5.0.0 - hooks: - - id: check-yaml - - id: check-json - exclude: ^\.vscode/.*\.json$ - - id: check-toml - - id: end-of-file-fixer - - id: trailing-whitespace - - id: check-added-large-files - args: [--maxkb=1000] - - id: check-merge-conflict - - id: check-case-conflict - - id: detect-private-key - - id: mixed-line-ending - args: [--fix=lf] - - # Python security checks - - repo: https://github.com/Lucas-C/pre-commit-hooks-safety - rev: v1.3.3 + - repo: https://github.com/PyCQA/bandit + rev: 1.7.9 hooks: - - id: python-safety-dependencies-check - files: pyproject.toml - -# CI/CD equivalent hooks (informational only, not blocking) -ci: - autofix_commit_msg: | - [pre-commit.ci] auto fixes from pre-commit.com hooks - - for more information, see https://pre-commit.ci - autofix_prs: true - autoupdate_branch: '' - autoupdate_commit_msg: '[pre-commit.ci] pre-commit autoupdate' - autoupdate_schedule: weekly - skip: [python-safety-dependencies-check] - submodules: false + - id: bandit + args: [-r, xarf/] diff --git a/.vscode/extensions.json b/.vscode/extensions.json deleted file mode 100644 index a174158..0000000 --- a/.vscode/extensions.json +++ /dev/null @@ -1,39 +0,0 @@ -{ - "recommendations": [ - // Python essentials - "ms-python.python", - "ms-python.vscode-pylance", - "ms-python.black-formatter", - "ms-python.isort", - "ms-python.mypy-type-checker", - - // Testing - "littlefoxteam.vscode-python-test-adapter", - "ryanluker.vscode-coverage-gutters", - - // Code quality - "ms-python.flake8", - "usernamehw.errorlens", - - // Git - "eamodio.gitlens", - "mhutchie.git-graph", - - // YAML/JSON - "redhat.vscode-yaml", - "ZainChen.json", - - // GitHub - "github.vscode-pull-request-github", - "github.copilot", - - // Markdown - "yzhang.markdown-all-in-one", - "DavidAnson.vscode-markdownlint", - - // Utilities - "christian-kohler.path-intellisense", - "visualstudioexptteam.vscodeintellicode", - "gruntfuggly.todo-tree" - ] -} diff --git a/.vscode/launch.json b/.vscode/launch.json deleted file mode 100644 index 8c0bb71..0000000 --- a/.vscode/launch.json +++ /dev/null @@ -1,51 +0,0 @@ -{ - "version": "0.2.0", - "configurations": [ - { - "name": "Python: Current File", - "type": "debugpy", - "request": "launch", - "program": "${file}", - "console": "integratedTerminal", - "justMyCode": true - }, - { - "name": "Python: Tests", - "type": "debugpy", - "request": "launch", - "module": "pytest", - "args": [ - "-v", - "--cov=xarf", - "--cov-report=term-missing", - "tests/" - ], - "console": "integratedTerminal", - "justMyCode": false - }, - { - "name": "Python: Current Test File", - "type": "debugpy", - "request": "launch", - "module": "pytest", - "args": [ - "-v", - "${file}" - ], - "console": "integratedTerminal", - "justMyCode": false - }, - { - "name": "Python: Specific Test Function", - "type": "debugpy", - "request": "launch", - "module": "pytest", - "args": [ - "-v", - "${file}::${selectedText}" - ], - "console": "integratedTerminal", - "justMyCode": false - } - ] -} diff --git a/.vscode/settings.json b/.vscode/settings.json deleted file mode 100644 index c74ad99..0000000 --- a/.vscode/settings.json +++ /dev/null @@ -1,104 +0,0 @@ -{ - // Python configuration - "python.defaultInterpreterPath": "${workspaceFolder}/.venv/bin/python", - "python.terminal.activateEnvironment": true, - - // Formatting - "[python]": { - "editor.defaultFormatter": "ms-python.black-formatter", - "editor.formatOnSave": true, - "editor.codeActionsOnSave": { - "source.organizeImports": "explicit" - } - }, - - // Black formatter - "black-formatter.args": [ - "--line-length=88" - ], - - // isort - "isort.args": [ - "--profile=black", - "--line-length=88" - ], - - // Linting - "python.linting.enabled": true, - "python.linting.flake8Enabled": true, - "python.linting.flake8Args": [ - "--max-line-length=100", - "--extend-ignore=E203,W503,C901" - ], - "python.linting.banditEnabled": true, - "python.linting.banditArgs": [ - "-r", - "xarf/", - "-ll" - ], - "python.linting.mypyEnabled": true, - "python.linting.mypyArgs": [ - "--strict", - "--python-version=3.8" - ], - "python.linting.pydocstyleEnabled": true, - "python.linting.pydocstyleArgs": [ - "--convention=google", - "--add-ignore=D100,D104,D105,D107" - ], - "python.linting.lintOnSave": true, - - // Testing - "python.testing.pytestEnabled": true, - "python.testing.unittestEnabled": false, - "python.testing.pytestArgs": [ - "tests", - "-v", - "--cov=xarf", - "--cov-report=term-missing" - ], - "python.testing.autoTestDiscoverOnSaveEnabled": true, - - // Editor - "editor.rulers": [88, 100], - "editor.trimAutoWhitespace": true, - "files.trimTrailingWhitespace": true, - "files.insertFinalNewline": true, - "files.eol": "\n", - - // File associations - "files.associations": { - "*.yaml": "yaml", - "*.yml": "yaml", - ".vulture_whitelist.py": "python" - }, - - // Exclude from file watcher - "files.watcherExclude": { - "**/.git/objects/**": true, - "**/.git/subtree-cache/**": true, - "**/node_modules/*/**": true, - "**/.venv/**": true, - "**/venv/**": true, - "**/__pycache__/**": true, - "**/.pytest_cache/**": true, - "**/.mypy_cache/**": true, - "**/dist/**": true, - "**/build/**": true, - "**/*.egg-info/**": true - }, - - // Search exclude - "search.exclude": { - "**/.venv": true, - "**/venv": true, - "**/__pycache__": true, - "**/.pytest_cache": true, - "**/.mypy_cache": true, - "**/dist": true, - "**/build": true, - "**/*.egg-info": true, - "**/.coverage": true, - "**/htmlcov": true - } -} diff --git a/.vscode/tasks.json b/.vscode/tasks.json deleted file mode 100644 index 8104335..0000000 --- a/.vscode/tasks.json +++ /dev/null @@ -1,141 +0,0 @@ -{ - "version": "2.0.0", - "tasks": [ - { - "label": "Install Dependencies", - "type": "shell", - "command": "pip install -e '.[dev,test]'", - "group": "build", - "presentation": { - "reveal": "always", - "panel": "new" - } - }, - { - "label": "Run Tests", - "type": "shell", - "command": "pytest --cov=xarf --cov-report=term-missing -v tests/", - "group": { - "kind": "test", - "isDefault": true - }, - "presentation": { - "reveal": "always", - "panel": "dedicated" - } - }, - { - "label": "Run Tests with Coverage", - "type": "shell", - "command": "pytest --cov=xarf --cov-report=html --cov-report=term-missing -v tests/", - "group": "test", - "presentation": { - "reveal": "always", - "panel": "dedicated" - } - }, - { - "label": "Format Code (Black)", - "type": "shell", - "command": "black .", - "group": "build", - "presentation": { - "reveal": "silent" - } - }, - { - "label": "Sort Imports (isort)", - "type": "shell", - "command": "isort --profile black .", - "group": "build", - "presentation": { - "reveal": "silent" - } - }, - { - "label": "Lint (flake8)", - "type": "shell", - "command": "flake8 xarf/ tests/", - "group": "build", - "problemMatcher": "$python", - "presentation": { - "reveal": "always" - } - }, - { - "label": "Type Check (mypy)", - "type": "shell", - "command": "mypy xarf/", - "group": "build", - "problemMatcher": "$python", - "presentation": { - "reveal": "always" - } - }, - { - "label": "Security Scan (bandit)", - "type": "shell", - "command": "bandit -r xarf/ -ll", - "group": "build", - "presentation": { - "reveal": "always" - } - }, - { - "label": "Dead Code Check (vulture)", - "type": "shell", - "command": "vulture xarf/ .vulture_whitelist.py --min-confidence 80", - "group": "build", - "presentation": { - "reveal": "always" - } - }, - { - "label": "Code Quality - All Checks", - "type": "shell", - "command": "black --check . && isort --check-only --profile black . && flake8 xarf/ tests/ && bandit -r xarf/ -ll && mypy xarf/ && vulture xarf/ .vulture_whitelist.py --min-confidence 80", - "group": "build", - "presentation": { - "reveal": "always", - "panel": "dedicated" - } - }, - { - "label": "Pre-commit Run All", - "type": "shell", - "command": "pre-commit run --all-files", - "group": "build", - "presentation": { - "reveal": "always", - "panel": "dedicated" - } - }, - { - "label": "Install Pre-commit Hooks", - "type": "shell", - "command": "pre-commit install", - "group": "build", - "presentation": { - "reveal": "always" - } - }, - { - "label": "Build Package", - "type": "shell", - "command": "python -m build", - "group": "build", - "presentation": { - "reveal": "always" - } - }, - { - "label": "Clean Build Artifacts", - "type": "shell", - "command": "rm -rf build/ dist/ *.egg-info .pytest_cache .mypy_cache .coverage htmlcov/", - "group": "build", - "presentation": { - "reveal": "silent" - } - } - ] -} diff --git a/.vulture_whitelist.py b/.vulture_whitelist.py deleted file mode 100644 index ff934dd..0000000 --- a/.vulture_whitelist.py +++ /dev/null @@ -1,5 +0,0 @@ -# Vulture whitelist for intentionally unused code -# https://github.com/jendrikseipp/vulture - -# Pydantic validators require 'cls' parameter even if unused -_.cls # unused variable (validators) diff --git a/ARCHITECTURE_DELIVERABLES.md b/ARCHITECTURE_DELIVERABLES.md deleted file mode 100644 index 95c6463..0000000 --- a/ARCHITECTURE_DELIVERABLES.md +++ /dev/null @@ -1,369 +0,0 @@ -# XARF Python Library - Architecture Design Deliverables - -## Overview - -Complete architecture design for the XARF Python library has been delivered. This document provides an index of all deliverables and their locations. - -## Deliverables Summary - -### Primary Documents (5 files, 74KB total) - -1. **ARCHITECTURE.md** (20KB) - `/docs/ARCHITECTURE.md` - - Complete architectural specification - - 50+ pages of detailed design - - All components, modules, and patterns - - Quality standards and benchmarks - - Security considerations - - **Status**: ✅ Complete - -2. **ARCHITECTURE_SUMMARY.md** (10KB) - `/docs/ARCHITECTURE_SUMMARY.md` - - Quick reference guide - - Implementation priorities - - Key decisions summary - - Usage examples - - **Status**: ✅ Complete - -3. **CLASS_HIERARCHY.md** (17KB) - `/docs/CLASS_HIERARCHY.md` - - Complete class diagrams - - Inheritance relationships - - Design patterns - - Extension points - - **Status**: ✅ Complete - -4. **API_SURFACE.md** (18KB) - `/docs/API_SURFACE.md` - - Public API specification - - All classes and methods - - Usage examples - - Stability guarantees - - **Status**: ✅ Complete - -5. **ARCHITECTURE_DIAGRAM.txt** (9KB) - `/docs/ARCHITECTURE_DIAGRAM.txt` - - Visual diagrams in ASCII - - Component interactions - - Data flows - - Module dependencies - - **Status**: ✅ Complete - -### Supporting Documents - -6. **INDEX.md** - `/docs/INDEX.md` - - Documentation index - - Navigation guide - - Document organization - - **Status**: ✅ Complete - -### Memory Storage - -Architecture design has been stored for agent coordination: -- **Key**: `xarf-python/architecture` -- **Location**: Claude Flow memory system -- **Status**: ⚠️ Attempted (file-based fallback used) - -## Key Design Decisions - -### 1. Package Rename -- **Decision**: Rename from `xarf-parser` to `xarf` -- **Document**: ARCHITECTURE.md Section 1.1 -- **Rationale**: Cleaner imports, broader scope -- **Impact**: Migration path needed - -### 2. Field Naming -- **Decision**: Use `category` field (not `class`) -- **Document**: ARCHITECTURE.md Section 3 -- **Rationale**: Python keyword conflict -- **Implementation**: Pydantic alias for JSON compatibility - -### 3. Component Architecture -- **Decision**: Three separate components (Parser, Validator, Generator) -- **Document**: ARCHITECTURE.md Section 2 -- **Rationale**: Separation of concerns, reusability -- **Impact**: New modules to create - -### 4. No v3 Converter -- **Decision**: No XARF v3 to v4 converter -- **Document**: ARCHITECTURE.md ADR-003 -- **Rationale**: v3 deprecated, simpler codebase -- **Impact**: Users migrate externally - -### 5. Minimal Dependencies -- **Decision**: Only 3 core dependencies -- **Document**: ARCHITECTURE.md Section 5.1 -- **Dependencies**: Pydantic v2, python-dateutil, email-validator -- **Rationale**: Security, performance, maintainability - -## Module Structure - -### New Modules to Create - -``` -xarf/ -├── validator.py # NEW - Extract from parser -├── generator.py # NEW - Report generation -├── constants.py # NEW - Constants and enums -├── schemas/ # NEW - JSON Schema files -│ ├── __init__.py -│ ├── loader.py -│ └── v4/*.json -├── utils/ # NEW - Utilities -│ ├── __init__.py -│ ├── validators.py -│ ├── encoders.py -│ └── converters.py -└── py.typed # NEW - Type marker -``` - -### Modules to Update - -``` -xarf/ -├── __init__.py # UPDATE - New exports -├── parser.py # UPDATE - Batch support -├── models.py # UPDATE - Use 'category' field -└── exceptions.py # UPDATE - Enhanced hierarchy -``` - -## Implementation Priority - -### Phase 1: Core Foundation (Week 1-2) -1. ✅ Architecture design complete -2. ⬜ Update models.py with `category` field -3. ⬜ Enhance parser.py with batch support -4. ⬜ Update exceptions.py -5. ⬜ Create constants.py - -### Phase 2: New Components (Week 3-4) -6. ⬜ Create validator.py (extract from parser) -7. ⬜ Create generator.py with factory methods -8. ⬜ Create utils/ package with validators -9. ⬜ Bundle schemas/ in package - -### Phase 3: Quality (Week 5-6) -10. ⬜ Comprehensive test suite (≥95% coverage) -11. ⬜ Type hints on all public API -12. ⬜ Documentation and examples -13. ⬜ Performance optimization - -### Phase 4: Polish (Week 7-8) -14. ⬜ CLI tool (optional) -15. ⬜ Integration examples -16. ⬜ Migration guide -17. ⬜ Release preparation - -## Quality Standards - -### Testing -- **Coverage**: ≥95% overall, 100% core modules -- **Types**: Unit, integration, performance, conformance, property-based -- **Frameworks**: pytest, pytest-cov, hypothesis -- **Status**: Architecture defined, implementation pending - -### Type Safety -- **Coverage**: 100% on public API -- **Checker**: mypy strict mode -- **Marker**: py.typed file -- **Status**: Architecture defined, implementation pending - -### Performance -- **Parse Speed**: 1000+ reports/sec -- **Memory**: <10KB per report -- **Concurrency**: Thread-safe, linear scaling -- **Status**: Benchmarks defined, implementation pending - -### Code Quality -- **Linter**: Ruff (replaces flake8, isort) -- **Formatter**: Black (88 char line length) -- **Complexity**: ≤10 cyclomatic complexity -- **Status**: Tools specified, configuration pending - -## Documentation Structure - -``` -docs/ -├── INDEX.md # Navigation guide -├── ARCHITECTURE.md # Complete design (20KB) -├── ARCHITECTURE_SUMMARY.md # Quick reference (10KB) -├── ARCHITECTURE_DIAGRAM.txt # Visual diagrams (9KB) -├── CLASS_HIERARCHY.md # Class relationships (17KB) -├── API_SURFACE.md # Public API spec (18KB) -├── QUICK_START.md # Getting started -├── MIGRATION_GUIDE.md # Upgrade guide -├── generator_usage.md # Usage examples -├── ci-cd-pipeline-design.md # CI/CD setup -└── PRE_COMMIT.md # Dev tools setup -``` - -## Public API Surface - -### Parser -- `XARFParser` - Parse JSON to objects - - `parse()` - Parse single report - - `parse_batch()` - Parse multiple reports - - `get_errors()` - Get validation errors - -### Validator -- `XARFValidator` - Multi-level validation - - `validate()` - Full validation - - `validate_schema()` - Schema only - - `validate_business_rules()` - Business rules - - `validate_evidence()` - Evidence validation -- `ValidationResult` - Result container - -### Generator -- `XARFGenerator` - Report generation - - `create_messaging_report()` - Factory method - - `create_connection_report()` - Factory method - - `create_content_report()` - Factory method -- `ReportBuilder` - Fluent builder pattern - -### Models -- `XARFReport` - Base report class -- `MessagingReport` - Email abuse reports -- `ConnectionReport` - Network abuse reports -- `ContentReport` - Web content abuse reports -- `XARFReporter` - Reporter information -- `Evidence` - Evidence attachments - -### Exceptions -- `XARFError` - Base exception -- `XARFParseError` - Parsing failures -- `XARFValidationError` - Validation failures -- `XARFSchemaError` - Schema errors -- `XARFGenerationError` - Generation errors - -## Dependencies - -### Core (3 packages) -```toml -pydantic>=2.5.0,<3.0.0 # Data validation -python-dateutil>=2.8.0 # Datetime parsing -email-validator>=2.1.0 # Email validation -``` - -### Development (7 packages) -```toml -pytest>=7.4.0 # Testing framework -pytest-cov>=4.1.0 # Coverage reporting -hypothesis>=6.88.0 # Property testing -ruff>=0.1.0 # Fast linting -black>=23.11.0 # Code formatting -mypy>=1.7.0 # Type checking -pre-commit>=3.5.0 # Git hooks -``` - -## Usage Examples - -### Parse Report -```python -from xarf import XARFParser - -parser = XARFParser() -report = parser.parse('{"xarf_version": "4.0.0", ...}') -print(f"Category: {report.category}") -print(f"Type: {report.type}") -``` - -### Validate Report -```python -from xarf import XARFValidator - -validator = XARFValidator() -result = validator.validate(report) - -if not result.is_valid: - for error in result.errors: - print(f"Error: {error}") -``` - -### Generate Report -```python -from xarf import XARFGenerator - -report = XARFGenerator.create_messaging_report( - source_ip="192.0.2.100", - report_type="spam", - reporter={ - "org": "My Org", - "contact": "noreply@example.com", - "type": "automated" - }, - evidence_source="spamtrap" -) - -json_output = report.model_dump_json(by_alias=True) -``` - -## Next Steps for Implementation Team - -### Immediate (Week 1) -1. Review all architecture documents -2. Set up development environment -3. Create new module stubs -4. Update pyproject.toml dependencies - -### Short-term (Weeks 2-4) -1. Implement core models with 'category' field -2. Extract validator from parser -3. Create generator with factory methods -4. Set up comprehensive test suite - -### Medium-term (Weeks 5-8) -1. Achieve ≥95% test coverage -2. Add type hints (100% public API) -3. Performance optimization -4. Documentation site with MkDocs - -### Long-term (Post v4.0.0) -1. CLI tool development -2. Integration examples -3. Community feedback incorporation -4. Additional report classes (infrastructure, copyright, etc.) - -## Success Criteria - -### Architecture Phase ✅ -- [x] Complete design specification -- [x] Module structure defined -- [x] Class hierarchy designed -- [x] API surface specified -- [x] Quality standards set -- [x] Documentation written - -### Implementation Phase (Pending) -- [ ] All modules implemented -- [ ] Test coverage ≥95% -- [ ] Type coverage 100% -- [ ] Performance benchmarks met -- [ ] Documentation complete -- [ ] Ready for alpha release - -## Contact & Resources - -### Repository -- **GitHub**: https://github.com/xarf/xarf-parser-python -- **Issues**: https://github.com/xarf/xarf-parser-python/issues -- **Pull Requests**: https://github.com/xarf/xarf-parser-python/pulls - -### Documentation -- **This Codebase**: `/docs/` directory -- **XARF Spec**: https://github.com/xarf/xarf-spec -- **XARF Website**: https://xarf.org - -### Tools -- **Pydantic**: https://docs.pydantic.dev/ -- **Ruff**: https://docs.astral.sh/ruff/ -- **Black**: https://black.readthedocs.io/ -- **mypy**: https://mypy.readthedocs.io/ -- **pytest**: https://docs.pytest.org/ - -## Version Information - -- **Architecture Version**: 1.0 -- **Target Release**: 4.0.0 -- **Design Date**: 2025-11-20 -- **Status**: ✅ Architecture Complete, ⬜ Implementation Pending - ---- - -**Prepared by**: System Architecture Designer (Claude Code) -**Date**: 2025-11-20 -**Project**: XARF Python Library (xarf-parser-python → xarf) diff --git a/CHANGELOG.md b/CHANGELOG.md index bd2283b..ddc24c4 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,229 +1,43 @@ # Changelog -All notable changes to the XARF Python Parser will be documented in this file. +All notable changes to this project will be documented in this file. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). ## [Unreleased] -### Changed -- **Legacy Tag Naming**: Updated v3 compatibility tags from `legacy:class:` to `legacy:category:` to align with v4 field naming conventions - - Affects only v3 report conversion metadata tags - - Maintains consistency with `category` field terminology throughout codebase - -### Fixed -- **Documentation Examples**: Corrected CONTRIBUTING.md sample report to use `category` field instead of outdated `class` reference - -### Added -- **XARF v3 Backwards Compatibility**: Automatic conversion from v3 to v4 format - - `is_v3_report()` function to detect v3 reports - - `convert_v3_to_v4()` function for explicit conversion - - Automatic detection and conversion in `XARFParser.parse()` - - Deprecation warnings for v3 format usage (`XARFv3DeprecationWarning`) - - 14 comprehensive tests for v3 compatibility covering all categories - - Complete field mapping from v3 to v4 structure (ReportClass→category, etc.) - - Legacy metadata tracking (`legacy_version`, `_internal.converted_from_v3`) - - Migration guide documentation at `docs/migration-guide.md` - -### Changed -- **Pydantic V2 Migration**: Updated from Pydantic V1 to V2 API - - Replaced `@validator` with `@field_validator` for all model validators - - Updated `Config` class to `ConfigDict` in XARFReport model - - Changed `allow_population_by_field_name` to `populate_by_name` - - All validators now use `@classmethod` decorator with type hints - - Fixed Python 3.13+ datetime deprecation warnings - -### Fixed -- Resolved all Pydantic V2 deprecation warnings in models -- Fixed `datetime.utcnow()` deprecation by using `datetime.now(timezone.utc)` -- Improved type hints for Pydantic V2 compatibility -- Updated import statements to use `pydantic.ConfigDict` and `field_validator` +## [1.0.0] - 2026-03-31 -### Documentation -- Added v3 compatibility section to README with example code -- Created comprehensive migration guide (`docs/migration-guide.md`) -- Updated feature list to highlight v3 support and Pydantic V2 compatibility -- Added documentation links for migration guide - -## [4.0.0] - 2024-01-20 +This release is a complete rework of the alpha (`v4.0.0a1`). No backward compatibility with the alpha API is provided. The version numbers will now be independent from the spec to provide release independence for the library. ### Breaking Changes -#### Field Rename: `class` → `category` - -The field previously named `class` has been renamed to `category` to align with the official XARF v4 specification. This change was made to avoid conflicts with programming language reserved keywords and better reflect the field's purpose. - -**Impact:** -- All JSON reports must now use `"category"` instead of `"class"` -- Python code must access `report.category` instead of `report.class_` -- Validation checks for `"category"` field presence - -**Migration:** -- Update all JSON generation code to use `"category"` -- Replace all `report.class_` with `report.category` in Python code -- See [MIGRATION_GUIDE.md](docs/MIGRATION_GUIDE.md) for detailed migration instructions - -```python -# Before (v3.x) -report = { - "class": "content", # Old field name - "type": "phishing" -} -print(report.class_) # Awkward Python workaround - -# After (v4.0.0+) -report = { - "category": "content", # New field name - "type": "phishing" -} -print(report.category) # Clean Python access -``` +- **New public API**: `parse()`, `create_report()`, `create_evidence()` are now module-level functions. The `XARFParser` and `XARFGenerator` classes have been removed. +- **Structured result objects**: `parse()` and `create_report()` now return `ParseResult` and `CreateReportResult` dataclasses respectively, rather than bare model instances or dicts. +- **Structured errors**: `ValidationError` and `ValidationWarning` are dataclasses with `field`, `message`, and (for errors) `value` attributes — previously errors were plain strings. +- **Package name**: published as `xarf` (was `xarf-parser`). +- **Python version**: minimum is now 3.10 (was 3.8). ### Added -- **Generator Functionality**: New `XARFGenerator` class for programmatically creating XARF v4 reports - - `create_report()` - Generate complete reports with validation - - `create_messaging_report()` - Generate messaging category reports (spam, phishing) - - `create_connection_report()` - Generate connection category reports (DDoS, port scans) - - `create_content_report()` - Generate content category reports (phishing sites, malware) - - Automatic UUID generation for `report_id` - - Timestamp generation in ISO 8601 format - - Built-in validation during generation - -- **Reporter `on_behalf_of` Field**: Support for infrastructure providers sending reports on behalf of other organizations - - `reporter.on_behalf_of.org` - Organization being represented - - `reporter.on_behalf_of.contact` - Contact email for represented organization - - Useful for MSSPs, abuse reporting services, and infrastructure providers - -- **Enhanced Validation**: Improved validation for all XARF v4 requirements - - Category-specific field validation - - Evidence structure validation - - Reporter information validation - - Timestamp format validation - -- **Python 3.12 Support**: Added support for Python 3.12 - -### Changed - -- **Model Classes**: Updated all model classes to use `category` instead of `class_` - - `XARFReport.category` replaces `XARFReport.class_` - - `MessagingReport.category` replaces `MessagingReport.class_` - - `ConnectionReport.category` replaces `ConnectionReport.class_` - - `ContentReport.category` replaces `ContentReport.class_` - -- **Parser Validation**: Updated validation logic to check for `"category"` field - - Old reports with `"class"` will fail validation - - Use migration helper to convert legacy reports - -- **Field Access**: Removed `class_` aliasing workaround in favor of clean `category` field - - Pydantic models now use `category` directly - - No more Python keyword conflicts - -### Removed - -- **Converter Module**: Temporarily removed `xarf.converter` module for XARF version conversion - - Will be redesigned and re-added in a future release - - Users needing conversion should implement temporary solution (see migration guide) - -- **Python 3.7 Support**: Dropped support for Python 3.7 (EOL June 2023) - - Minimum Python version is now 3.8 - -### Fixed - -- Improved error messages for validation failures -- Better handling of optional fields -- Fixed timezone handling for timestamps - -### Documentation - -- Added comprehensive [MIGRATION_GUIDE.md](docs/MIGRATION_GUIDE.md) with: - - Step-by-step migration instructions - - Before/after code examples - - Common migration issues and solutions - - Database migration examples - - Backward compatibility patterns - -- Updated [README.md](README.md) with: - - Generator usage examples - - Updated JSON examples using `"category"` - - `on_behalf_of` examples - - Security best practices - - Links to xarf.org website - - Updated feature matrix - -### Security - -- Enhanced input validation for all fields -- Added size limits for evidence payloads (5MB per item, 15MB total) -- Improved email validation for reporter contact fields -- Better handling of untrusted input in strict mode - ---- - -## [3.0.0] - 2023-11-15 - -### Added -- Initial XARF v3 support -- Basic JSON parsing and validation -- Support for common abuse types -- Python 3.8+ compatibility +- **All 7 categories fully implemented**: messaging, connection, content, infrastructure, copyright, vulnerability, reputation — with Pydantic v2 discriminated union models covering all 32 report types. +- **Schema-driven validation**: validation rules are derived from the official xarf-spec JSON schemas via `jsonschema` + `referencing` (Draft 2020-12); no hardcoded type or field lists. +- **`SchemaRegistry`**: programmatic access to schema-derived categories, types, and field metadata. Exposed as the `schema_registry` module-level singleton. +- **`SchemaValidator`**: AJV-equivalent JSON Schema validator with strict mode (promotes `x-recommended` fields to required before validation). +- **`create_evidence()`**: helper that computes hash, base64-encodes payload, and records size — supports `sha256`, `sha512`, `sha1`, `md5`. +- **`show_missing_optional`** parameter on `parse()` and `create_report()`: populates `result.info` with missing recommended and optional field details. +- **v3 backward compatibility** fully integrated into `parse()`: automatic detection and conversion with `XARFv3DeprecationWarning`. +- **`python -m xarf fetch-schemas`**: CLI command to pull fresh schemas from the xarf-spec GitHub release. +- **`python -m xarf check-schema-updates`**: CLI command to report whether a newer spec version is available. +- **`py.typed` marker** (PEP 561): downstream `mypy` picks up types when the package is installed. +- **Bundled schemas**: schemas ship inside the wheel, pinned to spec v4.2.0, loaded via `importlib.resources`. ### Changed -- Migrated from XARF v2 to v3 format - ---- - -## [2.1.0] - 2023-06-10 - -### Added -- Evidence attachment support -- Custom field handling - -### Fixed -- Timestamp parsing issues -- Validation edge cases - ---- - -## [2.0.0] - 2023-03-20 - -### Added -- Complete rewrite for XARF v2 -- Pydantic-based models -- JSON Schema validation -- Comprehensive test suite - ---- - -## [1.0.0] - 2022-09-15 - -### Added -- Initial release -- Basic XARF v1 parsing -- Limited validation - ---- - -## Migration Guides - -- **v3.x → v4.0.0**: See [MIGRATION_GUIDE.md](docs/MIGRATION_GUIDE.md) -- **v2.x → v3.x**: Contact support for legacy migration assistance - -## Links - -- [XARF v4 Specification](https://xarf.org/docs/specification/) -- [GitHub Repository](https://github.com/xarf/xarf-parser-python) -- [PyPI Package](https://pypi.org/project/xarf-parser/) -- [Issue Tracker](https://github.com/xarf/xarf-parser-python/issues) -- [XARF Website](https://xarf.org) - -## Versioning -This project follows [Semantic Versioning](https://semver.org/): -- **MAJOR** version for incompatible API changes -- **MINOR** version for backwards-compatible functionality additions -- **PATCH** version for backwards-compatible bug fixes +- **Tooling**: switched to `ruff` (replaces `black`, `isort`, `flake8`); `mypy --strict`; `bandit`; `pytest` with 80% coverage threshold. +- **`v3_compat.py`**: aligned type mappings exactly with the JS reference implementation (8 types, PascalCase + lowercase variants for each). +- **`models.py`**: replaced with result dataclasses (`ParseResult`, `CreateReportResult`, `ValidationError`, `ValidationWarning`) and base Pydantic models (`XARFReport`, `XARFEvidence`, `ContactInfo`). -Alpha releases use suffix: `4.0.0a1`, `4.0.0a2`, etc. -Beta releases use suffix: `4.0.0b1`, `4.0.0b2`, etc. +[Unreleased]: https://github.com/xarf/xarf-python/compare/v1.0.0...HEAD +[0.1.0]: https://github.com/xarf/xarf-python/releases/tag/v1.0.0 diff --git a/CODE_OF_CONDUCT.md b/CODE_OF_CONDUCT.md index 577e221..a2bc02f 100644 --- a/CODE_OF_CONDUCT.md +++ b/CODE_OF_CONDUCT.md @@ -1,48 +1,77 @@ -Contributor Covenant Code of Conduct -Our Pledge +# Contributor Covenant Code of Conduct + +## Our Pledge + We as members, contributors, and leaders pledge to make participation in our community a harassment-free experience for everyone, regardless of age, body size, visible or invisible disability, ethnicity, sex characteristics, gender identity and expression, level of experience, education, socio-economic status, nationality, personal appearance, race, caste, color, religion, or sexual identity and orientation. + We pledge to act and interact in ways that contribute to an open, welcoming, diverse, inclusive, and healthy community. -Our Standards + +## Our Standards + Examples of behavior that contributes to a positive environment for our community include: -* Demonstrating empathy and kindness toward other people -* Being respectful of differing opinions, viewpoints, and experiences -* Giving and gracefully accepting constructive feedback -* Accepting responsibility and apologizing to those affected by our mistakes, and learning from the experience -* Focusing on what is best not just for us as individuals, but for the overall community +- Demonstrating empathy and kindness toward other people +- Being respectful of differing opinions, viewpoints, and experiences +- Giving and gracefully accepting constructive feedback +- Accepting responsibility and apologizing to those affected by our mistakes, and learning from the experience +- Focusing on what is best not just for us as individuals, but for the overall community Examples of unacceptable behavior include: -* The use of sexualized language or imagery, and sexual attention or advances of any kind -* Trolling, insulting or derogatory comments, and personal or political attacks -* Public or private harassment -* Publishing others’ private information, such as a physical or email address, without their explicit permission -* Other conduct which could reasonably be considered inappropriate in a professional setting +- The use of sexualized language or imagery, and sexual attention or advances of any kind +- Trolling, insulting or derogatory comments, and personal or political attacks +- Public or private harassment +- Publishing others' private information, such as a physical or email address, without their explicit permission +- Other conduct which could reasonably be considered inappropriate in a professional setting + +## Enforcement Responsibilities -Enforcement Responsibilities Community leaders are responsible for clarifying and enforcing our standards of acceptable behavior and will take appropriate and fair corrective action in response to any behavior that they deem inappropriate, threatening, offensive, or harmful. + Community leaders have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, and will communicate reasons for moderation decisions when appropriate. -Scope + +## Scope + This Code of Conduct applies within all community spaces, and also applies when an individual is officially representing the community in public spaces. Examples of representing our community include using an official e-mail address, posting via an official social media account, or acting as an appointed representative at an online or offline event. -Enforcement -Instances of abusive, harassing, or otherwise unacceptable behavior may be reported to the community leaders responsible for enforcement at [INSERT CONTACT METHOD]. All complaints will be reviewed and investigated promptly and fairly. + +## Enforcement + +Instances of abusive, harassing, or otherwise unacceptable behavior may be reported to the community leaders responsible for enforcement at admin@xarf.org. All complaints will be reviewed and investigated promptly and fairly. + All community leaders are obligated to respect the privacy and security of the reporter of any incident. -Enforcement Guidelines + +## Enforcement Guidelines + Community leaders will follow these Community Impact Guidelines in determining the consequences for any action they deem in violation of this Code of Conduct: -1. Correction -Community Impact: Use of inappropriate language or other behavior deemed unprofessional or unwelcome in the community. -Consequence: A private, written warning from community leaders, providing clarity around the nature of the violation and an explanation of why the behavior was inappropriate. A public apology may be requested. -2. Warning -Community Impact: A violation through a single incident or series of actions. -Consequence: A warning with consequences for continued behavior. No interaction with the people involved, including unsolicited interaction with those enforcing the Code of Conduct, for a specified period of time. This includes avoiding interactions in community spaces as well as external channels like social media. Violating these terms may lead to a temporary or permanent ban. -3. Temporary Ban -Community Impact: A serious violation of community standards, including sustained inappropriate behavior. -Consequence: A temporary ban from any sort of interaction or public communication with the community for a specified period of time. No public or private interaction with the people involved, including unsolicited interaction with those enforcing the Code of Conduct, is allowed during this period. Violating these terms may lead to a permanent ban. -4. Permanent Ban -Community Impact: Demonstrating a pattern of violation of community standards, including sustained inappropriate behavior, harassment of an individual, or aggression toward or disparagement of classes of individuals. -Consequence: A permanent ban from any sort of public interaction within the community. -Attribution -This Code of Conduct is adapted from the Contributor Covenant, version 2.1, available at https://www.contributor-covenant.org/version/2/1/code_of_conduct.html. -Community Impact Guidelines were inspired by Mozilla’s code of conduct enforcement ladder. -For answers to common questions about this code of conduct, see the FAQ at https://www.contributor-covenant.org/faq. Translations are available at https://www.contributor-covenant.org/translations. +### 1. Correction + +**Community Impact**: Use of inappropriate language or other behavior deemed unprofessional or unwelcome in the community. + +**Consequence**: A private, written warning from community leaders, providing clarity around the nature of the violation and an explanation of why the behavior was inappropriate. A public apology may be requested. + +### 2. Warning + +**Community Impact**: A violation through a single incident or series of actions. + +**Consequence**: A warning with consequences for continued behavior. No interaction with the people involved, including unsolicited interaction with those enforcing the Code of Conduct, for a specified period of time. This includes avoiding interactions in community spaces as well as external channels like social media. Violating these terms may lead to a temporary or permanent ban. + +### 3. Temporary Ban + +**Community Impact**: A serious violation of community standards, including sustained inappropriate behavior. + +**Consequence**: A temporary ban from any sort of interaction or public communication with the community for a specified period of time. No public or private interaction with the people involved, including unsolicited interaction with those enforcing the Code of Conduct, is allowed during this period. Violating these terms may lead to a permanent ban. + +### 4. Permanent Ban + +**Community Impact**: Demonstrating a pattern of violation of community standards, including sustained inappropriate behavior, harassment of an individual, or aggression toward or disparagement of classes of individuals. + +**Consequence**: A permanent ban from any sort of public interaction within the community. + +## Attribution + +This Code of Conduct is adapted from the [Contributor Covenant](https://www.contributor-covenant.org/), version 2.1, available at https://www.contributor-covenant.org/version/2/1/code_of_conduct.html. + +Community Impact Guidelines were inspired by [Mozilla's code of conduct enforcement ladder](https://github.com/mozilla/diversity). + +For answers to common questions about this code of conduct, see the FAQ at https://www.contributor-covenant.org/faq. Translations are available at https://www.contributor-covenant.org/translations. diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index f1c710c..f9d243e 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -1,279 +1,286 @@ -# Contributing to XARF Python Parser +# Contributing to XARF Python Library -Thank you for your interest in contributing to the XARF v4 Python parser! This document provides guidelines for contributing to the implementation. +Thank you for your interest in contributing to the XARF Python library! We welcome contributions from the community and appreciate your help in making this project better. -## 🤝 How to Contribute +## Code of Conduct -### Reporting Issues -- **Bug Reports**: Parser errors, validation issues, or unexpected behavior -- **Feature Requests**: New validation rules, performance improvements, or API enhancements -- **Parser Support**: Help with implementing new XARF classes or types +This project adheres to the [Contributor Covenant Code of Conduct](CODE_OF_CONDUCT.md). By participating, you are expected to uphold this code. Please report unacceptable behavior to admin@xarf.org. -### Contributing Code -1. **Fork** the repository -2. **Create** a feature branch (`git checkout -b feature/validation-improvement`) -3. **Make** your changes following our coding standards -4. **Add tests** for new functionality -5. **Run** the test suite and linting -6. **Submit** a pull request +## How to Contribute -## 🛠️ Development Setup +### Reporting Bugs + +If you find a bug, please create an issue on GitHub with the following information: + +- **Clear title and description** of the issue +- **Steps to reproduce** the problem +- **Expected behavior** vs. **actual behavior** +- **Code samples** or test cases that demonstrate the issue +- **Version** of the library you're using +- **Python version** and operating system + +### Suggesting Features + +We welcome feature requests! Please create an issue with: + +- **Clear description** of the feature +- **Use case** explaining why this feature would be useful +- **Example code** showing how the feature might work +- **Compatibility considerations** with the XARF specification + +### Pull Requests + +We actively welcome pull requests! Here's how to contribute: + +1. **Fork the repository** and create your branch from `main` +2. **Make your changes** following our coding standards +3. **Add tests** for any new functionality +4. **Ensure all tests pass** and coverage remains >80% +5. **Update documentation** as needed +6. **Submit a pull request** with a clear description of changes + +## Development Setup ### Prerequisites -- Python 3.8+ -- Git -### Installation -```bash -# Clone your fork -git clone https://github.com/YOUR_USERNAME/xarf-parser-python.git -cd xarf-parser-python +- **Python**: 3.10 or higher +- **Git**: Latest stable version -# Create virtual environment -python -m venv venv -source venv/bin/activate # On Windows: venv\Scripts\activate +### Getting Started -# Install development dependencies -pip install -e ".[dev]" +1. **Clone your fork:** -# Install pre-commit hooks -pre-commit install -``` + ```bash + git clone https://github.com/YOUR_USERNAME/xarf-python.git + cd xarf-python + ``` + +2. **Create a virtual environment and install dependencies:** + + ```bash + python -m venv venv + source venv/bin/activate # On Windows: venv\Scripts\activate + pip install -e ".[dev]" + ``` + +3. **Install pre-commit hooks:** + + ```bash + pre-commit install + ``` + +4. **Run tests:** + + ```bash + pytest + ``` + +### Development Commands + +- `pytest` — Run the test suite +- `pytest --cov=xarf` — Generate code coverage report +- `ruff check xarf/` — Lint +- `ruff check --fix xarf/` — Auto-fix lint issues +- `ruff format xarf/` — Format code +- `ruff format --check xarf/` — Check code formatting +- `mypy --strict xarf/` — Run type checking +- `bandit -r xarf/` — Security scanning + +## Testing Requirements + +All contributions must maintain or improve test coverage: + +- **Coverage threshold**: 80% overall — enforced by `pytest-cov` +- **Unit tests**: Required for all new functions and classes +- **Integration tests**: Required for parser and generator functionality +- **Test file location**: Tests should be in the `tests/` directory +- **No schema mocking**: tests must use real schemas loaded from the bundle ### Running Tests + ```bash -# Run full test suite -pytest +pytest # Run all tests +pytest -v # Verbose output +pytest --cov=xarf # With coverage report +pytest tests/test_parse.py # Run a specific file +``` + +### Writing Tests -# Run with coverage -pytest --cov=xarf +We use pytest. Example test structure: -# Run specific test file -pytest tests/test_parser.py +```python +from xarf import parse + +def test_parse_valid_report() -> None: + result = parse({ + # ... valid XARF data + }) + + assert not result.errors + assert result.report is not None + assert result.report.category == "connection" + assert result.report.type == "ddos" + +def test_parse_returns_errors_for_invalid_data() -> None: + result = parse({}) -# Run with verbose output -pytest -v + assert len(result.errors) > 0 ``` -### Code Quality -```bash -# Format code -black xarf/ -isort xarf/ +## Code Style Guidelines + +### Python Standards + +- **Language version**: Python 3.10+ +- **Type annotations**: required on all public functions and methods +- **Docstrings**: Google style for all public APIs (`Args:`, `Returns:`, `Raises:`, `Example:`) +- **Strict mypy**: all code must pass `mypy --strict xarf/` + +See [pyproject.toml](pyproject.toml) for the full `ruff` and `mypy` configuration. -# Lint code -flake8 xarf/ +### Naming Conventions -# Type checking -mypy xarf/ +- **Functions / methods**: `snake_case` (e.g., `parse`, `create_report`, `create_evidence`) +- **Constants**: `UPPER_SNAKE_CASE` (e.g., `SPEC_VERSION`) +- **Classes**: `PascalCase` (e.g., `ParseResult`, `XARFReport`, `SchemaRegistry`) +- **Type aliases**: `PascalCase` (e.g., `AnyXARFReport`, `ConnectionReport`) + +### Code Organization + +- **One module per file** for main components +- **Related types** grouped in category-specific files (`types_messaging.py`, etc.) +- **Export from `__init__.py`** for public API — use `xarf-javascript/src/index.ts` as the reference for which names to expose + +### Formatting and Linting + +We use `ruff` for both formatting and linting. Configuration lives in [pyproject.toml](pyproject.toml). + +```bash +ruff format xarf/ # Auto-format +ruff format --check xarf/ # Check formatting +ruff check xarf/ # Lint +ruff check --fix xarf/ # Auto-fix linting issues ``` -## 📋 Contribution Guidelines +A pre-commit hook runs both automatically on staged files. -### Code Standards -- **Follow PEP 8** style guidelines -- **Use type hints** for all functions and methods -- **Write docstrings** for public APIs -- **Keep functions focused** and single-purpose -- **Use descriptive variable names** +### Documentation -### Testing Requirements -- **Unit tests** for all new functionality -- **Integration tests** for end-to-end scenarios -- **Test edge cases** and error conditions -- **Maintain >90% test coverage** -- **Mock external dependencies** +- **Google-style docstrings** for all public APIs +- **Type annotations** on all parameters and return values +- **Inline comments** for non-obvious logic +- **README updates** for new features -### API Design -- **Consistent naming** with existing patterns -- **Clear error messages** with actionable information -- **Backward compatibility** when possible -- **Performance considerations** for high-volume use +## Commit Message Conventions -## 🏗️ Architecture Overview +We follow the [Conventional Commits](https://www.conventionalcommits.org/) specification: -### Core Components ``` -xarf/ -├── __init__.py # Public API exports -├── parser.py # Main XARFParser class -├── models.py # Pydantic data models -├── exceptions.py # Custom exception classes -└── validators.py # Validation logic (future) +(): + + + +