Skip to content

PASTAplus/staging

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

EDI Data Package Staging Service

A web service for staging data packages for upload to PASTA.

Overview

The staging service provides a public API and web interface for uploading and managing data packages before final submission to PASTA. It eliminates the requirement for users to maintain their own publicly accessible hosting infrastructure.

Features

  • REST API protected by JWT bearer token or API key
  • Web UI with upload dashboard
  • Asynchronous upload processing
  • S3-backed storage
  • Per-upload status reporting with checksums (SHA-1/MD5)
  • Configurable data lifecycle management (garbage collection)
  • Integration with the EDI IAM service for authentication

Architecture

  • Runtime: Python 3.x
  • Server: nginx + gunicorn + uvicorn
  • ORM: SQLAlchemy
  • Database: PostgreSQL
  • Storage: Amazon S3 via boto3
  • Auth: JWT bearer tokens / API keys via IAM service

API

Method Endpoint Description
POST /upload?[key=|token=] Stage a data package (returns 202)
GET /upload?[key=|token=] List all of the caller's uploads
GET /upload/{id}/data?[key=|token=] Download a staged file (302 redirect)
DELETE /upload/{id}?[key=|token=] Delete a staged upload

Authentication

All API endpoints require either a ?token=<jwt> or ?key=<api-key> query parameter. API keys are issued through the EDI IAM service and are automatically exchanged for a JWT before each request is processed.

Example

# Stage a file using an API key
curl -X POST "https://<host>/upload?key=<api-key>" \
  -F "file=@data.zip" \
  -F "label=my-dataset-2026"

# List your uploads
curl "https://<host>/upload?token=<jwt>"

# Download a file
curl -L "https://<host>/upload/<id>/data?token=<jwt>"

# Delete an upload
curl -X DELETE "https://<host>/upload/<id>?token=<jwt>"

Upload Report

Each entry in the upload list includes:

  • Upload ID
  • Label
  • Upload status (pending / success / failure)
  • SHA-1 and MD5 checksums (populated after successful upload)
  • Data download URL (/upload/{id}/data)
  • Expiry timestamp (time-to-live)

Data Lifecycle

Staged data is automatically removed after a configurable retention period (default: one month). The timer starts at the time of upload.

Development Setup

Prerequisites

  • Python 3.11+
  • PostgreSQL
  • AWS credentials with S3 access
  • Access to an EDI IAM service instance

Install

git clone <repo-url>
cd staging
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Configure

Copy the config.py template and fill in the required values:

cp webapp/config.py.template webapp/config.py

Run (development)

uvicorn app.main:app --reload

Run (production)

gunicorn app.main:app -k uvicorn.workers.UvicornWorker

Deployment

The service is deployed behind nginx. See deploy/ for configuration templates and the nginx site configuration.

Testing deployments target the web-x server.

License

See LICENSE.

About

PASTA Data Package Staging Service

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published