A config-driven, modular pipeline for preprocessing American Sign Language (ASL) datasets. Supports YouTube-ASL and How2Sign with two landmark extractors (MediaPipe Holistic and MMPose RTMPose3D) and two output modes (pose landmarks and video clips).
|
YAML configs with base inheritance and CLI overrides MediaPipe Holistic (553 keypoints) and MMPose RTMPose3D (133 keypoints)
|
Add datasets, processors, and extractors via decorators Multi-worker extraction, normalization, and clipping Sharded tar archives for efficient training data loading |
📖 New? See the Installation Guide to get started.
git clone https://github.com/balaboom123/Sign-Language-Preprocessing.git
cd Sign-Language-Preprocessing
python -m venv venv
source venv/bin/activate
pip install -r requirements.txtMediaPipe works on CPU out of the box. MMPose requires a CUDA-capable GPU and additional dependencies -- see the Installation Guide for full setup instructions.
# Download YouTube-ASL videos, extract MediaPipe landmarks, normalize, and package into WebDataset shards
python -m sign_prep configs/youtube_asl/pose_mediapipe.yaml
# Extract MMPose landmarks from pre-downloaded How2Sign data (CUDA required)
python -m sign_prep configs/how2sign/pose_mmpose.yaml
# Override any config value from the command line (e.g. more workers, stop after extraction)
python -m sign_prep configs/youtube_asl/pose_mediapipe.yaml \
--override processing.max_workers=8 pipeline.stop_at=extractBoth modes produce WebDataset tar shards for efficient training data loading. See Pipeline Stages for detailed output formats and data shapes.
| Dataset | Venue | Description | License |
|---|---|---|---|
| YouTube-ASL | NeurIPS 2023 | 11,000+ videos, 73,000+ segments -- open-domain ASL-English parallel corpus | Apache-2.0 |
| How2Sign | CVPR 2021 | 80+ hours of instructional ASL in a controlled studio environment | CC BY-NC 4.0 |
For paper-aligned preprocessing methodology, see Research-Aligned Preprocessing.
- Installation Guide -- base setup and MMPose GPU dependencies
- Architecture -- system design, registry, pipeline flow
- Configuration -- full config reference, inheritance, CLI overrides
- Pipeline Stages -- all 6 processing stages
- Datasets -- YouTube-ASL vs How2Sign setup
- Research-Aligned Preprocessing -- paper-aligned preprocessing notes
The MIT license in this repository applies to the code and documentation in this project. Use of external datasets, research artifacts, and upstream repos referenced above must comply with their original licenses and usage terms.
MIT -- see LICENSE.