Tractor
Welcome to the Tractor documentation.
Tractor is a document redaction application with ML-powered Named Entity Recognition. Users upload documents, the system identifies sensitive information using GLiNER and Presidio, and users can accept/reject redactions before exporting redacted versions.
Key Features
- Document Upload: Support for PDF and docx files.
- Automatic Redaction: Hybrid three-model NER pipeline — SpanCat (trained), GLiNER (zero-shot), and Presidio (pattern-based)
- Review Workflow: Accept, reject, or bulk-action suggested redactions. Adjacent same-type spans are automatically merged for easier review.
- Model Training: Train custom SpanCat models from accepted redactions; GLiNER model managed separately
- Export: Generate disclosure packages with original, redacted, and edited versions
Quick Links
- Getting Started - Set up and start using Tractor
- Redacting Documents - Learn how to redact documents
- Settings - Configure exemption templates and other settings
- Developer Setup - Set up your development environment
- Architecture - Understand how Tractor works
License
Tractor is licensed under the AGPL-3.0 License.