Core API
SDS Extraction API for Structured 16-Section Compliance Data
An SDS extraction API converts supplier SDS and MSDS documents into structured, schema-ready records with confidence and warning metadata. This supports scalable compliance workflows beyond manual entry.
Last updated: 2026-03-09
What this page covers
This is the canonical commercial page for SDS extraction API evaluation. It focuses on production behavior: section-level extraction, failure modes, batch pipelines, integration patterns, and governance controls.
Section-level extraction scope
| SDS section group | Typical extracted entities | Operational use |
|---|---|---|
| Sections 1-3 | Product identity, supplier details, composition tables | Supplier onboarding and master-data synchronization |
| Sections 4-8 | First-aid, fire controls, handling/storage, PPE | EHS workflows and worker-safety runbooks |
| Sections 9-12 | Physical properties, stability/reactivity, tox/ecotox summaries | Risk scoring, review queues, downstream analytics |
| Sections 13-16 | Disposal, UN transport fields, regulatory refs, revision metadata | Transport checks, compliance audit trail, version control |
Failure modes and fallback behavior
- Low OCR confidence: response contains warnings and confidence values so records can route to human review.
- Malformed tables: table rows are normalized where possible and flagged when structure is incomplete.
- Mixed-language labels: language normalization attempts mapping and returns warnings when ambiguity remains.
- Missing required entities: required fields can be enforced by downstream validation rules before persistence.
Batch workflow and integration patterns
Single-file extraction uses POST /extract-sds. High-volume programs use batch jobs plus webhook
completion events. Typical downstream patterns include ERP master-data sync, EHS ingestion queues, and
exception routing to compliance QA.
curl -X POST "https://api.safetydatasheetapi.com/v1/extract-sds" \
-H "Authorization: Bearer <api_key>" \
-F "file=@supplier-sds.pdf" \
-F "language_hint=en" \
-F "schema_version=2026-01"
Output model and warning semantics
{
"request_id": "req_7f92d8",
"confidence_score": 0.91,
"schema_version": "2026-01",
"section_coverage_score": 0.94,
"warnings": [
"Low confidence in Section 14 transport table row 2"
],
"data": {
"product_identification": { "product_name": "Acetone" },
"hazards_identification": { "ghs_classification": ["Flammable Liquid - Category 2"] },
"transport_information": { "un_number": "UN1090", "hazard_class": "3", "packing_group": "II" },
"revision_metadata": { "revision_date": "2024-01-15" }
}
}
Real implementation scenarios
- Chemical manufacturer: supplier SDS intake into ERP/PLM with revision and compliance traceability.
- EHS platform vendor: embedded extraction API with webhook-driven tenant pipelines.
- Distributor/importer: normalization across mixed supplier templates and regional transport requirements.
Governance and schema stability
Production teams typically enforce mandatory-field validation, confidence thresholds, warning-triggered review, and schema version pinning. For migration planning, use the schema versioning guide and the API docs.
FAQ
Can this run in a governed enterprise workflow?
Yes. Responses include confidence and warning metadata for routing, validation, and audit-ready controls.
What if a document is scanned and low quality?
Scanned files are supported. Low-quality segments are surfaced through warning metadata for targeted review.
How do we integrate with existing systems?
Most teams map JSON/XML output into ERP or EHS ingestion pipelines and use webhooks for asynchronous completion handling.
Is this page the canonical commercial endpoint?
Yes. This page is the primary English commercial page for SDS extraction API evaluation.
Decision and proof links
- Generic OCR vs SDS normalization
- Build vs buy SDS parser
- Schema versioning
- Security controls
- API docs and endpoint reference