Core API

SDS Extraction API for Structured 16-Section Compliance Data

An SDS extraction API converts supplier SDS and MSDS documents into structured, schema-ready records with confidence and warning metadata. This supports scalable compliance workflows beyond manual entry.

Last updated: 2026-03-09

What this page covers

This is the canonical commercial page for SDS extraction API evaluation. It focuses on production behavior: section-level extraction, failure modes, batch pipelines, integration patterns, and governance controls.

Section-level extraction scope

SDS section group Typical extracted entities Operational use
Sections 1-3Product identity, supplier details, composition tablesSupplier onboarding and master-data synchronization
Sections 4-8First-aid, fire controls, handling/storage, PPEEHS workflows and worker-safety runbooks
Sections 9-12Physical properties, stability/reactivity, tox/ecotox summariesRisk scoring, review queues, downstream analytics
Sections 13-16Disposal, UN transport fields, regulatory refs, revision metadataTransport checks, compliance audit trail, version control

Failure modes and fallback behavior

  • Low OCR confidence: response contains warnings and confidence values so records can route to human review.
  • Malformed tables: table rows are normalized where possible and flagged when structure is incomplete.
  • Mixed-language labels: language normalization attempts mapping and returns warnings when ambiguity remains.
  • Missing required entities: required fields can be enforced by downstream validation rules before persistence.

Batch workflow and integration patterns

Single-file extraction uses POST /extract-sds. High-volume programs use batch jobs plus webhook completion events. Typical downstream patterns include ERP master-data sync, EHS ingestion queues, and exception routing to compliance QA.

curl -X POST "https://api.safetydatasheetapi.com/v1/extract-sds" \
  -H "Authorization: Bearer <api_key>" \
  -F "file=@supplier-sds.pdf" \
  -F "language_hint=en" \
  -F "schema_version=2026-01"

Output model and warning semantics

{
  "request_id": "req_7f92d8",
  "confidence_score": 0.91,
  "schema_version": "2026-01",
  "section_coverage_score": 0.94,
  "warnings": [
    "Low confidence in Section 14 transport table row 2"
  ],
  "data": {
    "product_identification": { "product_name": "Acetone" },
    "hazards_identification": { "ghs_classification": ["Flammable Liquid - Category 2"] },
    "transport_information": { "un_number": "UN1090", "hazard_class": "3", "packing_group": "II" },
    "revision_metadata": { "revision_date": "2024-01-15" }
  }
}

Real implementation scenarios

  • Chemical manufacturer: supplier SDS intake into ERP/PLM with revision and compliance traceability.
  • EHS platform vendor: embedded extraction API with webhook-driven tenant pipelines.
  • Distributor/importer: normalization across mixed supplier templates and regional transport requirements.

Governance and schema stability

Production teams typically enforce mandatory-field validation, confidence thresholds, warning-triggered review, and schema version pinning. For migration planning, use the schema versioning guide and the API docs.

FAQ

Can this run in a governed enterprise workflow?

Yes. Responses include confidence and warning metadata for routing, validation, and audit-ready controls.

What if a document is scanned and low quality?

Scanned files are supported. Low-quality segments are surfaced through warning metadata for targeted review.

How do we integrate with existing systems?

Most teams map JSON/XML output into ERP or EHS ingestion pipelines and use webhooks for asynchronous completion handling.

Is this page the canonical commercial endpoint?

Yes. This page is the primary English commercial page for SDS extraction API evaluation.

Decision and proof links

Request a pilot implementation plan: Talk to implementation or run a live document through sample extraction.