How to Extract Data from Bills of Lading Automatically

Bills of lading can be parsed automatically using an AI-powered document parser. The parser identifies key fields — BOL number, shipper, consignee, cargo description, weight, and freight charges — across different carrier formats without requiring a custom template for each carrier. The extracted data can then be exported to a TMS, ERP, Google Sheets, or freight audit system via webhook or native integration.

TL;DR: Create a Parsio inbox → choose the AI PDF parser → define the BOL fields you need → forward BOL PDFs to the inbox → export structured data to your logistics or finance system automatically.

What Is a Bill of Lading

A bill of lading (BOL or B/L) is the primary legal document in freight shipping. It serves three purposes at once: a contract of carriage between the shipper and carrier, a receipt confirming the carrier has taken possession of the goods, and — in the case of a negotiable BOL — a title document that can be used to claim ownership of the cargo at the destination.

BOLs appear in every mode of freight transport. An ocean bill of lading accompanies container and bulk sea shipments and may be negotiable, meaning the original physical document must be presented to release the cargo at the port. A straight bill of lading covers road freight, is non-negotiable, and goes directly to the named consignee. An air waybill is the equivalent for air cargo. An inland waybill or rail bill covers domestic land movements where the document is not used as a title instrument.

Each type carries slightly different fields, but the core data structure is consistent: who is sending the goods, who is receiving them, what the goods are, how much they weigh, and who is paying for the shipment.

What Data Fields Does a Bill of Lading Contain

Despite carrier-specific layouts, every BOL captures a consistent set of logistics facts. Knowing which fields matter for your workflow is the first step before setting up automated extraction.

Shipment identifiers

  • BOL number — the carrier’s or freight broker’s reference for this shipment
  • PRO number — the carrier’s internal tracking number, common in US LTL trucking
  • Booking or purchase order reference — the shipper’s internal PO or job number
  • Container number and seal number — for ocean freight

Parties to the shipment

  • Shipper name and address
  • Consignee name and address
  • Notify party, if different from the consignee
  • Bill-to party, if freight charges are billed to a third party
  • Carrier name

Cargo details

  • Commodity description
  • Number and type of handling units (pallets, cartons, drums)
  • Gross weight
  • Freight class and NMFC code (US LTL shipments)
  • Volume or dimensions
  • Hazardous material indicator, if applicable

Routing and timing

  • Port of loading and port of discharge (ocean and air)
  • Origin and destination (road and rail)
  • Vessel name and voyage number
  • Ship date or date of issue

Financial

  • Freight charges
  • Accessorial charges such as fuel surcharges or detention fees
  • Payment terms: prepaid, collect, or third party

Finance teams typically care most about BOL number, shipper, carrier, freight charges, and payment terms. Operations and logistics teams need cargo description, weight, handling units, and routing. Customs brokers and compliance teams need commodity descriptions, weights, and port information. A well-configured extraction schema captures all of these in one pass, so each downstream team has what they need without re-touching the source document.

Why BOL Data Extraction Is Challenging

Bills of lading are structurally harder to parse than standard invoices, and several specific issues make template-based approaches impractical for most freight operations.

No universal carrier format. Ocean carriers, LTL carriers, FTL operators, freight brokers, and 3PLs all produce different BOL layouts. The same carrier may update their template after a system change or rebrand. A parsing rule built for one carrier’s BOL will break the moment their format changes, and freight teams routinely deal with dozens of carriers simultaneously.

Complex cargo tables. The commodity section of a BOL typically contains multiple rows — one per SKU, commodity class, or handling unit type — with column structures that vary between carriers. Extracting these as repeating line items rather than a single block of text requires parsing logic that understands tabular structure in context.

Mixed digital and scanned documents. Freight operations still involve significant paper handling. Many inland BOLs are signed at pickup, stamped at the dock, and scanned. OCR must run before field extraction, and handwritten endorsements or port stamps can obscure original printed fields in scanned copies.

Multiple reference numbers on a single document. A single BOL may carry a shipper reference, a carrier reference, a booking number, a container number, and a customer PO number — each labeled differently depending on the carrier. Routing extracted data correctly downstream requires correctly identifying which number is which.

Carrier-specific terminology. What one carrier calls a "PRO number" another labels a "waybill number." What appears as "freight charges" on one BOL appears as "transportation charges" on another. Field definitions need to account for these synonyms to extract consistently across carriers.

How to Set Up Automated BOL Extraction in Parsio

Step 1: Create a dedicated BOL inbox

Parsio’s parser selection: choose the AI PDF parser for bills of lading that vary by carrier

In Parsio, create a new inbox and name it for bills of lading. If you receive BOLs from multiple carriers, keep them in a single inbox — the AI parser handles layout variation without needing separate configurations per carrier. If you process different logistics document types (BOLs, freight invoices, delivery notes) and want separate extraction schemas or export destinations for each, give them their own inboxes.

Step 2: Choose the AI-powered PDF parser

Parsio’s AI model selection for structured PDF documents including logistics and freight files

Select the AI-powered PDF parser for your BOL inbox. Unlike the template-based parser — which requires building field rules for each carrier format and breaks when layouts change — the AI PDF parser reads each document’s visual and textual structure and locates your defined fields across different layouts. Bills of lading fit within the category of structured, page-bounded documents the parser handles well: consistent field categories, tabular cargo data, and defined party and routing sections.

For scanned BOLs and photo-quality PDFs, OCR is applied automatically before the structured extraction runs. The system converts the scanned image to readable text first, then extracts the defined fields from that text.

Step 3: Define your extraction fields

Define the specific BOL fields your team needs. A standard schema for a freight operations team might include:

  • BOL number
  • PRO number or waybill number
  • Shipper reference or PO number
  • Ship date
  • Shipper name and address
  • Consignee name and address
  • Carrier name
  • Commodity description (repeating, one entry per cargo line)
  • Number of handling units per cargo line
  • Gross weight per cargo line
  • Freight class, if applicable
  • Total freight charges
  • Payment terms (prepaid, collect, or third party)

If your team processes ocean BOLs, also add container number, seal number, port of loading, port of discharge, and vessel name.

Write field descriptions clearly, especially where carrier terminology varies. For the carrier tracking number, describe it as: the carrier’s internal reference for this shipment, which may be labeled PRO number, waybill number, or carrier reference. For cargo line items, ask for each row as a separate structured entry rather than a single merged description.

Step 4: Route BOL documents into the inbox

BOLs typically arrive through one of several paths:

Email forwarding — many carriers and freight platforms send BOL confirmations as PDF attachments. Forward these emails directly to the Parsio inbox address. The parser extracts data as soon as the email arrives, with no additional setup.

Manual or batch upload — for paper BOLs received at the dock and scanned, upload the PDFs directly to the inbox. Batch uploads handle end-of-day processing for teams that accumulate documents throughout the day.

Automation via Zapier or Make — if BOL PDFs land in a shared folder (Google Drive, SharePoint, Dropbox) or arrive through a freight management portal, connect that source to Parsio using a no-code workflow. New BOLs are picked up automatically and submitted to the inbox without manual steps. See Best Ways to Automate Document Parsing in Zapier, Make and n8n for setup details.

API submission — for teams building freight or logistics platforms, the Parsio API accepts document submissions programmatically. This integrates high-volume BOL processing directly into a TMS or freight brokerage system without workflow tools.

Step 5: Export extracted data to your logistics systems

Export extracted BOL data to Google Sheets, webhooks, TMS, or automation platforms like Zapier and Make

Once extraction runs, the structured BOL data is available for export. Where it goes depends on your workflow:

  • TMS or ERP via webhook — send extracted BOL data as JSON directly to your Transportation Management System or ERP on document arrival. This eliminates manual entry into freight management systems entirely, and the webhook fires as soon as Parsio finishes processing each document.
  • Google Sheets — the fastest starting point for teams that currently aggregate BOL data in spreadsheets. Each processed BOL appends a row with all extracted fields. From there, formulas and pivot tables can handle carrier comparisons, shipment tracking, cost summaries, and audit reviews.
  • Freight audit software — route extracted freight charges and carrier references to your freight audit tool for automated invoice matching and dispute detection. Catching billing errors before payment requires having BOL data in structured form to compare against carrier invoices.
  • Make or Zapier multi-step workflows — use extracted BOL fields to trigger downstream actions: create shipment records in your WMS, notify receiving teams via Slack, update customer-facing tracking portals, or route cargo data to customs brokers.
  • CSV or Excel — for batch export at the end of a shipping period, download all processed BOLs as a structured CSV and import into any system that accepts tabular data.

Common Failure Modes in BOL Extraction

Handwritten and stamped fields. Many physical BOLs are partially filled in by hand — weight figures, dates, and reference numbers added at the dock — or carry port stamps and endorsements applied after the document was issued. OCR handles clean printed text well but is less consistent on handwriting or low-contrast stamps. For workflows where handwritten fields are critical, validate those specific values on a sample before running at full volume.

Multiple cargo line items collapsed into one field. If the BOL lists several commodity classes or handling unit types, the cargo table contains multiple rows. Make sure your field definition asks for these as a repeating structure, not as a single block of text. Without a repeating schema, the parser may return a merged string that is hard to split downstream.

Carrier-specific abbreviations and freight class codes. Freight class values (50, 55, 65, 70, and so on) and NMFC codes are standard in US LTL freight but are not meaningful to all downstream systems. Include them in your schema if your freight audit or ERP needs them; exclude them if they add noise without value.

Split BOLs across multiple legs. When a single shipment involves multiple carriers or transfer points, several BOLs may be issued with cross-references between them. If your workflow needs to link related documents, add a field for any referenced BOL number so the relationship is captured at extraction time.

Cluttered scanned copies. The shipper’s retained copy of a BOL sometimes carries fewer stamps and endorsements than the carrier’s or consignee’s copy. If scanned BOLs consistently produce extraction errors on specific fields, test with the cleaner version of the document from the same shipment.

Downstream Workflows That Use Structured BOL Data

Once BOL extraction is running reliably, the structured output enables workflows that are impractical with PDFs alone:

Freight invoice matching. Compare freight charges extracted from the BOL against the carrier’s invoice to catch billing discrepancies before payment. This is one of the highest-value automation opportunities for accounts payable teams that process freight: even small per-shipment billing errors add up significantly at volume. See How to Extract Data from Freight Invoices Automatically for the invoice side of this workflow.

Customs documentation pre-population. Use shipper, consignee, cargo description, weight, and port data drawn from the BOL to pre-fill import entry forms, reducing customs broker processing time and the risk of transcription errors on declarations.

Warehouse receiving preparation. Send consignee and cargo data to a WMS receiving queue automatically when a BOL is processed, so the warehouse team knows what to expect before the truck arrives. See How to Extract Data from Delivery Notes Automatically for the complementary receiving-side workflow.

Customer shipment notifications. Trigger outbound emails or messages to customers when a BOL is processed, including the carrier reference number and relevant cargo details, without anyone on the operations team manually composing that communication.

Freight emissions tracking. Use shipment weight and mode of transport extracted from BOLs as direct inputs to Scope 3 freight emissions calculations, which are increasingly required under corporate ESG reporting frameworks.

FAQ

Can Parsio handle bills of lading from multiple carriers without separate templates?

Yes. The AI-powered PDF parser reads each document’s visual and textual structure to locate the fields you defined, rather than relying on coordinate-based rules tied to a specific layout. This means it works across different carrier formats — ocean BOLs, US LTL truck BOLs, air waybills, inland waybills — without needing a separate template for each carrier. In practice, a single inbox with a common field definition handles BOLs from multiple carriers reliably. Some highly unusual carrier formats or very low-quality scans may need field description adjustments after initial testing, but this is the exception rather than the rule. The practical benefit for freight operations teams that deal with many carriers simultaneously is significant: you define what you need once and the parser adapts to format variations across carriers, rather than requiring a dedicated setup for each one.

What is the difference between a BOL number and a PRO number, and should I extract both?

A BOL number is the reference assigned when the shipment is booked — it identifies the shipment on the shipping documents and in the shipper’s or freight broker’s system. A PRO number (short for progressive number) is the carrier’s own internal tracking number, assigned when the carrier takes possession of the freight, and is most common in US LTL trucking. On ocean BOLs, the carrier’s reference is typically the BOL number itself. On air shipments, it is the air waybill number. Whether you need both depends on your downstream use. For freight audit and payment reconciliation, the carrier’s reference (PRO or BOL number) is the match key against invoices. For order management and customer communication, the shipper reference or PO number is more useful. Extracting all available reference numbers — BOL number, PRO number, booking reference, and any customer PO on the document — means every downstream team has the identifier they need without returning to the source document.

How does extraction handle cargo tables with multiple commodity rows?

Bills of lading often list cargo as a table with multiple rows — one per commodity class, SKU, or handling unit type — each with its own description, quantity, weight, and freight class. The AI PDF parser can extract these as a structured list of items rather than collapsing them into a single text field, provided the field definition asks for them in a repeating format. In Parsio, you define the cargo section as a repeating item by listing the sub-fields you want for each row — description, number of pieces, weight, freight class — and the parser returns one entry per cargo line. This is important for freight audit, where you need to match individual commodity charges rather than a single total, and for customs filings, where each commodity must be declared separately with its own description and weight.

Can Parsio process scanned or partially handwritten bills of lading?

Yes, with some important caveats. Parsio’s AI PDF parser includes OCR processing that converts scanned images to readable text before extraction runs. For clean, high-resolution scans of digitally generated BOLs, extraction typically works as well as it does on native digital PDFs. The challenge comes with partially handwritten BOLs — common in trucking, where the driver may fill in weight or date fields at pickup, or where dock workers add stamps and signatures after the BOL was originally issued. OCR handles printed text reliably but is less consistent on handwriting, especially low-contrast stamps or faded ink. For workflows where handwritten fields are critical — weight at pickup, driver signature date, dock notes — test extraction on a representative sample first and build in a review step for those specific fields before running at full volume.

What systems can I connect BOL data to after extraction?

Extracted BOL data from Parsio can reach most business systems without requiring development work. The most direct connection is via webhook: Parsio sends structured JSON to any endpoint you specify — a TMS, WMS, ERP, or customs broker platform — as soon as a BOL is processed. For teams that already use Zapier or Make, Parsio has native connectors that let you build multi-step automation: trigger on a new parsed document, filter by carrier or freight class, then create a shipment record in Airtable, update a row in Google Sheets, send a Slack notification to the receiving team, or push data into a logistics platform. Google Sheets is the most common starting point for freight teams that currently work in spreadsheets — each BOL appends a row automatically, which feeds dashboards, carrier performance reviews, and audit workflows. For teams building custom logistics software, the Parsio API returns structured data from submitted documents and can be integrated directly into existing pipelines.

How do I handle BOLs where payment terms are collect versus prepaid?

Payment terms — prepaid (shipper pays), collect (consignee pays), or third party — appear as a checked box or text field on most BOLs. Extract this as an explicit field in your schema: freight payment terms: prepaid, collect, or third party. This distinction is especially important for accounts payable teams. A collect BOL means the consignee’s AP team is responsible for the freight invoice when it arrives from the carrier. A prepaid BOL means the shipper has already settled or will settle directly with the carrier. Routing logic in your TMS or ERP often depends on this field. If you process a mix of prepaid and collect shipments, use payment terms as a filter in a Zapier or Make workflow to route documents automatically to the right cost center or invoice queue, rather than relying on someone to check each BOL individually.

How does BOL extraction differ from freight invoice extraction, and should I set up both?

A BOL is created when the shipment is booked and travels with the goods, capturing what was shipped, by whom, to whom, and at what agreed freight rate. A freight invoice is issued by the carrier after delivery, billing for the actual charges. Both documents carry freight charges, but the BOL represents the agreed terms while the invoice represents what the carrier actually billed. Setting up automated extraction for both is the foundation of a freight audit workflow: compare the freight charge on the BOL against the freight charge on the carrier’s invoice, and flag discrepancies for review before payment. Without extraction on both sides, this comparison happens manually — or not at all, which means billing errors are often paid without review. Most teams that automate BOL extraction find that adding freight invoice extraction in a parallel inbox immediately unlocks a freight audit workflow that pays for the setup cost in caught billing errors within the first few months.

For the invoice side of this workflow, see How to Extract Data from Freight Invoices Automatically.

Extract valuable data from emails and attachments

Try Parsio for free