Introducing tables & repetitive data parsing

Introducing tables & repetitive data parsing

We are happy to announce that today Parsio reached a huge milestone – it’s finally possible to extract data from tables and repetitive structures such as lists 🎉.

Until now Parsio has been great in parsing emails and attachments with the determined structure. But what if you need to parse an invoice or a receipt with an undetermined number of rows, or an Amazon confirmation email with 1, 2, 3, 4+ ordered items? Of course Parsio users have already encountered these situations before and were obliged to use a workaround: creating multiple templates. The first template was meant to parse an email with 1 product, the second one – an email with 2 products etc. Surely this approach is functional but it’s not always easy to manage all the templates, especially when you need to edit them.

The Solution

Today we are proposing to you a brand-new solution: a new "Table" field type.

Template creation
The parsed result

How does it work?

Technically speaking, parsing a table is a pretty complex task (more complex than we’d imagined at the beginning). To achieve a perfect result, the table must be well structured, and users should be very precise in highlighting the data to export. For example, there shouldn’t be any line breaks at the end of the selection or any html code from the adjacent elements. Let’s face the truth - it’s impossible to do this 100% perfectly.

Once you start selecting the first cell value to extract, Parsio will run a powerful algorithm to determine the whole table structure, trying to split it into rows and columns. To achieve a better result, keep selecting the data to help Parsio build a better table structure template.

Limitations

Although this new table parsing feature is supposed to cover the most significant part of the use cases (based on the existing users’ feedback), it should still be considered as a beta version. Our new approach has a couple of important limitations: it’s currently impossible neither to parse plain-text emails (and PDF files converted to text) nor to export links and other "hidden" data from the html attributes.

Next steps

We strongly hope this new feature will prove itself helpful and make your business processes even smoother. Don’t hesitate to share your feedback with us - we will collect all of them to make table parsing more robust so that it can meet all of your needs.

Learn more: Extracting tables and repetitive data with Parsio