Forum Discussion

barry-from-ws's avatar
barry-from-ws
Active Member
6 months ago

Removing HTML tags while ingesting using the API

Has anyone tried to remove tags from the HTML body of email campaigns when ingesting into a data warehouse? I'd love to be able to have a human-readable version of the email body with tags stripped out.

I don't want to have to create a bespoke tag stripping pipeline, as this is prohibitively slow if done in the warehouse, and the alternative is to put something ad-hoc between e.g. FiveTran and my data warehouse.

These ad-hoc solutions can be error prone and expose security holes for customers if whatever html parser used isn't correctly designed. It would be so much nicer and more reliable to have the stripped-out html in a field during ingestion.

Wondering if anyone has had a similar need and how you tackled it!

No RepliesBe the first to reply