Learn how to use Xpath, XML, IndexOf and Substring expressions in Power Automate in order to extract an HTML table from an incoming email, convert to a JSON array, and import into a new Excel File containing a table. I will talk you though the basics of HTML Tables, how to read the tags and understand how XPath might help you with your data extraction.
#PowerAutomate #JSON #HTML #Table #Excel
00:00 Intro
01:01 Let’s get started
02:25 HTML Tables W3Schools
03:31 How to Use Xpath in Power Automate
05:35 Quick Overview of HTML to JSON Array
06:14 Explore the solution in detail
10:22 Further explanation of XPath
11:03 Select vs Apply to Each method of extracting data from HTML
18:12 Create Excel File and populate with HTML Table
20:15 A demo of the solution, HTML Table to JSON to Excel
21:23 Summary and thanks for watching
Download Flow
Create and Populate and Excel File FAST
Create a new Excel File in Power Automate …(read more)
Convert Word 2 PDF: Word to PDF Converter
Convert PowerPoint 2 PDF: PPT to PDF Converter
Convert Excel 2 PDF: Excel to PDF Converter
Convert an Image 2 PDF: Image to PDF Converter
Convert HTML 2 PDF: HTML to PDF Converter
More Tools: PDF Converter
Hi Damien, as with our email discussion.
Some of you guys might face some issues with the xpath when following this tutorial. You might get this error when you run the flow that is shown in this tutorial : "Jason property name: the property with name " already exists.".
Fear not, this is an issue with xpath.
In my working solution to select 'Country', I used the following xpath. " xpath(xml(outputs('ComposeHTML')), concat('//table//tr[1]/td[1]//text()'))?[0]
(Use wildcard to select text, aka use "//text()" rather than "/text()", as the formatting for the table headers might not allow you to select the text() properly.)
Do the same for the remaining column headers and anytime that these column headers are used in the tutorial. Another best practice for Microsoft flow is to create variables first, so it is easier to just change 1 variable declaration rather than finding all the points where this expression is used and changing it individually.
I hope this helps!
Hello Could you please help me in creating this I want the outlook table rows data into different excel column i downloaded you github file also but its not working for me it will be really helpfull if you can help me on this
This didnt work but it was the last straw that broke the camels back. I had to go and learn how to use Arrays, strings, split and int (all from your vids) and have managed to make something that works despite the CSS stopping this solution working for me 🙂
First 4 steps run fine.. but at the ComposeHTML step I'm getting the following error:Unable to process template language expressions in action 'ComposeHTML' inputs at line '0' and column '0': 'The template language function 'xml' parameter is not valid. The provided value cannot be converted to XML: 'Reference to undeclared entity 'nbsp'. Line 1, position 8201
Any easy fix to this fail point? Thank you
Great video and explanation
Hi damian, im trying to parse and email that haves 10 html tables, and particularry i want to extract the data of the 8 table, do you have any idea of how can i get the first row of that table and the last row of that table, so i could make the loop properly, with no empty results, if you want i can contact you and send a sample of the email body. Thanks for the video
Do you have any idea of what can i change in step composehtml, because I´m trying to run the flow with an email that contains another html and sends me the message of error in that step.
When I try with the html of the example, the issue does not happen.
Thanks for the video
Hi Damo
Im trying with another html table and It is not possible to go further than the step compose html, the error message "Unable to process template language expreessions in action "Composehtml" inputs at line "0" and column "0". The template language function "xml" parameter is not valid. The provided value cannot be converted to xml:"Unexpected end of file has occurred. The following elements are not closed: td, tr, tbody, table. Line 1, position 773
The thing is when i try with the example of the video, the flow goes right, but when i tried with another email the resultd is error on that step.
Any idea about this stuff?
Thanks for the video
Hi Damien,
Can I use the SelectRowsFromHTML action if I receive tables with the same headers but different number of rows each time? From what I understand, it can only be used if the number of rows received is fixed.
Is this flow shared or can I find it in the power automate library?
Is there any solution if there are more than 1 table in the email?
Hi Damien, how do i pull 2nd table from my email. using your method, it stops at the first table
Nice vídeo! When I Have two table in mail
In "add rows to excel table" row option keeps disappearing and when i continue without it, I'm getting 'Rows' is required error
I have created an excel file with table and columns in onedrive.
I got Json array of my table data from email.
My question is, how to add those array data to rows and remove duplicates afterwards.
You're grossly underliked my friend – for that a sub from me
Hi Damien, excellent tutorial. It worked perfectly with the tip in the comment section to replace '/text()' to '//text()". I'm trying to, instead of making a new excel file with the table, update a table in an existing based on the data of the table in the mail via a key in the first column. I haven't figured out if that's possible yet but this seems like a first step to achieve that.
Let's say emails are received regularly with different amount of rows each time. Would this flow no longer work as you need to specify how many rows the table has? I'm wondering how to make the flow dynamic so it can handle any amount of rows..thanks
hello, I have a problem. Why the header and first row is taken both? Can someone help me?
Hi Damien,
Congratulations on the case.
It's fantastic.
I'm trying to create the table from a file of 17 columns and an average of 1800 rows and the process is taking an average of 3 and a half hours to complete.
In the stage of composing the table, through the Xpath that is the bottleneck.
Is there any way to improve this performance?
Great video. Thanks for your effort. Everything was nicely explained.
Hey @damoBird365 thanks for the spot-on content and detailed info.
In the GetHTMLDataAsJSONArray > ComposeStart and END table will it be possible to extract table data from .html file attached on email or saved SharePoint/OneDrive.?
Many thanks in advance
Hi Damien, thank you so much for sharing. I am working with larger data sets and Apply-to-each causes too many API calls. Adapted to my needs and really superfast.
InvalidTemplate. The execution of template action 'SelectRowsFromHTML' failed: The evaluation of 'query' action 'where' expression '{ "@{xpath(xml(outputs('ComposeHTML')),'//table[1]//tr[1]/th[1]/text()')?[0]}": "@xpath(xml(outputs('ComposeHTML')),concat('//table[1]//tr[',item(),']/td[1]/text()'))?[0]", "@{xpath(xml(outputs('ComposeHTML')),'//table[1]//tr[1]/th[2]/text()')?[0]}": "@xpath(xml(outputs('ComposeHTML')),concat('//table[1]//tr[',item(),']/td[2]/text()'))?[0]", "@{xpath(xml(outputs('ComposeHTML')),'//table[1]//tr[1]/th[3]/text()')?[0]}": "@xpath(xml(outputs('ComposeHTML')),concat('//table[1]//tr[',item(),']/td[3]/text()'))?[0]", "@{xpath(xml(outputs('ComposeHTML')),'//table[1]//tr[1]/th[4]/text()')?[0]}": "@xpath(xml(outputs('ComposeHTML')),concat('//table[1]//tr[',item(),']/td[4]/text()'))?[0]" }' failed: 'Unable to evaluate template language expression '@{xpath(xml(outputs('ComposeHTML')),'//table[1]//tr[1]/th[2]/text()')?[0]}' as JSON property name: the property with name '' already exists.'.
Hi Damien,
Great explanation. How would you handle nested tables? For example <head></head><body><table><tr><td><table><tr><td>second table</td></tr></table></td></tr></table></body>
would you code just return:
<table><tr><td><table><tr><td>second table</td></tr></table>
Thanks!
Hi Damien – Thanks for this tutorial. I'm trying this process and I can't seem to get past ComposeHTML. I'm seeing the error — "The 'col' start tag on "some position" does not match the end tag of 'colgroup'.
In my HTML expression I have start tag for colgroup and four col elements (representing my columns) and an end tag for colgroup (no start and end tag for col, just directly col and its parameters for each column). But with this segment, I suppose it should have been addressed given the replacement of </colgroup> with </col></colgroup>.
I wonder what I'm missing.