[Solved] HTML parser node

At Hack Arizona I did a project with IBM Watson Node-RED, displaying local bus schedules on the LCD kit. I used Watson because IBM is a sponsor at the hackathon so I can apply for their prize.
From functionality point of view, IBM Watson is very similar to Losant Workflows, but it’s more complicated and harder to use; their IoT platform is somehow disjoint from workflows, and lacks the realtime MQTT debugging that is available in Losant “application” page.

I find one feature that Node-RED has but Losant Workflow doesn’t: an html node, which parses a string as HTML DOM, and applies a CSS3 selector to extract one or more elements as strings or JSON objects.

This node would be useful to scrape information from a webpage which isn’t otherwise available as a JSON/XML feed. During the development of LCD calories tracker, I attempted to import jQuery into the workflow but it doesn’t work because workflow environment doesn’t have DOM, and importing DOM has a whole lot more dependencies. I ended up using regular expressions and string processing, but that isn’t scalable for increasingly complex webpages.
I hope Losant workflow can have an HTML parser node to simplify such applications.

1 Like

This is a cool idea - I created a ticket for the feature request. Just for our reference, the source for NodeRED’s HTML node is here:

We added this node in our release about 2 weeks ago (https://www.losant.com/blog/platform-update-20170228), but forgot to reply to this thread. You can check out the documentation for it here.

Does this actually work for importing jquery? It looks like it just finds text inside of an HTML element…

No, this doesn’t import jQuery and would not execute any JavaScript on the webpage. It only allows you to extract information from the DOM already present in the HTML structure.