[Solved] HTML parser node

yoursunny · January 16, 2017, 4:47am

At Hack Arizona I did a project with IBM Watson Node-RED, displaying local bus schedules on the LCD kit. I used Watson because IBM is a sponsor at the hackathon so I can apply for their prize.
From functionality point of view, IBM Watson is very similar to Losant Workflows, but it’s more complicated and harder to use; their IoT platform is somehow disjoint from workflows, and lacks the realtime MQTT debugging that is available in Losant “application” page.

I find one feature that Node-RED has but Losant Workflow doesn’t: an html node, which parses a string as HTML DOM, and applies a CSS3 selector to extract one or more elements as strings or JSON objects.

This node would be useful to scrape information from a webpage which isn’t otherwise available as a JSON/XML feed. During the development of LCD calories tracker, I attempted to import jQuery into the workflow but it doesn’t work because workflow environment doesn’t have DOM, and importing DOM has a whole lot more dependencies. I ended up using regular expressions and string processing, but that isn’t scalable for increasingly complex webpages.
I hope Losant workflow can have an HTML parser node to simplify such applications.

Brandon_Cannaday · January 16, 2017, 5:09pm

This is a cool idea - I created a ticket for the feature request. Just for our reference, the source for NodeRED’s HTML node is here:

github.com

node-red/node-red/blob/master/nodes/core/parsers/70-HTML.html


<script type="text/x-red" data-template-name="html">
    <div class="form-row">
        <label for="node-input-tag"><i class="fa fa-filter"></i> <span data-i18n="html.label.select"></span></label>
        <input type="text" id="node-input-tag" placeholder="h1">
    </div>
    <div class="form-row">
        <label for="node-input-ret"><i class="fa fa-sign-out"></i> <span data-i18n="html.label.output"></span></label>
        <select id="node-input-ret" style="width:70%">
            <option value="html" data-i18n="html.output.html"></option>
            <option value="text" data-i18n="html.output.text"></option>
            <option value="attr" data-i18n="html.output.attr"></option>
            <!-- <option value="val">return the value from a form element</option> -->
        </select>
    </div>
    <div class="form-row">
        <label for="node-input-as">&nbsp;</label>
        <select id="node-input-as" style="width:70%">
            <option value="single" data-i18n="html.format.single"></option>
            <option value="multi" data-i18n="html.format.multi"></option>

This file has been truncated. show original

Michael_Kuehl · March 14, 2017, 2:19pm

We added this node in our release about 2 weeks ago (https://www.losant.com/blog/platform-update-20170228), but forgot to reply to this thread. You can check out the documentation for it here.

Nathan_Harvey · February 20, 2019, 11:00pm

Does this actually work for importing jquery? It looks like it just finds text inside of an HTML element…

yoursunny · February 20, 2019, 11:19pm

No, this doesn’t import jQuery and would not execute any JavaScript on the webpage. It only allows you to extract information from the DOM already present in the HTML structure.

Topic		Replies	Views
Python node for Workflows Feature Request	2	484	April 11, 2022
Node to export payload to pc Feature Request	8	1109	February 13, 2020
Tuesday Tip: Use the Mutate Node for complex payload transformations Losant Tips edge , workflow	0	446	July 12, 2022
Encode and Decode a la mode! Losant Tips dashboard , tips , encode , decode	0	1895	July 16, 2019
Platform Update - Webhook responses, dashboard cloning and more Platform Updates	0	1110	November 23, 2016

[Solved] HTML parser node

Related Topics