I’m currently using a notebook to convert raw data files into KPIs. Each device sends one raw data file per day. Right now, I have a workflow that triggers the notebook every time a new file is uploaded.
The issue is that each notebook run takes about 4 minutes to execute even though the actual KPI calculation takes less than a second (Notebook execution time - Help - Losant Forums). This overhead means I’ll quickly hit the notebook runtime limit.
For example:
1 device × 30 days × 4 minutes = 120 minutes per device per month,
so with the 930-minute monthly limit, I can only support about 8 devices before running out of minutes.
I was thinking of changing the approach: since the notebook can handle multiple input files and generate multiple KPI outputs, I could run it once per day and process all the new raw data files at once.
However, I’m not sure how to configure the notebook to accept a dynamic number of input files, and I’m wondering if that’s even the right approach.
Question:
What would be the best practice for this situation?
First, a little background - yes, there is some overhead in executing a notebook in Losant’s environment that you do not experience when executing the notebook in your local environment. That is because we are spinning up a virtual machine, isolated from other environments and from the public internet, to perform the execution within our platform infrastructure. So even the simplest notebook usually takes at least four minutes to complete (and a notebook that locally takes, say, 15 minutes takes about 19 minutes in our environment).
As to your question … I’m going to file a feature request to allow for an application files directory to be an input to a notebook so that you can maintain separate files and pull them all in for an execution - something you cannot do now.
In the meantime, what is the nature of the files you are creating and how are you creating them? If they are CSVs with similar column structures, could you do something like the following …
Continue generating separate files as you are now, which we’ll combine into a single file in the following instructions.
Another option - and this is easier but potentially more dangerous - is to, immediately before invoking the Notebook: Execute Node to fire off the notebook …
Use the API Node again and the Notebook: Patch endpoint to update the inputs to the notebook, creating an external URL input for each file retrieved by the first call.
This would only work when executing from the workflow environment, though … using the UI or the API, you’d have to do the same work of creating one input per file first.
Unfortunately, the raw data files aren’t CSVs, they need to be decoded in the notebook first before running the KPI calculations. So the first option won’t work in this case.
I’ll try out the second option you mentioned tomorrow and will share the results here once I’ve tested it.