Dynamic number of inputs/outputs for a notebook

Hi Losanters,

I’m currently using a notebook to convert raw data files into KPIs. Each device sends one raw data file per day. Right now, I have a workflow that triggers the notebook every time a new file is uploaded.

The issue is that each notebook run takes about 4 minutes to execute even though the actual KPI calculation takes less than a second (Notebook execution time - Help - Losant Forums). This overhead means I’ll quickly hit the notebook runtime limit.

For example:
1 device × 30 days × 4 minutes = 120 minutes per device per month,
so with the 930-minute monthly limit, I can only support about 8 devices before running out of minutes.

I was thinking of changing the approach: since the notebook can handle multiple input files and generate multiple KPI outputs, I could run it once per day and process all the new raw data files at once.

However, I’m not sure how to configure the notebook to accept a dynamic number of input files, and I’m wondering if that’s even the right approach.

Question:
What would be the best practice for this situation?

Thanks in advance!

First, a little background - yes, there is some overhead in executing a notebook in Losant’s environment that you do not experience when executing the notebook in your local environment. That is because we are spinning up a virtual machine, isolated from other environments and from the public internet, to perform the execution within our platform infrastructure. So even the simplest notebook usually takes at least four minutes to complete (and a notebook that locally takes, say, 15 minutes takes about 19 minutes in our environment).

As to your question … I’m going to file a feature request to allow for an application files directory to be an input to a notebook so that you can maintain separate files and pull them all in for an execution - something you cannot do now.

In the meantime, what is the nature of the files you are creating and how are you creating them? If they are CSVs with similar column structures, could you do something like the following …

  1. Continue generating separate files as you are now, which we’ll combine into a single file in the following instructions.
  2. Application File Trigger that fires whenever a new file is added to a given directory.
  3. File: Get Node that fetches the combined file’s contents.
  4. CSV: Decode Node that parses the combined file contents into an array of objects.
  5. File: Get Node that fetches the contents of the newly added separate file.
  6. CSV: Decode Node that parses the new file contents.
  7. Array Node Concat operation that combines the two into a single array.
  8. CSV: Encode Node that turns that combined array back into a CSV string.
  9. File: Create Node that overwrites the previous combined file with the same contents plus the new file.

Another option - and this is easier but potentially more dangerous - is to, immediately before invoking the Notebook: Execute Node to fire off the notebook …

  1. Use the Losant API Node and the Files: Get endpoint to fetch all the files under a given directory.
  2. Use the API Node again and the Notebook: Patch endpoint to update the inputs to the notebook, creating an external URL input for each file retrieved by the first call.
  3. Then invoke the Notebook: Execute Node.

This would only work when executing from the workflow environment, though … using the UI or the API, you’d have to do the same work of creating one input per file first.

Thanks Dylan for the quick reply!

Unfortunately, the raw data files aren’t CSVs, they need to be decoded in the notebook first before running the KPI calculations. So the first option won’t work in this case.

I’ll try out the second option you mentioned tomorrow and will share the results here once I’ve tested it.