Exporting historical data to GCP

What are the best methods of copying all historical timeseries data to the google cloud platform? I’m aware of the archive feature but my concern around this is it only takes data that is older than 31 days. I’d like to figure out how to take all timeseries data and transfer all of that to google big query or google cloud (at the very least) in order to use their AI/ML features in a relatively simple way. Any help would be much appreciated, thanks!

Hey @Krishan_Patel,

Welcome to the Losant Forums, we’re excited you’re here!

We are working on a How-To guide for this very thing right now, so I will be sure to follow up with you once it is released.

The gist of the guide, though, is that you use the GCP: BigQuery Node in a workflow to input data into your BigQuery table.

How you get that data out depends on your use case.

If you are looking to get data that is currently stored, then you will need to run a series of Gauge Query Nodes or Time Series Nodes and then loop through that data to send it to big query.

If you would simply like to start sending data to BigQuery, then the work depends on how you are sending data to Losant. If you are using MQTT and sending device state directly with one of our Libraries or through the device state topic, then I would suggest a Device: State Trigger that triggers a workflow to send data to BigQuery (within the throttle limits of BigQuery, as well). If you are using a webhook to send data to Losant, then I would suggest adding the BigQuery Node to the workflow that you are ingesting that data.

My above suggestion does imply, though, that you are receiving data at a rate that is lower than the rate limits for BigQuery. If not, then my suggestion would be to set up a workflow with a Timer Trigger and then use a series of Guage Queries or Time Series queries to send that data over.

Please let me know if this answers your question or if you need any other guidance.

Thank you,
Heath

Thanks for your reply Heath!

To follow up, I did some testing around using the time series node but I was only able to query one attribute where each of our devices contain several attributes. I also explored querying the time series data using API (https://docs.losant.com/rest-api/data/#time-series-query) but am only able to query the last 180 days per device. Any tips on how to get around this? If not, are there any other recommended ways to do a full data dump into GCP?

Hey @Krishan_Patel,

So with the Time Series API endpoint that you linked, you are able to change the query time range back to an older 180 day period. A quick note, though, you will only be able to query as far back as your data retention limit. So, if your data retention for your organization is 365 days, you will only be able to query data up to 365 days ago. You can do this with the end, duration, and resolution fields in the API request body.

Using the API, were you able to get more than 1 attribute? Since the attribute field in the request body is an array, you can query multiple attributes.

"attributes": [
    "voltage",
    "amps", 
    "temperature"
  ],

As for your other question:

are there any other recommended ways to do a full data dump into GCP?

Using the Losant API would be my recommended approach. If you would like to send data to BigQuery, I would also recommend that you use the BigQuery node that I mentioned in my first reply.

Let me know if this works for you.

Thank you,
Heath