Hi
I am seeing anomalies in the line graph on the dashboard. Which isn’t reflected in the data explorer and appear/disappear with refreshes.
The dashboard in question is a 30min with 10 sec averages. The data is fed in at roughly 10 sec intervals. Data points are presented with wildly different values. ie 2000 vs 1450 (actual) RPM.
I have included a couple of screenshots showing the discrepancy and the same period in data explorer.
Seeing the odd data can be quite concerning as some times the numbers shown are 1. not possible or 2. indicative of a major problem with our equipment
My guess is it has something to do with the MEAN and the sparse data.
Tim,
Could you send in a support email with the device ids and the dashboard that is showing the anomalous data? I’d like to dig into the raw values and try and figure out what might be going on. Thanks!
Hi Michael.
Device ID’s are
578881b26e6a550100c80342
5788841e6e6a550100c80343
57888438fc8a5001007a98a5
5788844afc8a5001007a98a6
https://app.losant.com/#/dashboards/581c03b1b9475401004e58b8
The period we were seeing it was around 10:40am GMT+8 yesterday 15/11/2016
It would change all the time, for a few refreshes data looked good, then we would see the anomalies as shown in the screen shot.
Then disappear, then re-appear. I have noticed odd things at times on the right most data point, which also corrects on the after the next sample.
When we are starting equipment we tend to run dashboards with a low refresh interval (10 or 30 sec) so we can see issues such as oscillating pressures. In this situation we have 4 pumps each feeding to the next dewatering 300-400 m deep gold mine open pit.
None of that equipment is running at the moment.
Cheers
T
Tim,
Haven’t found a root cause yet (or actually been able to reproduce the issue yet) but I do have a question after looking at the raw data. When your pumps send up their data, are they attaching timestamps to the data, or is Losant just timestamp-ing with the time received? I noticed that most data points are approximately 10 seconds apart, but there are a couple data points where there are multiple within just couple seconds (and it happens right around 2016-11-15T02:42:40 GMT, which is right at the anomalous data, assuming I did my timezone math correctly).
Also, as far as the “rightmost” datapoint on a graph depicting current data, it is pretty normal for it to fluctuate weirdly - since it is reacting to data still flowing into the system.
I’m going to keep digging, looking specifically to reproduce the fact that the anomalous data would appear or disappear as the data refreshed. And you said the data always looked correct in the data explorer, even through multiple data refreshes? The data explorer and dashboard graphs use much of the same data query path, so that difference is interesting.
Hi Michael
Thanks for looking at this.
Yes we are adding a timestamp which is from the payload timestamp which is the time of data collection. Not the time sent.
Current process
Essentially we have a python process with an async loop collecting data via Modbus from the pumps.
This is not always reliable for various reasons. (For instance the controllers can’t support more than one Modbus connection at time, and so we close/open connections a lot.)
That set of data is timestamped and logged locally.
It is async pushed via pubnub to another more mobile friendly dashboard.
This same process writes it to REDIS. (The pubnub bit will move out of the core process an into a separate process listening to REDIS in future).
A separate process is subscribed to REDIS and grabs the payload as it is received (a queue) and then send it via MQTT to Losant. The timestamp of when the data was collected is taken from the payload and added as a Losant timestamp. (This is an early version of the software) we are now using mosquitto broker locally to queue. The idea is the timestamp is local to engine and not when data is received at Losant.
If there is interference on the network then the Modbus sampling can take time and be delayed, so it is quite possible that we might see some data come in with a shorter gap. As a MODBUS query could be delayed.
I figured it might be hard for you to replicate as seemed to be very dependent on active data coming and quite possibly due to the refresh.
It was however always showing in the same time period but different anomalous values when it was incorrect.
Hope this helps
Cheers
Tim
Tim,
I still haven’t been able to identify the issue or been able to reproduce the problem. I have identified one potential race condition with our MEAN aggregation, which in some cases could return anomalous data momentarily if data points are still flowing in for the particular time bucket being aggregated - but I don’t think that would be causing the issue you have experienced (although we will be fixing that race condition, and I’ll let you know when that fix is out, just in case). Please continue to monitor and let us know if you experience the issue again so we can catch it as early as possible.
Thanks,
Michael