I would like to improve a fault tolerance of my edge gateways by running an additional software watchdog inside of Edge Agent. In order to do so, I plan to build a latch by setting a variable like the following:
(1) for Device: Connect;
(0) for Device: Disconnect;
(0) for Device: Startup.
If the agent doesn’t get connected to Losant after 10 Minutes from Disconnect or Startup events, the edge host OS will get a command to reboot the gateway.
My Question is: Is there a way to check Losant conneciton status continuosly from the container? Like every minute with a timer? I want to avoid a situation, when the value was not set for some reason during the connection and the gateway reboots without a reason.
Suggestions to improve the logic are also welcome.
If you would like to check the connection status from the container, you would likely want to add it as an environment variable on container startup.
I believe that if you were to create a workflow and utilize a Timer Node in conjunction with the Device: Disconnect Node, then make use of Workflow Storage and a connected environment variable, you would be quite useful for building this fault tolerance workflow!
Please let me know if you have any further questions
Hi Julia, thank you for help. I re-built my logic using isConnectedToLosant boolean and a timer, checking the status every minute. The decision on a reboot is made based on a time difference between last connected/disconnected. The only two questions I wonder about are 1)if there is a way to catch a moment when a newer workflow version is applied to the agent 2) can I reboot the agent with one of the nodes?
There is not currently a workflow node that is able to reboot the agent. When your use case requires a reboot, would the device be connected or disconnected?
I am interpreting “catch a moment when a newer workflow version is applied” to mean you are hoping to trigger off of this event. Currently, there is not a way to “store” the exact time a workflow was pushed to your device. This value is currently only accessible as a timestamp on your device page, though you could do some handling on the device side. If I have interpreted your question correctly, I was curious if you could explain your use case for this value
My request was related to a desire of a better control of processes happening on the agent. Such as I am not fully aware if stored variables reset every time and to which value when 1) a newer workflow version is published 2) Agent reboots 3) Agent reboots after being hanged
In terms of Workflow Storage, those are persisted on disk. So, given that you have the store configured on your Edge Agent, the values shouldn’t reset upon new workflows or agent reboots.
Also, as an added note: Losant does expose flowVersion on the payload; it is the name of the current version of the Losant Workflow that is running. You can use that variable in the On Change Node to easily perform actions when a new version occurs.
Also, touching on your Reboot Node suggestion. I’m wondering about the need to reboot the agent from within a workflow. I am curious about one comment you made. Could you elaborate what you mean here by “hang”:
Ah got it! I wanted to make sure that there wasn’t a bug there. However, yes there are a number of items, like saving workflow storage to disk, that happens on the shutdown of the agent.