In order to make a more robust device, I am using a timer interrupt to drive some critical tasks. I have discovered that when I do this, the connection to Ubidots is lost fairly often. I don’t think it’s actually lost. I suspect that PubSubClient does not play well when it is being interrupted - perhaps poor timing logic?
The symptoms are that Reconnect() finds the connection down and attempts to connect again. It often takes a couple of tries before it reconnects. I know you drop the connection from time to time but this is much more frequent, and what I am seeing stops if I disable the timer interrupt.
The timer interrupt runs 5 times a second and uses at most 50 ms, which is 250/1000 or 25% of the processor. If I increase this to 10 times a second the MQTT connection is “lost” quite often. This points at a timeout that is set too low, rather than an actual lost connection.
Typical reconnect times are around 2.5 seconds, but can go as high as 4.5 seconds. If it exceeds 8 seconds the watchdog timer fires and reboots the processor - which happens a few times a day.
Based on my experience so far, I am wondering how anyone can have a robust control system which uses an IOT service like this. It’s a pity there isn’t a non-blocking MQTT. I am back to thinking that I need to dedicate a processor to doing the IOT communication (Ubidots) and use a separate one for the control system.
Every so often the reconnect attempt fails with “no socket available” - once this happens there is no recovery other than rebooting the processor. This seems to be an error in the code doing the connect - it isn’t freeing up the sockets.