Site/device health monitoring

There is a need to automatically monitor the health of a site and its devices. Each NetReceiver/VidGrind device has a corresponding uptime variable which has the name <SiteKey>_<DeviceID>.uptime which is updated with each /poll request from a device. The proposed approach is to periodically read the device uptime variables and check to see that the updated times are current. This is how the status is calculated by NetReceiver and VidGrind, the difference being that the latter are calculated on-demand, whereas we need a task that can be run automatically in the background.

Previously App Engine had a Task Queue API for this kind of thing but this has been deprecated in favor of the Cloud Tasks. In general, how it works is that the web service implements a method that is invoked periodically by the task.

It is proposed to extend VidGrind’s API. The first step would be to implement a VidGrind method, such as /api/health, that reports the health of a site and/ordevice, and the second step is to invoke the method from the cloud task and report the results, e.g., via an email notification, etc.

/api/health/SiteKey would the health of all devices at a site.
/api/health/SiteKey/DeviceID would the health of a specific device.

‌

Comments (10)