Abstract:
The recent availability of powerful (SBC) Single Board Computing devices has facilitated edge
computing at a level that was previously hard to deploy. This new shift, presented a gap, hitherto
considered tough to implement in the industry with lower power consumption at the edge.
Generally, keeping in mind preventive maintenance intervention as the key purpose, a simple and
quick implementation of federation in the industry had to be addressed. Industries need such
predictions with data privacy and accuracy to take care of chronic spare replacements before things
fail.
We were presented with an opportunity to suggest preventive maintenance procedures and make
manufacturing decisions based on (IIoT) Industrial Internet of Things data from multiple sensors
across the enterprise, from multiple similar machines in different shop floors in an industrial setup
across a varied geography.
IIoT sensors chosen, ping the edge device at each location with sensor data at regular intervals.
(MQTT) Message Queuing Telemetry Transport protocol was used [7] and the data reached the edge
in (JSON) Javascript Object Notation format with a timestamp and sensor value. The SBC ensured
low powered operation mode and was adequately cooled with a passive aluminium heat-sink and
fans. This ensured that the edge server could be kept on for long periods of time, consuming only
about an average of 15W of power.
We introduce a unique method of federation, specifically, using HDF5 model file transfer. The
checkpoint file is then synchronized to the central server using timed file transfer scripts at the nodes
achieving simple federation. Preset cron jobs at the clients allow real time federation as a quick
solution using off-the-shelf hardware.
The setup has a central server or alternatively a cloud server for fallback, in the monitoring station.
This was then networked to various edge devices in each shop floor across the industry. The individual
machines and their sensor data were then captured into a named time series database at each edge
device. The learning was done at each edge device and the model was then sent back to the central
server without any actual sensor data for incremental learning. This learning model could be used at
another similar deployment based on similar sensor data.
This is an implementation using Split Federation and Linear, DNN, CNN, RNN models.
Various sensor data was collected by the edge device in each of the industrial floors. We chose to base
our first set of experiments on the time series voltage data relayed, since it was fluctuating at times
from the power grid and had a seasonality pattern. Identifying periods of least fluctuations to run sensitive gauging machinery was the first step in forecasting preventive maintenance routines of
sensitive equipment across the enterprise. This even ensured higher accuracy of the said gauging
equipment. (FL) Federated Learning models were used to predict the sensor values and make
decisions. The sensor data was stored and processed at the edge. The (ML) Machine Learning
techniques only operated at the edge. The models were then synchronized to the central cloud server
and back to other edge devices in the network. Results obtained lay a foundation for FL using a split
learning paradigm in the IIoT space with SBCs consuming the least amount of power, for an enterprise
spread across a diverse geography.
Data privacy is upheld and maintained, while at the same time reduced bandwidth requirements
between the edge and the cloud make this a simple and easy first implementation of federation in the
industry.
There are many social implications to this approach as well. The quick and simple approach can help
in a cheaper implementation in public service projects where site data needs to be private. Even the
possibility of power cuts in rural areas will not affect the federation and decision making can happen
even in the harshest of field situations.
This has a lot of impact in decentralized decision making. Failure patterns can be identified and in
general, an accurate model can be generated with limited resources.
The uniqueness in this approach is that the training checkpoints are saved. In case of any interruption,
TensorFlow Keras callback ModelCheckpoint can continuously save the training model while
training the model, and also at the end of the training.