Consider the following question: Why do most teams face pressure to rethink traditional logging and observability approaches? Asking this question to most engineers would likely result in answers centered on the challenges posed by microservices apps. Because microservices are more complex than monoliths and involve more moving parts, they require more sophisticated, granular log collection, correlation, and analysis.
But microservices are not the only reason why logging is so challenging today. Equally important is the rise of the Internet of Things (IoT) systems. IoT devices see wide distribution and involve a broad range of types and formats of log data. Ultimately, this presents many of the same challenges as microservices regarding logging and observability.
Here’s what the IoT means for logging and observability, along with an overview of IoT logging best practices.
The Challenges of IoT Logging
At a high level, the challenges of logging for the IoT stem from one simple fact: IoT infrastructure is very complex.
But we can distill that complexity into several specific challenges that teams must manage when logging and analyzing data from IoT sensors, networks, and other devices in any type of IoT setting – from manufacturing to healthcare to robotics and beyond.
Distributed Devices Equal Distributed Logs
Like microservices applications, IoT environments often consist of devices spread across a wide geographic area. You might have a network of Internet-connected traffic lights distributed across a city, for instance, or medical IoT sensors deployed in patients’ homes.
Because of this distributed architecture, the logs that IoT devices produce typically disperse across a wide area. That makes collecting and aggregating log data more difficult because you can’t collect it all from a single local network. You’ll need to find a way to collect logs from each device, no matter where they are.
Diverse Communication Protocols
Some IoT devices communicate using standard Internet protocols. But others – especially those subject to intermittent network connectivity or have minimal hardware resources available for interacting with the network – rely on less standard networking protocols that can help IoT devices manage communications under conditions that wouldn’t exist in a conventional data center.
Ultimately, this can be a challenge when logging because logging strategies based on standard, IP-based network configurations and assumptions of continuous connectivity don’t always work with IoT networks. We can’t necessarily pull log data from a device at any time, and tracking devices based on IP doesn’t always work.
Intermittent Device Availability
Due to intermittent network connectivity and other issues, IoT devices may come and go offline on a rapid basis. They may power down periodically due to limited power resources. They may power down periodically due to limited power resources, see physical destruction due to harsh environmental conditions, or be stolen or moved.
Challenges like these make it challenging to count on collecting every log from every device in your IoT network. Teams must assume that some log data will be lost or unavailable.
Diverse Logging Formats
Compared to standard infrastructure and applications, IoT devices offer minimal standards or consistency regarding logging data. You can count on virtually every Linux server in the world to log data inside /var/log and every Kubernetes cluster to store its various logs in the same directory on nodes.
But with IoT devices, all bets are off about where they will store logs and how they will be structured. Sometimes, the devices may not log data because they lack the hardware resources necessary to store data locally, or developers didn’t implement any logic for logging.
Best Practices for Working with IoT Logs
There is no simple solution for coping with the logging challenges that IoT devices present. However, the following practices can help mitigate the difficulties.
Focus On the System, Not Individual Logs
Because some IoT logs may be lost or intermittently available, log analysis operations should focus on assessing the state of the network as a whole rather than obsessing over the status of individual devices. Lack of visibility into some devices is not likely to present significant challenges as long as you can collect and analyze enough data to manage the health of the IoT system as a whole.
Gather All of the Evidence You Can
That said, it’s nonetheless crucial to gather and analyze all of the data that you can from IoT devices and infrastructure. That includes any logs that the devices generate and metrics you can collect based on network performance. If your IoT devices connect to centralized data centers, use the logging data and metrics from those data centers, too, to gain added visibility into your IoT network.
In other words, be as creative as possible when it comes to gaining insight into IoT devices. Conventional logs may be the best source of visibility, but they’re not always available or complete, so you’ll need to think outside the box when building an IoT logging strategy.
Pull, Don’t Push
Due to the limited hardware resources available to IoT devices, logging strategies that focus on pulling log data into a centralized logging platform using logging agents rather than pushing it directly from the devices may improve overall logging performance. Your goal should place as little burden as possible on IoT devices and networks when collecting log and metrics data.
Continuously Improve Your IoT Logging Strategy
IoT networks tend to evolve and grow constantly. The logging processes and tools you use for them must do the same. As new devices come online or networks grow even more distributed, adapt your logging strategy to accommodate the changes.
Conclusion: Managing Logging in an IoT-Centric World
IoT logging success hinges on building a creative, adaptable logging strategy that can accommodate IoT infrastructure’s distributed and transient nature. Teams can achieve this goal by using a log management platform like LogDNA to collect as much evidence as possible and use it to track the health and status of their IoT environment as a whole.