Maryleen Ndubuaku Edge-enhanced smart analytics in IoT-based distributed systems

Automatic Summary

Edge Enhanced Smart Analytics in IoT Based Distributed Systems

Hello! This is Mary Narco, a PhD scholar and lecturer at the University of Derby. Today, we’re diving into the intersection of Machine Learning and the Internet of Things (IoT). To be more specific, let's explore the fascinating world of data stream processing using intelligent techniques on IoT distributed systems. In this article, we will look into real-world applications and business use cases of this technology.

Understanding Edge Enhanced Smart Analytics

In today's fast-paced technological landscape, being able to extract valuable insights from data quickly and efficiently is crucial. This is where edge-enhanced smart analytics comes into play. The approach is all about making quicker, data-driven decisions using additional sensors, or leveraging the existing data to derive faster insights. The goal is to minimize decision-making delay as much as possible. Industries such as healthcare, gaming, and surveillance are already benefiting from this approach because even a second's delay could lead to a life-threatening situation or hamper the user experience.

Growth in IoT and Associated Challenges

From deploying a negligible 5 billion devices in 2016 to a whopping 40 billion devices recently, the IoT landscape has witnessed exponential growth. This increase has brought about certain challenges, including:

  • Continuous, rapid arrival of data from multiple sensor origins
  • Low computational power in IoT devices
  • Need for faster, life-saving decisions
  • Deployment of devices in harsh environments or disaster-prone areas
  • Heterogeneity of data collected from different sources, locations, and types
  • Connectivity issues between the data source and the data center due to remote locations

Due to these complexities, businesses are constantly seeking to develop low latency applications and scalable systems.

The Switch from Cloud Computing to Edge Computing

Traditionally, cloud computing was the preferred data stream processing paradigm, where massive sensor-generated data was sent to a central cloud location for processing. However, cloud computing faces issues of latency and network congestion due to raw data dumping.

This is where edge computing comes to rescue. In this model, edge devices serve as intermediary layers between the sensor and the cloud. More powerful than the data source devices but less powerful than the cloud, these edge devices preprocess the data before sending it to the cloud for further processing. This not only reduces the load on the cloud but also localizes decision-making, reduces network traffic, and handles privacy concerns.

Applications and Benefits of Edge Enhanced Smart Analytics

The essence of edge-enhanced analytics lies in leveraging the edge’s intermediary layer capabilities by performing smart preprocessing using machine learning. This not only leads to significant data transmission cost savings but also ensures more meaningful and insightful data transmission to the cloud. Besides, the cloud can perform higher-level tasks, ensuring faster actuation and low latency in the system.

Techniques such as federated learning, dimension reduction, anomaly detection, and clustering help maintain data privacy, reduce data redundancy, efficiently handle resource constraints, and even detect irregularities in a timely manner, thereby dramatically enhancing system scalability and efficiency.

With data-driven decision-making now a necessity more than a convenience, IoT-based systems enhanced with smart analytics on the edge truly hold the key to future advancements.

Hope you enjoyed reading! For further queries or interactions, please feel free to connect with me on LinkedIn or Twitter. Let's keep the data conversation going!

Remember, in the era of digital domination, staying updated helps us stay ahead. So, here’s to learning and growing together in the realm of IoT and Machine Learning!


Video Transcription

Hello everyone. Um My name is Mary Narco and um I'm a lecturer at the University of Derby where I'm also doing my phd. So my work is actually at the um intersection of machine learning and internet of things.So basically, I'm looking at um data stream processing, you know, using intelligent techniques on IOT distributed systems. And my talk today is going to be around my research and I'm also going to be talking about some real world applications in in this area and some business use cases.

So the topic is edge enhanced smart analytics in IOT based distributed systems. So when you look at these applications, um you know, these applications have been with us for a while, but today actually there are things that are coming in place and we know that the decision making, like we we're looking at faster decision making, we're looking at making um quicker decisions with what we do with data.

So the the difference we see today is when we want a data driven approach to actually we make decisions. So it could be adding more sensors to the system, you know, to collect more data for data driven decision or it could actually be using the data. We already have to make faster um or quicker decisions or to get more insight into what's happening in the system. So we can see here like in healthcare, um the self-driving cars, gaming and surveillance that we, you know want to reduce as much as possible, the late the delay in making decision because that delay, just one second of delay can cost someone life, you know, it can, it can lead to loss of life or damage.

And in the case of gaming, those of us are game lovers. You know that if there is a lag in in the in the system, it's going to affect the user experience and it wouldn't be enjoyable. And in the case of surveillance, we are looking at detecting, let's say anomalies in the screen. So we want to go from looking at just redundant data to extracting useful insights. And now we're going to be seeing how um this today's technology is coming in place to improve this sort of applications and improve the efficiency and the performance of the system. So internet of things deployment has actually gone from about 5 billion devices as of 2016 to 40 billion devices and even more devices are being added to the network. And we know that the data that's coming from internet of things sensors, it it arrives quickly. For example, let's take the greenhouse, auto autonomous greenhouse. For example, you know, we can have sensors that are deployed to collect data every second. And you know that data has to be sent somewhere to be analyzed. And that data is also arriving continuously. And we also know that the IOT devices are they have low computational power, we know as well.

Like uh I've shown in in the previous slide that today we're looking at time critical application application that you know, you want to either you're improving the user experience in in real time or you're actually making real time decision that can be a matter of life and death.

So the nature of IOT based systems will look at it broadly um in the aspect of the challenges and as well as the requirements. So in for the challenges we know that IOT devices, they are resource constraint, some sometimes they are being deployed in harsh environment for for example, in disaster areas where they need to collect data. So some of them may be on on man, then we know that they are heterogeneous in in nature. So we we are collecting data from different data sources, different locations. It could even be in the case of healthcare, you just have like one patient data being in diff um having different data types. Let's say you have image, you have uh numerical data, you have time series data and all of them are all related to the same patient. So how can we handle this heterogeneity? These are some of the questions that come into play. When we talk of IOT based systems, then we also look at connectivity issues. So for example, in the disaster zone, when you deploy your IUT sensors or in an online area, you may actually um have connectivity issues between the data source and the um data center. OK.

So tho those are some challenges then for requirements that we saw where these days um most businesses they, they're looking to build low latency applications. For example, if you go on a website and it's taking even as it was 10 seconds to load the website, you're already impatient.

So it's a fast paced world and we're looking at arriving at very low latency um decisions also scalability. So like we said, the IOT devices, they are increasing in number, they are being added every day and because they are low cost low power devices, you know, it's easier to just add more devices. But what that means is that, you know, we are going to have like more data to, to handle. So we should be able to scale up our system. So as we're adding these devices, we want to achieve, you know, about the same performance uh uh in the system. So that is those are some requirements. Now we look at the this the cloud computing is the traditional, you know, style the traditional paradigm for data stream processing. So we in in this paradigm, we have sensors that are collecting data. So these sensors are the data source, they're sending data to the central location, which is the cloud. The cloud is a powerful entity. It's got, you know, storage, massive storage and computing capacity so it can handle all that data. But the problem is that, you know, for, for example, if you are in an area where you know between the data source and the cloud, you know, you're experiencing end to end latency. What that means is that it's going to take more time to make decisions.

And if you need to make an actuation decision, let's say open the vent. For example, in the case of self-driving cars, you need to stop the car, halt the car to prevent an accident and and and stuff like that. So it's going to, you know, using this architecture, it's going to really be challenging. And again, we have raw data, all of these sensors are sending raw data to the cloud. So you're going to have like a massive stream coming in and that can cause congestion in in your network and it can be cause network traffic as well. So these are some challenges and a lot more with the cloud and we're going to see how we can, you know, actually reduce some of those challenges or solve them using edge computing. So in the edge computing paradigm, we actually have the edge device as an intermediary layer. So you can have like several edge devices as intermediary layers. Some sometimes you hear cloud lit, mobile cloud and all other terminologies. But the whole idea is that you have this intermediary layer between the sensor and the cloud and the edge is is not as powerful as the cloud, but mostly more powerful than the the sensors than the um the data source devices.

So what what it does is that it can preprocess the data before it gets sent to the cloud so that um it's making low-level decisions or you can say, you know, doing some low level processing while the cloud is actually doing higher level processing. So in that way, we can say that the edge um is serving as a bug going to the cloud it or offloading the cloud and offload some of these processing tasks on the edge to, to be handled. And this way as well, if we want to make decisions quickly rather than, you know, sending um the data all the way to the cloud, we can actually localize the decision and we can um you know, do some act at the at that edge level. Yes. So again, when it comes to sending data, we're not sending all that raw data, we we can preprocess the data and we're sending data that is more meaningful and insightful to the cloud. Now, um the whole idea of edge enhanced smart analytics is that, you know, uh that, that uh we can actually leverage the, you know, this ability that the edge has as an intermediary layer by performing some preprocessing. And that preprocessing can be done in a smart way using machine learning. So even the and the cloud can actually collaborate. So it's sort of a cooper data processing in as a whole um a single distributed system.

So um we can have machine learning on the edge, machine learning on the cloud and the data um processing can go back and forth between them. And uh in terms of meeting IOT system requirements, we're going to be looking at how edge enhancement analysis can actually come in handy to achieve the goal of IOT systems or the or the requirement. So we're looking at, we're going to look at that in two separate parts. So there is we can use learning technique, which is basically how the intelligent model is being trained. We can also use data reduction techniques. So the way the data is being processed, just going to pause and check uh how this is going. OK. So, yeah. Yeah. OK. Um So moving on. Um hm So now we, we look at the different learning techniques. So we have Federated learning, which is basically um we have local models, you know, like on the edge. So those models are on the edge and then they can process the data locally and the model gets sent to the cloud and aggregated on the cloud. So the, the advantage of of doing this. Why do we need it? One is privacy. For example, if a business is concerned about sending its data to the cloud, you know, to avoid the data being compromised or it it wants to ensure privacy of the data.

So the data can be trained locally and then they send the model and aggregate it in in case maybe they need to make some global decision with business units but don't want to expose their data. So privacy is one motivation. Another motivation is um in in the case of resource constraint. So um rather than having to send all the data to the cloud, the edge node can still, you know, make decision or can still be trained. We um or it can still receive the global model during the model distribution. So we, so the motivations are privacy um resource constraint and there are some other motivations as well for Federated learning. Then um we look at data reduction techniques. Um One is the dimension reduction anomaly detection and clustering. So the whole idea is that we want to preprocess the data on the edge before it's sent to the cloud. So these are different techniques that we can use to achieve that. So dimensional reduction is we're still sending the same data to the cloud, but then it is in a miniaturized version. So the data is compressed. So it means that the channel between the edge and the cloud can actually be, you know, freed up and we can save. So um bandwidth then in the case of anomaly detection, we actually just want to send the special data or the interesting data as we may call it to the cloud. And in case of um a normal detection, and then a typical example is in video surveillance.

So basically, let's say you have a human monitor who ordinarily would be looking at each stream as the coming. But with the the um edge enhanced model, you actually have an intermediary layer that acts as a sieve, you know that sieves out the the abnormal data and then send the abnormal data to the cloud. So you can even set up an alert system so that, you know, whenever there is anomaly, you know, everyone can be alerted and it can improve decision making. And by doing that, you tend to miss less of the anomalies which are in themselves where then in uh clustering as well. So clustering you, you know, when data tend to be similar, you can have a lot of redundancy. So what you can do is to prove the similar data and then you're sending an instance of that data to the cloud rather than the entire data. So this is a way to reduce redundancy in your in in your data transmission, for example, in the case of data summarization. So you have this, you know, raw data input data, you try to find some structure and some similarity to the data.

And when you have, you have put the data in clusters, you can just choose like a sample of that data and send across your network just to summarize what is going on. So a typical example is in screen understanding you, you want to really know what's happening in the scene without having to look at every second of or or every second date uh of data coming in. So you can actually use um this approach. So what are some of the benefits of edge enhanced smart analytics? So one like we said, we we incur lower transmission costs because we actually um leveraging the edge to preprocess the data. So we can, you know, reduce the cost of data transmission, the data we're transmitting is not only fewer or compressed, it's actually more meaningful and more insightful as it goes to the cloud. And by doing that, the cloud can then let's say, perform some higher level tasks, maybe object recognition or correlation of anomalies or do or perform some global task, you know, so it frees up the cloud for high higher level decision, then fact faster actuation as well. So if we need to take decisions instead of have um sending all the data to the cloud, we can actually, you know, localize our decision making. And once let's say the edge detects anomalies, then we can make some attrition decision without you know, going back and forth.

And also that would lead us to low latency on the system in the case of scalability because we can add uh multiple edge resources that are local to that network. It means that, you know, we can scale up the the analytics privacy as well. Like we saw, you know, if you don't want to send all your data to the cloud for privacy and security concerns, then you can actually u use the Federated learning model to, you know, process your data locally and only send the um the model rather than the data to the cloud for aggregation.

So the the takeaways from this is basically that um the IOT systems, you know, we, we have different challenges like resource constraints, he uh heterogeneity and connectivity and we are tackling that with um learning and data reduction techniques. So some benefits as we've seen is that we can save data transmission costs, we can have um quicker attrition um and make decisions faster and we can also achieve privacy and scalability in the system. OK. So um let's see. Do we have any question yet? So please, if you have um question, feel free um ask on the um on the chat and I'll answer with the little time we have left and also feel free to connect with me on linkedin on Twitter. That's my OK. Um Ella, um how can we get started with putting data models on the cloud? So, um what, what do you actually do you mean like machine learning models? I guess like, so we can actually OK machine learning models. So basically you can have um you know, depending on your architecture. So you drop an architecture of of what you thank you Mariana. Um So you can have an archi depending on your architecture. So if you want to use the, you know, the cloud computing approach where you all send, you're sending all your data to the cloud. So you can have a specific machine learning model that you deploy on the cloud.

So even um you know, Google and some of Amazon and all that, they even have like machine learning platforms where you can do your machine learning, you know, training, you know, they have like algorithms, sort of custom made algorithms. So you can check out those platforms for, for what they have or you can build your own, you know, machine learning model and deployed on the cloud. So they already have like structure in place to, to achieve that, you know, I mean the cloud platforms like Amazon and, and cool. OK. Um All right. OK. You walk on cybersecurity and data privacy. All right. OK. Yeah, sure. Yeah, please. Um let's connect and um thank you very much for joining. Yeah, thank you. Thanks. Thanks a lot. All right. So I'm signing up now. Yeah, Dennis. Thank you for joining. Thanks a Lija. Thank you very much for joining. All right. Goodbye. Thanks, Sarah. Thank you, Judy. Thanks for joining. All right, bye.