Process large-scale data and computational function in serverless manner


Video Transcription

Hello, everyone. Uh uh Let me first start with my introduction and we'll talk about the topic that we are going to discuss today. Uh myself, Rinu Saluja, I'm part of commercial software engineering group uh within Microsoft.And uh we work closely and engineering to the team to get the the customer solution deployed to production. And um and uh um we uh with the engineering team, we also help them to prioritize uh the high priority scenarios from the customer so that the new feature can be accommodated. And uh as you can see, my title is software engineer. I love coding and I need to do hands on stuff. So uh let's start with the session. So uh today we are going to talk about processing large scale data and uh see the large scale data processing. This is not a new concept, but the specific thing that we will talk about today is how we can handle the computational function in serverless manner. That's the key point. So uh let's start. So our agenda will look like first, we will uh uh I will try to introduce uh uh about the uh the thing that we will discuss today and uh what's the key problem that we are going to solve uh being serve less manner in this.

And uh once uh we have identified a problem treatment, then we try to lay down with an information architecture uh by taking a specific scenario. And based on that information architecture, we'll uh try to draw and technical architecture. And after that, I will uh demonstrate uh the architecture and um a few things uh that is relevant to our use case. And after that, we will talk about uh what are the key consideration that you need to have before designing any uh uh such big data solution uh related to the surveillance technology. And in the end, we will talk about the best practices and it will be followed by the Q and A. So let's get started. So whenever we talk about uh large data, so large data, you know, daily the volume that we are getting with various platform um and uh various kind of social media. And in a company, if I talk about, we have various channels to gather the data. So when we say we need to handle this large volume of the data, then what are the key hurdle or the challenges that we generally face if we need to design a solution to handle that.

So one is since we are getting data from um the various channel, so how uh we can check the data quality? So by data quality means with various channel for a specific set of uh data we in from a few channel, we may be uh getting null values from few. It is out of range, which is not relevant for us. Because if we need to perform some analytics, uh those data is not relevant for us, for us. And in similar way, there may be other uh data quality issues uh that may be hurdle if we need to implement any kind of analytics including our machine learning algorithm. And furthermore for any kind of reporting. So that's the key thing apart from this as data is going to grow. And if we need to do some computation function on that, so obviously, we need to have more resources. And if our resources are going higher side, whether this is storing the data, uh getting the data into system or processing it and finally displaying the result as a resource will grow our data. Uh Our bus budget is anyway is going to be high. So how to handle that? Another uh as I have already talked about the increasing volume data is increasing day by day and we cannot miss any specific day data because this may be crucial for our business if I talk about the sales data.

So uh based on the season based um on the specific geography, based on the other parameter, you know whether this is a specific country or specific city data or ge geography data, we need to handle all the data. So despite its growing so large, we can't miss anything in this. And the key thing about the data is about your security and why it is important as we are going to handle data in data, we may be having some personal information or there is some data which is required only for the specific stakeholder or there is a data in which we need to apply all the GEOS specific guideline or there may be some other industry specific guideline like Hi, hi I and all.

So to meet all the compliance requirement, we need to have proper data security in place. Whether my data is going high or not, I need to have each and everything in place related to data security and that we cannot compromise and as our requirement uh will be higher. Um and whether this is related to security, increasing volume. Another thing is there will be multiple stakeholders that we need to handle at the same time. They may be, they may be sharing the data, they are going to collaborate in various way, whether this is displaying the data, processing the data or sharing the process data information or you know, this may also require during the governance phase. So this is another key factor.

And another thing that we can't miss is the performance. So if I talk about majorly, so for handling the large data, we can't compromise any of these things, we need to handle it, uh it uh you know, very, very decent way so that we are not going to compromise any of the way. But if I come back to the surv manner, so because all this requirement can be handled by any platform that we have in the market today for handling the large data. But why serve? So if we take this specific six major hurdles, so one is can we handle budgetary requirement with a growing data set? Can we handle the increased volume? And and on top of that, can we handle the performance? So these three things are critical when we need to take a decision, whether we can have a serverless architecture or we can have the traditional uh data processing model. So uh these are the few things. Now uh let's talk about how you know this data that we are going to uh uh that we generally talk today, this is going to impact our day today.

So the screen that uh you see this is related to the uh the data which is related to COVID and this is over the blue and you, you see the number of cases that is going high um And uh based on the cases also the vaccine dose and relevant uh data related to state city. And uh this data is growing day by day and how this data is useful. This data is being majorly used by the various industry. The uh uh two specific industry that I want to focus is and those industry are directly impacting our life. One is the healthcare industry, another is the supply chain by supply chain. This uh you know, delivering the uh whether this is medical equipment or day to day grocery. So everything is handled by these day by supply chain to do online. Uh um order processing and medical as this is pandemic. So did uh all the medical scientists, they need to look in the data and they need to also find what could be the next wave and what could be the structure or you know what could be the combination of the next medicine that we need to take? And what's the result of the previous medicine that we were taking? And you know, the, so this data that we can't miss and this is very important for us. So you can see any data that we see today. This is very much relevant to us.

Now, let's see um if you are talking about this data, then why we are uh going to say that this data is can be a use case uh for uh our uh sequel on demand or serve less architecture. So before that, um let me um also give some insight about uh this uh specific use case about the COVID. When we talk about uh this big uh big data, which is um on the screen, this will also help us identify the pattern trend. Uh And although uh identify the uh relevant of the different variation of the viruses and all, so let's now uh based on this trend, now we have identified the risk that could happen or, you know, we can identify the uh based on the data that we are going to have. Let's talk about uh why we are saying this data need to be handled in a surve manner. So if I say uh uh by several less, let's take the three scenario that could be perfect fit um uh about this one is basic discovery and exploration. So when I say we are going to get data from various channel, if I talk about the COVID data, we are getting data from the hospitals from the supply chain management. Apart from this government side, they are also uploading the data center, they are uploading the data. So we have a number of platform that uh in which we can discover our data. So basically those are our actual source from which we are going to have it.

And as we do have multiple sources, so the format of the files and the data that is also going to be different. So we may be getting it into CS V file, JSON file or your file. And can we have a system in which all this data come into one place like a data lake? And from there, we can pick and extract our insight. So this is one thing in which we have data coming from the multiple areas. Another is the key uh scenario that could be best fit for us the logical data warehouse. So by means of logical data warehouse, so um as we need to store this data, we need to apply some query, we need to apply some computational function on this. So can we use a SQL kind of query processing on that? So if I need to apply any kind of logic uh to filter out uh the information can I use is uh in a platform or a language that my developers or you know my engineers, they are orally familiar of that. So one such language in terms of data is our SQL.

So can I use T SQL on top of the data lake in which I can extract and uh I can query using key T SQL and it should work in a similar way that uh we generally work with the relational database. So that's another key requirement in which we also talk about the skill set and all the technology that we are familiar with. What could be the another scenario that is all about the data transformation. As I was talking about like how this data may be relevant to the other sector apart from the healthcare like supply chain. Also, they may be a good fit uh who are going to utilize this data. So how can we use the data transformation or you know? So for example, if this data, which is coming from the various sources, we need to present this to the sta specific stakeholder as our stakeholder may be different. Maybe uh the stakeholder are the end user who just need to have a look into data. The stakeholder may be uh our data scientist who need to apply some machine learning algo to come and find the future protection or our stakeholder uh may be some internal auditor, they may be tracking lineage or some other information um from where data is coming and how many processing has happened on it.

What were the various storage or the intermediate tools that we have used? So as uh our data is going to be uh in various stages, see which kind of transform has happened on it, whether uh this is going to be simple or complex. So may a few system may require a simple kind of transformation. So less resources would be required. But if I'll have a look on the COVID data, so transformation may be really complicated because we are going to have multiple stakeholder and their requirement is going to be very different. And you know, as I said, multiple type of input files as well. So to handle such a complicated data transformation, how we are going to handle those resources for data transformation? So can I have something which will you know as my resource requirement will grow, it should go to scale. And once my data processing is done, it should auto scale down. So auto scale of best use case could be, for example, if my data scientist, they are going to run a machine learning algorithm on specific data set. So obviously that would need higher resources. So as my query will be uh running on that, my algorithm will be running on that. So it should scale up and once my processing is uh done, it should automatically scale down after meeting uh you know, waiting for a couple of minutes or that should be configurable.

So these are the few things um in which we say, yeah, we need to have a system in which we can have sequel on demand or we can work in the surveillance manner. Now, let's talk about specifically the use case for for this. Uh that is uh our supply chain. We do have multiple data sources or data set available these days. And these are the basically the open source data and one such uh data set is T data set that is generally used. And apart from this, we also have uh open data source available on the Microsoft Data Lake and um uh that data lake. Uh I'll also show you that we have used for one of our project and uh that project we were doing for the supply chain uh domain. And apart from this, there are other uh data set. This is uh which are published uh for uh the use that is uh that is published by who and also uh whom I am mean to say who and health ministry of uh various uh countries like uh we do have uh open data available from uh India side as well as from us. We do have data set from other countries as well and this data can be used for the research analysis. So now um uh the requirement from our customer was to build a model to track uh the distribution of the critical medical supply.

And uh they wanted to uh check which hospital may need the specific uh medicines or the specific med uh medical equipment or medicines or you know, uh other resources that can help our front line worker. And apart from this, uh uh they also wanted to know if a specific city is undergoing uh uh for the pandemic or they are going, they are having facing the surge, then they, that specific area may also need on retail, retail. Um This will be under pressure and also foo uh free food supply, how they can handle with that. So they wanted to identify the peak requirement in the specific geography and the specific timeline so that they can align the required resources. And um as uh uh everything is like uh staying at home, working from home. So on other side, they also wanted to track, can they leverage this data and uh see what are the other resources, they may require other resources. I mean to say for work from home, for example, um uh equipment which is required like monitor or our laptop, they may be required um higher on the those specific area they may need higher supply. So can we they divert all the resources on that side? So in this way, the supply chain and only specific order tracking that was folded into multiple pages. And uh they want you to leverage that data to analyze this so that um uh they can align the required resources for the specific uh set. OK.

So now we had the basic requirement and now we want you to lay down the information technology uh strategy. So as we know before designing any technical solution, first, we need to have all the information related to that area. And uh based on that, we are going to lay down our technical architecture in similar way that diagram that you see, we will talk about each and every component uh uh one by one. So this is an overall architecture uh of the information that we had designed uh with the our customer. So um there you can see we have four different segments and those are divided broadly into two domain. One is who uh from the person from where we were getting the demand and how we are going to meet that demand with the delivery and uh uh frankly, this is all about the information from where we were getting the demand of the information on the data uh demand and how we are going to suffice or how we are going to deliver that information.

So let's talk about one by one. So our first main was the stakeholder. So we have identified four different stakeholder, which we uh we were supposed to play around with the data. So one that you can see is uh there were few data subject uh uh subjects and owner who were just looking at the data for it person. It was really critical. If data is going high, uh data employee is going to be high, then they need to have additional resources in similar ways. Uh Our data protection officer, uh those were very keen to know how data is going to be handled in between all other phases. So on the same, same data set, there were multiple uh uh stakeholder who wanted to use the data for their own business requirement. So we haven't identified those four different rules. Now, once we had identified the four different to what were the key things that they are going to on? That one is since uh this data may be used across the multiple stakeholders. So we, we were looking for a platform uh that uh should be reflect the right data, which is required for the based on the rule for example, for uh lineage information that may not be relevant to the data subject.

So can we handle uh data collaboration in such a way? Only data that is required for the specific role is present there? Um And apart from this, how we can connect the right data to the right person in the semantic clear. So as there were multiple uh data sources and based on the rules and different uh job requirements, some data is visible to some person and some data may be hidden uh uh from others. And after that, there was a requirement on the reporting side as well. So as different data sources, different uh reporting requirement and this reporting requirement was not restricted to one platform like power. They want you to use other uh reporting platform as well. So that data is sitting on one place but reporting can happen on other side.

But they want you to leverage the existing features of the analytics in surve manner. So the data is going to be higher side. So they are they need not to scale the reporting tools. So this is all about the user experience, how we can handle that now about how to meet those stakeholder requirement and user experience, how we can handle uh that from the information layer. So uh like um as you can see one layer is specifically focused only for the data security. So they had a specific requirement, they cannot compromise on the data security. So if we are talking about a patient data, so they did not want to um um uh you know, the patient, the specific I DS and uh how they are getting billed and all that should not be published to everyone. In similar with, they had two different other sector of operation data they need to handle. And in similar with the uh the specific data that is only to the scientist because that is the data that required the higher processing. And then another is the last one that I think uh we are going to list whether is how computation data will be used. Because as we are talking about the serverless, we should not bother because it should autos scale automatically. That was the platform requirement.

So now based on these four different layers, how we had laid down their technical architecture. So to lay down their technical architecture, we considered four main thing. One is the pricing should not be high and uh then the source and destination of the data because that is uh going to be different uh uh various sources, various uh uh platform, various format of the file. So source and destination of the data that can vary at any point of time and frequency and the size of the data that will also grow as the time will change like COVID data uh at certain time as pandemic is going in specific area, they are facing a wave it is going to be high.

So it will the frequency of the data will be high and the size of the data will be high. And apart from this, they were, they were looking for the convenience way it should be easily configurable. My developer who are or the my data scientist or my data team who is currently handling this, handling the current environment, they should not upscale much or they should not should not be very much surprises in terms of the tool or the technology they need to learn. So based on those requirements, we had laid down this architecture. So in which uh we divided everything into four different pieces. One is how data insertion will be happening. So this is basically focus on uh incoming data from the various sources and putting into a data lake or you know any kind of storage. And after that, uh the data storage that is going to be big data storage, this could be my like um uh this could be my storage system uh uh on sr it could be like other systems as well or my Cosmos TV or my SQL data storage. And apart from this, once we have stored the data, another technical requirement was preparing and training the data. How can I use Hadoop or the spark if I would require, can I train my machine learning models on that?

And once that is done, can I use the output of those to the different apps and the insights. For example, if someone is using power platform, they had already designed some apps, power bi reports, can they leverage as access? So, so we divided all these business requirement into four uh the technical component and this is the solution that we could found for them. So um after a few discussion with the customer, then we analyze the sup analytics in which we do have most of the feature that uh were required. And um in a way I will also demonstrate uh about uh the key feature of that in which we had um uh like we were also able to work on the hadoop and other thing. But the key things like can uh uh can it auto scale as data is going to be through and the workload but was on any scale that we had tested. And apart from this being a test solution. So, developer experience, there was a platform that they could use as is and um o other key requirement was that they were looking for the platform that they are uh uh existing using the similar kind of uh language like SQL that they were using.

So now uh with the serverless technology sign up also support the T SQL. And on top of that, uh we um we're also able to leverage the spark and uh the open source technology. So we were able to write um the notebook that was shared across the multiple user. And they were easily uh collaborating to each other uh to uh share uh the processing result and all, and finally, the data that was coming in was uh uh output of uh everything uh that was shared to the third platform in which they were able to generate report uh based on the few open source tool as well as they were able to publish the POWER B I reports.

So this is the tool that we had identified apart from this, they were also looking for other integration point of view uh for the inter of internet of things because they also wanted to monitor uh their uh current uh tracking system of the, of the various logic stick and all.

So this is the final tool that we had used. And um uh let me uh demonstrate uh uh like let me talk about a few things about the thing that we had designed for the customer. So um so what we had uh done is uh so uh as uh sap uh this is this was an sr service uh that we were leveraging uh for as in heart of the solutions of the solution apart from few open source technologies uh and integration with third party solutions. So this was the heart and, and um as you can see, I have already uh created a demo environment. Uh this is uh sign up uh workspace. And uh uh so uh how we can do. This is uh mhm So in uh this is like portal.co.com in which if you need to set up environment, you just need to look for the sign up analytics. And uh there uh uh with the few configuration parameter you can set up it. So after set up, uh this is the final uh ready environment that I'm going to demonstrate you. So this is the si uh sign up work uh place uh that I have set up in which um by default, it always create a sequel tool that is autos scalable. Or you can see this always work in the cus manner that you can see this is built if required, I can have a new in which I can specify my desired size. And there you can see in built.

And also this is also say, also saying size is auto, it's mean if required more resources uh to process any query, it will auto scale up. OK. Meanwhile, my portal is open. If I need to create a resource, I will just look for the your sign up analytics. And uh there I just need to create a new and uh there I will, I'm going to provide a few basic configuration requirement. OK? Uh So uh um as a basic requirement, I need to select my subscription. Uh uh So let me select one and then I do need to provide resource group. The key thing is add data into coming into system. So where do I need to store the data? So for this, I do have multiple options by default. Um It is also pro providing me an option to save data into data lake that is select data lake gen two that either I can provide manually or I can provide from the subscription. I need to provide the file system there in which my data is going. Also. This uh maybe is also used in as an output system for uh for this. And once I have provided this, then uh um uh it will ask for the security requirement in which I need to provide what should be user name, password for the sequel. So with all these basic configuration, it will set up an environment for me. This is a kind of workplace. And as I have already told you SQL pool that is already provision and if required, I can have an Apache spa pool configured.

Um But this is uh I need to create manually. Um We do have terraform script for this as well that if required, I can have those. So by default, I have provided the size as a medium and if required, uh we can change it. So let's talk about the several last thing that uh we were exploring that is the sequel pool. So ho now how can I use the SQL pool to handle the queries? So for this there, you can see the workspace, uh web URL. So um I have already opened uh uh let me open this workspace. So this is the actual place in which uh I can see all the options that we have already talked about. Four or five things that we need. Four things. Basically that we need to handle the complete architecture. One is the insertion of the data, how data is going to be inserted. Another is the transformation uh that we need to apply. And um um also there you can see how I can process the data. I I have two different option. Either I can use the sequel. Um For example, if I'm going to use my SQL data uh in between uh for hand uh with the serverless technology, if I'm going to use the T SQL, another is uh what if I want to leverage my existing notebook and I want to process uh all thing on the spark.

So uh spark or Hadoop. So that flexibility is also there. So there you can see I have this data development and integration option, integration option is a, a kind of integrating with the multiple platform like um getting data inserted from the very system. So let's talk about the main thing that is to develop. So what I will do, I will create a new sequel script because through SQL script, uh I want to leverage my existing skill of the sequel. And so what about the data from where I will be getting the data as I was showing you showing you the COVID data. So on as well. We do have this open uh data sets that is available uh as in catalog, let me search for the COVID. And there you can see I have data lake already created. So um uh we had used this data lake uh because this is continuously getting updated. And there you can see this is uh into four different format. One is bing data set, another is Oxford. And uh then you can see um uh another specific tracking of the vaccination hospitalization cases and all. And uh third is the European uh center. So uh this any of the data set uh that we can uh use. And uh what we uh uh we had done is uh let me uh showcase how we had leverage uh this as in query. So first I should be able to open or see the data from this particular data leak.

So what we can do is uh let me uh go back to my workspace and as I have already created a script in which I will just write my query there. You can see uh uh in this query. I'm just saying the simple select top 10 queries. And um I'm trying to read a CS V file that you there, you can see this is already a data storage and that is in public. And if I will try to run it since this is in public. So no authentication is required for this. But if you are going to leverage your own data set, then anyway, you need to insert uh the sequel um uh credential for that. And that is going to be your normal sequel queries there. You can see, I have just used the top 10 on the select key function that you see here is the open road set. That is the critical function if you need to use any kind of sequel uh or the other format of the file. OK. Uh There, you can see, I have already uh got the data in which it is talking about the cases that country geography ID, and you know the date and other things. So now you can see I am able to connect to the data. So let's talk about a few other things that uh we can uh do. For example, if I need to get the average cases.

So can I apply a case and order by so that, you know, um because they wanted to uh um our uh supply chain uh uh uh they want the partner, they want you to l know about the average cases in the specifications. So let's try to run a uh query in which we are going to talk about averaging the case based on the geography. And the, this geography that we had captured the uh was the Siberia and in which you can see I'm using the same. Um But now I have changed the forget file. Now, once uh I have run it and you can see I'm getting the data based on the date. And there is another feature that I can see the time series data and on the same screen. So if I don't want to leverage any kind of U I tool only uh for the data presentation, I can have this. And from there we were analyzing the data and the same time we were uh saving this data as an image. So these are the few basic function uh that uh we were using. And in uh after that, we were also uh try to get uh the cumulative cases. And let me also um uh showcase uh so few scenarios in which we were running the machine learning algorithm. So requirement was uh customer wanted to um uh play around with the industry standard algorithm like candle strength coloration. They wanted to apply to estimate the pipe.

So for that, uh uh what we had done is we were like we were already pulling data from the same data set. And then we were trying to uh uh get the P uh P information based on the death and number of cases. So for that, let me also show you that complicated query. And uh you can see this is already connected to the building. I'm not going to change my resource requirement. It should also scale up. So uh let me paste that query. So there you can see, I'm putting all. Um So first of all, I'm pulling data from uh my data lake storage. And um in this, based on the geography, I'm trying to pull the geography ID, countries and territory and I'm trying to find the average number of death in cases. And based on that, uh on top of that, uh this, I want you to apply a few formulas uh so that I can get value of the pie. And I want you to apply the joint between two data set. One is the data set uh in which I'm just calculating or I can say I can refining the data or I um uh as for the data quality requirement, I'm just reducing the data which is in which that case is very low and also the number of cases is very low, which is basically less than the 10 in that 10 cases are less than 100.

So and uh then I'm trying to calculate the P value of the pie. So if I will run this, so you can see the simple query and I'm going to run this to get the simple value of the pie. Uh It will take some time uh couple of a few seconds but uh you know, no upscale uh I need to change the resource requirement. It should auto scale up and auto scale down. So in this way, uh we were uh and on top of that, uh we also ran few um uh complicated uh um machine learning algorithm uh which ran um uh we ran uh which ran for like uh uh maximum time that we saw for the five minute. And, but we were able to get results without scaling it up and down. And also it was not uh getting higher on the cost side because at the same time, uh after idle for five minutes, it was able to scale down automatically there. You can see after 18 2nd, I was able to get result. And uh this is my value of the pie. So in this way, I was uh we were able to handle uh the large processing uh of data set on the large volume and uh without uh changing the resource requirement.

And on top of that, without uh changing uh you know, the requirement of uh related to the technology upscaling LA, I'm just using the SQL query. So bundling, we were using the T SQL in similar way. So if I need to have like Jupiter notebook uh notebook that I can have in s in similar way, if I need to have spark job that I have already created, I want to reuse that. That's totally uh you know, acceptable, we can do that uh with a few clicks. So in this way, we were able to use sign up and later on as requirement. Uh we were able to connect uh this with the Power B I reporting. So let's talk about uh the few best practices uh that we had followed. So one thing is uh as we need to handle uh uh the last data, so as the best practices, it is always recommended to use the CS V and JSON file to get converted uh into uh uh forget file files. Uh So this is one thing and apart from this, um yeah, uh apart from this, uh uh one thing is if still we want to use the CS V file, this is always recommended to have uh to break down your file into small pieces.

So one file size should not exceed 10 GB. This is uh recommended as of now. And apart from this, every file that the uh that we are the open Roset, that query that I have shown you. So it is recommended that if you are going to use multiple files in uh open Roset query, then uh they recommend this to be almost equal size. And um as what happens in the sequence, it always try to process the uh files or the commands which is written written on the upper side. So if you are going to use the wild characters or try to use uh those on the lower side of the path. And uh in similar way, uh there are few other recommended way uh to handle uh uh uh uh the best practices. So this is all specific to the development side. Apart from this, there, there are other uh recommendation related to how you are going to set up the environment. Like if you are going to use Cosmos TB, if you are going to use Power BI I, so it is recommended to have those resources in the same uh region that you have already uh uh established for the sign up.

And if you are going to uh integrate this with the third party system, so there is a dedicated pipeline that um is recommended to set up so that you are not compromising on the performance side. So these are the few best practices current due to the developers. And um apart from this, uh uh what are the benefits uh that we were able to identify based on that one is as this is uh totally serverless. As I show you, I have, I had tried to run the complex queries. I had not required to change any kind of the resource. So it was very simple in terms of management and there was no upfront requirement for the minimum number of resources as uh I have already shown you. There was one thing was those for the sequel pool in which it was auto return. So scaling another was for the spark in which I had configured, not for autos sc in which I had, I had specified the number of moded. So in case of the several less, there was no ups uh front commitment in terms of the resources that uh we had to uh uh do. So. Uh so what does it mean? We are reducing overall operational cost? Uh uh So, and uh as this is uh on the A O platform, it is going to inhere most of the security requirement that based on the geography, industries and all.

And finally, uh we are not compromising on the performance as, as you could see the complex query, I was able to execute uh in the eight seconds. So this is all about the major benefit that we were able to achieve and uh our customer was able to roll out the application um now, for the specific uh reasons and uh they are going to expand it soon for the others. And that's all about the benefit uh for using that platform uh uh and its integration with the others. No, I think um uh if you have any query question uh uh that we can take it up and thank you so much.