Unveiling the Power of Azure AI: Innovations and Applications
Unveiling the Power of Azure AI: Innovations and Applications
Hello all! I'm Sabita Mittal, a Senior Cloud Solutions Architect at Microsoft USA. Today, I'll dive into the power of Azure AI—the uses, innovations, and applications. Wherever we have data, we can apply AI; it's a significant part of our digital shift in technology. With AI, we can make automation more accessible and create smarter, more intuitive applications (Azure AI).
Azure AI’s Flair
Azure AI services are an easily integrated part of the Microsoft Azure cloud platform, designed to make the inclusion of AI in applications as straightforward as possible. No extensive expertise in AI is required. Azure AI offers a myriad of valuable solutions, like Microsoft 365 Copilot, Dynamics 365, and various other partner solutions.
Using Azure's low-code applications, such as Power BI, Power Apps, Power Automate, and Power Virtual Agent, we can create solutions faster and easier. It also offers scenario-based services like bot service, AI search, document intelligence, video indexer, and metrics adviser.
AI’s Extensive Offerings
AI Models
- Vision services - for object character recognition (OCR) on images and videos.
- Speech service - facilitating the transformation of audio files into text and vice versa.
- Language understanding - this aids in machine learning.
Then there’s Microsoft’s Copilot for web and Copilot for work, offering chat capabilities with the public internet and Office 365 apps respectively. Copilot also integrates with your Microsoft Word, Excel, Teams, and PowerPoint to make your tasks more interactive and easy.
Diving into Custom Creation with Azure Studio
Azure AI Studio provides all the AI services like vision, speech, and OpenAI models, which can be used to create custom packages for unique development needs. Super AI services like openai.comopenai.azure.com offer a GPT model playground to create your own chat playground.
Furthermore, Azure AI enables you to create an enterprise-level chatbot which involves landing your data, whether it’s audio, video, or a simple PDF, in blob storage. From there, various AI services come into play to breakdown and convert your data into smaller, more manageable chunks ready for querying via web app services.
GPT 4 Vision: Taking AI to a New Level
Microsoft's GPT 4 Vision is a revolutionary image-based model. With Vision, you can input an image or video, and with a text prompt, get a textual output. Essentially, you can analyze your visual aids and ask specific questions about them, all thanks to GPT 4 Vision.
For example, in police departments where recording videos is an everyday occurrence, GPT 4 Vision can allow officers to ask questions from the video files without needing to manually review the footage.
Building Confidence in Azure AI
Microsoft ensures that your data doesn’t get used to train any of their models. However, data protection, encryption, and security are still primarily your responsibility. Microsoft encourages applying role-based access models, multifactor authentication and regularly updating permissions for improved security. Compliance and governance are also the user's responsibility.
Your Power, Amplified
So, how can OpenAI be leveraged on your data? By creating an app interface to chat with your data, you can connect your OpenAI model to any data source—be it OneDrive, Cosmos DB, Oracle, SQL, or Amazon S3. You can use OpenAI as a powerful tool to query your data sources effectively, ensuring you get the insights you need when you need them.
Moreover, you can ensure the safety of your content with Azure AI content safety and abuse monitoring. This AI tool helps to classify harmful content into four categories—hate, sexual, self-harm, and violence—allowing you to block inappropriate content promptly.
Conclusion
In conclusion, Azure AI grants you the power to deeply integrate AI into your applications, thereby improving their capabilities. Offering a low-code platform and many partners' solutions, Microsoft Azure can turn data into insights quickly and efficiently. With security measures in place and a commitment to user privacy, Azure AI is a significant force in the evolution of AI technology. Now, go reap the benefits of this technological marvel—with your data, by your rules.
Video Transcription
Hello, everyone. Can you guys hear me? K. Great. Hello. This is Sabita Mittal, senior cloud solutions architect. I work at Microsoft USA. I'm based off of California.Today, I'm going to present unveiling the power of Azure AI, innovations and application. We hear AI everywhere, and let's see how we can make a difference. Data is a fuel of AI. Wherever we have data, we have AI. Data is a fuel that powers AI. Basically, wherever we have data, we can apply AI everywhere. It's a digital shift in the technology where we can create pretty much every application by the use of AI. We can gain quick insight. Those days are gone when we used to write complex queries. Now everything can be accomplished via AI, and we can create automated applications. And look at this slide. This is the whole Microsoft, Azure AI, basically, where we have all the collection of AI services offered by Microsoft as a part of its Azure cloud platform.
The Azure AI services are designed to make it easier for developers and organization to add AI capabilities to their applications without the need for extensive expertise in AI. As you can see in these applications, we have Microsoft 365 Copilot, Dynamics 365, and various partner solutions. We have these low code, low code applications, which is Power BI, Power Apps, and Power Automate and Power Virtual Agent, and then we have scenario based services like a bot service, AI search, document intelligence, video indexer, metrics adviser. And then we have AI models like vision services, like how to do OCR on the images and the videos. Then we have speed service, how we can leverage speed service to transform, audio files into a text or to transform text files into an audio file, and then we have language understanding and then machine learning. We have various components available within Microsoft like Copilot for web and Copilot for work.
Basically, Copilot for web and work means copilot web is like similar replica of, chat GPT. Basically, it is a official version of, Microsoft where you can chat with the public Internet. And then Copilot for work, basically, once you enable the toggle, then you are able to chat with Office 365 apps, whatever the account you have assigned to, like a SharePoint files. You can check with talk to Teams channels. You can, query your OneDrive files, and you can query their organization structure. Then we have integrated Microsoft Office 365 Copilot. Basically, Copilot is at your service in your Microsoft Word, in your Excel, in your Teams, in your PowerPoint presentation. Basically, you can chat with your content within those integrated applications. Then you have Copilot Studio.
Basically, it you can create your own custom Copilot, basically, custom chat bots with your Office 365 data. And it is less flexible, and, basically, you can you can have a set of powerful AI models that can be used for various AI task. You can spin your quick, copilot without creating a lot of custom copilot. You do not have a flexibility to choose your own models, but you can just provide a URL of your SharePoint website, or you can quickly spin some web files and, chat with your data without lot a lot of customizations. And then we have Azure AI Studio, pretty much all the AI services, vision, vision, and video, and speech, and OpenAI models, and AI search, and you can create your own custom packages and download those packages and do the custom development. And then you have super AI services, which is openai.comopenai.azure.com, where you have a GPT models playground. It has a PTU and PayGo models.
You can create, your own chat playground and chat chat with your data. And once you feel, trusted in the technology, then you can, import the package and export the package and create custom applications. This is a enterprise level chatbot, basically, and we have different components in this chatbot. As you can see, we have a block storage where we land all the data, whether it is a, simple PDF files or whether it's a audio file or video file. Whatever the type of structured data you deal with, you can put the data in blob storage. And from there, you have a AI search. Basically, it crawls each document, whatever the file type, and break down into smaller chunks.
And then you have a web app service where you can chat with the data which was that there in the block storage, and then you have these bunch of AI services as well. Depending upon the file type, if you're dealing with the, audio file, then you are converting audio file into a plain text. So or if you are dealing with video files, then you are converting the video file into a plain text. And if you have a translator need, then you are leveraging translation. And then you have OpenAI LLM model to chat with your data via web app service. As you can see in the right, there are lots of components goes into the in this enterprise level journey at chatbot, and there are various pricing model also available. Basically, you can chat with your own data by having your own copilot and own chatbot, and you can deploy this on your website, on your intranet, or wherever you feel like on the web.
These are the different models available within Azure Open AI Services like GPT 4, GPT 4 Turbo, GPT 3.5 Turbo, and, GPT Vision as well. So GPT the the various test tech based model. Basically, you can do the summarization, classification, and the GPT 4 vision is basically image based model where you can give the image as an input and your text prompt, and you can get the response back in a text as well. Basically, you can chat with your images, and you can chat with your video files as well. So g p t four vision, take image and video file as an input, and then you can ask questions. So let's take an example of a police department, where they they have a body camera and they record videos all day.
So at the end of the day, they can, ask questions out of those video files without someone, manually watching those videos and identifying the people in that, identifying the situation, what was the reason. So g p d four vision is very helpful in that. So all these models you can deploy on your data. You have various pricing models available like a provision throughput units, basically, carried in SLA, and then you have various pay as you go model as well for pricing. And you can create your own assistant and functions, and various plugins are also available, to create applications, custom applications using one of these models, basically. You can have confidence when you are using your Azure app OpenAI service. Basically, your is your data. Microsoft do not use your data, basically, to train any of these models, and you can, you can use you can completely trust OpenAI models the way you trust any of the cloud services, but security is still your responsibility as well.
Basically, like, a data protection is yours. Like, you have to apply encryption. You have to apply confidential computing in order to protect your data. Even the identity and the access management, basically, you have to leverage Azure Active Directory to properly secure your data. You have to apply role based access model basically for the users. Basically, then some users cannot access these kind of files. Some user cannot access that kind of files, and you have to apply those Security groups and role based access. And then multifactor authentication, and you have to regularly educate and train your users, basically, to make sure they are using the technology in the best possible way. You have to regularly review and update permissions as well because new use users keep joining and there are different level of permission, and then network security as well, like Azure VNET and DDoS protection.
And then compliance and governance is also user's responsibility to apply those compliance certifications, basically, and then various Azure policies and threat detection and monitoring and security center and Azure monitor. So, basically, security considerations are still yours, but Microsoft do not consume any data to train any other models. Then we have this DALL E 3, which is one of the image based model as well. You can create your own images, basically. You can apply a text prompt that give me a image for a fashion industry or you wanna design a campaign. All you wanna do, create any creative content, basically, where wherever you have a image need, you can use the tally tree model to create your own images without worrying about copywriting. So how you can leverage OpenAI service on your data? Basically, you need a app interface to chat with your data, and then you can leverage multiple data sources.
So your data can be in your OneDrive or in Cosmos DB or Oracle or SQL or Amazon S3 or GCP, wherever your data source is. So, basically, you can connect your OpenAI, LLM model to any data, basically, and you can have additional data sources as well, some third party or SaaS based platform. Then you have to consume different integration, platform in order to feed the data to OpenAI, and then you can have a app interface. And OpenAI will serve as your, querying the data, basically, querying your data sources. The next one is Azure AI content safety and abuse monitoring. It appears to be a powerful solution for creating safer online spaces by leveraging cutting edge AI models. As you can see on the screen, once you apply Azure AI content safety, it classifies the harmful content into 4 categories, hate, sexual, self harm, and violence.
Once you apply, content safety, you also get the categories. Like, for hate, you will get 0 246 and similarly for other 3 categories as well. So based on that, you can block the questions and block the prompt or block the sources, source ingestion data, and block the responses. So if someone is saying, giving a question, then you can apply content safety endpoint. And once you get the response in 0246, you can apply your own benchmark that, okay, if I get the response as 2, please block the question that, sorry, we cannot accept your prompt. And same thing goes with the response once the prompt because we cannot control the response because the type of the data is fed. So we based on the response, based on this content safety categories, you can block the content. See this example.
The input is painfully twist arm then punch him in the face. So this is kind of a bit of a inappropriate content. Then as you can see in the right in the output box, the violence category is 4. So based on that, you can block the content. Sorry. We cannot accept. Maybe this is coming as an image. Someone is, hitting in someone in the image or some violence category is detected in the image or in the video or in the plain text. So you can take a necessary actions how you wanna process that data and how you wanna consume the output. The next one is GPT 4 vision. As I mentioned, GPT 4 vision takes an text prompt and plus images and input, and you can get the text output. And these are the top use cases.
Basically, you can have a conversational model. You can have a code generation in the documentation. Once you have some code documentation, then you can ask questions. You can have questions on the sales cycle. You can do lot of document processing. You can do fraud detection in the documents. Once you, crawl the data, then you can identify the fraud detection. You can have a supply chain management. Intelligent contact center, basically, wherever communication is sort of key and the feedback is there. You can ask question. So, basically, with the with the LLM, you have a virtual assistant available everywhere, basically. And a call analytics wherever, like, let's say, filing a claim, you can, do the call analytics and the what kind of a claim is that?
Is that auto claim, house insurance? What is the speaker talking about, what is the claim issue, what is the sentiment of it, lot of possibilities in the use cases. Let's go to that. What to do next, basically? UI plus. These are the some sample architecture you can build, basically. A typical scenario of a call center, whatever the type of the audio is there, you can save that audio files into your Azure storage, then leverage speech to text service, and then what? Open before OpenAI, we were stopping right here because, now OpenAI can leverage basically the powerful LLM models to gain quick insights. And then what? Once you get the response, you can feed the data to Power BI to do the quick dashboarding or feed the data to CRM or some third party web services can consume the response. And the same thing, document. Document. Document. Document. Everywhere, basically. Documents are everywhere, and we can consume the documents and apply AI search. And then what?
We can ask questions from our documents. Maybe let's talk about the fire reports every week. Do they have a fire report? What was the biggest size of the fire? What was the case number? What was the location? How do I identify the objects? And then what? You can feed the response to Power BI or consume in the web application. Questions? And let's see the demo. Please confirm if you can see my screen of the demo screen. A demo screen? K? This is a very powerful power platform application. Basically, it showcase that how Azure AI can be used to extract the data from various file formats and how the workflow can be created to leverage OpenAI for various common scenarios. Overall, the understanding is we deal with unstructured data all the time. How I can ingest the data using low code, no code platform without any traditional code development?
I can leverage power apps, which is a component of power platform. It's a UI based, and this is the power apps I'm using to create this complex application. And then behind the scene, I have a power automate. Basically, the the reason I'm using power automate here, to extract the text and apply OpenAI. So, basically, if you're dealing with audio file, I am ingesting the data, basically. You can do this behind the scene as well. I'm leveraging Power Automate to ingest the audio file and then what? Because I'm dealing with audio file, I have to leverage speech service to transcribe the speech file. And then what? Before OpenAI, we were stopping right here. Then similarly, video service. So if I'm giving a video file, leveraging video index, sir, I am if it is a 3 hours long video file, I am getting the plain text work of 3 hours long.
But as it is, same thing with audio. I'm getting the plain text out of the audio file as it is, and same thing with PDF. It is 224 pages long document leveraging Azure Vision Service. I'm getting the plain text as it is. So then what? OpenAI is used to gain insights out of that long data, basically. Then we can ask question. And that's not the end of it. Once you get the data out of OpenAI, basically, then what? You wanna save that data into some SharePoint list for lookup or maybe save it as a JSON document into a block storage or maybe save the data as sequel or maybe do further lookup. And let's see how. So I have multiple demos here. Open AI GPT Studio 3 playground. Basically, how the playground works, once the different models. But let's see something else.
A customer feedback dashboard. The typical scenario is I get an email, I get a feedback, in the email and as a audio file. I leverage speech service to transform each email attachment audio file into a plain text. And then what? I do not want to have a chatbot for a customer scenario here where I get 100 emails every day. I do not want to have a chatbot to ask questions out of that. What was the email about? What is customer trying to say? What is the issue? What is the sentiment? Basically, I have those 10 plus questions to be asked from that data every day from each file. So I do not want a chatbot experience. Again, I wanna repeat, OpenAI is not about having a chatbot. Basically, idea is to gain insights out of my data.
So imagine this each row is a each file, each audio file, and let's see what I have done with it. And look at that. This is the original email about with the original audio file. I transcribe into a plain text. Customer is complaining about a travel company. So then I took behind the scene, asked these 6 questions. You can have your own set of questions depending upon the business scope. Instead of 6 question, you can even have 16 or 60 or 600 depending upon the business scope. But out of my original file, I'm asking, hey. What is the sentiment type? What is the reason? What is the result? What is the summary? And what should be my solution back to the customer? Imagine the time amount of time I'm spending on this reading this original content versus quickly looking at these, these responses via OpenAI, LLL model. I look at that.
Oh, this is the anger and frustration. Oh, mark it as unresolved or follow-up. And then this is out of my queue. Back to the dashboard, and let's check this one. Oh, this is the positive one. I can mark it as close. I do not have to take any action here. Or, basically, you can extend this experience to further more actions depending upon your business score. Maybe once this is once you mark it as closed, then maybe send an email to the customer with a thank you email. Once your market is pending, then send a email to escalation team. Once your market is unresolved, then it is going to the archive. So depending upon the business scope, a lot of things can be done. And let's check the another demo, which is my GPT 4 vision.
How GPT 4 vision can be leveraged? I have my pothole data set, basically. Citi takes a pothole images. What is going on with individual pothole and how we can take necessary actions, basically. This is about to see. So look at that. So I got my pothole data set uploaded to my block storage, and then up ask questions. Basically, describe the image. So look at that. The image display at this damaged road with this, this. So imagine that city gets too many images. Basically, traffic violation and the portable data center, wildfire images, and and CCTV cameras. So all those use cases are there, and we accumulate tons of images and videos every day.
So they basically, then we can apply Azure GPT 4 vision endpoint on each image or each video file and gain insights quickly, and then use these details to do some lookup and take necessary actions the way we saw in the dashboard. Then look at let's look at the next demo, which is my video file. How to gain insights out of the video file. So this is basically a video. Thank you for calling Coaching Gals. My name is this. What is the issue? And then Gals. My name is Sam. How can I help you? And see, this is this is translated this is transcribed as it is, basically. And then what? I want to ask questions from OpenAI. What is the customer issue? Look at that. Immediately, I gain the insight. The issue is about the refund of a code due to the wrong site. I I don't have to go through the original video and the original content and then ask questions. What is the customer phone number?
And then I can ask question to do necessary action, maybe have few buttons here, send an email, send a message, send to escalation team, or attach another workflow to it. Similarly, I can have another, extract text about the call center, basically. Oh, Martha Flores. Okay. How may I assist you? So, basically, we can gain quick insights. So if you look at that, how we can quickly tie different services because I was dealing with video scenario, I did the behind the scene via my Power Automate and then leverage video index as service. That's the plain text you saw. And then what I can ask questions of OpenAI, and then I can have few more buttons here to save the data back into my data storage to be consumed by another application. And let's look at the another demo, which is my content safety demo, which I covered. Sorry about this text.
This is already extreme text because I wanted to generate the category, some extreme categories are out of content safety. So I uploaded a PDF file, and PDF file has some extreme content, which is a content, which qualify under content safety. And I as I saw content violence 4, I said, sorry. Content is not acceptable for chat. For this, basically, to pass the PDF, I leverage Azure Computer Vision read API. So, basically, I uploaded my PDF, pressed the button, extracted text, Azure Computer Vision read API got triggered, got the plain text out of the PDF, and then what I'm seeing generate inside. But my generate inside button got disabled because we detected the category violence equals to 4.
So this is very helpful, for a public safety or the social media or anywhere we are dealing with the content and can be, and it can be a harmful content. And so, basically, to have a digital safer online experience for my audience, for my application audience, we should definitely use content safety. And then let's, look at the another demo. Same thing I'm doing with audio file. I'm just showing the demo here that upload a audio file, extract plain text, and ask question. You can have similar experience here behind the scene in a automated way, and I can show you with my flows. All those actions I showed, I am doing with the flows here, basically, with the end point.
So let's say the video file, I can show you, basically, the video file extraction is getting done all via HTTP endpoints and responding back to power apps. Similarly, I'm doing the audio here. The GPT 4 vision, we saw the part called dataset, basically. And then everything is getting done. I'm getting the list of block, apply to each my prompt, and then what? Here is my GPT 4 vision getting triggered. This is all my GPT 4 vision getting triggered, sending the response back, creating the item into SharePoint, and then what? That's what we see here in the GPT 4 vision batch. All this data is coming from as soon as, this URL is given. It has tons of images in it. My power automate workflow is calling GPD 4 vision endpoint running on each image, basically each image and getting the details back.
So that's how you can see, basically, we are doing, low code low code version of Azure AI applications, and we can create pretty much all the applications with low code low code leveraging Azure AI services. And once once we wanna chat with our data, basically, once we gain, once we gain insights out of the images and the videos and the PDFs, we wanna further but I also wanna chat because we are dealing with lots of lots of amount.