AI Governance: Balancing Progress and Responsibility

Sharmila Devi
AI Consultant
Automatic Summary

AI Governance: Striking a Balance Between Progress and Responsibility

Technology has made great strides in the past few years, with Artificial Intelligence (AI) leading the way. Yet, despite its numerous benefits, recent events and expert opinions reveal that AI needs to be regulated and governed to prevent potential risks.

So, buckle up for an engaging rundown of the highs and lows of AI and the dimensions of responsible AI, narrated by AI consultant Mila Devi from Google.

The Highs and Lows of AI

AI has become an integral part of our daily lives. Whether you're scrolling through YouTube recommendations or selecting movies on Netflix, AI underpins these processes. Its influence extends across many industries, simplifying and making life easier for humans.

But comprehensive and responsible AI doesn't only entail high accuracy and usefulness – multiple factors come into play, like ethical considerations, risks associated with purely AI-driven systems, and more. One of the recent major advancements includes Artificial General Intelligence (AGI), which builds a generic model handling various tasks across different use cases.

However, it's also essential to consider the lows of AI. For instance, an unleashed Twitter bot once turned abusive within 24 hours of being exposed to user inputs. This scenario is an unmistakable sign that unrestricted AI systems can pose substantial risks.

Risks of AI

AI isn't flawless, and predicting its risks is just as important as harnessing its powers. Risks could range from accuracy errors, bias, lack of transparency, cyberattacks, or a steep learning curve. Notable personalities like Elon Musk and Stephen Hawking have reiterated the need to regulate AI to mitigate these risks effectively.

Dimensions of Responsible AI

Responsible AI revolves around five key dimensions:

  1. Bias and fairness: Here, focus is on the dataset used to train the model. The data should be devoid of any bias to avoid skewed results.
  2. Privacy and Security: Striking a balance between usable data and privacy is crucial. Training requires data, but it should not infringe on the privacy of the users.
  3. Robustness: AI models should be capable enough to withstand diverse and potentially adverse testing scenarios.
  4. Transparency: The users should know how the AI model arrived at specific predictions or decisions.
  5. Explainability: Similar to transparency, the AI model should explain clearly how it arrived at certain decisions or predictions.

Through AI governance, these critical dimensions can be assessed to ensure that AI delivers fairly and maintains trust with users. Implementing AI governance is key for both the AI user and the AI owner, involving processes like regular audits, gaining user trust, and ensuring responsible usage of data.

In conclusion, the power of AI is undoubted. However, with power comes a responsibility to use it ethically and beneficially for the society. As Mila Devi puts it aptly, “Building AI-based solutions and incorporating AI-based approaches in business use case to derive business outcomes is great. But with that comes great responsibility. There has to be a human in the loop."


Video Transcription

Sure, Mila Devi, an A I consultant from Google. And today I'm here to share my experiences and thoughts on A I governance, balancing progress and responsibility. A very interesting topic. So the next 20 sec uh 20 minutes are going to be super interesting.So, so folks here's the agenda of the session. Firstly, we, we will see the highs and lows of A I. What are the latest advancements in A I? And, and uh what are the points that we need to be aware of uh of A I and Artificial intelligence and machine learning. What is it that uh that we have to be aware of? What are the risks of using uh a purely A I driven uh system or architecture? Later? We will also cover the dimensions of responsible A I so responsible um uh responsible A I means uh having uh having artificial intelligence implemented or, or thought from a uh ethical and moral perspective as well. And finally, we will touch upon uh the process to build a responsible A I systems. So let's get started. So A I has attached uh our day to day lives.

Uh We, we start our uh our day by interacting with mobile phones and, and, and there are, there are many industries, uh many, many use cases where wherein we constantly uh engage and communicate with artificial intelligence. So be it uh be it the youtube reviews that are being recommended based on our preferences based on the choice of videos that we watch or the Netflix movies. Uh It, it, it, it's, it's all uh A I algorithms recommend recommendation engines running behind the scenes. So it is the uh single largest technology that the world has ever seen. So, so that's, that's the gist of, of, of, of all the use cases that we have implemented a lot and the use cases that we are seeing uh in the near future. If you see here, uh there are two types of A I like artificial narrow intelligence and artificial general intelligence. So narrow intelligence, uh artificial narrow intelligence is all about performing specific tasks, tasks that are very specific to a particular domain where is uh artificial general intelligence, the A G I uh A G I development that we have been seeing recently is all about building a generic model that addresses a variety of use cases, right?

So if you see to the right, uh these are the A I applications uh that, that, that uh humans have uh implemented in the, in, in the recent history. So, right from image processing um identifying uh uh identifying or detecting cases and images or, or uh or matching to images, query, query images. So sorry. So you uh you, you, you could probably have um images and then you could say show me the list of t-shirts which has Mickey Mouse on it. So, so these are all uh image processing related use cases. You also have video related use cases, audio related use cases, audio in terms of uh translating it to multiple language, a single model which actually translates in 203 100 plus languages spoken all over the world. Very, very cool, very interesting though though when we think about it, it it it it is a very simple uh it seems to be a very simple use, but behind the scenes, it's the um I I it's the model, the complexity, the architecture that, that actually helps a single model solve such a five, right?

And, and uh not just uh image uh audio and video, we uh we, we, we also use it in manufacturing industries, right? So, so le let's say, let's say we have machines and we want to predict uh if uh if the machine will get repaired in the next based. So that is a use case of predictive maintenance where is where they are predicting we have use cases in the healthcare industry that actually assist doctor, it doesn't replace doctors, but it just assists doctors saying that uh whether this patient should be recommended a particular medicine, what what is the diagnosis that the patient has.

So, so, so such sort of use cases are uh uh is something that we have been seeing in the recent past. And we have a very, very interesting testing and bright future uh for A I and it's related applications. But we need to think 11 point is it able to perform tasks without supervisor? There's always a human involved, there's always a human approval reviewer rejected involved in any enterprise uh uh e enterprise uh A IA I system. So is it able to perform tasks manually? No, we are still not at that age, at that state, we still need to constantly train the models. So there are scenarios where in the um where in the models uh need to be retrained with latest set of data sets, right? So constant maintenance, constant uh retraining of models is required and the A I systems still don't understand what is ethically correct. What is morally correct for humans. So, so this is the current state of A I applications that we have and, and the, and the last few months and probably I should say last one or two years is the era of large language models where we have seen a single model with billions and trillions of parameters being trained on.

And it, it, it, it, it, it, it is one single model that papers took all the use cases that we have, you are able to see all right? Like from summarization to solving uh mathematical quantitative reasoning problems, like um uh word problems to uh translation, to question answering to asking questions that we uh uh asking questions from different industries, different past topics, ask me anything type of uh models, right?

So, so, so these are the uh this is the era of large language models and now that we have seen all the highs of A I, um all the uh sweety, sweet little things of A I. Now, let's see the lows of A I. What is it that, that we need to be cautious about? So we have seen uh scenarios uh where in where in a bot uh a Twitter bot that was released by uh by, by, by one of the largest cloud vendors. Uh It, it's the, as soon as it was released within 24 hours, it had to be taken down while because it started learning, it started learning from its users. It, it, it started, it started becoming abusive, iiiii I it started learning offensive languages and, and, and, and started using bad words. So, so when this is what happens when we let an A I system out to the public without any human inter inter interference. And that's when it had to be brought down. Uh within just 24 hours. Here is another example of where A I became biased. So, so, so uh so there's an uh recruiting uh tool developed by Amazon and, and, and it, and it actually uh gives uh recommendations that uh recommendations to uh to folks on, on what, what jobs uh should be applied.

So, so there, it, it gave, um it gave a biased result. So, so, so the, the there were uh evidences of bias against dark skinned uh people and women uh fe female faces. So, so, so this, this was a case where the A I result that the A I system outputted of or made predictions that was biased against one particular section of our community. Uh Th this is another example where in where in propublica uh actually uh created uh created an algorithm that actually predicts the uh ri uh risk of a particular criminal. So whe whether it's a high, low risk, high risk, medium risk criminal or not, and, and all these predictions are being made based on the prior offenses. So again, if you see here, if I, I, if you just see the first criminal though, the criminal has a Vernon Pra though the no uh no um there are two armed robberies uh though there are attempted armed robberies, one attempted one and one grand theft. So there are a lot of serious crime against him, but still he is uh he is evaluated as a lowest criminal.

There is if you see to the right wherein a dark skinned woman is, is segmented as a high risk criminal for just four juvenile, uh four juvenile criminals and, and similarly, if you see the results to be right, you see that um low risk, uh low risk um evaluations have been made even though the uh the prior offenses are uh are very aggravated and, and there are uh grand thefts, drug trafficking and so on.

There is um there is, there's a medium risk for one pet. The, so this is how our A I is behaving in a biased manner. And it's all because of the data set that's being used to train that model. Uh There will also instances of a novel A wherein wherein uh an, an A I model predicts uh whether uh whether a particular criminal should be uh released on parole or not. But, and, and that is all uh subjected on uh that is all based on questions and, and, and uh and uh all and this is also uh uh this is also an uh experience from the past wherein we have seen that, that it, it, it, it gives uh it gives black box assessments there, there's no explainability here.

Hence, it is unknowable here and, and we've also seen cases of unaccountable. So, so though we are proud of our model saying that OK, it is 99% accurate but, but it is 1% inaccurate. So who takes the accountability of that 1% inaccuracy? So uh I is it, is it the developer who worked on that model or is it the team lead or is it the owner of that company or, or is it the end user? So, so there's no accountability defined yet. What if an autonomous vehicle, autonomous car had an accident? Who is legally irresponsible? And that's a million dollar question that we have. We've also seen cases of bias A I where in where in the new uh New Zealand uh pa passport robot, it actually identified a bit uh recommended an uh an an as to open eyes during the photograph. So it was not trained uh much. The train data did not have enough uh enough examples from a particular sector of community. Hence, it provided this bias output. Now that we have seen the highs of air, we have seen the lows of air. Now let's see the risks of air. So we, so risk of error, we we know it cannot be 100% accurate.

There is there is an error uh number though it might be a, as small as possible, but there is a possibility of error uh that could happen in a uh that, that could happen that we could incur in A I predictions. There's also risk of bias and all these bias depends on the training data uh performance transparency. So there's no explainability. It uh we could have a cyber attacks uh to the entire uh A IA I application. Uh We could have uh we could have cases like uh A I growing group I I it and, and, and it's learning from the end. So, so these are the risks of A I and, and, uh, Elon Musk and Stephen Hawking. Both of them have highlighted at multiple forums, uh, how, how they feel that, uh A I needs to be regulated. And, and that's the whole reason why we are talking about the next topic that is Dimensions of Responsibility Act. So, so these are the five primary dimensions of res responsibility buyers and fairness we are under uh we are uh we are evaluating if the model has been trained on unbiased data in order to give unbiased results. So, but uh so that is how uh bias and fairness uh is a very important component of res responsibility. The data that we are using to train the model is it uh is it uh is it approved by the end user? Is it approved by the data owner?

Uh Are there any legal implications? Are we training on P I IP H I data? So, so, so all the uh so all these aspects come under privacy and security and next is robust, robustness. So we really need to make sure that uh that the model is robbery. So, so there have been uh cases wherein uh wherein um the model is put to Avers testing, wherein wherein uh uh a particular uh type of questions um uh uh probably related to some war or related to some religion are are asked or are tested against the model to, to understand how the model behaves.

So the model has to be robust enough to understand such attacks. The next dimension that we have is transparency. So here uh the uh we need to make sure that that whatever choices are being made by the model, whatever predictions are being made by the model is it transparent? Can, can uh users make choices that OK, the the model has made this prediction based on this reason. So the users and users have a choice whether to go with it or go against it. And the final dimension that we have is here is interpret and explainability. Do we know why the model made such a prediction so made for in for an individual prediction or a prediction at scale? So, so uh so these are the, these are the important dimensions of responsibility and this is a typical ML cycle here and, and, and uh and here we, we need to be cognizant enough as ML practitioners, as data scientists, as ML engineers. We don't, we need to ask important questions at each and every single level. If when we are defining the problem, we need to understand what is the business problem? Who is the intended audience? So, understanding the business problem and end user is the very first thing.

And what is the evaluation and success criteria? How do I evaluate my model? And what are the risks associated? With the use case, the the use case could be very challenging, very super interesting use case. But, but what are the risks when a model does not uh print or, or or or or gives an inaccurate results? So, so what are the implications implications of that? Secondly, uh next we come to the data collection. How is the data collected? Is it biased? Are, are all the uh all the segments uh considered uh by collecting the data? Is it skewed? Is there any PR A or PH I data? And then next we talk about model training. We need to understand how is modeling, what is the model training frequency? When was it trained? And is it explainable while evaluating, we need to make sure that the test sets of it has to be a representation of the real world data sets. And while monitoring monitoring, we need to make sure the model doesn't perform just on the majority class but also on all the classes. So, so that is super critical that the model has good performance over all the classes. Next is explainability. So, so it is not just the model prediction that we are interested in, especially in segments like um uh health care industry and and fi and uh and finance se uh finance sectors where in where in those are very highly regulated.

So explain a liberty has to be in cooperated in the model. If a model says this is a cat and not a dog, then we really need to be sure. And at the end user has to understand why did the model say that this is a cat? So that explanative feature has to be incorporated, the model has to be transparent. Why was that decision taken by the model? And, and it has to be reproducible. So tomorrow, if I say if, if I share the same picture of a cat to the, to the model, and it has to say that this is a cat. So, so the results have to be reproducible and it has to be fair and biased. And this bias and a fair uh uh we have unbiased models and fair models only when the data set that uh that uh is used to train the model, uh it is fair. So uh so fairness and bias has to be measured at all these stages. And this is achieved uh using A I governance A I governance has to be implemented at two phases. One is the A I user perspective and second is the A I owner perspective. So if I'm, if I am an A I user, then I need to be aware of how the data was collected. Did the end user have an option to opt out or opt in?

So this will give an idea for an A I user that uh but whether all the users were uh were considered or not and, and the end users, end user who is sharing his horror data. They need to understand why this data will be used for. What is the purpose that this data will be used for? And what is the benefit of the A I model that will be based on their data? So, so they need to understand that they also need to understand that it will not harm the end user. And from an A I owner perspective, they need to gain uh gain the trust uh understanding support from the A A I end users. They, they, they need to make them understand that this is not something that is going to be helpful, this would be beneficial for the society. And uh and A I audits need to be uh performed at regular intervals. So, so probably you could consider a time frame probably monthly or quarterly. You regularly evaluate your models and based on that you take the uh required ne next steps. And this is a super interesting quote uh that, that I like and, and I would like to uh state here with great power comes great responsibility.

So building uh building uh A I based solutions uh having uh A I based uh approaches incorporated in our uh business use cases to derive business outcomes, to derive lots of value. It's all great. But with that comes great responsibility, we will have to be aware of that. We have to acknowledge the fact that uh that it is not just a machine to whom we should be. Uh uh we should be giving all the accountability. There has to be a human in the loop. There has to be a human centered interface that has to be made available. And with this quote, uh I would like to uh take a pause here and hear out any questions that you may have.