How to get a started and get ahead in AI & Data Science
Video Transcription
So let's get started. So a little bit about me. So if you right now, hopefully you join the session on how to get started in data and A I um if that's not session you were hoping to attend, I hope you'll stay anyway. So welcome.My name is Pat Sidiki. And today we'll be talking about how to get started in a career in data science and A I and a small section on if you're already in the in this space, how to get ahead. And that's really going to be about uh some of the challenges that I see working this year. Uh I have been a software engineer promotion for years. Currently, I work at ETS A I research labs where we build new products backed by research and powered by A I. I also advise students and start ups working in this field. So I have seen um the challenges firsthand of people who want to break into this field. There's a lot of information out. There can be very confusing. What I'm gonna do is give you a few battle tested tips that I give to anybody who reaches out to me and wants to start in this space. Um Just like another reminder if you just joining, um do go to the meti site and tell us um how, what you do currently and how many years you have worked with the and the link is, I'm gonna drop it again in the chat in case you don't have it.
So let's keep going and let's talk about getting started. Um So you're here because you're intrigued by this new uh exciting field of data science and you think you want in on the act. Now, the demand is very high salaries are strong. But before taking the leap onto this path, you must ask yourself the questions of what with how deep is your interest? I mean, life is too short to work on something you hate. So uh ask you for the question. Are you ready for the challenge and opportunities and you're working with data or you build reports at work? You're there too. Do you love taking surveys and analyzing the results? That's you, that's data. If you like doing any of these things, then a career in data science could possibly be a fit for you. That's just the first step though. Uh What you need to look at is where would you step in? And that depends on your skills and what you already know. So let's take a look now in a company, uh data science, like any other rule is, doesn't work in isolation it is part of an ecosystem of teams that work together. So none of us operate in a vacuum. And while we can play around with data for fun, most of us want to earn a living doing that. So it's crucial that you understand how data science fits into the business world, how it delivers value in general.
There are three main roles that is the data engineering role, the data science, analytics and data science A IML and each of you um uh may uh use one of these rules to enter your career. And uh in uh data engineering role, you will collect, organize and clean the data in the analytics role. You will analyze and generate insights to present to the business leadership and sometimes to your customers. In the A IML role, you will build machine learning models where you will make predictions for your users. Um Now to get this work to your users, you will work with the software and applications, either you will build them or your engineering team will build them or if you also will work with existing systems that produce the data. So understand uh where the data science function fits in and uh drop any questions in the chat. Um Rina database administrator uh could be an entry point uh into the data engineering role. Um The uh and the next slide you will see some of the things that the data engineering function does. But uh DB a has a good shot of stepping into that role. So here are the three rows, a little bit more in detail.
Some you may do all three. And there is a lot of discussion in the data science industry today about whether there should be a specialized set of uh people with that, each role is done by a different group or different person, or that should be what we call the full life cycle data scientist who does all these rules and there are good arguments on in both sides.
In fact, if you go to my linkedin, just uh y yesterday, I posted um an analysis or a summary of uh something that somebody had shared with. They compared and contrasted these approaches, whether it's one person doing all of it. But most of you as beginners, I would recommend you attempt to master all three domains though the space where you enter would be um dictated by what you already know and where you are. So if you are um from a uh math or research or software background and uh data science, ML um work would be uh uh uh easy sidestep if you are in the business side. And I see one of our tech bloggers, um tech blogger um attending you probably uh have familiarity with generating insights about your users. I mean, you are monitoring your blog to see who's joining and what are they doing? Maybe one of your uh your entry point could be the analysis path where you analyze data and visualize it and generate insights. Um database administrator, you could go in into the data engineering space where you manage the warehouse, you run etl scripts, you transform and um uh you then move the data to the next page. Remember the most critical function in data science is cleaning the data.
So no matter what you do here, you must learn how to clean and transform your data. Any questions on this specific or I can take one question. Uh Before I jump to the next slide earlier, you asked how much experience do these roles require relative to each other? Um I would say that all of these are beginner friendly. Uh It's really about um with the um machine learning libraries out there. Data Science has become quite easy. Many of the courses that I will share um you can build models in a week and two weeks. The the key is to learn how to do it and then take that and apply that to the real world situation, which is um um where who comes in? Thank you for sharing, Kevin. That was definitely a good progression. I've seen that as well. Rina, you ask who provides the big question? I would say that that is um it can come from your business side. But as a data scientist, you are expected to be curious and look at your enterprise or the space you are in and generate a bunch of questions that then you can bet with the business owner and see if they're interested in signing up for um having you work on those. So, um I would suggest you do not wait for somebody to give you the question, but you find the question. If you are close to the data, you will see patterns and you can ask those questions and put them out there.
So, um and I will, I'll show you AAA path that I've recommended has worked well. Um And so, OK, so once you have decided your entry point based on your current skill, you should, you can find a course that teaches or builds upon the skills that you already have. Now, if you have a solid background in analytic discipline, and I see many database engineers or computer science or researchers in here, um You can pretty much teach yourself data science and I would, and there are many free and almost free courses that you can take. I would recommend that you take do these courses before you spend any money on a paid certificate or a degree. Now, if you're a student that's different, but before you switch your major or sign up, so you have a path where you could take a class at in college. But if you are already in a path and you are not sure whether to switch, you could take a data science class at your school or take one of these classes just to understand whether you do want to sign up for that or you want to switch out before even switching your major or track.
You, you should probably do some exploration on your own. Um Now how do you find a good course? Um, you should look at the ratings, you should look at how long ago the course was published. And is the instructor adding new updates? Now, there are many courses out there. It can get very confusing but looking at these two things usually will help you narrow it down quick. Um This field is moving very fast. Uh while the foundations remain the same. New Python libraries pop up all the time, new tools pop up all the time. You have your Bigquery looker and new um uh libraries in Python that keep coming up. So you want a course that's recent, last five years is still ok unless you're doing a foundational Python course, but still you're trying to get into data science. So look for a newer course. Look for course that's popular. Usually when I go to Udemy and try to find a course, I filter for post stars and above only, I don't want to spend more time, anything that's not post stars. So that can be a way for you to do that. And of course, course here uh you can uh audit courses for free. So that's a tremendous one. Some of these courses that I listed here uh are ones that I have recommended to the students that I mentor and they have taken it.
Now, here's the other thing, if you finally feel bored and unable to finish the course, you should ask yourself the question, whether you want to spend a lifetime or at least many years doing this work. Data science can often be a very solo activity, it can be incredibly frustrating. So if, if you can't finish the course, you have to ask yourself why and give it an honest shot, you know, you're interested, but it's work at the end of the day, it's work, it's not just all fun. So uh keep plugging it at it and if you still can finish it, then I hope you look at other options. There are supporting rules that you can explore. You don't have to be the hands on data sign as I show the eli is part of the organization, there are many other rules. There's a data product manager, there might be data integrators that you can um um you, you can do that. So, but if you don't want to be handsome, but you still like working with data, you can be in those roles. And now you ask the question, who was a big question?
Often a data product manager might be the one who um is actually doing and talking to people trying to find the problems and bringing the question to the data scientist. So that's another rule that you could explore if you don't want to be hands on. Thank you, Rina now. OK. So I don't, how many of you have actually uh completed a data science project? Uh So students uh at your school or uh people who are working or working for themselves? Um Have you, have you worked on a project and completed it? Drop it in the chat? Awesome. That, that's, that's great. The thing that uh if you are in the workplace or even if you are as a student, um data science that I've seen many uh people do experimental projects and uh it's while it's great, you get the experience, it doesn't move you to the next level. So if you want to do a career in here, you want to go to the next level, which means you need to get feedback from an honest feedback from people who uh will tell you not just, yeah, good job. They will tell you that. Yes, this solves a problem that I have and here's some money. Can you go build it for me? So uh getting ahead, getting, getting this as a career, it's important that you um find uh a problem that exists.
Either if you're a student, maybe there is a problem that your professor has that or somebody in the research lab who has a problem where uh they have tons of data and they need help um massaging it, analyzing it and processing it. You could start, there could start in your workplace and um to apply your skills to solve the problem. Make sure you when you look at the problem, make sure you define in your head that these three questions, who are you doing, solving this problem for? Who's the user? What is your problem? And how does your project solve that problem? If you keep your focus on solving problems for somebody, um that is a very valuable skill to have. Uh And that will make you a star and a very valuable person in the data science field. Um And then once you have that you must, you must share it, you must get uh an honest peer review as well. Um Of often we are afraid of offending. Um But make sure that you, you solicit honest feedback on whether your project has solved a real problem and schedule a demo. I have seen this path observed. Once the uh demo or the prototype is presented to the user, they are often willing to find money in their budget to um pay you or pay for the servers or pay for somebody to help you um bring this uh project to life and give it to them to use.
Because if it makes their life easier, they're gonna be happier. And then gradually they may create a rule for you that um is in the data science, data, a analyst. Or data engineer uh area where you you are now the maybe the first data scientist in the company or maybe the first data center in that department. So the ideal time frame for this project, um I would say um depending on how much time you have, obviously, you probably are not working on this full time because you have other things to do. I would suggest no more than six weeks um If you can do it even in it in shorter. So how do you go about understanding the domain area, Rena um I would say that if you are in wherever you are, start close to home, um talk to people look at the data. If you're already working with data, you like I mentioned earlier, you will see patterns that you can um if that you look at and say, well, why isn't anybody talking about this? Why isn't anybody saying something? I see this so you can do around that?
But I would suggest start close to home, start with adjacent to your department or adjacent department. Don't spread, go too far a field because you won't have the connections um to be able to get the data that you need or have the user schedule, uh uh uh schedule a demo with the user. So start close, close by. And I would say just essentially either start with the data or start with the people and ask them um or observe them what they complain about. What are they, what they wish they had. Um Let's get quickly to make sure I put to the next page. Now, if you cannot find a project at work, I would say that you haven't worked hard enough. There are many, many, many problems at work. So, uh we'll look for it. Now, that was my section on getting started. Now, a few minutes on getting ahead. Um This section of uh for people who are already in the field and want to understand how to take the next step. So let's take a look at it. So if you're already in the field of data science, uh many of these scenarios will look familiar to you. Um So ask yourself why and what can you do better?
And for those who are starting out once the initial enthusiasm uh wears off, sometimes you hit the hard reality of, um, that I listed here about dashboards that nobody sees, report that nobody reads, you spend a lot of time on it, but it, it's not uh hitting the point. And this comes back to the previous thing I had mentioned where make sure you're solving a real problem. So sometimes as a leader in the business, um so as the dashboard that will looks great, but it doesn't tell me what's important it, or it tells me to remember something that happened and do math in my head to figure out um how things have changed or models that while interesting, they are not tied to the business or the application um that the users are using.
And there is not enough thought put into how you will take it from where you are. A Jupiter notebook is not sufficient to put something in front of users. So as a data scientist, if you keep the need of a user front and center, you will add tremendous value and boost your career trip a lot. Uh If you look at the next slide, I can show you some specific examples. Um So here are some things you can do. Um So earlier, we saw that data science roles are part of the larger organizations flowing rhythm. Just as the example, this is uh something you will see if you Google it, this came from actually Google. And um it shows that beyond the model, beyond the inside, that is an entire gray space that often gets neglected and not addressing this space is often the cause of the failure of many data science projects. So if if you own this space entirely and not just an orange box box in there, but all of it, it will be unstoppable. Um In fact, there's a new course that just came out uh Deep learning uh dot A in deep learning dot A I, it's about uh MLS, I would recommend you sign up.
It's on course, I think it came out two weeks ago, but there is a lot of um talk resources and communities um out there um MLS community that talk about this missing link between the what data scientist produces and what the users actually end up needing. Remember, um Your goal should be to deliver value by your grasp of the data and the tools and skills needed to convert that raw data into something that users can take action on or that delivers value to them. So uh let's take a look at one more once more. Let's take a look at the question that I asked you guys earlier. Um Wonderful. So we have student is still the leading um audience category. So uh for, for most of your students, you um probably will benefit from more from the earlier section. But um the later section will be useful once you have a little bit more experience. But definitely look beyond um the Python, the Jupiter notebooks. Um if you want to have a productive career, have all those others. Um any of those courses in any of those areas would be useful and we have quite a few people with lots of experience, but one and zero. So beginners, thank you for sharing guys and I have a few minutes before the session ends. So I'll take questions and of course you can, please can follow me on Twitter or linkedin and shoot me any DM me on Twitter if you have any questions as well. Yes.
Um Actually let me see one second. You may just have to Google it. Unfortunately. Um, let me think. I'm gonna go back to that slide here. And exactly, so I'll have to find the links for you. But if you want to grab a screenshot, um, the courses are exactly this, uh, have this name and, uh, you can actually search for it. Let me know if you have any trouble while I'm still on the call here. But yeah, I would recommend just take a picture with your phone or save it. Uh And if you dm me later, I can send you a link, maybe I'll do what I'll do and you can drop it in um in my Twitter feed. If you follow me on Twitter, you will get the links for each of these courses. So are there any communities? Yes. So the the data towards data science community and medium is very good. There is a lot of information for all levels. If you don't already follow them, I would recommend you go and follow them. Um Twitter is an amazing, amazing resource. There are many people sharing their expertise and that's all levels. I mean, there's beginners, people who are students, uh people who are experts, people who are, who are moving this field. And the amazing thing about Twitter that I find is you can have a global conversation.
Uh Like the other day, I was talking to some uh a researcher in Israel and like, I didn't even know he was in Israel. I was just talking to him and it, it, it's great and, and if you um ask insightful questions, make insightful comments on something that those leaders are sharing. That's another way to network networking as you guys know, is very important, right? I mean, even on data science and current in data science, you got a network, you got to go out there and meet people and especially students. Um this is uh uh as being a student gives you an opening um that you can use, people are more receptive to students reaching out for some reason. We just like students. So use that in your favor and network right now. Uh find people who do the work that you would like to be doing or whom you look up to and reach out to them, not to ask for a job. But hey, I'm a fan. Um Let me another question here. So some examples of projects on which you're working. So one of the most common projects that um happen in the data science field often uh recommendation engines. Thank you, Annette for that. And folks go ahead and join, join that woman in A I.
Um So recommendation engines are one of the uh like I said, the one of the entry level kind of projects. Um you collect some data and you, when the user visits your application, you present them with choices that help them help guide them to where they need to go. Uh So those are, that's a very common project. Other more advanced projects are around N LP, speech processing, image processing, text processing, those can get pretty complicated. And that field is actually um we have made progress and ma some areas but uh it's, it's really very far from being um everywhere. I mean, we use Alexa as well of ML around that speed is another problem. Um We often face problems with deploying uh ML models and insights um on devices, small devices and uh um dealing with network connection. So those are the kind of projects that um that's currently, I'm working on challenges that I'm looking at. Donna Lakshmi, you had a question infrastructure experience. Uh Would you suggest for me to learn to start up? Um So yeah, so take a look at um I'll share the slides but definitely I would say that explore um your, how you work with data today.
Um Figure out the what skills you have when inventory, take a look at some of these courses and see which ones you actually finish. I mean, there are many, sometimes you will start one of them and say um no, this is not for me, skip to the other. I try to give a few different uh types of courses that you could join here. Um So that to make sure that you want to do this work uh and finishing the courses would be uh a good sign that of your interest and uh then find a project at your workplace that you can do on your own time. Um Nobody's gonna give you hours to work on a project. So you'll have to do it on your own time. But if you do uh a project that provides value, that would be uh a pathway often. I think it's easier for you to transition into a data science role in your existing company than to get somebody um outside to take a chance on you because um they do not have a good sense of whether you can actually do the job. Uh But people at, at your company at least know you and they might be willing to give you a chance Google data analysis certificate, uh great certificate.
Uh But again, as I start the smaller course, if you like it, then go ahead and get that certificate, that's a bigger time investment and that looks good on the resume. But nothing takes the place of um actual, an actual portfolio that you get. So make sure you in addition to the certificate, you have a portfolio, maybe projects on github. I think you're gonna get cut off in a couple of minutes. But um often when I hire scientists, I ask them about their previous work and have them describe it and I look for um how much they own the space. So that's important. But a certificate will get a resume a second look. Um, but it's won't necessarily get you hired without the, um, work to back it up. So, um, can anyone sit into data science? Yes, of course. Um, with your experience, um, I'm sure you have a reputation in your, uh, industry and business and a big network, uh, just produce. Right. Have one of the people that, you know, give you a chance to work in that field. Any final question, I think we have made it cut up any second. But I'm happy to take and do connect with me.
As I mentioned, I'm happy to take uh um uh DM me on Twitter or just follow me on linkedin and um I post content on this all the time. So it's up here. And so also in my speaker profile, if you want me to take a look at that, I'm gonna open in the chat as well. And if I get cut off, thank you for coming, everybody. I was uh great uh hearing from you and your experiences. That's my linkedin and at Twitter, just find me as this. OK. Take care. All right. Thank you. All.