Ayse Bilge Gunduz Introduction to Scientific Programming in Python

Automatic Summary

Introduction to Scientific Programming with Python

Today, we delve into scientific programming with Python. My name is Gino, a Ph.D. candidate in Computer Engineering and a Data Science enthusiast. Follow me on Twitter or Medium for more.

Why Python for Scientific Programming?

Python is relatively easy to learn and write, making it an ideal choice for scientific programming. When compared to other programming languages like MATLAB and R, Python proves to be more versatile. However, when compared to C, Python's performance lags a bit. But Python 3, which I highly recommend, tries to mitigate this.

Let's dive into some essential Python packages for scientific computing.

Numpy

Numpy is the fundamental Python package for scientific computing. It features an N-dimensional array object which is at its core. As a swift brief, with Numpy, you can easily:

1. Assign a number in any data type, be it integer or float
2. Use several mathematical functions just like in Python's math library
3. Manipulate an array conveniently

```python
import numpy as np
```

Array Manipulation in Numpy

With Numpy, if you have a one-dimensional array, you can easily call it a vector. For instance:

```python
import numpy as np
a = np.array([1, 2, 3, 4])
```

Creating Matrices in Numpy

With Numpy, you can conveniently create matrices. Here is how:

```python
import numpy as np
m = np.ones((2,3))
m
```

Pandas

Pandas is an essential Python library that provides a DataFrame object. If you consider DataFrame as an Excel sheet or a dictionary in Python series object, you won't be far from right. With Pandas, you can read and write many file types.

```python
import pandas as pd
```

DataFrame Manipulation in Pandas

When it comes to DataFrame manipulation, Pandas proves itself worthy. It provides a group by function, allowing you to group different types together to reveal relationships and patterns.

```python
pokemon = pd.read_csv('pokemon.csv')
pokemon.groupby(['Type 1', 'Type 2']).Speed.mean()
```

Matplotlib and Seaborn

Matplotlib and Seaborn are standard libraries frequently used for creating static, animated, and interactive visualizations in Python. Here's an approach to creating a plot:

```python
import matplotlib.pyplot as plt
import seaborn as sns
...
```

In conclusion, scientific programming with Python might seem challenging at first, but with the right understanding of the basic packages and libraries such as Numpy, Pandas, Matplotlib, and Seaborn, you will quickly start to enjoy the process. Happy programming!


Video Transcription

Today, we will talk about introduction to scientific programming in Python. I have so limited time. So I will try to go um as fast as possible to be able to complete in time and to answer your questions before my session ends.And I would like to talk about myself a little bit. I am a Gino, I'm a phd candidate in computer engineering department uh in a university at Istanbul and I am an active members of a couple of women oriented technical groups and I'm interested in application security and I'm in huge data science enthusiast. And if you would like to follow me, you can either follow me on um Twitter or medium. So why Python at the beginning? Because it's relatively easy to learn and has to write a code. It's like writing it in English. It seems to me at least and very intuitive. And when we compare with MATLAB and R, it's versatile. And uh of course, when we compare with C it's worst performance. So uh why Python three? Cause today, I will explain everything in Python three because Python two is not supported anymore. So if you're using, if you're still using Python two. I highly recommend to convert to Python three cause it's easy. So I'd like to start first with Namai.

And um numpy is a fundamental package for scientific computing in Python. Uh We can easily um import numpy with an alias as we can see it in here and in the first line and it's an N dimensional array object in essential in the core. So we can easily assign a number in any type. It can be integer, it can be float. And there's many functions in non pi like in math library in Python, like there's an exponential function, absolute function. As we can see it in here, we can easily read. And with NP dot pi NP dot EE number and PI number and here are the outputs for our inputs. And OK, we told that it's an N dimensional array object for non pi, right? So it's so easy to use with, especially with an array manipulation. Um Things when we would like to do like we can define an array easily, we can define a data type. It is referred to data type and we can use print function to print the uh array itself easily. Like in the output, we can, I hope you can see it in here and we can ask to see the dimension of di dimension of the um array itself. It is one as we can see it in here and we can easily get the shape uh which is four, you can see about like there's a four item in our array and we can ask data type as well, which is a float.

So as I told you before, it's an array object, if it's one dimensional, we can easily call it as vector. So if we would like to do some actions over our array, one dimensional array or vector or whatever you call it is so easy. So I create an figure in here to show directly by writing a, we say, I mean, this is an uh we call it a like we did in here. So I'm calling as a or we can uh write a and brackets and column in the middle. It means that whole uh items in the array. And we can, if you, if you would like to specific indexes in the area, right, we can easily write the index number like we did in different languages, programming languages. So this is the zero, this is one, this is 23 and four. If you write one, then we can reach the two. So we can ask for um for example, like in here between one and 3rd, 3rd indexes the whole item between one and third one. Like in here, we can directly ask uh like that a brackets, 1 to 3, then it will give us 23 and four. And also we can ask for even intervals if we put two column between the brackets after writing the name of the variable of an array. Of course, it means that start from the zero. And with every two interval, give me the item in, in, in my array, for example.

So it will start with zero, we will see one then um pas to the first um in this and give the, we'll give the second one um and we'll pass to the four item and we'll give the 51 as you can see down there. It, it is because we gave two as a parameter in here. And we can re create even diagonal matrixes easily and um matrix with full of zeros or ones with NP. Uh as you know that NP stands for non pi since we uh imported as import num pi as NPNP dot DI A will give us a diagonal matrix. What is diagonal matrix? If you ask a diagonal matrix is main, the uh outside of main diagonal is whole zero, but the main diagonal is, will um include values in it. It's a um in default to the diagonal matrix. So we can create a normal diagonal matrix like in the first line in our code or we can create the offset ones as well. And if you wrote K equal to one, it will create a first offset like in D two, the first column will be R of set, then the diagonal matrix will be created.

And also as I told, you before we can create um matrixes full of zeros like it is five, refer for the wrong number. Four, refer to the column number. And in NP ones, this is the second one, M two in output um which has two rows and three columns in here. So there is also, as I told you before, there is also array manipulations, right? Because in Nam Pi everything is N dimensional array object. If we want to, if we name our uh variables as M, we can call directly by writing M uh or we can uh put brackets after M of the variable name and put comma colon. Then in between comma uh if we want to access the specific instance in our array, then we can give the row and column in the says to reach out to access that um instance or we can directly ask to reach to the full of full row. For example, the first row I wanna reach, then the third example will show up. And as you can see it in, I'm going from top to bottom, then we'll pass to right one and bottom again as well. Um And the first row you can easily reach by writing M brackets zero. And the bottom, right, one minus two means um the first row actually. And we can see that there's a after comma there's a column actually, the column means the whole row except whole column.

I'm sorry, except the last one. So there is many, many matrix manipulation functions in non pi. So you can ask shape, easily like a dot shape. And it's so similar in MATLAB and also a flattened function will give a copy of our uh matrix just it is a flattened way as you can as we can see it in the bottom in our output. So there is a mat plot lip uh library as well. You standard to import ma plot lip like that. And as you can see it in here, we first we plot that figure, we create a figure. Then we start to fill inside of the figure. We say that plot according to X one and two and R means red with color as red and X label and Y label means the title of the labels. And also we can give a title to be able to display our uh matte lip figure one is so attractive. Um So nice. I love to use it. It is the library is based on Mac Plot Lip. You can easily use like import Seor Snssns dot set, I load an is data set in here. You can see the data set with route plot. Here you go. It's that lovely, really nice. Um And a attractive figure we can see it in here.

So I want to pass directly to pandas because there's so many things that I'd like to share. First of all pandas is a data frame object what is data frame? If you ask, you can consider like data frame is an Excel sheet. You can consider like that or a dictionary in Python in the of the series object. So you can import pandas as import pandas SPD. You can call different aliases, but PD is so standard and you can read and write many file types uh with pandas like I read the Pokemon CS V in here, I even can give as a parameter to encoding. And the F hat will give us the 1st 10 rows in our data set. Like as you can see it in here, I hope. And the first one is Bulbasaur, the second one is IOS and there's charizard in here and there is many, many, many um information attributes about the Pokemons. We can see it in here. And I'd like to give more profound examples with, by using Mac Plot Lipsy Born and Pandas from this point. Uh Mac Plot Lip in line will give us a rendering figure in notebooks.

After my session, I will uh share my notebook, call up Google, call up notebook with you. There's many more examples in in it. Uh So uh as we can see it in here, we read again the Pokemon dot CS V and we said that there is a column information inside our data set. And we said that we had function, we would like to read the uh first five role in our data set and there's a group by function in pandas. Uh As you can see it in here, we set type one. Uh with type two, it means group by type one with type two. Type one is the primary one. Type two is the second one. So it's a type one says it's a bug, it can be bug, it can be water, it can be poison like that. And type two will reveal the type two of our um uh Pokemon in here. I'd like, I'm checking the time as well. So the group by is N with type two, actually, we just, we just group type one and type two by asking the uh speed and mean function.

We say that according to type one and type two, we ask to the average mean average um speed of to the type of that Pokemon for example, and ta type one is back for butter free and Lady B and both in their type two, it's also flying. And what if we want to know the uh how many null value in our feature set in our data set in features? We can ask uh with Pokemon dot is null and with that sum, we can ask the summation of for each features. Uh By the way, the as you can, I guess as you can see it in here, the I don't know if I wrote it. The data set is from Kagle. Um I will share it later, if I didn't share then here. Uh and then I wanted to see the um Pokemons. Um I wanted to see the every type of Pokemon according to type one. Let's say those the first two lines are the options for to make it more nice to see. Born Library. Plot figure is create a plot for us in that figure size. And I'm saying that Pokemon type one, I wanted to reach out in our data frame and count the values of that type one and plot a bar and make the title set the title as Pokemon types like in a font size.

And let's run it when we run it as we can see it in here. We have water Pokemons, we have flying Pokemons, but mostly water than normal. Least flying Pokemons like that. So we can say that how many uh legendary Pokemon exist in this list. For example, legendary. Is that false and true options? If a Pokemon is legendary, it, it says it's true. If it's not, then it's false. Ok, let's see that there is 700 non legendary Pokemon and six hun 6 to 5 legendary Pokemons. And in legendary Pokemon, we want to see the type ones and when we check that with, well, we counts, we see that we the mostly mostly legendary Pokemons are psychics and then dragons. So my time is done in here. Unfortunately, I have many more things to show you. But the time is up. Thanks for listening, everyone.