What is Pandas?¶
Watch it
See the accompanied youtube video at the link here.
Pandas is an add-on library to Python.
It let’s us do more things with our code, specifically with dataframes.
Importing pandas¶
To analyze dataframes and load these csv
files, we need to make sure
that we bring in the pandas
library into Python.
Before we start writing any valuable code for loading data and doing data analysis we need to import it with the following code.
import pandas as pd
Reading in Data¶
Next we can bring in our data named candybars
which is stored as a
.csv
file.
candy = pd.read_csv('candybars.csv')
candy
name | weight | chocolate | peanuts | caramel | available_canada_america | |
---|---|---|---|---|---|---|
0 | Coffee Crisp | 50 | 1 | 0 | 0 | Canada |
1 | Butterfinger | 184 | 1 | 1 | 1 | America |
2 | Skor | 39 | 1 | 0 | 1 | Both |
3 | Smarties | 45 | 1 | 0 | 0 | Canada |
4 | Twix | 58 | 1 | 0 | 1 | Both |
5 | Reeses Peanutbutter Cups | 43 | 1 | 1 | 0 | Both |
6 | 3 Musketeers | 54 | 1 | 0 | 0 | America |
7 | Kinder Surprise | 20 | 1 | 0 | 0 | Canada |
8 | M & M | 48 | 1 | 1 | 0 | Both |
9 | Glosettes | 50 | 1 | 0 | 0 | Canada |
10 | KitKat | 45 | 1 | 0 | 0 | Both |
11 | Babe Ruth | 60 | 1 | 1 | 1 | America |
12 | Caramilk | 52 | 1 | 0 | 1 | Canada |
13 | Aero | 42 | 1 | 0 | 0 | Canada |
14 | Mars | 51 | 1 | 0 | 1 | Both |
15 | Payday | 52 | 0 | 1 | 1 | America |
16 | Snickers | 48 | 1 | 1 | 1 | Both |
17 | Crunchie | 26 | 1 | 0 | 0 | Canada |
18 | Wonderbar | 58 | 1 | 1 | 1 | Canada |
19 | 100 Grand | 43 | 1 | 0 | 1 | America |
20 | Take 5 | 43 | 1 | 1 | 1 | America |
21 | Whatchamacallits | 45 | 1 | 1 | 0 | America |
22 | Almond Joy | 46 | 1 | 0 | 0 | America |
23 | Oh Henry | 51 | 1 | 1 | 1 | Both |
24 | Cookies and Cream | 43 | 0 | 0 | 0 | Both |
let’s break this up:
pd
is the short form for pandas, which we are using to manipulate our dataframe.read_csv()
is the tool that does the job and, in this case, it is reading in thecsv
file namedcandybars.csv
.candy
is The dataframe is now saved as an object calledcandy
.
The dataframe is stored in an object named candy
and we can inspect in
by “calling” the object name.
In these section we can differentiate between the code that we typed in with a light grey background and it’s output which has a dark grey background.
From this dataframe, we can see that there are 25 different candy bars and 6 columns.
We can obtain the names of the columns using .columns
syntax, and if
we wanted to see the dimensions of the whole dataframe we could use
.shape
after the dataframe name.
candy.columns
Index(['name', 'weight', 'chocolate', 'peanuts', 'caramel',
'available_canada_america'],
dtype='object')
candy.shape
(25, 6)
Breaking up the code, we interpret this as:
“From our dataframe that we saved as candy
, tell me the columns
and shape
”
What if we don’t want to output the whole table when displaying it as dataframe?
We can specify how many rows of the dataset to show with .head()
syntax.
.head(2)
will output the first 2 rows of the dataframe.
candy.head(2)
name | weight | chocolate | peanuts | caramel | available_canada_america | |
---|---|---|---|---|---|---|
0 | Coffee Crisp | 50 | 1 | 0 | 0 | Canada |
1 | Butterfinger | 184 | 1 | 1 | 1 | America |
We can specify any number of rows within the parentheses or we can leave it empty which will default to the first 5 rows.
candy.head()
name | weight | chocolate | peanuts | caramel | available_canada_america | |
---|---|---|---|---|---|---|
0 | Coffee Crisp | 50 | 1 | 0 | 0 | Canada |
1 | Butterfinger | 184 | 1 | 1 | 1 | America |
2 | Skor | 39 | 1 | 0 | 1 | Both |
3 | Smarties | 45 | 1 | 0 | 0 | Canada |
4 | Twix | 58 | 1 | 0 | 1 | Both |
This can be really useful when we have dataframes that have hundreds or thousands of rows long.
Functions/Methods and Attributes¶

Something you may have noticed is that when we use pd.read_csv()
we
put our instructions within the parentheses, whereas, when we use
.shape
or .head()
the object that we are operating on comes before
our desired command.
In Python, we use functions, methods and attributes. These are special words in Python that take instructions (we call these arguments) and do something.
Attributes¶
Attributes can be distinguished from methods and functions as they do not have parentheses.
They can be thought of as nouns or adjectives that describe an object.
Take candy.shape
as an example.
In this case, our dataframe candy
is our object and .shape
is the
attribute describing it.
Functions¶
Functions and methods have parentheses.
They can be thought of as verbs that complete an action.
In the example of pd.read_csv()
, this function does the action of
reading in our data.
This is going to be discussed in more detail later in the course but now, simply be aware of the way we write the different instructions.
Comments¶
While we write code, it’s often useful to annotate it or include information for humans that we do not want to executed.
The easiest way to do this is with a hash (#
) symbol. This creates a
single line comment and prevents anything written after it from being
executed by Python.
# This line does not execute anything.
candy.shape # This will output the shape of the dataframe
(25, 6)
We use comments frequently in the exercises to help you understand what to do and what our intentions are.
It’s good practice to use them to explain our code so if we or someone else wants to read it at a later date, it’s easier to understand.
Let’s apply what we learned!
1. What is Pandas?
a) A useful tool for data manipulation in Python
b) A programming language
c) A datatype
2. Which of the following statements is true?
a) Attribute and methods can be thought of as nouns and functions as verbs
b) Attribute can be thought of as nouns and functions and methods as verbs
c)Functions and methods can be thought of as nouns and attributes as verbs
Solutions!
a) A useful tool for data manipulation in Python
b) Attribute can be thought of as nouns and functions and methods as verbs