Introduction to Pandas Series

pandas series

Pandas Series is one of the Data Structures supported in Python. Series objects can be thought of as a NumPy array joined with an array of labels. It is a one-dimensional array of indexed data.

In this tutorial, we will cover the pandas series in detail, and look at some examples to understand the concepts.

Before we move on to our topic, check out the tutorials on one of the rigorously used data structures in solving any real-world problem.

What is Pandas Series?

Pandas Series is a one-dimensional array of indexed data. The index of the series objects supports label-based and integer-based indexing. It is able to hold various data types like integer, float, strings, and objects. It supports multiple operations involving the index. It can be created using a list or a Numpy array.

Create a Pandas Series from a List

Enough of theory, now get ready to do some hands-on. Create a pandas series object using a list.

# Import pandas
import pandas as pd

# Create a List
lst = ['a', 'b', 'c', 'd', 'e']

# Pass the list object to series
series_1 = pd.Series(lst)

# Display Series object
print(series_1)

Output:

0    a
1    b
2    c
3    d
4    e
dtype: object

Create a Pandas Series from a NumPy Array

Let’s create a NumPy array and pass it to the Series method.

# Create a Series using a NumPy array
import pandas as pd
import numpy as np

# Create a List
narray = np.array([101, 103, 105, 109, 111, 115, 117, 189, 201])

# Pass the list object to series
series_2 = pd.Series(narray)

# Display Series object
print(series_2)

Output:

0    101
1    103
2    105
3    109
4    111
5    115
6    117
7    189
8    201
dtype: int32

How to Create Series from Dictionary?

You can also create series from the dictionary object. In absence of labels, the keys of the dictionary will be utilized for the labels.

# Create Series from Dictionary
import pandas as pd

# Create a dict data
data ={'A' : '1', 'B' : '2', 'C': '3', 'D':'4', 'E':'5', 'F':'6', 'G':'7'}

# Pass the list object to series
series_3 = pd.Series(data)

# Display Series object
print(series_3)

Output:

A    1
B    2
C    3
D    4
E    5
F    6
G    7
dtype: object

Converting Pandas Series to ndarray or ndarray-like

Use pandas.Series.values() to return series as ndarray or ndarray-like depending on the data type. In the given example, .values() method return data type as ‘numpy.ndarray’.

# Create Series from Dictionary
import pandas as pd

# Create a dict data
data ={'A' : '1', 'B' : '2', 'C': '3', 'D':'4', 'E':'5', 'F':'6', 'G':'7'}

# Pass the list object to series
series_3 = pd.Series(data)

# Series as ndarray
print("Series Data type: ", type(series_3))
print("Series.values Data type: ",type(series_3.values))
print(series_3.values)

Output:

Series Data type:  <class 'pandas.core.series.Series'>
Series.values Data type:  <class 'numpy.ndarray'>
['1' '2' '3' '4' '5' '6' '7']

Slicing Pandas Series

Pandas provides options to subset the series either by using integer position or label-based slicing.  We will discuss .loc and .iloc methods for pandas slicing.

  • pandas.Series.loc[] – is label-based method
  • pandas.Series.iloc[] – is integer position based method

Slicing Series using .loc[]

The .loc method gives the flexibility to the user to slice the pandas series using an index value. By providing the list of the index values, a group of elements from the series can be extracted.

# Create Series from Dictionary
import pandas as pd

# Create a dict data
data ={'A' : 101, 'B' : 201, 'C': 301, 'D':401, 'E':501, 'F':601, 'G':701}

# Pass the list object to series
ser = pd.Series(data)
print(ser)

Output:

A    101
B    201
C    301
D    401
E    501
F    601
G    701
dtype: int64

Let’s take a look into an example to slice a single element from the series.

# Selecting Single element from the series
print('Selecting Single Element, Index value is B: ', ser.loc["B"])

Output:

Selecting Single Element, Index value is B:  201

Next example to select a group of elements from the pandas series.

# Selecting Multiple elements from the series
print('Selecting Multiple Elements, from B to E: ', ser.loc["B":"E"])

Output:

Selecting Multiple Elements, from B to E:  B    201
C    301
D    401
E    501
dtype: int64

Slicing Series using .iloc[]

The .iloc function can extract single or multiple elements from a series using positions.

# Selecting Single element from the series
print('Selecting Single Element, 0th position or first element: ', ser.iloc[0])

Output:

Selecting Single Element, 0th position or first element:  101

Extracting multiple elements from the series.

# Selecting Multiple elements from the series
print('Selecting Multiple Elements, from 2nd to 6th : ', ser.iloc[2:6])

Output:

Selecting Multiple Elements, from 2nd to 6th :  C    301
D    401
E    501
F    601
dtype: int64

Assignment Operation on Series

Series values can be modified and accessed using the index label or position.

Label based assignment, assign value ‘1000’ to series element with index value equal to “A”.

# Before Changes
print('Before Change: ', ser['A'])

# After changes
ser['A'] = 1000

print('After Change: ', ser['A'])

Output:

Before Change:  101
After Change:  1000

Updating the third element of the series using position.

# Before Changes
print('Before Change: ', ser[2])

# Assign new value
ser[2] = 3001

# After changes
print('After Change: ', ser[2])

Output:

Before Change:  301
After Change:  3001

Delete Operation on Series

Pandas has provided two functions .pop() and .drop() to delete the item from series using index label.

# Pandas .pop method to delete a index
ser.pop('E')

Output:

501

Let’s check our series to verify the deletion of the item.

# After Deletion
print(ser)

Output:

A    101
B    201
C    301
D    401
F    601
G    701
dtype: int64

With the .pop method, you can delete one index at a time. What if you need to delete many items based on the index label.

# Delete using Drop
print(ser.drop(['B', 'C']))

Output:

A    101
D    401
F    601
G    701
dtype: int64

Remember that the drop method didn’t update the original series “ser” as in the case of .pop method. Just display the “ser” to confirm.

# Display ser after using Drop
print(ser)

Output:

A    101
B    201
C    301
D    401
F    601
G    701
dtype: int64

Statistical Operations on Pandas Series

Pandas has provided many functions for statistical operations on series. You can apply arithmetic and statistical methods similar to the NumPy array.

# Import numpy and pandas
import numpy as np
import pandas as pd

# Create data using random function
np.random.seed(1234)
data = np.random.randint(1, 20, 10)

ser = pd.Series(data)

# Print Series as ndarray
print("Series Value: ", ser.values)

# Sort series values
print("Sort Series Value: ", ser.sort_values().values)

# Length of series
print('Length of Series: ', len(ser))

# mean of series
print('Sum of Series: ', np.sum(ser))

# mean of series
print('Mean of Series: ', np.mean(ser))

# Standard Deviation of series
print('Standard Deviation of series: ', np.std(ser))

# variance of series
print('Variance of Series: ', np.var(ser))

Output:

Series Value:  [16  7 13 16 18 10 12 13 17  6]
Sort Series Value:  [ 6  7 10 12 13 13 16 16 17 18]
Length of Series:  10
Sum of Series:  128
Mean of Series:  12.8
Standard Deviation of series:  3.919183588453085
Variance of Series:  15.36

Conclusion

Pandas Series is similar to a one-dimensional NumPy array. The major difference is that it gives control to the user to explicitly define the index whereas NumPy array has a non-immutable integer index. Therefore, Pandas provides more control over the series object.

Few more recommendations:

End-to-End Machine Learning and Deep Learning projects built on Python, and Django:

 

Leave a Comment

Your email address will not be published. Required fields are marked *