Skip to content

Pandas: Dataframes

Diving into the world of data analysis with Python, you'll frequently come across the term "DataFrame" if you use the Pandas library. In this extensive tutorial, we'll unravel the nuances of creating DataFrames in Pandas, ensuring you're well-equipped to handle tabular data with ease.

What is a Pandas DataFrame?

A Pandas DataFrame is a two-dimensional, size-mutable, and heterogeneous tabular data structure with labeled axes (rows and columns). Imagine an in-memory Excel sheet where you can perform operations programmatically; that's your DataFrame!

Creating a DataFrame

Pandas provides multiple methods to create a DataFrame:

From a Dictionary of Series or Lists

import pandas as pd

data = {
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Age': [25, 30, 35],
    'City': ['New York', 'San Francisco', 'Los Angeles']
}

df = pd.DataFrame(data)
print(df)

From a List of Dictionaries

data = [
    {'Name': 'Alice', 'Age': 25, 'City': 'New York'},
    {'Name': 'Bob', 'Age': 30, 'City': 'San Francisco'},
    {'Name': 'Charlie', 'Age': 35, 'City': 'Los Angeles'}
]

df = pd.DataFrame(data)
print(df)

From a CSV File

# Assuming you have a file 'data.csv' with appropriate data
df = pd.read_csv('data.csv')
print(df)

Setting Custom Indexes

While creating a DataFrame, you can also set a custom index:

df = pd.DataFrame(data, index=['Student1', 'Student2', 'Student3'])
print(df)

Accessing Data in a DataFrame

Selecting Columns

print(df['Name'])

Selecting Rows

Using loc for label-based indexing:

print(df.loc['Student1'])

Using iloc for position-based indexing:

print(df.iloc[0])

DataFrame Basic Operations

Adding a New Column

df['Score'] = [85, 90, 88]
print(df)

Deleting a Column

df.drop('Score', axis=1, inplace=True)
print(df)

Handling Missing Data

df['Score'] = [85, None, 88]
print(df.fillna(0))

Conclusion

The Pandas DataFrame is a powerful tool for data manipulation and analysis in Python. This tutorial touched upon its creation and basic operations, but there's a universe of possibilities with DataFrames. For an exhaustive list of functionalities, check the official Pandas documentation.


Version 1.0

This is currently an early version of the learning material and it will be updated over time with more detailed information.

A video will be provided with the learning material as well.

Be sure to subscribe to stay up-to-date with the latest updates.

Need help mastering Machine Learning?

Don't just follow along — join me! Get exclusive access to me, your instructor, who can help answer any of your questions. Additionally, get access to a private learning group where you can learn together and support each other on your AI journey.