Back to Browse

Pandas DataFrame: Understanding the Difference Between loc and iloc #pandas #python #dataengineer

1.1K views
Jul 29, 2023
14:46

In this video, we delve into one of the fundamental concepts of working with pandas DataFrames – the difference between loc and iloc. loc and iloc are essential methods for accessing and manipulating data in a DataFrame, but they have distinct approaches to locating and selecting data. Join us as we explore how to use loc for label-based indexing and iloc for integer-based indexing. We'll provide clear examples and explanations to help you grasp the key distinctions between these two methods and when to use each of them in your data analysis projects. In Pandas, `loc` and `iloc` are two important methods used to access and manipulate data in a DataFrame. They are used for indexing and selecting data, but they have some key differences in their behavior: 1. `loc` (Label-based indexing): `loc` is used for selection based on the **labels** of rows and columns. It means you refer to the rows and columns using their actual labels (index and column names). The syntax for using `loc` is `df.loc[row_label, column_label]`. Example: ```python import pandas as pd # Create a sample DataFrame data = { 'Name': ['John', 'Jane', 'Mike', 'Alice'], 'Age': [25, 30, 22, 28], 'City': ['New York', 'London', 'Paris', 'Tokyo'] } df = pd.DataFrame(data, index=['A', 'B', 'C', 'D']) # Using loc to access data print(df.loc['B', 'Age']) # Output: 30 print(df.loc['C', 'Name']) # Output: Mike ``` 2. `iloc` (Integer-based indexing): `iloc` is used for selection based on **integer positions** of rows and columns. It means you refer to the rows and columns using their numerical indices. The syntax for using `iloc` is `df.iloc[row_index, column_index]`. Example: ```python import pandas as pd # Create a sample DataFrame data = { 'Name': ['John', 'Jane', 'Mike', 'Alice'], 'Age': [25, 30, 22, 28], 'City': ['New York', 'London', 'Paris', 'Tokyo'] } df = pd.DataFrame(data) # Using iloc to access data print(df.iloc[1, 1]) # Output: 30 (Row at index 1, Column at index 1) print(df.iloc[2, 0]) # Output: Mike (Row at index 2, Column at index 0) ``` **Key Differences:** 1. **Input type:** - `loc` uses labels (index and column names) to access data. - `iloc` uses integer positions (numerical indices) to access data. 2. **Slicing behavior:** - When using `loc`, the end index is **inclusive** in slicing operations. - When using `iloc`, the end index is **exclusive** in slicing operations, just like Python's regular slicing. 3. **Indexing Error:** - Using `loc` with a label that doesn't exist in the DataFrame will raise a `KeyError`. - Using `iloc` with an index that is out of bounds (greater than the DataFrame size) will raise an `IndexError`. It's essential to understand the distinction between `loc` and `iloc` since using the wrong method can lead to unexpected results or errors when accessing data in Pandas DataFrames.

Download

0 formats

No download links available.

Pandas DataFrame: Understanding the Difference Between loc and iloc #pandas #python #dataengineer | NatokHD