EDA using pandas and duckdb

Name: EDA using pandas and duckdb
Uploaded: Mar 27, 2026
Duration: 1383 s

AIgineer1.02K subscribers

168 views

Mar 27, 2026

23:03

The video demonstrates an exploratory data analysis workflow using pandas and DuckDB on a Kaggle salaries CSV dataset. It sets up a Python environment and Jupyter notebook, loads the data with pandas, inspects rows, columns, shape, and info to confirm there are no nulls, and reviews key fields like work year, company location, experience level, salary currency, and job title using value counts. Then we do some data cleaning using duckdb and pandas. After exporting a cleaned CSV, we create matplotlib/pandas plots Github repo https://github.com/kokchun/youtube_demos/tree/main/eda_pandas_duckdb #duckdb #pandas 00:00 Intro to EDA Setup 00:43 Project Environment Setup 01:53 EDA Mindset and Goals 02:45 Load Data and Inspect 05:56 Quick Profiling with Counts 06:51 Spotting Cleaning Needs 10:22 DuckDB for Title Analysis 13:35 Clean Titles and Levels 17:54 Export Cleaned CSV 18:17 Visualize Top Job Roles 20:09 Yearly Trends Plotting 22:24 Wrap Up and Next Steps

Download

1 formats

Video Formats

360pmp418.8 MB

Download

Right-click 'Download' and select 'Save Link As' if the file opens in a new tab.