Back to Browse

EDA using pandas and duckdb

168 views
Mar 27, 2026
23:03

The video demonstrates an exploratory data analysis workflow using pandas and DuckDB on a Kaggle salaries CSV dataset. It sets up a Python environment and Jupyter notebook, loads the data with pandas, inspects rows, columns, shape, and info to confirm there are no nulls, and reviews key fields like work year, company location, experience level, salary currency, and job title using value counts. Then we do some data cleaning using duckdb and pandas. After exporting a cleaned CSV, we create matplotlib/pandas plots Github repo https://github.com/kokchun/youtube_demos/tree/main/eda_pandas_duckdb #duckdb #pandas 00:00 Intro to EDA Setup 00:43 Project Environment Setup 01:53 EDA Mindset and Goals 02:45 Load Data and Inspect 05:56 Quick Profiling with Counts 06:51 Spotting Cleaning Needs 10:22 DuckDB for Title Analysis 13:35 Clean Titles and Levels 17:54 Export Cleaned CSV 18:17 Visualize Top Job Roles 20:09 Yearly Trends Plotting 22:24 Wrap Up and Next Steps

Download

1 formats

Video Formats

360pmp418.8 MB

Right-click 'Download' and select 'Save Link As' if the file opens in a new tab.

EDA using pandas and duckdb | NatokHD