1. Data Loading and Preprocessing:
Load the dataset from both CSV and Excel files using Pandas.
Display the first 5 rows and basic dataset info (.head(), .info(), .describe()).
Check for missing values and handle them (drop, fill, or impute).
2. Data Cleaning and Feature Engineering :
Identify duplicate rows and remove them.
Convert categorical variables into numerical if needed (using LabelEncoder or OneHotEncoding).
Normalize/scale numerical features if necessary.
3. Exploratory Data Analysis (EDA) :
Check basic statistics: Mean, median, mode, standard deviation.
Visualize distributions: Use histograms, boxplots, and KDE plots.
Identify correlations: Use a heatmap to display relationships between variables.
Outlier Detection: Use boxplots or scatter plots.
4. Data Visualization:
Univariate Analysis: Plot bar charts, histograms, and pie charts to analyze individual features.
Bivariate Analysis: Create scatter plots, pair plots, or violin plots to explore relationships.
Multivariate Analysis: Use heatmaps and correlation matrices to identify patterns.
Custom Visualization: Provide one insightful visualization of your choice.
Download
0 formats
No download links available.
Exploratory Data Analysis (EDA) and Data Visualization from CSV and excel file | NatokHD