Back to Browse

R programming | Kernel density plots

357 views
Jan 1, 2024
6:14

Kernel density estimation is a nonparametric method for estimating the probability density function of a random variable. Basically, we’re trying to draw a smoothed histogram, where the area under the curve equals 1. Kernel density plots can be an effective way to view the distribution of a continuous variable. The format for a density plot is: ggplot(data, aes(x = contvar)) + geom_density() where data is a data frame and contvar is a continuous variable The density plot can be filled with color. Smoothness of the curve is controled with a bandwidth parameter. Using a larger bandwidth will give a smoother curve with fewer details. A smaller value will give a more jagged curve. bw.nrd0(cars2008$cty) # display the current bandwidth (1.41) ggplot(cars2008, aes(x=cty)) + geom_density(fill="red", bw=.5) + labs(title="Kernel density plot with bw=0.5", x="City Miles Per Gallon") Kernel density plots can be used to compare groups. For this example, we’ll compare the 2008 city gas mileage estimates for four-, six-, and eight-cylinder cars using both colored and filled density plots: ggplot(cars2008, aes(x=cty, color=Cylinders, linetype=Cylinders)) + geom_density(size=1) + labs(title="Fuel Efficiecy by Number of Cylinders", x = "City Miles per Gallon") ggplot(cars2008, aes(x=cty, fill=Cylinders)) + geom_density(alpha=.4) + labs(title="Fuel Efficiecy by Number of Cylinders", x = "City Miles per Gallon") #rprogramming #rstudio #ggplot #kerneldensity #kernel #kerneldensitycurve #rdatacode

Download

0 formats

No download links available.

R programming | Kernel density plots | NatokHD