2.5.2 Nearest Neighbour Methods - Pattern Recognition and Machine Learning
In this final section of Chapter 2 we discuss nearest-neighbour methods for estimating probability densities. We return to estimating the density at a given point as the fraction of the dataset contained in a small volume around the point, normalized by that volume. Standard kernel density estimators keep the volume fixed and count the number of datapoints contained therein. This can be problematic because it implies a smoothing of fixed resolution over the entire dataset, which may be inappropriate if the true density varies. Depending on the kernel used (e.g. rectangular), it can also produce estimates of the probability that are exactly 0, which may not be appropriate. In contrast, nearest-neighbour methods keep the number of neighbours fixed while expanding the volume as necessary to encompass that many datapoints. This dynamically adjusts the resolution to the density of the data, but comes at the cost of not yielding a valid estimate because its integral over an unbounded space diverges. We end by discussing how the nearest-neighbour estimates can be used for classification, resulting in nearest-neighbour classifiers which assign a point to whichever class is most represented in the local volume.
Download
0 formatsNo download links available.