We consider how to make optimal decisions when different types of errors have different costs. We introduce the notion of the loss function, or the loss matrix when working with discrete classes, to capture these different costs. We define the expected loss and discuss how it can be minimized pointwise by assigning each datapoint to the class for which it incurs the lowest average loss. I conclude by showing the loss matrix for which minimizing the expected loss results in the minimum misclassification rate rule discussed in the previous section. I also explicitly derive the intuitively clear result that, when we're certain about the class of a data point, the expected loss is minimized when we assign it to that class.