The training of HMMs and using HMMs for recognition of sequential patterns are stated in terms '3 problems' of HMMs.
We begin with 'embedded' (simultaneous) training of HMMs for subword (phone-like) units using the iterative training algorithm. Initial estimation of the HMM parameters is illustrated using a simple example. The intution behind re-estimation of HMM parameters is explained. The method of recursive computation of the likelihood of a partial feature vector sequence being generated one HMM is explained here (as a prelude to forward-backward algorithm).
The slides are at
http://iitg.ernet.in/samudravijaya/tutorialSlides/ASR_MFCC_DTW_HMM_GMM_LM.pdf