This paper describes an embedded architecture to couple utilizable language knowledge and innovative language models, as well as modeling approaches, for intensive language modeling in speech recognition. In this embedded mechanism, three innovative language modeling approaches at different levels, ie., composite N-gram, dis tance-related unit association maximum entropy (DU-AME), and linkgram, have different functions to extend the definitions of basic language units, favorably improve the underlying model instead of conventional N-grams and provide effective combination with longer history syntactic lnk dependency knowledge, respectively. In this threelevel hybrid language modeling, each lower level modeling serves the higher level modeling(s). The results in each level are well utized or embedded in the higher level(s). These models can be trained level by level Accordingly, some prospective language constraints can finally be embedded in a wellorganized hybrid model. Experimental data based on the embedded modeling show that the hybrid model reduces WER 14.5% compared with the conventional word-based bigram model As a result, it can be expected to improve the conventional statistical language modelng.