Abstract
Formulaic sequences (FSs), or prefabricated multi-word structures (e.g. on the other hand), are often difficult to identify objectively, and current corpus-driven methods yield structurally incomplete, overlapping, or overly extended structures of questionable psychological validity and pedagogical usefulness. To address these limitations, this study evaluated transitional probability as a potential metric to improve the identification of FSs by presenting 100 four-word sequences from the British National Corpus, varying in transitional probabilities between words, to native and non-native speakers of English (N = 293) in a sequence completion task (e.g. for the sake__). Results revealed that the application of transitional probability reduces many of the problems associated with current approaches to FS identification and can produce lists of FSs that are more functionally salient and psychologically valid.
Original language | English |
---|---|
Pages (from-to) | 24-43 |
Number of pages | 20 |
Journal | International Journal of Applied Linguistics (United Kingdom) |
Volume | 27 |
Issue number | 1 |
DOIs | |
Publication status | Published - 2017 Mar 1 |
Externally published | Yes |
Keywords
- corpus-driven research
- formulaic language
- formulaic sequences
- lexical bundles
- n-grams
ASJC Scopus subject areas
- Language and Linguistics
- Linguistics and Language