ProtAnt

A tool for analysing the prototypicality of texts

Laurence Anthony, Paul Baker

    Research output: Contribution to journalArticle

    6 Citations (Scopus)

    Abstract

    Corpus-based researchers and traditional qualitative researchers, such as those interested in critical discourse analysis, are often required to select prototypical texts for close reading that include the language features of interest that are present in a much larger corpus. Traditional approaches to this selection procedure have been largely ad hoc. In this paper, we offer a more principled way of selecting texts for close reading based on a ranking of texts in terms of the number of keywords they contain. To facilitate this analysis, we have developed a multiplatform, freeware software tool called ProtAnt that analyses the texts, generates a ranked list of keywords based on statistical significance and effect size, and then orders the texts by the number of keywords in them. We describe various experiments that demonstrate the ProtAnt analysis is effective not only at identifying prototypical texts, but also identifying outlier texts that may need to be removed from a target corpus.

    Original languageEnglish
    Pages (from-to)273-292
    Number of pages20
    JournalInternational Journal of Corpus Linguistics
    Volume20
    Issue number3
    DOIs
    Publication statusPublished - 2015

    Fingerprint

    selection procedure
    statistical significance
    Prototypicality
    discourse analysis
    ranking
    experiment
    language
    Key Words
    Close Reading
    software
    Outliers
    Ranking
    Experiment
    Effect Size
    Critical Discourse Analysis
    Corpus-based
    Language

    Keywords

    • Critical discourse analysis
    • Keywords
    • ProtAnt
    • Prototypicality
    • Qualitative research

    ASJC Scopus subject areas

    • Language and Linguistics
    • Linguistics and Language

    Cite this

    ProtAnt : A tool for analysing the prototypicality of texts. / Anthony, Laurence; Baker, Paul.

    In: International Journal of Corpus Linguistics, Vol. 20, No. 3, 2015, p. 273-292.

    Research output: Contribution to journalArticle

    @article{377594def2cc46438a166be0219bfdd1,
    title = "ProtAnt: A tool for analysing the prototypicality of texts",
    abstract = "Corpus-based researchers and traditional qualitative researchers, such as those interested in critical discourse analysis, are often required to select prototypical texts for close reading that include the language features of interest that are present in a much larger corpus. Traditional approaches to this selection procedure have been largely ad hoc. In this paper, we offer a more principled way of selecting texts for close reading based on a ranking of texts in terms of the number of keywords they contain. To facilitate this analysis, we have developed a multiplatform, freeware software tool called ProtAnt that analyses the texts, generates a ranked list of keywords based on statistical significance and effect size, and then orders the texts by the number of keywords in them. We describe various experiments that demonstrate the ProtAnt analysis is effective not only at identifying prototypical texts, but also identifying outlier texts that may need to be removed from a target corpus.",
    keywords = "Critical discourse analysis, Keywords, ProtAnt, Prototypicality, Qualitative research",
    author = "Laurence Anthony and Paul Baker",
    year = "2015",
    doi = "10.1075/ijcl.20.3.01ant",
    language = "English",
    volume = "20",
    pages = "273--292",
    journal = "International Journal of Corpus Linguistics",
    issn = "1384-6655",
    publisher = "John Benjamins Publishing Company",
    number = "3",

    }

    TY - JOUR

    T1 - ProtAnt

    T2 - A tool for analysing the prototypicality of texts

    AU - Anthony, Laurence

    AU - Baker, Paul

    PY - 2015

    Y1 - 2015

    N2 - Corpus-based researchers and traditional qualitative researchers, such as those interested in critical discourse analysis, are often required to select prototypical texts for close reading that include the language features of interest that are present in a much larger corpus. Traditional approaches to this selection procedure have been largely ad hoc. In this paper, we offer a more principled way of selecting texts for close reading based on a ranking of texts in terms of the number of keywords they contain. To facilitate this analysis, we have developed a multiplatform, freeware software tool called ProtAnt that analyses the texts, generates a ranked list of keywords based on statistical significance and effect size, and then orders the texts by the number of keywords in them. We describe various experiments that demonstrate the ProtAnt analysis is effective not only at identifying prototypical texts, but also identifying outlier texts that may need to be removed from a target corpus.

    AB - Corpus-based researchers and traditional qualitative researchers, such as those interested in critical discourse analysis, are often required to select prototypical texts for close reading that include the language features of interest that are present in a much larger corpus. Traditional approaches to this selection procedure have been largely ad hoc. In this paper, we offer a more principled way of selecting texts for close reading based on a ranking of texts in terms of the number of keywords they contain. To facilitate this analysis, we have developed a multiplatform, freeware software tool called ProtAnt that analyses the texts, generates a ranked list of keywords based on statistical significance and effect size, and then orders the texts by the number of keywords in them. We describe various experiments that demonstrate the ProtAnt analysis is effective not only at identifying prototypical texts, but also identifying outlier texts that may need to be removed from a target corpus.

    KW - Critical discourse analysis

    KW - Keywords

    KW - ProtAnt

    KW - Prototypicality

    KW - Qualitative research

    UR - http://www.scopus.com/inward/record.url?scp=84940649944&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=84940649944&partnerID=8YFLogxK

    U2 - 10.1075/ijcl.20.3.01ant

    DO - 10.1075/ijcl.20.3.01ant

    M3 - Article

    VL - 20

    SP - 273

    EP - 292

    JO - International Journal of Corpus Linguistics

    JF - International Journal of Corpus Linguistics

    SN - 1384-6655

    IS - 3

    ER -