Corpus-Based Lexical Development of EFL Writing
- 10.2991/978-2-494069-37-4_9How to use a DOI?
- Lexical development; EFL writing; Longitudinal learner corpus
Learner corpora, with their detailed information on learner language use, have been widely explored in second language acquisition and teaching. This paper is based on a self-built longitudinal EFL learner corpus to partly meet a long-desired goal of measuring and describing the general guiding feature and the dynamics of learner language, especially for beginners. The current study uses NLP tools to calculate the values for the variables needed for measuring lexical development: types, tokens, TTR, unit length indices, COCA frequency list coverage, and lexical sophistication indices. As the data are in abnormal distribution, independent-samples Kruskal-Wallis tests are employed to test the significance; further pairwise comparisons are to determine the difference between group pairs by year. The present study finds that conventional global variables are more applicable for learner language development for beginners, including the number of tokens and types, the number of letters per word and the number of words per sentence, bigram frequency, and bigram mutual information. At the same time, some of the novel indices do not make significant differences, such as TTR, MATTR, MTLD, MTLD-Ma-Wrap, COCA frequency list coverage, trigram frequency and trigram mutual information. The present study also notes that spelling mistakes hinder statistical accuracy in processing beginner language. The real difficulty of beginners lies in their lack of knowledge and practice of non-literary, suggestive or affective use of content words; correct use of topic-specific words, lexical bundles, and set collocations also pose great challenges. The findings provide new insights into EFL learner language and offer helpful pedagogical implications.
- © 2023 The Author(s)
- Open Access
- Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
Cite this article
TY - CONF AU - Weilu Wang AU - Jijun Wang AU - Manfu Duan PY - 2022 DA - 2022/12/19 TI - Corpus-Based Lexical Development of EFL Writing BT - Proceedings of the 2022 International Conference on Diversified Education and Social Development (DESD 2022) PB - Atlantis Press SP - 53 EP - 65 SN - 2352-5398 UR - https://doi.org/10.2991/978-2-494069-37-4_9 DO - 10.2991/978-2-494069-37-4_9 ID - Wang2022 ER -