Proceedings of the International Conference on Culture and Language in Southeast Asia (ICCLAS 2017)

Projected Characteristics and Content of Arabic Corpus in Indonesia

Nur Hizbullah, Muchlis Madian Muhammad
Corresponding author
Nur Hizbullah
DOI to use a DOI?
Arabic Corpus; corpus characteristic; corpus content; comparative corpus
Utilization and integration between linguistics and information and communication technology produce a result in the form of a language corpus. Corpus is a collection of data prepared systemically and is developed in such a way to be used as research data. In general, the content of a corpus relates to the purpose preparation of the corpus itself in the context of linguistic researches. In addition, the corpus' content relates to the availability of data materials to be included in the corpus. With its long history and wide coverage of Arabic teaching in Indonesia, there are quite a plenty of materials and data on and in Arabic language that can be documented and compiled to be used as corpus. This ascertains that Arabic Corpus in Indonesia will be filled by various data materials. Under the descriptive- comparative method, this paper will describe various types of Arabic corpus, particularly the aspect of corpus content and compare the content in the corpus and the predicted availability of content materials in the context of the plan to prepare Arabic Corpus in Indonesia. By referring to the existing corpus, it can be projected that the Arabic Corpus to be made in Indonesia is a regional and diachronically corpus. This corpus contains seven distinct classifications in accordance with the availability of data in the field. Hence, the effort of drafting this corpus is important and strategic in order to make a documentation of the Arabic linguistic data that is real produced by the Indonesian speakers and this corpus will be able to showcase the richness of Arabic language in Indonesia for use to develop research in various Arabic studies in future.
© The authors. This article is distributed under the terms of the Creative Commons Attribution License 4.0, which permits non-commercial use, distribution and reproduction in any medium, provided the original work is properly cited. See for details:
Open Access | Under Creative Commons license CC BY-NC 4.0

Download article (PDF)

Cite this article
  title={Projected Characteristics and Content of Arabic Corpus in Indonesia},
  author={Hizbullah, Nur and Madian Muhammad, Muchlis},
  booktitle={International Conference on Culture and Language in Southeast Asia (ICCLAS 2017)},
  publisher={Atlantis Press}
copy to clipboarddownload