Proceedings of the 3rd International Conference on Integrated Intelligent Computing Communication & Security (ICIIC 2021)

Aadhaar Data Analysis Comparison in MapReduce, Hive and Spark

Authors
R Roopa, Varsha Ryali, Tejasvi Shrivastava, Syed Mahmood Nabeel Anwar
Corresponding Author
R Roopa
Available Online 13 September 2021.
DOI
10.2991/ahis.k.210913.036How to use a DOI?
Keywords
Aadhaar, Big data, MapReduce, Hadoop, Hive, Apache Spark
Abstract

Aadhaar with a 12-digit unique identification number of every Indian provides demographic and biometric information and is mandatory for various purposes like benefit transfer directly, healthcare, etc. Approximately Aadhaar details need to store 1.3 Billion Indians which attributes to the concept of big data. In this paper, the proposed hybrid model analyses the Aadhaar dataset w.r.t different research interrogations such as count of applicants based on gender, state-wise approved and by age type applicants. In the existing systems, Aadhaar data analyses are done either manually or in primitive SQL platforms which may take days to complete. In this paper, the focus is on Aadhaar data analysis using different distributed computing frameworks like MapReduce, Hive, and Apache Spark on top of Hadoop that could be used for the purpose of better decision-making by all government firms and we provide the valid conclusion that Apache Spark framework is efficient in terms of performance.

Copyright
© 2021, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Download article (PDF)

Volume Title
Proceedings of the 3rd International Conference on Integrated Intelligent Computing Communication & Security (ICIIC 2021)
Series
Atlantis Highlights in Computer Sciences
Publication Date
13 September 2021
ISBN
10.2991/ahis.k.210913.036
ISSN
2589-4900
DOI
10.2991/ahis.k.210913.036How to use a DOI?
Copyright
© 2021, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

TY  - CONF
AU  - R Roopa
AU  - Varsha Ryali
AU  - Tejasvi Shrivastava
AU  - Syed Mahmood Nabeel Anwar
PY  - 2021
DA  - 2021/09/13
TI  - Aadhaar Data Analysis Comparison in MapReduce, Hive and Spark
BT  - Proceedings of the 3rd International Conference on Integrated Intelligent Computing Communication & Security (ICIIC 2021)
PB  - Atlantis Press
SP  - 286
EP  - 295
SN  - 2589-4900
UR  - https://doi.org/10.2991/ahis.k.210913.036
DO  - 10.2991/ahis.k.210913.036
ID  - Roopa2021
ER  -