Proceedings of the 2016 4th International Conference on Machinery, Materials and Information Technology Applications

Research on the Advanced Computing Method for Supporting Large Data Quality Assessment and Improvement

Authors
He Yang, Jiangqi Chen, Xiaojia Xiang, Heng Liu, Yunpeng Li
Corresponding Author
He Yang
Available Online January 2017.
DOI
https://doi.org/10.2991/icmmita-16.2016.29How to use a DOI?
Keywords
Cross-database Queries; Big Data Processing; Apache Hive; Data Quality Assessment and Improvement; Task Relevance.
Abstract
To support the high efficient and fast data quality assessment of electrical operation, we need to make optimization of high performance computing technology on computing platform, this paper carry out in-depth research on the performance bottleneck that the data quality evaluation system faces, after the analysis of big data platform on data quality assessment and improvement, we make the design and implementation of easily cross-database queries, which can seamlessly integrate relational data into Hadoop ecosystem, and put forward a kind of optimization model for Hive by considering task relevance.
Open Access
This is an open access article distributed under the CC BY-NC license.

Download article (PDF)

Cite this article

TY  - CONF
AU  - He Yang
AU  - Jiangqi Chen
AU  - Xiaojia Xiang
AU  - Heng Liu
AU  - Yunpeng Li
PY  - 2017/01
DA  - 2017/01
TI  - Research on the Advanced Computing Method for Supporting Large Data Quality Assessment and Improvement
BT  - Proceedings of the 2016 4th International Conference on Machinery, Materials and Information Technology Applications
PB  - Atlantis Press
SP  - 143
EP  - 151
SN  - 2352-538X
UR  - https://doi.org/10.2991/icmmita-16.2016.29
DO  - https://doi.org/10.2991/icmmita-16.2016.29
ID  - Yang2017/01
ER  -