Proceedings of the 2022 3rd International Conference on E-commerce and Internet Technology (ECIT 2022)

Prediction on Housing Price Based on the Data on Kaggle

Authors
Jiachen Yu1, *
1College of Letters and Science, University of California, Davis, USA
*Corresponding author. Email: hjcyu@ucdavis.edu
Corresponding Author
Jiachen Yu
Available Online 10 November 2022.
DOI
10.2991/978-94-6463-005-3_64How to use a DOI?
Keywords
Data preparation; Correlation; Exploratory data analysis; RMSE; Machine learning models
Abstract

People’s lives have always relied on having a secure place to stay. As a result, housing prices have risen to people’s top priority list. This paper uses a series of correlation tests, exploratory data analysis, and feature selection approaches to the training and testing datasets to find the most accurate model for forecasting housing prices. There is a dataset about housing prices on Kaggle. The author found that the variables PoolQC, MiscFeature, Alley, and Fence include almost 90% missing values during the data preparation process. After removing those variables, the author generated a correlation matrix to visualize the relationship between the rest variables. In addition, the exploratory data analysis on the dataset shows the overall quality of a home, the size of the living area, the total basement size, and the presence of newer homes contribute the most to a house’s value. The author created seven different machine learning models and calculated the R-square and root mean square values. Among these models, the random forest algorithm has the highest R-Squared value and the lowest RMSE. As a result, the random forest algorithm is the best model for predicting the price of a house.

Copyright
© 2023 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Download article (PDF)

Volume Title
Proceedings of the 2022 3rd International Conference on E-commerce and Internet Technology (ECIT 2022)
Series
Atlantis Highlights in Engineering
Publication Date
10 November 2022
ISBN
10.2991/978-94-6463-005-3_64
ISSN
2589-4943
DOI
10.2991/978-94-6463-005-3_64How to use a DOI?
Copyright
© 2023 The Author(s)
Open Access
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 4.0 International License (http://creativecommons.org/licenses/by-nc/4.0/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

Cite this article

TY  - CONF
AU  - Jiachen Yu
PY  - 2022
DA  - 2022/11/10
TI  - Prediction on Housing Price Based on the Data on Kaggle
BT  - Proceedings of the 2022 3rd International Conference on E-commerce and Internet Technology (ECIT 2022)
PB  - Atlantis Press
SP  - 627
EP  - 634
SN  - 2589-4943
UR  - https://doi.org/10.2991/978-94-6463-005-3_64
DO  - 10.2991/978-94-6463-005-3_64
ID  - Yu2022
ER  -