Manuscript submitted November 20, 2024; revised November 29, 2024; accepted December 12, 2024; published January 17, 2025
Abstract—In recent years, the demand for accurate housing price predictions has intensified, driven
by the dynamic nature of real estate markets and the need for data-driven decision-making. Machine
learning models (a subset of AI) have emerged as powerful tools in this domain, offering enhanced
predictive capabilities over traditional statistical methods. In this paper, we aimed to predict house
price in Norwich and evaluate the factors that drive the price. To achieve this, we trained four boosting
(Gradient Boosting, XGBoost, LightGBM, and CatBoost) to predict the house price. The performance of
these models was evaluated in a standard evaluation approach and post-hoc residual evaluation
approach within three designed instances (testing, training, and combined [testing + training]). The
predictive performance and significant predictors were identified, with Beds, Baths, Sqm, and other
features showing high significance, while age of the house was not significant. We found out that
GradientBoost and XGboost are closely related in their residuals, while LightBoost operates
independently. The performance metrics revealed that LightGBM outperformed the other models with
the lowest Root Mean Square Error (RMSE) and Mean Absolute Error (MAE) in both training (RMSE
[5.891], MAE [3.680]) and test (RMSE [13.170], MAE [7.092]) instances, achieving an R-squared value
of (combined [0.99] train [0.998], and test [0.99]). Correlation analyses of the residuals indicated a
strong positive correlation between Gradient Boosting and XGBoost (train [0.84], test [0.85], combined
[0.84]), while CatBoost demonstrated a moderate correlation with both. Notably, LightGBM (−0.04 ≤ r
≤ 0.3) exhibited distinct residual patterns, showing no significant correlation with the other models,
suggesting it captures different aspects of the dataset. These findings show the importance of utilizing
an ensemble approach that includes LightGBM to enhance predictive accuracy by leveraging its unique
error characteristics alongside the complementary strengths of the other models, and inform model
selection and ensemble strategies in future.
keywords—Boosting algorithm, house price, real estate valuation, residential property prices, norwich
housing market
Cite: J. D. Adekunle, M. I. Oyeniran, H. S. Sule, T. T. Akinpelu, E. J. Ayanlowo, C. K. Ogu, C. O. Robert"Let’s Boost House Price Predictions: A Machine Learning Approach for Norwich," Journal of Advances in Artificial Intelligence vol. 3, no. 1, pp. 1-18, 2025.doi: 10.18178/JAAI.2025.3.1.1-18
Copyright © 2024 by the authors. This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).
Copyright © 2023-2025. Journal of Advances in Artificial Intelligence. All rights reserved.
E-mail: editor@jaai.net