This paper explores the question of how house prices are affected by housing characteristics (both internally, such as number of bathrooms, bedrooms, etc. and externally, such as public schools' scores or the walkability score of the neighborhood). Using data gathered from sold houses listed on Zillow, Trulia and Redfin, three prominent housing websites, this paper utilizes both the hedonic pricing model (Linear Regression) and various machine learning algorithms, such as Random Forest and Support Vector Regression (SVR), to predict house prices. The models' prediction scores, as well as the ratio of overestimated houses to underestimated houses are compared against Zillow's price estimation scores and ratio. Results show that SVR gives a better price prediction score than Zillow's baseline on the same dataset of Hunt County (TX) and close price prediction scores to Zillow's baseline on two other counties. Moreover, this paper's models reduce the overestimated to underestimated house ratio of 3:2 from Zillow's estimation to a ratio of 1:1.
Additional Speakers
Faculty Department/Program
Faculty Division
Presentation Type
Do You Approve this Abstract?
Approved