Journal of Data Science logo


Login Register

  1. Home
  2. Issues
  3. Volume 22, Issue 4 (2024)
  4. Exploring Racial and Ethnic Differences ...

Journal of Data Science

Submit your article Information
  • Article info
  • Related articles
  • More
    Article info Related articles

Exploring Racial and Ethnic Differences in US Home Ownership with Bayesian Beta-Binomial Regression
Volume 22, Issue 4 (2024), pp. 605–620
Jhonatan Medri   Tejasvi Channagiri   Lu Lu  

Authors

 
Placeholder
https://doi.org/10.6339/23-JDS1113
Pub. online: 31 October 2023      Type: Data Science In Action      Open accessOpen Access

Received
16 January 2023
Accepted
31 July 2023
Published
31 October 2023

Abstract

Racial and ethnic representation in home ownership rates is an important public policy topic for addressing inequality within society. Although more than half of the households in the US are owned, rather than rented, the representation of home ownership is unequal among different racial and ethnic groups. Here we analyze the US Census Bureau’s American Community Survey data to conduct an exploratory and statistical analysis of home ownership in the US, and find sociodemographic factors that are associated with differences in home ownership rates. We use binomial and beta-binomial generalized linear models (GLMs) with 2020 county-level data to model the home ownership rate, and fit the beta-binomial models with Bayesian estimation. We determine that race/ethnic group, geographic region, and income all have significant associations with the home ownership rate. To make the data and results accessible to the public, we develop an Shiny web application in R with exploratory plots and model predictions.

Supplementary material

 Supplementary Material
We have included a separate Supplementary section with additional discussion of the data, modeling analyses and results, and a description of a Shiny app developed in R for data exploration. The app features a user-friendly web interface created with the R packages Shiny (Chang et al., 2022) and shinyWidgets (Perrier et al., 2023), enabling users to perform customized and interactive explorations of the data and models presented in this work. The current version of the interface was initially showcased at the American Statistical Association (ASA) Data Challenge Expo 2022 (in the Joint Statistical Meeting (JSM) 2022), and was subsequently refined and expanded in this article. The code for the Shiny app and all our analyses may also be found at https://github.com/jmedri/JSM2022_HomeOwnership.

References

 
Aronowitz M, Golding EL, Choi JH (2020). The Unequal Costs of Black Homeownership. Massachusetts Institute of Technology, Golub Center for Finance and Policy, Cambridge, MA.
 
Ascari R, Migliorati S (2021). A new regression model for overdispersed binomial data accounting for outliers and an excess of zeros. Statistics in Medicine, 40(17): 3895–3914. https://doi.org/10.1002/sim.9005
 
Austin TM, Felicity S (1999). Mortgage Lending Discrimination: A Review of Existing Evidence. The Urban Institute.
 
Bureau UC (2011). Overview of Race and Hispanic Origin: 2010. Technical report. US Census Bureau.
 
Bürkner PC (2017). brms: An R package for Bayesian multilevel models using stan. Journal of Statistical Software, 80(1): 1–28.
 
Bürkner PC, Gabry J, Kay M, Vehtari A (2023). posterior: Tools for working with posterior distributions. R package version 1.4.1.
 
Carpenter B, Gelman A, Hoffman MD, Lee D, Goodrich B, Betancourt M, et al. (2017). Stan: A probabilistic programming language. Journal of Statistical Software, 76: 1–32.
 
Chang W, Cheng J, Allaire J, Sievert C, Schloerke B, Xie Y, et al. (2022). shiny: Web Application Framework for R. R package version 1.7.2.
 
Choi JH, McCargo A, Neal M, Goodman L, Young C (2019). Explaining the Black-White Homeownership Gap, Washington, DC: Urban Institute. Retrieved March, 25: 2021.
 
Dahl DB, Scott D, Roosen C, Magnusson A, Swinton J (2019). xtable: Export Tables to LaTeX or HTML. R package version 1.8-4.
 
Delgadillo L (2009). A model of factors correlated to homeownership: The case of Utah. Family and Consumer Sciences Research Journal, 30: 3–36. https://doi.org/10.1177/1077727X01301001
 
Deslatte A, Tavares A, Feiock RC (2018). Policy of delay: Evidence from a Bayesian analysis of metropolitan land-use choices. Policy Studies Journal, 46(3): 674–699. https://doi.org/10.1111/psj.12188
 
Erdoğdu H, Erdem N, Nacar F (2021). Housing appraisal under model uncertainty: Bayesian model averaging method. Advanced Engineering Journal, 1(1): 26–34.
 
Ferrari S, Cribari-Neto F (2004). Beta regression for modelling rates and proportions. Journal of Applied Statistics, 31(7): 799–815. https://doi.org/10.1080/0266476042000214501
 
Flippen CA (2010). The spatial dynamics of stratification: Metropolitan context, population redistribution, and black and Hispanic homeownership. Demography, 47(4): 845–868. https://doi.org/10.1007/BF03214588
 
Gabry J, Mahr T (2022). bayesplot: Plotting for Bayesian Models. R package version 1.9.0.
 
Gabry J, Simpson D, Vehtari A, Betancourt M, Gelman A (2019). Visualization in Bayesian workflow. Journal of the Royal Statistical Society. Series A. Statistics in Society, 182(2): 389–402. https://doi.org/10.1111/rssa.12378
 
Gelman A, Carlin JB, Stern HS, Dunson DB, Vehtari A, Rubin DB (2013). Bayesian Data Analysis, Third edition. Chapman & Hall/CRC Texts in Statistical Science. Chapman & Hall/CRC, Philadelphia, PA,
 
Goodman LS, Mayer C (2018). Homeownership and the American dream. The Journal of Economic Perspectives, 32(1): 31–58. https://doi.org/10.1257/jep.32.1.31
 
Haseman JK, Kupper LL (1979). Analysis of dichotomous response data from certain toxicological experiments. Biometrics, 35(1): 281–293. https://doi.org/10.2307/2529950
 
Bengtsson H (2017). matrixStats: Functions that Apply to Rows and Columns of Matrices (and to Vectors). R package version 0.52.2. Available at https://github.com/HenrikBengtsson/matrixStats.
 
Hu T, Gallins P, Zhou YH (2018). A zero-inflated beta-binomial model for microbiome data analysis. Stat, 7(1): e185.
 
Hui SK, Cheung A, Pang J, et al. (2010). A hierarchical Bayesian approach for residential property valuation: Application to Hong Kong housing market. International Real Estate Review, 13(1): 1–29. https://doi.org/10.53383/100117
 
Jones LD (1989). Current wealth and tenure choice. Real Estate Economics, 17(1): 17–40. https://doi.org/10.1111/1540-6229.00471
 
Krivo LJ, Kaufman RL (2004). Housing and wealth inequality: Racial-ethnic differences in home equity in the United States. Demography, 41(3): 585–605. https://doi.org/10.1353/dem.2004.0023
 
Kuebler M (2013). Closing the wealth gap: A review of racial and ethnic inequalities in homeownership. Sociology Compass, 7(8): 670–685. https://doi.org/10.1111/soc4.12056
 
Martin BD, Witten D, Willis AD (2020). Modeling microbial abundances and dysbiosis with beta-binomial regression. Annals of Applied Statistics, 14(1): 94–115.
 
Mast BD (2010). Measuring neighborhood quality with survey data: A Bayesian approach. Cityscape, 12(3): 123–142.
 
Medri J, Channagiri T (2022). Exploratory Analysis of Racial Representation in American Home Ownership. In: JSM Proceedings. Statistical Computing Section, 983–1013. American Statistical Association, Alexandria, VA.
 
Moore DJ (1991). Homeownership affordability series forecasting the probability of homeownership: A cross-sectional regression analysis. Journal of Housing Research, 2(2): 125–143. Publisher: American Real Estate Society.
 
Müller K, Wickham H (2023). tibble: Simple Data Frames. R package version 3.2.1.
 
Najera-Zuloaga J, Lee DJ, Arostegui I (2018). Comparison of beta-binomial regression model approaches to analyze health-related quality of life data. Statistical Methods in Medical Research, 27(10): 2989–3009. https://doi.org/10.1177/0962280217690413
 
Ospina R, Ferrari SLP (2012). A general class of zero-or-one inflated beta regression models. Computational Statistics & Data Analysis, 56(6): 1609–1623. https://doi.org/10.1016/j.csda.2011.10.005
 
Paleologos EK, Elhakeem M, Amrousi ME (2018). Bayesian analysis of air emission violations from waste incineration and coincineration plants. Risk Analysis, 38(11): 2368–2378. https://doi.org/10.1111/risa.13130
 
Pebesma E (2018). Simple features for R: Standardized support for spatial vector data. The R Journal, 10(1): 439–446. https://doi.org/10.32614/RJ-2018-009
 
Perrier V, Meyer F, Granjon D (2023). shinyWidgets: Custom Inputs Widgets for Shiny. https://github.com/dreamRs/shinyWidgets, https://dreamrs.github.io/shinyWidgets/.
 
Prentice RL (1986). Binary regression using an extended beta-binomial distribution, with discussion of correlation induced by covariate measurement errors. Journal of the American Statistical Association, 81(394): 321. https://doi.org/10.1080/01621459.1986.10478275
 
R Core Team (2021). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria.
 
Robb BH (2021). Council Post: Homeownership and the American Dream. https://www.forbes.com/sites/forbesrealestatecouncil/2021/09/28/homeownership-and-the-american-dream/.
 
Sanchez-Moyano R (2021). Geography and Hispanic homeownership: A review of the literature. Journal of Housing and the Built Environment, 36(1): 215–240. https://doi.org/10.1007/s10901-020-09745-5
 
Tennekes M (2018). tmap: Thematic maps in R. Journal of Statistical Software, 84(6): 1–39. https://doi.org/10.18637/jss.v084.i06
 
Thomas AF (2021). The racial wealth gap and the tax benefits of homeownership. The New York Law School Law Review, 66: 247.
 
US Census Bureau (2015-2020). American Community Survey. Tables S1501, S1903, S2301, S2501, S2502, S2503, S2506, S2507, DP05. Technical report, US Census Bureau.
 
US Census Bureau (2021a). Data Suppression. Technical report, US Census Bureau. Available at https://www.census.gov/programs-surveys/acs/technical-documentation/data-suppression.html.
 
US Census Bureau (2021b). Understanding and Using the American Community Survey Public Use Microdata Sample Files: What Data Users Need To Know. Technical report, US Census Bureau. Available at https://www.census.gov/content/dam/Census/library/publications/2021/acs/acs_pums_handbook_2021.pdf.
 
Vehtari A, Gabry J, Magnusson M, Yao Y, Bürkner PC, Paananen T, et al. (2022). loo: Efficient Leave-One-Out Cross-Validation and WAIC for Bayesian Models. R package version 2.5.1.
 
Vehtari A, Gelman A, Gabry J (2017). Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC. Statistics and Computing, 27: 1413–1432. https://doi.org/10.1007/s11222-016-9696-4
 
Vehtari A, Gelman A, Simpson D, Carpenter B, Bürkner PC (2021). Rank-normalization, folding, and localization: An improved Rˆ for assessing convergence of MCMC (with discussion). Bayesian Analysis, 16(2): 667–718. Publisher: International Society for Bayesian Analysis. https://doi.org/10.1214/20-BA1221
 
Wickham H (2016). ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag, New York.
 
Wickham H (2022). stringr: Simple, Consistent Wrappers for Common String Operations. R package version 1.5.0.
 
Wickham H (2023). forcats: Tools for Working with Categorical Variables (Factors). R package version 1.0.0.
 
Wickham H, Averick M, Bryan J, Chang W, McGowan LD, François R, et al. (2019). Welcome to the tidyverse. The Journal of Open Source Software, 4(43): 1686. https://doi.org/10.21105/joss.01686
 
Wickham H, François R, Henry L, Müller K (2022). dplyr: A Grammar of Data Manipulation. R package version 1.0.10.
 
Wickham H, Hester J, Bryan J (2023a). readr: Read Rectangular Text Data. R package version 2.1.4.
 
Wickham H, Vaughan D, Girlich M (2023b). tidyr: Tidy Messy Data. R package version 1.3.0.
 
Wilkinson GN, Rogers CE (1973). Symbolic description of factorial models for analysis of variance. Journal of the Royal Statistical Society. Series C. Applied Statistics, 22(3): 392–399.

Related articles PDF XML
Related articles PDF XML

Copyright
2024 The Author(s). Published by the School of Statistics and the Center for Applied Statistics, Renmin University of China.
by logo by logo
Open access article under the CC BY license.

Keywords
census exploratory analysis generalized linear models GLM housing sociodemographic factors statistical analysis

Funding
The USF Department of Mathematics & Statistics Department and the USF Graduate School provided the travel funding to attend JSM 2022.

Metrics
since February 2021
351

Article info
views

309

PDF
downloads

Export citation

Copy and paste formatted citation
Placeholder

Download citation in file


Share


RSS

Journal of data science

  • Online ISSN: 1683-8602
  • Print ISSN: 1680-743X

About

  • About journal

For contributors

  • Submit
  • OA Policy
  • Become a Peer-reviewer

Contact us

  • JDS@ruc.edu.cn
  • No. 59 Zhongguancun Street, Haidian District Beijing, 100872, P.R. China
Powered by PubliMill  •  Privacy policy