* * *
Here are excerpts:
We develop empirical tests for discrimination that use high-frequency evaluations to address the problem of unobserved heterogeneity in a conventional benchmarking test. Our approach to identifying discrimination requires two conditions: (1) the subject pool is time-invariant in a short time horizon and (2) there is high-frequency variation in the extent to which evaluators can rely on their subjective assessments. We bring our approach to the residential mortgage market, using data on the near-universe of
Monthly volume quotas reduce how much subjectivity loan officers apply to loans they process at the end of the month. As a result, the volume of new originations increases by 150% at the end of the month, while application volume and applicants' quality are constant within the month. Owing to within-month variation in loan officers' subjectivity, we estimate that
* * *
Racial and gender disparities have been documented in a range of fields, such as labor markets, the legal system, and credit markets. Yet whether these disparities are the result of discrimination by economic decision-makers–defined as an evaluator treating otherwise identical subjects from minority groups worse than subjects from the majority group–remains in dispute. There has been a growing trend toward using experiments and correspondence studies to test for discrimination (Bertrand and Duflo, 2017). Nonetheless, tests for discrimination that use observational data have several advantageous features. Such tests are accessible to a wide range of researchers, they are easy to replicate and scale, they can be used to estimate aggregate costs of discrimination in a given market, and policymakers can easily implement them.
However, tests for discrimination based on observational data face a number of econometric challenges that limit their appeal. The most straightforward test for discrimination is an audit or “benchmarking” test. Benchmarking tests claim to find discrimination when minority groups receive unfavorable evaluations relative to the majority group. But, benchmarking tests are vulnerable to criticisms of omitted variable bias–differences in group characteristics, which the researcher does not observe, can cause differences in evaluations across groups.
Alternatively, Becker (1957) proposed an “outcome test.” Instead of comparing differences in how groups are evaluated, outcome tests compare the ex-post success of these evaluations. The marginal minority will have better ex-post outcomes than the marginal majority subject because minority groups face higher thresholds for inclusion when they are subject to discrimination.
Though intuitively appealing, outcome tests are notoriously difficult to implement, most notably because of the “infra-marginality” problem–the average difference in ex-post outcomes can be a poor approximation of the difference in marginal outcomes (Ayres, 2002). Recent research has made significant progress to improve econometric methods (e.g., Arnold et al., 2018), but addressing the infra-marginality problem requires additional modelling and distributional assumptions (Simoiu et al., 2017). Furthermore, ex-post outcomes can be the result of self-fulfilling prophecies (e.g., female students underperform in math because gender stereotypes reduce investment in females' math education; Bordalo et al., 2016) and ex-post outcomes are often not easily measured (e.g., worker productivity can be difficult to measure and proxies for productivity, such as wages, can also be affected by discrimination).
We propose an alternative way to test for discrimination. The approach is motivated by the observation that evaluators' subjectivity can often vary substantially within short time intervals.
For example, employers that have immediate staffing needs can ill afford to turn away job applicants.
The approach requires two simple assumptions: time-varying discrimination and time-invariant unobserved characteristics both in a short time interval (e.g., a month). The identification rationale is straightforward. If the evaluations of a group vary within a short time interval, then these differences cannot be driven by unobserved subject characteristics, because the unobserved characteristics are time-invariant.
We apply our approach to high-frequency data on mortgage applications, to test for discrimination in the
We obtain the time-stamped version of the Home Mortgage Disclosure Act (HMDA) data, covering the near-universe of mortgage applications from 1994 to 2018 with 500 million loan applications across more than 28,000 lenders. Crucial to our empirical approach, we observe the exact application and decision date of each application.
Figure 1 demonstrates our key source of high-frequency variation in the mortgage market and the foundation of our empirical approach. The figure shows the volume of new originations and new applications relative to the first day of a given month. The total volume of new mortgage originations increases by more than 150% on the last day relative to the first day of a given month.
* * *
1/ The literature can be traced back at least as far as the public release of HMDA data and the work of Munnell et al. (1996).
* * *
At the same time, the number of submitted mortgage applications stays constant over the course of the month. These patterns reveal a crucial feature of the mortgage application process: loans are processed by individual loan officers who have monthly performance targets that determine their compensation./2
Moreover, this within-month pattern in loan approvals unveils the component of loan officers' decision-making that is orthogonal to observable and unobservable factors affecting loan originations (e.g., credit market conditions, applicant characteristics, and firm-level characteristics). Drawing from Becker (1957), a profit-maximizing agent can give disparate treatment to minority populations until market competition makes discrimination economically untenable.
* * *
2/ Though we do not directly measure the compensation of any individual loan officer, for the most part, commissions are set based on the number of loans and the loan amount originated. And the compensation scheme is common across employers. For example, see the following link for an article on the website of the
* * *
Loan officers have an economic incentive to meet end-of-month performance incentives. As such, loan officers' subjective favoritism toward applicants has to attenuate at the end of the month relative to the beginning of the month. Therefore, the within-month pattern, combined with a conventional benchmark test, allows us to estimate the extent to which loan approval decisions can be attributed to loan officers' subjectivity towards applicants.
Exploiting this within-month variation, our tests for discrimination estimate the difference in approval rates between
The regressions also include lender-month fixed effects that control for factors, such as regulations, that would affect lending at the institution level. Confirming the graphical evidence, the difference in the
Our approach to estimating discrimination hinges on a set of simple assumptions that we derive and that are easily supportable, either in the data or via narrative, or both. The first assumption is that the loan officer has time-varying costs of being subjective. In our setting, loan officers have nonlinear contract incentives./3
Loan officers that fail to meet their volume quotas will have reduced compensation and risk getting fired. The second assumption is that the characteristics of the subject pool are time invariant. Indeed, we find that application volume, the relative share of
* * *
3 Importantly, with volume quotas, the optimal strategy would be to approve all loan applications. However, in practice, there are several constraints on this strategy. Lenders set origination standards that an application has to exceed and loan officers may have a fixed quantity of mortgage credit that they can distribute within a month. Loan officers can use their discretion and work to sidestep the origination standards by either using risk-based pricing or appealing to other “soft” criteria, such as noting that the applicant is a customer of the bank.
* * *
In contrast to other methodologies, our approach does not require ex-post outcomes to test for discrimination. Nevertheless, we show that a conventional outcome test is potentially misleading about the levels of discrimination in mortgage lending. We find that
Furthermore, our approach offers guidance, relative to both benchmarking and outcome tests, as to whether observed discrimination is caused by taste-based versus statistical discrimination. We develop an additional set of assumptions to distinguish between the two theories. Put simply, the case for statistical discrimination requires asymmetric information between evaluators and subjects. Because of the high-frequency nature of our data, statistical discrimination would require the loan officers' information set about applicants to change from the start to the end of the month. This explanation is unlikely because we show that the applicant pool is time invariant. Related, we consider the role of inaccurate beliefs (i.e., stereotypes Bordalo et al., 2016) and a similar logic precludes this explanation.
Finally, our approach is advantageous because it can easily be applied to evaluate the effect of market policies and market innovations on the quantity of discrimination. We consider three important features of modern mortgage lending: market concentration in banking, FinTech lending, and shadow banking. We find that the amount of discrimination due to loan officers' subjectivity is unaffected by both market concentration and FinTech lending. This result is largely consistent with the fact that our regressions include lender-by-month fixed effects and that the component of loan officer subjectivity our approach uncovers occurs within-lender. Moreover, despite these changes to the banking sector, loan officer compensation incentives have largely remained constant throughout our sample, and even mortgage lending at FinTech lenders involves significant discretion from human loan officers. On the other hand, we find that shadow banks have lower levels of subjective discrimination against
* * *
Tests for taste-based discrimination are often unconvincing because subject groups tend to have different unobserved characteristics. We show that high-frequency evaluations can help address the omitted variable problem when there is variation in the degree to which decision-makers rely on subjective evaluations. Under the null that decision-makers do not engage in taste-based discrimination, and assuming that the applicant pool is constant, a decrease in the degree of subjectivity should have no impact on the likelihood of favorable decisions for minority subjects relative to majority subjects. A reduction in disparate treatment for minority subjects would instead reveal the presence of taste-based discrimination.
We use our approach to provide new evidence of discrimination in mortgage lending in the
Our findings have important policy implications for the distribution of credit in consumer credit markets. Legislation such as the Community Reinvestment Act and the Equal Credit Opportunity Act has been implemented over the past several decades to counteract historical inequities in credit access (e.g., red-lining; Appel and Nickerson, 2016; Aaronson et al., 2017). A crucial aspect of such legislation is that it intends to modify the behavior of lending institutions. We show that patterns of discriminatory behavior by loan officers exist within-institution and such behavior is not mitigated by important features of the market structure of lending markets–namely FinTech, institution size, and competition across lenders. This suggests that policies targeted toward institutions will have limited effects so long as individuals use their discretion to allocate credit.
Seeing as institution-level policies do not eliminate biases held by individuals, it calls into question what policies would be effective. In accordance with classic economic theories of discrimination (Becker, 1957), competition reduces taste-based discrimination. Such competition occurs in the labor market for loan officers. Loan officers have to meet monthly performance targets otherwise they would have less compensation and risk being fired. However, loan officers' preference for discrimination is not fully undone by labor market competition, suggesting that there are barriers to entry in the labor market. Indeed, loan officers need at least a bachelor's degree in a field related to finance or business, and they have to obtain and maintain a license.
There are two recommendations that emerge from our study. First, the collection of high-frequency data on evaluations, combined with our approach, can be used to estimate the amount of discrimination across a variety of contexts and markets. Second, enhancing the data collection to go beyond the institution and down to individual decision-makers can provide further insight into the factors that determine discrimination. Such data can be used by researchers and policy-makers, as well as by consumers, for instance when shopping for credit in the mortgage market.
* * *
Aaronson, D., Hartley, D., Mazumder, B., 2017. The effects of the 1930s HOLC “redlining” maps. Working Paper .
Agarwal, S., Ben-David, I., 2012. Do loan officers' incentives lead to lax lending standards? Working Paper . Agarwal, S., Benmelech, E., Bergman, N., Seru, A., 2012. Did the Community Reinvestment Act (CRA) lead to risky lending? Working Paper .
Akerlof, G. A., 1991. Procrastination and obedience. American Economic Review 81, 1-19.
Akey, P., Heimer, R. Z., Lewellen, S., 2020. Politicizing consumer credit.
Ambrose, B. W., Conklin, J. N.,
Appel, I., Nickerson, J., 2016. Pockets of poverty: The long-term effects of redlining. Working Paper .
Arnold, D., Dobbie, W., Yang, C. S., 2018. Racial bias in bail decisions.
Ayres, I., 2002. Outcome tests of racial disparities in police practices.
Bandiera, O., Barankay, I., Rasul, I., 2007. Incentives for managers and inequality among workers: Evidence from a firm-level experiment.
Beck, T., Behr, P., Madestam, A., 2018. Sex and credit: Is there a gender bias in lending?
Becker, G. S., 1957. The Economics of Discrimination. The
Berg, T., Burg, V., Gombovic, A., Puri, M., 2020a. On the rise of fintechs: Credit scoring using ' digital footprints. The Review of Financial Studies 33, 2845-2897.
Berg, T., Puri, M., Rocholl, J., 2020b. Loan officer incentives, internal rating models, and default rates. Review of Finance 24, 529-578.
Berkovec, J. A., Canner, G. B., Gabriel, S. A., Hannan, T. H., 1994. Race, redlining, and residential mortgage loan performance.
Berkovec, J. A., Canner, G. B., Gabriel, S. A., Hannan, T. H., 1998. Discrimination, competition, and loan performance in FHA mortgage lending. Review of Economics and Statistics 80, 241- 250.
Bertrand, M., Duflo, E., 2017. Field experiments on discrimination. In: Handbook of economic field experiments,
Bhutta, N., 2011. The community reinvestment act and mortgage lending to lower income borrowers and neighborhoods.
Bhutta, N., Hizmo, A., 2020. Do minorities pay more for mortgages? The Review of Financial Studies Forthcoming.
Bohren, J. A., Haggag, K., Imas, A., Pope, D. G., 2020. Inaccurate statistical discrimination, Working Paper.
Bohren, J. A., Imas, A., Rosenberg, M., 2019. The dynamics of discrimination: Theory and evidence. American economic review 109, 3395-3436.
Bordalo, P., Coffman, K., Gennaioli, N., Shleifer, A., 2016. Stereotypes.
Buchak, G., Jorring, A., 2017. Does competition reduce racial discrimination in lending? Working Paper .
Buchak, G., Matvos, G., Piskorski, T., Seru, A., 2018. Fintech, regulatory arbitrage, and the rise of shadow banks.
Butler, A. W., Mayer, E. J.,
Cao, Y., Fisman, R., Lin, H., Wang, Y., 2020. Target setting and allocative inefficiency in lending: Evidence from two Chinese banks. Working Paper .
Cetorelli, N., Strahan, P. E., 2006. Finance as a barrier to entry: Bank competition and industry structure in local
Chen, D. L., Moskowitz, T. J., Shue, K., 2016. Decision making under the gambler's fallacy: Evidence from asylum judges, loan officers, and baseball umpires.
Cole, S., Kanz, M., Klapper, L., et al., 2015. Incentivizing calculated risk-taking: Evidence from an experiment with commercial bank loan officers.
Cortes, K., Duchin, R., Sosyura, D., 2016. Clouded judgment: The role of sentiment in credit ' origination.
Demiroglu, C., Ozbas, O., Silva, R., Ulu, M. F., 2019. Do physiological and spiritual factors affect economic decisions?
Dobbie, W., Liberman, A., Paravisini, D., Pathania, V. S., 2020. Measuring bias in consumer lending. Review of Economic Studies Forthcoming.
Engelberg, J., Gao, P., Parsons, C. A., 2012. Friends with money.
Fisman, R., Paravisini, D., Vig, V., 2017. Cultural proximity and loan outcomes. American Economic Review 107, 457-92.
Fisman, R., Sarkar, A., Skrastins, J., Vig, V., 2020. Experience of communal conflicts and intergroup lending.
Fuster, A., Goldsmith-Pinkham, P., Ramadorai, T., Walther, A., 2017. Predictably unequal? The effects of machine learning on credit markets. Tech. rep., CEPR Discussion Papers.
Fuster, A., Plosser, M., Schnabl, P., Vickery, J., 2019. The role of technology in mortgage lending. The Review of Financial Studies 32, 1854-1899.
Goldin, C., Rouse, C., 2000. Orchestrating impartiality: The impact of “blind” auditions on female musicians. American Economic Review 90, 715-741.
Gravelle, H., Sutton, M., Ma, A., 2010. Doctor behaviour under a pay for performance contract: Treating, cheating and case finding?
Hertzberg, A., Liberti, J. M., Paravisini, D., 2010. Information and incentives inside the firm: Evidence from loan officer rotation.
Kleinberg, J., Lakkaraju, H., Leskovec, J., Ludwig, J., Mullainathan, S., 2018. Human decisions and machine predictions.
Larkin, I., 2014. The cost of high-powered incentives: Employee gaming in enterprise software sales.
Li, J., Hurley, J., DeCicca, P.,
Liebman, J. B., Mahoney, N., 2017. Do expiring budgets lead to wasteful year-end spending?
Evidence from federal procurement. American Economic Review 107, 3510-49.
Montoya, A. M., Parrado, E., Solis, A., Undurraga, R., 2019. Gender discrimination in the consumer credit market: Experimental evidence. Working Paper . Munnell, A. H., Tootell, G. M., Browne, L. E., McEneaney, J., 1996. Mortgage lending in
Murfin, J., Petersen, M., 2016. Loans on sale: Credit market seasonality, borrower need, and lender rents.
Murphy, K. J., 2000. Performance standards in incentive contracts.
Oyer, P., 1998. Fiscal year ends and nonlinear incentive contracts: The effect on business seasonality.
Pierson, E., Simoiu, C., Overgoor, J., Corbett-Davies, S., Jenson, D., Shoemaker, A., Ramachandran, V., Barghouty, P.,
Qian, J., Strahan, P. E., Yang, Z., 2015. The impact of incentives and communication costs on information production and use: Evidence from bank lending.
Rosen, R. J., 2011. Competition in mortgage markets: The effect of lender type on loan characteristics. Economic Perspectives 35, 2-21.
Simoiu, C., Corbett-Davies, S., Goel, S., et al., 2017. The problem of infra-marginality in outcome tests for discrimination. The Annals of Applied Statistics 11, 1193-1216.
Tang, H., 2019. Peer-to-peer lenders versus banks: Substitutes or complements? The Review of Financial Studies 32, 1900-1938.
Tootell, G. M., 1996. Redlining in
Tzioumis, K., Gee, M., 2013. Nonlinear incentives and mortgage officers' decisions.
* * *
View full text of the white paper at https://www.philadelphiafed.org/-/media/frbp/assets/working-papers/2021/wp21-04r.pdf