Get 20M+ Full-Text Papers For Less Than $1.50/day. Start a 14-Day Trial for You or Your Team.

Learn More →

House Price Markups and Mortgage Defaults

House Price Markups and Mortgage Defaults The housing market is characterizedby major information asymmetries, heterogeneous preferences, and high transaction costs. Accordingly, the prices at which buyers and sellers are willing to trade identical units may vary widely relative to an average market price. Idiosyncratic conditions of each sale lead to either a positive or a negative price markup, measured as the difference between the actual transaction price and the average market price.1Case and Shiller (1989) estimate a property‐specific appreciation standard deviation of 15% around the market average using repeat‐sales regressions, suggesting large price variation for transactions of identical homes. These regularities are explained and predicted by a long and growing literature that employs search‐and‐matching models to study the microstructure of housing transactions. Han and Strange (2015) present a comprehensive survey of this topic; examples of recent studies in the housing literature include Carrillo (2012), Merlo, Ortalo‐Magne, and Rust (2015), and Adelino, Gerardi, and Hartman‐Glaser (2019). For an excellent survey of search and matching models in labor markets, see Mortensen and Pissarides (1999). Nominal loss aversion and house price expectations may also play roles in markups; see Genesove and Mayer (2001) and Glaeser and Nathanson (2017), respectively.Housing is also highly leveraged, making individual mortgage performance sensitive to variation in the value of the collateral on the loan and the market susceptible to leveraged bubbles (Jordà, Schularick, and Taylor 2015). These attributes make housing the ideal market to explore issues related to markups, leverage, and loan performance.In this paper, building off of the pioneering work of Harding, Rosenthal, and Sirmans (2003), Ben‐David (2011), and others, we introduce the general concept of markups as a major source of unaccounted‐for‐risk in loan performance using information on over 100 million home transactions and 40 million mortgages in the United States between 1975 and 2018. The findings have important consequences for modeling mortgage defaults, credit losses conditional on default, and cash flows and pricing of mortgage‐backed securities.A simple conceptual framework can show that a transaction price and its deviation from the expected price mechanically determine future equity value, and thus, is a key predictor of mortgage performance. Loans with higher (lower) price markups at origination should be, all other things equal, more (less) risky.We measure markups using a comprehensive data set of housing transactions in the United States. Combining house price data from mortgages purchased or securitized by Fannie Mae, Freddie Mac, and the Federal Housing Administration with property values from sales and transfer documents that are filed at county recorder offices, more than 100 million transactions are amassed involving single‐family housing units since the mid‐1970s.2The county recorder data have been licensed from CoreLogic.Our measure exploits repeat‐sales to predict the transaction price of a unit in a manner similar to Anenberg (2016): take an initial purchase price and project it forward using a house price index (HPI). The markup is then computed as the percentage difference between the observed and predicted purchase price of the next transaction; refinances are discarded from the data set. Intuitively, the markup resembles (but it is not identical to) a residual from a repeat‐sales regression. The richness of our data allows us to estimate markups for any house that sells more than once in the sample. Empirical tests suggest that markups calculated in this manner are partially mean reverting, suggesting that markups do not solely reflect unobserved changes in economic fundamentals of a housing unit's quality or location; rather, because markups are mean reverting, they contain a transitory, transaction‐specific component.We merge this markups database onto two public sources of loan performance information that combine to track nearly 40 million loans from origination to termination. The loan performance sample begins in the early 2000s, which allows us to capture rising and falling markets and, most importantly, provide takeaways about both the Great Recession and the recovery environment. The mortgage origination data provide important details on the loan's terms (e.g., amount, length, and interest rate) and underwriting characteristics (e.g., acquisition channel for securitization). The mortgage performance data track outstanding loan balances, loan maturity, delinquency status, and details about termination. The combined data are used to estimate traditional mortgage performance models for a sample of purchase‐money mortgages. Results from a wide set of specifications and samples provide convincing evidence that markups are associated with future mortgage default. In the baseline specification, posterior default rates are about 50% higher for mortgages with a +20% markup versus one with a −20% markup,3This range of markups is consistent with earlier work by Molloy and Nielsen (2018) and contains 80% of the markups in our main sample.conditional on identical covariates including loan‐to‐value ratios (LTV), between 2001 and 2012.4Throughout this paper, we refer to the “combined” LTV ratio as the LTV. This includes the balances of all loans using the house as collateral at the time of origination. This does not include subsequent second liens. We filter out all loans with either an individual or combined LTV < 25 to focus on first liens.These findings are robust across years, other sample cuts, and taking into account nonlinearities and interaction terms in a series of models.These price and mortgage data also allow us to estimate the relation between markups and credit losses when a loan defaults. Conditional on default, markups are associated with greater credit losses for the holder of the mortgage. Overall, a +20% markup is associated with an additional 3.5% in eventual credit losses relative to a loan with no markup. Importantly, unlike variation in LTV, which has mechanical interactions with mortgage insurance (MI) coverage and depth and is thus insured, variation in the markup translates to credit risk regardless of MI incidence, though the existence of MI does mitigate some of the losses due to markups.5Lenders often require credit enhancements that require MI depending on the LTV of the loan. MI is typically required for loans with LTVs greater than 80. MI is then in force until enough collateral is accumulated through principal payments to achieve a 78 LTV based on the valuation at origination, or if the mark‐to‐market LTV is below 80.Why can markups predict credit risk even after LTV is taken into account? It seems that neither the sale price nor the appraisal capture the “true” average value of a home. The sale price captures conditions that are idiosyncratic to each transaction and that may not reflect average market conditions. Average rather than idiosyncratic conditions should be stronger driving factors for determining collateral value. Moreover, manually performed (human) appraisals may be subject to bias as reported by a growing set of studies in this area (see Nakamura 2010, Agarwal, Ben‐David, and Yao 2013, Calem, Lambie‐Hanson, and Nakamura 2015, Ding and Nakamura 2016, Calem et al. 2021). To confirm the existence of appraisal bias in the sample, the following is shown: (i) the distribution of appraisal‐based markups has excess mass at zero (in nearly 50% of cases, the sale price is within 0.5% of the appraisal); and (ii) this notch point is predictive of default in a manner consistent with right‐censoring of appraisal valuations.6We compute appraisal‐based markups using the difference between the purchase price and the appraised price.The previous evidence and discussion allows us to conclude that collateral coverage may be substantially mismeasured, and this is likely the reason why HPI‐generated markups are correlated with default risk even when conditioning on LTV.This paper contributes to a growing literature concerned with how the mispricing of assets affects immediate valuations and subsequent performance outcomes. A large number of studies have documented that collateral value relative to the outstanding balance is an important determinant of payment default, highlighting the role of accurately assessing current equity when estimating runoffs and default risks (Campbell and Dietrich 1983, Deng, Quigley, and Van Order 2000, Foote, Gerardi, and Willen 2008, Mayer, Pence, and Sherlund 2009, Piskorski, Seru, and Vig 2010, among many others). This study makes an important point: due to factors idiosyncratic to each trade, transaction values should not be expected to always reflect average market conditions. Units that have been sold or refinanced at an above (below) average price have, other things equal, less (more) equity and are exposed to more (less) mortgage default risk.Importantly, these dynamics are not subject to narrow contexts such as hot or cold housing markets or fraud.7Ben‐David (2011) and Carrillo (2013) show that house prices that are artificially high at origination due to fraudulent practices can be associated with future default rates.Rather, markups exist for every transaction, pointing to an additional source of asset default risk that may not be apparent with certain underwriting valuation (i.e., appraisal) techniques. Accordingly, this research has fundamental implications for the valuation of pools of loans and asset‐backed securities on which they are based.More broadly, these findings could be applicable to other assets that are characterized by both large spreads and highly leveraged transactions where the asset purchased serves as the collateral on the loan. This includes the market for auto loans, collectibles such as art, wine, or trading cards, heavy machinery, and insurance policies. When these assets are highly leveraged and markups are large and positive, incentives to default are higher than for loans with small and/or negative markups, holding leverage constant.CONCEPTUAL FRAMEWORKA simple conceptual framework can illustrate why price markups exist in the housing market, and the relation between markups and future default rates. Allow for a stylized one‐sided, partial equilibrium, two‐period model where potential home buyers search for a house in period t=0$t=0$. To keep things simple and tractable, assume that each buyer is able to find exactly one seller. Homes are identical in their observed characteristics but a buyer obtains a random utility component B$B$that is idiosyncratic to each home‐buyer combination. From a buyer's point of view, the seller's reservation value and potential trading price P0$P_0$is random as it depends on the seller's characteristics and valuation of the house. Assume that the house price is an independently and identically distributed realization drawn from a predetermined continuous distribution F0$F_0$; that is, P0∼F0$P_0 \sim F_0$. A buyer has access to credit and can purchase the house after paying a downpayment of δP0$\delta P_0$, where 1−δ∈[0,1]$1-\delta \in [0,1]$measures the LTV ratio. In period t=1$t=1$, a home buyer can become a seller and put a house on the market. A random price offer P1∼F1$P_1 \sim F_1$is expected. Let F1$F_1$be exogenously determined. If selling the house is not profitable, a seller has the option to default on the loan. In what follows, the model is formally set up while denoting random variables and their realizations with upper‐ and lower‐case letters, respectively.After a buyer has inspected a house and negotiated with the seller, a realization of B$B$and P0$P_0$is revealed. From the buyer's perspective, the value of having an opportunity to buy such house V0b$V^b_0$in period t=0$t=0$is equal to1V0b(p0)=maxb−δp0+EV1s(P1,p0),0,$$\begin{equation} V^b_0(p_0)= \max {\left\lbrace b-\delta p_0+ E {\left[ V^s_1(P_1,p_0) \right]},0 \right\rbrace} , \end{equation}$$where the match value b$b$denotes the increased utility from housing consumption relative to the outside option (renting), and E[V1s(P1,p0)]$E [ V^s_1(P_1,p_0) ]$is the expected value of having an opportunity to sell this house next period. In period t=1$t=1$, a home owner has the option to sell the house and repay the loan or default2V1s(P1,p0)=max{P1−b−(1+r)(1−δ)p0,−D},$$\begin{equation} V_1^s(P_1,p_0)=\max \lbrace P_1-b-(1+r)(1-\delta )p_0,-D \rbrace , \end{equation}$$where r$r$is the loan interest rate and D$D$captures both pecuniary and nonpecuniary default costs. A simple examination of equation (2) allows us to conclude that, conditional on b$b$and p0$p_0$, a seller would default as long as the realization of P1≤γ$P_1 \le \gamma$, where γ=b+(1+r)(1−δ)p0−D$\gamma =b+(1+r)(1-\delta )p_0-D$. Note that the probability of default (given realizations b$b$and p0$p_0$) is equal to3Pr[P1≤γ∣b,p0]=F1(b+(1+r)(1−δ)p0−D).$$\begin{equation} \Pr [P_1\le \gamma \mid b,p_0]=F_1(b+(1+r)(1-\delta )p_0-D). \end{equation}$$Let4PRS(b,p0)=γ$$\begin{equation} P_R^S(b,p_0)=\gamma \end{equation}$$be the seller's reservation price: the minimum price at which she is willing to sell his/her house. Now compute5EV1s(P1,p0)=∫γ∞(P1−b−(1+r)(1−δ)p0)dF1(P1)−F1(γ)D.$$\begin{equation} E {\left[ V^s_1(P_1,p_0)\right]}= \int _{\gamma } ^\infty (P_1-b-(1+r)(1-\delta )p_0) dF_1(P_1)- F_1(\gamma )D. \end{equation}$$After simple differentiation, one can show that∂EV1s(P1,p0)∂p0≤0,$$\begin{equation*} \frac{ \partial E {\left[ V^s_1(P_1,p_0)\right]}}{\partial p_0} \le 0, \end{equation*}$$and∂V0b(p0)∂p0≤0.$$\begin{equation*} \frac{ \partial V^b_0(p_0)}{\partial p_0} \le 0. \end{equation*}$$Hence, in period 0, it is also optimal for a home buyer to follow a reservation strategy such that a house is purchased if the realization of P0$P_0$is smaller than a reservation price P0R$P_0^R$. After inspecting equation (1), it is clear that P0R$P_0^R$is implicitly defined by6∫V1sP1,P0RdF(P1)=δP0R−b.$$\begin{equation} \int {\left[ V^s_1{\left(P_1,P_0^R\right)}\right]}dF(P_1)=\delta P_0^R-b. \end{equation}$$As shown in Figure 1, the solution to this equation exists and it is unique. This simple model allows us to make three important conjectures that will later guide the empirical analysis.1FigHome Buyer's Optimal Reservation Price.Notes:This figure illustrates the solution to the home buyer's reservation problem defined in equation (6). The buyer would buy a house if, and only if, the transaction price is below P0R$P_0^R$. The left‐hand side of equation (6) is nonincreasing, convex, and converges to −D$-D$as p0$p_0$approaches infinity. The right‐hand side is an increasing linear function of p0$p_0$. Hence, a unique solution exists. Any shift of the distribution F1$F_1$to the right (that is, an increase in expected future appreciation rates) will increase current buyers' reservation values.First, the model predicts that households pay different prices for identical housing. Some households pay a positive price markup (sale price above the average), while others pay less. What is important to remark here is that the price markup is not a consequence of unobserved housing heterogeneity but rather it is due in part to random matching and heterogeneity in buyer and seller tastes. This is an important observation that is considered when estimating the price markups empirically.Second, the model clearly shows that there is a negative relationship between price markups and default rates: higher markups (higher p0$p_0$) increase the probability of future default (see equation (3)). In the empirical section, this hypothesis is tested and the implications are investigated.8Partial equilibrium solutions have been analyzed where it has been assumed that F0$F_0$and F1$F_1$are exogenous to the model. This may not be a reasonable assumption since sale prices are typically determined after a (sometimes complex) negotiating and bargaining process between buyers and sellers. Moreover, a home buyer can become a home seller in the future. This complication is avoided for the sake of clarity and simplicity.Finally, also note that buyers with lower costs of future default (lower D$D$) are willing to pay higher prices for houses.9To see this from Figure 1, note that as D$D$decreases the dotted horizontal line in that figure moves higher, closer to the x‐axis. At the same time, the downward sloping curve above (E[V1s(p0)]$E[V_1^s(p_0)]$) shifts up, raising the value at the intersection that determines the buyer's reservation house price (P0R$P^R_0$). We thank an anonymous referee for making this point.This means that buyers with lower default costs will pay on average higher prices for identical housing. Higher markup and lower default costs, in turn, predict higher mortgage default rates. Are defaults more common among buyers that begin with positive markups because these markups make negative equity more likely, or because these markups indicate that these buyers have lower costs of default? From an empirical perspective, it is challenging to isolate these channels and precisely identify the causal effect of markups on future default rates. We emphasize that we are not seeking to uncover such causal link. The empirical section focuses on estimating the conditional correlation between current markups and future default rates.MEASUREMENT OF MARKUPS, DATA, AND STATISTICSA markup is defined as the (log) difference between the actual price and the predicted price of a home. To predict the average sale price of a housing unit, we use microdata and a valuation method based on market averages. The method uses two transaction prices of the same housing unit, along with a local HPI, to calculate the difference between the actual price and a predicted price. After outlining this method along with its strengths and weaknesses, a description is provided for the data used to calculate the measures found in the subsequent empirical sections.Finally, some basic descriptive statistics of the markups are presented for origination cohorts, loan outcomes, distributions, and same‐unit dynamics. We note that predicted home prices (and markups) can be estimated in alternative ways, using estimates from professional appraisals and/or automated valuation models (AVMs). In Section 6, we explore the role (if any) of appraisal‐based markups, leaving analysis of AVMs for further research.Markup MeasurementThe house price markup for a transaction is defined as the log‐difference between the actual transaction value (V$V$) and a predicted value (V̂$\hat{V}$) of a unit. Markups are measured by taking the previous transaction value of unit i$i$in area j$j$at time t$t$, and multiplying this value by the change in a local HPI (P$P$) between the time of the first anchor transaction in time t$t$, and a second subject transaction t+1$t+1$.10This value estimation approach is also used by Anenberg (2016) and Molloy and Nielsen (2018) in other contexts.This method requires two transactions of the same unit to calculate a markup (m$m$). With lower‐cased variables indicating logs, the markup is defined as follows:7mi,j,t+1≡vi,j,t+1−v̂i,j,t+1;v̂i,j,t+1=p̂j,t+1−p̂j,t+vi,j,t,$$\begin{equation} m_{i,j,t+1}\equiv v_{i,j,t+1}-\hat{v}_{i,j,t+1}; \quad \hat{v}_{i,j,t+1}=\hat{p}_{j,t+1}-\hat{p}_{j,t}+ v_{i,j,t}, \end{equation}$$where p̂j,t+1−p̂j,t$\hat{p}_{j,t+1}-\hat{p}_{j,t}$reflects house price appreciation as measured by the area's repeat sales index from time t$t$to time t+1$t + 1$.An example of the calculation of the price markups for a particular housing unit is shown in Figure 2. This house transacted four times between 1998 and 2015, with transaction prices denoted by solid circles. Each predicted price, based on the previous anchor transaction and the change in the HPI, is denoted by a hollow diamond. The markup is shown in the lower panel with the bar height representing the percent difference between the actual and predicted subject transaction prices. The 2002 transaction was about 6% lower than the predicted transaction price, indicating a −6% markup that is denoted with a solid red circle.11Throughout the paper, log‐differences and percent changes are used interchangeably. This is a reasonable approximation for small (<|0.3|$&lt;|0.3|$) log‐differences, but becomes increasingly inaccurate as differences increases.Then, based on this second transaction price, the predicted price in 2009 was about 10% lower than the actual, giving a +10% markup as depicted with a solid green circle. Finally, the fourth transaction indicates a −10% markup relative to the third transaction, and is depicted with another red circle because of the negative markup. We focus on purchase transactions (both cash and purchase‐money mortgages).2FigAn Illustration of the Calculation of a Markup.Notes:This figure is calculated using GSEs' data on mortgages and county recorder data on transactions from an actual sequence of four transactions of a single housing unit in Washington, DC.DataTo calculate markups for a large set of houses and estimate their association with various mortgage outcome measures, a database is assembled with information on three main items for each housing unit: a time series of transactions, a relevant HPI, and mortgage attributes and performance.Transaction recordThe base data set is a nearly complete coverage of real property and refinance transactions for single‐family housing in the United States. This record is based on two main sources and is identical to the data used by the Federal Housing Finance Agency (FHFA) to produce its suite of house price indices. The first source is administrative data from Fannie Mae, Freddie Mac, and the Federal Housing Administration. These include transaction and appraised values of homes from both purchase‐money and refinance mortgages purchased, securitized, or guaranteed by any of these three entities. Appended to this data set is a county recorder transaction file from CoreLogic consisting of transaction values from sales and transfer documents that are filed with county recorder offices. These include information that the administrative data misses, including cash purchases or purchases with loans that are held in portfolios or private‐label securities. These sources combine to include more than 100 million transactions involving single‐family housing units since the mid‐1970s.House price indicesA panel of local HPIs is merged onto the transaction records. To account for local variation in house prices, a ZIP‐code‐level file is obtained from the FHFA as described in Bogin, Doerner, and Larson (2019a).12These indices are produced at an annual frequency for ZIP codes with at least 25 “half‐pairs” (a count of the paired transactions where either the first or the second transaction occurs in the respective year) in each year.After combining these indices with the transactions record, markups are calculated for all transactions on a property except for the first recorded transaction. The out‐of‐sample standard deviation of prediction errors for these indices is approximately 10% for a 1‐year holding period, rising to about 15% for 8 years (see Figure 1a in Bogin, Doerner, and Larson 2019b).13For some context, estimates by Goodman and Thibodeau (2003) show hedonic models to have unit‐level prediction errors between 3% and 10% depending on the level of aggregation. While an HPI may produce noisier price estimates, this should only affect the second moment, not the first. Accordingly, measurement error should only serve to inflate standard errors (left‐hand side error) and attenuate estimates (right‐hand side error). An HPI also has the advantage of not requiring a vector of detailed housing unit attributes. This allows us to capture an extensive number of transactions going back in time. A large number of markups are required in later stages of the paper when estimating effects of markups on loan performance, as these models require a sufficient number of calculated markups.Loan performance dataFinally, mortgage origination and performance information are gathered at the loan level to measure the effects of price markups on specific payment outcomes. This information is from Fannie Mae and Freddie Mac's publicly available single‐family performance files.14Data are obtained from Freddie Mac and Fannie Mae from their websites and a database of proprietary identifiers via a regulatory oversight agreement. Freddie Mac data are at https://www.freddiemac.com/research/datasets/sf‐loanlevel‐dataset. Fannie Mae data can be found at http://www.fanniemae.com/portal/funding‐the‐market/data/loan‐performance‐data.html.The mortgages represent fully amortizing fixed‐rate loans that have been purchased or securitized between 2001 and 2012. These are “full documentation” loans, which means that loan application information has been verified or waived. A typical loan is underwritten for 30 years but the data also contain 15‐, 20‐, and 40‐year terms, although loan data with these other lengths are not available for originations before 2005. Mortgages are excluded if they are adjustable‐rate, interest‐only, balloon, step‐rate, or are not first liens. The sample also usually excludes mortgages that have credit enhancements that go beyond primary MI. Cash transactions or government‐issued mortgages, such as from Federal Housing Administration (FHA) or U.S. Department of Veterans Affairs (VA), are not present in this performance data set, though they are present in the transactions records used in the creation of the markups measures. Loans are also excluded if they participated in the Home Affordable Refinance Program (HARP) or if they have been flagged for other nonstandard attributes such as LTVs above 97%, immediate liquidations, biweekly payment due dates, reduced documentation, streamlined processing, or had been part of a lender recourse or third‐party credit‐sharing arrangement.The origination data provide details about a loan's terms (amount, length, and rate), purpose (purchase or refinance), MI coverage, origination channel, and general location. The ongoing performance files track monthly outstanding loan balance, maturity (age and remaining months), loan modification, delinquency status, and outcomes upon termination (actual loss revenues and expenses, covered chargeoffs, or repurchased workouts). A “default” is defined as a negative loan termination, including foreclosure, foreclosure alternative, or lender repurchase.Loan CountsThe analysis begins with the 3.9 million purchase‐money mortgages (“purchase mortgages”) in the loan performance file where we are able to calculate a markup. About 3.4 million loans pass several data quality filters.15Major drops include 350,000 observations with markups greater than +/−50% and 90,000 observations without fully populated covariates.Table 1 outlines counts of loan originations and outcomes by origination cohort.16Outcomes are as of March 2018.1TABLEPurchase Loan Counts and OutcomesDirectPositiveNeg. markupPos. markupYearLoansCurrentPrepayDefaultDefaultModificationMarkupDefault rateDefault rate2001366,5821.5%97.4%1.0%31.5%0.5%56.2%0.9%1.2%2002338,2153.0%95.6%1.3%38.7%0.8%58.8%1.2%1.4%2003352,7487.8%90.0%2.2%51.1%1.6%67.0%2.1%2.3%2004248,3458.7%88.1%3.2%57.1%2.1%65.9%2.8%3.4%2005273,18310.0%84.6%5.4%63.4%3.0%62.8%4.0%6.2%2006243,0517.3%86.6%6.1%60.8%3.6%57.6%5.3%6.7%2007251,1208.6%84.5%6.9%53.2%4.4%51.8%6.2%7.6%2008276,3208.6%87.3%4.1%44.7%3.6%41.3%3.4%5.1%2009279,76320.4%78.9%0.6%41.0%0.5%43.5%0.6%0.8%2010247,34229.8%69.9%0.3%38.9%0.4%49.5%0.2%0.3%2011232,87937.9%62.0%0.2%32.2%0.3%48.6%0.1%0.2%2012305,80167.3%32.6%0.1%30.5%0.2%56.0%0.1%0.1%Total (all)3,415,34917.0%80.5%2.5%44.7%1.7%55.3%2.1%2.8%Total (2006–07)494,1718.0%85.5%6.5%57.0%4.0%54.6%5.7%7.1%Note:The sample includes all purchase loans with a calculated markup subject to filters noted in the text as of March 2018. A direct default is defined following Foote et al. (2010): (i) the borrower must be current for three consecutive months, and then register a 90‐day delinquency 3 months later; (ii) the borrower must never have been seriously delinquent (90 days) before triggering (1); and (iii) the borrower must never become current again before defaulting.The number of purchase originations in the sample is highest in 2001–2003 at around 350,000 loans per year, and then remains roughly constant at between 240,000 and 300,000 loans per year for the remainder of the sample. For loans originated prior to 2009, about 90% or more of each vintage of loans has resolved in either default or prepayment. Defaults are low in 2001 at 1.0%, rise to 6.9% in 2007, and fall to 0.1% in 2012, though with each successive vintage past 2007, an increasing share of loans remains current. Loan modifications follow a similar pattern as defaults.Direct defaults are tracked as a measure of strategic default. A direct default happens when a borrower makes no payments after becoming delinquent, instead choosing to “simply walk away from the home” (Foote et al. 2010). Direct defaults follow Foote et al. (2010) by applying three criteria: (i) the borrower must be current for three consecutive months, and then register a 90 day delinquency three months later; (ii) the borrower must never have been seriously delinquent (90 days) before triggering (i); and (iii) the borrower must never become current again before defaulting. Across the sample, about 45% of defaults are direct, with loans in 2005 and 2006 averaging about 57%.Over the entire sample, about 55% of all purchase loans have positive markups.17The average markup is positive, indicating a difference in the selection mechanism into the sample versus the HPI. By construction, the HPI approach should give markups that are, on average, equal to zero if the samples are identical and markups are symmetric. The combination of a price index and property selection has isolated a subset of houses that have transacted for prices that are higher than the market area's average. This is likely due to two main reasons. First, the price indices are based on conforming, conventional loans. If an initial transaction were cash or a non‐GSE foreclosure sale, the price could be depressed and the markup calculated for the second transaction may be artificially inflated. Second, houses that have transacted multiple times in sequence may have undergone a renovation or “flip” causing, again, a positive markup. Ultimately, we continue with the analysis but acknowledge such potential shortcomings.Table 1 also presents default rates by markup sign. In every year, negative markups have at least as low default rates as positive markups. The magnitudes are often remarkably dissimilar, indicating a large unconditional correlation between markups and defaults. For instance, in 2007, the default rate for loans with negative markups is 6.2%; for positive markups, it is 7.6%. This relationship between negative and positive markups and default rates persists in noncrisis periods; even with default rates dropping to less than 1% in post‐2009 cohorts, positive markups still have (weakly) greater default rates than negative markups. Overall, there is a clear, bivariate, reduced‐form relationship between the sign of the markup at origination and eventual default. This relationship is remarkably robust and economically relevant, especially in periods of heightened default risk.Kernel densities related to markups and defaults are shown in Figure 3. The distribution of markups is shifted to the left for those that prepay versus those that default, indicating a positive association between default and markups throughout the distribution. Another thing to note is the large variance of the markups. We attribute the large variance to buyer/seller match idiosyncracies, the presence of error in the markup measurement, and quality changes across transactions. The last two factors attenuate any estimated relation between markups and outcome variables in empirical work. However, as will be shown, the markup is highly predictive across a battery of loan outcomes, model specifications, and samples, despite the potentially high degree of noise in the particular measure.18Markups are greater than 0.25 or less than −0.25 in 14.1% of all our loans, which is comparable to the 20–26% of “markups” falling outside of this range when using Zillow's Home Value Index, according to Molloy and Nielsen (2018).3FigMarkup Distribution.Notes:The sample includes GSE purchase mortgages originated in 2001 through 2012 with a calculated markup.Same‐Unit DynamicsThe model in the prior section suggests that idiosyncratic factors related to the buyer/seller match generate markups. We assess the presence of these factors in markups across consecutive transactions of the same housing unit. We find that, on average, markups are partially mean reverting, suggesting that markups contain the crucial idiosyncratic component we believe is related to default.Continuing from Equation (7), let us assume value can be decomposed into one‐dimensional price p$p$and quantity q$q$and that the estimate of HPI appreciation is measured with error e$e$. Then the markup is mt+1=Δqt+1+Δpt+1−Δp̂t+1−et+1$m_{t+1}=\Delta {q}_{t+1}+\Delta {p}_{t+1}-\Delta \hat{p}_{t+1}-e_{t+1}$. From this expression, a number of factors influence the markup: changes to the quantity of housing services provided by the property, changes to the price paid per unit of housing services, HPI changes, and measurement error in the HPI. Each factor deserves some discussion.Quantity changes may be one‐time events (i.e., large‐scale renovation), so the markup in t+1$t+1$may be uncorrelated with future quality changes in t+2$t+2$. In this case, the markup is uncorrelated with future markups. On the other hand, we would expect price changes for individual units in excess of market‐measured prices to be mean‐reverting upon subsequent transactions, giving markups that are negatively correlated. If the HPI had measurement error, the markup would be of equal and opposite sign to this error. Note that in period t+2$t+2$, the markup from period t+1$t+1$may be, but is required to be, relevant; some portions of the markup are mean‐reverting and others are not. In general, we expect markups to be partially mean‐reverting when averaged over the market.19Drift in quality may also cause the average markup in a group to be nonzero, even with an unbiased HPI.To model the same‐unit dynamics of markups, a future markup is expressed as a function of the current markup, the prior markup, and other covariates, as shown below. This equation is an adapted version of the simple mean‐reversion specification found in Davis and Weinstein (2002). A variable is also included for default due to the well‐documented foreclosure premium, as well as the financing type (versus a GSE purchase‐money mortgage), and in some specifications, the time between transactions, h(t,t+1)$h(t,t+1)$. To control for the possibility that markups are nonzero on average, time period and state fixed effects are included. To account for residual variance in the model, standard errors are clustered at the level of the state interacted with the year.8mi,t+1=β0+β1mi,t−1+β2mi,t+X′γ+ui,t+1.$$\begin{equation} m_{i,t+1}=\beta _0+\beta _1 m_{i,t-1}+\beta _2 m_{i,t}+X^{\prime }\gamma +u_{i,t+1}. \end{equation}$$If β1<0$\beta _1&lt;0$and/or β2<0$\beta _2&lt;0$, then the prior markup contains a transitory component; the markup in the past is predictive of a markup of opposite sign in the future. On the other hand, if both are equal to 0, then the prior markup is not predictive of future markups and can be considered permanent and reflective of a change to home quality. A positive coefficient would represent the presence of series of positive or negative markups relative to the market average price, suggesting serially correlated changes to home quality.Table 2 shows estimates of same‐unit markup dynamics across transactions. There are four models, with each corresponding to a different conditioning set or subsample. Column 1 presents estimates using the full set of purchase mortgages. The current markup parameter is −0.34 and the previous markup parameter is −0.03.20Genesove and Mayer (2001) estimate the residual from the previous sale hedonic to have a partial effect of 0.16 on the next sale, and a “months since last sale” parameter of −0.0004, or about −0.005 per year. This hedonic residual is an alternative calculation of a markup. This specification is remarkably close to ours but is not adapted to the task of estimating markup dynamics, making the task of comparison somewhat difficult. If they had estimated an interacted variable parameter relating the residual ×$\times$months since last sale along with a vector of controls, then the parameters would be comparable. As it stands, their estimate of 0.16 for the residual on the next sale corresponds to a −0.84 estimate (about 2.5×$\times$the estimate) on the following markup in the specification. The −0.005 per year is larger than 0.0005 estimate, though by assumption, they are restricting the interacted coefficient to 0, while ours is −0.01. Accordingly, the findings are broadly consistent with the signs and significance levels found in Genesove and Mayer (2001) though they are fairly far apart in magnitude.In terms of other covariates, the foreclosure premium is estimated to be about −25%, which is similar to the estimates found in Campbell, Giglio, and Pathak (2011). Financing appears to have little effect with parameters between −0.5% and 1% versus GSE‐acquired mortgages. Column 2 presents results from a model with covariates for the time‐between‐transactions and interactions with the markup. This result suggests the markup increasingly mean reverts the longer the holding period. Columns 3 and 4 show that positive markups exhibit lower levels of mean reversion than negative markups.2TABLESame‐Unit Markup DynamicsDependent variable: Next markup (Markup[t+1])PositiveNegativeSampleAllAllMarkupMarkupModel:[1][2][3][4]Markup[t]−0.337***−0.283***−0.0718***−0.523***[0.00720][0.00850][0.0116][0.0193]Markup[t−1]−0.0336***−0.0336***−0.0297***−0.0367***[0.00163][0.00163][0.00189][0.00193]Markup[t] ×$\times$h[t,t+1]−0.0119***−0.0214***−0.00713*[0.00137][0.00189][0.00383]h[t,t+1]−0.000490.0005770.000497[0.000381][0.000410][0.000399]Default[t]−0.250***−0.249***−0.271***−0.222***[0.00916][0.00917][0.00830][0.0127]FHA[t−1]0.0132***0.0134***0.00699***0.00705***[0.00189][0.00188][0.00163][0.00271]non FHA, GSE[t‐1]0.0103***0.0104***0.0001950.0185***[0.00240][0.00242][0.00249][0.00338]FHA[t+1]−0.0481***−0.0484***−0.0477***−0.0520***[0.00218][0.00213][0.00200][0.00268]non FHA, GSE[t+1]−0.00592***−0.00551***−0.00889***−0.00077[0.00157][0.00159][0.00173][0.00190]Constant0.0499***0.0527***0.0290***0.0292***[0.00108][0.00200][0.00239][0.00243]Observations475,729475,729264,036211,693R20.1150.1160.0940.103Note:Robust standard errors in brackets and clustered by State ×$\times$Year. ***p<0.01$p&lt;0.01$, ** p<0.05$p&lt;0.05$, * p<0.1$p&lt;0.1$. Fixed effects by State ×$\times$Year are absorbed prior to estimation, with standard errors reflecting true degrees of freedom. The sample includes GSE purchase mortgages originated in 2001–2012 with both an earlier (t−1$t-1$) and later (t+1$t+1$) markup, necessitating at least four sequential purchase or refinance transactions on the same housing unit.In sum, markups are mean reverting. These results indicate that markups are driven by transaction‐level pricing that is, in part, transitory. Our theoretical model suggests that due to these transitory factors, markups are associated with variation in collateral risk for mortgages. The analysis now turns to an examination of the relationship between markups, mortgage defaults, survival lengths, losses conditional on default, and comparisons with markups calculated using mortgage financing appraisals.PREDICTING EVENTUAL DEFAULTSConceptually, it is clear that lower collateral, holding the loan balance and other variables constant, should be associated with a higher risk of default. This directly applies to markups for reasons best illustrated by the following simple illustration. Suppose a mortgage with a stated 97 LTV ratio that is used to purchase a house, but the sale price contains a +10% pricing error. This loan would immediately be underwater with a current LTV of about 108 because the collateral is worth less than the initial valuation going into the calculation of the LTV. Similarly, if the housing unit is underpriced by 10%, the 97 LTV mortgage would immediately have a current LTV of 88, making it substantially less risky.21The LTV math is as follows: 9790=1.077$\frac{97}{90}=1.077$; 97110=0.881$\frac{97}{110}=0.881$.In either case, the standard LTV is not based on an accurate collateral valuation. This is especially dangerous in the case of a positive markup, because it potentially indicates an appraisal failure. One of the primary purposes of an appraisal is to help assess the risk of default for both the lender and the borrower. With a positive markup, the LTV understates the default risk. For these reasons, emphasis is placed on assessing how price markups affect default risk conditional on the origination LTV.The bivariate relationship between the markup and default is shown in the first panel of Figure 4 for the 2001–2012 sample. At a −20% markup, the default probability is 2%, whereas at +20%, the default probability is about 3.1%, nearly 1.5 times the low markup. The relation is monotonic and of the predicted sign, giving a strong indication that markups can be used to predict eventual defaults in a meaningful way. While +/−20% may seem to be a wide range, recall that (i) the particular markups estimate likely includes substantial idiosyncratic noise, and (ii) the range between these values includes about 80% of the markups; that is, over 20% of values are outside this range.4FigDistributions of Markups and Defaults Using a House Price Index (HPI) versus Appraisals.Notes:The sample includes loans originated in 2001–2012, with a calculated markup, subject to filters noted in the text. Histogram bins at the edges of the respective figures include values censored values between −50% and −20% attributed to the −20% bucket, and 20% and 50% to the 20% bucket, respectively. The default rate curve in Panel (b) does not include appraisal markups within 0.5% of zero, which includes 49% of all loans. This particular group of loans has been associated with adverse selection in the literature; we also find that it introduces a moderate discontinuity in the default rate curve. Loans within this range have a default rate of 2.5%.MethodsTo examine empirically the link between markups and default in a more rigorous fashion, a standard competing options default model is constructed. The dependent variable d$d$for loan i$i$is set equal to 1 if the mortgage terminates in default, defined as a foreclosure, foreclosure alternative such as a short sale, or a lender repurchase, 2 if the loan prepaid, and 0 if it is current as of March 2018. The variable of interest is the markup, calculated for the origination period, and the partial correlation of this variable is hypothesized to be positive on all forms of default. Note that this models requires complete ex post information and cannot be used to model causal lender or borrower behavior or answer the questions regarding whether lenders (borrowers) account for markups in their choices to issue (take out) a mortgage. Additionally, the markup calculation in a repeat‐sales framework benefits from sales subsequent to the origination period. These limitations notwithstanding, the model here is illustrative of partial relations between the variables in question.The functional form of the specification is a logit equation, with control variables in the vector X$X$, coefficients for controls in the vector γ$\gamma$, and e$e$as a generalized extreme value IID random variable.22This is not the precise functional form as presented in Section 2. In the conceptual framework (and in other models, e.g., Hatchondo, Martinez, and Sanchez 2015, Bailey, Davila, Kuchler, and Stroebel 2017), the interaction between interest rates and the LTV determines the equilibrium outcome. The markup is layered onto this expression, which if properly specified, would necessitate a triple interaction and multiple double interactions, with default increasing in each of the arguments both through direct and interacted effects. These complex interactions are omitted to focus on the total partial effects using a parsimonious specification, acknowledging the possibility of omitted variable bias at extreme values.Normalization requires all d=0$d=0$parameters set equal to zero, or {β0,0,β1,0,γ0}={0,0,0}$\lbrace \beta _{0,0}, \beta _{1,0}, \gamma _0\rbrace =\lbrace 0,0,\mathbf {0}\rbrace$.9Pr(di=j)=exp(β0,j+β1,jmi+Xi′γj+ei,j)∑τ=jJexp(β0τ+β1,τmi+Xi′γτ+ei,j).$$\begin{equation} Pr(d_{i}=j)= \frac{\exp {(\beta _{0,j}+\beta _{1,j} m_{i}+X_{i}^{\prime }\gamma _j+e_{i,j})}}{\sum _{\tau =j}^J\exp {(\beta _{0\tau }+\beta _{1,\tau } m_{i}+X_{i}^{\prime }\gamma _\tau +e_{i,j})}}. \end{equation}$$The choice of controls is standard in the literature, but in particular, are motivated by Ghent and Kudlyak (2011) and Davis et al. (2019). First, variable ranges or “buckets” are included for the LTV, debt service payment to income ratio (DTI), and the credit score. The DTI at origination is a standard default indicator, as a high debt fraction of household income makes a household particularly susceptible to income shocks in terms of ability to repay. Credit score at origination is also a common indicator, representing a household's willingness and ability to repay debt in the past and also may be related to the cost of default (D$D$), a key determinant of loan repayment in our conceptual framework. Multiple borrowers can mitigate the income risk of default, as the risk of a default‐inducing income shock falls. First‐time home buyers may be more or less risky because they often have less wealth, yet also often receive preferable treatment in the tax code, have acquired mortgages during a period of their lives when incomes tend to be accelerating, and may be held to stricter lending standards than repeat borrowers, all else equal. Investment properties and second houses are also risk indicators because debt on these sorts of luxury assets is the first to be defaulted upon in times of stress for the borrower, who presumably has multiple mortgages simultaneously.23Chinco and Mayer (2016) show that such buyers are more prone to mispricing home purchases. This can be taken a step further by analyzing whether such borrowers also perform worse on loans and if the mispricing exacerbates credit losses.The origination channel controls for associations between different types of mortgage originators and loan outcomes. The amortization period is covered by dummy variables for 20‐, 30‐, and 40‐year terms (versus 15‐year term).A prepayment option control is included by way of the interest rate at origination. Combined with a time period fixed effect, the interest rate variable turns into a spread‐at‐origination variable, which is a prepayment risk indicator.24Agarwal, Ben‐David, and Yao (2017) show that mortgage points are correlated with prepayment speeds, but less so than would be implied by the cost of the points. These points may be associated with markups as well, as a buyer who has a positive preference for a particular home may be both willing to overpay for a house and also purchase points on the mortgage. Unfortunately, a variable representing mortgage points is not available in the data. The (relative) interest rate acts as a control for a latent preference to remain in a home.A default option variable is included as the cumulative change in the HPI between origination and the next transaction. This variable captures the Deng, Quigley, and Van Order (2000) concept of an underwater mortgage without explicitly calculating the probability.25Deng, Quigley, and Van Order (2000) calculate the probability a mortgage is underwater by taking the cumulative standard normal distribution of the log‐difference in the current loan balance (u$u$) and the current house value (i.e., the log‐approximation of the current LTV) divided by the estimated residual standard deviation of the HPI value, or Φ[(lnui,t+h−lnVi,t+h)/ω2]$\Phi [(\ln u_{i,t+h} - \ln V_{i,t+h})/\sqrt {\omega ^2}]$. Rather than using a probabilistic measure, the origination LTV and the change in the HPI are included as separate covariates.In addition, the positive markup variable itself is an underwater indicator, as it serves to increase or decrease the fundamental property value relative to the mortgage balance. The seasoning of the loan is included as a vector of amortization year fixed effects to account for amortization.Time period and state fixed effects serve as important controls as well. In addition to the previously mentioned transformation of the interpretation of the interest rate variable, period fixed effects control for other macroeconomic‐related conditions, including housing market liquidity, national unemployment rate changes, and other factors. State effects capture state‐level variation in recourse versus not‐recourse laws, elasticity of housing supply, propensity to overbuild, fraud, and other factors throughout the crisis and its aftermath. Overall, these controls mitigate omitted variable bias in the treatment parameter, allowing us to capture the predictive relation between markups and defaults after accounting for a comprehensive set of variables.ResultsResults of several different default models are shown in Table 3. The first model is a multinomial logit specification, with “current” as the baseline outcome, and defaults and prepayments as competing options. Markups have a consistently positive and statistically significant effect on the probability of default. These estimates are also economically relevant, as going from a −20% markup to a +20% markup increases the default probability from 5.0% to 8.4% for a high‐risk mortgage originated in 2006.26Variables are chosen to represent a 30‐year, fixed rate, purchase mortgage in Florida in 2006, with one first‐time borrower, issued through a retail channel, which terminates after 3 years. The DTI is 39, the FICO is 690, and the LTV is 91.3TABLEMarkups and Loan Outcomes, Part 1Estimator:M. Logit (vs. Current)Multinomial Logit (vs. Current)Sample:All LoansAll LoansOutcome:Default TypeDefaultPrepayForec. Alt.RepurchaseREOPrepayModel[1][2]Markup1.380***0.01991.319***1.045***1.494***0.0174%Δ$\%\Delta$House Price Index−4.794***0.861***−5.659***−1.666***−4.845***0.906***Mortgage interest rate0.648***0.155***0.500***0.890***0.645***0.156***Combined LTV Bucket (vs. 0–60)61–700.944***−0.0321**1.135***0.200**1.038***−0.0342**71–751.339***−0.01341.548***0.549***1.430***−0.017576–801.696***0.003171.923***0.699***1.826***−0.0005181–851.983***0.03272.103***1.187***2.141***0.029886–902.330***0.0668***2.456***1.357***2.511***0.0635***91–952.598***0.0572***2.717***1.591***2.795***0.0524**96+2.768***−0.04912.847***1.922***2.962***−0.0528DTI Bucket (vs. 0–33)34–380.154***0.0150*0.194***0.0876*0.145***0.0147*39–430.279***0.0148*0.335***0.230***0.260***0.0138*44–500.322***−0.01330.389***0.328***0.287***−0.015451+0.369***−0.142***0.356***0.549***0.346***−0.143***Credit Score Bucket (vs. 300–579)580–619−0.232**0.295***0.263***−0.0316−0.681***−0.139620–639−0.358***0.406***0.367***−0.104−0.944***−0.262**640–659−0.436***0.540***0.496***−0.107−1.206***−0.344***660–689−0.589***0.710***0.664***−0.169−1.404***−0.528***690–719−0.780***0.841***0.794***−0.256*−1.656***−0.758***720–769−1.144***0.950***0.902***−0.491***−2.148***−1.172***770+−1.683***0.926***0.879***−0.910***−2.763***−1.777***Multiple Borrowers−0.589***0.167***−0.327***−0.775***−0.683***0.165***First‐time homebuyer−0.136***−0.0831***−0.220***0.0195−0.124***−0.0827***Acquisition Channel (vs. Retail)Broker0.0836***−0.127***0.101***0.01230.0755***−0.124***Correspondent−0.0702***−0.122***0.00696−0.181**−0.0774***−0.121***Not Specified0.599***0.448***0.570***0.452***0.630***0.441***Occupancy Type (vs. Owner)Investment Property0.190***−0.167***−0.221***0.1240.376***−0.164***Second Home0.0829*−0.0146−0.215***0.1290.218***−0.0101Amortization Period (vs. 15 years)20 Years0.0685−2.61E‐050.286−0.581**0.160.00019330 Years0.678***0.0601**0.999***−0.1840.737***0.0589**40 Years0.456***−2.397***0.03371.011***0.688***−2.330***Observations3,415,3493,415,349Pseudo‐R20.7260.706Note:Values presented are the log‐odds estimates. Robust standard errors are clustered by State ×$\times$Year, but are intentionally omitted for brevity. *** p<0.01$p&lt;0.01$, ** p<0.05$p&lt;0.05$, * p<0.1$p&lt;0.1$. The sample includes purchase loans originated in 2001–2012, with a calculated markup, subject to filters noted in the text. Cohort year, state, seasoning, and GSE fixed effects are included in all specifications, but omitted from the table.Control variables give estimates as predicted. The default option control variable—the cumulative level of ZIP‐code‐level house price appreciation—reduces the probability of default. The mortgage interest rate, LTV, longer loan terms, Broker and third‐party originators (TPOs) originators, and DTI are each associated with increased default probabilities. Credit score, the presence of multiple borrowers, status as a first‐time home buyer, and intended status as an owner‐occupier are all factors that reduce the probability of default.Model 2 expands the default indicator in three separate outcomes: foreclosure alternatives, repurchases, and foreclosure/REO sales. Estimates for the markup variable and controls are remarkably robust. Due to this robustness, the omnibus default indicator in model 1 is the preferred default metric in subsequent models.Models 3–6, as shown in Table 3, consider four other outcomes of interest. The first, as shown as model 3, is direct defaults. Markups are hypothesized to have a positive effect on the incidence of direct default, conditional on default. Borrowers are continually gathering information and updating their beliefs of property values. This is aided, in part, by companies such as Zillow and Redfin, which produce real‐time estimates of house values. As shown in models 1 and 2, a borrower who faces a large, positive markup is more likely to default. This type of borrower is also more likely to be underwater and, upon receiving information on the true price of their previously overpriced unit, may choose to direct default. Evidence for this hypothesis exists as the parameter estimate is positive and statistically significant.3TABLEMarkups and Loan Outcomes, Part 2Estimator:LogitLogitLogitLogitCoxSample:DefaultsAllAllAll2006–2007DirectEverEverEverOutcome:DefaultModifiedD90D180DefaultModel[3][4][5][6][7]Markup0.163***0.398***0.765***0.892***0.481***%Δ$\%\Delta$House Price Index−0.852***−1.588***−3.754***−4.145***−3.397***Mortgage interest rate−0.236***0.621***0.577***0.551***0.890***Combined LTV Bucket (vs. 0–60)61–700.315***0.695***0.497***0.631***1.053***71–750.323***0.920***0.674***0.814***1.451***76–800.316***1.057***0.862***1.032***1.623***81–850.256***1.280***1.031***1.196***1.813***86–900.381***1.425***1.258***1.449***2.016***91–950.394***1.551***1.432***1.630***2.185***96+0.339***1.738***1.649***1.830***2.351***DTI Bucket (vs. 0–33)34–380.002460.347***0.196***0.181***0.121***39–43−0.0619**0.523***0.325***0.305***0.226***44–50−0.0878***0.670***0.427***0.389***0.299***51+−0.140***0.853***0.571***0.517***0.410***Credit Score Bucket (vs. 300–579)580–6190.319***−0.404***−0.509***−0.416***−0.347***620–6390.564***−0.682***−0.763***−0.629***−0.512***640–6590.748***−0.936***−1.013***−0.837***−0.616***660–6891.023***−1.294***−1.377***−1.154***−0.765***690–7191.203***−1.625***−1.755***−1.495***−0.975***720–7691.398***−2.054***−2.269***−1.987***−1.285***770+1.512***−2.709***−2.886***−2.598***−1.717***Multiple Borrowers0.153***−0.238***−0.656***−0.663***−0.499***First‐time homebuyer−0.0500***−0.126***−0.0897***−0.0949***−0.0988***Acquisition Channel (vs. Retail)Broker−0.0893***0.338***0.243***0.253***0.227***Correspondent−0.03490.292***0.0995***0.0866***0.166***Not Specified−0.0457*0.175***0.185***0.185***0.0944***Occupancy Type (vs. Owner)Investment Property0.209***−1.793***−0.170***−0.101*0.0805***Second Home0.0869**−0.965***−0.134***−0.102***0.229***Amortization Period (vs. 15yr)20 Years−0.2620.1130.07590.03110.14130 Years−0.01050.745***0.526***0.634***0.682***40 Years−0.709***9.266***6.567***4.278***1.444***Observations83,6403,415,3493,415,3493,415,349494,171Pseudo R20.09540.5460.3460.340.214Note:Values presented are the log‐odds estimates for the logit models, and partial effects for the Cox hazard model. Robust standard errors are clustered by State ×$\times$Year, but are intentionally omitted for brevity. *** p<0.01$p&lt;0.01$, ** p<0.05$p&lt;0.05$, * p<0.1$p&lt;0.1$. The sample includes purchase loans originated in 2001–2012, with a calculated markup, subject to filters noted in the text. Cohort year, state, seasoning, and GSE fixed effects are included in all specifications, but omitted from the table.Model 4 estimates the effects of markups at origination on propensity to seek and acquire a loan modification from a lender. Because markups are a default indicator, they are also likely an indicator of modifications for the same reasons. Results show that markups are highly significant predictors of loan modifications. Controls are also of anticipated sign and significance, and are the same as models 1 and 2 with some exceptions. In particular, mortgages on investment properties and second homes are substantially less likely to receive loan modifications versus owner‐occupied homes. Models 5 and 6 estimate the effects of markups on delinquency using both D90 and D180 definitions. The signs and significance levels for these two models are nearly identical to models 1 and 2.27Robustness tests by origination cohort and considering further nonlinearities in the markup measure are available later in the main body of the paper. Additionally Appendix Table A.1 considers several other subsamples, including markups that are nearly zero for the previous markup and for different loan amounts. Models reinforce the robustness of the markups sign, significance, and magnitudes.Model 7 presents estimates from a simple Cox (1972) proportional hazard model to estimate reduced‐form relations between a price markup and a mortgage's survival length.28Prepayment and default hazards, from a borrower's perspective, are competing options: in each period, a borrower must decide to remain current, prepay the balance of the mortgage, or default. In a proportional hazard model, competing options can be treated as censored, facilitating a substantial reduction in the computation burden necessary to estimate unbiased and consistent reduced‐form parameters, but removing causal interpretation from the resulting estimates. For a recent example of this approach, see Foote et al. (2010), who model the prepayment and default hazards for prime and subprime loans.This model is estimated for the crisis sample of originations (the 2006/2007 cohort) because these loans have mostly resolved. Results from the default hazard model are consistent with the other variables and methods considered. Markups are highly predictive of default.This set of models demonstrates that house price markups provide a simple but robust mortgage stress indicator, conditional on standard controls. Markups at origination are predictive of delinquencies, defaults, loan modifications, and a loan's default path.29It is possible our results, as robust as they are, may be in part driven by our methodological choice of using an HPI to calculate the markup. We have constructed the best markup measure we can using available information, which consists of a long historical transaction record. To our knowledge, this is the only way to generate market‐based valuations for a large number of properties before, during, and after the Great Recession. We leave evaluation of alternative markup calculations to further research.Markups in Hot and Cold National Housing MarketsEstimates thus far are based on the pooled sample of transactions between 2001 and 2012. Another interest is the extent to which parameters vary over time when estimated on a year‐by‐year basis. In doing so, a link may be established between the association of markups with defaults and prepayments in hot versus cold national housing markets.Table 4 shows the results of 12 models, each with the same specification as Table 3, model 1. The markups log‐odds parameter is presented and all other covariates are estimated but not reported. Default parameters are between 0.6 and 1.8, with no discernible pattern in terms of parameters over time. This is evidence that the relation between markups and defaults is relatively stable over time, with variation due to chance. It is remarkable that even as absolute default rates rise and fall, the parameter estimate stays reasonably similar.4TABLEMarkups and Loan Outcomes by Origination CohortEstimator: Multinomial Logit (vs. Current)Sample: Column headerYear200120022003200420052006Markup1.668***1.793***1.332***1.144***1.283***0.733***[0.187][0.272][0.225][0.193][0.222][0.156]Other CovariatesYesYesYesYesYesYesObservations366,582338,215352,748248,345273,183243,051Pseudo‐R20.5950.6660.6890.6610.630.598Year200720082009201020112012Markup0.974***1.381***1.786***1.508***1.499***0.676***[0.111][0.168][0.143][0.179][0.411][0.248]Other CovariatesYesYesYesYesYesYesObservations251,120276,320279,763247,342232,879305,801Pseudo‐R20.6080.6470.7810.8070.8040.748Note:Robust standard errors in brackets, clustered by State ×$\times$Year. *** p<0.01$p&lt;0.01$, ** p<0.05$p&lt;0.05$, * p<0.1$p&lt;0.1$. The sample includes purchase loans originated in the relevant column year, with a calculated markup, subject to filters noted in the text. The models include all covariates in Table 3 but each except the markup coefficient is omitted from the table text for brevity. The prepay equation is also omitted for brevity.Markups and the Cost of DefaultThe conceptual framework suggests that the cost of default D$D$is an important determinant of both markups and default.30Figure A.1 in the Appendix shows that default costs (as measured by predetermined credit scores) are correlated with markups. As default costs rise (when credit score goes up), markups decrease. This is an interesting finding that confirms the theoretical predictions: lower default costs lead to higher reservations prices and, potentially, higher markups.Does the relation between markups and default depend on D$D$? We bring this question to the data because the theoretical model is unable to provide a direct answer.31Equation (3) shows that the probability of default θ=Pr[P1≤γ∣b,p0]=F1(γ)$\theta =\Pr [P_1\le \gamma \mid b,p_0]=F_1(\gamma )$is a function of both p0(D)$p_0(D)$and D$D$, where γ=b+(1+r)(1−δ)p0(D)−D$\gamma =b+(1+r)(1-\delta )p_0(D)-D$. Simple differentiation shows that ∂2θ∂p0∂D$\frac{\partial ^2 \theta }{\partial p_0 \partial D}$can be positive, negative, or even zero, depending on the nature of F1$F_1$and the value of γ$\gamma$.As a first step, we need to identify a measure of default costs. But measuring individual default costs is challenging, as they depend on a variety of idiosyncratic factors that are unique to each individual and generally unobserved. Certain variables, however, may be used as imperfect proxies for the cost of default. The first variable we consider is the individual's credit worthiness. Ceteris paribus, higher credit scores should reflect higher default costs and vice versa. Also note that default costs depend on local foreclosure laws. In judicial foreclosure states (where the lender has to file a lawsuit in court in order to foreclose), default costs should be lower compared to nonjudicial foreclosure states. These two variables, credit scores and local foreclosure laws, are used to test if the relation between markups and default depends on D$D$.Results are shown in Table 5. All models use a similar specification as Table 3, model 1, but in each column, the sample is restricted to subgroups with plausibly different levels of default costs. Two findings deserve discussion. First, note that the markup coefficient is positive, and both economically and statistically significant across specifications. The markup is a powerful predictor of default when default costs are high or low. Second, the relationship between markups and default is stronger when default costs are higher; borrowers with lower default costs are less sensitive to variation in markups. Further research is needed to explain the mechanisms driving this heterogeneity.5TABLEMarkups and Loan Outcomes by Default CostJudicialNonjudicialCredit Score:StateState[300,660)[660,720)[720,850]Markup1.240***1.505***1.097***1.268***1.548***[0.0682][0.0618][0.0774][0.0604][0.0651]Other CovariatesYesYesYesYesYesObservations1,381,0872,034,262299,156775,6902,340,503Pseudo‐R20.7160.7340.5550.6480.785Note:Robust standard errors in brackets, clustered by State ×$\times$Year. *** p<0.01$p&lt;0.01$, ** p<0.05$p&lt;0.05$, * p<0.1$p&lt;0.1$. The sample includes purchase loans subject to filters noted in the text. Models with “other covariates” include all covariates in Table 3 but each except the markup coefficient is omitted from the table text for brevity. Judicial vs. nonjudicial foreclosure resolution state classifications are from Fannie Mae, found at https://singlefamily.fanniemae.com/media/6726/display.Interactions of Markups with LTV RangesBecause the relation between LTV and defaults is convex, it is logical that the effect of the markup conditional on LTV is also convex. For instance, at low LTVs, default rates are extremely low, and a negative markup is likely to have a de minimis effect on defaults. On the other hand, at high LTV ranges, a positive markup may be particularly dangerous. This is confirmed in Figure 5 in reduced form, which presents smoothed estimates of the probability of default as a function of the markup, segmented by LTV bucket. Negative markups exhibit a much smaller marginal effect on defaults than positive markups, and positive markups at higher LTVs have higher marginal effects than those at lower LTVs. Note that LTVs are important determinants for predicting future default, regardless of the markup. Figure 5 precisely shows that there is not a one‐for‐one trade‐off between LTV and markups: a loan with an ostensible LTV of 95% and a negative markup of 50% has a higher predicted default rate than a loan with LTV of 85% and a positive 50% markup. Additionally, when LTV is modeled continuously alongside the markup, while LTV remains a stronger predictor, markups are also important (see Table A.2).5FigDistributions of HPI‐constructed Markups and Defaults by Origination LTV Bucket.Notes:The sample includes loans originated in 2001–2012, with a calculated markup, subject to filters noted in the text. Lines present the polynomial‐smoothed estimated default rate over all mortgages in the sample, estimated with a bandwidth of 0.15, using STATA.Conditional estimates using a multinomial logit model with a full markup ×$\times$LTV grid add further nuance. Table 6 shows estimates from this model. The omitted LTV bucket is the 97+ range, and the omitted markup bucket is for approximately 0 markup (the +/− 0.01 range). In this table, we can see that markups less than +0.10 for the 25‐60 LTV range have no statistically distinguishable effect. These estimates suggest that low LTVs are mostly relevant to defaults, except when markups are extremely large. For moderate LTV ranges, effects of markups are mostly symmetric in log‐odds units, giving a slightly convex marginal effect. For the highest LTV range, 97+ LTVs, there is no statistically distinguishable effect of positive markups at the 5% level. It should be noted, however, that the point estimates are all positive and of reasonably similar magnitude to the 91–96 LTV range, indicating similar but much less precise estimates. We attribute this to the relative infrequence of high LTV loans in the database.32Most 97+ loans are intentionally filtered out. For parts of the sample, they represented questionable loan originations and, for other periods, they were not even permissible with underwriting standards.6TABLELogit Model with Markup‐LTV GridLTV Bucket25–6061–7576–8081–9091–9697+Bucket−3.035***−1.843***−1.194***−0.603***−0.349***×$\times$Markup Bucket−0.50 to −0.21−0.0173−0.392**−0.547***−0.422***−0.270***−0.477***−0.20 to −0.16−0.466*−0.255*−0.367***−0.225***−0.0703−0.209−0.15 to −0.110.122−0.116−0.196***−0.176***−0.0665−0.225*−0.10 to −0.060.113−0.0493−0.106**−0.08710.0278−0.267***−0.05 to −0.020.0790.0762−0.0638−0.05540.014−0.02580.02–0.050.01410.04080.01830.03790.112**0.03140.06–0.100.2710.247**0.117***0.0942*0.172***0.005970.11–0.150.360**0.329***0.215***0.207***0.248***0.10.16–0.200.483***0.385***0.340***0.171**0.267***0.199*0.21–0.500.550***0.585***0.468***0.439***0.395***−0.0529Other CovariatesYesObservations3,415,349Pseudo‐R20.726Note:Robust standard errors in brackets, clustered by State ×$\times$Year. *** p<0.01$p&lt;0.01$, ** p<0.05$p&lt;0.05$, * p<0.1$p&lt;0.1$. The sample includes purchase loans originated in 2001–2012, with a calculated markup, subject to filters noted in the text. The models include all covariates in Table 3 but each except the markup coefficient is omitted from the table text for brevity. The prepay equation is also omitted for brevity.CREDIT LOSSES CONDITIONAL ON DEFAULTThis section considers the intensive margin of defaults—the credit losses suffered by mortgage holders, conditional on default. As with the extensive margin, it is clear how markups could affect losses conditional on default based on the difference between perceived and actual collateral value. When there is a positive markup at origination, the collateral is less valuable than that which is indicated by the LTV at origination, and this may persist to an eventual REO sale, reducing recoveries.The accounting of credit losses is constructed with the GSE performance data. Credit losses are defined as the net proceeds from the final REO sale, including additions from gross sale proceeds and MI payments, and subtractions from unpaid principal balance (UPB), legal fees, taxes, insurance, homeowners' association fees, and maintenance. To simplify the accounting and the notation in this section, the net loss (L$L$) realized is expressed at the time of the REO sale (t+1$t+1$) as the sum of the gross sale proceeds (P$P$), MI claims (I$I$), UPB (u$u$), and an omnibus “other expenses” category (e$e$):10Lt+1=ut+1+et+1−Pt+1−It+1.$$\begin{equation} L_{t+1}= u_{t+1}+e_{t+1}-P_{t+1}-I_{t+1}. \end{equation}$$Let us define the loss fraction of the final UPB as Lt+1=Lt+1/ut+1$\mathcal {L}_{t+1}=L_{t+1}/u_{t+1}$and it+1$i_{t+1}$as the MI coverage ratio at the time of default. Additionally, recall that LTV$LTV$is the origination (time t$t$) LTV ratio, ut/Pt$u_t/P_t$, ΔHPI$\Delta HPI$is the change in the HPI over the time period, and mt$m_t$is the markup at origination. After some manipulation, equation (10) becomes the following:11Lt+1=1+et+1ut+1︸Expense Ratio−1+%ΔPt+1LTVt+Δut+1/Pt︸Current Equity−it+1︸MI Payment Ratio.$$\begin{equation} \mathcal {L}_{t+1}=1+\underbrace{\frac{e_{t+1}}{u_{t+1}}}_{\text{Expense Ratio}}-\underbrace{\frac{1+\%\Delta P_{t+1}}{LTV_t+\Delta u_{t+1}/P_t}}_{\text{Current Equity}}-\underbrace{i_{t+1}}_{\text{MI Payment Ratio}}. \end{equation}$$Within “Current Equity,” the term (1+%ΔPt+1)$(1+\%\Delta P_{t+1})$represents the change in the sale price between the initial and REO sale, including the markup. This term includes the markup, the change in the average market value, and variation in the REO sale price from the market average. The term Δut+1/Pt$\Delta u_{t+1}/P_t$represents the contribution of principal payments to equity. Because this is typically positive on a standard amortization loan, this term usually represents decreases in losses in the case of default. Accordingly, higher LTVs, negative house price appreciation, higher origination markups, and slower amortization increase losses. Expenses and MI payments are expressed as fractions of the initial balance, with expenses adding to credit losses and MI reducing losses.Empirical ModelA reduced‐form expression is linearized and stochastically specified for the loss fraction Lt+1$\mathcal {L}_{t+1}$below. The markup is the treatment variable, and the addition of the remaining variables is to ensure that there is no omitted variable bias. In this model, as with the default model, only partial correlations are captured and these parameter estimates should not be interpreted as causal relations between the variables.In place of the loan payoff fraction, the mortgage interest rate, rt$r_t$, is included and negatively affects amortization. In place of the house appreciation, which includes the markup and the appreciation of the particular house, three terms are included: the change in the local HPI, ΔlnHPIt$\Delta \ln HPI_t$, the markup, and allow any other random variation in the REO sale price to be absorbed within the residual. In specifications where loans have MI, the depth of coverage at the time of REO sale, it+1$i_{t+1}$, is tracked. Additional controls include GSE, state, and origination month fixed effects to account for different Enterprise strategies for REO properties, recourse versus nonrecourse states and other legal issues, as well as macroeconomic factors related to the housing and mortgage finance system. The expenditure share is taken as a constant fraction, and is therefore subsumed within the fixed effects. Other factors affecting the amortization period and small factors due to linearization are subsumed within the error term.12Lt+1=αit+β1mt+β2rt+β3LTVt+β4ΔlnHPIt+1+β5it+1+et.$$\begin{equation} \mathcal {L}_{t+1}=\alpha _{it}+\beta _1 m_t+\beta _2 r_t +\beta _3 LTV_t+\beta _4\Delta \ln HPI_{t+1}+\beta _5 i_{t+1}+e_t. \end{equation}$$A higher interest rate causes the average principal payment each period to decline in the early years of the amortization schedule, so the effect of the interest rate rt$r_t$on the loss fraction is predicted to be positive. Variation in foreclosure premiums and the time until foreclosure that varies by state is captured by the fixed effects. A term for the MI depth of coverage is added in samples where coverage is present. While this is correlated with LTV, it typically increases in a stepwise fashion. Both conditional and unconditional on LTV, the coverage depth is predicted to be negative and between 0 and 1 on the basis that some nonzero fraction of the losses will be made up by the coverage payment, but this payment will not exceed the total losses.ResultsEstimates of this model are shown in Table 7, calculated over samples of purchase loans originated in the 2006–2007 period that eventually defaulted. Model 1 covers the sample of loans without MI, and model 2 considers loans with MI in effect. The sample is split into two subsamples based on the incidence of credit enhancements because it is possible that MI protects the creditor from losses on the loan associated with markups.7TABLELosses, Markups, and Mortgage InsuranceDependent variable: Loss fraction of final UPBSample:No MI coverageMI coverageModel:[1][2]Markup0.196***0.153***[0.0123][0.0126]LTV0.556***−0.215***[0.0374][0.0832]%Δ$\%\Delta$House Price Index−0.429***−0.419***[0.0191][0.0226]Mortgage Insurance Coverage Depth0.105***0.0728***[0.00628][0.00547]Mortgage Interest Rate−0.443***[0.0551]Constant−0.737***0.0894[0.0444][0.0666]Observations29,39032,029R20.2940.244Note:Robust standard errors in brackets, clustered by State ×$\times$Year. *** p<0.01$p&lt;0.01$, ** p<0.05$p&lt;0.05$, * p<0.1$p&lt;0.1$. The sample includes purchase loans originated in 2006 or 2007, with a calculated markup, subject to filters noted in the text. The loss fraction is calculated as the proceeds net of expenses from property sale divided by the final unpaid principal balance on the mortgage. The MI coverage depth is defined as the percent of the UPB covered by mortgage insurance. GSE fixed effects are included in all specifications but omitted from the table.In model 1, the markup coefficient is about 0.2, indicating that about 1/5th of the markup at origination is predictive of credit losses for loans without MI. For instance, suppose a 10% markup on a $500,000 house ($50,000). This coefficient suggests that credit losses for the holder of the mortgage are about $10,000 higher than for an equivalent loan with no markup. Other control coefficients are consistent with comparative statics from equation (11), with LTV and the interest rate contributing positively to losses and house price appreciation contributing negatively.Model 2 presents a similar story. The markup coefficient is a bit smaller at about 0.15, indicating that in loans with MI, the markup is still predictive of losses, but with a weaker relationship. This suggests that MI may alleviate some of the risk posed by markups to creditors. Highlighting the role of MI is the LTV parameter that is now negative, implying that higher LTVs are associated with more‐than‐compensating MI depth with lower overall credit risk.When considering both models together, the markup coefficient is remarkably robust across different samples and control variables, indicating that the markup calculated using an HPI is strongly associated with credit losses conditional on default.33While presented within the context of mortgages, this approach could be extended similarly to other financial products where credit risk involves not only the loss of an asset but also a cash flow or an associated value that may not be homogeneous for all assets within a class.MI may protect against credit losses associated with markups, but the estimated effect is small.MECHANISMSWhy can markups predict default even after controlling for LTV? To answer this question, it is useful to note that the sale price (P0$P_0$) and the appraised value (Papp$P_{app}$) mechanically determine a loan's LTV$LTV$at originationLTV=100×Lmin{Papp,P0}.$$\begin{equation*} LTV=100 \times \frac{L}{\min \lbrace P_{app},P_0\rbrace }. \end{equation*}$$If the “true” average value of a home were correctly captured by P0$P_0$and/or Papp$P_{app}$, the correlation between positive markups and future defaults should be partially (or completely) captured by the coefficient(s) on LTV$LTV$in a default model. However, both the transaction price and the appraisal may be inadequate measures of the housing unit's valuation for the following reasons. First, as it was discussed in the conceptual framework, P0$P_0$captures conditions that are idiosyncratic to each transaction and that may not reflect average market conditions. Obviously, a lender cares about the average and not the idiosyncratic component of collateral's value. Second, previous studies have shown that the appraisal Papp$P_{app}$may be subject to appraisal bias: appraisers typically “ratify” selling prices that are above the indexed based price of the house and appraised values are often equal to (and rarely below) sales prices (see Nakamura 2010, Agarwal, Ben‐David, and Yao 2013, Calem, Lambie‐Hanson, and Nakamura 2015, Ding and Nakamura 2016, Shui and Murthy 2018, Calem et al. 2021). These factors suggest a mechanism for the predictability of markups: a failure of the average appraisal to accurately estimate the value of the mortgage collateral at origination.While other mechanisms cannot be ruled out, there is substantial evidence reinforcing this claim by performing two simple exercises. First, a new version of the markup is calculated using appraisals and this variable is highly right‐censored at zero and does not appear to be associated with defaults.Second, competing options default models are run with markups calculated using the standard HPI approach and then the appraisal on the markup; the appraisal markup contributes almost no information and the HPI markup maintains its magnitude, sign, and significance. These findings suggest that appraisals are missing important variation in the fundamental value of a home based on their failure to predict defaults, with potentially serious consequences.Appraisal BiasAppraisal markupsare calculated as the difference between the sale price P0$P_0$and the appraisal Papp$P_{app}$. Because mortgage appraisals are used to estimate the collateral on a loan, an appraisal that is substantially lower than a transaction price will often cause a loan application to be rejected. If loan rejection were the only cause of a lack of mass in the right half of the distribution, it could be said that appraisers are fulfilling one of their primary objectives: to prevent a buyer from overpaying on a home, which from the lender's perspective, would reflect lower relative collateral. However, excess mass exists at zero in the second panel of Figure 4 suggesting censoring rather than truncation in the distribution, reflecting substantial appraisal bias. For purchase mortgages, 37% of all appraisals are exactly equal to the transaction price and 49% are within 0.5% of the transaction price. Previous research has shown that an appraisal nearly equal to the contract price is an indicator of heightened risk (see Nakamura 2010, Agarwal, Ben‐David, and Yao 2013, Calem, Lambie‐Hanson, and Nakamura 2015, Ding and Nakamura 2016, Calem et al. 2021).Appraisal Markups and DefaultAcross the distribution of appraisal‐based markups, it appears as though markups has a slight negative correlation with defaults on a reduced‐form basis. It is possible, however, that appraisal markups are predictive conditional on other observables. This may occur due to the mechanical interaction of the appraisal with the LTV: a negative appraisal markup results in a lower LTV, all else equal. On the other hand, as with HPI markups, endogenous sorting into different markup values may occur and drive partial correlations within the data.As before, the relation between appraisal markups and default is analyzed using a competing‐options default model. Estimates are reported for three models in Table 8, each with the same baseline specification and sample as model 1 in Table 3. These models estimate default and prepayment log‐odds parameters as a function of the markup and covariates. Model 1 uses HPI markups, and is identical to the earlier benchmark model. Model 2 estimates the bivariate relationship between appraisal markups and default excluding all other covariates. Because of the heavy censoring at zero, and following the literature, a dummy variable is added for when the appraisal is within 0.5% of the contract price to indicate a potentially censored appraisal. The negative coefficient on the markup is consistent with the pattern shown in the bottom panel of Figure 4. The coefficient on the censoring dummy is positive, indicating heightened risk, consistent with Calem et al. (2021). Model 3 adds the same list of covariates as in the benchmark model. The coefficient on the appraisal markup switches sign and becomes similar to the HPI markup. This indicates substantial correlation between markups and other covariates, and omitted variable bias in model 2.8TABLEDefaults and Markups Calculated Using HPIs versus AppraisalsMarkup variable(s)HPIAppraisalAppraisalBothModel[1][2][3][4]Markup (HPI)1.380***1.309***[0.0470][0.0482]Markup (Appraisal)−0.734*1.518***0.432***[0.425][0.156][0.148]Appraisal = Sale Price0.233***0.124***0.112***[0.0455][0.0139][0.0132]Other CovariatesYesNoYesYesObservations3,415,3493,415,3493,415,3493,415,349Pseudo‐R20.7260.002520.7250.726Log‐Likelihood−531,837−1,934,434−533,143−531,683Note:Values presented are the marginal log‐odds estimates. Robust standard errors in brackets, clustered by State ×$\times$Year. *** p<0.01$p&lt;0.01$, ** p<0.05$p&lt;0.05$, * p<0.1$p&lt;0.1$. The sample includes loans originated in 2001–2012, with a calculated markup, subject to filters noted in the text. The models include all covariates in Table 3 but each except the markup, coefficients are omitted from the table text for brevity. The Prepay equation is also omitted. Model [1] is the same as in Table 3, model [1].Finally, model 4 includes both the appraisal and HPI markups to assess the relative information contained in each of the markup variables. There are three factors to consider relative to models 1–3: the coefficients, the log‐likelihoods, and the variation in the markups themselves. The coefficient on the appraisal markup falls by about 75% relative to model 3 from 1.5 to 0.4, while the parameter on the HPI markup is mostly maintained relative to model 1. This indicates that the HPI markup nearly encompasses the appraisal markup. The overall fit of the model is marginally lower using appraisal markups alone in model 3, with a pseudo‐R2$R^2$of 0.725 and a log‐likelihood of −533,143 versus 0.726 and −531,873 for HPI markups in model 1. In the fourth model, when both appraisal and HPI markups are included, the log‐likelihood is almost identical to model 1 at −531,683, implying little explanatory gain from adding the appraisal markup variables. Finally, while the parameter on the appraisal markup is about 1/4 the size of the HPI markup, the standard deviation of the appraisal markup is also much smaller at 4.4% versus the HPI markup's 16.6%. This suggests that the continuous portion of the appraisal markup explains about 1/16 of the default variation of the HPI markup.Overall, these combined findings confirm that appraisals contain limited information beyond the observed LTV, and that collateral‐related information is still predictive of defaults.34Appraisal bias is not necessarily inconsistent with a well‐functioning mortgage market. Valuing homes at purchase prices might be consistent with a market that is willing to accept a small number of additional foreclosures in return for a much higher number of home sales. We thank an anonymous referee for making this point.CONCLUDING DISCUSSIONHouse price markups—calculated as the difference between a transaction price and a predicted price—are associated with mortgage delinquencies, defaults, prepayments, losses conditional on default, and loan modifications. Moreover, these associations are economically relevant—the difference between a −20% and +20% markup is a 50% increase of the default rate of a mortgage, holding all other characteristics of the loan and borrower constant. Because appraisals are biased toward the contract price, and the LTVs calculated using these appraisals give measures of expected defaults, appraisal bias may be an important factor of credit risk mismeasurement. The analysis has the dual strengths of being based on a set of modeling approaches rooted firmly in the literature and being estimated with a near‐universe of house sales from a large pool of mortgages.Markups are related to loan performance for fundamental reasons related to the microstructure of the housing and mortgage markets. As shown in the conceptual framework, markups do not cause default. Rather, the price paid for an asset is chosen simultaneously with the expected probability of default because housing is highly leveraged relative to expected house price appreciation, and the house itself serves as the collateral on the loan.Although the estimates are specific to the market for residential real estate, a similar mechanism emerges in many settings: any market with both large spreads and high leverage, where the collateral on the loan is the purchased asset, likely exhibits a similar type of relation between markups and loan performance. Of particular similarity is the market for auto loan debt, but this relation could also extend to insurance policies or even alternative investments like collectibles. These interesting applications are left as topics for further research.APPENDIXA.1TABLEMarkups and Loan Outcomes, Sample RobustnessSample|M(t−1)|<5%$|M(t-1)| &lt; 5\%$Loan<$100k$Loan &lt; \$100k$Loan>$300k$Loan &gt; \$300k$Model[A1][A2][A3]Markup1.997***1.310***1.431***[0.125][0.106][0.0623]Observations428,609444,921601,853Pseudo‐R20.7380.7560.706Note:Robust standard errors in brackets, clustered by State ×$\times$Year. *** p<0.01$p&lt;0.01$, ** p<0.05$p&lt;0.05$, * p<0.1$p&lt;0.1$. The sample includes purchase loans subject to filters noted in the text. The models include all covariates in Table 3 but each except the markup coefficient is omitted from the table text for brevity. Model A1 consists of a sample of loans where the prior markup is nearly zero. Models A2 and A3 are for different loan amount subsamples.A.2TABLEMarkups and Loan Outcomes, Other RobustnessLinear CLTVNo Credit ScoreMarkup1.385***1.398***[0.0471][0.0478]CLTV (linear)5.81***[0.161]Other CovariatesYesYesObservations3,415,3493,415,349Pseudo‐R20.7260.719Note:Robust standard errors in brackets, clustered by State ×$\times$Year. *** p<0.01$p&lt;0.01$, ** p<0.05$p&lt;0.05$, * p<0.1$p&lt;0.1$. The sample includes purchase loans subject to filters noted in the text. Models with “other covariates” include all covariates in Table 3 but each except the markup coefficient is omitted from the table text for brevity.A.1FigDistributions of HPI‐constructed Markups and Default Costs Measured Using Credit Scores.Notes:The sample includes loans originated in 2001 through 2012, with a calculated markup, subject to filters noted in the text. Credit scores are measured at origination. The line presents the polynomial‐smoothed estimated markup over all mortgages in the sample, estimated with a bandwidth of 200, using STATA.LITERATURE CITEDAdelino, Manuel, Kristopher Gerardi, and Barney Hartman‐Glaser. (2019) “Are Lemons Sold First? Dynamic Signaling in the Mortgage Market.” Journal of Financial Economics, 132, 1–25.Agarwal, Sumit, Itzhak Ben‐David, and Vincent Yao. (2013) “Collateral Valuation and Borrower Financial Constraints: Evidence from the Residential Real Estate Market.” Working Paper 19606, National Bureau of Economic Research.Agarwal, Sumit, Itzhak Ben‐David, and Vincent Yao. (2017) “Systematic Mistakes in the Mortgage Market and Lack of Financial Sophistication.” Journal of Financial Economics, 123, 42–58.Anenberg, Elliot. (2016) “Information Frictions and Housing Market Dynamics.” International Economic Review, 57, 1449–79.Bailey, Michael, Eduardo Davila, Theresa Kuchler, and Johannes Stroebel. (2017) “House Price Beliefs and Mortgage Leverage Choice.” National Bureau of Economic Research Working Paper Series.Ben‐David, Itzhak. (2011) “Financial Constraints and Inflated Home Prices during the Real Estate Boom.” American Economic Journal: Applied Economics, 3, 55–87.Bogin, Alexander N., William M. Doerner, and William D. Larson. (2019a) “Local House Price Dynamics: New Indices and Stylized Facts.” Real Estate Economics, 47, 365–98.Bogin, Alexander N., William M. Doerner, and William D. Larson. (2019b) “Missing the Mark: Mortgage Valuation Accuracy and Credit Modeling.” Financial Analysts Journal, 75, 32–47.Calem, Paul, Jeanna Kenney, Lauren Lambie‐Hanson, and Leonard Nakamura. (2021) “Appraising Home Purchase Appraisals.” Real Estate Economics, 49, 134–68.Calem, Paul S., Lauren Lambie‐Hanson, and Leonard I. Nakamura. (2015) “Information Losses in Home Purchase Appraisals.” Working Paper 15‐11, Federal Reserve Bank of Philadelphia.Campbell, John Y., Stefano Giglio, and Parag Pathak. (2011) “Forced Sales and House Prices.” American Economic Review, 101, 2108–31.Campbell, Tim S., and J. Kimball Dietrich. (1983) “The Determinants of Default on Insured Conventional Residential Mortgage Loans.” Journal of Finance, 38, 1569–81.Carrillo, Paul. (2013) “Testing for Fraud in the Residential Mortgage Market: How Much Did Early‐Payment‐Defaults Overpay for Housing?” Journal of Real Estate Finance and Economics, 47, 36–64.Carrillo, Paul E. (2012) “An Empirical Stationary Equilibrium Search Model of the Housing Market.” International Economic Review, 53, 203–34.Case, Karl E., and Robert J. Shiller. (1989) “The Efficiency of the Market for Single‐Family Homes.” American Economic Review, 79, 125–37.Chinco, Alex, and Christopher Mayer. (2016) “Misinformed Speculators and Mispricing in the Housing Market.” Review of Financial Studies, 29, 486–522.Cox, David R. (1972) “Regression Models and Life‐Tables.” Journal of the Royal Statistical Society. Series B (Methodological), 34, 187–220.Davis, Donald R., and David E. Weinstein. (2002) “Bones, Bombs, and Break Points: The Geography of Economic Activity.” American Economic Review, 92, 1269–89.Davis, Morris A., William D. Larson, Stephen D. Oliner, and Benjamin R. Smith. (2019) “A Quarter Century of Mortgage Risk.” Technical Report, Federal Housing Finance Agency.Deng, Yongheng, John M. Quigley, and Robert Van Order. (2000) “Mortgage Terminations, Heterogeneity and the Exercise of Mortgage Options.” Econometrica, 68, 275–307.Ding, Lei, and Leonard Nakamura. (2016) “The Impact of the Home Valuation Code of Conduct on Appraisal and Mortgage Outcomes.” Real Estate Economics, 44, 658–90.Foote, Christopher, Kristopher Gerardi, Lorenz Goette, and Paul Willen. (2010) “Reducing Foreclosures: No Easy Answers.” NBER Macroeconomics Annual, 24, 89–138.Foote, Christopher L., Kristopher Gerardi, and Paul S. Willen. (2008) “Negative Equity and Foreclosure: Theory and Evidence.” Journal of Urban Economics, 64, 234–45.Genesove, David, and Christopher Mayer. (2001) “Loss Aversion and Seller Behavior: Evidence from the Housing Market.” Quarterly Journal of Economics, 116, 1233–60.Ghent, Andra C., and Marianna Kudlyak. (2011) “Recourse and Residential Mortgage Default: Evidence from US States.” Review of Financial Studies, 24, 3139–86.Glaeser, Edward L., and Charles G. Nathanson. (2017) “An Extrapolative Model of House Price Dynamics.” Journal of Financial Economics, 126, 147–70.Goodman, Allen C., and Thomas G. Thibodeau. (2003) “Housing Market Segmentation and Hedonic Prediction Accuracy.” Journal of Housing Economics, 12, 181–201.Han, Lu, and William C. Strange. (2015) “Chapter 13 ‐ The Microstructure of Housing Markets: Search, Bargaining, and Brokerage.” In Handbook of Regional and Urban Economics, edited by J. Vernon Henderson, Gilles Duranton, and William C. Strange, vol. 5, pp. 813–86. Amsterdam: Elsevier.Harding, John P., Stuart S. Rosenthal, and Clemon F. Sirmans. (2003) “Estimating Bargaining Power in the Market for Existing Homes.” Review of Economics and statistics, 85, 178–88.Carlos Hatchondo, Juan, Leonardo Martinez, and Juan M. Sanchez. (2015) “Mortgage Defaults.” Journal of Monetary Economics, 76, 173–90.Jordà, Óscar, Moritz Schularick, and Alan M. Taylor. (2015) “Leveraged Bubbles.” Journal of Monetary Economics, 76, S1–S20.Mayer, Christopher, Karen Pence, and Shane M. Sherlund. (2009) “The Rise in Mortgage Defaults.” Journal of Economic Perspectives, 23, 27–50.Merlo, Antonio, Francois Ortalo‐Magne, and John Rust. (2015) “The Home Selling Problem: Theory and Evidence.” International Economic Review, 56, 457–84.Molloy, Raven, and Eric Nielsen. (2018) “How Can We Measure the Value of a Home? Comparing Model‐Based Estimates with Owner‐Occupant Estimates.” FEDS Notes, Board of Governors of the Federal Reserve System.Mortensen, Dale T., and Christopher A. Pissarides. (1999) “Chapter 39 New developments in Models of Search in the Labor Market.” In Handbook of Labor Economics, Vol. 3, pp. 2567–27. Amsterdam: Elsevier.Nakamura, Leonard I. (2010) “How much is that Home Really Worth? Appraisal Bias and House‐Price Uncertainty.” Business Review, Q1, 11–22.Piskorski, Tomasz, Amit Seru, and Vikrant Vig. (2010) “Securitization and Distressed Loan Renegotiation: Evidence from the Subprime Mortgage Crisis.” Journal of Financial Economics, 97, 369–97.Shui, Jessica, and Shriya Murthy. (2018) “Are Appraisal Management Companies Value‐Adding? Stylized Facts from AMC and Non‐AMC Appraisals.” Working Paper 18‐01, Federal Housing Finance Agency. http://www.deepdyve.com/assets/images/DeepDyve-Logo-lg.png Journal of money credit and banking Wiley

House Price Markups and Mortgage Defaults

Loading next page...
 
/lp/wiley/house-price-markups-and-mortgage-defaults-KrlhqLOzhh

References (42)

Publisher
Wiley
Copyright
© 2023 The Ohio State University.
ISSN
0022-2879
eISSN
1538-4616
DOI
10.1111/jmcb.12940
Publisher site
See Article on Publisher Site

Abstract

The housing market is characterizedby major information asymmetries, heterogeneous preferences, and high transaction costs. Accordingly, the prices at which buyers and sellers are willing to trade identical units may vary widely relative to an average market price. Idiosyncratic conditions of each sale lead to either a positive or a negative price markup, measured as the difference between the actual transaction price and the average market price.1Case and Shiller (1989) estimate a property‐specific appreciation standard deviation of 15% around the market average using repeat‐sales regressions, suggesting large price variation for transactions of identical homes. These regularities are explained and predicted by a long and growing literature that employs search‐and‐matching models to study the microstructure of housing transactions. Han and Strange (2015) present a comprehensive survey of this topic; examples of recent studies in the housing literature include Carrillo (2012), Merlo, Ortalo‐Magne, and Rust (2015), and Adelino, Gerardi, and Hartman‐Glaser (2019). For an excellent survey of search and matching models in labor markets, see Mortensen and Pissarides (1999). Nominal loss aversion and house price expectations may also play roles in markups; see Genesove and Mayer (2001) and Glaeser and Nathanson (2017), respectively.Housing is also highly leveraged, making individual mortgage performance sensitive to variation in the value of the collateral on the loan and the market susceptible to leveraged bubbles (Jordà, Schularick, and Taylor 2015). These attributes make housing the ideal market to explore issues related to markups, leverage, and loan performance.In this paper, building off of the pioneering work of Harding, Rosenthal, and Sirmans (2003), Ben‐David (2011), and others, we introduce the general concept of markups as a major source of unaccounted‐for‐risk in loan performance using information on over 100 million home transactions and 40 million mortgages in the United States between 1975 and 2018. The findings have important consequences for modeling mortgage defaults, credit losses conditional on default, and cash flows and pricing of mortgage‐backed securities.A simple conceptual framework can show that a transaction price and its deviation from the expected price mechanically determine future equity value, and thus, is a key predictor of mortgage performance. Loans with higher (lower) price markups at origination should be, all other things equal, more (less) risky.We measure markups using a comprehensive data set of housing transactions in the United States. Combining house price data from mortgages purchased or securitized by Fannie Mae, Freddie Mac, and the Federal Housing Administration with property values from sales and transfer documents that are filed at county recorder offices, more than 100 million transactions are amassed involving single‐family housing units since the mid‐1970s.2The county recorder data have been licensed from CoreLogic.Our measure exploits repeat‐sales to predict the transaction price of a unit in a manner similar to Anenberg (2016): take an initial purchase price and project it forward using a house price index (HPI). The markup is then computed as the percentage difference between the observed and predicted purchase price of the next transaction; refinances are discarded from the data set. Intuitively, the markup resembles (but it is not identical to) a residual from a repeat‐sales regression. The richness of our data allows us to estimate markups for any house that sells more than once in the sample. Empirical tests suggest that markups calculated in this manner are partially mean reverting, suggesting that markups do not solely reflect unobserved changes in economic fundamentals of a housing unit's quality or location; rather, because markups are mean reverting, they contain a transitory, transaction‐specific component.We merge this markups database onto two public sources of loan performance information that combine to track nearly 40 million loans from origination to termination. The loan performance sample begins in the early 2000s, which allows us to capture rising and falling markets and, most importantly, provide takeaways about both the Great Recession and the recovery environment. The mortgage origination data provide important details on the loan's terms (e.g., amount, length, and interest rate) and underwriting characteristics (e.g., acquisition channel for securitization). The mortgage performance data track outstanding loan balances, loan maturity, delinquency status, and details about termination. The combined data are used to estimate traditional mortgage performance models for a sample of purchase‐money mortgages. Results from a wide set of specifications and samples provide convincing evidence that markups are associated with future mortgage default. In the baseline specification, posterior default rates are about 50% higher for mortgages with a +20% markup versus one with a −20% markup,3This range of markups is consistent with earlier work by Molloy and Nielsen (2018) and contains 80% of the markups in our main sample.conditional on identical covariates including loan‐to‐value ratios (LTV), between 2001 and 2012.4Throughout this paper, we refer to the “combined” LTV ratio as the LTV. This includes the balances of all loans using the house as collateral at the time of origination. This does not include subsequent second liens. We filter out all loans with either an individual or combined LTV < 25 to focus on first liens.These findings are robust across years, other sample cuts, and taking into account nonlinearities and interaction terms in a series of models.These price and mortgage data also allow us to estimate the relation between markups and credit losses when a loan defaults. Conditional on default, markups are associated with greater credit losses for the holder of the mortgage. Overall, a +20% markup is associated with an additional 3.5% in eventual credit losses relative to a loan with no markup. Importantly, unlike variation in LTV, which has mechanical interactions with mortgage insurance (MI) coverage and depth and is thus insured, variation in the markup translates to credit risk regardless of MI incidence, though the existence of MI does mitigate some of the losses due to markups.5Lenders often require credit enhancements that require MI depending on the LTV of the loan. MI is typically required for loans with LTVs greater than 80. MI is then in force until enough collateral is accumulated through principal payments to achieve a 78 LTV based on the valuation at origination, or if the mark‐to‐market LTV is below 80.Why can markups predict credit risk even after LTV is taken into account? It seems that neither the sale price nor the appraisal capture the “true” average value of a home. The sale price captures conditions that are idiosyncratic to each transaction and that may not reflect average market conditions. Average rather than idiosyncratic conditions should be stronger driving factors for determining collateral value. Moreover, manually performed (human) appraisals may be subject to bias as reported by a growing set of studies in this area (see Nakamura 2010, Agarwal, Ben‐David, and Yao 2013, Calem, Lambie‐Hanson, and Nakamura 2015, Ding and Nakamura 2016, Calem et al. 2021). To confirm the existence of appraisal bias in the sample, the following is shown: (i) the distribution of appraisal‐based markups has excess mass at zero (in nearly 50% of cases, the sale price is within 0.5% of the appraisal); and (ii) this notch point is predictive of default in a manner consistent with right‐censoring of appraisal valuations.6We compute appraisal‐based markups using the difference between the purchase price and the appraised price.The previous evidence and discussion allows us to conclude that collateral coverage may be substantially mismeasured, and this is likely the reason why HPI‐generated markups are correlated with default risk even when conditioning on LTV.This paper contributes to a growing literature concerned with how the mispricing of assets affects immediate valuations and subsequent performance outcomes. A large number of studies have documented that collateral value relative to the outstanding balance is an important determinant of payment default, highlighting the role of accurately assessing current equity when estimating runoffs and default risks (Campbell and Dietrich 1983, Deng, Quigley, and Van Order 2000, Foote, Gerardi, and Willen 2008, Mayer, Pence, and Sherlund 2009, Piskorski, Seru, and Vig 2010, among many others). This study makes an important point: due to factors idiosyncratic to each trade, transaction values should not be expected to always reflect average market conditions. Units that have been sold or refinanced at an above (below) average price have, other things equal, less (more) equity and are exposed to more (less) mortgage default risk.Importantly, these dynamics are not subject to narrow contexts such as hot or cold housing markets or fraud.7Ben‐David (2011) and Carrillo (2013) show that house prices that are artificially high at origination due to fraudulent practices can be associated with future default rates.Rather, markups exist for every transaction, pointing to an additional source of asset default risk that may not be apparent with certain underwriting valuation (i.e., appraisal) techniques. Accordingly, this research has fundamental implications for the valuation of pools of loans and asset‐backed securities on which they are based.More broadly, these findings could be applicable to other assets that are characterized by both large spreads and highly leveraged transactions where the asset purchased serves as the collateral on the loan. This includes the market for auto loans, collectibles such as art, wine, or trading cards, heavy machinery, and insurance policies. When these assets are highly leveraged and markups are large and positive, incentives to default are higher than for loans with small and/or negative markups, holding leverage constant.CONCEPTUAL FRAMEWORKA simple conceptual framework can illustrate why price markups exist in the housing market, and the relation between markups and future default rates. Allow for a stylized one‐sided, partial equilibrium, two‐period model where potential home buyers search for a house in period t=0$t=0$. To keep things simple and tractable, assume that each buyer is able to find exactly one seller. Homes are identical in their observed characteristics but a buyer obtains a random utility component B$B$that is idiosyncratic to each home‐buyer combination. From a buyer's point of view, the seller's reservation value and potential trading price P0$P_0$is random as it depends on the seller's characteristics and valuation of the house. Assume that the house price is an independently and identically distributed realization drawn from a predetermined continuous distribution F0$F_0$; that is, P0∼F0$P_0 \sim F_0$. A buyer has access to credit and can purchase the house after paying a downpayment of δP0$\delta P_0$, where 1−δ∈[0,1]$1-\delta \in [0,1]$measures the LTV ratio. In period t=1$t=1$, a home buyer can become a seller and put a house on the market. A random price offer P1∼F1$P_1 \sim F_1$is expected. Let F1$F_1$be exogenously determined. If selling the house is not profitable, a seller has the option to default on the loan. In what follows, the model is formally set up while denoting random variables and their realizations with upper‐ and lower‐case letters, respectively.After a buyer has inspected a house and negotiated with the seller, a realization of B$B$and P0$P_0$is revealed. From the buyer's perspective, the value of having an opportunity to buy such house V0b$V^b_0$in period t=0$t=0$is equal to1V0b(p0)=maxb−δp0+EV1s(P1,p0),0,$$\begin{equation} V^b_0(p_0)= \max {\left\lbrace b-\delta p_0+ E {\left[ V^s_1(P_1,p_0) \right]},0 \right\rbrace} , \end{equation}$$where the match value b$b$denotes the increased utility from housing consumption relative to the outside option (renting), and E[V1s(P1,p0)]$E [ V^s_1(P_1,p_0) ]$is the expected value of having an opportunity to sell this house next period. In period t=1$t=1$, a home owner has the option to sell the house and repay the loan or default2V1s(P1,p0)=max{P1−b−(1+r)(1−δ)p0,−D},$$\begin{equation} V_1^s(P_1,p_0)=\max \lbrace P_1-b-(1+r)(1-\delta )p_0,-D \rbrace , \end{equation}$$where r$r$is the loan interest rate and D$D$captures both pecuniary and nonpecuniary default costs. A simple examination of equation (2) allows us to conclude that, conditional on b$b$and p0$p_0$, a seller would default as long as the realization of P1≤γ$P_1 \le \gamma$, where γ=b+(1+r)(1−δ)p0−D$\gamma =b+(1+r)(1-\delta )p_0-D$. Note that the probability of default (given realizations b$b$and p0$p_0$) is equal to3Pr[P1≤γ∣b,p0]=F1(b+(1+r)(1−δ)p0−D).$$\begin{equation} \Pr [P_1\le \gamma \mid b,p_0]=F_1(b+(1+r)(1-\delta )p_0-D). \end{equation}$$Let4PRS(b,p0)=γ$$\begin{equation} P_R^S(b,p_0)=\gamma \end{equation}$$be the seller's reservation price: the minimum price at which she is willing to sell his/her house. Now compute5EV1s(P1,p0)=∫γ∞(P1−b−(1+r)(1−δ)p0)dF1(P1)−F1(γ)D.$$\begin{equation} E {\left[ V^s_1(P_1,p_0)\right]}= \int _{\gamma } ^\infty (P_1-b-(1+r)(1-\delta )p_0) dF_1(P_1)- F_1(\gamma )D. \end{equation}$$After simple differentiation, one can show that∂EV1s(P1,p0)∂p0≤0,$$\begin{equation*} \frac{ \partial E {\left[ V^s_1(P_1,p_0)\right]}}{\partial p_0} \le 0, \end{equation*}$$and∂V0b(p0)∂p0≤0.$$\begin{equation*} \frac{ \partial V^b_0(p_0)}{\partial p_0} \le 0. \end{equation*}$$Hence, in period 0, it is also optimal for a home buyer to follow a reservation strategy such that a house is purchased if the realization of P0$P_0$is smaller than a reservation price P0R$P_0^R$. After inspecting equation (1), it is clear that P0R$P_0^R$is implicitly defined by6∫V1sP1,P0RdF(P1)=δP0R−b.$$\begin{equation} \int {\left[ V^s_1{\left(P_1,P_0^R\right)}\right]}dF(P_1)=\delta P_0^R-b. \end{equation}$$As shown in Figure 1, the solution to this equation exists and it is unique. This simple model allows us to make three important conjectures that will later guide the empirical analysis.1FigHome Buyer's Optimal Reservation Price.Notes:This figure illustrates the solution to the home buyer's reservation problem defined in equation (6). The buyer would buy a house if, and only if, the transaction price is below P0R$P_0^R$. The left‐hand side of equation (6) is nonincreasing, convex, and converges to −D$-D$as p0$p_0$approaches infinity. The right‐hand side is an increasing linear function of p0$p_0$. Hence, a unique solution exists. Any shift of the distribution F1$F_1$to the right (that is, an increase in expected future appreciation rates) will increase current buyers' reservation values.First, the model predicts that households pay different prices for identical housing. Some households pay a positive price markup (sale price above the average), while others pay less. What is important to remark here is that the price markup is not a consequence of unobserved housing heterogeneity but rather it is due in part to random matching and heterogeneity in buyer and seller tastes. This is an important observation that is considered when estimating the price markups empirically.Second, the model clearly shows that there is a negative relationship between price markups and default rates: higher markups (higher p0$p_0$) increase the probability of future default (see equation (3)). In the empirical section, this hypothesis is tested and the implications are investigated.8Partial equilibrium solutions have been analyzed where it has been assumed that F0$F_0$and F1$F_1$are exogenous to the model. This may not be a reasonable assumption since sale prices are typically determined after a (sometimes complex) negotiating and bargaining process between buyers and sellers. Moreover, a home buyer can become a home seller in the future. This complication is avoided for the sake of clarity and simplicity.Finally, also note that buyers with lower costs of future default (lower D$D$) are willing to pay higher prices for houses.9To see this from Figure 1, note that as D$D$decreases the dotted horizontal line in that figure moves higher, closer to the x‐axis. At the same time, the downward sloping curve above (E[V1s(p0)]$E[V_1^s(p_0)]$) shifts up, raising the value at the intersection that determines the buyer's reservation house price (P0R$P^R_0$). We thank an anonymous referee for making this point.This means that buyers with lower default costs will pay on average higher prices for identical housing. Higher markup and lower default costs, in turn, predict higher mortgage default rates. Are defaults more common among buyers that begin with positive markups because these markups make negative equity more likely, or because these markups indicate that these buyers have lower costs of default? From an empirical perspective, it is challenging to isolate these channels and precisely identify the causal effect of markups on future default rates. We emphasize that we are not seeking to uncover such causal link. The empirical section focuses on estimating the conditional correlation between current markups and future default rates.MEASUREMENT OF MARKUPS, DATA, AND STATISTICSA markup is defined as the (log) difference between the actual price and the predicted price of a home. To predict the average sale price of a housing unit, we use microdata and a valuation method based on market averages. The method uses two transaction prices of the same housing unit, along with a local HPI, to calculate the difference between the actual price and a predicted price. After outlining this method along with its strengths and weaknesses, a description is provided for the data used to calculate the measures found in the subsequent empirical sections.Finally, some basic descriptive statistics of the markups are presented for origination cohorts, loan outcomes, distributions, and same‐unit dynamics. We note that predicted home prices (and markups) can be estimated in alternative ways, using estimates from professional appraisals and/or automated valuation models (AVMs). In Section 6, we explore the role (if any) of appraisal‐based markups, leaving analysis of AVMs for further research.Markup MeasurementThe house price markup for a transaction is defined as the log‐difference between the actual transaction value (V$V$) and a predicted value (V̂$\hat{V}$) of a unit. Markups are measured by taking the previous transaction value of unit i$i$in area j$j$at time t$t$, and multiplying this value by the change in a local HPI (P$P$) between the time of the first anchor transaction in time t$t$, and a second subject transaction t+1$t+1$.10This value estimation approach is also used by Anenberg (2016) and Molloy and Nielsen (2018) in other contexts.This method requires two transactions of the same unit to calculate a markup (m$m$). With lower‐cased variables indicating logs, the markup is defined as follows:7mi,j,t+1≡vi,j,t+1−v̂i,j,t+1;v̂i,j,t+1=p̂j,t+1−p̂j,t+vi,j,t,$$\begin{equation} m_{i,j,t+1}\equiv v_{i,j,t+1}-\hat{v}_{i,j,t+1}; \quad \hat{v}_{i,j,t+1}=\hat{p}_{j,t+1}-\hat{p}_{j,t}+ v_{i,j,t}, \end{equation}$$where p̂j,t+1−p̂j,t$\hat{p}_{j,t+1}-\hat{p}_{j,t}$reflects house price appreciation as measured by the area's repeat sales index from time t$t$to time t+1$t + 1$.An example of the calculation of the price markups for a particular housing unit is shown in Figure 2. This house transacted four times between 1998 and 2015, with transaction prices denoted by solid circles. Each predicted price, based on the previous anchor transaction and the change in the HPI, is denoted by a hollow diamond. The markup is shown in the lower panel with the bar height representing the percent difference between the actual and predicted subject transaction prices. The 2002 transaction was about 6% lower than the predicted transaction price, indicating a −6% markup that is denoted with a solid red circle.11Throughout the paper, log‐differences and percent changes are used interchangeably. This is a reasonable approximation for small (<|0.3|$&lt;|0.3|$) log‐differences, but becomes increasingly inaccurate as differences increases.Then, based on this second transaction price, the predicted price in 2009 was about 10% lower than the actual, giving a +10% markup as depicted with a solid green circle. Finally, the fourth transaction indicates a −10% markup relative to the third transaction, and is depicted with another red circle because of the negative markup. We focus on purchase transactions (both cash and purchase‐money mortgages).2FigAn Illustration of the Calculation of a Markup.Notes:This figure is calculated using GSEs' data on mortgages and county recorder data on transactions from an actual sequence of four transactions of a single housing unit in Washington, DC.DataTo calculate markups for a large set of houses and estimate their association with various mortgage outcome measures, a database is assembled with information on three main items for each housing unit: a time series of transactions, a relevant HPI, and mortgage attributes and performance.Transaction recordThe base data set is a nearly complete coverage of real property and refinance transactions for single‐family housing in the United States. This record is based on two main sources and is identical to the data used by the Federal Housing Finance Agency (FHFA) to produce its suite of house price indices. The first source is administrative data from Fannie Mae, Freddie Mac, and the Federal Housing Administration. These include transaction and appraised values of homes from both purchase‐money and refinance mortgages purchased, securitized, or guaranteed by any of these three entities. Appended to this data set is a county recorder transaction file from CoreLogic consisting of transaction values from sales and transfer documents that are filed with county recorder offices. These include information that the administrative data misses, including cash purchases or purchases with loans that are held in portfolios or private‐label securities. These sources combine to include more than 100 million transactions involving single‐family housing units since the mid‐1970s.House price indicesA panel of local HPIs is merged onto the transaction records. To account for local variation in house prices, a ZIP‐code‐level file is obtained from the FHFA as described in Bogin, Doerner, and Larson (2019a).12These indices are produced at an annual frequency for ZIP codes with at least 25 “half‐pairs” (a count of the paired transactions where either the first or the second transaction occurs in the respective year) in each year.After combining these indices with the transactions record, markups are calculated for all transactions on a property except for the first recorded transaction. The out‐of‐sample standard deviation of prediction errors for these indices is approximately 10% for a 1‐year holding period, rising to about 15% for 8 years (see Figure 1a in Bogin, Doerner, and Larson 2019b).13For some context, estimates by Goodman and Thibodeau (2003) show hedonic models to have unit‐level prediction errors between 3% and 10% depending on the level of aggregation. While an HPI may produce noisier price estimates, this should only affect the second moment, not the first. Accordingly, measurement error should only serve to inflate standard errors (left‐hand side error) and attenuate estimates (right‐hand side error). An HPI also has the advantage of not requiring a vector of detailed housing unit attributes. This allows us to capture an extensive number of transactions going back in time. A large number of markups are required in later stages of the paper when estimating effects of markups on loan performance, as these models require a sufficient number of calculated markups.Loan performance dataFinally, mortgage origination and performance information are gathered at the loan level to measure the effects of price markups on specific payment outcomes. This information is from Fannie Mae and Freddie Mac's publicly available single‐family performance files.14Data are obtained from Freddie Mac and Fannie Mae from their websites and a database of proprietary identifiers via a regulatory oversight agreement. Freddie Mac data are at https://www.freddiemac.com/research/datasets/sf‐loanlevel‐dataset. Fannie Mae data can be found at http://www.fanniemae.com/portal/funding‐the‐market/data/loan‐performance‐data.html.The mortgages represent fully amortizing fixed‐rate loans that have been purchased or securitized between 2001 and 2012. These are “full documentation” loans, which means that loan application information has been verified or waived. A typical loan is underwritten for 30 years but the data also contain 15‐, 20‐, and 40‐year terms, although loan data with these other lengths are not available for originations before 2005. Mortgages are excluded if they are adjustable‐rate, interest‐only, balloon, step‐rate, or are not first liens. The sample also usually excludes mortgages that have credit enhancements that go beyond primary MI. Cash transactions or government‐issued mortgages, such as from Federal Housing Administration (FHA) or U.S. Department of Veterans Affairs (VA), are not present in this performance data set, though they are present in the transactions records used in the creation of the markups measures. Loans are also excluded if they participated in the Home Affordable Refinance Program (HARP) or if they have been flagged for other nonstandard attributes such as LTVs above 97%, immediate liquidations, biweekly payment due dates, reduced documentation, streamlined processing, or had been part of a lender recourse or third‐party credit‐sharing arrangement.The origination data provide details about a loan's terms (amount, length, and rate), purpose (purchase or refinance), MI coverage, origination channel, and general location. The ongoing performance files track monthly outstanding loan balance, maturity (age and remaining months), loan modification, delinquency status, and outcomes upon termination (actual loss revenues and expenses, covered chargeoffs, or repurchased workouts). A “default” is defined as a negative loan termination, including foreclosure, foreclosure alternative, or lender repurchase.Loan CountsThe analysis begins with the 3.9 million purchase‐money mortgages (“purchase mortgages”) in the loan performance file where we are able to calculate a markup. About 3.4 million loans pass several data quality filters.15Major drops include 350,000 observations with markups greater than +/−50% and 90,000 observations without fully populated covariates.Table 1 outlines counts of loan originations and outcomes by origination cohort.16Outcomes are as of March 2018.1TABLEPurchase Loan Counts and OutcomesDirectPositiveNeg. markupPos. markupYearLoansCurrentPrepayDefaultDefaultModificationMarkupDefault rateDefault rate2001366,5821.5%97.4%1.0%31.5%0.5%56.2%0.9%1.2%2002338,2153.0%95.6%1.3%38.7%0.8%58.8%1.2%1.4%2003352,7487.8%90.0%2.2%51.1%1.6%67.0%2.1%2.3%2004248,3458.7%88.1%3.2%57.1%2.1%65.9%2.8%3.4%2005273,18310.0%84.6%5.4%63.4%3.0%62.8%4.0%6.2%2006243,0517.3%86.6%6.1%60.8%3.6%57.6%5.3%6.7%2007251,1208.6%84.5%6.9%53.2%4.4%51.8%6.2%7.6%2008276,3208.6%87.3%4.1%44.7%3.6%41.3%3.4%5.1%2009279,76320.4%78.9%0.6%41.0%0.5%43.5%0.6%0.8%2010247,34229.8%69.9%0.3%38.9%0.4%49.5%0.2%0.3%2011232,87937.9%62.0%0.2%32.2%0.3%48.6%0.1%0.2%2012305,80167.3%32.6%0.1%30.5%0.2%56.0%0.1%0.1%Total (all)3,415,34917.0%80.5%2.5%44.7%1.7%55.3%2.1%2.8%Total (2006–07)494,1718.0%85.5%6.5%57.0%4.0%54.6%5.7%7.1%Note:The sample includes all purchase loans with a calculated markup subject to filters noted in the text as of March 2018. A direct default is defined following Foote et al. (2010): (i) the borrower must be current for three consecutive months, and then register a 90‐day delinquency 3 months later; (ii) the borrower must never have been seriously delinquent (90 days) before triggering (1); and (iii) the borrower must never become current again before defaulting.The number of purchase originations in the sample is highest in 2001–2003 at around 350,000 loans per year, and then remains roughly constant at between 240,000 and 300,000 loans per year for the remainder of the sample. For loans originated prior to 2009, about 90% or more of each vintage of loans has resolved in either default or prepayment. Defaults are low in 2001 at 1.0%, rise to 6.9% in 2007, and fall to 0.1% in 2012, though with each successive vintage past 2007, an increasing share of loans remains current. Loan modifications follow a similar pattern as defaults.Direct defaults are tracked as a measure of strategic default. A direct default happens when a borrower makes no payments after becoming delinquent, instead choosing to “simply walk away from the home” (Foote et al. 2010). Direct defaults follow Foote et al. (2010) by applying three criteria: (i) the borrower must be current for three consecutive months, and then register a 90 day delinquency three months later; (ii) the borrower must never have been seriously delinquent (90 days) before triggering (i); and (iii) the borrower must never become current again before defaulting. Across the sample, about 45% of defaults are direct, with loans in 2005 and 2006 averaging about 57%.Over the entire sample, about 55% of all purchase loans have positive markups.17The average markup is positive, indicating a difference in the selection mechanism into the sample versus the HPI. By construction, the HPI approach should give markups that are, on average, equal to zero if the samples are identical and markups are symmetric. The combination of a price index and property selection has isolated a subset of houses that have transacted for prices that are higher than the market area's average. This is likely due to two main reasons. First, the price indices are based on conforming, conventional loans. If an initial transaction were cash or a non‐GSE foreclosure sale, the price could be depressed and the markup calculated for the second transaction may be artificially inflated. Second, houses that have transacted multiple times in sequence may have undergone a renovation or “flip” causing, again, a positive markup. Ultimately, we continue with the analysis but acknowledge such potential shortcomings.Table 1 also presents default rates by markup sign. In every year, negative markups have at least as low default rates as positive markups. The magnitudes are often remarkably dissimilar, indicating a large unconditional correlation between markups and defaults. For instance, in 2007, the default rate for loans with negative markups is 6.2%; for positive markups, it is 7.6%. This relationship between negative and positive markups and default rates persists in noncrisis periods; even with default rates dropping to less than 1% in post‐2009 cohorts, positive markups still have (weakly) greater default rates than negative markups. Overall, there is a clear, bivariate, reduced‐form relationship between the sign of the markup at origination and eventual default. This relationship is remarkably robust and economically relevant, especially in periods of heightened default risk.Kernel densities related to markups and defaults are shown in Figure 3. The distribution of markups is shifted to the left for those that prepay versus those that default, indicating a positive association between default and markups throughout the distribution. Another thing to note is the large variance of the markups. We attribute the large variance to buyer/seller match idiosyncracies, the presence of error in the markup measurement, and quality changes across transactions. The last two factors attenuate any estimated relation between markups and outcome variables in empirical work. However, as will be shown, the markup is highly predictive across a battery of loan outcomes, model specifications, and samples, despite the potentially high degree of noise in the particular measure.18Markups are greater than 0.25 or less than −0.25 in 14.1% of all our loans, which is comparable to the 20–26% of “markups” falling outside of this range when using Zillow's Home Value Index, according to Molloy and Nielsen (2018).3FigMarkup Distribution.Notes:The sample includes GSE purchase mortgages originated in 2001 through 2012 with a calculated markup.Same‐Unit DynamicsThe model in the prior section suggests that idiosyncratic factors related to the buyer/seller match generate markups. We assess the presence of these factors in markups across consecutive transactions of the same housing unit. We find that, on average, markups are partially mean reverting, suggesting that markups contain the crucial idiosyncratic component we believe is related to default.Continuing from Equation (7), let us assume value can be decomposed into one‐dimensional price p$p$and quantity q$q$and that the estimate of HPI appreciation is measured with error e$e$. Then the markup is mt+1=Δqt+1+Δpt+1−Δp̂t+1−et+1$m_{t+1}=\Delta {q}_{t+1}+\Delta {p}_{t+1}-\Delta \hat{p}_{t+1}-e_{t+1}$. From this expression, a number of factors influence the markup: changes to the quantity of housing services provided by the property, changes to the price paid per unit of housing services, HPI changes, and measurement error in the HPI. Each factor deserves some discussion.Quantity changes may be one‐time events (i.e., large‐scale renovation), so the markup in t+1$t+1$may be uncorrelated with future quality changes in t+2$t+2$. In this case, the markup is uncorrelated with future markups. On the other hand, we would expect price changes for individual units in excess of market‐measured prices to be mean‐reverting upon subsequent transactions, giving markups that are negatively correlated. If the HPI had measurement error, the markup would be of equal and opposite sign to this error. Note that in period t+2$t+2$, the markup from period t+1$t+1$may be, but is required to be, relevant; some portions of the markup are mean‐reverting and others are not. In general, we expect markups to be partially mean‐reverting when averaged over the market.19Drift in quality may also cause the average markup in a group to be nonzero, even with an unbiased HPI.To model the same‐unit dynamics of markups, a future markup is expressed as a function of the current markup, the prior markup, and other covariates, as shown below. This equation is an adapted version of the simple mean‐reversion specification found in Davis and Weinstein (2002). A variable is also included for default due to the well‐documented foreclosure premium, as well as the financing type (versus a GSE purchase‐money mortgage), and in some specifications, the time between transactions, h(t,t+1)$h(t,t+1)$. To control for the possibility that markups are nonzero on average, time period and state fixed effects are included. To account for residual variance in the model, standard errors are clustered at the level of the state interacted with the year.8mi,t+1=β0+β1mi,t−1+β2mi,t+X′γ+ui,t+1.$$\begin{equation} m_{i,t+1}=\beta _0+\beta _1 m_{i,t-1}+\beta _2 m_{i,t}+X^{\prime }\gamma +u_{i,t+1}. \end{equation}$$If β1<0$\beta _1&lt;0$and/or β2<0$\beta _2&lt;0$, then the prior markup contains a transitory component; the markup in the past is predictive of a markup of opposite sign in the future. On the other hand, if both are equal to 0, then the prior markup is not predictive of future markups and can be considered permanent and reflective of a change to home quality. A positive coefficient would represent the presence of series of positive or negative markups relative to the market average price, suggesting serially correlated changes to home quality.Table 2 shows estimates of same‐unit markup dynamics across transactions. There are four models, with each corresponding to a different conditioning set or subsample. Column 1 presents estimates using the full set of purchase mortgages. The current markup parameter is −0.34 and the previous markup parameter is −0.03.20Genesove and Mayer (2001) estimate the residual from the previous sale hedonic to have a partial effect of 0.16 on the next sale, and a “months since last sale” parameter of −0.0004, or about −0.005 per year. This hedonic residual is an alternative calculation of a markup. This specification is remarkably close to ours but is not adapted to the task of estimating markup dynamics, making the task of comparison somewhat difficult. If they had estimated an interacted variable parameter relating the residual ×$\times$months since last sale along with a vector of controls, then the parameters would be comparable. As it stands, their estimate of 0.16 for the residual on the next sale corresponds to a −0.84 estimate (about 2.5×$\times$the estimate) on the following markup in the specification. The −0.005 per year is larger than 0.0005 estimate, though by assumption, they are restricting the interacted coefficient to 0, while ours is −0.01. Accordingly, the findings are broadly consistent with the signs and significance levels found in Genesove and Mayer (2001) though they are fairly far apart in magnitude.In terms of other covariates, the foreclosure premium is estimated to be about −25%, which is similar to the estimates found in Campbell, Giglio, and Pathak (2011). Financing appears to have little effect with parameters between −0.5% and 1% versus GSE‐acquired mortgages. Column 2 presents results from a model with covariates for the time‐between‐transactions and interactions with the markup. This result suggests the markup increasingly mean reverts the longer the holding period. Columns 3 and 4 show that positive markups exhibit lower levels of mean reversion than negative markups.2TABLESame‐Unit Markup DynamicsDependent variable: Next markup (Markup[t+1])PositiveNegativeSampleAllAllMarkupMarkupModel:[1][2][3][4]Markup[t]−0.337***−0.283***−0.0718***−0.523***[0.00720][0.00850][0.0116][0.0193]Markup[t−1]−0.0336***−0.0336***−0.0297***−0.0367***[0.00163][0.00163][0.00189][0.00193]Markup[t] ×$\times$h[t,t+1]−0.0119***−0.0214***−0.00713*[0.00137][0.00189][0.00383]h[t,t+1]−0.000490.0005770.000497[0.000381][0.000410][0.000399]Default[t]−0.250***−0.249***−0.271***−0.222***[0.00916][0.00917][0.00830][0.0127]FHA[t−1]0.0132***0.0134***0.00699***0.00705***[0.00189][0.00188][0.00163][0.00271]non FHA, GSE[t‐1]0.0103***0.0104***0.0001950.0185***[0.00240][0.00242][0.00249][0.00338]FHA[t+1]−0.0481***−0.0484***−0.0477***−0.0520***[0.00218][0.00213][0.00200][0.00268]non FHA, GSE[t+1]−0.00592***−0.00551***−0.00889***−0.00077[0.00157][0.00159][0.00173][0.00190]Constant0.0499***0.0527***0.0290***0.0292***[0.00108][0.00200][0.00239][0.00243]Observations475,729475,729264,036211,693R20.1150.1160.0940.103Note:Robust standard errors in brackets and clustered by State ×$\times$Year. ***p<0.01$p&lt;0.01$, ** p<0.05$p&lt;0.05$, * p<0.1$p&lt;0.1$. Fixed effects by State ×$\times$Year are absorbed prior to estimation, with standard errors reflecting true degrees of freedom. The sample includes GSE purchase mortgages originated in 2001–2012 with both an earlier (t−1$t-1$) and later (t+1$t+1$) markup, necessitating at least four sequential purchase or refinance transactions on the same housing unit.In sum, markups are mean reverting. These results indicate that markups are driven by transaction‐level pricing that is, in part, transitory. Our theoretical model suggests that due to these transitory factors, markups are associated with variation in collateral risk for mortgages. The analysis now turns to an examination of the relationship between markups, mortgage defaults, survival lengths, losses conditional on default, and comparisons with markups calculated using mortgage financing appraisals.PREDICTING EVENTUAL DEFAULTSConceptually, it is clear that lower collateral, holding the loan balance and other variables constant, should be associated with a higher risk of default. This directly applies to markups for reasons best illustrated by the following simple illustration. Suppose a mortgage with a stated 97 LTV ratio that is used to purchase a house, but the sale price contains a +10% pricing error. This loan would immediately be underwater with a current LTV of about 108 because the collateral is worth less than the initial valuation going into the calculation of the LTV. Similarly, if the housing unit is underpriced by 10%, the 97 LTV mortgage would immediately have a current LTV of 88, making it substantially less risky.21The LTV math is as follows: 9790=1.077$\frac{97}{90}=1.077$; 97110=0.881$\frac{97}{110}=0.881$.In either case, the standard LTV is not based on an accurate collateral valuation. This is especially dangerous in the case of a positive markup, because it potentially indicates an appraisal failure. One of the primary purposes of an appraisal is to help assess the risk of default for both the lender and the borrower. With a positive markup, the LTV understates the default risk. For these reasons, emphasis is placed on assessing how price markups affect default risk conditional on the origination LTV.The bivariate relationship between the markup and default is shown in the first panel of Figure 4 for the 2001–2012 sample. At a −20% markup, the default probability is 2%, whereas at +20%, the default probability is about 3.1%, nearly 1.5 times the low markup. The relation is monotonic and of the predicted sign, giving a strong indication that markups can be used to predict eventual defaults in a meaningful way. While +/−20% may seem to be a wide range, recall that (i) the particular markups estimate likely includes substantial idiosyncratic noise, and (ii) the range between these values includes about 80% of the markups; that is, over 20% of values are outside this range.4FigDistributions of Markups and Defaults Using a House Price Index (HPI) versus Appraisals.Notes:The sample includes loans originated in 2001–2012, with a calculated markup, subject to filters noted in the text. Histogram bins at the edges of the respective figures include values censored values between −50% and −20% attributed to the −20% bucket, and 20% and 50% to the 20% bucket, respectively. The default rate curve in Panel (b) does not include appraisal markups within 0.5% of zero, which includes 49% of all loans. This particular group of loans has been associated with adverse selection in the literature; we also find that it introduces a moderate discontinuity in the default rate curve. Loans within this range have a default rate of 2.5%.MethodsTo examine empirically the link between markups and default in a more rigorous fashion, a standard competing options default model is constructed. The dependent variable d$d$for loan i$i$is set equal to 1 if the mortgage terminates in default, defined as a foreclosure, foreclosure alternative such as a short sale, or a lender repurchase, 2 if the loan prepaid, and 0 if it is current as of March 2018. The variable of interest is the markup, calculated for the origination period, and the partial correlation of this variable is hypothesized to be positive on all forms of default. Note that this models requires complete ex post information and cannot be used to model causal lender or borrower behavior or answer the questions regarding whether lenders (borrowers) account for markups in their choices to issue (take out) a mortgage. Additionally, the markup calculation in a repeat‐sales framework benefits from sales subsequent to the origination period. These limitations notwithstanding, the model here is illustrative of partial relations between the variables in question.The functional form of the specification is a logit equation, with control variables in the vector X$X$, coefficients for controls in the vector γ$\gamma$, and e$e$as a generalized extreme value IID random variable.22This is not the precise functional form as presented in Section 2. In the conceptual framework (and in other models, e.g., Hatchondo, Martinez, and Sanchez 2015, Bailey, Davila, Kuchler, and Stroebel 2017), the interaction between interest rates and the LTV determines the equilibrium outcome. The markup is layered onto this expression, which if properly specified, would necessitate a triple interaction and multiple double interactions, with default increasing in each of the arguments both through direct and interacted effects. These complex interactions are omitted to focus on the total partial effects using a parsimonious specification, acknowledging the possibility of omitted variable bias at extreme values.Normalization requires all d=0$d=0$parameters set equal to zero, or {β0,0,β1,0,γ0}={0,0,0}$\lbrace \beta _{0,0}, \beta _{1,0}, \gamma _0\rbrace =\lbrace 0,0,\mathbf {0}\rbrace$.9Pr(di=j)=exp(β0,j+β1,jmi+Xi′γj+ei,j)∑τ=jJexp(β0τ+β1,τmi+Xi′γτ+ei,j).$$\begin{equation} Pr(d_{i}=j)= \frac{\exp {(\beta _{0,j}+\beta _{1,j} m_{i}+X_{i}^{\prime }\gamma _j+e_{i,j})}}{\sum _{\tau =j}^J\exp {(\beta _{0\tau }+\beta _{1,\tau } m_{i}+X_{i}^{\prime }\gamma _\tau +e_{i,j})}}. \end{equation}$$The choice of controls is standard in the literature, but in particular, are motivated by Ghent and Kudlyak (2011) and Davis et al. (2019). First, variable ranges or “buckets” are included for the LTV, debt service payment to income ratio (DTI), and the credit score. The DTI at origination is a standard default indicator, as a high debt fraction of household income makes a household particularly susceptible to income shocks in terms of ability to repay. Credit score at origination is also a common indicator, representing a household's willingness and ability to repay debt in the past and also may be related to the cost of default (D$D$), a key determinant of loan repayment in our conceptual framework. Multiple borrowers can mitigate the income risk of default, as the risk of a default‐inducing income shock falls. First‐time home buyers may be more or less risky because they often have less wealth, yet also often receive preferable treatment in the tax code, have acquired mortgages during a period of their lives when incomes tend to be accelerating, and may be held to stricter lending standards than repeat borrowers, all else equal. Investment properties and second houses are also risk indicators because debt on these sorts of luxury assets is the first to be defaulted upon in times of stress for the borrower, who presumably has multiple mortgages simultaneously.23Chinco and Mayer (2016) show that such buyers are more prone to mispricing home purchases. This can be taken a step further by analyzing whether such borrowers also perform worse on loans and if the mispricing exacerbates credit losses.The origination channel controls for associations between different types of mortgage originators and loan outcomes. The amortization period is covered by dummy variables for 20‐, 30‐, and 40‐year terms (versus 15‐year term).A prepayment option control is included by way of the interest rate at origination. Combined with a time period fixed effect, the interest rate variable turns into a spread‐at‐origination variable, which is a prepayment risk indicator.24Agarwal, Ben‐David, and Yao (2017) show that mortgage points are correlated with prepayment speeds, but less so than would be implied by the cost of the points. These points may be associated with markups as well, as a buyer who has a positive preference for a particular home may be both willing to overpay for a house and also purchase points on the mortgage. Unfortunately, a variable representing mortgage points is not available in the data. The (relative) interest rate acts as a control for a latent preference to remain in a home.A default option variable is included as the cumulative change in the HPI between origination and the next transaction. This variable captures the Deng, Quigley, and Van Order (2000) concept of an underwater mortgage without explicitly calculating the probability.25Deng, Quigley, and Van Order (2000) calculate the probability a mortgage is underwater by taking the cumulative standard normal distribution of the log‐difference in the current loan balance (u$u$) and the current house value (i.e., the log‐approximation of the current LTV) divided by the estimated residual standard deviation of the HPI value, or Φ[(lnui,t+h−lnVi,t+h)/ω2]$\Phi [(\ln u_{i,t+h} - \ln V_{i,t+h})/\sqrt {\omega ^2}]$. Rather than using a probabilistic measure, the origination LTV and the change in the HPI are included as separate covariates.In addition, the positive markup variable itself is an underwater indicator, as it serves to increase or decrease the fundamental property value relative to the mortgage balance. The seasoning of the loan is included as a vector of amortization year fixed effects to account for amortization.Time period and state fixed effects serve as important controls as well. In addition to the previously mentioned transformation of the interpretation of the interest rate variable, period fixed effects control for other macroeconomic‐related conditions, including housing market liquidity, national unemployment rate changes, and other factors. State effects capture state‐level variation in recourse versus not‐recourse laws, elasticity of housing supply, propensity to overbuild, fraud, and other factors throughout the crisis and its aftermath. Overall, these controls mitigate omitted variable bias in the treatment parameter, allowing us to capture the predictive relation between markups and defaults after accounting for a comprehensive set of variables.ResultsResults of several different default models are shown in Table 3. The first model is a multinomial logit specification, with “current” as the baseline outcome, and defaults and prepayments as competing options. Markups have a consistently positive and statistically significant effect on the probability of default. These estimates are also economically relevant, as going from a −20% markup to a +20% markup increases the default probability from 5.0% to 8.4% for a high‐risk mortgage originated in 2006.26Variables are chosen to represent a 30‐year, fixed rate, purchase mortgage in Florida in 2006, with one first‐time borrower, issued through a retail channel, which terminates after 3 years. The DTI is 39, the FICO is 690, and the LTV is 91.3TABLEMarkups and Loan Outcomes, Part 1Estimator:M. Logit (vs. Current)Multinomial Logit (vs. Current)Sample:All LoansAll LoansOutcome:Default TypeDefaultPrepayForec. Alt.RepurchaseREOPrepayModel[1][2]Markup1.380***0.01991.319***1.045***1.494***0.0174%Δ$\%\Delta$House Price Index−4.794***0.861***−5.659***−1.666***−4.845***0.906***Mortgage interest rate0.648***0.155***0.500***0.890***0.645***0.156***Combined LTV Bucket (vs. 0–60)61–700.944***−0.0321**1.135***0.200**1.038***−0.0342**71–751.339***−0.01341.548***0.549***1.430***−0.017576–801.696***0.003171.923***0.699***1.826***−0.0005181–851.983***0.03272.103***1.187***2.141***0.029886–902.330***0.0668***2.456***1.357***2.511***0.0635***91–952.598***0.0572***2.717***1.591***2.795***0.0524**96+2.768***−0.04912.847***1.922***2.962***−0.0528DTI Bucket (vs. 0–33)34–380.154***0.0150*0.194***0.0876*0.145***0.0147*39–430.279***0.0148*0.335***0.230***0.260***0.0138*44–500.322***−0.01330.389***0.328***0.287***−0.015451+0.369***−0.142***0.356***0.549***0.346***−0.143***Credit Score Bucket (vs. 300–579)580–619−0.232**0.295***0.263***−0.0316−0.681***−0.139620–639−0.358***0.406***0.367***−0.104−0.944***−0.262**640–659−0.436***0.540***0.496***−0.107−1.206***−0.344***660–689−0.589***0.710***0.664***−0.169−1.404***−0.528***690–719−0.780***0.841***0.794***−0.256*−1.656***−0.758***720–769−1.144***0.950***0.902***−0.491***−2.148***−1.172***770+−1.683***0.926***0.879***−0.910***−2.763***−1.777***Multiple Borrowers−0.589***0.167***−0.327***−0.775***−0.683***0.165***First‐time homebuyer−0.136***−0.0831***−0.220***0.0195−0.124***−0.0827***Acquisition Channel (vs. Retail)Broker0.0836***−0.127***0.101***0.01230.0755***−0.124***Correspondent−0.0702***−0.122***0.00696−0.181**−0.0774***−0.121***Not Specified0.599***0.448***0.570***0.452***0.630***0.441***Occupancy Type (vs. Owner)Investment Property0.190***−0.167***−0.221***0.1240.376***−0.164***Second Home0.0829*−0.0146−0.215***0.1290.218***−0.0101Amortization Period (vs. 15 years)20 Years0.0685−2.61E‐050.286−0.581**0.160.00019330 Years0.678***0.0601**0.999***−0.1840.737***0.0589**40 Years0.456***−2.397***0.03371.011***0.688***−2.330***Observations3,415,3493,415,349Pseudo‐R20.7260.706Note:Values presented are the log‐odds estimates. Robust standard errors are clustered by State ×$\times$Year, but are intentionally omitted for brevity. *** p<0.01$p&lt;0.01$, ** p<0.05$p&lt;0.05$, * p<0.1$p&lt;0.1$. The sample includes purchase loans originated in 2001–2012, with a calculated markup, subject to filters noted in the text. Cohort year, state, seasoning, and GSE fixed effects are included in all specifications, but omitted from the table.Control variables give estimates as predicted. The default option control variable—the cumulative level of ZIP‐code‐level house price appreciation—reduces the probability of default. The mortgage interest rate, LTV, longer loan terms, Broker and third‐party originators (TPOs) originators, and DTI are each associated with increased default probabilities. Credit score, the presence of multiple borrowers, status as a first‐time home buyer, and intended status as an owner‐occupier are all factors that reduce the probability of default.Model 2 expands the default indicator in three separate outcomes: foreclosure alternatives, repurchases, and foreclosure/REO sales. Estimates for the markup variable and controls are remarkably robust. Due to this robustness, the omnibus default indicator in model 1 is the preferred default metric in subsequent models.Models 3–6, as shown in Table 3, consider four other outcomes of interest. The first, as shown as model 3, is direct defaults. Markups are hypothesized to have a positive effect on the incidence of direct default, conditional on default. Borrowers are continually gathering information and updating their beliefs of property values. This is aided, in part, by companies such as Zillow and Redfin, which produce real‐time estimates of house values. As shown in models 1 and 2, a borrower who faces a large, positive markup is more likely to default. This type of borrower is also more likely to be underwater and, upon receiving information on the true price of their previously overpriced unit, may choose to direct default. Evidence for this hypothesis exists as the parameter estimate is positive and statistically significant.3TABLEMarkups and Loan Outcomes, Part 2Estimator:LogitLogitLogitLogitCoxSample:DefaultsAllAllAll2006–2007DirectEverEverEverOutcome:DefaultModifiedD90D180DefaultModel[3][4][5][6][7]Markup0.163***0.398***0.765***0.892***0.481***%Δ$\%\Delta$House Price Index−0.852***−1.588***−3.754***−4.145***−3.397***Mortgage interest rate−0.236***0.621***0.577***0.551***0.890***Combined LTV Bucket (vs. 0–60)61–700.315***0.695***0.497***0.631***1.053***71–750.323***0.920***0.674***0.814***1.451***76–800.316***1.057***0.862***1.032***1.623***81–850.256***1.280***1.031***1.196***1.813***86–900.381***1.425***1.258***1.449***2.016***91–950.394***1.551***1.432***1.630***2.185***96+0.339***1.738***1.649***1.830***2.351***DTI Bucket (vs. 0–33)34–380.002460.347***0.196***0.181***0.121***39–43−0.0619**0.523***0.325***0.305***0.226***44–50−0.0878***0.670***0.427***0.389***0.299***51+−0.140***0.853***0.571***0.517***0.410***Credit Score Bucket (vs. 300–579)580–6190.319***−0.404***−0.509***−0.416***−0.347***620–6390.564***−0.682***−0.763***−0.629***−0.512***640–6590.748***−0.936***−1.013***−0.837***−0.616***660–6891.023***−1.294***−1.377***−1.154***−0.765***690–7191.203***−1.625***−1.755***−1.495***−0.975***720–7691.398***−2.054***−2.269***−1.987***−1.285***770+1.512***−2.709***−2.886***−2.598***−1.717***Multiple Borrowers0.153***−0.238***−0.656***−0.663***−0.499***First‐time homebuyer−0.0500***−0.126***−0.0897***−0.0949***−0.0988***Acquisition Channel (vs. Retail)Broker−0.0893***0.338***0.243***0.253***0.227***Correspondent−0.03490.292***0.0995***0.0866***0.166***Not Specified−0.0457*0.175***0.185***0.185***0.0944***Occupancy Type (vs. Owner)Investment Property0.209***−1.793***−0.170***−0.101*0.0805***Second Home0.0869**−0.965***−0.134***−0.102***0.229***Amortization Period (vs. 15yr)20 Years−0.2620.1130.07590.03110.14130 Years−0.01050.745***0.526***0.634***0.682***40 Years−0.709***9.266***6.567***4.278***1.444***Observations83,6403,415,3493,415,3493,415,349494,171Pseudo R20.09540.5460.3460.340.214Note:Values presented are the log‐odds estimates for the logit models, and partial effects for the Cox hazard model. Robust standard errors are clustered by State ×$\times$Year, but are intentionally omitted for brevity. *** p<0.01$p&lt;0.01$, ** p<0.05$p&lt;0.05$, * p<0.1$p&lt;0.1$. The sample includes purchase loans originated in 2001–2012, with a calculated markup, subject to filters noted in the text. Cohort year, state, seasoning, and GSE fixed effects are included in all specifications, but omitted from the table.Model 4 estimates the effects of markups at origination on propensity to seek and acquire a loan modification from a lender. Because markups are a default indicator, they are also likely an indicator of modifications for the same reasons. Results show that markups are highly significant predictors of loan modifications. Controls are also of anticipated sign and significance, and are the same as models 1 and 2 with some exceptions. In particular, mortgages on investment properties and second homes are substantially less likely to receive loan modifications versus owner‐occupied homes. Models 5 and 6 estimate the effects of markups on delinquency using both D90 and D180 definitions. The signs and significance levels for these two models are nearly identical to models 1 and 2.27Robustness tests by origination cohort and considering further nonlinearities in the markup measure are available later in the main body of the paper. Additionally Appendix Table A.1 considers several other subsamples, including markups that are nearly zero for the previous markup and for different loan amounts. Models reinforce the robustness of the markups sign, significance, and magnitudes.Model 7 presents estimates from a simple Cox (1972) proportional hazard model to estimate reduced‐form relations between a price markup and a mortgage's survival length.28Prepayment and default hazards, from a borrower's perspective, are competing options: in each period, a borrower must decide to remain current, prepay the balance of the mortgage, or default. In a proportional hazard model, competing options can be treated as censored, facilitating a substantial reduction in the computation burden necessary to estimate unbiased and consistent reduced‐form parameters, but removing causal interpretation from the resulting estimates. For a recent example of this approach, see Foote et al. (2010), who model the prepayment and default hazards for prime and subprime loans.This model is estimated for the crisis sample of originations (the 2006/2007 cohort) because these loans have mostly resolved. Results from the default hazard model are consistent with the other variables and methods considered. Markups are highly predictive of default.This set of models demonstrates that house price markups provide a simple but robust mortgage stress indicator, conditional on standard controls. Markups at origination are predictive of delinquencies, defaults, loan modifications, and a loan's default path.29It is possible our results, as robust as they are, may be in part driven by our methodological choice of using an HPI to calculate the markup. We have constructed the best markup measure we can using available information, which consists of a long historical transaction record. To our knowledge, this is the only way to generate market‐based valuations for a large number of properties before, during, and after the Great Recession. We leave evaluation of alternative markup calculations to further research.Markups in Hot and Cold National Housing MarketsEstimates thus far are based on the pooled sample of transactions between 2001 and 2012. Another interest is the extent to which parameters vary over time when estimated on a year‐by‐year basis. In doing so, a link may be established between the association of markups with defaults and prepayments in hot versus cold national housing markets.Table 4 shows the results of 12 models, each with the same specification as Table 3, model 1. The markups log‐odds parameter is presented and all other covariates are estimated but not reported. Default parameters are between 0.6 and 1.8, with no discernible pattern in terms of parameters over time. This is evidence that the relation between markups and defaults is relatively stable over time, with variation due to chance. It is remarkable that even as absolute default rates rise and fall, the parameter estimate stays reasonably similar.4TABLEMarkups and Loan Outcomes by Origination CohortEstimator: Multinomial Logit (vs. Current)Sample: Column headerYear200120022003200420052006Markup1.668***1.793***1.332***1.144***1.283***0.733***[0.187][0.272][0.225][0.193][0.222][0.156]Other CovariatesYesYesYesYesYesYesObservations366,582338,215352,748248,345273,183243,051Pseudo‐R20.5950.6660.6890.6610.630.598Year200720082009201020112012Markup0.974***1.381***1.786***1.508***1.499***0.676***[0.111][0.168][0.143][0.179][0.411][0.248]Other CovariatesYesYesYesYesYesYesObservations251,120276,320279,763247,342232,879305,801Pseudo‐R20.6080.6470.7810.8070.8040.748Note:Robust standard errors in brackets, clustered by State ×$\times$Year. *** p<0.01$p&lt;0.01$, ** p<0.05$p&lt;0.05$, * p<0.1$p&lt;0.1$. The sample includes purchase loans originated in the relevant column year, with a calculated markup, subject to filters noted in the text. The models include all covariates in Table 3 but each except the markup coefficient is omitted from the table text for brevity. The prepay equation is also omitted for brevity.Markups and the Cost of DefaultThe conceptual framework suggests that the cost of default D$D$is an important determinant of both markups and default.30Figure A.1 in the Appendix shows that default costs (as measured by predetermined credit scores) are correlated with markups. As default costs rise (when credit score goes up), markups decrease. This is an interesting finding that confirms the theoretical predictions: lower default costs lead to higher reservations prices and, potentially, higher markups.Does the relation between markups and default depend on D$D$? We bring this question to the data because the theoretical model is unable to provide a direct answer.31Equation (3) shows that the probability of default θ=Pr[P1≤γ∣b,p0]=F1(γ)$\theta =\Pr [P_1\le \gamma \mid b,p_0]=F_1(\gamma )$is a function of both p0(D)$p_0(D)$and D$D$, where γ=b+(1+r)(1−δ)p0(D)−D$\gamma =b+(1+r)(1-\delta )p_0(D)-D$. Simple differentiation shows that ∂2θ∂p0∂D$\frac{\partial ^2 \theta }{\partial p_0 \partial D}$can be positive, negative, or even zero, depending on the nature of F1$F_1$and the value of γ$\gamma$.As a first step, we need to identify a measure of default costs. But measuring individual default costs is challenging, as they depend on a variety of idiosyncratic factors that are unique to each individual and generally unobserved. Certain variables, however, may be used as imperfect proxies for the cost of default. The first variable we consider is the individual's credit worthiness. Ceteris paribus, higher credit scores should reflect higher default costs and vice versa. Also note that default costs depend on local foreclosure laws. In judicial foreclosure states (where the lender has to file a lawsuit in court in order to foreclose), default costs should be lower compared to nonjudicial foreclosure states. These two variables, credit scores and local foreclosure laws, are used to test if the relation between markups and default depends on D$D$.Results are shown in Table 5. All models use a similar specification as Table 3, model 1, but in each column, the sample is restricted to subgroups with plausibly different levels of default costs. Two findings deserve discussion. First, note that the markup coefficient is positive, and both economically and statistically significant across specifications. The markup is a powerful predictor of default when default costs are high or low. Second, the relationship between markups and default is stronger when default costs are higher; borrowers with lower default costs are less sensitive to variation in markups. Further research is needed to explain the mechanisms driving this heterogeneity.5TABLEMarkups and Loan Outcomes by Default CostJudicialNonjudicialCredit Score:StateState[300,660)[660,720)[720,850]Markup1.240***1.505***1.097***1.268***1.548***[0.0682][0.0618][0.0774][0.0604][0.0651]Other CovariatesYesYesYesYesYesObservations1,381,0872,034,262299,156775,6902,340,503Pseudo‐R20.7160.7340.5550.6480.785Note:Robust standard errors in brackets, clustered by State ×$\times$Year. *** p<0.01$p&lt;0.01$, ** p<0.05$p&lt;0.05$, * p<0.1$p&lt;0.1$. The sample includes purchase loans subject to filters noted in the text. Models with “other covariates” include all covariates in Table 3 but each except the markup coefficient is omitted from the table text for brevity. Judicial vs. nonjudicial foreclosure resolution state classifications are from Fannie Mae, found at https://singlefamily.fanniemae.com/media/6726/display.Interactions of Markups with LTV RangesBecause the relation between LTV and defaults is convex, it is logical that the effect of the markup conditional on LTV is also convex. For instance, at low LTVs, default rates are extremely low, and a negative markup is likely to have a de minimis effect on defaults. On the other hand, at high LTV ranges, a positive markup may be particularly dangerous. This is confirmed in Figure 5 in reduced form, which presents smoothed estimates of the probability of default as a function of the markup, segmented by LTV bucket. Negative markups exhibit a much smaller marginal effect on defaults than positive markups, and positive markups at higher LTVs have higher marginal effects than those at lower LTVs. Note that LTVs are important determinants for predicting future default, regardless of the markup. Figure 5 precisely shows that there is not a one‐for‐one trade‐off between LTV and markups: a loan with an ostensible LTV of 95% and a negative markup of 50% has a higher predicted default rate than a loan with LTV of 85% and a positive 50% markup. Additionally, when LTV is modeled continuously alongside the markup, while LTV remains a stronger predictor, markups are also important (see Table A.2).5FigDistributions of HPI‐constructed Markups and Defaults by Origination LTV Bucket.Notes:The sample includes loans originated in 2001–2012, with a calculated markup, subject to filters noted in the text. Lines present the polynomial‐smoothed estimated default rate over all mortgages in the sample, estimated with a bandwidth of 0.15, using STATA.Conditional estimates using a multinomial logit model with a full markup ×$\times$LTV grid add further nuance. Table 6 shows estimates from this model. The omitted LTV bucket is the 97+ range, and the omitted markup bucket is for approximately 0 markup (the +/− 0.01 range). In this table, we can see that markups less than +0.10 for the 25‐60 LTV range have no statistically distinguishable effect. These estimates suggest that low LTVs are mostly relevant to defaults, except when markups are extremely large. For moderate LTV ranges, effects of markups are mostly symmetric in log‐odds units, giving a slightly convex marginal effect. For the highest LTV range, 97+ LTVs, there is no statistically distinguishable effect of positive markups at the 5% level. It should be noted, however, that the point estimates are all positive and of reasonably similar magnitude to the 91–96 LTV range, indicating similar but much less precise estimates. We attribute this to the relative infrequence of high LTV loans in the database.32Most 97+ loans are intentionally filtered out. For parts of the sample, they represented questionable loan originations and, for other periods, they were not even permissible with underwriting standards.6TABLELogit Model with Markup‐LTV GridLTV Bucket25–6061–7576–8081–9091–9697+Bucket−3.035***−1.843***−1.194***−0.603***−0.349***×$\times$Markup Bucket−0.50 to −0.21−0.0173−0.392**−0.547***−0.422***−0.270***−0.477***−0.20 to −0.16−0.466*−0.255*−0.367***−0.225***−0.0703−0.209−0.15 to −0.110.122−0.116−0.196***−0.176***−0.0665−0.225*−0.10 to −0.060.113−0.0493−0.106**−0.08710.0278−0.267***−0.05 to −0.020.0790.0762−0.0638−0.05540.014−0.02580.02–0.050.01410.04080.01830.03790.112**0.03140.06–0.100.2710.247**0.117***0.0942*0.172***0.005970.11–0.150.360**0.329***0.215***0.207***0.248***0.10.16–0.200.483***0.385***0.340***0.171**0.267***0.199*0.21–0.500.550***0.585***0.468***0.439***0.395***−0.0529Other CovariatesYesObservations3,415,349Pseudo‐R20.726Note:Robust standard errors in brackets, clustered by State ×$\times$Year. *** p<0.01$p&lt;0.01$, ** p<0.05$p&lt;0.05$, * p<0.1$p&lt;0.1$. The sample includes purchase loans originated in 2001–2012, with a calculated markup, subject to filters noted in the text. The models include all covariates in Table 3 but each except the markup coefficient is omitted from the table text for brevity. The prepay equation is also omitted for brevity.CREDIT LOSSES CONDITIONAL ON DEFAULTThis section considers the intensive margin of defaults—the credit losses suffered by mortgage holders, conditional on default. As with the extensive margin, it is clear how markups could affect losses conditional on default based on the difference between perceived and actual collateral value. When there is a positive markup at origination, the collateral is less valuable than that which is indicated by the LTV at origination, and this may persist to an eventual REO sale, reducing recoveries.The accounting of credit losses is constructed with the GSE performance data. Credit losses are defined as the net proceeds from the final REO sale, including additions from gross sale proceeds and MI payments, and subtractions from unpaid principal balance (UPB), legal fees, taxes, insurance, homeowners' association fees, and maintenance. To simplify the accounting and the notation in this section, the net loss (L$L$) realized is expressed at the time of the REO sale (t+1$t+1$) as the sum of the gross sale proceeds (P$P$), MI claims (I$I$), UPB (u$u$), and an omnibus “other expenses” category (e$e$):10Lt+1=ut+1+et+1−Pt+1−It+1.$$\begin{equation} L_{t+1}= u_{t+1}+e_{t+1}-P_{t+1}-I_{t+1}. \end{equation}$$Let us define the loss fraction of the final UPB as Lt+1=Lt+1/ut+1$\mathcal {L}_{t+1}=L_{t+1}/u_{t+1}$and it+1$i_{t+1}$as the MI coverage ratio at the time of default. Additionally, recall that LTV$LTV$is the origination (time t$t$) LTV ratio, ut/Pt$u_t/P_t$, ΔHPI$\Delta HPI$is the change in the HPI over the time period, and mt$m_t$is the markup at origination. After some manipulation, equation (10) becomes the following:11Lt+1=1+et+1ut+1︸Expense Ratio−1+%ΔPt+1LTVt+Δut+1/Pt︸Current Equity−it+1︸MI Payment Ratio.$$\begin{equation} \mathcal {L}_{t+1}=1+\underbrace{\frac{e_{t+1}}{u_{t+1}}}_{\text{Expense Ratio}}-\underbrace{\frac{1+\%\Delta P_{t+1}}{LTV_t+\Delta u_{t+1}/P_t}}_{\text{Current Equity}}-\underbrace{i_{t+1}}_{\text{MI Payment Ratio}}. \end{equation}$$Within “Current Equity,” the term (1+%ΔPt+1)$(1+\%\Delta P_{t+1})$represents the change in the sale price between the initial and REO sale, including the markup. This term includes the markup, the change in the average market value, and variation in the REO sale price from the market average. The term Δut+1/Pt$\Delta u_{t+1}/P_t$represents the contribution of principal payments to equity. Because this is typically positive on a standard amortization loan, this term usually represents decreases in losses in the case of default. Accordingly, higher LTVs, negative house price appreciation, higher origination markups, and slower amortization increase losses. Expenses and MI payments are expressed as fractions of the initial balance, with expenses adding to credit losses and MI reducing losses.Empirical ModelA reduced‐form expression is linearized and stochastically specified for the loss fraction Lt+1$\mathcal {L}_{t+1}$below. The markup is the treatment variable, and the addition of the remaining variables is to ensure that there is no omitted variable bias. In this model, as with the default model, only partial correlations are captured and these parameter estimates should not be interpreted as causal relations between the variables.In place of the loan payoff fraction, the mortgage interest rate, rt$r_t$, is included and negatively affects amortization. In place of the house appreciation, which includes the markup and the appreciation of the particular house, three terms are included: the change in the local HPI, ΔlnHPIt$\Delta \ln HPI_t$, the markup, and allow any other random variation in the REO sale price to be absorbed within the residual. In specifications where loans have MI, the depth of coverage at the time of REO sale, it+1$i_{t+1}$, is tracked. Additional controls include GSE, state, and origination month fixed effects to account for different Enterprise strategies for REO properties, recourse versus nonrecourse states and other legal issues, as well as macroeconomic factors related to the housing and mortgage finance system. The expenditure share is taken as a constant fraction, and is therefore subsumed within the fixed effects. Other factors affecting the amortization period and small factors due to linearization are subsumed within the error term.12Lt+1=αit+β1mt+β2rt+β3LTVt+β4ΔlnHPIt+1+β5it+1+et.$$\begin{equation} \mathcal {L}_{t+1}=\alpha _{it}+\beta _1 m_t+\beta _2 r_t +\beta _3 LTV_t+\beta _4\Delta \ln HPI_{t+1}+\beta _5 i_{t+1}+e_t. \end{equation}$$A higher interest rate causes the average principal payment each period to decline in the early years of the amortization schedule, so the effect of the interest rate rt$r_t$on the loss fraction is predicted to be positive. Variation in foreclosure premiums and the time until foreclosure that varies by state is captured by the fixed effects. A term for the MI depth of coverage is added in samples where coverage is present. While this is correlated with LTV, it typically increases in a stepwise fashion. Both conditional and unconditional on LTV, the coverage depth is predicted to be negative and between 0 and 1 on the basis that some nonzero fraction of the losses will be made up by the coverage payment, but this payment will not exceed the total losses.ResultsEstimates of this model are shown in Table 7, calculated over samples of purchase loans originated in the 2006–2007 period that eventually defaulted. Model 1 covers the sample of loans without MI, and model 2 considers loans with MI in effect. The sample is split into two subsamples based on the incidence of credit enhancements because it is possible that MI protects the creditor from losses on the loan associated with markups.7TABLELosses, Markups, and Mortgage InsuranceDependent variable: Loss fraction of final UPBSample:No MI coverageMI coverageModel:[1][2]Markup0.196***0.153***[0.0123][0.0126]LTV0.556***−0.215***[0.0374][0.0832]%Δ$\%\Delta$House Price Index−0.429***−0.419***[0.0191][0.0226]Mortgage Insurance Coverage Depth0.105***0.0728***[0.00628][0.00547]Mortgage Interest Rate−0.443***[0.0551]Constant−0.737***0.0894[0.0444][0.0666]Observations29,39032,029R20.2940.244Note:Robust standard errors in brackets, clustered by State ×$\times$Year. *** p<0.01$p&lt;0.01$, ** p<0.05$p&lt;0.05$, * p<0.1$p&lt;0.1$. The sample includes purchase loans originated in 2006 or 2007, with a calculated markup, subject to filters noted in the text. The loss fraction is calculated as the proceeds net of expenses from property sale divided by the final unpaid principal balance on the mortgage. The MI coverage depth is defined as the percent of the UPB covered by mortgage insurance. GSE fixed effects are included in all specifications but omitted from the table.In model 1, the markup coefficient is about 0.2, indicating that about 1/5th of the markup at origination is predictive of credit losses for loans without MI. For instance, suppose a 10% markup on a $500,000 house ($50,000). This coefficient suggests that credit losses for the holder of the mortgage are about $10,000 higher than for an equivalent loan with no markup. Other control coefficients are consistent with comparative statics from equation (11), with LTV and the interest rate contributing positively to losses and house price appreciation contributing negatively.Model 2 presents a similar story. The markup coefficient is a bit smaller at about 0.15, indicating that in loans with MI, the markup is still predictive of losses, but with a weaker relationship. This suggests that MI may alleviate some of the risk posed by markups to creditors. Highlighting the role of MI is the LTV parameter that is now negative, implying that higher LTVs are associated with more‐than‐compensating MI depth with lower overall credit risk.When considering both models together, the markup coefficient is remarkably robust across different samples and control variables, indicating that the markup calculated using an HPI is strongly associated with credit losses conditional on default.33While presented within the context of mortgages, this approach could be extended similarly to other financial products where credit risk involves not only the loss of an asset but also a cash flow or an associated value that may not be homogeneous for all assets within a class.MI may protect against credit losses associated with markups, but the estimated effect is small.MECHANISMSWhy can markups predict default even after controlling for LTV? To answer this question, it is useful to note that the sale price (P0$P_0$) and the appraised value (Papp$P_{app}$) mechanically determine a loan's LTV$LTV$at originationLTV=100×Lmin{Papp,P0}.$$\begin{equation*} LTV=100 \times \frac{L}{\min \lbrace P_{app},P_0\rbrace }. \end{equation*}$$If the “true” average value of a home were correctly captured by P0$P_0$and/or Papp$P_{app}$, the correlation between positive markups and future defaults should be partially (or completely) captured by the coefficient(s) on LTV$LTV$in a default model. However, both the transaction price and the appraisal may be inadequate measures of the housing unit's valuation for the following reasons. First, as it was discussed in the conceptual framework, P0$P_0$captures conditions that are idiosyncratic to each transaction and that may not reflect average market conditions. Obviously, a lender cares about the average and not the idiosyncratic component of collateral's value. Second, previous studies have shown that the appraisal Papp$P_{app}$may be subject to appraisal bias: appraisers typically “ratify” selling prices that are above the indexed based price of the house and appraised values are often equal to (and rarely below) sales prices (see Nakamura 2010, Agarwal, Ben‐David, and Yao 2013, Calem, Lambie‐Hanson, and Nakamura 2015, Ding and Nakamura 2016, Shui and Murthy 2018, Calem et al. 2021). These factors suggest a mechanism for the predictability of markups: a failure of the average appraisal to accurately estimate the value of the mortgage collateral at origination.While other mechanisms cannot be ruled out, there is substantial evidence reinforcing this claim by performing two simple exercises. First, a new version of the markup is calculated using appraisals and this variable is highly right‐censored at zero and does not appear to be associated with defaults.Second, competing options default models are run with markups calculated using the standard HPI approach and then the appraisal on the markup; the appraisal markup contributes almost no information and the HPI markup maintains its magnitude, sign, and significance. These findings suggest that appraisals are missing important variation in the fundamental value of a home based on their failure to predict defaults, with potentially serious consequences.Appraisal BiasAppraisal markupsare calculated as the difference between the sale price P0$P_0$and the appraisal Papp$P_{app}$. Because mortgage appraisals are used to estimate the collateral on a loan, an appraisal that is substantially lower than a transaction price will often cause a loan application to be rejected. If loan rejection were the only cause of a lack of mass in the right half of the distribution, it could be said that appraisers are fulfilling one of their primary objectives: to prevent a buyer from overpaying on a home, which from the lender's perspective, would reflect lower relative collateral. However, excess mass exists at zero in the second panel of Figure 4 suggesting censoring rather than truncation in the distribution, reflecting substantial appraisal bias. For purchase mortgages, 37% of all appraisals are exactly equal to the transaction price and 49% are within 0.5% of the transaction price. Previous research has shown that an appraisal nearly equal to the contract price is an indicator of heightened risk (see Nakamura 2010, Agarwal, Ben‐David, and Yao 2013, Calem, Lambie‐Hanson, and Nakamura 2015, Ding and Nakamura 2016, Calem et al. 2021).Appraisal Markups and DefaultAcross the distribution of appraisal‐based markups, it appears as though markups has a slight negative correlation with defaults on a reduced‐form basis. It is possible, however, that appraisal markups are predictive conditional on other observables. This may occur due to the mechanical interaction of the appraisal with the LTV: a negative appraisal markup results in a lower LTV, all else equal. On the other hand, as with HPI markups, endogenous sorting into different markup values may occur and drive partial correlations within the data.As before, the relation between appraisal markups and default is analyzed using a competing‐options default model. Estimates are reported for three models in Table 8, each with the same baseline specification and sample as model 1 in Table 3. These models estimate default and prepayment log‐odds parameters as a function of the markup and covariates. Model 1 uses HPI markups, and is identical to the earlier benchmark model. Model 2 estimates the bivariate relationship between appraisal markups and default excluding all other covariates. Because of the heavy censoring at zero, and following the literature, a dummy variable is added for when the appraisal is within 0.5% of the contract price to indicate a potentially censored appraisal. The negative coefficient on the markup is consistent with the pattern shown in the bottom panel of Figure 4. The coefficient on the censoring dummy is positive, indicating heightened risk, consistent with Calem et al. (2021). Model 3 adds the same list of covariates as in the benchmark model. The coefficient on the appraisal markup switches sign and becomes similar to the HPI markup. This indicates substantial correlation between markups and other covariates, and omitted variable bias in model 2.8TABLEDefaults and Markups Calculated Using HPIs versus AppraisalsMarkup variable(s)HPIAppraisalAppraisalBothModel[1][2][3][4]Markup (HPI)1.380***1.309***[0.0470][0.0482]Markup (Appraisal)−0.734*1.518***0.432***[0.425][0.156][0.148]Appraisal = Sale Price0.233***0.124***0.112***[0.0455][0.0139][0.0132]Other CovariatesYesNoYesYesObservations3,415,3493,415,3493,415,3493,415,349Pseudo‐R20.7260.002520.7250.726Log‐Likelihood−531,837−1,934,434−533,143−531,683Note:Values presented are the marginal log‐odds estimates. Robust standard errors in brackets, clustered by State ×$\times$Year. *** p<0.01$p&lt;0.01$, ** p<0.05$p&lt;0.05$, * p<0.1$p&lt;0.1$. The sample includes loans originated in 2001–2012, with a calculated markup, subject to filters noted in the text. The models include all covariates in Table 3 but each except the markup, coefficients are omitted from the table text for brevity. The Prepay equation is also omitted. Model [1] is the same as in Table 3, model [1].Finally, model 4 includes both the appraisal and HPI markups to assess the relative information contained in each of the markup variables. There are three factors to consider relative to models 1–3: the coefficients, the log‐likelihoods, and the variation in the markups themselves. The coefficient on the appraisal markup falls by about 75% relative to model 3 from 1.5 to 0.4, while the parameter on the HPI markup is mostly maintained relative to model 1. This indicates that the HPI markup nearly encompasses the appraisal markup. The overall fit of the model is marginally lower using appraisal markups alone in model 3, with a pseudo‐R2$R^2$of 0.725 and a log‐likelihood of −533,143 versus 0.726 and −531,873 for HPI markups in model 1. In the fourth model, when both appraisal and HPI markups are included, the log‐likelihood is almost identical to model 1 at −531,683, implying little explanatory gain from adding the appraisal markup variables. Finally, while the parameter on the appraisal markup is about 1/4 the size of the HPI markup, the standard deviation of the appraisal markup is also much smaller at 4.4% versus the HPI markup's 16.6%. This suggests that the continuous portion of the appraisal markup explains about 1/16 of the default variation of the HPI markup.Overall, these combined findings confirm that appraisals contain limited information beyond the observed LTV, and that collateral‐related information is still predictive of defaults.34Appraisal bias is not necessarily inconsistent with a well‐functioning mortgage market. Valuing homes at purchase prices might be consistent with a market that is willing to accept a small number of additional foreclosures in return for a much higher number of home sales. We thank an anonymous referee for making this point.CONCLUDING DISCUSSIONHouse price markups—calculated as the difference between a transaction price and a predicted price—are associated with mortgage delinquencies, defaults, prepayments, losses conditional on default, and loan modifications. Moreover, these associations are economically relevant—the difference between a −20% and +20% markup is a 50% increase of the default rate of a mortgage, holding all other characteristics of the loan and borrower constant. Because appraisals are biased toward the contract price, and the LTVs calculated using these appraisals give measures of expected defaults, appraisal bias may be an important factor of credit risk mismeasurement. The analysis has the dual strengths of being based on a set of modeling approaches rooted firmly in the literature and being estimated with a near‐universe of house sales from a large pool of mortgages.Markups are related to loan performance for fundamental reasons related to the microstructure of the housing and mortgage markets. As shown in the conceptual framework, markups do not cause default. Rather, the price paid for an asset is chosen simultaneously with the expected probability of default because housing is highly leveraged relative to expected house price appreciation, and the house itself serves as the collateral on the loan.Although the estimates are specific to the market for residential real estate, a similar mechanism emerges in many settings: any market with both large spreads and high leverage, where the collateral on the loan is the purchased asset, likely exhibits a similar type of relation between markups and loan performance. Of particular similarity is the market for auto loan debt, but this relation could also extend to insurance policies or even alternative investments like collectibles. These interesting applications are left as topics for further research.APPENDIXA.1TABLEMarkups and Loan Outcomes, Sample RobustnessSample|M(t−1)|<5%$|M(t-1)| &lt; 5\%$Loan<$100k$Loan &lt; \$100k$Loan>$300k$Loan &gt; \$300k$Model[A1][A2][A3]Markup1.997***1.310***1.431***[0.125][0.106][0.0623]Observations428,609444,921601,853Pseudo‐R20.7380.7560.706Note:Robust standard errors in brackets, clustered by State ×$\times$Year. *** p<0.01$p&lt;0.01$, ** p<0.05$p&lt;0.05$, * p<0.1$p&lt;0.1$. The sample includes purchase loans subject to filters noted in the text. The models include all covariates in Table 3 but each except the markup coefficient is omitted from the table text for brevity. Model A1 consists of a sample of loans where the prior markup is nearly zero. Models A2 and A3 are for different loan amount subsamples.A.2TABLEMarkups and Loan Outcomes, Other RobustnessLinear CLTVNo Credit ScoreMarkup1.385***1.398***[0.0471][0.0478]CLTV (linear)5.81***[0.161]Other CovariatesYesYesObservations3,415,3493,415,349Pseudo‐R20.7260.719Note:Robust standard errors in brackets, clustered by State ×$\times$Year. *** p<0.01$p&lt;0.01$, ** p<0.05$p&lt;0.05$, * p<0.1$p&lt;0.1$. The sample includes purchase loans subject to filters noted in the text. Models with “other covariates” include all covariates in Table 3 but each except the markup coefficient is omitted from the table text for brevity.A.1FigDistributions of HPI‐constructed Markups and Default Costs Measured Using Credit Scores.Notes:The sample includes loans originated in 2001 through 2012, with a calculated markup, subject to filters noted in the text. Credit scores are measured at origination. The line presents the polynomial‐smoothed estimated markup over all mortgages in the sample, estimated with a bandwidth of 200, using STATA.LITERATURE CITEDAdelino, Manuel, Kristopher Gerardi, and Barney Hartman‐Glaser. (2019) “Are Lemons Sold First? Dynamic Signaling in the Mortgage Market.” Journal of Financial Economics, 132, 1–25.Agarwal, Sumit, Itzhak Ben‐David, and Vincent Yao. (2013) “Collateral Valuation and Borrower Financial Constraints: Evidence from the Residential Real Estate Market.” Working Paper 19606, National Bureau of Economic Research.Agarwal, Sumit, Itzhak Ben‐David, and Vincent Yao. (2017) “Systematic Mistakes in the Mortgage Market and Lack of Financial Sophistication.” Journal of Financial Economics, 123, 42–58.Anenberg, Elliot. (2016) “Information Frictions and Housing Market Dynamics.” International Economic Review, 57, 1449–79.Bailey, Michael, Eduardo Davila, Theresa Kuchler, and Johannes Stroebel. (2017) “House Price Beliefs and Mortgage Leverage Choice.” National Bureau of Economic Research Working Paper Series.Ben‐David, Itzhak. (2011) “Financial Constraints and Inflated Home Prices during the Real Estate Boom.” American Economic Journal: Applied Economics, 3, 55–87.Bogin, Alexander N., William M. Doerner, and William D. Larson. (2019a) “Local House Price Dynamics: New Indices and Stylized Facts.” Real Estate Economics, 47, 365–98.Bogin, Alexander N., William M. Doerner, and William D. Larson. (2019b) “Missing the Mark: Mortgage Valuation Accuracy and Credit Modeling.” Financial Analysts Journal, 75, 32–47.Calem, Paul, Jeanna Kenney, Lauren Lambie‐Hanson, and Leonard Nakamura. (2021) “Appraising Home Purchase Appraisals.” Real Estate Economics, 49, 134–68.Calem, Paul S., Lauren Lambie‐Hanson, and Leonard I. Nakamura. (2015) “Information Losses in Home Purchase Appraisals.” Working Paper 15‐11, Federal Reserve Bank of Philadelphia.Campbell, John Y., Stefano Giglio, and Parag Pathak. (2011) “Forced Sales and House Prices.” American Economic Review, 101, 2108–31.Campbell, Tim S., and J. Kimball Dietrich. (1983) “The Determinants of Default on Insured Conventional Residential Mortgage Loans.” Journal of Finance, 38, 1569–81.Carrillo, Paul. (2013) “Testing for Fraud in the Residential Mortgage Market: How Much Did Early‐Payment‐Defaults Overpay for Housing?” Journal of Real Estate Finance and Economics, 47, 36–64.Carrillo, Paul E. (2012) “An Empirical Stationary Equilibrium Search Model of the Housing Market.” International Economic Review, 53, 203–34.Case, Karl E., and Robert J. Shiller. (1989) “The Efficiency of the Market for Single‐Family Homes.” American Economic Review, 79, 125–37.Chinco, Alex, and Christopher Mayer. (2016) “Misinformed Speculators and Mispricing in the Housing Market.” Review of Financial Studies, 29, 486–522.Cox, David R. (1972) “Regression Models and Life‐Tables.” Journal of the Royal Statistical Society. Series B (Methodological), 34, 187–220.Davis, Donald R., and David E. Weinstein. (2002) “Bones, Bombs, and Break Points: The Geography of Economic Activity.” American Economic Review, 92, 1269–89.Davis, Morris A., William D. Larson, Stephen D. Oliner, and Benjamin R. Smith. (2019) “A Quarter Century of Mortgage Risk.” Technical Report, Federal Housing Finance Agency.Deng, Yongheng, John M. Quigley, and Robert Van Order. (2000) “Mortgage Terminations, Heterogeneity and the Exercise of Mortgage Options.” Econometrica, 68, 275–307.Ding, Lei, and Leonard Nakamura. (2016) “The Impact of the Home Valuation Code of Conduct on Appraisal and Mortgage Outcomes.” Real Estate Economics, 44, 658–90.Foote, Christopher, Kristopher Gerardi, Lorenz Goette, and Paul Willen. (2010) “Reducing Foreclosures: No Easy Answers.” NBER Macroeconomics Annual, 24, 89–138.Foote, Christopher L., Kristopher Gerardi, and Paul S. Willen. (2008) “Negative Equity and Foreclosure: Theory and Evidence.” Journal of Urban Economics, 64, 234–45.Genesove, David, and Christopher Mayer. (2001) “Loss Aversion and Seller Behavior: Evidence from the Housing Market.” Quarterly Journal of Economics, 116, 1233–60.Ghent, Andra C., and Marianna Kudlyak. (2011) “Recourse and Residential Mortgage Default: Evidence from US States.” Review of Financial Studies, 24, 3139–86.Glaeser, Edward L., and Charles G. Nathanson. (2017) “An Extrapolative Model of House Price Dynamics.” Journal of Financial Economics, 126, 147–70.Goodman, Allen C., and Thomas G. Thibodeau. (2003) “Housing Market Segmentation and Hedonic Prediction Accuracy.” Journal of Housing Economics, 12, 181–201.Han, Lu, and William C. Strange. (2015) “Chapter 13 ‐ The Microstructure of Housing Markets: Search, Bargaining, and Brokerage.” In Handbook of Regional and Urban Economics, edited by J. Vernon Henderson, Gilles Duranton, and William C. Strange, vol. 5, pp. 813–86. Amsterdam: Elsevier.Harding, John P., Stuart S. Rosenthal, and Clemon F. Sirmans. (2003) “Estimating Bargaining Power in the Market for Existing Homes.” Review of Economics and statistics, 85, 178–88.Carlos Hatchondo, Juan, Leonardo Martinez, and Juan M. Sanchez. (2015) “Mortgage Defaults.” Journal of Monetary Economics, 76, 173–90.Jordà, Óscar, Moritz Schularick, and Alan M. Taylor. (2015) “Leveraged Bubbles.” Journal of Monetary Economics, 76, S1–S20.Mayer, Christopher, Karen Pence, and Shane M. Sherlund. (2009) “The Rise in Mortgage Defaults.” Journal of Economic Perspectives, 23, 27–50.Merlo, Antonio, Francois Ortalo‐Magne, and John Rust. (2015) “The Home Selling Problem: Theory and Evidence.” International Economic Review, 56, 457–84.Molloy, Raven, and Eric Nielsen. (2018) “How Can We Measure the Value of a Home? Comparing Model‐Based Estimates with Owner‐Occupant Estimates.” FEDS Notes, Board of Governors of the Federal Reserve System.Mortensen, Dale T., and Christopher A. Pissarides. (1999) “Chapter 39 New developments in Models of Search in the Labor Market.” In Handbook of Labor Economics, Vol. 3, pp. 2567–27. Amsterdam: Elsevier.Nakamura, Leonard I. (2010) “How much is that Home Really Worth? Appraisal Bias and House‐Price Uncertainty.” Business Review, Q1, 11–22.Piskorski, Tomasz, Amit Seru, and Vikrant Vig. (2010) “Securitization and Distressed Loan Renegotiation: Evidence from the Subprime Mortgage Crisis.” Journal of Financial Economics, 97, 369–97.Shui, Jessica, and Shriya Murthy. (2018) “Are Appraisal Management Companies Value‐Adding? Stylized Facts from AMC and Non‐AMC Appraisals.” Working Paper 18‐01, Federal Housing Finance Agency.

Journal

Journal of money credit and bankingWiley

Published: Jun 1, 2023

Keywords: appraisal bias; credit risk; collateral risk; house price; mortgage

There are no references for this article.