The Retail Habitat

  • 时间:2025-10-01
  • 作者:Toomas Laarits,Marco Sammon

  Abstract

  Retail investors trade hard-to-value stocks. Controlling for size, stocks with a high share of retail-initiated trades are composed of more intangible capital, have longer duration cash-flows and a higher likelihood of being mispriced. Consistent with retail-heavy stocks being harder to value, we document that such stocks are less sensitive to earnings news. As an additional consequence, the well-known earnings announcer risk premium is limited to low retail stocks only. Further, high-retail stocks are more sensitive to retail order flow and are especially expensive to trade around earnings announcements. Overall, the findings document a new dimension of investor heterogeneity and suggest the comparative advantage of retail in holding hard-to-value stocks.

A theme in recent asset pricing research has been to employ trading and ownership data to explain cross-sectional variation in returns. Motivated by this line of inquiry, we argue that the contrasting attributes of institutional and retail investors provide a particularly useful framework for capturing variation in trading and ownership patterns and, in turn, for identifying the drivers of risk and return.
Institutional investors are frequently spoken of in the finance literature as “smart money”: their portfolios are typically much larger than those of any individual investor, and they have the scale to acquire and process various kinds of data. This scale also allows them to better accommodate the fixed costs of trading, to take on leverage, and to short. Retail investors, by contrast, are seen as suffering from a variety of behavioral biases and cognitive errors, and as less equipped to do meaningful research. They are thus deemed “noise traders” in the sense of Black (1986). Yet retail investors do have advantages. As they are investing their own money, they have more control over the investment horizon, and they do not face flow sensitivity to recent performance. They are also not constrained by mandates restricting their investable universe or tying their compensation to performance relative to a benchmark.
Indeed, a number of empirical findings contradict a pure noise trader view. A prominent strand of work finds that retail trades, which are contrarian with respect to past returns, on aggregate predict future returns with a positive sign.1 On the other side of the retail-institutional divide, recent work by Di Maggio et al. (2023) documents that institutional investors trade away from stocks shortly before their earnings announcements, precisely when their presumptive informational advantage should be the strongest. Further, the pandemic-era surge in retail trading (Welch, 2022) and the response of the stock market to US government stimulus checks (Greenwood et al., 2023) have served as reminders of the potentially large effect of retail traders on markets.
These findings, taken together, suggest that a framework that goes beyond an informed versus uninformed dichotomy is required to account for the holdings and trading patterns of retail and institutional investors. Our argument is that the nature of the stocks heavily traded by retail and institutional investors plays a critical role. Specifically, we contend that the above empirical findings reflect the tendency of retail investors to trade stocks that are hard-to-value. These stocks typically have a weaker connection between current fundamentals and market values. For instance, current fundamentals might bear less on the firm’s prospects because its cash flows have a long duration or because its intangible capital comprises much of its total value. As an extreme example, consider a biotechnology firm. Earnings per share (EPS) may not be useful for valuing this firm because the success of a potential breakthrough product may not materialize in a gradual EPS growth; it can happen suddenly.2
We argue that in hard-to-value stocks, the presumptive informational advantages of institutional investors are mitigated. To motivate this idea, we provide a model in the spirit of Kyle (1985). The model features an informed institutional investor who endogenously allocates a fixed attention budget across a set of stocks to maximize total expected trading profits. Importantly, the stocks differ in the cost of generating a signal of a given precision, capturing the notion that some stocks are harder to value than others. Retail order flow is modeled as unpredictable and unrelated to fundamentals, but retail traders’ behavior reflects an element of sophistication in that their trading intensity can depend on stock characteristics.
Where in the cross-section should this informed investor choose to produce information? At first blush, it may seem that a stock with high retail trader presence would be the best bet, as the informed investor can better hide their trades among the retail investors’ order flow. This logic only holds, however, if retail investors’ propensity to trade is equal across stocks. If, however, retail investors tend to trade hard-to-value stocks, informed investors may allocate less attention to these stocks, as their expected profits would be lower on account of a weaker informational advantage. In the presence of participation costs, informed investors may avoid learning about the stocks favored by retail investors altogether, leading to segmentation and a habitat of stocks favored by retail investors. Overall, the model illustrates that where informed investors choose to trade depends on the relative strength of these two forces: the intensity of retail trading and the propensity of retail traders to focus on hard-to-value stocks.
Our empirical analysis is motivated by the tension between these two forces. First, to establish a premise for our analysis—the existence of stocks with high retail investor trading interest (“high retail”)—we document new facts on the distribution of retail trading in the cross-section. We show that retail trading intensity is concentrated and persistent: roughly 90% of stocks in the top 20% of retail trading intensity at any given time remain in the top two quintiles of retail trading intensity 12 months later.
Having documented a persistent retail focus, we then seek to test which of the two forces suggested by our framework matters more empirically. We find that the concept of difficulty-to-value—defined in the model as the relative cost of obtaining a signal of a given precision—is a key characteristic for explaining the cross-sectional heterogeneity in retail trading intensity: retail investors trade hard-to-value stocks.
To establish this cross-sectional regularity, we employ three types of proxies of difficulty-to-value. First, we examine cash-flow duration (as constructed in Gormsen and Lazarus (2023)) under the view that firms with longer duration cash flows are harder to value because investors need to forecast fundamentals further into the future. Second, we examine measures of intangible capital (Peters and Taylor, 2017Kogan et al., 2017), which is harder to value than physical capital (Lev and Gu, 2016). Third, we examine several composite measures: the mispricing score of Stambaugh and Yuan (2017), the valuation uncertainty score of Golubov and Konstantinidi (2023), and the hard-to-value score of Ben-David et al. (2023). Across these measures, we find that hard-to-value stocks see higher retail trading intensity and all of these relationships hold when controlling for market capitalization quintiles. Combined with the evidence of retail trading persistence, these results establish a new way to capture the trading patterns of retail versus institutional investors.
In the second set of results, we demonstrate that this “retail sort” is particularly powerful in capturing differential return dynamics around earnings announcements in a way that is consistent with our contention that retail investors trade hard-to-value stocks. We find that high retail stocks have more volatile announcement news and returns, with a standard deviation of standardized unexpected earnings (SUE) that is almost three times as large for high retail stocks than low retail stocks. We also find that the dispersion in analysts’ forecasts for high retail stocks is roughly five times as large as for low retail stocks. Analyst price and earnings forecasts are less accurate among stocks heavily traded by retail investors, and the selection of unskilled or attention-constrained analysts into high retail stocks does not appear to drive this inaccuracy.
These results comport with our claim that retail traders concentrate in stocks with weak links between observable fundamentals and market prices. To quantify this connection, we employ earnings-response regressions, following Kothari and Sloan (1992), and find that, for a given magnitude earnings surprise, high retail stocks’ prices respond significantly less to earnings news than low retail stocks. A stock in the highest quintile in terms of past retail trading share has an almost 50% lower sensitivity to SUE news than a stock in the middle quintile. This effect endures when we control for characteristics known to be correlated with retail activity and holds at almost every point along the firm size distribution.
The results so far establish descriptive differences in the composition of the high retail portfolio as well as dynamics around earnings announcements. In a third set of results, we show that these differences reflect the decisions of retail investors as a group. Specifically, we document that retail-heavy stocks see substantial retail-initiated buys in anticipation of earnings announcements. This net retail buying, normalized by respective daily trading volume, cumulates to over 2% in the run-up to earnings announcements. Retail traders hence actively trade to assume incremental earnings news risk of these stocks. Our result elucidates the phenomenon documented by Di Maggio et al. (2023) regarding the tendency of institutional investors to exit positions ahead of earnings announcements. We find that this effect is substantially larger among high retail stocks, suggesting institutional investors understand that hard-to-value stocks have volatile and idiosyncratic earnings-day returns and want to avoid this risk.
The tendency of retail investors to buy in anticipation of earnings announcements is suggestive of liquidity provision. On the announcement level, we find that stocks heavily bought by retail investors in anticipation of earnings news releases outperform stocks heavily sold by retail investors over the subsequent 60 days, inclusive of the earnings day itself. This pattern, documented unconditionally by Kaniel et al. (2012), is particularly pronounced for stocks with high past retail trading. In a decomposition, we attribute about 25% of this predictable return differential to liquidity provision, emphasizing the active role retail traders take around earnings announcements.
To summarize, the announcement time returns of high retail stocks are more volatile and have a weaker relationship with earnings news. Further, retail investors take a disproportionate long position in these stocks before earnings announcements, consistent with liquidity provision. In the final set of results, we argue that these two relations lead to substantial differences in average returns earned over the announcement window.
As established in a long literature, starting with Beaver (1968), stocks tend to earn high average returns when they are scheduled to make earnings announcements. A potential explanation for this earnings announcer premium is that announcing firms provide information about non-announcing firms and the premium compensates investors for exposure to systematic risk, as argued by Savor and Wilson (2016). Our expectation therefore is that the premium should be smaller for high retail stocks, as we have shown their earnings news to mostly reflects idiosyncratic information. This is what we find: high retail stocks consistently see lower announcement time returns. Unconditionally, stocks earn a six-day earnings announcement premium of 32 basis points (bps). However, those in the highest retail trading quintile see an average return of negative 48 bps over the same time window. This gap in returns stems from high retail stocks having much lower average returns in response to the earnings news. In pre-announcement returns, the pattern flips: high retail stocks earn higher returns than low retail stock returns. High retail stocks therefore contribute to the puzzle (as documented by Frazzini and Lamont (2007)) of high pre-announcement returns. That pre-announcement returns are highest among stocks that institutional investors exit in anticipation of earnings announcements suggests that liquidity provision helps drive pre-announcement returns.
Finally, note that our proxy for retail trading intensity is based on the algorithm developed by Boehmer et al. (2021). Several recent studies have raised concerns about the algorithm both failing to capture true retail trades and incorrectly classifying institutional trades as retail-initiated (Barber et al., 2024Battalio et al., 2023). In Online Appendix A.2, we argue that these issues can have only a minor impact on our results. A quantitative assessment of the documented biases in the algorithm shows that the magnitude of classification errors is too small to explain the spread we observe in retail trading intensity. Further, we confirm that all our results endure when using the improved trade direction classification algorithm proposed by Barber et al. (2024).

  0.1. Connection to the literature

Our work contributes to the research highlighting the importance of investor heterogeneity and less-than-perfect risk-sharing in determining the risk-return trade-off in security prices. One part of this work seeks to estimate demand curves of different investor classes as functions of various characteristics (Koijen and Yogo, 2019Koijen et al., 2024McLean et al., 2020Haddad et al., 2025van der Beck, 2022). We document a new point of distinction in the trading habits of two principal investor classes: retail and institutional investors. Other recent work by Balasubramaniam et al. (2023) and Gabaix et al. (2023) has studied the portfolios of retail investors specifically. Balasubramaniam et al. (2023) use account-level data from India to document the role of characteristics in attracting retail holdings. They find that firm age and nominal price and, to a weaker degree, turnover and recent returns best capture the heterogeneity in retail holding intensity. Our aggregate retail trading data are consistent with a retail focus on firm age and nominal price as well as turnover and past returns while pointing to a unifier of these regularities.
Outside of that recent work, the literature has devoted surprisingly little attention to the determinants of retail trading and holdings in the cross-section. Most of the literature has focused on behavioral frictions that bring stocks to the attention of retail investors. However, we find substantial and persistent cross-sectional heterogeneity in retail trading intensity, which can be explained by a metric that is not an obvious function of past returns, betas, or accounting metrics. Our results add to the literature by suggesting that difficult-to-value stocks attract retail attention or, equivalently, deter institutional investors.
This aspect of retail selection allows us to reconcile two broad, seemingly contradictory aspects of retail investing. On one hand, research has repeatedly found that retail trades, on aggregate, positively predict stock returns going forward. For example, Kaniel et al. (2012) show that the direction and magnitude of retail order flow predict returns on and after earnings announcements. Along the same lines, in more recent work, Welch (2022) documents that Robinhood investors as a group did well in 2020–2021.3 On the other hand, retail traders have been shown to suffer from behavioral biases, including excessive trading (Barber and Odean, 2000Barber and Odean, 2002), familiarity bias (Huberman, 2001Seasholes and Zhu, 2010), extrapolation (Benartzi, 2001), and the disposition effect (Odean, 1998Dhar and Zhu, 2006Vaarmets et al., 2019). Moreover, the relaxation of retail investors’ budget constraints leads to rallies among retail-heavy stocks (Greenwood et al., 2023). Taken as a whole, our results suggest that, because of retail traders’ preference for hard-to-value stocks, these biases and predictable errors are particularly hard for professional investors to correct.
More broadly, our results can be used to recast several results in the asset pricing literature. First, the literature has shown significant effects of retail investor buying on stock prices (Kumar and Lee, 2006Greenwood et al., 2023). Our results on the concentration of retail investors’ trading as well as the types of stocks they prefer may explain why retail investors can have such a large effect on prices, despite their small share of overall stock market wealth. Second, the focus on hard-to-value stocks can explain why retail order flow strongly predicts returns going forward, as documented by Kaniel et al. (2012). In fact, we show that this predictability is particularly pronounced among high retail share stocks. Given that retail order flow is persistent, it may be difficult for institutional investors to maintain bets against retail order flow long enough to benefit from long-run reversion (De Long et al., 1990). Further, we believe our results speak to the literature on how inventory risk borne by intermediaries can lead to high pre-earnings-announcement returns (Johnson and So, 2018). Specifically, we find that high retail stocks have earnings-announcement returns which are significantly more volatile than low retail stocks. Market makers holding inventory therefore may offer a premium to retail investors buying ahead of the announcement, as this would reduce their naturally long exposure to this risk. Finally, we show that stocks favored by retail investors have very low or high mispricing scores (constructed using the data from Stambaugh and Yuan (2017)), suggesting they are often in the extreme ends of anomaly portfolios. This opens the door for retail investors to directly contribute to anomaly returns, as their trading in these stocks impedes institutional investors from trying to correct any mispricing.

  1. Hypothesis development

In this section, we outline a model and our empirical predictions.

  1.1. Motivation

Consider a model in the spirit of Kyle (1985) with multiple securities. An informed insider, representing institutional investors, has a total attention budget allocated across stocks to maximize their total expected trading profits. Apportioning more attention to a given stock increases the insider’s signal precision, but learning about any given asset entails diminishing marginal returns. We allow the securities to be heterogeneous in two ways: (A) in the level of noise trading intensity, standing in for differences in the intensity of retail trading, and (B) in the cost of generating a signal of a given precision, standing in for the difficulty of valuing the stock. Finally, the model features fixed participation costs that informed investors must pay if they want to trade a given security. Within this model we ask: where in the cross-section would the institutional investor find it most profitable to produce information?
When assets differ only in the intensity of noise trading, informed investors allocate more attention to those with higher retail activity. This reflects the standard intuition that noise traders help mask informed investors’ trades. However, cross-sectional differences in the informed investor’s ability to generate a signal can overturn this relationship. Faced with different costs of signal production, all else equal, the informed investor would allocate more attention to the stocks with lower information production costs. To the extent that the stocks retail investors prefer are precisely the ones with high information production costs, this second force can outweigh the expected benefits of hiding among retail order flow.
The two-period (Kyle, 1985) model described in Online Appendix A.1 allows us to illustrate this point. We specify the model with five assets featuring different levels of noise trader order flow volatility and, potentially, differences in information acquisition costs and explore the resulting attention allocation of the informed investor.
First, we consider the case with no cross-sectional differences in the cost of producing a signal of a given precision. As illustrated in the top left panel of Fig. 1, the informed investor’s optimal attention allocation increases monotonically with noise trading intensity. This relationship aligns with the standard intuition from Kyle (1985): all else equal, an informed trader’s expected profits are increasing in noise trader activity. As a result, when equating the marginal benefits of learning across assets, the informed investor concentrates on those with higher noise trading intensity. Introducing fixed participation costs at the asset level reinforces this emphasis. As shown in the bottom left panel of Fig. 1, with positive participation costs, the stocks with the lowest noise trading share can receive zero attention. These fixed participation costs aim to capture several features of the investment process, such as the effort required to identify viable investments before conducting deeper research, maintain coverage of existing positions, and manage the operational burden of justifying underperforming positions.

Fig. 1Optimal attention allocation of the informed investor. Five bars in each of the panels represent five stocks that differ in noise trading intensity: the standard deviation of noise trading in assets 1 to 5 is 1, 1.25, 1.5, 1.75 and 2, respectively. Assets potentially differ in the inverse cost factor, the rate at which attention translates into improved signal precision. In the left column both parametrizations have constant inverse cost factors; in the right column the inverse cost factors are decreasing going from asset 1 to asset 5. Total attention budget is fixed at 15. Fundamental volatility (σν) is fixed at 0.25.

Fig. 2Retail share of trading volume. The average retail share of trading volume in the top and bottom quintiles sorted on previous month’s retail trading intensity.

Fig. 3Abnormal net trading around earnings announcements. Daily net trading (retail-initiated buys minus retail-initiated sells, measured in shares), normalized by aggregate trading volume or shares outstanding. We subtract out the unconditional means in respective series to construct an abnormal measure and take an equal-weighted average within each quintile. Q1 represents the bottom quintile of retail intensity, while Q5 represents the top quintile. Bottom panels cumulate the values in top panels starting at time −10 relative to earnings announcement day at time 0. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

Second, we consider the case with cross-sectional differences in information production costs. We introduce the parameter αi, which captures the ease of producing a signal of a given precision in stock i. Low values of αi make signals of a given precision costlier to obtain, capturing the concept of a hard-to-value stock. What is more, while we still model retail investors’ order flow as uncorrelated with fundamentals, we allow their trading intensity to be correlated with the difficulty of valuing a particular stock. In other words, we specify low values of the parameter αfor precisely the stocks where noise trading is the highest. A correlation between difficulty-to-value and retail interest might stem from various behavioral effects, such as hard-to-value stocks being more attention grabbing, or it could stem from retail investors evincing a degree of sophistication and recognizing their informational disadvantage among relatively easy-to-value securities.

The top right panel of Fig. 1 shows that such retail investor focus on hard-to-value stocks can flip the results in the top left panel. In this particular parametrization, the informed investor’s attention allocation monotonically decreases with noise trading intensity, the opposite pattern relative to the set-up with equal learning costs across assets. While the informed trader would still prefer to trade in securities with more noise traders, the signal production costs of these stocks are high enough to push them into allocating most of their attention to low noise trader intensity stocks. Therefore, equalizing the marginal benefits of learning across the five assets results in the informed investor devoting more attention to the low noise trading stocks.

Like before, the introduction of fixed costs can exclude certain securities from the informed investor’s consideration altogether. The bottom right panel of Fig. 1 shows that, as fixed costs increase from 0 to 2, the informed investor stops learning about the highest retail interest securities #4 and #5. This phenomenon illustrates our notion of the retail habitat: a subset of stocks that—due to being hard to value in the sense that signal generation is costly—are heavily traded by retail investors and avoided by institutional investors.

To summarize, the model in Online Appendix A.1 highlights two competing forces: informed investors’ desire to hide their trades among noise traders versus the precision of their signal. The results from the model suggest that which of these forces dominates depends on whether retail investors have a persistent habitat of hard-to-value stocks and therefore is an empirical question.

  1.2. Cross-sectional heterogeneity in retail trading intensity

Motivated by the model, we first seek to establish which of these two forces—hiding among retail order flow versus signal precision —dominates. In the model, a stock is harder to value if the informed investor must expend more effort to obtain a signal of a given precision in that stock. While we cannot observe this effort, we aim to identify empirical characteristics that capture the notion of stocks that require more learning to accurately predict their future value. Importantly, in the model, the terminal payoff of the asset is explicitly tied to its fundamental value. We believe, however, that the model can capture broader notions of hard to value. If instead, for example, informed investors were to receive signals about terminal prices rather than fundamentals, difficulty to value could also capture the idea of early liquidation risk in the sense of De Long et al. (1990).
Empirically, we show that the concept of difficultly-to-value—referring to stocks whose valuation is not strongly related to their respective current financial performance—powerfully summarizes retail trading interest. With that result in hand, we develop three sets of predictions regarding the stocks heavily traded by retail investors.

  1.3. Predictions on retail trading and earnings announcement dynamics

Our first set of predictions concerns the differences in earnings announcement dynamics, regarding both earnings news and the resulting returns. Given the difficulty of producing a signal regarding the value of stocks heavily traded by retail investors, we predict:
Prediction 1A: High retail stocks should have larger in magnitude earnings surprises and more volatile earnings-day returns. In addition, high retail stocks should have more dispersion in analysts’ forecasts.
Given that high retail stocks are hard to value, any news about current cash-flows will have a smaller effect on current prices. Additionally, in hard-to-value stocks, different investors may focus on different pieces of the news, leading to more disagreement and an underreaction to news. These considerations yield the following prediction:
Prediction 1B: High retail stocks should respond less to earnings news. Their earnings surprises should be mostly driven by idiosyncratic news.
Prediction 1B suggests a possible tension with Prediction 1A: if high retail stocks react less to earnings news, we might expect them to have less volatile returns on earnings days. Therefore, it is an empirical question whether the effect of larger in magnitude earnings surprises dominates the effect of reduced responses to earnings news of a given size.

  1.4. Prediction on retail trading around earnings announcements

Our second prediction ascribes an active role to retail investors around earnings announcements, particularly with respect to liquidity provision. As with the first set of predictions, we consider the cross-section of stocks as a function of their prior intensity of retail-initiated trades. If, according to Prediction 1B, high retail stocks are less sensitive to earnings news, we anticipate the presumptive advantage of institutional investors in trading based on fundamental signals to be mitigated around these events.
For that reason, we expect institutional investors to comprise a relatively low share of trading in retail-favored stocks around earnings announcements. What is more, prior work has documented that institutional investors tend to trade away from announcing stocks (Di Maggio et al., 2023). We hypothesize that this behavior is particularly pronounced among the hard-to-value stocks, in keeping with institutional investors’ informational advantage being the smallest in that subset.4
Prediction 2: High retail stocks should have a high share of retail trading intensity around earnings announcements. Around the scheduled earnings announcement, retail investors act as liquidity providers and hold an outsized position in announcing stocks.

  1.5. Prediction on retail trading and the earnings announcement premium

Finally, the differing earnings information dynamics across the retail sort should be evident in average returns. Savor and Wilson (2016) argue that the announcement risk premium—the positive average returns earned by announcing firms—derives from the systematic value-relevant information in announcements regarding non-announcing firms. This mechanism, however, is unlikely to apply to high retail trading intensity firms. Specifically, because these stocks are hard to value, the information contained in a given earnings announcement is likely to be mostly idiosyncratic and carry little information about other stocks. For that reason, we anticipate the earnings announcement premium to decrease with the intensity of retail trading:
Prediction 3: High retail stocks should have a lower earnings announcement premium.

Click to read more