How do women react to higher standards?As a final exercise, I investigate how women react to higher standards as they update beliefs about referees' expectations. Figure 5 compares papers pre- and post-review at increasing publication counts. Solid circles denote NBER draft readability; arrow tips reflect readability in the final, published versions of those same papers; dashed lines trace changes made as papers undergo peer review.
Figure 5 Readability of authors tth publication (draft and final versions)

Note: Flesch Reading Ease marginal mean scores for authors' first, second, third, 4th–5th and sixth and up publications in the data. Solid circles denote estimated readability of NBER working papers; arrow tips show the estimated readability in the published versions of the same papers. Pink represents women co-authoring only with other women; blue are men co-authoring only with other men.
All things equal, economists who anticipate referees' demands are rejected less often; economists who don’t enjoy more free time. Figure 5 implies little, if any, gender difference in this trade-off – senior economists of both sexes sacrifice time upfront to increase acceptance rates.
Moreover, Figure 5 emphasises that only inexperienced women make changes during peer review. Assuming choices by senior economists express optimal tradeoffs with full information, this implies that women initially underestimate referees’ expectations.
Men, however, do not. Draft and final readability choices remain relatively stable over the course of their careers.
Are men just better informed about referees' expectations? Yes and no. Male and female draft readability scores for first-time publications are exactly the same. This suggests that men and women start out with identical beliefs. But those beliefs reflect standards that apply only to men. Women are then mistaken by thinking they apply to them, too.
Policy implicationsFigure 5 suggests that women respond to biased treatment in ways that not only obscure the line between personal preferences and external constraints, but can paint a rosier picture than even preferences justify. This raises a couple of concerns about identifying discrimination from narrow viewpoints. For example, if we only concentrate attention on a cross-section of papers written by senior economists, we might conclude that women simply prefer writing more clearly. Alternatively, if we limit our focus to the gap formed inside peer review, we might decide it declines with experience.
But neither conclusion is supported when the data are analysed from a broader perspective. A smaller gap in peer review is completely offset by a wider gap before peer review. Senior female economists did not enjoy writing so well when they were junior economists.
My evidence also emphasises that discrimination impacts more than just obvious outcomes. It corrupts productivity, too. Work that is evaluated more critically at any point in the production process will be systematically better (holding prices fixed) or systematically cheaper (holding quality fixed). This reduces women’s wages (for example, if judges require better writing in female-authored briefs, female attorneys must charge lower fees and/or under-report hours to compete with men) and distorts measurement of female productivity (billable hours and client revenue decline; female lawyers appear less productive than they truly are).
Unfortunately, there is no easy way to eliminate implicit bias. But least intrusive – and arguably most effective – is simple awareness and constant supervision. Monitoring referee reports is difficult but it isn’t impossible, especially if peer review were open. Several science and medical journals not only reveal referees’ identities, they also post reports online. Quality does not decline (it may actually increase); referees still referee (even those who initially refuse) (van Rooyen et al. 1999). And given what’s at stake, is spending an extra 25–50 minutes reviewing a paper really all that bad (van Rooyen et al. 2010)?
ReferencesAzmat, G and R Ferrer (2017), "Gender Gaps in Performance: Evidence from Young Lawyers", Journal of Political Economy 125(5): 1306-1355.
Bloor, K, N Freemantle and A Maynard (2008), "Gender and variation in activity rates of hospital consultants", Journal of the Royal Society of Medicine 101(1): 27-33.
Ceci, S J, D K Ginther, S Kahn and W M Williams (2014), "Women in Academic Science: A Changing Landscape", Psychological Science in the Public Interest 15(3): 75-141.
Ellison, G (2002), "The Slowdown of the Economics Publishing Process", Journal of Political Economy 110(5): 947-993.
Goldberg, P K (2015), "Report of the Editor: American Economic Review", American Economic Review 105(5): 698-710.
Hartley, J, J W Pennebaker and C Fox (2003), "Abstracts, introductions and discussions: How far do they differ in style?", Scientometrics 57(3): 389-398.
Hatamyar, P W and K M Simmons (2004), "Are Women More Ethical Lawyers? An Empirical Study", Florida State University Law Review 31(4): 785-858.
Hengel, E (2017), "Publishing while female: Are women held to higher standards? Evidence from peer review", mimeo.
Salter, S P, F M Mixon and E W King (2012), "Broker beauty and boon: a study of physical attractiveness and its effect on real estate brokers’ income and productivity", Applied Financial Economics 22(10): 811-825.
Seagraves, P and P Gallimore (2013), "The Gender Gap in Real Estate Sales: Negotiation Skill or Agent Selection?", Real Estate Economics 41(3): 600-631.
Tsugawa, Y, A B Jena, J F Figueroa, E J Orav, D M Blumenthal and A K Jha, A (2017), "Comparison of Hospital Mortality and Readmission Rates for Medicare Patients Treated by Male vs Female Physicians", JAMA Internal Medicine 177(2): 206-218.
van Rooyen, S and T Delamothe and S J Evans (2010), "Effect on peer review of telling reviewers that their signed reviews might be posted on the web: randomised controlled trial", British Medical Journal 341(c5729).
van Rooyen, S, F Goodlee, S Evans, N Black and R Smith (1999), "Effect of open peer review on quality of reviews and on reviewers' recommendations: a randomised trial", British Medical Journal 318(7175): 23-27.
Walsh, E, L Appleby and G Wilkinson (2000), "Open peer review: a randomised controlled trial", British Journal of Psychiatry 176(1): 47-51.
Endnotes[1] The ‘Publishing paradox’ and ‘leaky pipeline’ refer to phenomena in academia whereby women publish fewer papers and disproportionately leave the profession, respectively.
[2] Readability scores are highly correlated across an article’s abstract, introduction and discussion sections (Hartley et al. 2003).
[3] NBER persistently releases its working papers two to three years before publication (mean 2.1 years), precisely the length of time papers spend in peer review (Goldberg 2015, Ellison 2002).
[4] This estimate averages results over all five scores. It assumes women are accepted in a subset of states in which men are accepted and within pair differences are zero for the 30–40% of matched pairs that fail to satisfy conditions 1 and 2. See Hengel (2017) for alternative estimates based on weaker assumptions. (Conclusions drawn from those estimates mirror the conclusions discussed here.)
[5] Ellison (2002) evaluates how non-gender author compositional effects contribute to higher mean-accept times at the American Economic Review, Econometrica, Journal of Political Economy, Quarterly Journal of Economics and Review of Economic Studies. (Although his analysis controls for female authorship, it did not investigate gender differences specifically.)