23rd September 2023, 16 min read

Malcolm Gladwell: Meritocracies don't work

Malcolm Gladwell was invited to Google Zeitgeist again. He gave a talk on meritocracies and their failures. This is somewhat a follow-up on his earlier talk given on Google Zeitgeist. I had written on this prior talk here: Malcolm Gladwell: Don't go to Harvard, go to the Lousy Schools!.

The talk is here:

Meritocracy, a ruling system based on merit:

the notion of a political system in which economic goods or political power are vested in individual people based on ability and talent, rather than wealth or social class. Advancement in such a system is based on performance, as measured through examination or demonstrated achievement.

At first sight this looks like that it will favor good outcome. Malcolm Gladwell analyzes a number of pitfalls.

Below is the transcript of the talk directly from YouTube:

It's a real pleasure to be invited back to Google's Zeitgeist. I think the last time I spoke at this was many, many years ago in Phoenix, and if memory serves, my talk was a critical examination of my decision to agree to talk at Google Zeitgeist. Incredibly, I got invited back, and I so I thought as an encore what I would do is do a critical examination of why all of you were invited to Google Zeitgeist. Now, there is a standard answer to that, which is that this is a gathering of the best and the brightest and all of you have reason to believe that you are the best and the brightest. But my question is: How do you know you're the best and the brightest? And what I want to suggest this morning is that there is a great deal of more uncertainty over that question than you may care to admit, and that paradoxically, this is a very good thing. So, I want to focus in the brief time that I have on the role of gatekeepers, because meritocracies of the sort that we've erected in our world are run by gatekeepers, and I would like to advance a series of propositions to suggest that gatekeepers are really, really bad at what they do. So, there is going to be four of these propositions.

Proposition 1. Gatekeepers very often do not understand the meritocracy that they are supposed to be keeping.

Proposition 1 is that gatekeepers very often do not understand the meritocracy that they're supposed to be keeping the watching the gates for. So, tons of examples, but the one I will focus on is the NIH, National Institute of Health. This is one of the most consequential meritocracies in the world, probably. NIH has a budget of 40 billion dollars a year. They get 80,000 grant applications a year, which represents an extraordinary percentage of the most crucial research we do in the world, and they put together groups of experts who grade each one of those grant applications on a scale of 10 to 90 where 10 is fantastic and 90 is terrible, right. So, this is a classic meritocracy guarded by a group of expert gatekeepers.

So, a couple years ago the Guy, the Deputy Director of extra research at NIH, the Guy running this process decades to try and verify how good the process is, right. So, when you do a score on a grant application, you're making a prediction of how good you think that research is going to turn out to be. So, his question was, well, how good are these predictions? He does a really simple analysis where medicine, the way we judge, the quality of research is how many citations are made to that research fund when finished. He says let's simply correlate the grant score on an application with a number of citations it gets once the research is finally finished. So, what does he discover? He discovers that the correlation between your score and how good your research ends up being is modest to nonexistent. Now, we're talking about one of the most crucial meritocracies in modern society. We're talking about $40 billion of intellectual activity, and the guy running the whole show takes a look and discovers the experts who are manning the gates to this particular meritocracy don't know what they're doing. So, why doesn't gate keeping work in this example? Well, one is maybe it's impossible to predict who is a good researcher and who isn't. That is impossible. Maybe it is the groups of experts by virtue of being experts belong to a particular generation of medical research and are hopelessly out of touch with what the next generation of research is supposed to be all about. It doesn't really matter. The point is that this is a meritocracy that is not a meritocracy, right. My favorite response to this guy, Dr. Lower's paper, a bunch of microbiologist published this paper where they said the only rational thing now is to tell all the grant reviewers to go home and shutdown that entire cumbersome process of trying to evaluate all these 80,000 grant applications, just have a big round cylindrical container, but all the applications in the container, and pick them out at random and that should be how we govern the grant process in this country. That strikes me as a system that makes a great deal of sense. Okay.

Proposition No. 2. Meritocracies don't work sometimes because they are run for the benefit of gatekeepers.

Proposition No. 2. Meritocracies don't work sometimes because they are run for the benefit of gatekeepers. Again, any examples.

The LSAT. I got so obsessed with the LSAT a couple years ago I took it. I challenged my assistant to an LSAT contest. So, we all mow about the LSAT. It's six sections, you know, reading, problem solving, logic problems, I've forgotten the others, writing. You get 35 minutes for each section, and your score determines whether you get into an elite law school, and whether you get into an elite law school determines whether you get an elite job once you graduate and whether you get an elite job is a job on the Supreme Court and an invitation to Zeitgeist. You make a distinction between power tests and speed tests. So, a speed test is where I give you a whole lot of relatively easy questions, and I'm interested to see how many you can answer in a given amount of time, right. So, video games are really very often speed tests, right. We play for constraint and see how well you do under that constraint. Power test is where I give you really hard questions, and all I'm really interested in is how many of those questions you can get right. So, scrabble tends to be, is really a power game. Untimed chess is the power game. So, what is the LSAT? Well, the LSAT is a series of very, very hard questions, but if we require that the test taker complete them in a limited period of time. And the time constraint is so strict that it's deliberately strict because we want to make it so hard. We want to make it hard enough that the average test taker cannot answer all the questions in the allotted time. So, what we have here is what a psycho matrician would call a speeded power test. We're collecting power data with a speed constraint. Here is the question: Why do we collect it with a speed constraint? Why is there a 35 minute limit on the six sections of the LSAT?

We have two test takers here. We have tortoise and hare. Hare, we all know hare is super speedy, very confident. He answers every one of the 101 questions on the LSAT in the time allotted. He gets 82 of them correct for an accuracy rate of 81.2, and he has an LSTAT score of 165 which puts him in the 94th-percentile, gets a job at Annie law firm, works 80 hours a week, he never sees his kids, his marriage falls apart.

Tortoise, by contrast, allow me to say tortoise is a woman, for no particular woman. But she is super analytic. She doesn't do things quickly. She, whenever she has a hard question, she goes over it 17 times. There is no way tortoise can finish all of the answers to all the questions in the time allotted, so she only gets to 80 questions of which she answers 78 correctly. She has an accuracy rate of 97.5, but she makes so many guesses that she gets penalized. She ends up with an LSAT score of 165, which is the 94th-percentile. She gets a job in Annie law firm, works 80 ours a week, never sees her children, her marriage falls apart and she quits law.

The LSAT will tell you that these two people, hare and tortoise, are identical. They both got a score of 165, but who would you rather have as your lawyer? Would you like to call up hare and say have you gotten to my contract yet and hare tell you yeah, I looked at it over lunch, it's fine, right. No.

You want tortoise as your lawyer. You want the person who is an el and doesn't skip ahead, right. But the purpose of creating a speeded power test for the LSAT, the result of having that time constraint is to make hare look better than he is and tortoise look worse than she is actually is, right. Why would we do this in a profession that is based on tortoise thinking? I was so baffled by this I went to see the organization that runs the LSAT, but fancy office build anything New Jersey, huge conference room. They all gathered, and I just said to them, can you explain to me why you have a time limit on the LSAT? Makes no sense. Why not just let them spend all day doing it. Ask really hard questions. That is what the law is, hard questions, we take a long time. They charge by the hour for goodness sake. There is no institutional reason why you would want people to move quickly. And I had my taperecorder all ready, because I was expecting them to give this long-nuanced answer and their answer was, nah, just easier. How you going to rent the hall for the whole day? All right.

Proposition 3. Meritocratic systems often do not recognize that being real good is not an individual effort, but a team accomplishment.

Proposition 3. This is a crucial one. Particularly for the kind of intellectual work that we do in the modern economy. This is about surgeons. Now, we're all familiar with the observation that the more operations a surgeon does, the more procedures they do, the better they get. There is a learning curve with surgery. That is why we tend to have rare surgeries clustered at major teaching hospitals so we can keep the volume of the surgeon up really high, right. You don't go and get some kind of very particular brain surgery at some rural hospital, you go to the major medical center for this very reason. So, this is a chart demonstrating this. Norwood operations are very difficult pediatric cardiac surgery, and you can see the learning curve if you are under like 150 cases a year, your mortality rate is really high. It's terrible. But once you get to about 400 a year, the mortality rate comes down dramatically by, you know, it's a quarter of what it would be otherwise. This is a pattern that we see throughout all of surgery, and the people who are on this end, right, are the ones who get rewarded, they are the ones who make the most money, the ones that have the fan east title, those are the ones that are the winners of the bureaucratic game that is academic medicine.

But there is a complication with this, and that is what happens to the people, the surgeons who do their Norwood operations at a different hospital? So, lots of surgeons do this, right. You have privileges at more than one hospital. Maybe you do 90% of your procedures at one place, but then you go down the treat or across town or the weekends or whatever and do some at another place. And the answer is that when you leave your regular hospital and moonlight somewhere else, you move from being at this end of the curve and you go all the way back to the other end of the curve. This is a result beautifully demonstrated in this paper. I am going to read to you the conclusion. "Higher volume in a prior period for a given surgeon at a particular hospital is correlated with significantly lower risk adjusted mortality for that surgeon hospital pair." That is what they're talking about. That volume, however, does not significantly improve the surgeon's performance at other hospitals.

What does that mean? Well, what that means is that cardiac surgery, or any kind of complex surgery is a team activity, right. So, when you are with your team at your regular hospital, you all get better together, but then when you leave on the weekend to moonlight somewhere else, you leave your team behind, and without your team, you're hopeless, right. Now, does the meritocratic system recognize that being a real good surgeon is not an individual accomplishment, but a team accomplishment? No, it doesn't. The whole meta contract particular system is based on the assumption that what we're observing here is the greatness of this particular individual surgeon. Now, I would suggest to you that it is a pretty big problem, particularly if you are someone who picks your elite surgeon and just happens to be seeing that surgeon at the hospital they're moonlighting at and not their regular place. And I think that this applies to an extraordinary number of complex allegedly marrow contract particular systems. I mean, think about me up here right now. How much of this talk is me? Do you know, whether I wrote or whether the team wrote it? You don't know. You have no idea how good I am based on this particular talk that I'm giving to you without knowing the actual process that I use to come up with these observations.

Proposition 4. Meritocracies are bad because gatekeepers don't fix them.

Proposition No. 4. Meritocracies are bad because gatekeepers don't fix them. Once you realize there is an accumulating body of knowledge that suggests we're not very good at managing meritocracies, then you would assume then that there should be an ongoing process by which we try and improve the quality of the gate keeping function, and it turns out that there isn't.

So, again, a million examples, but this is one I've been obsessed with for a while. I wrote about it in my book Outliers in 2008. This is the roster of the 2007 Medicine Hat Tigers. This is the actual chart I used in my book Outliers, and this is a major junior hockey team, so this is one rung below the NHL. The point of this, those who read Outliers will know this, the point of this chart is this is a group of elite hockey players in a country that takes hockey very seriously, and what is most striking about this group are how many of them are born in the first four months of the year. January, January, March, April, September, October. April, and January, January, August, March, May, January. Now, this is a very, very well known phenomenon, it's called a relative age effect, and it's a function of the way in which we select† the way in which we structure the particular meritocracy that is elite youth sports. We, in Canada, they're crazy about hockey, so they start forming all-star teams at the age of nine, and at the age of nine, the kids who look like the best hockey players are the ones who are relatively the oldest. If you are born in January, you're going to look better than a kid born in December. So, we take that kid out and put them on an all-star team and give them way more practice time, way better coaches, way more access to good competition, way more encouragement and low and behold, ten years later they are the best. An arbitrary advantage has been elevated to a real advantage.

You can see this everywhere. It's true in soccer, basketball, swimming. Any competitive sport that looks to develop and identify talent at an early level, early age, has the problem of creating these relative age effect arbitrary damages. For example, look at this. Schools. This is a study of gifted and talented programs in England, and they have broken down the composition of gifted and talented programs by birth cohort. In Inc. grand is September 1st. If you are in a relatively cohort in your class, your chance of being in a gifted science and math is roughly half of those who are born in the relatively oldest cohort. Basically, if your kid is among the youngest in your class, kiss good bye getting into a gifted and talented program. And of course, we use those to decide who gets into quality schools and we use quality schools to determine who gets advances, on and so. It's the same old system.

This is not a meritocracy, something pretending to be a meritocracy. I wrote my book in 2008, and as a result of the long stuff about the effect of my book, there was a public attention to this particular relative age effect. I thought when I was coming here to talk about meritocracies, what I would do was revisit the hockey example and show you about how this particular Canadian institution, has learned its lesson and fixed its ways and no longer pursues a policy that has the misfortune of leading half the talent on the table. So, I decided I would look at the 2022 Canadian junior hockey roster, and let's just go through the birth months, shall we. September, November, June, March, January, January, April, September, January, August, September, July, October, January, February, January, August, April, October, January, March, January, March, February, January. They have learned nothing.

They haven't done a single thing to fix the problem. 15 years ago, it was brought to their attention that a country that was more passionate about hockey than almost anything else had created this system that was arbitrarily leaving half the talent on the table, right. No one could possibly be more powerfully motivated to fix this system than Canadians. Hockey is the national everything. Have they fixed it? No, they haven't. By the way, has anyone fixed the system? No.

Think about your child's elementary school. If you in first and Second Grade, do they divide the kids up and put the January to March kids in one class and the April to June kids in another and the July to September kids in another? No, they don't do that. Right. Even though we've had years and years of evidence that it's completely unfair to ask a January kid to compete with a December kid. When your child does takes standardized tests, do the kids born in December take the standardize tests on the same day as the kids born many January? Yes, they do. Does that make any sense whatsoever? No, it doesn't.

For some reason we are powerfully incurious about the problems that we created with our meritocracies. We think we know a good research proposal from a bad one, and we don't. We think we know that we think we're selecting the right people for law school, and we aren't. We, you know, we think we know that an individual is responsible for their surgical success, and they aren't. And when we're presented with evidence of the falsity of our systems, what do we do? We do nothing. Now, I said at the beginning that this observation about our failed meritocracies is a very good thing. How can it be? If we fixed meritocracies, then most of us wouldn't be here, right.

But think about it. If we fix the system, the people who would replace us at a conference like this would be so much smarter than we are. This conference would have been so much more fun. Google would make so much more money, and I wouldn't be here. Someone far more gifted than I would be giving this talk, and it would have been infinitely more interesting. Thank you.

Also see The rise and fall of peer review and the followup The dance of the naked emperors by Adam Mastroianni.