Mark Twain summed up what I take to be one of the fundamental problems of cognitive science with a single witticism. He said, "There's something fascinating about science. One gets such wholesale returns of conjecture out of such a trifling investment in fact."
Twain meant it as a joke, of course, but he's right: There's something fascinating about science. From a few bones, we infer the existence of dinosaurs. From spectral lines, the composition of nebulae. From fruit flies, the mechanisms of heredity, and from reconstructed images of blood flowing through the brain, or in my case, from the behavior of very young children, we try to say something about the fundamental mechanisms of human cognition. In particular, in my lab in the Department of Brain and Cognitive Sciences at MIT, I have spent the past decade trying to understand the mystery of how children learn so much from so little so quickly. Because, it turns out that the fascinating thing about science is also a fascinating thing about children, which, to put a gentler spin on Mark Twain, is precisely their ability to draw rich, abstract inferences rapidly and accurately from sparse, noisy data. I'm going to give you just two examples today. One is about a problem of generalization, and the other is about a problem of causal reasoning. And although I'm going to talk about work in my lab, this work is inspired by and indebted to a field. I'm grateful to mentors, colleagues, and collaborators around the world.
Let me start with the problem of generalization. Generalizing from small samples of data is the bread and butter of science. We poll a tiny fraction of the electorate and we predict the outcome of national elections. We see how a handful of patients responds to treatment in a clinical trial, and we bring drugs to a national market. But this only works if our sample is randomly drawn from the population. If our sample is cherry-picked in some way—say, we poll only urban voters, or say, in our clinical trials for treatments for heart disease, we include only men—the results may not generalize to the broader population.
So scientists care whether evidence is randomly sampled or not, but what does that have to do with babies? Well, babies have to generalize from small samples of data all the time. They see a few rubber ducks and learn that they float, or a few balls and learn that they bounce. And they develop expectations about ducks and balls that they're going to extend to rubber ducks and balls for the rest of their lives. And the kinds of generalizations babies have to make about ducks and balls they have to make about almost everything: shoes and ships and sealing wax and cabbages and kings.
So do babies care whether the tiny bit of evidence they see is plausibly representative of a larger population? Let's find out. I'm going to show you two movies, one from each of two conditions of an experiment, and because you're going to see just two movies, you're going to see just two babies, and any two babies differ from each other in innumerable ways. But these babies, of course, here stand in for groups of babies, and the differences you're going to see represent average group differences in babies' behavior across conditions. In each movie, you're going to see a baby doing maybe just exactly what you might expect a baby to do, and we can hardly make babies more magical than they already are. But to my mind the magical thing, and what I want you to pay attention to, is the contrast between these two conditions, because the only thing that differs between these two movies is the statistical evidence the babies are going to observe. We're going to show babies a box of blue and yellow balls, and my then-graduate student, now colleague at Stanford, Hyowon Gweon, is going to pull three blue balls in a row out of this box, and when she pulls those balls out, she's going to squeeze them, and the balls are going to squeak. And if you're a baby, that's like a TED Talk. It doesn't get better than that. But the important point is it's really easy to pull three blue balls in a row out of a box of mostly blue balls. You could do that with your eyes closed. It's plausibly a random sample from this population. And if you can reach into a box at random and pull out things that squeak, then maybe everything in the box squeaks. So maybe babies should expect those yellow balls to squeak as well. Now, those yellow balls have funny sticks on the end, so babies could do other things with them if they wanted to. They could pound them or whack them. But let's see what the baby does.
See this? Did you see that? Cool. See this one? Wow.
Told you.
See this one? Hey Clara, this one's for you. You can go ahead and play.
I don't even have to talk, right? All right, it's nice that babies will generalize properties of blue balls to yellow balls, and it's impressive that babies can learn from imitating us, but we've known those things about babies for a very long time. The really interesting question is what happens when we show babies exactly the same thing, and we can ensure it's exactly the same because we have a secret compartment and we actually pull the balls from there, but this time, all we change is the apparent population from which that evidence was drawn. This time, we're going to show babies three blue balls pulled out of a box of mostly yellow balls, and guess what? You cannot randomly draw three blue balls in a row out of a box of mostly yellow balls. That is not plausibly randomly sampled evidence. That evidence suggests that maybe Hyowon was deliberately sampling the blue balls. Maybe there's something special about the blue balls. Maybe only the blue balls squeak. Let's see what the baby does.
See this? See this toy? Oh, that was cool. See? Now this one's for you to play. You can go ahead and play.
So you just saw two 15-month-old babies do entirely different things based only on the probability of the sample they observed. Let me show you the experimental results. On the vertical axis, you'll see the percentage of babies who squeezed the ball in each condition, and as you'll see, babies are much more likely to generalize the evidence when it's plausibly representative of the population than when the evidence is clearly cherry-picked. And this leads to a fun prediction: Suppose you pulled just one blue ball out of the mostly yellow box. You can't pull three blue balls in a row at random out of a yellow box, but you could randomly sample just one blue ball. That's not an improbable sample. And if you could reach into a box at random and pull out something that squeaks, maybe everything in the box squeaks. So even though babies are going to see much less evidence for squeaking, and have many fewer actions to imitate in this one ball condition than in the condition you just saw, we predicted that babies themselves would squeeze more, and that's exactly what we found. So 15-month-old babies, in this respect, like scientists, care whether evidence is randomly sampled or not, and they use this to develop expectations about the world: what squeaks and what doesn't, what to explore and what to ignore.
Let me show you another example now, this time about a problem of causal reasoning. And it starts with a problem of confounded evidence that all of us have, which is that we are part of the world. And this might not seem like a problem to you, but like most problems, it's only a problem when things go wrong. Take this baby, for instance. Things are going wrong for him. He would like to make this toy go, and he can't. I'll show you a few-second clip. And there's two possibilities, broadly: Maybe he's doing something wrong, or maybe there's something wrong with the toy. So in this next experiment, we're going to give babies just a tiny bit of statistical data supporting one hypothesis over the other, and we're going to see if babies can use that to make different decisions about what to do.
Here's the setup. Hyowon is going to try to make the toy go and succeed. I am then going to try twice and fail both times, and then Hyowon is going to try again and succeed, and this roughly sums up my relationship to my graduate students in technology across the board. But the important point here is it provides a little bit of evidence that the problem isn't with the toy, it's with the person. Some people can make this toy go, and some can't. Now, when the baby gets the toy, he's going to have a choice. His mom is right there, so he can go ahead and hand off the toy and change the person, but there's also going to be another toy at the end of that cloth, and he can pull the cloth towards him and change the toy. So let's see what the baby does.
Two, three. Go! One, two, three, go! Arthur, I'm going to try again. One, two, three, go! Arthur, let me try again, okay? One, two, three, go! Look at that. Remember these toys? See these toys? Yeah, I'm going to put this one over here, and I'm going to give this one to you. You can go ahead and play. Okay, Laura, but of course, babies love their mommies. Of course babies give toys to their mommies when they can't make them work. So again, the really important question is what happens when we change the statistical data ever so slightly. This time, babies are going to see the toy work and fail in exactly the same order, but we're changing the distribution of evidence. This time, Hyowon is going to succeed once and fail once, and so am I. And this suggests it doesn't matter who tries this toy, the toy is broken. It doesn't work all the time. Again, the baby's going to have a choice. Her mom is right next to her, so she can change the person, and there's going to be another toy at the end of the cloth. Let's watch what she does.
Two, three, go! Let me try one more time. One, two, three, go! Hmm.
Let me try, Clara. One, two, three, go! Hmm, let me try again. One, two, three, go! I'm going to put this one over here, and I'm going to give this one to you. You can go ahead and play.
Let me show you the experimental results. On the vertical axis, you'll see the distribution of children's choices in each condition, and you'll see that the distribution of the choices children make depends on the evidence they observe. So in the second year of life, babies can use a tiny bit of statistical data to decide between two fundamentally different strategies for acting in the world: asking for help and exploring. I've just shown you two laboratory experiments out of literally hundreds in the field that make similar points, because the really critical point is that children's ability to make rich inferences from sparse data underlies all the species-specific cultural learning that we do. Children learn about new tools from just a few examples. They learn new causal relationships from just a few examples. They even learn new words, in this case in American Sign Language.
I want to close with just two points. If you've been following my world, the field of brain and cognitive sciences, for the past few years, three big ideas will have come to your attention. The first is that this is the era of the brain. And indeed, there have been staggering discoveries in neuroscience: localizing functionally specialized regions of cortex, turning mouse brains transparent, activating neurons with light. A second big idea is that this is the era of big data and machine learning, and machine learning promises to revolutionize our understanding of everything from social networks to epidemiology. And maybe, as it tackles problems of scene understanding and natural language processing, to tell us something about human cognition. And the final big idea you'll have heard is that maybe it's a good idea we're going to know so much about brains and have so much access to big data, because left to our own devices, humans are fallible, we take shortcuts, we err, we make mistakes, we're biased, and in innumerable ways, we get the world wrong. I think these are all important stories, and they have a lot to tell us about what it means to be human, but I want you to note that today I told you a very different story. It's a story about minds and not brains, and in particular, it's a story about the kinds of computations that uniquely human minds can perform, which involve rich, structured knowledge and the ability to learn from small amounts of data, the evidence of just a few examples. And fundamentally, it's a story about how starting as very small children and continuing out all the way to the greatest accomplishments of our culture, we get the world right.
Folks, human minds do not only learn from small amounts of data. Human minds think of altogether new ideas. Human minds generate research and discovery, and human minds generate art and literature and poetry and theater, and human minds take care of other humans: our old, our young, our sick. We even heal them. In the years to come, we're going to see technological innovations beyond anything I can even envision, but we are very unlikely to see anything even approximating the computational power of a human child in my lifetime or in yours. If we invest in these most powerful learners and their development, in babies and children and mothers and fathers and caregivers and teachers the ways we invest in our other most powerful and elegant forms of technology, engineering and design, we will not just be dreaming of a better future, we will be planning for one.
Thank you very much.
Laura, thank you. I do actually have a question for you. First of all, the research is insane. I mean, who would design an experiment like that? I've seen that a couple of times, and I still don't honestly believe that that can truly be happening, but other people have done similar experiments; it checks out. The babies really are that genius.
You know, they look really impressive in our experiments, but think about what they look like in real life, right? It starts out as a baby. Eighteen months later, it's talking to you, and babies' first words aren't just things like balls and ducks, they're things like "all gone," which refer to disappearance, or "uh-oh," which refer to unintentional actions. It has to be that powerful. It has to be much more powerful than anything I showed you. They're figuring out the entire world. A four-year-old can talk to you about almost anything.
And if I understand you right, the other key point you're making is, we've been through these years where there's all this talk of how quirky and buggy our minds are, that behavioral economics and the whole theories behind that that we're not rational agents. You're really saying that the bigger story is how extraordinary, and there really is genius there that is underappreciated.
One of my favorite quotes in psychology comes from the social psychologist Solomon Asch, and he said the fundamental task of psychology is to remove the veil of self-evidence from things. There are orders of magnitude more decisions you make every day that get the world right. You know about objects and their properties. You know them when they're occluded. You know them in the dark. You can walk through rooms. You can figure out what other people are thinking. You can talk to them. You can navigate space. You know about numbers. You know causal relationships. You know about moral reasoning. You do this effortlessly, so we don't see it, but that is how we get the world right, and it's a remarkable and very difficult-to-understand accomplishment.
I suspect there are people in the audience who have this view of accelerating technological power who might dispute your statement that never in our lifetimes will a computer do what a three-year-old child can do, but what's clear is that in any scenario, our machines have so much to learn from our toddlers. I think so. You'll have some machine learning folks up here. I mean, you should never bet against babies or chimpanzees or technology as a matter of practice, but it's not just a difference in quantity, it's a difference in kind. We have incredibly powerful computers, and they do do amazingly sophisticated things, often with very big amounts of data. Human minds do, I think, something quite different, and I think it's the structured, hierarchical nature of human knowledge that remains a real challenge.
Laura Schulz, wonderful food for thought. Thank you so much.
Thank you.