The Diversity of Cell Types in the Human Brain
Date Posted:
August 11, 2019
Date Recorded:
August 10, 2019
CBMM Speaker(s):
Christof Koch All Captioned Videos Brains, Minds and Machines Summer Course 2019
Description:
Christof Koch, Allen Institute for Brain Science
PRESENTER: Good evening. And welcome to this year's working mass lecture. My name is [INAUDIBLE] or the methods in computational science for MCN course. What a massive family actually. It's co-hosted by the MCN and BMM, Brains, Minds, and Machines.
The host, that's co-directed by Gabriel Kreiman and Tomasso Poggio. I'm delighted that you're all here. And this year was the first time that two students from both courses are together. We're also very grateful to the family of [INAUDIBLE] who generously sponsors this [INAUDIBLE] series.
I'm really pleased and honored to introduce this year's lecturer, Dr. Chrisof Koch. Chritoff attained his PhD in [INAUDIBLE] from Max-Planck Institute in Tubingen, Germany, which, by many, is considered as kind of the birthplace of cybernetics. After spending two years at MIT, Christof took a family position and contact in 1986, where [INAUDIBLE] when he became the founding director of institute.
Now he is the president and chief scientist of that institute for brain research. Based on the scientific research are very broad and is well recognized for many contributions in the field, including the pioneering work in collaboration with [INAUDIBLE] the brain [INAUDIBLE] the basis of consciousness.
[INAUDIBLE] also played a major role in fostering the development of the field that now is called [INAUDIBLE] science ever since its inception. Many people will know that he was the co-founder of the vaccine course in 1988 here at MBM. Over the last three decades, this course has been attractive in training many people.
[INAUDIBLE] institute, Christof pushed for new ways for the last [INAUDIBLE] research, which I think really radically shaped the landscape of neuroscience today. So please join me in welcoming Christof. This is the [INAUDIBLE] of this title. The diversity of cell types in the human brain.
[APPLAUSE]
CHRISTOF KOCH: And he was our first student in 1988 in the methods computation neuroscience, and he survived, and he did well. All right, I'm greatly pleased to speaking to both of you, the minds, brains, machines, and the methods computational neuroscience since those topics are obviously very close to my long term interests.
So today I'm going to talk not about computation, not about consciousness, but the other good, interesting thing was seen in the cell and cell types. And I'll focus-- early on, I'll talk a little bit about the types of cells in the mouth, but then the bulk of the talk will be about human and the basic messages that we're doing.
And people can now do human-level systems neuroscience at the cellular level, where you can record from neuron to identify them, do the transcriptomics, do the morphology, do the electrophysiology. But not on a model system anymore, be that a mouse or a monkey, but actually on the real thing. And for many things, it's absolutely crucial to work on humans. Thank you, [INAUDIBLE].
All right, so first I'll just briefly introduce us. So just because we do differ in terms of the sociology-- we are an independent, not-for-profit research organization started in 2003, so I'm not the foundational director. There were people before me. And what we do is support basic research in the brain sciences.
We've now become larger. There's now an institute with various subunits, and its focus is all of biology, not just brain science. But the Institute I lead is for brain science. So my institute is roughly 330 people.
And the culture is somewhere between a university in the sense that most of us, like myself, are from universities. [INAUDIBLE] Smith, a lot of people, that you recognize have been here for many years and taught here, that have now joined us. But then we also have a DNA from biotech. We have a bunch of people that come from biotech.
We have let's called a project management. And we have something that probably most of you are blissfully ignorant of, called smart goals. How many of you know what a smart goal is? One person. Two. Three. OK. Well, SMART goal exists because it has a Wikipedia entry. It's my definition of existence. And a SMART goal stands for Specific, Measurable, Actionable, Realistic, Timely.
So typical SMART goal that we may have says, for instance, by Q3 of 2020, make a go/no go decision whether we can do a complete cubic millimeter electron microscopic reconstruction of a human piece of cortex-- a complete concurrent doing combinatorics in human EM.
So it means it's specific. It's measurable. It has a specific timeline, so we have to make a determination by Q3 whether or not we're going to do it. And then, of course, then we say, OK, now that we think we can do it, we're going to have a milestone on how quickly we can do it. And so this is one of the biggest difference between us and the university.
We only take on a few big projects that require scale, and we work in large teams. So the papers I'm talking about today, it's a bunch of nature papers and paper tomorrow in the talk on Brain Observatory typically involve 50, 60, 70, 80, 90 people. So it's a very different sociology, but I would submit that the field of neuroscience is now mature enough and is getting big enough funding in this country, $10 billion alone a year, that people want return on their money.
Politicians, NIH, want return on their money. And so, ultimately, if we want to sort of transition from the romantic phase of neuroscience to a more mature phase, a phase transition that physics has already undergone in their large accelerator, and in particle physics, and in astronomy, certainly projects at least need to be done at scale and in this way.
All right. So we initiated, when I came talking with Paul Allen, we had this plan for roughly-- for these people, for 10 years, for a billion dollars. And we're executing on that, so that's our current budget. And it's all due to the generosity of one individual, who, of course, has passed away but has left us his legacy. So we are here to stay.
So what we do, we take on these large projects that we believe help all of neuroscience-- at least those aspects of neuroscience that relate to the mammalian brain, particularly the mouse brain, the monkey brain, and the human brain. Those are the systems we focus on.
So over the years, we've released all-- this is the timeline of all the various projects we have done. And with each one, there are certain and unique aspects to it that makes it different from a typical, for example, what I did for 25 years at Caltech. So typically these are large projects. We make all the data, all the meta data available, including Jupyter Notebooks and white papers about the protocols, the data, the method data.
As much as we can put down on paper, on notebooks, we make available, accessible. And we make-- right now, you can go to brain-map.org, and you can download all the data I'm going to talk about. There's no log-in. There's no restriction. You can download everything we do once it passes internal quality control stage.
And then the papers, typically, they're large papers. They only typically become available two or three years later, because it's, you know, of course, we all know there's a difference between producing the data and then evaluating it and getting it to a peer review. But we've never had a problem. I think this problem that many people believe being paranoid, you can't share your data. And I think the archive shows that you can share your data perfectly well without being scooped.
So today I'm just going to talk uniquely about cell types. So we know roughly, pretty much exactly for 200 years now, since the early 19th century, that all biological organisms consist of one or more different cells. A cell is a basic unit in biology. And we know that cells beget other cells.
And then, of course, we realized in the 19th century that there are different types of cells. There are brain cells. There are kidney cells. There are heart cells.
And then with the chemical industry that gave rise to dyes-- Golgi, and Cajal, and others-- showed a great variety of different types of neurons by their mythology. So this, of course, has evolved over the last 150 years. So what the field is trying to do, and what we are heavily involved in-- probably 2/3 or 3/4 of our institutes are involved in this-- is to come up with a classification.
How many different types of cells are there, and can we classify them? Of course, the mother of all classification systems in biology is, of course, the periodic table. And so that's a difficult problem.
People have addressed this problem before. And so in other fields, particularly in the field of trying to classify all biological organisms. You know, a dog is different from a cat. And how are they related to each other? And how do they differ from a lion? And so people have the entire tree of life with its kingdoms and its various levels.
And so there is a lot of knowledge about classification, and taxonomy, and how to name ontology with all the fields that we are dealing now on a day-to-day basis. And so we know, from those types of classification of species, one has to be very flexible. You can't be rigid. You have to use continuous parameters.
So we typically-- you all know this textbook knowledge classifies, let's say, neurons in the brain as non-neurons and neurons. And then there are classes of excitated inhibitor neurons. There are subclasses. There are types, and there are subtypes. And ultimately, we want to get to this level. And I come back at the end, whether this is useful, or how useful it is, or for what usages can we put this knowledge.
And the question is, how many different subtypes are there? How many different leaves are there on this tree? And by what criteria?
So in classifying species, it was morphology and embryology, and then, of course, everything changed when it was molecular biology and genomes. So same thing, historically, the most dominant way to classify neurons, Ramon Cajal, is morphology. Then now we have electrophysiology, but now, of course, we have transcriptomics-- the genes that are expressed inside the cell or inside the nucleus. We do all of these methods.
This is all just two species, mouse and human. So it's almost exactly a factor of 1,000 in terms of mass. It's 0.6 grams versus 1,400 grams. In terms of surface area, it's 1,200 square centimeters compared to, a little bit, under 1 square centimeter. It's a factor of 1,000 in terms of neurons. 86 billion versus 74 million. It's a factor of 1,000 in terms of cortical neurons. 16 billion cortical neurons and 14 million cortical neurons in mouse.
And so we can ask the question, and I will ask this. Well, where's the difference? Why are we the dominant species and not mice? Is it just because we have 1,000 times more? And people say, well, obviously, that can't be.
But if you look at your computers in your pockets, today, you can talk to them. 20 years ago, when there were 1,000 times fewer transistors, you couldn't talk to them. Yet there is no fundamental difference in terms of the architecture between computers 20 years ago and computers today.
All right, so I said we operate in a large team. These are sort of the senior scientists. I'm not going to go through them all. It's really led by [INAUDIBLE] at the human cell type program and [INAUDIBLE], who's in charge of the structured science, who's in charge of the mouse, and who's in charge of transgenic animals that many of you, at least if you're experimental scientists, will use.
So we have these flexible guides to how we classify things. And we use, by and large, three modalities, now going to a fourth one. We use a morphology, particularly the classification of dendrites and axons. Local if it's in slice. And distal projection, if we do a whole brain reconstruction, that we do now increasingly in the mouse. We can't do that, at least yet, in humans.
We do a large scale electrophysiology in both human neurons and mouse neurons. This is all in slice. This is not in vivo, so this is in a comatose piece of brain tissue.
And then we use a technique that many people use today. And partly it's a question of the drunkard who is looking for his keys, and he, you, know looked for his keys under the streetlight, because that's where the light is.
So we are using transcriptomics. In other words, where we are looking, we're doing RNA and then the reverse, transcribing the RNA sequences that we find inside the nucleus. Either inside the nucleus for human or inside the cell for the mouse brain. We typically get between 8,000 and 10,000 different species of RNAs that are being expressed. And then we can do classification of these very high dimensional spaces.
It's getting hugely expensive. So we used most of these papers on a reporting use technology called Smart-Seq version 4. It's a technique that allows you to go very deep, but it's very expensive. It's, like, $30 per cell. And if you do 100,000 or a million cells, it's expensive. And then some of the later results, we now use 10x, which uses Drop-Seq, which is roughly a factor of 50 times cheaper.
OK, so our methods are the following. So we define types using different criteria. So when I refer to e type, I mean elective physiological types, where we do the following.
We take out a little piece of brain, human and mouse brain. We put it in a dish, so now it's cut off from its input and its output. It's really a piece of comatose brain. We put an electrode inside the cell body, and we inject current. Random steps, steps, random noise, frozen noise, pulses, all sorts of different stimuli to try to evoke a very diverse response.
And then we classify based on-- we do unsupervised classification based on those criteria. Or we do the same thing, where we go inside a slice. We record from the cell, but then we inject [INAUDIBLE], and we recover the morphology of the cell-- whether its human or mouse. And we can see it's dendritic. It's pretty much complete dendritic GABA, and it's local axon. And we do classification, based on that, unsupervised.
This has a drawback that it's a comatose brain. You don't have back on activity. This has a drawback that you're cutting off everything. You know, you have these slices 350 micrometers thin, so you don't see the distal axons.
But now this technique we have, called [INAUDIBLE], where we can do much better, where we can reconstruct neurons and the axon projection across an entire brain. But we can't do that in humans. So doing this-- for instance, we've done this. This came out a couple of months ago in Nature Neuroscience, where we do this exhaustively in primary visual cortex we chose for a very idiosyncratic reason.
We chose-- most of our tools are concentrated on primary visual cortex in the mouse. All of our brain observatory, I'll speak tomorrow, will be all in mouse visual cortex. So we go very deep in one area, and we classify-- and I'm not going to go into the detail.
We've classified this mouse according to the pyramidal cells, primarily their dendrites and the position of the cell bodies within layer 2-3. So in the mouse, you can't difference [INAUDIBLE] layer 2 and 3. It's one layer, here layer, four layer, five layer, six. And interneurons that, of course, can be found in layer one to layer six.
And then also using electrophysiology, whether they-- for example, in response to step, they fire a few spikes, and then they shut up. Or whether they stutter, or whether they adapt, or whether they're high frequency firing. Whether they're bereft, et cetera.
And using this classification, we can identify 46 different morphological and electrophysiological types. So different types of cluster in these spaces that on the order of several hundred dimensions. Now, of course, as you all know, if you've done clustering unsupervised, it really depends a lot on the detail. So there's nothing particular magic about 46.
In the first paper that we published two years earlier, we gotten 42, and I knew that was the answer. But unfortunately, looking more, using a different cluster algorithm, it's now 46 and may well be 47. The exact numbers aren't so important and depend on the various details of your learning algorithm. All right, so quite a few cells. More than just regular spiking cells and fast spiking interneurons.
And then we do automatic-- you know, I used to do this for my PhD. And it took me, you know, six months to write the program to do that. We now have automatic code that can do sort of-- you give it the morphology of a cell. You give it the electrophysiology. And then using genetic algorithm, it finds the best distribution of current in the soma and in dendrites. Its active models.
For each one of those cells, you can download. And if you want to do neuron models, you can get all that data. Yeah, so as I said, all this data, whether it's human or mouse, it's available at that website.
OK, but those techniques, they don't scale very well. So electrophysiology and morphology, for those you've done it, remains very laborious. You know, we have various robotic helps. We can now do automatic recording from eight cells simultaneously, but still it's very challenging.
And also biocytin reconstructions are really difficult to automate, so we have an entire group that does computational anatomy. And it's been challenging. And not just for us. For everybody in the field. If you use GF fluorescent signals, they have certain advantages there. But this, the advantage of transcriptomics, you can easily scale it up to 1,000, 10,000, 100,000, or a million.
So we do this technique. And we have this pipeline that we built, where you can either put in human tissue, or you can put in mouse tissue. In fact, you can put in any tissue. It doesn't really matter. So here, we put in either mouse or human tissue. We dissect, for example, only layer one, or only layer four, or particular parts of the brain.
And then we do all the magic of transcriptomics. We may pick certain types of cells. For example, we may pick cells that are marked with new N, which is the mark of neurons, because we are primarily interested in neurons and less interested in micro glial cells, for instance.
And then we do, in this case, as I said, Smart-Seq 4 or now the 10x. And so our typical lead level is, you know, 8,000, 9,000. For some of the big pyramidal cells, the human pyramidal cells, we may get 10,000 genes that are expressed. So quite high between 5,000 and 10,000. It's lower for glial. It's lower for interneurons. It's highest for pyramidal cells.
It's biggest for the layer two, three, and the layer five gigantic pyramidal cells. It's like nanograms of stuff. All right, so here the idea is quite simple. We have this diversity of cell, and then we use various clustering algorithms, unsupervised clustering algorithms, to get different types of cells.
Partly, we do this because then, with the magic of biotechnology, we can convert each one of these cells even into transgenic animal or into virus that only expresses in this subset of cells. So we can then build reporter, for example, where all the cells that have these particular genes they express, they light up. Or you can put a channel of adoption in and turn them off or turn them on.
And so independent of the epistemic value of cell types, which I think remains unclear, they have great practical value because they allow us to go into the system and break open the system in very deliberate ways. All right, so this came out last year.
So here we did a deep dive in primary visual cortex. This is the back of the brain. This is the front. It's a mouse brain, of course.
This is a secondary motor area which we were interested in for various reasons. We picked two areas because they are quite a way apart on the dorsal surface-- one visual primary sensory. The other one secondary motor-- because we wanted to compare and contrast.
All right, so in the meantime, we've done several hundred thousands, and the results still holds. This is what was published with 21,000 cells. We find 116 clusters. And once again, the exact number-- 116, 112, or 120-- will depend on the various details.
We do at least two different clustering algorithms independently, and now we work with others in the field to have as many clustering algorithms-- because all this data is available to use as many clustering algorithms as possible. And everybody comes up with a different number, but the taxonomy remains pretty much the same.
OK, so there are two really interesting insights here. So these are all-- so all the cells divide into two big branches. One is neurons, and the other branch is non-neurons. So we have many fewer non-neurons. Then we have neurons. We did that by design.
As I said, we use key lines, and we use new N, because we are primarily interested in those and less interested in those. So there are only 14 cell clusters here. Within the neurons, there are two broad classes. Of course, excitatory and inhibitory. And within the inhibitory, there are four subclasses-- the [INAUDIBLE], the VIP, somatostatin, PV, and SCNG.
Partly, they come from different-- they have different developmental origins. They come from two different parts of the medial eminence, so that's why they fall into different clusters. And then we have all the excitatory ones. Now this is incredibly interesting. This graph here, what does it say?
Well, it says, of all the cells expressed, which fraction did we find in this motor area, and which fraction did we find in visual areas? In V1? So the inter-neurons, almost without fail, we find in both areas. So all these-- [INAUDIBLE] or SST, or PV. We find same cell types transcription defined in these two different areas.
While excitatory cells, they are either 0 or 1. So either they are found in one area or the other. They don't seem to mix. Again, transcription will define it. OK, so the bottom line is there are roughly 100 different cell types, transcription-defined different cell types in these two different brain areas.
The inhibitory ones are identical or very similar, and the excitatory ones are different. Now why might that be? Well, you can think it may well be that these high transcription spaces is really subdivided into two orthogonal spaces.
One is the space that defines the function of the neuron itself and its input, and the other one is the set of genes that defines where it goes for output. And of course, the output from layer 5, so these are-- note, you have very few cell types in layer two, three, and four. Very few. [INAUDIBLE] few in either V1 or in the motor areas.
You have a very large number of cells in layer five and layer six. Of course, those are all the output cells, and they project to a very different area. So in V1 layer five, you're going to go, for instance, to the superior colliculus, because you need to send the information to the visual tectum.
There is no projection in motor cortex from layer five into the colliculus. Instead, it projects through the [INAUDIBLE] down to the spinal cord, et cetera. So I suspect, because these have to carry the zip code for a distal area outside their local area, while these are just addressed locally, all with one exception.
And SST subtype actually do have a long distal projection. All of these are local inter-neurons that don't go outside the area. So all they need to know is-- while I'm sitting in layer two, and I have a dendritic tree in layer one, and I extend a basal dendrite into layer three. And I can code that in the same way in V1 as in motor cortex. OK, so that's mouse.
So as I said, some of the different classes we can explain by the developmental different origins, from the central or the medial ganglionic eminence. And we just finished this. We haven't put it online yet, because I'm still fully done with the quality control. And we're probably a year or two away from publishing any of this.
So we've done this now in a very large number of cells. I think the total now is half a million across the entire cortex and hippocampus. And we're getting ready, by the end of the year, to have done this in an entire mouse brain. And where you can see, so these are the so-called tiers and the [INAUDIBLE] where you can sort of differentiate the cell types.
So at the bottom, we find roughly the same rules that inter-neurons, by and large, are shared while pyramidals excitatory cells differ across cortex. We find true across the cortical mantle. The final, bottom line will be that a brain like the mouse has on the order of 1,000 cell types.
Again, this knowledge is only useful because you can focus on-- I'm not going to talk about it. You can use this knowledge to build tools, like viral tools or transgenic tools, that express, let's say, a promoter or reporter in a specific part of them, which you can then use to study.
So here we study the hierarchy that we find. We can precisely define the hierarchy in all of cortex and all of thalamus, thalamocortical, corticothalamic in the mouse using such tools. And of course, you can also do causal experiments where you perturb the system. OK, now let's switch to humans.
So when I came from Caltech to Seattle, I just finished a long collaboration with [INAUDIBLE] where we record together with Gabriel Kreiman here. That was his PhD where we recorded from the human brain single neurons. You know, the [INAUDIBLE] and neurons. And so I was really gung ho, and I believed that the field was ready to work with neurosurgeons at a large scale to really move-- to go beyond a mouse and monkey.
Because everybody said, no, you can't do humans. You have to study monkey. And I didn't want to study monkey. There's nothing wrong with studying monkeys, but then you get stuck, for your entire life, studying monkeys. And I didn't want that.
So how do you get access to this thing? How do you get access to the human brain? This is the brain [INAUDIBLE]?
Well, you can do what many of us-- you can do, for example, what [INAUDIBLE] does to perfection. You can do all sorts of sophisticated imaging scans, where you can get ever smaller brain areas in a magnetic scanner of volunteers. It's very cool because you can do it reliably and safely and over and over again.
But, you know, the smallest marks are there. We're talking about, you know, 2 by 2 by 1 millimeter or something like that. That's half a million neurons, and I'm interested in understanding those half a million neurons probably for 100 different types. And so you'll have to go inside the skull.
You can work on dead brains, so that's what we do. That's what we've done in the past, and that's what we continue to do. Well, we work closely with coroner's office. It's a very elaborate procedure, and you have to have lots of patience, because you have a very long list of exclusionary criteria that you can't screen until you get patients-- donor brains that have died from non-brain-related diseases that weren't involved in a potential crime.
And they have to be a certain age, et cetera, et cetera. And people don't necessarily donate these days anymore, because they're more skeptical. It's a long term endeavor, but we're doing it since many years. And so we have quite a few clean brains in our freezers.
The other way you can do, fetal tissue. Unfortunately, due to our administration-- let's just keep it at that-- watching that path, it is almost impossible. It's very, very difficult. And of course, there, you're studying a very young brain. So typically we're talking about aborted fetuses, which is the first trimester.
So there is already-- there are progenitor neurons there, but it's not really, you know, developed into neurons yet. You can do neurosurgery tissue, and we'll do that. I'll explain that. That's what we choose.
Or you can also-- you can do brain organoids, using induced pluripotent stem cells. Very cool. Very exciting. But still very, very immature both as the field's only 10 years old. As well as these neurons. You wait nine months, and then you get neurons at transcription or by express transcription factor.
You can sort of equate to end of first term progenitor neurons. They don't really have action potential yet. They don't really have apical dendrites. They don't have, you know, the stuff that we associate the machinery of neurons yet. And people haven't really been able to difficult to terminally differentiate these into what you and I would call a, let's say, a spike in pyramidal cell.
All right. So what we do, we either do work with post-mortem brain, or we work with a neurosurgical brain. So that's very simple. Every major neurosurgical practice will do one of two operations. They will either have to remove tumors. So let's say it's a deep tissue tumor, let's say, in the amygdala hippocampus.
So depending on the exact approach, and different clinics, and different surgeons have different approaches-- but typically it involves going through one of the medial temporal gyrus. And so in order to get to the deep tissue tumor, the surgeon has to tunnel through the overlying piece of brain. And that brain, a little bit is snipped off and given to the pathologist. Most of that piece of brain in most hospitals on the planet is discarded as medical waste.
All right? This is really shocking. This is some of the most precious tissue there is, because it comes from an individual who lived with his brain until they're 30, 40, 50, 60, whatever. Most of their brain, we've done detailed histology, is normal because it's a centimeter or 2 centimeters away. Not always, but most of the time, you cannot distinguish it from healthy post-mortem tissue.
But most of the time, it's thrown away as medical waste. So we've built up this pipeline to work with lots of neurosurgeons to retrieve that. So within a few minutes, typically two or three minutes, we have a special way to transport it and to perfuse it with carboxyl. So we have this optimal way where, within 20 to 30 minutes, literally, let's say.
We try to keep up 30 minutes after it was cut from the human brain, it sits in our lap, and we cut it in slices and do experiments with it. And you got to remember, half an hour ago, it was part of somebody's memory of their first kiss or whatever. The other major source of tissue is epileptic surgery.
Same thing. You have very often-- 70% of the time, you have off temporal lobe epileptic seizures. They originate, let's say, in the hippocampus or associated structure, and then the surgeon has to cut through the overlying piece of medial temporal gyrus. So about 2/3 to 3/4 of our tissue comes from medial temporal gyrus. Most of the other ones come from anterior cingulate and other frontal areas.
And then we found out-- we've optimized this now. We do array tomography. We are now starting to do a neural pixel recording. This is in the patient with surgeons. I won't talk about that now.
We do single celled transcriptomics. I'll talk about this. Clearly, it's getting ready. They've done already some samplings, and we're getting ready to do a full [INAUDIBLE] of a cortical column of the human. We're doing slice physiology. I'll talk about that. We're doing synaptic physiology. We won't talk about that. And we're doing vital tools. So you can impart a lot of the vital tools, because you can't build transgenic lines.
But you can use EBV. So you can use a lot of the machinery, and the enhancers, et cetera, and the attack sequence. All that knowledge, you import it from the mouse, and you can partially test it in a monkey and then take it into human. So this holds immense therapeutic promise.
And so we work with a variety of surgeons at all the local clinics, and it really works great. And they all work with each other. It's a great thing. In fact, on this paper-- so we have a big Nature paper coming out in two weeks from now. These surgeons are all co-authors.
So as I said, we do pretty much the same pipeline for the mouse as for the human. We do, again, nu N, because we're more interested in neurons and in non-neurons. The study I'm going to talk about here, which we finished, like, a year ago or so, we have eight donors. Four are from neurosurgery, and four are from post-mortem.
16,000 cells. It's all from the middle temporal gyrus. We fact sort them, yeah. So we do it eight. I talked about all of this already.
So this is what we find. So if we do clustering, and then we derive a taxonomy, this is what we find in the human MPG. And here we are using nuclear RNA seq. So typically, the tissue comes from people who are 40, 50, 60, 70 unlike all of our mouse, which are all P56. They're 56 days old. We have everything optimized for p56, C57 Bl6 j.
So mice are highly inbred, and, you know, they're all sacrificed the same day, et cetera. That's, of course, not the case with humans. So we have vastly more diversity. That's just the nature of the game.
And also we can't-- because the tissue has been like this for 60 years, dissociating it without breaking it up is very difficult, but it's easy to get there the nucleus. So that's why we do nuclear RNA. And we've done, in the mouse, detailed comparisons where we compare the sequencing we get from intra nuclear reads versus reads outside the nucleus. And there's some interesting differences.
So here again, we have two families. Non-neurons and neurons. The neurons divide into inhibitory, GABAergic, glutamatergic. The glutamatergic divide very similar. So here, they have sometimes different molecular marker genes, but it's the same family, [INAUDIBLE], VIP, SST, [INAUDIBLE].
And then the excitatory classes. Again, we find few classes in the superficial layer, the marker for two, three, and four. So both in the mouse and in the human, which is sort of amazing, we find effectively only one or two cell types in layer four. But we find, like, 10 or 12 excitatory cell types in layer five, the output. And we find, again, lots of them in layer five and layer six.
Once again, we believe that these different branches reflect the different embryologic origin in the medial and central ganglionic eminences. But now what we can do, we can co-cluster. We can now establish really good evolutionary relationship between the mouse and the human.
To answer this question. Well, how many cell types does a human have? So many people-- certainly when I give talks, and I ask members of the audience-- many people would assume or do assume that a human, obviously, because we are so much more complex, has many more cells. That the cells are much more complex than the mouse cells.
But, of course, we've got to remember people also thought that in the 1980s when they found 20,000 genes first in the mouse. And then people said, well, obviously, they're going to be 100,000 or 200,000 genes in the human, because we are, after all, Homo sapiens, right.
But it turned out embarrassing deflationary for us. They are the same genes as in the mouse, and the same story appears to be true here. So here, these are these clustering plots-- these TSNE, these nonlinear clustering plots-- where we cluster them in these very high dimensions. 5,000, 6,000 dimensions. And then project them down onto those axes that are most easily discriminable.
If you just do straightforward PCA, you can nicely segregate mouse from human. But if you do this canonical correlation analysis, then you can see now the different cell types nicely clustered human and in mice. So in fact here, we cluster the GABAergic using the same technique in the same space. The triangle is human, and the courses are the mouse.
And you can really see how the very nicely co-cluster. So for example, [INAUDIBLE] cells. There's one type of [INAUDIBLE] cell in the mouse, and there's one type in the human in the co-cluster. That's not always the case.
But at the subclass level, not necessarily at the level of what we call tea leaves, T stands for transcription defined. Leaves, it's the bottom layer of the taxonomical tree. At the level of tea leaves, we get seven that cluster one to one. Most cluster at one level above the-- and the final one. So this is how that looks.
So you can align the human non-neurons with mouse non-neurons, glutamate excitatory projection cells, PB, SST, VIP, and [INAUDIBLE] cells. And you can do the following. So this is just for GABAergic neurons. So here are all the different GABAergic neurons we find in either V1 or in ALM, because remember I told you the same.
And here we find all the GABAergic neutrons in human middle temporal gyrus. And of course, [INAUDIBLE] different areas, the different techniques. One is nuclear sequencing. The other one is cell sequencing.
These are highly outbranched, hugely diverse people. These are hundreds of highly inbred mice all at the same age. Lots and lots of differences. But still, when you do the clustering, you find a whole bunch of cell types, where you get one-on-one mapping. So you can make predictions.
So, for example, we just previously published in Nature Neuroscience paper, where in layer five, we find a cell that has a unique morphology called rose hip, because they look like rose hip bushes. Well, it turns out transcription-defined, there is a cell that looks more like a neurogliaform cell in layer one, but expresses many of the same genes. In fact, 40 of the same genes that are highly differentially expressed that we used to cluster.
There's one type of [INAUDIBLE] cell. So we can predict just like-- so we know this in the mouse. We've now done-- Clay Reid has done the connectomics reconstruction. Those specifically have these nice [INAUDIBLE], these lamp with candles. And so we expect exactly the same thing to be true in humans.
We find one type of somatostatin [INAUDIBLE] cell. So those are the inhibitory cells that project outside, so they have long range axons in the mouse. So first of all, molecularly, we find one cell type that matches human and mouse. In the mouse, we know they have long distance axon projections, the somatostatin. And they're involved in sleep regulation. So the prediction would be that they also have long-range axons in the human, and they're also involved in sleep regulation. Yeah, this is the rosehip cell.
And we see similarities and differences in the [INAUDIBLE]-- for the excitatory cell. A few number match, but then you get differences. You get more differences in-- you get one cell type in the mouse that maps onto differently at four cell types in MDG. And you get one cell type, the projection neurons, the layer five extra-parametal projection neurons that break down into different cell types in the mouse.
Now, of course, we have to be cautious here with the exact number. The mouse, we have hundreds of mice. This study only involves eight humans. Humans, we're always extremely limited in tissue. So we all suspect that if we have more tissue and dig deeper, we will find more cell types. So right now, it's 76 versus 116 in the mouse. If we dig deeper, we're all sure we'll find on the order of 100 cell types in both.
Now, this is really important, particularly if you are in the big business of making drugs. Well, because here, for example, we show in two cells where we have one-to-one matching. So I told you there's only one type of [INAUDIBLE] cell in the mouse, and there's one type in the human, and they map onto each other. So in other words, they have a large number of differentially-expressed genes that are differentially expressed both in the human and the mouse that they share, on the order of 40.
But here we're plotting all the genes. We have like 7,000 genes-- [INAUDIBLE]-- probably 6,000 genes that are expressed. Here we plot on a log scale, log2 scale, the expression level of all the different genes in this cell type in the mouse, and here in the human. These bands is plus-minus factor 10. So if you're outside this band, this gene, for example, is expressed 30 times higher in the mouse than it is in the human, yet it's still the same cell type. Same thing here for a microglial cell.
So if you're a farmer and you used to have a large mouse lab-- because of course, they all shut them down-- and you study this receptor, let's say somatostatin receptor, or serotonin receptor because those are all the psychoactive drugs, and you have a cell type, and you can successfully label that, knock it out, and get some behavioral phenotype, well, there's no guarantee that that gene is also expressed in the human.
So in fact, we're putting this tool online where you can-- for each of these cell types, you can differentially look your favorite gene. So for example, oxytocin, which we know critically involved in social behavior and autism, et cetera. So here, what you can see-- this is on a log plot. This is log 2 to the 0, so this is 1. 2 to the 10, so this 1,000. So these are the violin plot of the express gene cannabis receptor 2, cannabis receptor 1, andronergic receptor oxytocin. So it always shows that the open one is the distribution of that particular gene in the different cells in the human and in the mouse.
And so you can see many are similar, but many are very, very different. When you have a highly oxytocin expressed in some cell type somatostatin in the mouse, but not in the human or vice versa.
And so this really helps explain why most mouse models have been a failure. The principle are very similar between mice and human, the principle of processing. But the details differ, and of course, the devil's in the details.
All right. Yeah, so he had to show the principle, right? So let's see, hippocampus. You have a lesion here, has to be taken out. It's a tumor or lesion that gives rise to seizure. So the surgeon will tunnel through here, and will cut this out, this part of medial temporal lobe, and give it to us. [INAUDIBLE] like this. Looks like a piece of sushi. Sometimes bigger. Sometimes it can be as big as this. Very often, it's smaller. On average, it's like a cubic centimeter. So think of a sugar cube.
And then we drop it into this special [INAUDIBLE]-- it's temperature, and control for oxygen, and carboxyl and other things. We mark the location so we can later on put it with all the pictures online, so people can then extract the metadata. And we do a number of things with it. We slice it for doing culture, but also for-- slices, but also for doing long-term recordings.
Yeah, and we can perfectly well-- as I said, we have these robots where we can now do automatic patching, up to eight of them simultaneously, where we can record and inject current. We can inject [INAUDIBLE]. We can do the reconstructions.
We do a panel-- and all of this is accessible. We do a panel of [INAUDIBLE], cytoarchitectonics, and markers of pathology. We just do it routinely. We just want to know this information for every one sample. By and large, the samples look very good.
So here, we have one slice where we had, I don't know, 10 neurons that we could inject, record from, and reconstruct. And-- wait, why is this fuzzy? OK, this shouldn't be fuzzy.
OK, so these are layer 2/3. This is the stuff you think with. If you're going to remember anything about this lecture, it is because neurons like these-- so these are layer 2/3 pyramidal cells in your middle temporal gyrus. Take all that sensory input that they get from, let's say, a higher order visual cortex, and encode it, and lay it down as memory many weeks later. You can see they are thin, because that's seen at 50 micrometer of the slice.
So I'm a great believer in that we are all nature's children, and we are all conscious. And so I did this at an old [INAUDIBLE] where we had 300 people in the audience. And I flashed up onto the screen one cell at a time. Not together like here. Different cells. 12 cells, half which were from mouse, half which were from human, and we removed the scale bar. And then people had a phone applet to guess which one was human and which one is mouse. Can you do this?
OK. So people were, on average, at chance. Now, that's not to say you can't train somebody. So A, of course, this is human. This is 2.5 millimeter. This is like 700 micron. So that's an obvious way to do. But even if you remove that, of course, you can train people to tell the difference, and you can for sure train a deep network to tell the difference.
But the overall point is the [INAUDIBLE] is just so similar, although the last common ancestor between us and those guys you probably have here in the basement, wild type, is like 60 million years. So that's the last common ancestor. But we are sitting upstairs, and they're downstairs. And why is that? That's what people always want to know.
And most people-- at least in a general audience-- they want to be told, oh, there's a magic explanation. It's von Economo cells, or it's a special brain area that only we have. And all those explanations, if you look deeper, none of them holds true.
We are specialized, but so is a mouse, so is dogs. Each creature is specialized in its own way. And the biggest difference that people routinely under-appreciate is the fact 1,000. The fact the dendritic tree are two times larger, synaptic density is somewhat higher. So it's probably going to be a combination of all these different factors that leads to our brain being the way it is.
So here, you can do electrophysiology. So what we discovered was really cool. People were extremely skeptical of this. Look, this is a recording three days after we took it from a human brain. You get beautiful spikes. You get interneurons. You get fast spiking. You can record synaptic potentials.
So here, we plotted for-- it doesn't show-- no, I have another slide. When we do the same thing for tumor patients versus for epileptic patients, we don't find a difference whether the tissue's taken from one or the other. Here, you can see this is time out after slicing. So this goes to three days. Yeah, and there are some trends, like the resting potential goes up a little bit. But by and large, it's incredible stable.
You can't do this in a mouse. In a mouse, typically, if the slice is more than six hours old, the gradients run down, and you have to throw the slides away. Humans, presumably because we live much longer, we may have more antioxidants, we may have more anti-stress factors. We don't really know why it is, but human tissue lasts much longer than rodent tissue.
And then you can find-- so here, we did a study-- we can find differences-- I'm not going to go into this-- between human and mouse. So we tried to identify equivalent cells. This is layer 2/3 pyramidal cells. And you can find differences-- whether they're meaningful or not, who knows-- between mouse and human.
And then you can do the same thing that we did in a mouse now in a human. So you can collect cells. We're doing this in collaboration with [INAUDIBLE] in Amsterdam, and with Gabor Thomas in Hungary, who really pioneered this technique. So all using common standards, we're now doing three modalities at the same time. So-called patch seek.
We're going into human cells. Right now we're doing interneurons, and then layer 2/3 pyramidal cells. We're injecting biocytin and reconstructing the dendrites from the axon. We get a standard electrophysiology. We suck out the nucleus to do single-cell transcriptomics on that same cell.
So in other words, we then get the morphology, the electrophysiology, and the transcriptomics all on the same cell to be able to identify, well, if this is a GABAergic cell, which particular transcription type of GABAergic cell is this? Which particular type of layer 2 pyramidal cell it is. And we're doing this now quite routinely.
And we also-- as I mentioned, we're now using these viral techniques, very exciting, where we can get-- these are a class that make up 20% of the cells here. Layer 5 cells, so-called pyramidal track that go down to the spinal cord. In the monkey, we've done this. There are many fewer. In the human, we have now a virus that identifies these. They are very specific cells, but there are many fewer. They're only 0.5%, but they project all the way down. And so we're now busy trying to find various promoters for GABAergic cells, for VIP cells, for subsets of GABAergic cells in the human brain.
All right, so let me finish. So the bottom line is I hope I've shown you that believable evidence that in any one cortical areas are going to be on the order of 100 distinct cell types. In some areas, it may be 80, in other areas, in may be 120, but it's going to be on the order of 10 to the 2. With most inhibitory interneurons shared across regions. Again, it's very difficult to be perfectly general. There may well be some distinct inhibitory cell types, but if they exist, they seem to be in the distinct minority.
What remains unclear as of now, how these different properties map onto each other. So how does the morphology, where they project to, the elected physiology, and their genes that express, how does it map onto each other? Does it map onto each other totally one to one? It's unlikely. What are the redundancy in the-- we see redundancy, and we see degeneracy, but we need to understand that by "we," I mean collectively the field.
The brain as a whole, whether it's going to be human and mouse, is going to be on the order of 1,000. Maybe it's 2,000, but on that order. Probably not 10,000, and certainly much more than 100.
So the big question is, particular for you, who think about the brain as a computational machine-- that I'm not fully sure is the right way to think about it. But if you do, you're confronted with this challenge. What is the function of 1,000 cell types from a computational point of view?
In particular, I'd like to remind you of this foundational paper of computational neuroscience. It was published almost 80 years ago during the war by McCulloch and Pitts at the time at Chicago, [INAUDIBLE]. And they published this beautiful paper-- if you haven't read it, you should go out and read it immediately-- which they propose one of the first neural networks model. It was a very simple model where [INAUDIBLE] all or none. Inhibition has a veto power. It's a threshold model that they proposed.
And then they showed systematically, they showed how any one logical proposition-- like all men are mortal, and Socrates is a man, therefore, Socrates will die-- how all such propositions could be mapped onto networks of these neurons. It was immensely influential. Really influenced the way society's organized and civilization. There's a direct connection between that than John von Neumann-- his book, Computer and the Brain, that he finished in '57 when he was dying-- and Alan Turing, and the rest literally is history.
So we are influenced-- we have this profound belief going back to Leibniz, and probably can trace it back even earlier to the Greeks, that the brain-- not the kidney and not the heart, but curiously only the brain-- instantiates these steps, this operation that we think of computation, and that is some primary function of the brain. In particular, of this cortical tissue.
Now, we know that simple McCulloch and Pitts neurons are Turing complete. So in fact, we know that just NAND gates, or rule 110, or game of life, all those things are very simple and they're Turing incomplete. In other words, anything that can be computed on a Turing machine with infinite memory can be computed with such simple networks.
So why the hell do we need 1,000 of these incredible complicated cell types of neurons? What do they possibly compute? You need a neuron for doing inhibition and for doing game normalization, but why do you need 15 types of somatostatin cells and 13 types of PD interneurons? And I don't have the answer.
But we suspect-- and certainly, if you read the biological literature-- that many of that has to do with developmental reasons. Your brain had to develop from a single cell. There are all sorts of neurons that we know are progenitive neurons to express transiently, and some remain around. Some are evolutionary remnants, no doubt, from evolutionary ancestors. Some may be there for metabolic reasons. So these are other reasons, rather than purely computation one.
But for us, of most interest is this. The belief of many of us in biology and in medicine that many diseases-- many automological diseases, eye diseases that are specific to the eye, psychiatric diseases, immunological diseases-- are not just general degenerative diseases, but at least start out with one or two specific cell types, that they're cell-specific diseases.
So for example, there are many types of retinal blindness that are not generic damage in the eye, but are very specific genes that are missing in the photoreceptor, for instance, or in [INAUDIBLE] cells, or in bipolar cells. And so that then gives you a promise for therapy, because if you can introduce that gene, then you can help them.
There's this spectacular success-- now, unfortunately, it costs $2.1 million by Novartis, but there's this spectacular success in AV trial, a single shot AV trial from infants who suffer from SMA, spinal cord muscular atrophy. In many of them, it's deadly. They die within two years. It's a dysfunction of a single gene that's expressed in an alpha motor neuron in the spinal cord.
There's now an AV treatment. Very successful in 12 children, where not a single one that has died-- usually 90% die-- where you have a virus that delivers that gene to only those motor neurons. So if we can do--
There's evidence that some Parkinson's disease, of course. ALS, amyotrophic lateral sclerosis, schizophrenia. In many of these diseases, at least a very promising hint that the disease is specific to a cell type, or starts out with a specific cell type, or a specific synaptic connection among cell types.
And so therefore, if you can identify the subtypes in the human-- because mice don't really get Parkinson's. They don't really get schizophrenia. If you can identify those in humans, and then you have a very promising avenue using AV or other vital vectors to introduce the missing or defunct gene, and introduce them back into the patients to alleviate or maybe eliminate the symptoms.
So plus, in the mouse, of course, or in the monkey, it gives us very powerful tools to perturb. So independent of what the functions of these, we can use them to access the brain in a causal way in animals. And we can maybe access them to devise more powerful diagnostic tools for the humans.
And with that, this is our team at the Allen Institute. And this is our founder, who started us all off on this road, particularly the road towards looking at big problems in large teams, and making all the knowledge available for all of humanity for free. Thank you much.
[APPLAUSE]