Building Brain Observatories for Large-Scale Cellular Surveys
Date Posted:
August 11, 2019
Date Recorded:
August 11, 2019
CBMM Speaker(s):
Christof Koch All Captioned Videos Brains, Minds and Machines Summer Course 2019
Description:
Christof Koch, Allen Institute for Brain Science
GABRIEL KREIMAN: OK. Good morning, everyone. It's a great privilege to get Christof to give his third talk here. We really made it work this time.
So in out ultra-specialized world, you will encounter scientists who are world experts in the [INAUDIBLE] data of [INAUDIBLE], like the [INAUDIBLE] days of Tuesday mornings when it's raining. In stark contrast to that, [INAUDIBLE] perhaps the only person in the world who can seriously scrutinize physics and neurons. It's a beautiful book, Biophysics of Computation, that he wrote. All the way up to [INAUDIBLE] years of consciousness, as he'll discuss here today.
And then the two talks today, one's starting now with the [INAUDIBLE]. The next one [INAUDIBLE] I think maybe [INAUDIBLE] interesting [INAUDIBLE] here [INAUDIBLE] to think about potential projects. I encourage you to think about all the amazing things that [INAUDIBLE] doesn't exist on the basic thing that he can do with this type of power. So without further ado, Christof.
CHRISTOF KOCH: Thank you. Thank you, Gabriel. Yeah, it's really a joint presentation with my junior colleague Michael Buice over there, who's going to give the immediately following talk about the brain observatory in much more detail than I will.
So what we have time to do at the Allen Institute is build brain observatories in the mouse-- today is all mouse, not human-- for large scale cellular surveys. And one of the key motivation for doing this is current reproducibility crisis that we're in. So here, I'm just showing you two headliners.
One is the cover of The Economist, who's a very science-oriented weekly international publication, Anglo-Saxon publication. And it goes into depth into the study that we all know about. Here's a review from the Nature Review Neuroscience that it has been claimed on statistical ground and demonstrated quite often that the many, probably more than half, of conclusion drawn from biomedical research are probably false. So we, of course, always believe that in the other field.
We believe it in fMRI, for instance, if we don't do fMRI. Or we think it's in psychology and in priming, or we think it's in cancer biology. Or we think it's in cell biology.
Because in all those cases, people have done studies of reproducibility. In fact, there's a large scale reproducibility project ongoing in psychology right now as we speak. Then, folks, if you talk to industry, they tell you, well, unless several independent academic labs have reproduced something, we routinely disregard high profile publication that claim to find some new effect.
This is a book that just came out. I can warmly recommend it. It's by a medical philosopher, Stegenga at Cambridge. Medical Nihilism-- by which he means not that modern medicine doesn't have some very effective treatments.
He says explicitly, there's no place I'd rather be after a serious accident than in emergency room of a clinic. And of course, there's a few magic bullets, like insulin, like penicillin, like Gleevec. But the vast majority of medical interventions, including lipids and anti-depressants, serotonin, uptick inhibitor, et cetera, are probably at best ineffectual.
And so there are many culprits here. One of them is, for example, a very strong publication bias. And of course, we all know you only published positive results.
Negative results are very difficult to establish. So the literature is full of positive results whether this be clinical trials that prove, supposedly show, the effectiveness of some medical intervention or report about cellular interventional or behavioral intervention or channelrhodopsin intervention or whatever it is. And it's much more difficult to publish negative results, which is, of course, a huge source of bias.
There's p-hacking. There's hopping. There's all sorts of things, bad statistical practice that we all undergo.
All right, so one way to get around that is to do what, for example, astronomy has done. Astronomy, of course, also suffers from the issue that we don't have. You can't intervene in astronomy. You can't just turn the sun on or off or something like that, unlike in biology.
But what astronomy has done, over the last 100 years they've done these large scale surveys, which are hypothesis free. And of course, your grant, your NIH program also doesn't like a hypothesis free R01, right? You have to have some particular hypothesis to chase down. But then, of course, you're very biased in finding results in favor of that hypothesis.
So one way to try to get at the reproducibility is to do large scale unbiased surveys. Of course, nothing is ever truly unbiased. You can just try to be as explicit about your bias and describe them in detail and discuss them at length with all your colleagues.
And then also to make all the data available, so that if you don't trust our data, well, here's our data. And here's our metadata. And here's our Jupyter notebooks.
This is how we got from the raw data to the final paper in the publication, including all the parameters. So you can yourself, A, reproduce it. And B, you can play around and maybe find other statistical relationship between variables in the data. And I think it's absolutely essential going forward for science if we really want to be successful. In particular in delivering on the promise of the neuroscience for therapeutic value, which is what most people care about, we have to practice these techniques.
All right, so the vision is coming from Caltech where we have a large number of astronomical sites. This is one. It's so-called 30 meter telescope. It's supposed to be built 10 years ago on Mount Kea in Hawaii.
It would have been or could be the world's biggest telescope, 30 meter with independently steered mirrors. However, due to local protests on Mount Kea, it hasn't been built and probably will never be built. But the point about these observatories is, whether they're terrestrial based or space based, they have separated the process of building instruments from the process of analyzing the data.
So in astronomy, there are really three communities. There's the theoretical community, the cosmologist. Then there are the community of engineers and physicists and software people who build the instruments and who run these large scale sites.
And then there's the third category of astronomers who, let's say, work on solar astronomy or extragalactic radio sources, you know, black holes or whatever they do. And the way they work, they apply for observation time on a particular telescope. And since observational time is always a scarce resource, they have to compete with other proposals.
And then the best proposals are chosen. And then, you know, they observe whatever they observe with the relevant instrument. And then sometimes later, the PI gets all the data and all the metadata for analysis. This is how astronomy works or, incidentally, also particle accelerator.
And the idea is when I came-- and we wrote about this together with Clay Reid and I, who were sort of the senior PIs in this effort. We wrote about this. So we imagined that we set up an observatory that would be open ultimately to the community, which we're doing now where people such as yourself can apply-- we'll talk a little bit about that-- for observational time.
Well, we have different sets of instruments. And of course, again, astronomy, in some sense, is vastly simpler. There's only one sky. There's standard coordinate systems. You can't intervene in the sky.
So it's much, much different from biology where, of course, you have different animals. There's no, you know, unique coordinate system. You can do a vast number of different behavior manipulations, and so it's much more challenging.
And so we'll see in the fullness of time whether this works out. But the instruments we wanted, from mouse, this is on mouse. And we have focused on vision. And for a variety of idiosyncratic reasons, we focus on vision. It doesn't have to be.
We wanted three sets of instruments. We wanted multi-photon imaging. We have 2p and 3p solar imaging.
1p wide field imaging, think of that like the equivalent of fMRI, albeit at the single try level where we can survey the entire dorsal surface of the mouse, and then high density electrical recordings. And we're talking here about sort of at the time we're imagining thousands. Now, we're talking about tens of thousands of neurons that we can image simultaneously in the standard way.
So that's how we started off back in seven years ago. So in parallel at the institute, the 10 year plan that we initiated was to focus a lot of resources onto primary visual cortexes to really understand this highly complex piece of excitable matter. So we are generating the data with a series of pipelines. So you'll see that here, the observatory.
Yesterday evening, I talked about the cell types. In the mouse, at least initially, it was all focused on V1, the morphology, the electrophysiology, and the transcriptomics. Then we have a very detailed connectivity atlas that in detail analyzes all the masses of amount of projection going into visual cortex and leaving visual cortex. I won't talk about that. I talked with it last night.
We're building this brain observatory. And I also won't talk about it. We have this very major effort under Clay Reid and Nuno da Costa where we have now reconstructed-- well, we've cut and sectioned an image. And it now reconstructs, together with Sebastian Seung, a cubic millimeter of visual cortex.
So this really requires, particularly if you want to do imaging under a standard condition, that you have an exact standardized map. On mice, we can do that unlike in humans where it's much more difficult. We can do it in mice, because these are inbred mice. These are not wild type. But these are C57 6J mice that-- how many of you are doing mouse experiments, anybody? One, OK, two.
Well, so the majority of mouse researchers, of which there are, like, 4,000 labs in the world, use one type of breed for better or worse. It's a monoculture out there, by and large. It has advantages, of course, also has disadvantages. And so that's the mouse we use, C57 black 6J.
And then, of course, so here we are superimposing the brains of 1,300 of these mice. You can see you can really see lots of features, including whiskers, et cetera. So it really tells you they are much more standardized, because these are all inbred animals that have very similar genomes.
So this is the advantage of having this CCF. So we developed this over the last eight years, a series of them that everybody now uses. So we have roughly a half a billion pixels. And so we know that an individual pixel is 10 by 10 by 10 micrometer.
So we know exactly where we are. Whether we are doing an O-phys recording with optical physiology, or with electric, with Neuropixel, or we're doing a patch, or any other manipulation, we know exactly where we are with that sort of resolution. And we have a detail atlas that is roughly 900 different regions again. So no matter where you are in the mouse, if you refer to it here, everybody knows what we're talking about. That's really essential if you want to do a highly reproducible science.
All right, then we also develop this. This is just a lot of infrastructure that you need to develop. It's as sexy as toilet plumbing. But believe me, if you've all been in this case, if your toilet at home doesn't work, you're in trouble.
And so you need to develop this. For example, you need to develop stand up protocols. So we spend a lot of time working with other people in the field to develop this data standard, which it's not the data standard yet-- I hope it will be-- which allows people to do any sort of cellular imaging whether it's electrophysiology or optical physiology. It's called Neurodata Without Borders.
So it has its own web page. There's various releases now, a 2.0, NWB 2.0 release. So it includes data and metadata in a standard format where then anybody can download these data and analyze them.
And there are a Python version. There's MATLAB version. You can open them up, and you can do all your basic analysis. I think you're going to talk about some of that, right? So Michael will talk a little bit about that.
And everything we do we release NWB. And a few other labs are now also using NWB. NIH is strongly encouraging it. That's for cellular level data.
So yeah, so this brain observatory has a number of components. So there's wide field imaging and two-photon photon imaging. I'll talk a bit about that. And then Michael will talk much more.
There is a Neuropixel. I'll talk about that. And then there is an entire modeling effort, where we try to do either point model, because we have the position of all the neurons here, or we do very detailed biophysical model, or we do more high level population statistic model or machine intelligence models. I won't talk about that.
All right, so-- ah, I forgot my mouse. Wait, I have a mouse, mouse brain, a C57 of course. OK, so this is the mouse brain.
Here, it's actually affected to the-- all right, so if you want to pass it around. So there are thousands of papers written on this brain. That's the actual size printed directly from the CCF, 3D printed.
And this is a keychain holder. This is a factor eight bigger, two linearly, two X, Y, and Z. So this guy has 74 million neurons. And as I showed you yesterday, cell type is not that different from human cell types.
Yeah, so this is the mouse brain front to back. So the eyes are somewhere here. They go from the eyes to the LGN.
And this is another part of the thalamus called the LP. Think of it like the monkey or the human pulvinar. And then here, the visual area is at the back, roughly where they are in our case, that we study. So there are six visual areas, visual cortical areas, that we study.
Of course, the visual input from the eye goes to more than two dozen different areas. So in fact, a massive projection goes to the colliculus, but that somewhat smaller projection goes to the LGN. But there are lots and lots of other projection.
We're studying cortical one, because we're interested in cortex. And of course, in us, it's the dominant one. Yeah, so here you can see this network of different areas LGN, V1, these high order areas. And then there are the other visual thalamic nucleus called LP. As I said, think of it like the pulvinar.
And we can do routine imaging. So this is wide field imaging. We do this in every mouse.
So think of it like fMRI. So we do that before we do cellular imaging or before we do Neuropixel imaging. So we know exactly, when we have our Neuropixel electrodes, where we put them in here. Or when we do optical physiology, we know exactly where we are with respect to these maps.
So we're trying to do the functional maps-- well, we are doing the functional maps in each animal. And of course, then we have the standard anatomical, because we have its brain. And we obviously reconstruct its brain afterwards. So I think that's really important to know, to have high confidence, in where you are.
So there's this bevy. So this is primary visual cortex. And then there are these bevies of other higher order visual areas. So people have identified roughly 12 different visual areas.
Some other senses over here, auditory is over there. Front is there. Back is here. Medial is here.
All right, so we have this standard, what we call a pipeline. So pipeline is a set of procedures that are highly standardized. So unlike at university where you, the grad student, have to do all of this by yourself for better or worse and you learn from your predecessor, so we operate slightly differently.
We have a series of detailed white papers and detailed SOPs, Standard Operating Procedures, where we write down in as much detail as humanly possible, we write down the exact procedures that we undertake. And so if we get a new team member, they get sort of indoctrinated into the SOP method. So we try again to do things as similar as possible. Of course, occasionally we have to change, because we discover there's some better way of doing it. But there's a change management process.
So the advantage of doing things standardized is that they're highly standardized and highly reproducible. The drawback is it's difficult to change things. You can't just sort of everyday twiddle and optimize things like you can do in a university lab. That's a drawback, but it comes with the advantage that you can compare things across days and weeks and months.
OK, you know, we do the surgery. We let the animal recover. We do the intrinsic imaging. We habituate the animal.
If we do a behavioral training, we train the animal. We do the functional imaging or the Neuropixel recording. We again do imaging.
You know, because this functional imaging may have been a week or two weeks or longer if there's behavior ongoing. And then we do the reconstruction of the brain, so we know where we are. And we also look for brain damage.
So we have an engineering team, a team of 14 engineers or so, that build all these little widgets and gadgets and, you know, 3D models, [INAUDIBLE] models. You can access all of that if you wanted to. We spend great amount of time and effort into making things, again, very reproducible.
So we have these radicals. We try to position things highly accurate from one animal to the next. So each of these is made for each animal and 3D printed. So we can go back now up to 10 times over 10 consecutive days of imaging and find the same cells again. That's the advantage doing it using optical imaging.
OK, so for the brain observatory that I'll talk about for the optical and that Michael will talk about, we wanted to know, what is the response of a large number of cells to a standard battery of visual stimuli that people have used over the last 30 years, which is drifting grating, you know, drifting various direction, various speeds appropriate to the animal? The resolution of the animal is roughly 50 times worse than us-- static gratings, locally sparse noise, then natural scenes from natural scenes database, and movies. So originally, we used Touch Of Evil.
How many of you have seen this? It's a great movie, Touch Of Evil. We chose Touch Of Evil. I think you chose it, because it has a-- well, because it has this long opening single shot opening sequence. But now these days we show a more diversity of movies, particularly movies from nature shows. Because we figure mice haven't seen too many movies that take place on the Mexican-American border and involve drugs.
And then spontaneous imaging-- and so here in the data you'll see in the first part we're doing three imaging session over three consecutive days in many animals. So the hope was-- it didn't quite turn out to be that case-- that, if we use different transgenic lines, we can do imaging in different populations of cells, which is true. So what do I mean by that?
So these are different lines that we build. We build more transgenic lines certainly for cortex than anyone else. We make them available through [INAUDIBLE]. So any of these you can order.
Some of them are very popular, where we have a fluorescent calcium indicator. GCaMP6 we use either 6s or 6f expressed in particular subtype of cells. So this is the advantage of doing optical imaging. It comes with drawbacks. I'll talk about that.
But the advantage is that you can do imaging, in particular genetically identified subtypes of cells. So you only image, for example, in excite. We have some lines, like the Emx line, that images in excitatory cells in layer two, three, four, and five. Or this Rorb line only does imaging in layer four excitatory cells, on layer five, excitatory cells. We have one that's layer six.
And then we have some inhibitory ones. SST, VIP, these are very interesting, and now PV. So that's the advantage of using this technology. We used 10 different excitatory lines-- unfortunately, this didn't quite pan out-- and two inhibitory lines.
And then for each one, so these are the different lines or some the different lines. Again, we do this intrinsic imaging to ascertain, you know, do they all have normal maps, et cetera? There is a belief, of course, in the field that transgenic animals are basically normal happy animals.
Although they will have something changed in their genome. Why? Because you had to insert something to get your, for example, your GCaMP, your genetic encoded calcium indicator, expressed in a specific genetic subtype of cells with a specific promoter.
And in some of these animals, you not only insert one gene or two genes, but you may have to insert three or four genes. And once again, if you do sort of long studies over many years of these lines, you'll see, unfortunately, that some of these lines aren't exactly like wild type. They're not just like happy mice that happen to have this magic thing that makes layer four cells glow when they fire action potentials.
All right, so the first survey took us a long time to set up, because we had to set up the entire machinery of building these highly reproducible set of instruments in behaving animals, which we certainly and no one in the field had ever done before. And so the first survey is now coming out in Nature Neuroscience. It said, six cortical areas, three hours of these visual stimuli using two-photon calcium imaging using this calcium encoder.
Everything is freely and publicly accessible. You can download it. People have downloaded several thousand times the manuscript. It went last year on the bioserver. The data is available since two years. The paper is available since one year, and the actual paper comes out.
Now, the actual work is really led by three brilliant young scientists that joined us as investigators, so Saskia de Vries, Jerome Lecoq, and Michael. Michael Buice is here, is going to be talking more about this.
All right, so the standard view is this, a view that I grew up, yeah, that you guys also grew up with, because it's part of all of our textbooks in neuroscience, is that early visual cortex in humans, or in monkeys, or in cats, or in other mammals, there's sort of the standard model where cells have a well-defined receptive field. So they're looking at the world through these little or larger receptive field. And you can essentially get their spiking responses know by doing a convolution of whatever stimulus falls into this receptive field, passing it through some non-linearity, like halfway rectification and squaring, and then some sort of normalization. That's sort of the standard model. Of course, that's also used in many neural networks.
So this was one of several articles that appeared-- the other one was by Bruno Olshausen-- that sort of question this and said, well, to what extent do we actually know that this model is the case? Or to what extent do we know for what fraction of cells does this model hold? Is it only true for a very small subset of cells that we happen to record from, because they are shouting the loudest?
Because you have to remember, if you do electrophysiology with an electrode, classical electrophysiology, you drive yourselves blind into cortex. You're listening to an auto monitor until you come to a cell-- brrp, brrp, brrp-- that's nicely modulated. And that's the one that you're recording from.
So that's like interviewing people in the country that shout the loudest. And you're only going to interview the people that happen to shout the loudest. So you're going to get a view, but you're going to get a bias view, a partially biased view.
All right, so here you can see the mouse watching Touch Of Evil, the movie. And there you can see sort of the flickering activity. So typically, we get, I don't know, on the order of 60 or 70 cells-- Michael, what's the average number of excitatory cells that we see in a single field of view, like 60?
MICHAEL BUICE: About 180.
CHRISTOF KOCH: 180, OK. So we can record simultaneously at frame rate. So this is slow frame rate. But this is typical. This is what two-photon calcium imaging is. It's 30 Hertz. And then we track the eye to make sure we know where the animal is looking and to regress against that.
All right, then for more housekeeping, we have this. Again, we think this gives rise to high quality data that, after the imaging has been done, we are still doing checks for quality control. So we have lots of quality control checks. And the people who are doing the quality control typically may not know the end result or how good the data was, again, to keep it as unbiased as possible.
So from the 422 mice for this particular paper that enter this pipeline that goes through the surgery, the intrinsic imaging, the two-photon calcium imaging, and then the histology, lots of them get rejected. So the two biggest sources are brain health. If we look in detail at the brain and histology doesn't look as good as it should, and we reject the animal even though it may have been imaged. And we get Z drift. Although we've managed to reduce this now.
All right, and then we have this. So you can go online and look at all these cells. And Michael is going to talk more about you.
So you can download 60,000 cells and get for each of them, in principle, up to 3 hours worth of responses. And so we quantify this. We spent a lot of time generating these cute plots that I'm not trying to decode for you.
Here, for example, you can see these are so-called corona plots. This shows you the response. So we show 101 natural scenes. And each one you can see. This natural scene, this cell, responded on multiple trials to this one very strong, to this image somewhat weaker, and to most of the images hardly at all.
Or you can look in the receptive field. Or you look in the natural movies. Or you can look at the response of drifting grating. So it's a quick way to summarize. And then there's an SDK, Software Development Kit, that Michael is going to talk about that allows you to download all of this and analyze and do orientation tuning curves and correlation curves and whatnot.
All right, something else we did, we wanted to know-- so everybody records from two-photon calcium imaging and publishes paper on it and assumes these are the cellular responses. And most people think, oh, cellular responses really is like spiking responses. But it isn't.
And there are a few people who have done this. For example, Karel Svoboda has done this work. It's very difficult. It's very demanding.
So what we did here, in these animals we simultaneously recorded using two-photon calcium imaging. And then while we are patching it will lose a cell attach patch from one of the cells. So in other words, we get ground zero. We know what its actually electrical behavior is whether it spikes or doesn't spike. And so now we can compare its electrical behavior to the observed optical imaging data.
And the bad news is, well, so here we plot-- this is an analysis done by one particular scientist under supervision of Michael here. So this shows you for two different excitatory lines, layer two, three lines in cortex. This is the probability of detecting an event in the calcium from imaging.
So we used a method, a theoretical method. That extracts events known as L0 events. And Michael will tell you more about it.
So we get these changes from fluorescent delta f over f. And then we extract significant events. And this is what many people do.
And the question is, what's the relationship between those significant calcium events and the underlying spiking events? I mean Hodgkin-Huxley fast action potential. And here we plot the actual spikes. We measure it using the loose attach electrode. And here we attach the probability of detecting one of those calcium events, the L0 events.
And you can see, OK, if there are no spikes, the signal basically doesn't respond. So it's a very low false positive. That's good.
But what's not so good, if you have one spike, there's less than a 5% chance of getting an L0 change in calcium fluorescence. Only if you have four or five spikes within half a second you get, like, you know, 70% chance of detecting an event. So these are different time windows.
And this is the strength of the event. Essentially, what we're seeing when you're looking at two-photon calcium imaging data, you're seeing sort of bursts of spikes, 3, 4, 5, 6, 7 spikes within an observational window, let's say, a half a second or a quarter of a second, or 3/4 of a second, OK? You will always never pick up individual spikes or two spikes that fire within 200 milliseconds or something like that. All right, that's the weakness of all calcium imaging streams.
So here, we have four example cells just to see what-- so here we have this is the delta f over f case. These are the events and the magnitude of these L0 events that we infer for this particular. So this is an inhibitory somatostatin line. This is another inhibitory VIP line. Those are two excitatory lines.
This is the response. So these cells are tuned to this particular orientation. That cell is tuned.
This one cell isn't tuned at all. This one is also tuned to static [INAUDIBLE]. These cells respond to a few image natural scenes, here four, there only to two.
This shows you the movie. So we repeat the movie, in this case, 10 times. And this shows you the response to each repetition of this movie.
So it shows that the cell responds very reliable to something happening in the movie here, or there, or there, or there. So it responds quite reliable. And these are the receptive fields.
All right, so Michael will go into more detail of this result here. I'm just highlighting two results. One is it's a very sparse code that we find in this optical imaging. It's very sparse code. And some of this we've now replicating using Neuropixels.
So here, we compute lifetime sparseness and population sparseness. So this is the measure. It comes from Jack Gallant. Well, here, so here, for example, we have response of one particular cell to different natural scenes.
And most of the time, response to most natural scenes was either nothing or was a tiny response. But for some natural scenes, it responds, you know, with a large response, as you can see here. So this has a high lifetime sparseness, lifetime sparseness of one on the limit of one means it only responds to one thing, this cell.
Lifetime sparseness of 0 means it responds to everything equally. So this one here, it's a much lower lifetime sparseness, OK? And then you can do the same population sparseness.
So here, you're asking one cell. How does one cell respond to different images shown over the three days, the three imaging sessions? Or you can ask, given that I'm seeing 180 cells in this one window, what is their response? So that's the population sparseness.
And here, we have a lot of data packed into. We spent a lot of time into making comprehensible ways to illustrate the data. I think at least they're aesthetically appealing, at least we'd like to believe.
So here, this is called a parse. So this is V1, the response of V1. And this is response of LM, AL, PM, AM, PM. So these are the six different areas that we imaged.
The challenge we have, we have massive amount of data. We've got 60,000 cells. We've got all these different tuning properties. So how do we best represent that?
You know, so you can either go through individual plot till the cows go home, or you can try to abstract. So here we do this parse. We show, for example, the response in all these different areas in layer five, in layer four, in layer two, three.
And I'm not going to go-- and here, we see it in one area. This is in V1 for the different Cre lines. So you can see, for example, VIP cells tend to be very proliferous.
They respond quite indiscriminately to things, where SST cells are very discerning. They're much more discerning. So you have two very different classes of inhibitory cells.
And then you get these excitatory cells, you know, lines. And they have sort of a complex pattern of responses. But the bottom line cells are sparse to both natural stimuli, as well as population sparseness. And this doesn't quite accord with the view that no matter what comes into view of the receptive field the cell will always respond.
AUDIENCE: Is all this averaged over running speed?
CHRISTOF KOCH: So this is-- no, here-- I mean, so we've done a totally separate analysis where we regress against running speed.
AUDIENCE: So running speed is--
CHRISTOF KOCH: This one, I think this is the average overrunning speed. What?
MICHAEL BUICE: All of this ignored running speed.
CHRISTOF KOCH: Yup.
AUDIENCE: It's averaging that?
MICHAEL BUICE: Yeah.
CHRISTOF KOCH: That's one point I wanted to make, sparse. B, I'm just going to talk about this. So here, we plot the cells that are responsive. And by responsive, we mean we take the stimulus they respond best to.
So here, let's say we do different gratings that move in eight different direction. And we pick them, the grating that the cell is most responsive. And we plot, well, how often does it respond to its optimal stimulus, the stimulus that evokes the strongest response?
And here, you can see, you know, on average the median is around 50%. You know, many cells respond to their best stimulus less than 50% of the time. Some, like SST, respond, again, highly reliable. VIP and [INAUDIBLE]-- incredible variable. So there's a great deal of variability, right? So cells, even to the optimal stimulus, the one that really turns them on respond on average less than half the time.
And then the last thing, we wanted to ascertain-- all this hides a lot of work. And Michael is going to unpack some of that, a lot of work by a lot of people. Well, you take the standard canonical model where we take a receptive field, either just a linear receptive field in and out of phase or curvature receptive field.
We pass it through some non-linearity. And then we include a running speed variable as a weight. And then we compare it against the response just to see how well does the standard model fit.
And yes, we find some cells here. These cells are all V1, some cells where it actually does very well. So the red, see, this is over time. This is a 30 Hertz frame rate.
So here, we have a cellular response to both the natural stimuli, i.e. the national scenes and the natural movie, as well as to artificial stimuli, i.e. the stationary grating or the moving grating quite well, and really responds pretty much as you would expect to-- I mean, not perfectly, but really quite well. Now, you also have some cells that respond well to natural stimuli. But they don't respond at all to artificial stimuli, to the gratings.
You have some cells that respond well, that can be very well-fitted, the standard model. The standard model fits very well. They respond to gratings, but not to natural scenes or to natural movies.
And then you have cells like this, you know, where you get these effectively zero regression coefficients. Now, it turns out, if we look at all cells-- and I think this is 15,000 cells-- ah, there it is, yeah. So I just showed you those four examples. So here we plot this regression against the model, against responses for artificial stimuli. Here, we do it for these natural stimuli, for the natural scenes and movies.
And then generally, you can see, yeah, the cells respond better to natural stimuli than to artificial one. But you know, these are 15,000. So the vast number of cells are down here. And look at this correlation coefficient. They're tiny.
So only 2% neurons respond more than 0.5. So it's difficult to find cells that respond that follow the standard model. And it turns out we've done detailed clustering analysis that Michael is going to talk to you more about.
There's a class of cells we call none. And they don't really respond to any of the stimuli that we tried over 3 hours of running. And they are, like, one third of all of our cells fall into that null category.
So they're there. We can see them. They do respond. But they respond to events that don't really respond to something obvious on the screen, at least the stimuli that we've seen.
So maybe it's a very strange or it's a very particular visual stimuli, like in IT. But this is V1. And in some of the other areas, it's rather early on. Or they respond to some non-visual stimuli or to some combination that we have not latched onto.
OK, so that's the status of the current survey where the mouse is free to run and not to run. So I should have said this earlier. There's no behavioral control in this case.
The mouse can sit. Many mice like to run. And so we're doing this imaging while the mouse does what it does, its head fixed, of course. But there's no behavior.
So in the meantime, we're setting up more instruments. In particular, we have this beautiful instrument that came to us from the Karel Svoboda's lab, the Janelia mesoscope that we improved upon. So we can do twice as many imaging as the original one.
So this allows us now to do high resolution imaging with better signal to noise than on the original scientific scope and using eight different area. So what we can do? For instance, we image in two areas, like V1 and AM simultaneously. Simultaneously means at 30 Hertz, simultaneous. And then we can do, again, simultaneous imaging of four imaging planes.
So we can do here, here, here, here, here, here, here, here, and repeat that. I think we're doing that every 10 Hertz. You can see that here.
So this is simultaneously in V1 and in higher order visual areas, where we can see what the cells do in many different areas. So we're making that soon available to the community.
And much more interesting, we have moved, over the last year-- it makes it even more challenging now-- to behavior. So in addition to doing all the imaging, we now also have to establish or have established a pipeline where the animal enters. And we have to check its training. So we have supervised training, computer supervised training, where the animal in this case has to do a change detection task.
So we show it a series of natural scenes, because we know some cells respond very nicely to natural scenes, albeit very sparse. So we show image, blank, image, blank. So this is at 250 milliseconds, blank, 250 milliseconds.
And then the image changes from one natural scene to another. And here, the animal has to respond by licking. And then if it's correct, it gets rewarded. If it's incorrect, there's a time out.
OK. The animals can do this quite well. It takes them, like, a week or two to-- well, it takes them some time to habituate to the entire setup. But once they habituate, it takes them a week or two to do this training. So here you can see there the animal then licks and gets water, continues to run.
All right, so now this is same imaging and behavior. And so you can see what I told you already before. This is from an excitatory line. This shows you the whole fluorescent signal normalized by it. So it's delta f over f.
And you can see, so these colors are the different image. So there are eight different images. And it has to detect transition in principle between any of these images and any other. It has just to detect when the image changes.
And here, you can see these excitatory cell line, it's a layer two, three line. They respond quite sparsely to individual images, which is an interesting point. It's very different certainly from my idea of V1, that the cell in V1 will only respond to this image, but not to these other seven images. And this happens at the first stage of cortical processing already.
And of course, this is two-photon calcium imaging. So we don't have the resolution using two-photon calcium imaging to see the exact temporal onset. To what extent, for example, given this is 30 Hertz, is there feedback from higher order that make these responses so selective?
So here, you can see it's very rich now here to mine behavioral modification of cellular responses. So here you can see the excitatory lines responding, as I said, to the eight different images. And you can see. So these are 2,000 cells.
And you can see some cells respond very strongly, but only to particular images. So this is on highly trained images. Here on one day, we exposed the mouse to a new set of eight images.
It learns very quickly that the task is still the same. The task changed from one image to the same, but you get less sparse responses. And we find this highly reliable.
So if you train the animal for a week or two, the response get very sparse. If you use new images, the animal can still do the task, but it's less sparse. And then, also, we see really interesting predictive signals when we're looking at, for example, subtypes of inhibitory cells.
So here, you can see the responses to-- so red, this is an excitatory layer two, three line when the animal has seen these eight images many times. And this is when the image set is new. And here what we did, we left out on purpose. We omitted.
So you can see they're always repeated at fixed interval. But here, we omitted one. Because we wanted to see, well, the mouse expects an image, but doesn't get an image. What happens?
And you can see the VIP cell's very reliable. You know, they have some expectation about what happens. Well, it should happen around zero, and it doesn't happen. And they ramp up until the actual next signal comes.
So this is just a teaser. There's lots of interesting data there. Of course, the data is much more interesting when the animal is on a behavioral control and is actively doing a task that must involve V1. We know, because if we take out V1 using optogenetics, by exciting all the interneurons, the animal can't do this task nearly as well.
All right, so then over the last six years, we also work together with Janelia with Tim Harris to develop these Neuropixels that are very, very popular now. So it's really the next generation in electrical recording. So it's a silicon technology.
Actually, it's 120 nanometers. It's not very aggressive. But it's a huge jump over conventional microelectrodes, where you have on a piece of silicon that's 22 micrometer by 70 micrometer thin. So it's thinner, quarter of the size, of a human hair. And these ones are 10 millimeter long. There are ones being developed for the monkey that are up to 49 millimeter long for monkey and possible human use.
And they have amplification here. And then everything gets read out here at the base of the chip. It gets amplified up to factor [INAUDIBLE] multiplex digitized.
So all you have is the USB connector at the output of this. So you have 966 electrodes. You can select which one you want to read out. You can read out 364 of them.
They're very low impedance, very stable noise, 6 to 7 microvolts. So they're really the next generation. We insisted this when we built them.
They are cheaply available. So for $10,000 anyone can set up a new recording lab and, within a few months, record from a cell themselves for each experiment. So here, we have six of them.
So yeah, that's all the-- you know, they weigh less than a quarter of a gram. People are now trying to make wireless versions. We are also making versions with two optical channels on them.
So you cannot only record, but also send down two different colored beams of light to activate it and activate opsins. And we're also working on ones to not only read, but also write, in other words to stimulate. It's very exciting technology development.
And in fact, we are just working now together with Nick Steinmetz, who's a young faculty at the University of Washington and Tim at Janelia. We just got these where we have Neuropixel Ultra where they have even four times higher densities, so where we get an electrode every 6.25 micrometer in x and in y. So we can get these ultra dense fields.
So a lot of my early work, and I still love it, is the electrophysiology of excitable neurons, which is why these are called Neuropixel. Because at every pixel, we get now an electrical signal with very low, again, noise flow. It's a little bit higher, 8 microvolt.
So we can really pinpoint. You may not know. But there's a number of issues here associated with what's called them-- did you talk about the dark neurons? OK.
There's a problem of dark neurons just like this for dark matter. There's some neurons that don't show up on some electrodes, or they don't show up at all. They never seem to spike.
And it may be because our electrodes are so cooled and only if you have really some very small sampling. And particularly if they are symmetric and their electric field is very compact, when you have a very big electrode, you won't see them. So one way to partially visualize them is using high density electrodes.
All right, yeah, so they're now commercially available. They're 1,200 euro per. And as I said, all you need is a USB port that plugs into a PIX board.
AUDIENCE: Christof, are those-- just I wonder, in this picture, they look very similar to your simulations [INAUDIBLE].
CHRISTOF KOCH: Yes.
AUDIENCE: [INAUDIBLE]
CHRISTOF KOCH: Yes. Yes. We do understand neurons at the level of biophysics. We have a reasonable good understanding, electrical fields, local electric fields.
So we have the same pipeline now that we do for O-phys, we have for E-phys. So here we have an animal. It's a markup of an animal with six electrodes. And so we have to do things slightly different here, because we want to locate where those probes are. So we use afterwards optical tomography imaging to enable us to locate where those electrodes are.
So here, those are our six areas. And just like, you know, butterflies on a needle, we can stick electrodes through all of them. And in this case, we are fortuitous. If we know what we're doing, we can also get the super colliculus, which of course gets a massive input from the retina, as well as the pulvinar, in the mouse called the LP, and the LGN. So we can get on the order of 10 visual areas.
Now, this compares the response that you get for Neuropixel recording. So each one is a cell. And the size of the bubble shows you the strength of the response, this one in AL.
So if we two-photon calcium imaging, we get this plane of cells at 30 Hertz on the order of 180 cells if it's an excitatory lines, many fewer if it's inhibitory line, because they're sparser. Now, with Neuropixels, we get all of those responses down through layer six, through hippocampus, dentate gyrus, C1, C3, and down into the LGN. And if you wanted to, you can go all the way to the bottom of the brain.
And we get this. This is why I love electrophysiology. We get this at 30 kilohertz, at 30,000 Hertz rather than getting those cells at 30 Hertz. So at 30 Hertz-- those cells versus all of this for 30 kilohertz.
So you can see, you know, it's just a markup. The fact that we get hippocampus is just due to the geometry of recording. Yeah, so you know, in typical experiments now you get overwhelming number of cells.
So this is in a few experiments that we did just to test the basic, we call this, pilot data to test our basic setup, you know, 14,000 cells. And you can discover things that we couldn't see before, because the signal to noise is very low and the electrodes are spaced closely together. So for example, I don't know, have you talked about back propagating action potential?
So there are action potential that propagate not only from the soma, the site where the action potential gets initiated at the axon hillock down the axon onto the synapse and jumping across the synapse. But they can also propagate back from the spike initiation zone up into the apical dendrites. And if you look carefully, we can see that here. In many cases of a subset of pyramidal cells in cortical structure and in cortical structures, we can see these action potentials that back propagate very nicely, in particular in hippocampus and in cortex. And here you can see the electrical signatures. And they have the right orientation.
So sort of where you can begin to do in vivo recordings not only of classical action potential, but also dendritic action potentials, at large scale. So it's pretty cool. So why would you-- you can tell I'm very enthusiastic about E-phys. Why would you want to do O-phys? It seems like a poorest man version, right?
OK, I get this question very often, which is why I summarized it here. So O-phys recording, the advances, the pros, is you can record from several hundred neurons from genetically identifiable population. That's a huge advantage, right?
So the Neuropixel, you pick up everything that has an electrical action potential. Everything that shouts, you know, in the electrical field, you pick up. And you don't know what the identity is.
You can try to infer whether it's an interneuron or excitatory cell, but that's indirectly. So that's a big advantage if you have some specific hypothesis about a subset of cells being involved in particular behavior. And of course, you can turn those neurons, if you're genetically identifying, you can turn them on or off using opsins.
And you can track them across multiple sessions. So it's easy, for example-- no, it's not easy. But it's doable that you track the same set of neurons over 10 sessions.
We do this now reliable, where we assure this is one particular cell in the upper part of layer two, three that expresses that particular promoter. And we can do that on day one, on day two, and on day 10. You can't do that E-phys. E-phys you're recording from all the neurons. And you retract the electrode. Next day, you put it in again. You're recording from maybe the same or probably from different neurons. You have no idea.
But the drawback is signal is slow. It's 30 Hertz. So certain things are completely invisible to you.
You don't know about high frequency oscillation, you know, gamma, high gamma, all of that. You're blind to, you don't know anything about synchrony. And then as I showed, the spikes aren't really related.
The activity you see, whether it's delta f over f or L0, it's not really spikes. It's indirectly related in a non-linear, non-stationary manner to the underlying action potential. It's difficult to reach deep structures, because you're imaging with photons with two photons still.
But you're limited. Or even with three photons, you're limited to 600 micrometers or 700 micrometers even with three photons. So in order to go to the hippocampus, you have to remove the overlying cortex, which gets bloody. And you have to make lesion.
The advantage of E-phys is you can you record per electorate now several hundred neurons. In a couple of years, that'll probably go to a few thousand neurons. You can do it in the entire brain.
You can do at 1,000 times high temporal frequency. It's minimal invasive. And you'll really understand the physics.
And it's easy to set up. The main drawback is you can't track neurons over multiple session. And you cannot-- oh, sorry, you can't ascertain the genetic identity.
All right, last thing I want to talk about-- which is the observatory model. So this just shows the statistics from one particular observatory, the Keck Observatory in Hawaii, which is run by Caltech and University of California. We had a speaker from there, Hilton Lewis who's the director from there, speak to us. These are the number of papers that are published each year just based on that particular observatory.
And we want to do something the same, but now doing two-photon calcium imaging. So we have this whimsical drawing. So the idea is you will perform small scale experiments.
You form a hypothesis. Or you may not even have a small scale experiment. You just have an interesting hypothesis about what goes on, let's say, in the areas in the back of the mouse brain with a particular stimulus in a particular transgenic animal, for instance.
And then you submit a protocol to the open scope request for proposal. We just had two of them. We're going to release a third one in the next month, right?
The proposals are reviewed by internal referees inside the Allen Institute and by external referees, so we don't bias that. And then we would perform the experiments that come out of this refereeing process. What you have to commit, if you apply to it, you have to commit that you're doing data analysis. Because it takes us significant effort and time and money, ultimately, right?
We have to pay the people and pay for all the infrastructure for each experiment. So you have to promise us and you have to specify ahead of time. These are the types of analysis I'll do with this data. And you know, you have to give a detailed justification for your data plan.
And then you publish the data. And the data, a year later I think, has to be made public. This is run by Jerome Lecoq, who comes to us from Stanford.
So first call, we got 33 proposals. And we have done these three stimuli. This involve people from university, as well as from DeepMind and Google. This is involving the Tononi lab. And this is an internal, because our own people also want to run their own experiments on this open scope. Because it's much easier to do things on these big pipeline proposals.
We just open it up for-- well, last year we opened it up for second round, got a lot of proposals and selected three of them and are executing them now. And now, we're going to go to the next request for proposal. So once again, if you're interested in doing this, it doesn't cost you anything.
There will be a request for proposal. And then you can submit something. It's best to contact us ahead of time, because certain things we can easily adjust.
Certain other things are very difficult. You know, we don't have infinite flexibility. We try and become more flexible to have other surgical windows and other sites where you can observe other brain parts, but it's a non-trivial process. But I think it's an exciting one moving towards this open observatory model.
And with that, I'm finishing my talk. This is our large team who supports all of that. And as I said yesterday, that's our visionary founder who funded all of that and who kept us asking the hard questions.
And his voice is gone now, so we have to internalize his asking the hard questions. But he's enabled all of that. And with that, I thank you.
[APPLAUSE]