09 June 2022
Il Memming Park
Deciphering Neuro Chatter
09 June 2022
Deciphering Neuro Chatter
Thanks to technological advances, neuroscientists can now collect mouth-watering amounts of data, simultaneously recording the activity of hundreds of neurons in real-time. But what's the use of having this "big neuro data" if we can't make sense of it?
Il Memming Park, a young professor at Stony Brook University, soon joining the Champalimaud Research faculty, is up for the challenge. Contrary to most neuroscientists, he is not dedicated to a particular brain region or behaviour. Instead, he moves from one scientific question to another, coming up with new mathematical tools along the way.
In this interview, Park lightheartedly talks about his path and even shares some of his secret weapons.
Basically, the biggest influence that made me a scientist was my father. He's a solid-state physicist, always asking one "why question" after another. I guess all kids ask why questions, but my father actually answered all of them scientifically. Up to when I was about 25, our conversations were always about science.
I moved around a lot when I was a kid, but I ended up getting all my public education in Korea. When I was a really small child, around 7 years old, I started programming. At the time, the typical way of learning how to programme was by reading books. The books would have the source code printed. You didn't know what it meant; you would just type it in and run it. And then you would think… "What would happen if I changed things here and there?" As you can imagine, learning to programme like this is a very long process, but I enjoyed it.
Maybe my more creative programming started when I was in middle school. At that time, I made my first AI-type programme, a version of a computer psychologist software called ELIZA. It's a matched response system that reacts to different keywords with psychology-type questions, like "you said you feel sad; why is that?" So it's just clever enough to pretend to be a human therapist.
It was actually a couple of years later when I came across a book about artificial neural networks. That's when I understood that machines can recognise different digits using neural network structures. It was fascinating. Half of the book was just source code; I typed it in and built a multilayer perceptron.
It's the first learning algorithm ever invented using neural-like structures. Actually, the whole field of machine learning emerged from neuroscience. Scientists at the time thought, "Oh, look! Neurons behave like binary switches that turn on and off depending on the input, just like a logic gate. We can use this structure for programming, for creating systems that learn."
When they made the first perceptron in the '60s, they believed, "we found how the brain works!". But then they entered the "winter of AI", where there was no progress. Later, they understood that they could create multilayer perceptrons by stacking individual perceptrons, which eventually helped them move forward.
This is just a bit of history. But multilayer perceptrons are, still now, the fundamental building blocks of deep learning, where you basically have interconnected layers and layers of neuron-like artificial units.
Then, I read another book that was very influential to me, called The Emperor's New Mind. It’s a popular science book by Roger Penrose, the winner of the Nobel Prize in Physics in 2020. In this book, he laid out a logical boundary, saying there's a definite limit to what computers can do. And I thought, "this can't be true. We have to break through this mathematical limit. How can it be that computers cannot do more than this?" I was very much into understanding this particular question, so I spent the next few years studying Logic and Theory of Computation.
Turns out this was a useless endeavour in terms of understanding the brain. All I have told you so far is just the motivation for this riddle work that I started later on. Fortunately, I was still pretty young and could change fields. I left computer science and went deeper into neuroscience.
My PhD advisor was Dr. José Príncipe, head of the Computational Neuroengineering lab at the University of Florida. He's a Portuguese scientist who is deeply invested in understanding information processing and machine learning systems. As it happens, he told me about the Champalimaud Foundation back when I was a student in his lab.
Computer science is based on binary things - 0’s and 1’s. And you don't get to deal with continuous streams of information. However, electrical engineers have this powerful tool called signal processing, which they use to study continuous signals, like sound, radio waves, video… really anything that changes over time.
It's a great tool, but it was difficult to apply it to neuroscience because neural activity is binary. So part of my PhD was dedicated to building a set of mathematical tools to allow signal processing of neural spike trains.
I also started working with real neural systems then. Before this, I just imagined how the brain works purely based on theory, and this… it's not the right way to go. Real neural signals are nothing like the electrical signals of modern computers. They are super variable, highly nonlinear, and very hard to understand. It took me a while to get the hang of what was going on, what the signals were, and why there was so much “noise”.
From then on, everything I do starts from neural data. I regularly collaborate with experimentalists and have worked on many different kinds of data and, in the process, developed many different mathematical and statistical methods.
Not really. I did my postdoc with Jonathan Pillow, who was at the University of Texas in Austin at the time (now he's at Princeton). He had just started his lab with a couple of students, and I joined as his first postdoc. I was drawn to his work because my PhD advisor gave me some of his papers early on, and these papers were impressive; I really liked them.
There, I was introduced to the new world of Bayesian Machine Learning. It sounds complicated, but the principle is simple: you explicitly state your assumptions before beginning the analysis. When you draw conclusions from data, you always rely on certain assumptions, but they could be hidden, or you may not write them down. Which is not always best. In Bayesian form, they are always explicitly written as a probability distribution. Then, you combine your assumptions with the data to reach a conclusion. This allows you to extract the biological structure which might have generated that data. We developed a lot of Bayesian machine learning techniques for analysing neural data.
Of course! When I landed at Stony Brook in 2015 as a junior faculty member, I was very confused. Nobody teaches you how to become a professor… I am interested in so many things, but I had to focus on a couple of important questions that I wanted to answer. During my postdoctoral training, I mainly worked with probabilistic models. Now I wanted to expand to questions about computation and dynamics. As in, describing the process of neural coding and how it changes over time. The idea is that this temporal transformation is the basis of all computation and behaviour, and also of neural diseases when it goes wrong. So I started building up a set of machine learning tools that could extract these temporal structures.
There are two different ways in which it goes. Sometimes you have a cool idea, and then you look for the data. The other way is when we work closely with a collaborator, and specific questions may need new methods. In the end, we get innovation in terms of algorithms, computation, and statistics, which we then put out to the world so that everyone can benefit from it.
Statistically speaking, we need models that can scale with the dimensionality of the data. That by itself is a huge problem. Some datasets have hundreds of neurons, so you are dealing with a high dimensional space, at least as high as the number of neurons.
Fortunately, there are regularities in the data that can help us reduce the number of dimensions. For example, even if you observe 1000 neurons simultaneously, not all neurons function independently. They are actually in a network and work together to perform a computation. If we can identify close relationships in the activity of neurons, then we can exploit it to study the data in a lower-dimensional space.
Many of the methods that I've developed are in this direction. We extract structure out of neural data by making smart assumptions consistent with biology and our theory about how the brain works.
Yes, I am very excited about forming new collaborations with experimentalists there. I also intend to continue external collaborations, for example with Alex Huk at the University of Texas at Austin. We have ideas of using different types of analysis on neural data from working memory tasks. I am also interested in decision-making, so I will spend a good amount of time trying to solve these problems. But these are only some of the many things that I want to do.
I don't want to reveal too many of my secret weapons! Still, a particular method that I'm excited about is creating a real-time neuroscience system.
Now, during an experiment, neuroscientists record from hundreds, even thousands of neurons simultaneously. Then, they go home and analyse the data. After drawing conclusions, the experiment is repeated with adjustments, and the entire process begins again. It's a very long and slow loop.
The tools I've been developing can perform this analysis on the fly. So you're recording from hundreds of neurons, and, within milliseconds, I would like to provide you with a visual saying, "oh, look at this! There’s a hidden 3D structure in the neural activity!" With this online analysis, the experimentalist can immediately see what is happening and even tweak the experiment in real time to test how different manipulations influence the relationship between the neurons.
I believe that real-time neuroscience systems are critical for doing new types of experiments and for making scientific progress fast. It also has clinical applications. Online feedback is already available for brain stimulation, for example, to stop epileptic seizures. However, current devices are not very smart. We are building next-generation analysis algorithms that will be able to do this kind of analysis quickly and efficiently.