by Ben Dickson
Welcome to AI book reviews, a series of posts that explore the latest literature on artificial intelligence.
Since the early years of artificial intelligence, scientists have dreamed of creating computers that can “see” the world. As vision plays a key role in many things we do every day, cracking the code of computer vision seemed to be one of the major steps toward developing artificial general intelligence.
But like many other goals in AI, computer vision has proven to be easier said than done. In 1966, scientists at MIT launched “The Summer Vision Project,” a two-month effort to create a computer system that could identify objects and background areas in images. But it took much more than a summer break to achieve those goals. In fact, it wasn’t until the early 2010s that image classifiers and object detectors were flexible and reliable enough to be used in mainstream applications.
In the past decades, advances in machine learning and neuroscience have helped make great strides in computer vision. But we still have a long way to go before we can build AI systems that see the world as we do.
Biological and Computer Vision, a book by Harvard Medical University Professor Gabriel Kreiman, provides an accessible account of how humans and animals process visual data and how far we’ve come toward replicating these functions in computers.
Kreiman’s book helps understand the differences between biological and computer vision. The book details how billions of years of evolution have equipped us with a complicated visual processing system, and how studying it has helped inspire better computer vision algorithms. Kreiman also discusses what separates contemporary computer vision systems from their biological counterpart.
While I would recommend a full read of Biological and Computer Vision to anyone who is interested in the field, I’ve tried here (with some help from Gabriel himself) to lay out some of my key takeaways from the book.
Hardware differences
Biological vision runs on organic matter and cortical cells. Computer vision runs on transistors and electronic circuits.
In the introduction to Biological and Computer Vision, Kreiman writes, “I am particularly excited about connecting biological and computational circuits. Biological vision is the product of millions of years of evolution. There is no reason to reinvent the wheel when developing computational models. We can learn from how biology solves vision problems and use the solutions as inspiration to build better algorithms.”
And indeed, the study of the visual cortex has been a great source of inspiration for computer vision and AI. But before being able to digitize vision, scientists had to overcome the huge hardware gap between biological and computer vision. Biological vision runs on an interconnected network of cortical cells and organic neurons. Computer vision, on the other hand, runs on electronic chips composed of transistors.
Therefore, a theory of vision must be defined at a level that can be implemented in computers in a way that is comparable to living beings. Kreiman calls this the “Goldilocks resolution,” a level of abstraction that is neither too detailed nor too simplified.
For instance, early efforts in computer vision tried to tackle computer vision at a very abstract level, in a way that ignored how human and animal brains recognize visual patterns. Those approaches have proven to be very brittle and inefficient. On the other hand, studying and simulating brains at the molecular level would prove to be computationally inefficient.
“I am not a big fan of what I call ‘copying biology,’” Kreiman told TechTalks. “There are many aspects of biology that can and should be abstracted away. We probably do not need units with 20,000 proteins and a cytoplasm and complex dendritic geometries. That would be too much biological detail. On the other hand, we cannot merely study behavior—that is not enough detail.”
In Biological and Computer Vision, Kreiman defines the Goldilocks scale of neocortical circuits as neuronal activities per millisecond. Advances in neuroscience and medical technology have made it possible to study the activities of individual neurons at millisecond time granularity.
And the results of those studies have helped develop different types of artificial neural networks, AI algorithms that loosely simulate the workings of cortical areas of the mammal brain. In recent years, neural networks have proven to be the most efficient algorithm for pattern recognition in visual data and have become the key component of many computer vision applications.
Read the full review at TechTalks website using the link below.