Do you see what I see? (12/26/2011)
An essential question confronting neuroscientists and computer vision researchers alike is how objects can be identified by simply "looking" at an image. Introspectively, we know that the human brain solves this problem very well. We only have to look at something to know what it is.
But teaching a computer to "know" what it's looking at is far harder. In research published this fall in the Public Library of Science (PLoS) Computational Biology journal, a team from Los Alamos National Laboratory, Chatham University, and Emory University first measured human performance on a visual task of identifying a certain kind of shape when an image is flashed in front of a viewer for a very short amount of time (20-200 milliseconds). Human performance gets worse, as expected, when the image is shown for shorter time periods. Also as expected, humans do worse when the shapes are more complicated.
But could a computer be taught to recognize shapes as well, and then do it faster than humans? The team tried developing a computer model based on human neural structure and function, to do what we do, and possibly do it better.
Their paper, "Model Cortical Association Fields Account for the Time Course and Dependence on Target Complexity of Human Contour Perception," describes how, after measuring human performance, they created a computer model to also attempt to pick out the shapes.
"This model is biologically inspired and relies on leveraging lateral connections between neurons in the same layer of a model of the human visual system," said Vadas Gintautas of Chatham University in Pittsburgh and formerly a researcher at Los Alamos.
Neuroscientists have characterized neurons in the primate visual cortex that appear to underlie object recognition, noted senior author Garrett Kenyon of Los Alamos. "These neurons, located in the inferotemporal cortex, can be strongly activated when particular objects are visible, regardless of how far away the objects are or how the objects are posed, a phenomenon referred to as viewpoint invariance."
The brain has an uncanny ability to detect and identify certain things, even if they're barely visible. Now the challenge is to get computers to do the same thing. And programming the computer to process the information laterally, like the brain does, might be a step in the right direction.
How inferotemporal neurons acquire their viewpoint invariant properties is unknown, but many neuroscientists point to the hierarchical organization of the human visual cortex as likely being an essential aspect.
"Lateral connections have been generally overlooked in similar models designed to solve similar tasks. We demonstrated that our model qualitatively reproduces human performance on the same task, both in terms of time and difficulty. Although this is certainly no guarantee that the human visual system is using lateral interactions in the same way to solve this task, it does open up a new way to approach object detection problems," Gintautas said.
Simple features, such as particular edges of the image in a specific orientation, are extracted at the first cortical processing stage, called the primary visual cortex, or V1. Then subsequent cortical processing stages, V2, V4, etc., extract progressively more complex features, culminating in the inferotemporal cortex where that essential "viewpoint invariant object identification" is thought to occur. But, most of the connections in the human brain do not project up the cortical hierarchy, as might be expected from gross neuroanatomy, but rather connect neurons located at the same hierarchical level, called lateral connections, and also project down the cortical hierarchy to lower processing levels.
In the recently published work, the team modeled lateral interactions between cortical edge detectors to determine if such connections could explain the difficulty and time course of human contour perception. This research thus combined high-performance computer simulations of cortical circuits, using a National Science Foundation funded neural simulation toolbox, called PetaVision, developed by LANL researchers, along with "speed-of-sight" psychophysical measurements of human contour perception. The psychophysical measurements refer to an experimental technique that neuroscientists use to study mechanisms of cortical processing, using the open-source Psychtoolbox software as an advanced starting point.
"Our research represented the first example of a large-scale cortical model being used to account for both the overall accuracy, as well as the processing time, of human subjects performing a challenging visual-perception task," said Kenyon.
Note: This story has been adapted from a news release issued by the DOE/Los Alamos National Laboratory