lpetrich
Contributor
Where We See Shapes, AI Sees Textures | Quanta Magazine
There are ways of getting around this kind of incorrect learning, however, like training with different textures for each shape. That works, and it does well with noisy images. It may also be good to do some preprocessing that brings out edges and other such features, in the fashion of biological early vision.
They noticed the poor generalization of DL image recognizers and they considered what kind of image feature is the most affected by noise. They suspected textures, and they did experiments to test that hypothesis -- textures and shapes suggesting different identifications.When you look at a photograph of a cat, chances are that you can recognize the pictured animal whether it’s ginger or striped — or whether the image is black and white, speckled, worn or faded. You can probably also spot the pet when it’s shown curled up behind a pillow or leaping onto a countertop in a blur of motion. You have naturally learned to identify a cat in almost any situation. In contrast, machine vision systems powered by deep neural networks can sometimes even outperform humans at recognizing a cat under fixed conditions, but images that are even a little novel, noisy or grainy can throw off those systems completely.
A research team in Germany has now discovered an unexpected reason why: While humans pay attention to the shapes of pictured objects, deep learning computer vision algorithms routinely latch on to the objects’ textures instead.
So a hairless cat like a sphynx cat might be misidentified as an elephant, and a woolly mammoth as a cat.Geirhos, Bethge and their colleagues created images that included two conflicting cues, with a shape taken from one object and a texture from another: the silhouette of a cat colored in with the cracked gray texture of elephant skin, for instance, or a bear made up of aluminum cans, or the outline of an airplane filled with overlapping clock faces. Presented with hundreds of these images, humans labeled them based on their shape — cat, bear, airplane — almost every time, as expected. Four different classification algorithms, however, leaned the other way, spitting out labels that reflected the textures of the objects: elephant, can, clock.
There are ways of getting around this kind of incorrect learning, however, like training with different textures for each shape. That works, and it does well with noisy images. It may also be good to do some preprocessing that brings out edges and other such features, in the fashion of biological early vision.