Machine-learning AI sees textures rather than shapes

lpetrich · Jul 14, 2019

Where We See Shapes, AI Sees Textures | Quanta Magazine

When you look at a photograph of a cat, chances are that you can recognize the pictured animal whether it’s ginger or striped — or whether the image is black and white, speckled, worn or faded. You can probably also spot the pet when it’s shown curled up behind a pillow or leaping onto a countertop in a blur of motion. You have naturally learned to identify a cat in almost any situation. In contrast, machine vision systems powered by deep neural networks can sometimes even outperform humans at recognizing a cat under fixed conditions, but images that are even a little novel, noisy or grainy can throw off those systems completely.

A research team in Germany has now discovered an unexpected reason why: While humans pay attention to the shapes of pictured objects, deep learning computer vision algorithms routinely latch on to the objects’ textures instead.

They noticed the poor generalization of DL image recognizers and they considered what kind of image feature is the most affected by noise. They suspected textures, and they did experiments to test that hypothesis -- textures and shapes suggesting different identifications.

Geirhos, Bethge and their colleagues created images that included two conflicting cues, with a shape taken from one object and a texture from another: the silhouette of a cat colored in with the cracked gray texture of elephant skin, for instance, or a bear made up of aluminum cans, or the outline of an airplane filled with overlapping clock faces. Presented with hundreds of these images, humans labeled them based on their shape — cat, bear, airplane — almost every time, as expected. Four different classification algorithms, however, leaned the other way, spitting out labels that reflected the textures of the objects: elephant, can, clock.

So a hairless cat like a sphynx cat might be misidentified as an elephant, and a woolly mammoth as a cat.

There are ways of getting around this kind of incorrect learning, however, like training with different textures for each shape. That works, and it does well with noisy images. It may also be good to do some preprocessing that brings out edges and other such features, in the fashion of biological early vision.

Gun Nut · Jul 15, 2019

That's interesting. My understanding of how the layers of a neural network play into image identification in a deep machine learning system is that the first hidden layers after the input layer tend to contain "texture" information (how individual pixels in proximity to each other relate to texture and shadows). But then deeper layers take those arrangements of texture to describe shapes... then features in the next layer, and then the geometry of the arrangements of the shapes, and then the output layer...
So good network architecture gives the AI both "texture" and "shape"... just at different layers of the "thought process".

barbos · Jul 29, 2019

What took them so long? I saw an article about this a few years ago. In that article they took trained NN and tried to generate images which fit best with desired outcomes. Sufficient to say, generated images did not resemble actual objects. They were all weird patterns.

Machine-learning AI sees textures rather than shapes

lpetrich

Contributor

Gun Nut

Veteran Member

barbos

Contributor