Resolving richer textures with computer vision

Automatic analysis of complex textures within digital images may lead to improved machine learning and video compression applications.

Studies suggest that the human visual system relies on indicators of texture to differentiate between objects. A team from KAUST has developed a method to enhance the way computers decipher texture through an innovative digital image processing program¹.

As bandwidth is increasingly taken up by video traffic, the software could prove critical to computer vision recognition systems and may yield faster internet video streaming.

To replicate natural texture solving processes, scientists are developing ways to analyze textons, groups of pixels that describe the repeating units of a texture. But recognizing random and irregular textons in textures — such as the bark of a tree, for instance — is challenging. Current techniques aggregate statistics within neighborhoods surrounding a pixel and can help “firm up” the texton description. However, this approach fails at texture boundaries when unrelated data is aggregated, leading to image analysis errors.

KAUST’s Ganesh Sundaramoorthi, Assistant Professor of Electrical Engineering, and colleagues in the United States cracked the texture code by overhauling the strategy.

“The problem,” Sundaramoorthi said, “is how to construct an invariant descriptor that depends on texture when the texture itself is unknown. To make a robust system, we had to eliminate the concept of choosing neighborhoods altogether.”

To replace the concept, the researchers formulated an estimation problem where both the descriptors and the texture boundaries are solved together by continuously checking each other’s values.

They programmed novel “shape-tailored” descriptors that use Poisson-like partial differential equations (PDEs) to analyze data such as light radiance and color channels in an image. An optimization algorithm then refines initially large neighborhoods into ever smaller pixel groups until surface patterns are accurately traced out.

The team found their method automatically detected textures in a wide range of examples under challenging video and image conditions.

“The PDEs provide some invariance like traditional descriptors, but their key difference is that they are naturally defined within regions of arbitrary shape,” Sundaramoorthi explained.

One application that may be improved by the research is video compression. Some estimates predict that 80-90 percent of all Internet activity in the year 2019 will be video traffic, a situation that calls for more efficient bandwidth use.

Sundaramoorthi notes that deciphering textures is redundant and far more demanding of bits than the information they convey.

“None of the best video compression schemes take this into account, but our texture segmentation scheme may bring us closer to this,” he said.

References

Khan, N., Algarni, M., Yezzi, A. & Sundaramoorthi, G. Shape-tailored local descriptors and their application to segmentation and tracking. IEEE Conference on Computer Vision and Pattern Recognition (2015).| article