robot ai vision

Computers trained in virtual environments to recognise objects and locations

Image credit: Dreamstime

Computers are being taught to ‘see’ and ‘understand’ objects in the real world by training their vision systems in a virtual environment.

For computers to learn and accurately recognise objects, such as a building, a street or humans, the machines must rely on processing a huge amount of labelled data - in this case, images of objects with accurate annotations.

A self-driving car, for instance, needs thousands of images of roads and cars to learn from. Datasets therefore play a crucial role in the training and testing of the computer-vision systems.

Using manually labelled training datasets, a computer-vision system compares its current situation to known situations and takes the best action it can ‘think’ of - whatever that happens to be.

“However, collecting and annotating images from the real world is too demanding in terms of labour and money investments,” wrote professor Kunfeng Wang, an associate at China’s State Key Laboratory for Management and Control for Complex Systems, and the lead author on the paper.

He said the goal of their research is to specifically tackle the problem that real-world image datasets are not sufficient for training and testing computer-vision systems.

To solve this issue the researchers created a dataset called ParallelEye, which was virtually generated by using commercially available computer software, primarily the video game engine Unity3D.

Using a map of Zhongguancun, one of the busiest urban areas in Beijing, China, as their reference, they recreated the urban setting virtually by adding various buildings, cars and even different weather conditions. Then they placed a virtual camera on a virtual car. The car drove around the virtual Zhongguancun and created datasets that are representative of the real world.

Through their complete control of the virtual environment, Wang’s team was able to create extremely specific usable data for their object detecting system - a simulated autonomous vehicle.

The results showed a marked increase in performance on nearly every tested metric. By designing custom-made datasets, a greater variety of autonomous systems will be more practical to train.

While their greatest performance increases came from incorporating ParallelEye datasets with real-world datasets, Wang’s team has demonstrated that their method is capable of easily creating diverse sets of images.

“Using the ParallelEye vision framework, massive and diversified images can be synthesised flexibly and this can help build more robust computer-vision systems,” Wang said.

The research team’s proposed approach can be applied to many visual computing scenarios, including visual surveillance, medical image processing and biometrics.

They are now planning to create an even larger set of virtual images, improve the realism of virtual images and explore the utility of virtual images for other computer vision tasks.

Sign up to the E&T News e-mail to get great stories like this delivered to your inbox every day.

Recent articles