Microsoft quietly takes down huge facial-recognition database
Image credit: Dreamstime
Microsoft has discreetly taken down its database of millions of images used to train neural networks to perform facial recognition, the Financial Times has reported.
Training a neural network to recognise faces is a data-intensive process, requiring as many different images as possible and ideally representing as wide a range of different people. Microsoft was one of a handful of companies offering a sufficiently large dataset for anybody to use.
The Microsoft database (MS Celeb) contained more than 10 million images of approximately 100,000 models gathered from search engines, largely without their consent. This was possible using the Creative Commons license, which allows for the use of material for academic reuse with the permission of the copyright holder (rather than the models involved).
Microsoft told the Financial Times that: “The site was intended for academic purposes. It was run by an employee that is no longer with Microsoft and has since been removed.” The dataset will remain available to any university or company which has already downloaded it.
According to the Financial Times, the MS Celeb dataset was originally meant to be used to recognise celebrities, but also contained photographs of private citizens and was used by IBM, Panasonic, Alibaba, Hitachi, Nvidia and two companies which supply technology to the Chinese government, SenseTime and Megvii.
The Chinese government is using facial-recognition technology to closely track its minority Muslim population in Xinjiang through a widespread surveillance system, including cameras located around mosques and internet cafés. Reports by Human Rights Watch have accused Beijing of subjecting Muslim minorities to intensive surveillance, forcing them to renounce their religion and even holding millions in internship camps. Megvii and SenseTime denied any involvement in the crackdown on racial minorities in China.
Research has shown that the MS Celeb dataset was used in China more than anywhere else in the world over the last two years.
The use of facial-recognition technology by governments and security organisations has been extremely controversial given concerns about invasion of privacy and the inaccuracy of facial-recognition systems, particularly with regard to women and people with dark skin tones. In January 2019, more than 85 human rights groups (including the American Civil Liberties Union and the Freedom of the Press Foundation) lobbied tech giants Microsoft, Amazon and Google to stop selling their facial-recognition software to governments, arguing that this use “threatens the safety of community members and will also undermine public trust.”
In February, Microsoft president Brad Smith commented in an interview with Business Insider that facial-recognition technology could be used for humanitarian purposes and that a blanket ban on government use of the technology “risks being cruel in its humanitarian effect”. Last week, Amazon shareholders rejected a motion calling on the company to limit selling its Rekognition software to the US government.
In March, IBM came under criticism for building a facial-recognition dataset using almost 100 million photographs from unknowing models collected from Flickr.
Sign up to the E&T News e-mail to get great stories like this delivered direct to your inbox every day.