Facial recognition trap: users in the dark as millions of pics grabbed from web
User-generated content has accelerated the process of harvesting images needed for research and facial recognition design, but although it has facilitated major leaps in the accuracy of the technology, it has raised concerns over the legitimacy of the use of private content and hence, user security.
However imperfect the facial recognition currently technology is, it is advancing at an extraordinary pace, and is already based on an array of algorithms that learn how to determine one’s facial features and the ways they differ from each other.
To make algorithms work and become more sophisticated, tons of pictures are needed of a wealth of types of human faces, with developers scraping them from the depths of Internet. Meanwhile, people who have posted them or posed for the images are completely unaware of the practice and the bulk of pictures being shared with researchers and academic groups, NBC News reported.
User-generated content took modern life by storm, making regular people available on any platform, be it YouTube, Facebook, Wikipedia, and even police mugshots.
“This is the dirty little secret of AI training sets. Researchers often just grab whatever images are available in the wild”, said NYU School of Law Professor Jason Schultz, as cited by NBC News.
One of the examples cited by the American edition is activity conducted by IBM, the latest company to engage in facial recognition design. In January, it released a dataset of roughly a million photos that were taken from the photo hosting site Flickr and specially coded, or annotated to describe each person’s appearance, to many photographers’ and models’ bafflement:
“None of the people I photographed had any idea their images were being used in this way”, said Greg Peverill-Conti, a Boston-based public relations executive who has more than 700 photos in IBM’s collection, known as a “training dataset”. He added that it seems a bit “sketchy that IBM can use these pictures without saying anything to anybody”.
Interestingly, despite IBM’s assurance and commitment to “protecting the privacy of individuals”, as John Smith, who oversees AI research at IBM, put it, NBC News discovered that it’s almost impossible to get photos removed, from the database, since the company requires that users post individual links to the pictures that they want to take down, rather than erasing everything en masse that is registered in the database under a Flickr user’s login.
Some experts and activists argue that this is not just an infringement on the privacy of the millions of people whose images have been harvested online, but it also raises a broader issue about the improvement of facial recognition technology. Separately, disproportionate targeting of minorities is also a worrisome possibility.
“People gave their consent to sharing their photos in a different internet ecosystem”, Meredith Whittaker, co-director of the AI Now Institute, which studies the social implications of AI, shared.
“Now they are being unwillingly or unknowingly cast in the training of systems that could potentially be used in oppressive ways against their communities”.
IBM’s motives are currently being questioned by civil liberty advocates and tech ethics specialists, bringing up IBM’s history of selling surveillance tools, with many underscoring that, outrageously, no consent from creators or models is required.
For instance, in wake of the 9/11 attacks, IBM is known to have sold technology to the New York City Police Department that allowed it to search CCTV feeds for people with particular skin tones or hair colour. Separately, IBM has released an “intelligent video analytics” product that uses body camera surveillance to recognise people by “ethnicity” tags, such as Asian, black or white.
Also, the company now sells a top-notch system called IBM Watson Visual Recognition, which IBM says can estimate the age and gender of people depicted in images, with many noting that the advancements in the field are positive, given that they render the facial recognition system less and less prone to error and add to its accuracy.
(SPUTNIK)