skip to main content

Always be wary of hidden algorithms, says Jennifer Stark

Be wary of hidden algorithms, says Stark
Be wary of hidden algorithms, says Stark

University of Maryland's Jennifer Stark says we should use new technologies with caution, as she described an experiment she performed to "audit" Google's home page for biases.

Stark was speaking at the first European Data and Computational Journalism Conference on Thursday, 6th July, held in UCD and sponsored by RTÉ and Google News Lab. 

The audit analyzed images which appear in a box to the right of Google searches and serve as a preview of what you will find in the dedicated image search tab, as well as supplementing the text results. Stark said these images are not selected in an obvious way, such as from the top eight results, but by a Google algorithm which is kept secret. 

The concern over this information being kept secret by Google, Facebook, and others, is that as people grow more reliant on the giants' products, there is increased potential for unknown biases or imbalances to affect which information is presented to people. What comes up first on Google- and what is selected for the image box- is generally taken to be the most popular, important, representative. 

Images in particular may be more important than they appear; in elections for instance a candidate’s appearance has a direct bearing on their success, said Stark.

"Generally, the more attractive one wins." she said, "Even after we know more about them".

And the importance of which image or search result appears, or appears first, has a profound effect on where people "end up" on the internet, particularly as images have a source – a host web page or article they link back to.

"Google has a way of pointing you in a particular direction by directing you to particular articles"- Jennifer Stark.

And of course, there is huge competition in media to get the top spot, to make sure your article has enough photos and the right keywords, to be indexed higher by Google. This is part of the tech giant's need for secrecy, they say. The more people know how to be ranked higher, the more people can distort it to manipulate results.

Jennifer said she decided to audit the search engine home page to discover if she could see anything unusual, or unbalanced, about the way images were chosen for the top spot. 

The candidates for her experiment would be those of the US presidential election: Clinton and Trump.

Images were analysed by Microsoft’s emotion API, to determine how many photos of each candidate were seen as happy, neutral, angry, surprised, and so on.

"Research shows that women in news media smile more than men, they’re portrayed as happy and passive and men are portrayed more neutral or aggressive." said Jennifer.

To check for bias, she compared photos of the candidates in the hundreds of image search results with the smaller, more prominent selection appearing beside a normal search.

She checked if the emotions in the box selection were representative of the larger selection, and also if the links (articles the images appeared in, for instance) were in left-leaning, right-leaning or neutral sources.  

Self Control

Describing herself as a lab-coat scientist, Stark said it can be tough in social science to achieve the desirable conditions for an experiment.  "In social (science), you don’t have control over everything, you’re measuring something that happened in the world at large" said Jennifer. "To have a control group is very challenging."

To counter her own bias, and to rank sources by how left or right-leaning they were, Jennifer turned to Allsides.com, a site which offers bias-rating tools for media. She also used images saved over a period of time leading up to the election, as both search results and image box selections change regularly.

The experiment eventually did show that there were discrepancies – the selection of images in the box showed Clinton as mostly happy, whereas the overall image results generally showed her with a neutral expression.

"We showed statistically that representation of candidates’ emotions in the image box is not equal to Google’s universe." Stark said.

The sources of image referenced were also, more often than not, liberal, for both Trump and Clinton – although Stark was quick to say this could have just been because there are more liberal sources of news online.

Does this mean Google’s algorithm is biased? Jennifer says it doesn’t; that there are many possible reasons for the discrepancy – such as Microsoft’s emotion rating system being itself biased or inaccurate, for example. But, she said, it does show that things as presented by Google are not necessarily perfectly representative of reality.

"With any new technology, we should be wary" she said.