Taming the Web

What is an Algorithm?

By Shawn Fuller

Contributing Writer

The earliest World Wide Web (Web 1.0) was a static network of websites, search engines, the pre-web UseNet groups and AOL conferencing. It resembled a vast library where you would find things of interest through the search engine and retrieve the best content brought back from search results. With the next iteration (Web 2.0), companies stayed away from generating any content, virtual reality or otherwise, and focussed their attention on the network, itself. The user experience of Facebook, Twitter, and YouTube were pared down to simple scrolling web pages. All content came from the users—or the advertisers. All the money, the millions of dollars in investment capital, was spent on algorithms that enhanced social networking.

Algorithms 101

In its simplest form an algorithm is any set of steps that, if followed, will accomplish a goal. For instance, a recipe is an algorithm that a human follows to make a meal.

empty circles divided into 3 columns (input, hidden and output) with arrows connecting them
An artificial neural network is an interconnected group of nodes. These networks are inspired by the neural architecture of the human brain. Here, each circular node represents an artificial neuron and an arrow represents a connection from the output of one artificial neuron to the input of another. Coloured neural network by Glosser.ca / CC BY-SA 3.0

The traditional computer algorithm should provide the same answer or do the same thing each time it is given the same input. Increasingly tech companies use machine learning algorithms for many of their systems. Algorithms developed by machine learning don’t follow an ordered set of steps. Instead, they work more like the perceptual systems of the human brain, which processes visual input through successive layers of neurons, each layer specialized to extract different kinds of information. The software that supports machine learning mimics the interconnected neurons, whose behaviour changes with learning. The algorithms of social media take in information about our behaviours, make decisions based on their own hidden learning, and feed back to us the results.

Hannah Fry summarizes the four major categories of algorithms that are used by Google, YouTube, and Facebook in her book, Hello World: Being Human in the Age of Algorithms. They are prioritization, association, classification, and filtering.

Prioritization algorithms rank things according to criteria such as popularity, or ratings. When you search YouTube for videos of hurricanes, cats, TV bloopers, or how to repair your dishwasher it uses prioritization algorithms to bring you the most popular videos, ranked according to the number of times a video has been viewed by other people. It also displays videos that you may also like to watch, based on their association with other videos that people have viewed.

Association algorithms find connections between things. They are what Amazon uses to display other books and products that you may be interested in based on what you just searched for. It is the association algorithms that may create a radicalization pipeline in which the recommendation engine offers increasingly extreme videos. The media scholar Zeynep Tufekci observed that after watching a number of videos of Donald Trump rallies on YouTube, the site began to autoplay videos featuring white supremacist rants and Holocaust denials. When she started watching videos of Hillary Clinton and Bernie Sanders, YouTube started recommending left wing conspiratorial videos filled with allegations of secret government agencies and 9/11 cover-ups. In response to public outcry and media attention over ISIS and extremist videos, YouTube began removing videos that directly preached hate or incite violence. They also adjusted their ranking algorithms to de-amplify those that promote conspiracy theories and pseudoscience. Now, if you look up chloroquine and Covid 19, for example, YouTube will use its association algorithms to connect you to videos by medical researchers, who provide a more factual view of the topic.

Classification algorithms attempt to place you in various categories. The massive data harvesting that social media applications and data broker companies engage in is aimed at placing you in demographic and behavioral categories in order to target ads for products that might interest you. It may comfort you to know that these algorithmic guesses are not entirely accurate (yet). Based on my signed in activity Google thinks I am not a parent. Meanwhile, my daughter’s google account thinks she is interested in cars and basketball (neither is true).

Filtering algorithms remove or exclude information that is considered noise or not of interest. Siri and Alexa need to filter out the background noise in order to recognize your voice commands. Similarly, your phone filters out background noise while you are speaking to someone. This sometimes leads to the perception that your call has been disconnected during a pause in the conversation. (“hello? Are you still there?” “Yes, I’m still here”) The social media apps use filtering algorithms to include only the stories, memes, and videos that match your known interests.

In the series: Taming the Web


Related articles:

Further Reading


Websites:

First Draft
The mission of First Draft is to protect communities from harmful misinformation. Through their Cross Check program, they work with a global network of journalists to investigate and verify emerging news stories. The site has many research articles, education, and guidelines on misinformation and infodemics.

Data & Society
Data & Society studies the social implications of data-centric technologies & automation. It has a wealth of information and articles on social media and other important topics of the digital age.

Stanford Internet Observatory
The Stanford Internet Observatory is a cross-disciplinary program of research, teaching and policy engagement for the study of abuse in current information technologies, with a focus on social media.

Profiles:

Sinan Aral

Sinan Aral is the David Austin Professor of Management, IT, Marketing and Data Science at MIT, Director of the MIT Initiative on the Digital Economy (IDE) and a founding partner at Manifest Capital. He has done extensive research on the social and economic impacts of the digital economy, artificial intelligence, machine learning, natural language processing, social technologies like digital social networks.


Renée DiResta

Renée DiResta is the technical research manager at Stanford Internet Observatory, a cross-disciplinary program of research, teaching and policy engagement for the study of abuse in current information technologies. Renee investigates the spread of malign narratives across social networks and assists policymakers in devising responses to the problem. Renee has studied influence operations and computational propaganda in the context of pseudoscience conspiracies, terrorist activity, and state-sponsored information warfare, and has advised Congress, the State Department, and other academic, civil society, and business organizations on the topic. At the behest of SSCI, she led one of the two research teams that produced comprehensive assessments of the Internet Research Agency’s and GRU’s influence operations targeting the U.S. from 2014-2018.

YouTube talks:
The Internet’s Original Sin
Renee DiResta walks shows how the business models of the internet companies led to platforms that were designed for propaganda

Articles:
Computational Propaganda
“Computational Propaganda: If You Make It Trend, You Make It True”
The Yale Review


Claire Wardle

Dr. Claire Wardle is the co-founder and leader of First Draft, the world’s foremost non-profit focused on research and practice to address mis- and disinformation.


Zeynep Tufekci

Zeynep is an associate professor at the University of North Carolina, Chapel Hill at the School of Information and Library Science, a contributing opinion writer at the New York Times, and a faculty associate at the Berkman Klein Center for Internet and Society at Harvard University. Her first book, Twitter and Tear Gas: The Power and Fragility of Networked Protest provided a firsthand account of modern protest fueled by social movements on the internet.
She writes regularly for the The New York Times and The New Yorker

TED Talk:

We’re building a dystopia just to make people click on ads