Gsd-skill-creator algorithmic-awareness

Understanding how algorithmic systems shape what users see, know, and do -- from recommendation feeds to search ranking to credit scoring to hiring software. Covers the mechanics of recommendation systems, algorithmic bias and its sources, personalization's effects on information diets, opacity and accountability, AI limitations (hallucination, confident wrongness), and the human-in-the-loop question. Use when a learner needs to think critically about why particular content reached them.

install

source · Clone the upstream repo

git clone https://github.com/Tibsfox/gsd-skill-creator

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/Tibsfox/gsd-skill-creator "$T" && mkdir -p ~/.claude/skills && cp -r "$T/examples/skills/digital-literacy/algorithmic-awareness" ~/.claude/skills/tibsfox-gsd-skill-creator-algorithmic-awareness && rm -rf "$T"

manifest: examples/skills/digital-literacy/algorithmic-awareness/SKILL.md

source content

Algorithmic Awareness

Algorithmic awareness is the discipline of noticing that the content reaching you was selected by a system optimizing for something, and asking what that something is. Most online experience is now mediated by recommendation algorithms: what you see on social media, what videos YouTube queues, what appears at the top of search results, which products Amazon pushes, which job postings surface, which loan offers arrive. The systems are not neutral; they are trained to produce specific outcomes, and those outcomes are not always aligned with yours. This skill draws from Safiya Noble's Algorithms of Oppression, Cathy O'Neil's Weapons of Math Destruction, and the algorithmic accountability research community.

Agent affinity: noble (algorithmic bias, power asymmetry), palfrey (institutional framing), rheingold (user-facing strategies)

Concept IDs: diglit-recommendation-systems, diglit-algorithmic-bias, diglit-ai-limitations, diglit-data-collection

What Is An Algorithm, In This Context

The word "algorithm" has two meanings that get conflated.

Narrow technical meaning: A finite sequence of precise steps that produces an output from an input. Sorting a list is an algorithm. Computing a checksum is an algorithm.

Broader popular meaning: A proprietary, often machine-learned system that makes decisions about what users see or what happens to them. "The Facebook algorithm" or "the hiring algorithm." This is usually a pipeline of statistical models trained on historical data, optimized for a business objective.

This skill is about the second. The systems we call "algorithms" in everyday speech are not neutral calculators; they are trained-to-maximize machines whose training objectives are almost always different from what users would consciously choose.

How Recommendation Systems Work

At a high level, a recommendation system does this:

Represent the user as a vector of features: explicit preferences (things you followed, liked, purchased) and implicit signals (time on page, scroll speed, mouse hover, pause rate, replay).
Represent the content as a vector of features: topics, creators, format, engagement history, freshness.
Score each piece of content against the user vector using a model trained to predict a target metric.
Rank content by score. Show the top K.
Observe the user's behavior on what was shown. Feed that back into training.

The critical question is step 3: what is the target metric?

The objective function problem

Recommendation systems are trained to optimize a specific measurable outcome. Common choices:

Click-through rate -- will you click?
Watch time -- will you keep watching?
Engagement -- will you like, comment, share?
Retention -- will you come back tomorrow?
Revenue per user -- will ads shown to you earn the company money?

These metrics are proxies for "value to the user" but they are imperfect proxies. Outrage drives clicks. Anxiety drives engagement. Polarizing content drives retention. A system trained to maximize engagement will surface engaging content whether or not it is true, healthy, or good for you.

This is not a conspiracy. No one at a platform company sits in a room deciding to amplify misinformation. The objective function does it automatically. The engineers would need to actively override the optimization to stop it.

Filter Bubbles and Echo Chambers

Personalization means different users see different things. The systemic consequence is that your information environment becomes progressively tuned to your past behavior.

Filter bubble

Eli Pariser's 2011 concept: the personalized web produces an information environment shaped by your clicks, where dissenting or unfamiliar viewpoints are filtered out not by a human editor but by an algorithm trained to predict your preferences.

Echo chamber

A community whose members primarily interact with each other, reinforcing shared beliefs and suppressing counter-evidence. Social platforms naturally produce these because homophily (the tendency to connect with similar others) is strong in human networks.

How large is the effect

The empirical literature is mixed. Early filter-bubble claims were sometimes overstated -- most people still encounter diverse content, and social platforms can actually expose users to more diverse views than their offline networks would. But the effect is real for heavy users, and the asymmetry of amplification means extreme content spreads disproportionately.

Algorithmic Bias

Bias in algorithmic systems is not a bug in the code. It is a feature of how the systems are built.

Sources of bias

Training data bias. The system learns from historical data. If the history reflects bias, the system reproduces it. A resume screener trained on a company's historical hires learns who the company historically hired -- which may not be who they should have hired.
Proxy bias. The system uses variables that are correlated with protected attributes. ZIP code correlates with race in the U.S., so lending models that use ZIP code may produce discriminatory outcomes even when race is not an input.
Feedback loop bias. A system's predictions affect the world, which produces the next round of training data. Predictive policing models direct police to areas where crime was previously reported, leading to more reports in those areas, reinforcing the model's prediction.
Measurement bias. The target variable itself is biased. "Engagement" measures what users clicked; it does not measure what users found valuable. Optimizing for the former does not give you the latter.
Evaluation bias. Models are tested on datasets that do not represent the full user population. Facial recognition systems performed dramatically worse on darker-skinned faces for years because the test sets were overwhelmingly lighter-skinned.

Documented cases

Amazon's hiring model (reported 2018): trained on 10 years of resumes, mostly from men. The model learned to penalize mentions of "women's" (as in "women's chess club"). Amazon scrapped the project.
COMPAS recidivism scoring: A 2016 ProPublica investigation found the system rated Black defendants as higher risk than white defendants with comparable histories.
Healthcare risk algorithms: A 2019 Science paper found a widely used system used healthcare costs as a proxy for healthcare needs, systematically underestimating Black patients' needs because less money was historically spent on their care.
Search results: Safiya Noble's 2018 work documented how search queries for "Black girls" produced pornographic results, while searches for "white girls" did not. Google adjusted the results after coverage; the underlying dynamics remain.

The AI Limitations Conversation

In 2026, "algorithm" increasingly means "large language model" or "generative AI." Understanding what these systems can and cannot do is essential digital literacy.

What LLMs are good at

Summarizing and paraphrasing text
Drafting structured content (emails, outlines, code skeletons)
Translation (high-resource language pairs)
Pattern recognition in text

What LLMs are bad at

Facts. LLMs produce plausible-sounding output regardless of truth. They will generate fake citations, wrong dates, misremembered quotes, and confident nonsense.
Math and arithmetic. Without tool use, LLMs are unreliable calculators.
Reasoning at scale. Short chains of reasoning work; long chains compound errors.
Current events. Training data has a cutoff date. The model does not know what happened yesterday.
Nuance about their own limitations. They are often confidently wrong about what they know.

Hallucination

Hallucination is the technical term for when an LLM generates factual content that is wrong. It is not lying -- the model has no concept of lying -- but the output is not reliable and must be verified against external sources.

The human in the loop

The correct relationship with generative AI is partnership, not delegation. The AI drafts; the human verifies. The AI suggests; the human decides. Delegating decisions entirely to AI systems produces exactly the failure modes this skill describes.

Opacity and Accountability

Most algorithmic systems are opaque: you cannot see the code, you cannot inspect the training data, and you often cannot even know when a decision was made by a machine versus a person.

Why opacity matters

When you are rejected for a loan, a job, or an insurance claim, you have very limited means to understand why or to challenge it. "The algorithm decided" is not an answer in a democratic society.

Emerging accountability

Some jurisdictions now require algorithmic impact assessments, explanation rights, or human review of automated decisions. The EU AI Act (2024) classifies AI systems by risk and imposes obligations on high-risk uses. Enforcement is still developing.

What individual users can do

Ask whether an automated system made the decision. You may have a legal right to know.
Ask for human review. Often available and often underused.
Document. Keep evidence of decisions you want to challenge.
Support organizations doing algorithmic accountability work. EFF, AI Now Institute, Algorithmic Justice League, Mozilla Foundation.

When NOT to Use This Skill

Evaluating a specific claim's truth. Use
```
information-evaluation
```
.
Privacy settings and data collection. Use
```
data-privacy
```
.
Understanding the computer hardware and networks. Use
```
computational-literacy
```
.

Decision Guidance

When an algorithmic system affects your experience, ask:

What is the system optimizing for? If you do not know, assume the answer is not "your wellbeing."
What evidence am I seeing? Is this content really representative or just what the algorithm surfaced?
Who benefits from my current attention? Follow the money.
What would a different system show me? Occasionally use different search engines, different platforms, different sources to calibrate.
Am I the product or the customer? If the service is free, the answer is almost always the first.

Cross-References

noble agent: Algorithmic bias, power asymmetry, "Algorithms of Oppression"
palfrey agent: Institutional accountability, policy framing
rheingold agent: User-facing strategies, crap detection against AI output
information-evaluation skill: Downstream defense against algorithmic amplification
data-privacy skill: Upstream cause -- collected data feeds the algorithms

References

Noble, S. U. (2018). Algorithms of Oppression: How Search Engines Reinforce Racism. NYU Press.
O'Neil, C. (2016). Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy. Crown.
Pariser, E. (2011). The Filter Bubble: What the Internet Is Hiding from You. Penguin.
Eubanks, V. (2018). Automating Inequality: How High-Tech Tools Profile, Police, and Punish the Poor. St. Martin's Press.
Buolamwini, J., & Gebru, T. (2018). "Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification." Proceedings of Machine Learning Research 81.
Obermeyer, Z. et al. (2019). "Dissecting racial bias in an algorithm used to manage the health of populations." Science 366(6464), 447-453.