Alignment in recommender systems

Mar 26, 2025

Over the span of any given Monday you likely interact with at least three different recommender systems. If you’re like me and actively avoid most forms of social media, you might still scroll X (“it’s for learning!”) as you wait for your coffee, put on a Spotify daily mix at the gym, and let YouTube play a recommended Tiny Desk as you make dinner. If you also use TikTok, Instagram, and Netflix on an average Monday, you’ve just doubled your recommender system exposure for the day.

Very brief history of recommender systems

Modern recommender systems are “designed to provide personalized recommendations to users of online products and services, enhancing the user’s online experience” (Li et al., 2021). But recommender systems have a rich (pun intended) history that began decades before widespread internet use. One of the lesser-sung heroes of computer science is Dr. Elaine Rich, who pioneered one of the earliest recommender systems, Grundy, at CMU in 1979.

Dr. Rich writes:

Grundy exploited relatively deep models, both of the books it knew and of its users. This meant that the models of the individual books had to be created by hand. And what about the models of each user? The goal, in designing Grundy, was to minimize the amount of interaction a user would have to have with Grundy in order to get going. To achieve that goal, Grundy used stereotypes, which could be triggered for a particular person just from a small set of words that the person provided as a simple self description. Grundy generated book recommendations by comparing its model of the current user to the models of the books it knew about. It chose the best match and generated a short description of the book, which emphasized the reasons it thought this user would like the book. Then it asked the user whether it liked the recommendation, and, if not, why not. Using that feedback, it updated both its stereotypes and its model of the current user.

conversation with Grundy — From Dr. Rich’s website

Dr. Rich compares this approach, which still required “relatively deep [hand-created] models, both of the books it knew and of its users”, to collaborative filtering, which predicts a user’s preferences based on the behavior of users similar to them.

Referencing the 2023 IEEE Recent Developments in Recommender Systems survey paper, a good example of a hybrid recommender system (at a high level, combining elements of collaborative filtering and e.g. content-based, which is conceptually similar to Dr. Rich’s original approach oriented around learning from user behaviors to form a user model that can be used to personalize recommendations) is Netflix’s recommendation system. The following is just a snapshot of the input data Netflix considers, just the tip of the iceberg, but already we see how complex these systems can become and what bottlenecks we may face in designing these to be scalable, memory- and time-efficient:

We estimate the likelihood that you will enjoy a particular title in our catalog based on a number of factors including:
your interactions with our service (such as your viewing history and how you rated other titles),
other members with similar tastes and preferences on our service, and
information about the titles, such as their genre, categories, actors, release year, etc.
In addition to knowing what you have watched on Netflix, to best personalize the recommendations we also consider factors including:
the time of day you're enjoying Netflix,
the languages you prefer
the devices you are enjoying Netflix on, and
how long you enjoyed a Netflix title.

Yanir Seroussi highlights two primary motivations to work on recommender systems:

Money
True data science problem: intersection of software engineering, machine learning, and statistics.

This blog post is not going to be a comprehensive overview of recommender systems. Instead I want to propose and focus this post on a third primary motivator to working on recommender systems, a topic that has finally been getting the airtime it so desperately needs in the general artificial intelligence space, but still remains underrepresented and understudied in the heavily commercialized field of recommender systems:

alignment.

Traditional alignment models don’t economically serve large-scale social recommender systems

Alignment is rapidly growing as a global coordinated research effort to ensure humans are able to retain control over AI systems and “align” them with human values (survey paper, 2023). In 2021 researchers in SF and Berkeley published this short paper “What are you optimizing for? Aligning Recommender Systems with Human Values” proposing that “recommender system design can draw from the study of the value alignment problem: the problem of ensuring that an AI system’s behavior aligns with the values of the principal on whose behalf it acts”. Applications of explicitly values-aligned recommenders included identifying and down-boosting or removing clickbait, harmful content, and misinformation, increasing fairness and reducing bias in algorithmic discovery to boost e.g. smaller artists on streaming platforms, and more proactively encouraging positive social media outcomes over toxic ones.

These are in theory big improvements over first-generation recommender systems, but practically money still talks much bigger game; social media remains one of the roughest, hardest landscapes to “align” because a few huge players are actively shaping the way humans process each others’ online content. These networks and feedback loops specifically engineered to increase profit via human engagement have the power to bring people together, democratize education and information (including this very post), and change lives for the better in ways previously unheard of, but at the same time introduce pressures and dangers completely unique to these platforms.

Alfie Lamerton of Formation Research is one of the first I’ve seen to start actually formalizing alignment risk in recommender systems, something Lamerton is generally calling “lock-in risk”. (Somebody get Lamerton into a PhD program ASAP.) Lamerton highlights from literature that recommender systems are “believed to contribute to polarization by promoting extreme content”, but that “the evidence for this is inconclusive and it’s possible that this effect is not very pronounced… either because it was never the problem in the first place, or because companies are effectively mitigating the effect in their in-house recommender system”. However, “The way recommender systems curate information can lead to the isolation of information into a filter bubble”, and Lamerton argues that these are bad outcomes of these phenomena that measurements of lock-in risk could help counteract.

How to flip the problem

Lamerton is off to a rolling start with funding for Formation Research and a huge existing book of work on fairness and generalized bad actor detection in recommender systems (including the many in-house systems mentioned with controls and protocols built in and in enough cases open sourced), and it’s disappointing but not shocking that there isn’t the same level of attention on aligning systems that can influence and ultimately change human behavior as there is on aligning the machines and models we are building. After all, in a universal human attention-based media industry where corporations have so much of a say in what societies define as beautiful, desirable, creative, and aspirational, it simply doesn’t make economic sense to focus on driving forward some nebulous definition of “what makes us good as people” while we’re also basically building potentially too-smart killer robots we need to protect ourselves against by continuously checking and re-calibrating them (traditional alignment).

Perhaps it’s more convenient to dismiss these claims as fearmongering sensationalism; perhaps this post itself is an example of a polarized outcome of a filter bubble where I’ve been fed post after post and paper after paper covering alignment risk and how to build responsible and safe AI. But I’m hopeful that consumer trends are beginning to support a larger movement here. There is a higher appetite for personalized mental health solutions and “healthier social media”. One huge benefit of existing media platforms are that diverse lifestyles and differing reward systems can be highlighted; a good example was the “Danish way-of-life-turned-phenomenon” hygge a few years ago. There are a number of reasonably high-profile influencers that are proactively and publicly disconnecting to reconnect with themselves and their loved ones, including one of my personal favorites, plant-based chef Derek Sarno.

The easiest way to contribute to fixing the alignment problem with recommender systems is to provide human feedback that rewards the online behaviors you do want to see more of, even aspirationally. Personally, if I discover organically or via algorithm some small musician or band I like, I’ll put all of their tracks on repeat even if some of them are a little too experimental for my taste, I’ll subscribe and like their videos and watch all the way through even if the camera work looks like my own (amateurish). These are small steps but at population scale they are the core input signals to the vast majority of recommender systems.

We can get there without sacrificing progress on AI alignment in robotics and defense and fighting deepfakes; in fact, in the same way governments and organizations have the power to encourage healthier nutrition and fitness habits, it’s time to take a wider and stronger stance on healthier internet consumption habits. To me, approaching this movement as an alignment in recommender systems problem democratizes it because we are the humans these systems cater to; we actually have utmost power to mold them to better serve our purposes.

And you can now take one second to contribute to recommender systems alignment yourself by subscribing to me here and to Alfie Lamerton’s Substack as well and boosting my blog engagement metrics and increasing healthy, informed debate on the internet by leaving your unfiltered comments below.

Val's Tech Blog

Discussion about this post