On Content Discovery


A content discovery platform lets you find items similar to what you’re interested in. An example is YouTube. It shows you a video that you would likely enjoy watching given your interests and what you are currently watching. Spotify is another content discovery platform. It creates curated playlists based on what you seem to like. And there is Google News, which attempts to deliver news aggregated from multiple sources, with a section customized for the sort of articles a user is interested in. There are plenty other such platforms that are popular today.

An important consequence of the rise of these platforms is that they now hold a significant part of the user’s attention. And with this comes a responsibility to improve these systems for the benefit of the user. The heart of these “suggested for you” platforms is a recommendation system.

A recommendation system is a subclass of information filtering system that seeks to predict the “rating” or “preference” a user would give to an item.

Simply put, given what a user has already liked/disliked, the systems aims to predict how the user would feel about something she hasn’t seen yet. For instance, if a user recently purchased a cellphone off Amazon, they would be likely choose to buy accessories related to that particular phone, such as a case or a screen guard. Or, if the user has been listening to a lot of Black Sabbath on Spotify, they’d be interested in music in a similar vein, such as Dio. But there is a lot more that constitutes a good content discovery platform. Let’s take a look at some of the aspects of a recommendation system


Showing the same items repeatedly should be discouraged. There should be a point after which it can be concluded that the user is not interested in that item, and showing it again is simply wasteful. This is most annoying when the user has already seen that item, and possibly even purchased it. This is exemplified by Amazon recommending to me a shirt identical to one I’ve already purchased.

Incorporating methods to keep track of how often a certain recommendation has been pushed towards a user is important. The system must then be penalized for showing the same recommendation too many times.

Lack of discovery

A good content discovery platform must occasionally mix things up and show a user something that is outside his perceived range of interests. YouTube for instance, locks me in to a couple of topics it thinks I am interested in, and never shows me anything else. After a few days of listening to synthwave albums, I’m only recommended other synthwave content. After a long bout of listening to only atmospheric black metal, that is the only genre of music I’ll see on my feed. At no point will YouTube show me a hip-hop video.

This is part of a wider problem known as a filter bubble. The basic problem in a lot of online services is that they like to box users in to certain categories, and feed them content only pertaining to those categories. This is a dangerous problem, for it severely limits the understanding people have of topics, whilst giving them the illusion that they have access to all the various opinions about a topic. A supporter of a political party would never get to see a view contradicting the views of the party on their Facebook news feed. A lot of online communities, where the content is completely user-generated, tend to turn into “echo chambers” that simply re-iterate the same ideas, rejecting anything that is non-conformist. This is an important fact to be wary of when navigating various subreddits on Reddit. Its almost like the world went through centuries of globalization, only to end up forming localized tribes again.

It is important that platforms that grab huge shares of user’s attentions do not enclose them in a filter bubble. There is a concept in reinforcement learning called the exploration-exploitation dilemma. The dilemma is the choice between exploiting something that is known to provide a certain reward, vs exploring and trying out a new choice, which may either offer a significantly higher reward, or result in a negative outcome. When thinking about this in terms of recommendation engines, exploitation would be showing content that the system knows that the user is receptive to. Exploration would be showing content to the user without knowing what their reaction would be. Achieving this is a delicate balance, and how effective it is would vary from user to user. Personally, I’d prefer that the system should swing to the side of exploration. Another type of user might find that too much exploration is irrelevant.

Spotify handles this brilliantly, with different sections of recommendations. There is a “Discover” playlist created every week that is filled with new music that is curated based on the user’s overall interests. Apart from that, there are multiple sets of albums chosen, each of a “because you liked X” variety. And the absolute best part is the song radio. You can right click on any song, select “song radio”, and it creates a playlist with similar music. This effectively allows the user to choose how much to explore or exploit! I’ll spare the technical details on how Spotify’s recommendation system works, but the gist of it is that they use a combination of traditional collaborative filtering, natural language processing based on the metadata of songs, as well as features extracted from the audio itself. This ensures that even artists with a very small number of plays have a chance to be recommended.

Ignoring interests

Content discovery platforms should not simply abandon an interest shown by the user. YouTube is especially guilty of this: it almost never shows me tech talks, despite the fact that I tend to watch them occasionally. Sometimes, a platform could cross over to the other side of the spectrum: throwing similar content at you simply because you watched one such instance. Content discovery platforms need to figure out how to achieve that fine line between completely ignoring an interests vs. spamming the user with content because of a single action.

Excessive promoted content

At the end of the day, content discovery platforms have a business to run. They need money to be able to continue to provide their services. One way they earn revenue is by allowing content creators to pay and have their creations promoted. These promotions can be targeted at certain demographics. Most platforms are using native advertising to seamlessly blend these promoted items with other recommendations. When a platform overdoes this, by excessively pushing promoted content over regular recommendations, it can become annoying. Not to mention, most of promoted content is of the clickbait variety.

I look forward to the day more recommendation systems get good at helping people discover and expand their knowledge. Although my goal initially was to outline a few characteristics of what constitutes a content discovery platform, in retrospect I believe the real message here is for consumers to become aware of the limitations of the mediums they use to absorb information about the world, and to break out of filter bubbles.