Sky Limited, the British media and telecommunications conglomerate owned by Comcast and headquartered in London, is working on the future of television. Ironically, the future of television has less and less to do with television itself, at least in the traditional sense of the medium. With the rise in cord-cutting, in favor of streaming content on-demand from a virtually infinite archive, content-providers are turning to recommendation algorithms as a way to improve the viewing experience and differentiate their experience from that of competitors. While the industry standard (think Netflix) is to provide recommendations based solely on previous viewing history, Sky seeks to extend the current model with computer vision capable of automatic recognition of subjects/objects within video, as well as the semantic representation of mood.
The Royal Wedding in 2018 provided Sky with an opportunity to demonstrate a great implementation of computer vision and the automatic recognition of subjects/objects within video. Using machine learning resources from Amazon Web Services (AWS), it was possible to automatically recognize and tag famous guests as they arrived at Saint George’s Chapel for the Royal Wedding. Their names were then added to a list and matched with a biography and information on their connection to the royal couple. With this information mapped out and stored,viewers could then search the footage on demand for their favorite guests. This kind of technology marks a big step forward in the creation of a more interactive viewing experience, one that is all the more remarkable considering it was made possible live.
“We certainly want to explore what machine learning and automatic recognition of objects within video can do for us, because that’s a very interesting opportunity for us to analyse live data and tell people different things. I think we’ve definitely opened a theme where we can go with this.”
-Hugh Westbrook, Senior Product Owner as Sky
In addition to computer vision, Sky has been working on bolstering its recommendation algorithm with a semantic representation of mood that allows customers to search for content based on sentiment. To some extent, the existence of genres provide a crude means of classifying content based on an emotional theme, like comedy, romance, action, suspense, etc. However, this rudimentary means of classification fails to take into account the fact that an “action movie” can also be funny, or that a comedy can also be romantic. To look past this kind of oversimplification, it was necessary to develop a new model where each film is scored across a variety of emotional metrics.
The team at Sky started by tagging content with keywords describing the nature of the content on the one hand, and a list of mood labels on the other hand. They then built a model to learn the correlations between moods and keywords, resulting in a map where every mood is represented by every emotional keyword in the model, but where each word is scored relevant to that particular mood. In other words, while every emotional keyword is present in each mood, if it is not relevant to a particular mood, then its score will be low (but it will still be there). For example, a film like the spy comedy Johnny English would be assigned a score of 5.2 for ‘funny’, 3.8 for ‘adventurous’ and 1.4 for ‘exciting'. This sort of semantic mapping makes it possible to “weigh” content across many different categories of sentiment, allowing viewers to employ the power of ratios to filter content in a more personal and granular way.
Sky’s first attempt at a user-interface for their semantic mood map took the form of a series of on-screen bubbles, each containing a word representing an emotional category. Users are prompted to select one, or a combination of bubbles, with the ability to shrink or expand the bubbles based on the ratio of the vibe they are seeking. This sort of user-interface is much more interactive and personal than the typical search function or recommendation algorithm in which users play a more or less passive role. although the technology is still developing, a patent is pending, and in the future it may provide a significant way for Sky to differentiate its viewing experience.
In a landscape where users have access to practically unlimited archives of content, choice paralysis becomes a real issue and providers need to get creative with different ways of surfacing the content they have spent so much to acquire. Using computer vision and machine learning to harness attributes of content that are “buried” within the content itself can give content providers powerful tools to enhance their user’s viewing experience and differentiate their platform from the many competitors in this space.