Great content seems to be like a magic potion that is concocted by story writers and sprinkled with some magic formula. It is essentially a figment of the writer’s imagination. But is it really? How is it that the production houses can be assured that their content is really worth the moolah? The TV content production requires big investments, but the investment decision currently relies heavily on intuition of the human psyche. With so many shows and so much content, the viewer’s relation to a show is not bound by any contract. What is it that can help keep the viewers glued to your TV show? Enter content analytics. Initially driven by intuition, content analytics is changing the way content is created.

Scripts, screenplays, transcripts, blogs, summaries, reviews…so much unstructured text data and there is nothing we can do about it? Yes, we can! Natural language processing has matured with the contribution of dedicated developers under the open-source environment. A recent development was the introduction of Parsey McParseface by Google, which enables pretty accurate English language Part-Of-Speech tagging. It uses their deep learning platform, TensorFlow, which allows for the creation of complex neural network models on high volumes of data. Further, with the emergence of cloud technologies, the cost of analytics infrastructure has come down drastically, which makes it commercially viable for organizations with a modest budget.

Today, it’s the show’s TRPs that generate fluctuations in the production houses’ heartbeat. It is the only measure to gauge the show’s performance. But is it enough to understand the viewer’s response to content? The viewer engages with the content at a more granular level as well, like with the characters, the emotions between characters, events in the lives of the characters etc. Text mining extracts and quantifies these granular components of the story, through data pre-processing techniques such as Stemming, Part-Of-Speech Tagging and Named Entity Recognition. After quantifying the different components, machine learning algorithms learn the patterns in these components and identify the main components that drive the TRP.

This component analysis can be different for different target groups and markets or similar across the population. This understanding of key drivers in content, empowers the content creator to choose the type of content they want to create, keeping in mind the viewer’s preference. When scaled up to other shows, a knowledge base can be created which gives an understanding of the entire ecosystem, the content preferences and its change across time.

The industry has sought a data driven decision mechanism and the response is turning out to be phenomenal. Netflix has already shown us the way, as to what all we can achieve with it. What if you were to know that whenever the protagonist has a romantic rendezvous outside of marriage, that actually sets the TRPs soaring or it’s actually when the negative character takes center stage that the viewers are intrigued. Content feature analytics by AthenasOwl unlocks the doors to endless possibilities. Looking into the future, this can scale up to audio and video analytics as well. There is so much more we can do, and this is just the beginning.