“Mr. Chambers, don’t get on the spaceship! The rest of the book, To Serve Man, it’s a cookbook!” Who can forget the infamous ending to the Twilight Zone adaption of Damon Knight’s humorously dark short story. The people of Earth take peace in believing that the alien book bestowed upon them is called ‘To Serve Man” thinking that the title is a mantra meant to help serve humanity. Little did they know that the contents of the rest of the book suggested something quite the opposite.
However, the Twilight Zone characters aren’t the only one’s who suffer from an inability to understand a phrase in it’s greater context; machine learning algorithms fall short everyday.
There are a number of real-time and lexical limitations that stand in the way of automating the semantic curation process of social media commentary.
Let’s look at an example. Say we wanted to put together some advanced Boolean search queries to collect chatter about the Chicago Bears.
People talk about the Chicago Bears in a number of different ways. @Mentions, #Hashtags, direct mentions, referencing the stadium, coaches, managers, players, opponents, etc. Let’s focus on the query directly related to finding comments about ‘the bears’.
If we were to search for ‘the bears’ as a standalone term, the relevancy of the comments pulled in would vary depending on a number of factors. Is football in season? Do the Bears have a game today? Are comments being pulled only from the greater Chicago area? Is there breaking news about the bears at the Zoo or perhaps a recent bear attack? Perhaps there’s a really hilarious YouTube video going viral that is about a couple of bears? What about references to the 80s band, The Bears?
Location, time, breaking news, existence of other brands with the same name, and colloquial use all impact the relevancy of searching for ‘the bears’ as a standalone term. The context of a comment outside of the phrase itself is rather important in determining it’s relevancy to the football team. Although there is room to fine tune these search queries once we believe we have accounted for all the possible ways ‘the bears’ can be mentioned in social media, there is always the chance that the algorithms will break down in light of some unforeseen factor. In fact, the human element to social media content curation at scale is more essential than machine learning scientists would like you to believe.
Although social media at it’s best is served up like delicate hors d’oeuvres; in petite, rich, and appealing tidbits. Let’s head Rod Sterling’s advice and pay attention to the greater context of the phrase before we end up as ‘an ingredient in someones soup’.