In July 2017, Gartner placed ‘Machine Learning’ – very broadly understood as a particular branch, or subset, of artificial intelligence – on the peak of ‘inflated expectations’ within their renowned Hype Curve.
Gartner hype cycle for emerging technologies, 2017
According to the curve, machine learning will reach the “plateau of productivity” within two to five years. Unfortunately for data and computer scientists, if Gartner’s predictions are correct, this means a sharp turn to the trough of disillusionment, which will see optimism towards machine learning reduced and in real terms, potentially hiring freezes and budget cuts.
While machine learning has seen a resurgence in recent years, this is not the first time it has attracted both unbridled optimism and conversely, criticism.
In the 1970s, following a decade of AI optimism in the 1960s, the Lighthill report caused a “massive loss of confidence in AI”, particularly in areas of natural language processing. This push and pull response to artificial intelligence and within this, machine learning, is a rather unique response to essentially a research and insights method that had been in development since the 1950s.
With this in mind, one of the challenges machine learning projects often face, at a high level, is accounting for unrealistic expectations as to what is achievable and conversely, accounting for skepticism and a preference for reduced automation. At a low level, there are a series of barriers data scientists and engineers must overcome. As a studio responsible for implementing machine learning and AI products for some clients, we are acutely aware of expectations for machine learning and also, how it must not be seen as a time-sink, free from the constraints of results or performance.
We have recently embarked on a machine learning project which includes text-mining and classification to understand customer needs. For the project, we interviewed customers of a utility service to understand their frictions and needs. During the interviews, we found that users typically talked about two to three themes for why they utilised this service. Taking these insights, we then used machine learning to understand how these themes impacted the broader customer base.
The more technical details of our model are provided here, but essentially we used Naive Bayes classification to understand both the sentiment and themes discussed by users tweeting the service, to see how this mapped back to our user interviews. From this work, we were able to validate that the customer base were talking about the themes identified and their sentiment towards the service. Furthermore, given that Twitter was operating as a pinch point for customer queries, we identified that a major issue was customer service. Such projects are the basis of more physical applications of AI, such as chatbots, however we wanted to first understand customer needs part of a detailed research piece.
Data scientists spend from 50 percent to 80 percent of their time mired in mundane labor of collecting and preparing unruly digital data, before it can be explored for useful nuggets.
While machine learning projects can often have the expectation of large budgets and even larger timeframes, we were able to roll out our findings within two weeks, at zero cost apart from one data scientist. The fundamentals for understanding where to apply machine learning, and how to do so both nimbly and cost-effectively, with results was quite straight forward:
Don’t Skip on Statistics
In a rush to apply machine learning to your product or service, there can often be a temptation to skip the more practical elements of data science and statistics, namely inference and exploratory data analysis (EDA). This preliminary analysis is incredibly essential and can make or break the success of a project. Details on how to conduct exploratory data analysis are provided here.
Clean your Data
In 2014, the New York Times found that the cleanliness of data is often a key “hurdle” to insights. Three years later, little has changed, but by having a set methodology for cleaning and processing data, you can reduce time spent being a data janitor. As a blended team, we have both traditional programming skills as well as in-depth statistical knowledge, meaning we can be involved in data collection methods, as well as working with engineers, who are excellent at taking on our needs for more complex data pipelines.
Our most recent application of machine learning was a relatively straight-forward text-mining and classification project. While we could have pitched a larger piece of work, we identified that this work is often part of a longer journey and as such, it is important to be able to give incremental improvements, as opposed to tinkering away without results for months on end. A lot of the time, our work might be the first steps of just data science, let alone machine learning, namely the aforementioned data cleansing and EDA, but this health-check is often essential for future work.
Monica Rogati, Jawbone’s vice president for data science, with Brian Wilt, a senior data scientist.
Future projects we’re working on will include clustering using machine learning, as well as image recognition and text transcription using Google’s Cloud Vision. By delivering consistent results and following a structured process for machine learning, we are hopeful that we can continue to provide our clients with intelligent and valuable applications of machine learning.