Why Data Science has a ‘last-mile’ problem

Finding the right insight at the right time isn’t easy. The sheer volume of data can be overwhelming, and seldom are the questions we ask straightforward to answer.
People who work at the interface between data, technology and business are commonly referred to as Data Scientists. Gaining the skills and experience to become a data scientist takes many years of university study and practice.
Much of the work of a data scientist is devoted to the creation of predictive models. This calls for the application of statistical theory, but more so the development and application of machine learning algorithms and other computational methods. Model outputs, reflective of the research goals, usually take the form of predictions, summary statistics, and data visualisations.
A central tenet of the data science value proposition is that model outputs can be interpreted and applied within a business context. This step is the most critical link of the data science value chain, because it’s the point where people must apply their own qualitative judgements to determine the relevancy and meaning of the information produced. It requires much skill and effort, and is commonly the step that data scientists and business people find most challenging.

The biggest pain-points people encounter at this step are:
-
Data science outputs are not easily interpretable or translatable
A common complaint of business people is that model outputs are overwhelmingly complex and full of jargon, which leads to unclear business messages and a failure of findings to be understood and acted upon. Jargon is inevitable in any profession, but for data science it’s particularly acute because of a high dependency on technical methods. A further setback that inhibits the acceptance of model outputs by business people is that there is no common interface and every new model requires considerable human effort in translating. What’s more, many models appear to function as "black boxes" that neither data scientists can easily explain, or business people naturally trust.
-
Software tools lack functionality for users to iteratively reframe their questions and expose increasingly relevant insights
Rarely do insights appear when the first question gets answered. Instead, successive reframing of the question or problem is needed, where interesting visual patterns and data perspectives are investigated, and new hypotheses are formed. Without good software, the end-to-end reframing process, including data preparation, is very manual and time-consuming, which only a data scientist can perform. This stipulation creates either a production bottleneck for the data scientist, or a limit to the direct involvement of domain experts and the number of ideas or scenarios that can be investigated within the allotted time.
Some data science teams are looking to solve the data science bottleneck problem by using automated machine learning technologies that make the process of building and testing predictive models faster. However, these investments — while freeing up scarce data science resources — don’t address the wider organisational challenges of making data science and analytics a mainstream business capability.
The reality is there remains a “last-mile” problem with the interpretation and application of data science outputs by business people – which in turn limits the business value that can be generated by data science overall.
So how do we solve this Last Mile problem with software?
-
Enable deeper exploration with interactivity
Advances in AI technology now give users the ability to discover “unknowns” in data. Importantly, users can iteratively reframe their questions, based on their own knowledge and understanding of the problem. What’s more, users can explore data in the first-person from any angle or viewpoint they consider above and beyond what they already know. This creates a diversity of data insights that are truly unique, meaningful and valuable to data scientists and business users alike.
-
Keep the software interface as simple as possible
A good self-service solution simplifies the complexity of data science, and offers functionality that is intuitive to use, and consistent regardless of skill level. This requires balancing the need to control for complexity, with the need to give users the freedom and flexibility to think creatively. It’s also critical that data scientists and business users have a common, visual interface that allows users to explore their own data sources, as well as outputs from any third-party tool such a R, SAS and python. These features can be achieved through a combination of good user interface design and data visualisation techniques.
-
Foster organisation-wide collaboration
This requires software to be pushed to the widest and lowest possible level possible of the organisation. At the same time, in-platform communication and collaboration features allow for the creation of trusted networks and shared objectives, as well as the cross-pollination of ideas and perspectives through a virtuous cycle of business innovation.
DeepConnect is the first company to make organisation-wide learning through big data analysis a core business competency for medium and large sized enterprises. The DeepConnect solution extends insight generation as a core capability throughout the enterprise, as near to the people who are asking the questions and have the domain expertise.
Best of all, we believe it’s through these technology and mindset shifts where the most exciting opportunities for insight discovery will emerge.
To learn more about our DeepConnect platform, methods and training solutions, please contact us, or check out our Leadership Academy program.