The Perfect Data Scientist - A Mythical Beast?
  • 13th February, 2018

The Perfect Data Scientist - A Mythical Beast?

Steven Sidley - Head of Strategy, Ixio Analytics

John Tukey was a Princeton statistician who in the 1960s, first imagined the job of a data scientist. It is safe to say that this new job category did not catch a popular wave until decades later, when Hal Varian, Chief Economist at Google, conflated statisticians and eroticism in 2009 by proclaiming the career to be the ‘sexiest job in the world”, thereby exciting the prurient interests of inflamed techies all over the globe. 

In case you are wondering about the linkage between statistics and data science, here is the most pithy definition of a data scientist I have read. "A data scientist is someone who is better at statistics than any software engineer and better at software engineering than any statistician.” Not very precise, but there is more than a kernel of truth in there. 

Still, a little rigour is required. The problem with pure statisticians, is that most are not great coders, and fewer still understand the boundaries of software engineering. Their code is often a convoluted mess of spaghetti, and while it works for them, there is usually no need to share it with a team, or the world at large. When presented with a problem, the statistician will find an elegant predictive or modelling solution, even if it is buried inside a best efforts scrabble of computer code. 

One the other hand the great coders of the world, for whom Githubs and commits and opensource protocols are baked into their DNA, rarely have deep statistical skills. They are sometimes unaware of assumptions, axioms, alternative predictive approaches and the many other tentacles of the statistical arts. So they grab SAS libraries or open source modules that implement some sort of predictive solution and are adept at jamming parameters through an API door. 

This is not to say, never the twain shall never meet. The global sprouting of data science education is trying hard to merge these two disciplines. Given the rewards on offer, the gap will surely close. 

But there is a more pressing problem, and it sits within the enterprise. The subject of data science/data analysis/big data is all abuzz right now. And because of the the word ‘data’, guess who does the hiring? It is usually a job description that ping-pongs between IT and Human Resources. And because IT managers (particularly legacy IT managers), have little understanding of the statistical pillar of data science, they tend to hire who they have always hired, that is programmers who might have done a course or two in data analytics, or worse, a lateral movement of an internal enterprise programmer into data science, with an imprecation to ‘read up on the subject’.

We have recently seen this in various enterprises we have visited. An entire team assembled as the ‘Data Team’ or ‘Data Hub.’ And then lots of busy work and no solutions presented or solved two years after a big internal launch.

There is a final pillar that is often not addressed by the job description. Without an understanding of operations and business process no one (not even the mythical statistician/software engineer cyborg) is going to make much headway. 

There are many guides out there for hiring data scientists. Here is one that we like:

It’s written by Ben Dias, the Head of Advanced Analytics and Data Science at the Royal Mail in the UK. We think it covers most if not all, of the bases.

Share This Artcle :

Previous Post

Unfair Play

About Author

Steve has over 35 years’ experience in diverse areas of business, technology, telecommunications, media, information management and private equity and has worked at operational, executive and board levels. After qualifying with an MSc (Computer Science) from the University of California Los Angeles in 1979, Steve spent the next 17 years in California. This was the Cambrian period of technological development, particularly on the West Coast of the US, and he was fortunate enough to be involved with a great deal of new technology. This included working as an engineer on the Space program at Hughes, followed by designing video games at Tronix, a stint as an artificial intelligence researcher at Citicorp research in Los Angeles, and a period spent designing computer peripherals. Steve also founded one of Los Angeles’ most successful computer animation companies, Sidley Wright, which was sold to National Video Systems in 1994. Steve returned to South Africa in 1995 where he has held a number of C-level positions in blue chip companies including as the Group Chief Technology Officer for Anglo American plc. In South Africa, Steve has successfully co-founded two companies and continues to work on private equity and venture capital transactions. Steve also serves on several boards of directors. Steve is an award-winning novelist, and has published 4 books with Pan MacMillan.