Cutting through the exabytes

The world is so short of engineers that countries around the globe are steering more children towards the study of STEM (Science, Technology, Engineering and Mathematics) subjects.  But Data Scientists are in even shorter supply than engineers.  With the arrival of Big Data, served up in a daily dose of 2.5 quintillion bytes (that’s 30 zeros), companies and governments now have the haystack that brings them tantalisingly close to unlocking the secrets of commercial success, human psychology and the universe. [1] All that is now required are people who are able to mine all these data and extract the needle from them.  Especially since data creation is expected to reach 240 exabytes per day by 2020.  One exabyte is equivalent to one quintillion bytes so that means a one hundredfold increase over the next four years.

bio informatics - data science

No wonder data scientists are in such demand.  Adam Byrnes, Senior Director of International Expansion at Freelancer whose outsourcing marketplace has 16 million users, says data scientists can pretty much write their own contracts at the moment, so hungry are organisations such as his to recruit them.

At present there are about 40 universities offering data science courses in the UK and about half that number for specialist areas such as bioinformatics.  Data science is, broadly speaking, a marriage of computer science and statistics, with varying degrees of mathematics, analytics and modelling thrown in.  Bioinformatics applies these sciences directly to the health sphere and is split, by the NHS, into three areas: physical sciences, health sciences and genomics.

According to a report by Deloitte in 2015 [2], there are 164 genomics firms operating in the UK, mostly around Cambridge, London, Oxford and Edinburgh.  Most are small, employing fewer than 10 employees and Deloitte believes that these firms are having difficulty in scaling up because the NHS is unable or unwilling to commission more work from them.  It is certain that there are budgetary constraints within the NHS – the Secretary of State for Health is currently urging the NHS to ‘find’ £22bn in savings – that might make the organisation stop short of commissioning full-scale sequencing projects with an unclear cost/benefit ratio.

Bioinformatics could help.  According to Deloitte, the UK currently has around 35,000 qualified health informatics professionals – almost half the number required if we wish to provide a service that is comparable with the USA.  If a third party were to trawl through volumes of data for them, both genomics companies and their natural customer, the NHS, could benefit.

There are few companies in the UK dedicated exclusively to the genomics area of bioinformatics.  Desktop Genetics is one.  Based in Cambridge, it specialises in gene editing to reduce the impact of Alzheimers and to overcome cancer resistance to chemotherapy.   Other companies such as Synthace and Genestack and, best known, Benevolent AI, organise the information that other scientists may require.

With an NHS that is desperate to spend money on preventing people from falling ill rather than the costlier alternative of curing them once they are sick, there must be space for more companies and more investment in this field.

[1] Tech Crunch.  How to stem the global shortage of data scientists.  31 December 2015.

[2] Office for Life Sciences.  Genomics in the UK.  September 2015.