So you wanna be a data scientist?
A guide to 2015's hottest profession
Are you good at math? Like, really good at math? Do you also know Python and, oh yeah, have deep knowledge of a particular industry?
On the off chance that you possess this agglomeration of skills, you might have what it takes to be a data scientist. If so, these are good times. LinkedIn just voted "statistical analysis and data mining" the top skill that got people hired in 2014.
Glassdoor reports that the average salary for a data scientist is $118,709 versus $64,537 for a programmer. A McKinsey study predicts that by 2018, the U.S. could face a shortage of 140,000 to 190,000 "people with deep analytic skills" as well as 1.5 million "managers and analysts with the know-how to use the analysis of big data to make effective decisions."
The field is so hot right now that Roy Lowrance, the managing director of New York University's new Center for Data Science program says he thinks it has peaked. "It's probably in a bubble," he says. "Anything that gets hot like this can only cool off." Still, NYU is looking to expand its data science program from 40 students to 60 over the next few years. The current school year won't be over for another five months and 50% to 75% of its students already have firm job offers.
Why the explosion? Linda Burtch, managing director of Burtch Works, a Chicago-based executive recruiting firm, notes that while tech firms like Google, Amazon, Netflix and Uber have data science groups, the use of such professions is now starting to filter down to non-tech companies like Neiman Marcus, Walmart, Clorox and Gap. "All these are companies looking to hire data scientists," she says.
The hope is that such professional will unearth new information that will prompt new streams or revenue or let a company streamline its business. Pratt & Whitney, the aerospace manufacturer, now can predict with 97% accuracy when an aircraft engine will need to have maintenance, conceivably helping it run its operations much more efficiently, says Anjul Bhambhri, VP of Big Data at IBM.
Though IBM just released its freemium, cloud-based Watson Analytics program this month, most often data scientists have to create homegrown software programs to analyze unstructured data, which is one reason that programming skills are required.
Lowrance says there are basically three skills that a data scientist needs to possess: math/statistics, computer literacy and knowledge of a particular business domain (like autos, for example.) NYU's program teaches those so that each area of expertise builds on the other. When you graduate, you're sort of a jack-of-all-trades for data crunching. "When working on data science projects in coursework they have to do all the jobs," he says.
Not everyone has to go through a college course to become a data scientist, though. A company called Metis, for instance, started offering a 12-week data science boot camp in September. The program, in New York, costs $14,000 and admission is highly competitive. Metis Cofounder Jason Moss says that about half the students come in with a Master's or PhD.
Just a couple of weeks after the first boot camp ended in early December, Moss said six of the class's 15 students had job offers.
"I don't think it's a replacement for college," Moss says of his program. "I think college is about more than the fastest path to getting a job. I also don't believe that you have to have gone to college to be successful as a data scientist," he says. "There's a personality type - innately curious, has grit, wants to figure things out — that does well."
Anmol Rajpurohit, an independent data scientist and consultant, says being a fast learner is most important attribute for this line of work. "Generic programming skills are a lot more important than being the expert of any particular programming language," he says. "Living in an age of rapid technology advancement, we see languages quickly becoming obsolete and new languages quickly getting popular. Thus, a fast learner will go a lot farther than an expert."
Lowrance says that he believes boot camps and online-based courses can be helpful for candidates strong in some skills, but weak on others. One virtue of NYU's program is that it teaches the skills sequentially so that they build on each other. "We give you everything you need in an order that makes sense," he says.
What data scientists do?
"On an average day, I manage a series of dashboards that tell our company about our business — what the users are doing," says Jon Greenberg, a data scientist at Playstudios, a gaming firm. Greenberg is a manager now, so he's programming less than he used to, but he still does his fair share. Usually, he pulls data out of Apache Hadoop storage and runs it through Revolution R, an analytics platform and comes up with some kind of visualization. "It may be how one segment of the population is interacting with a new feature," he explains.
Greenberg got a Master's degree in statistics six years ago. He expected to go into government work, but was surprised to see that data scientists were so in demand in the private sector. "It was definitely not as hot a field then," he says. Now, he says he gets about one call or email a day from a headhunter. "It's not me," he says. "They probably bother everyone else [with this expertise]."
For Greenberg, employability is a plus, but he loves the work itself. "I think it starts with, you have to have an analytical mind. You have to be curious," he says. "You have to be flexible and creative and think of a different way to solve problems." The only downside of the job, Greenberg says, is the time spent "cleaning" data — pruning it to remove irrelevant findings. "That part's not that exciting and you spend a lot of time doing it," he says.
Rajpurohit says he spends a lot of his energy cleaning data, but also researching. "A significant part of my time is spent on research, because I often come across absolutely new problems and thus, have to study the latest literature on research in that particular field or reach out to experts on those topics for advice," he says.
"Despite its name, data science requires a good mix of both art and science. The science part is obvious –- mathematics, programming, etc. The art part is equally important –- creativity, deep contextual understanding, etc. Both the parts put together make one a great problem solver."
That said, Rajpurohit acknowledges that 'working in Data Science is not even remotely as sexy or glamorous as it is being perceived these days. This field is definitely gaining significance (and seeing high pay offers) across organization, but there is a lot of not-so-exciting tasks that a data scientist needs to work on almost daily basis."
Is this the career for you?
If the idea of spending much of your day programming and analyzing dashboards for relevant information appeals to you, then you might have the makings of a computer scientist. If you're merely motivated by the salaries, though, you may have a tougher time. Consider: People who fall into this line of work often spend their spare time writing programs and analyzing data just to amuse themselves.
Adam Flugel, data science recruiter for Burtch Works, recalls a recent candidate, a PhD holder, who he placed at Electronic Arts this fall. "What really stood out was the work that he was doing for fun in his free time," Flugel says. "He was involved in the online multiplayer game World of Tanks and led a “clan”, basically a team of players. He created a utility to scrape data from the game server and then ran analytics on that data to evaluate his clan’s performance. He used this info to figure out how to adjust their strategy, what types of players he should recruit to improve the team, etc."
If you don't love data for its own sake, then you will find it hard to compete with such candidates. Burtch, however, says everyone should learn to love data, if only for the sake of their career. "Within 10 years, if you're not a data geek, you can forget about being in the C-suite," Burtch says.
But what about Steve Jobs, Bill Gates and other such visionaries who saw the big picture and didn't get bogged down in the minutiae of data science? "That was 30 years ago," says Burtch. "I'm talking about the next 10 years."