Photo by Sarah Ackerman
Roger Magoulas asked me an interesting question during Strata – what was the biggest theme that emerged from this year's gathering? It took a bit of thought, but I realized that I was seeing a lot of people from all kinds of professions and organizations becoming conscious and open about their identity as data scientists.
The term itself has received a lot of criticism and there's always worries about 'big-data-washing', but what became clear from dozens of conversations was that it's describing something very real and innovative. The people I talked to came from professions as diverse as insurance actuaries, physicists, marketers, geologists, quants, biologists, web developers, and they were all excited about the same new tools and ways of thinking. Kaggle is concrete proof that the same machine-learning skills can be applied across a lot of different domains to produce better results than traditional approaches, and the same is being proved for all sorts of other techniques from NoSQL databases to Hadoop.
A year ago, your manager would probably roll her eyes if you were in a traditional sector and she caught you experimenting with the standard data science tools. These days, there's an awareness and acceptance that they have some true advantages over the old approaches, and so people have been able to make an official case for using them within their jobs. There's also been a massive amount of cross-fertilization, as it's become clear how transferrable across domains the best practices are.
This year thousands of people across the world have realized they have problems and skills in common with others they would never have imagine talking to. It's been a real pleasure seeing so much knowledge being shared across boundaries, as people realize that 'data scientist' is a useful label for helping them connect with other people and resources that can help with their problems. We're starting to develop a community, and a surprising amount of the growth is from those who are announcing their professional identity as data scientists for the first time.