assorted books on book shelves

The Essential Skills of a Great Data Scientist

Introduction

The growing demand for data

In today’s data-driven world, the demand for skilled data scientists is soaring. As organizations increasingly rely on data to make informed decisions and gain a competitive edge, the technical prowess of data scientists is essential. However, technical skills alone are no longer sufficient to thrive in this field. To truly excel as a data scientist, one must also possess a range of soft skills that enable effective communication, collaboration, and problem-solving.

Soft skills

The essential soft skills needed to make a profound difference in your work are communication skills, problem solving, continuous learning and curiosity, collaboration, adaptability and resilience.

There are many resources out there telling you what the top skills are to learn to be a data scientist. Many of these are the same, and I havenโ€™t used many of these in my own work. It feels like many people are selling themselves a little too much through these advertisements and recommendations. I donโ€™t think you need all the latest fancy tools and knowledge of the most complicated algorithms. In industry, you will soon realize that simpler is sometimes better.

So many of these people say โ€œYou need SQL, python, R, machine learning, etc. so take this FREE SQL course that I am offering which really only has basic examples and doesnโ€™t really explain much else!โ€

I have yet to see a real comprehensive source mention anything about the REAL skills you need to be a GREAT data scientist.

While I do think a lot of these technical skills are pertinent to the field, I feel there isnโ€™t much attention on the soft skills. Many people seem to gloss over the fact that data scientists arenโ€™t there JUST to build and run models. They are there to help interpret the results from said models to inform decision making.

Now I know you have heard that before, but it is much easier said than done. You have to be able to take the information you have and tell a story. Also, tell it to someone who has no idea about the technical aspects of how you got to the conclusion. This is where the term โ€œBlack boxโ€ may come to mind.

Communication

This storytelling with the data is where data science is as much an art as it is a science. And I see many people fall flat in this area.

For some people, this skill is rather innate and they are a naturally creative and explorative person. But some people have trouble realizing how some of these things may be applicable to the real world. I browse through countless analysis projects using datasets like those on Kaggle. Some people have really great models and get accurate results, but I donโ€™t see the story anywhere. Just numbers on a page.

I know what these things mean but would the senior leadership at your company know what they mean? Not likely as they are much more focused on seeing the business impact rather than the mathematical methods and results.

It is an art as much as it is a science…

Sometimes I see people off to a good start by trying to paint a scenario where you may use this data set for an analysis, but then never reference the use case again and just put the code out there for people to see and once again, only look at the technical part of the project. Many of these people start off well, but they shouldโ€™ve taken their results from the models and their analysis, and tied it back to the use case for a full story to really make it GREAT. That would tell me that they can do the โ€œscienceโ€ part of data science as well as the part that makes it an art: they can tell a story and really interpret the results.

Another problem with not providing a story with your analysis and only code is how do you expect someone to see that you really know what you are doing? Sometimes people just do things without providing the reasoning. This can stem from habit of doing this a hundred times, or because you see everyone else always doing it (And it doesnโ€™t mean it is right).

Here are some questions to keep in mind that I often think when I see projects with only code provided:

  • How do I know you didnโ€™t just copy someone elseโ€™s code?
  • Where is your personal touch? Your own interpretation.
  • I have seen this plot a 100 times! Why are you using it?
  • Can you explain why you dropped those observations or leverage points?

Keeping these questions in mind and answering them as you present your project, you should be able to present your findings, results and insights in a clear and concise manner, even to the non-technical stakeholders.

Problem solving

Data scientists are problem solvers at their core.

puzzle pieces

They tackle complex challenges, analyze vast amounts of data, and uncover meaningful insights. A great data scientist possesses critical thinking skills, an analytical mindset, and the ability to approach problems with creativity and resourcefulness.

This is what you have heard before. But where do you obtain or hone these skills? This is where math comes in handy. There are many people that shudder when they think of math because maybe it didn’t make sense to them in high school or college, when a level of abstraction was added and were ill-equipped with the tools to be able to understand to think in that level of abstraction (I have my own thoughts on this relating to the education of math in public school but that is a whole other post and won’t get into that here). The point I am trying to make here is that a lot of people think math is hard.

The little-understood truth is that math is there to make things easier. When studying math in college, you hone the skills required for problem solving by breaking apart what you know to find out what you don’t know. Because many people get turned off once x and y are involved, they don’t get to this point.

Critical thinking is a foundational skill for data scientists. It involves the ability to objectively evaluate information, identify biases, and apply logical reasoning to draw conclusions.

Great data scientists question assumptions, challenge existing methodologies, and explore alternative perspectives. By exercising critical thinking, data scientists can uncover hidden insights, detect anomalies, and refine their models to achieve better results.

Data science problems rarely have one definitive solution. Great data scientists embrace an iterative approach, constantly refining their solutions based on feedback and new insights. They are not discouraged by setbacks but rather see them as opportunities for improvement. By continuously learning from their mistakes and evolving their methodologies, data scientists can deliver more robust and accurate solutions over time.

They can break down complex problems into smaller, manageable components allowing them to focus on specific aspects and identify patterns and trends. By combining logical reasoning, data exploration, and domain knowledge, they can address intricate data-related problems efficiently.

Continuous learning and curiosity

The field of data science is constantly evolving, with new technologies, tools, and methodologies emerging regularly.

To stay ahead, a great data scientist must have an insatiable curiosity and a commitment to continuous learning. They actively seek out new knowledge, stay updated with the latest trends, and adapt their skill set accordingly.

By embracing a growth mindset and dedicating time to explore new techniques and approaches, data scientists can bring fresh insights to their work and maintain a competitive edge.

Collaboration

Data science projects are rarely solo endeavors. Data scientists must collaborate with colleagues from various backgrounds, including business stakeholders, domain experts, data engineers, and software developers.

A great data scientist is a team player who can effectively collaborate and integrate diverse perspectives. They understand the importance of cross-functional teamwork, actively contribute to group discussions, and value the insights and expertise of others. By fostering a collaborative environment, data scientists can leverage collective intelligence and deliver more impactful results.

Adaptability and Resilience

The world of data science is characterized by ambiguity, uncertainty, and rapidly changing circumstances.

Successful data scientists possess adaptability and resilience, enabling them to thrive in dynamic environments. They can quickly adjust their strategies, learn new techniques, and navigate through challenges and setbacks. By embracing change and demonstrating resilience in the face of adversity, data scientists can continuously deliver value and drive meaningful impact.

Wrap Up

While technical skills are undoubtedly crucial for a data scientist, the importance of soft skills cannot be overstated. The ability to communicate effectively, solve complex problems, continuously learn, collaborate with other professionals, and adapt to evolving landscapes are key differentiators that set great data scientists apart. By honing these essential soft skills, data scientists can unlock their full potential, make a significant impact in their organizations, and achieve long-term success in the dynamic world of data science.