Yesterday in Data 101, we addressed a few of data’s most fundamental questions. Today, you’ll explore and build acumen around prevalent data professions. Let’s jump right in!
There are three overarching professional domains:
• data analytics
• data science
• data engineering
We’ll first break them down one by one to clearly distinguish their differences, and finish with a simple exhibit of how all three generally work together to make the most of data.
Often referred to as just analytics, this domain has professionals quality assuring data, building reports and dashboards, and running basic analyses to help an organization answer four high-level questions with data for the purpose of optimal decision making:
• What happened? (How many users registered on our mobile app yesterday?)
• Why did it happen? (Which platform registered the most mobile app users yesterday, iOS or Android?)
• What is likely to happen? (How many mobile app users will we have by the end of the year?)
• What actions should we take? (What experiment should we run to better understand how to convert users faster?)
Common roles in this domain generally end with “analyst” (e.g., data analyst, business analyst, sales operations analyst, growth marketing analyst, product analyst, etc.)
While the field of data science is vast, we can break it down into two simple genres: decision-based and feature-based data science. Let’s dig a bit deeper:
Decision-based data science. This genre of data science is simply an extension of data analytics. People operating in this space instrument the techniques that help tackle overly complex problems that require advanced data-modeling and statistical techniques. Specifically, they help build sophistication around answering the last two questions from our list above (i.e., “What is likely to happen?” and “What actions should we take?”)
For example, the interpretation of whether a product experiment was successful (especially if it was tested with a small sample of customers) would typically be categorized as a data science effort, as it would require the use of advanced statistical and probabilistic methods.
The common role in this area of data science is: data scientist.
Feature-based data science. This genre of data science refers to the building of data products. Data products are literal product features based on data that are consumed by a companies’ external users/customers. Let’s dive into an example:
Amazon’s recommendation feature, “Customers Who Viewed This Item Also Viewed,” is a data product. The data science folks at Amazon have built a data model to relate transactions that have been purchased together with the item you’ve just selected. Amazon is using data to not only deliver this data product but also continue to make it more accurate over time.
The common role in this area of data science is: machine learning engineer.
In order for any of the above to be possible (i.e., data analytics and data science), data actually has to be available. Analysts, data scientists, and machine learning engineers need access to clean, organized, and timely data to do their work. This is where data engineering comes into the picture!
Sometimes referred to as data infrastructure or data architecture, “data pipelines” collect, store, update, and move data from one place to the next. These efforts are typically engineering-forward, in contrast to being analytical.
A common role in data engineering is: data engineer.
In summary, folks in data engineering help build the infrastructure for data to be accurately collected, so folks in data science and analytics actually have data to work with!
Tomorrow, we’ll discuss the data tools these professions use to get their work done!
—Colby and Serge
Share with friends