What’s Next?

06.02.2017 |

Episode #10 of the course An introduction to data science by Roger Peng

 

You should now be armed with an approach that you can apply to your data analyses. Although each data set is its own unique organism and each analysis has its own specific issues to contend with, tackling each step with the epicycle framework is useful for any analysis. As you work through developing your question, exploring your data, modeling your data, interpreting your results, and communicating your results, remember to always set expectations and then compare the result of your action to your expectations. If they don’t match, identify whether the problem is with the result of your action or your expectations and fix the problem so that they do match. If you can’t identify the problem, seek input from others, and then when you’ve fixed the problem, move on to the next action. This epicycle framework will help to keep you on a course that will end at a useful answer to your question.

In addition to the epicycle framework, there are also specific activities of data analysis that we discussed throughout these lessons. Although all of the analysis activities are important, if we had to identify the ones that are most important for ensuring that your data analysis provides a valid, meaningful, and interpretable answer to your question, we would include the following:

1. Be thoughtful about developing your question and use the question to guide you throughout all of the analysis steps.

2. Follow the ABCs:

a. Always be checking
b. Always be challenging
c. Always be communicating

The best way for the epicycle framework and these activities to become second nature is to do a lot of data analysis, so we encourage you to take advantage of the data analysis opportunities that come your way. Although with practice, many of these principles will become second nature to you, we have found that revisiting these principles has helped to resolve a range of issues we’ve faced in our own analyses. We hope, then, that these lessons continue to serve as a useful resource after you’re done and whenever you hit the stumbling blocks that occur in every analysis.

 

The book The Art of Data Science, written with Elizabeth Matsui, provides an in-depth look at how data science works best. We expand on many of the topics covered in these lessons and provide concrete examples of how they apply in the real world. The book is available on a pay-what-you-want basis from Leanpub.

 

Share with friends