A quick side note: as anyone who has been handed a bunch of analytic results can tell you, there is still a lot of work to be done to translate those results into clear and engaging tables, charts, presentations and reports, but that is (figuratively) the icing on the cake and we'll cover that at another time.
For now, we want to focus on what happens during the mixing. What gets done to data before it's ready to go into the oven? Here I've compiled a fairly comprehensive list of things to look for/do with raw data prior to the actual analyses. It is by no means exhaustive, nor does it apply to every data situation, but it's a great place to start the discussion.
~Bon Appétit!
Do a quick review to see if the data 'make sense', if the data are complete, in the data are what was expected.
Review the analysis plan (the recipe!) and make sure what is planned is feasible with the data you have.
Check the data structure to make sure it is set up correctly for the type of analysis that is planned. Reconfigure if necessary.
Know what the unique ID variable in the dataset is.
Identify and get rid of any duplicate or test observations.
Examine the inclusion/exclusion criteria for the project and make sure all observations in the dataset actually belong there.
Check each variable - frequency if categorical, univariate if continuous - is it complete, is it formattedly correctly.
Check for skip patterns and exclude responses to questions if they should have been skipped over.
Collapse, recode, and create new variables as needed for the final analyses.
The views expressed on the Institute for Community Health blog page are solely those of the blog post author(s), and do not necessarily reflect the views of ICH, the author’s employer or other organizations with which the author is associated.
No comments:
Post a Comment