Processing data

data processingOver the past two weeks I have been taking my surveys and “transforming” the answers into absence and presence data and quantitative P flows. I had already done this process for about 50 or 60 of the surveys in july in order to check if the information was complete and how I should be re-checking the other surveys after I had completed them to make sure everything I needed was there.

I am so happy I took the time to do this “test” processing and took lots and LOTS OF NOTES and made good summary excel sheets of all the conversion factors I had used. It is definitely making my job easier now. I am still finding some “novel” conversions that I need to look-up and I am also thinking hard about the conversions I used before to make sure I am ok with the assumptions behind them. Still, the task does seem a little less overwhelming as I did the prep-work a few months ago.

I chose to start the processing even though I am still waiting for some missing pieces of information from some stakeholders. This is making the processing a little more tricky because it means I need to go back in complete the surveys and make sure those changes make it all the way through even when I change excel files (I am actually not a very big fan of linking between excel files because I have had too many problems when changing computers and to me it makes the data less sharable with people). I am using COLOR CODES for the cells that need extra information and I think this should be enough to keep me on track.

Here are my basic steps from going from the filled-out surveys stored in limesurvey to something I can use:

  1. Go back to research questions. What are you looking for (but there is a certain level of change, like my questions are quantitative but after the pilot I really also did some yes/no questions to be sure I wouldn’t loose possible data and also “ease” people in)
  2. What steps need to be taken between what I have and what I need. So for me its to convert everything to P, so unit conversion (all metric, but also from volume to mass, so look up densities online but also based on data we have with commercial inputs).
  3. Write down assumptions. If you don’t have the info you need to do conversions in the survey need to make assumptions and need to write them all down.
  4. Take notes on each transformation of the raw data. Trust me, if you don’t take meticulous notes of how you transformed data and why it will be hard to summarize later (and even with notes its hard).  I think this has been the big difference between when I did my MSc and my PhD. I have made sure to take copious amounts of notes. I wouldn’t say it systematically reduces mistakes but it makes its 1000% easier to see where you made them and then be systematic about correcting assumptions if you need to as you go along.
  5. Look at all the data. Already, even though I am no where done finished processing there are things I have but didn’t anticipate. I am realizing there might be interesting information on # of inputs used, type of inputs, so more about management practices and then quantitative values.
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s