This post is long overdue (I feel like I say that a lot!).
It is a follow up to a post I published in January titled “Here’s my communication research class assignment on analyzing media placement.” Recently, I received a public comment on that post from a professor I greatly admire, Kelli Burns, pointing out that project assignment (see the bottom of this post for that document) notes at the bottom of the document that additional work will be assigned the following day. But, I never discuss what that entails in the blog post. I apologize to everyone who read that post because, in that sense, it was incomplete in terms of explaining the project.
Thank you to Dr. Burns for bringing this to my attention. With this in mind, I’ve decided to do a much-delayed follow up post, turning that initial post into a two-part series.
So, if you haven’t read the first post in this series, I encourage you to go back and do so. If you just want to know about teaching students to do data entry from coded data and to create data legends, then read on my friend!
The Set Up
In review, in the first post I provide an assignment where students download a data set of media articles using the Meltwater social intelligence software. Their task is to conduct a quantitative content analysis using a coding sheet (which I’ve provided in that first post). They are then told to do all of the coding at home, dividing up the articles to code as evenly as possible among their team.
On the second day of class, students come back with the coding sheet coded for the number of articles they needed to code. I instruct the students to download the coding sheet, copy it onto a new page in their document for the total number of articles they need to code and code them by highlighting the answers on the coding sheet. For example, a student who needed to code 30 articles would return with a digital copy of an MS Word document with 30 pages, each page containing a completed coding sheet.
All good right? They just need to get their coded data into something that SPSS can read… because that always goes smoothly! 😛
This whole project is aimed at introducing students to quantitative research and all we’re doing is running descriptive statistics. But here’s the problem:
As you probably remember learning in a quantitative methods class some years ago (let’s not age ourselves), the numbers in a data set don’t mean anything themselves. We, the researchers, assign meaning to them. This is an idea that we have to teach the students.
Here’s a simple example. Let’s say that we are coding for eye color. We assign the following numbers for coding purposes:
- 1 = brown
- 2 = blue
- 3 = hazel
- 4 = green
- … and so forth until we have an exhaustive list.
But when a student runs the mean and find that variable 1 has a mode of 3, they ask “what the heck does that mean?”
The problems with this are are:
- They don’t know what variable 1 corresponds to on their coding sheet (in this example, eye color).
- They don’t know what a mode of 3 represents (that the most common eye color is hazel).
Oh, and keep in mind that the students haven’t done any data entry yet. They don’t have their data into a spreadsheet format yet that can be imported into SPSS. So, there’s another problem. Most students have never entered data into a spreadsheet before.
What They Need to Do
- Get their coded data into a spreadsheet format that can be analyzed in SPSS.
- Create a data legend so they can interpret the SPSS output
What They Need to Know About Measurements First
In my class, students need to know the four common types of measurement – nominal, ordinal, interval and ratio – , as the Netflix assignment (and other assignments to follow) use them. Students in our major are not required to take any statistics class and thus this is new information to the vast majority of them. If your students know this, you can skip it. If you need a refresher on these, here is a quick summary that explains each measurement type and its strengths and limitations. I teach them these concepts with a lecture and in-class activity to test their application. I do this earlier int he semester before we get into the Netflix assignment.
Teaching Students Basic Data Entry
This part is pretty simple. As a reminder, the students are working in teams on this project. So the team needs to create a shared Google spreadsheet in which they enter all their coded data from their coding sheets. They just need to open Word and open the shared Google spreadsheet and enter the corresponding numbers from the coding sheet in Word for each article coded. The key thing is that in this spreadsheet the columns are the questions (i.e., variables) on the coding sheet and the rows are the individual articles (such as in the image below). Otherwise, it won’t import into SPSS correctly (Note: You can import a CSV file through SPSS. So, I have my students download the Google Spreadsheet in CSV format and import that into SPSS).
But, before they can enter their data they need a data legend. So..
Teaching Students to Create Data Legends
A data legend lets the researcher quickly put meaning to the variables and numbers in their results.
Creating a data legend can be done in SPSS. But, for time purposes and because students wont always be using SPSS, I prefer to do it another way. It is quite useful as I can have the data legend right in front of me on a piece of paper.
Simply, have your students type or write up their data legend and keep it handy.
Each variable needs a descriptive label that’s under 13 characters (13 characters is the max that SPSS allows you to use in describing a variable).
Each possible numerical value of that variable needs a name, which is the simplest possible description of what that number means. So, in our example above, if 1 equaled brown eye color, 2 equaled blue eye color and so forth, then we write it up to look like this:
eyecolor (1) brown, (2) blue, (3) hazel, (4) green.
In the above, I have given the variable for eyecolor the label eyecolor. The numbers in parentheses represent the numerical value that I have assigned to the possible responses.
For scale questions, the number equals the number on the scale. Example: On a scale of 1-7 where 1 means not at all, and 7 means very much so, how much do you like string cheese?
stringcheese (1) not at all, (2) 2, (3), 3, (4), 4, (5), 5, (6), 6, (7) very much so.
So, the instructions for creating a data legend are quite simple:
On a separate file or paper:
- Assign each variable a label (max 13 letters). So, “schoolstatus”, “favicecream” and “rankicecream” work.
- If it is nominal or ordinal label it in parentheses (this is optional, but I like to do it to help students remind what type of variable it is)
- With each label, make a list that indicates what # we have assigned to each term within our measurement, by placing the # in parentheses.
Of course, there are some caveats when dealing different measurement types, such as ordinal data. Indeed, ordinal data and ‘check all that apply’ questions are tough. These can be a bit frustrating when doing data entry. That’s why I’ve provided below a handout I created and use in class to teach students how to create data legends using the different types of measurements. This walks them through how to not only create a data legend for that variable but subsequently how to enter that data correctly from their coding sheet into their spreadsheet so that the spreadsheet can be analyzed in SPSS or elsewhere.
Once you walk students through this process, you can give them an activity to test for understanding and application. If the students don’t enter their data correctly now, it is going to be a mess when they try to import it into SPSS. So while this may take some valuable class time or may serve as homework, I recommend assigning the data entry and data legend activity (see below) and making sure the students entered their data correctly.
In the activity, it is important to clarify to students that, in part 2 of the activity, the survey responses are separated by semi-colons such that the first respondent’s answers are: a) digital film, b) freshman, c) 4, and d) Domino’s, Pizza Hut, Pizza Perfection.
Once the students have created the data legend and entered it into the table on the activity sheet, their answers should look like this
Once your students got this down, set them loose to do their data entry. You may want to assign that as homework. You can give them a lecture on descriptive statistics and work with SPSS or whatever software you’ll be doing the analysis in. Help the students interpret what the data means by pointing them to their data legend.
I hope this blog post was helpful. Again, if you have not yet done so, check out the first article in this post to learn more about the Netflix media placement assignment. If you want to know more about my applied communication research class, you can see all blog posts related to communication research here.
Data Entry and Data Legend Handout for Students
Data Entry and Data Legend Activity for Students
Project 1: Media Placement Assignment Handout (from previous blog post cited above).
credits: Photo public domain from Pexels