By Kevin Shane / August 1, 2017
Data Vocabulary: Some of Many!
Here at Learnmetrics, we encounter people with a very wide range of knowledge on data. We work with experts in the field all the way to people who are just starting to dip their toes into the world of data. It is our goal to help everyone realize the power of their data, so we’ve written this short guide designed for individuals at all levels. I’ll include a beginner data term, an intermediate term, and a higher-level term.
Beginner – “Data Set”
When first looking at data, it’s bound to get confusing and overwhelming. The importance of thinking of your data in smaller pieces, instead of looking at everything at once, cannot be overstated. A “data set” is a collection of related information that can be looked at individually or as a whole.
For example, a data set could be, “Fall 2017 CoGAT Scores.” Each individual student’s scores can be looked at individually, but the entire set can be looked at as well to find trends, averages, and more. It’s always important to try to find bigger trends in your data if possible.
Intermediate – “Personally Identifiable Information”
When working with data on people, in our case it’s mostly students, keeping in mind what data identifies is very important. “Personally Identifiable Information”, refers to any information that could be linked back to a student to find that student’s identity. There are two types of personally identifiable information, direct and indirect identifiers.
Direct identifiers are very straightforward. They include things like a student’s name or ID number. Indirect identifiers are a little more complicated. They are things that could be combined with other information to figure out a student’s identity. For instance, addresses, birthdays, and race could be combined together to figure out a student’s identity.
For example, address alone doesn’t provide enough information in the case of a family with more than one student or a family living in an apartment building. However, combined with the student’s date of birth, that could be enough to identify one student.
Advanced – “Masking”
Data security and privacy are very important to us at Learnmetrics. “Masking”, is a term to describe how original values in a set of data can be changed or “masked” to protect individual information. In our case, this can be how we perform demos. We have the ability to mask all personally identifiable information about students so that we can show people the capabilities of our product using actual, live data. Masking can also be used in public reports to, again, protect the identities of students.
For example, when we show demos of our product, we use “masked” data. Each school and student is represented by a unique number in our database. That number cannot be traced back to the student without knowledge of our database.
We’ve covered other data terms in the past, such as our multiple blogs on interoperability (BLOG #1, BLOG #2). Blogs like this are something we do in order to help educators become more familiar with the vast amount of vocabulary surrounding data. As always, if you have any questions or want to discuss other data terms, feel free to reach out!