The life cycle of data
There's a popular image of data as a pure, all-knowing body of information. But data is a tool, and just like any other tool, it has phases of construction. After two years working in social impact research, I've learned that mobilizing data to its fullest capabilities requires diving into each one of these stages. In this blogpost, I take you through the key life stages of data and what questions we should be asking ourselves in each.
When we think of data, we often visualize densely populated spreadsheets and color-coded bar charts. But data starts way before then, and continues long after. In this blogpost, I take you through the key life stages of data and what questions we should be asking ourselves in each.
Concept operationalization
How do we decide what to measure?
Our biggest questions often aren’t measurable. There’s no pure metric for something like social impact. Think about it: how would you measure social impact without measuring something smaller, like the number of communities reached or the resulting change in individuals’ wellbeing? Instead of taking something like “social impact” head-on, we have to decide what we can measure that will accurately reflect our bigger question.
Breaking down big questions into measurable, smaller questions is a process called operationalization. We make big questions operable by deciding upon indicators, smaller questions that relate to the broader question.
More often than not, our overarching questions need more than one indicator to garner valuable insights (a social impact model that only looks at the number of communities reached doesn’t look at social impact–it looks at the number of communities reached). This lends high-quality impact models to look at multiple metrics.
When considering data operationalization, ask yourself: What is the relationship between the measured topics and my overarching question? What other topics may reflect my overarching question?
Data collection
How do we measure?
Once we know what we want to measure, it’s time to go out and find information that reflects that topic. This information is our data, and this process of data collection comes with its own suite of choices.
Careful decision making during the collection process makes for high-quality data. Any data point is touched by at least one person–at the very least, it takes someone to write a data point down, not to mention the people it takes to design and conduct the studies that yield data or the systems that house it.
This means that every data point has at least one pivot point for potential bias. But while bias may be a threat, it can be controlled. Carefully considering the origins of data–and the motivations of the people involved–helps filter bias and increase information accuracy.
When considering data collection, ask yourself: Where did these numbers come from? Who touched the data between the reality it aims to reflect, and when I first saw it?
Data analysis
How do we make sense of data?
Individual data points make up a data set. Once we have our data set, we then have to decide how to make sense of it. Do we take averages? Evaluate what inputs are most frequent? Look at how wide the data range is?
There are countless tools and methods for analysis, and each type highlights some level of insight at the cost of others (after all, not everything can be highlighted at once–that defeats the point). So how do we choose what to look at?
Strong analytic models employ multiple tools and methods to look at a data set from all angles. It’s like highlighting a paragraph in a book. If you go through and color all the nouns in blue, it’s easier to identify the nouns at a glance. But if you then go through and highlight the verbs in green, adjectives in purple, conjunctions in red–and so on until the whole paragraph is strategically shaded–you can pull the information apart at a glance.
Too many colors can get confusing. Because of this, good analysis models work to strike a balance, looking at data from multiple angles without getting too twisted around.
When considering data analysis, ask yourself: What insight does this analysis highlight? What does it obscure?
Drawing conclusions
What can data do for us?
After making sense of the data through analysis, we’re at the final step in data-backed decision making: the decision making. So we’ve color-coded a paragraph–what does that do for us?
Data is a unique tool because it’s multipurpose. As insight, data can support changes in the intensity and the directions of what movements we make. As evidence, data can help us communicate our rationale for these movements, spurring conversation and fostering teamwork.
And as a child of inquiry, data leads us to new questions. It’s the rebirth stage of data’s life cycle. Our big question, broken down into smaller questions, leads us to bits of information that lead us to a set of information, which leads us to insight that leads us to… more questions.
If there are two central themes of data, it’s curiosity and connection-making.
When drawing conclusions, ask yourself: What do I still not know? How do I come to know it?