Introduction to the Data Lifecycle

Overview

This guide will help you identify the actions taken at different stages of the data lifecycle, building from the foundations of data management, data curation, and data literacy.

Part of Data Literacy & Training

Learning Objectives

  1. Identify the different stages of the data lifecycle.
  2. Distinguish between foundational concepts of data management, data to knowledge cycle, data curation, and data literacy.

Data lifecycle models

A data lifecycle illustrates how data (in all its various forms and derivatives, including data points, datasets, databases, data files, visualizations, and code) conceptually flows through its lifecycle of usefulness. While data lifecycles are helpful frameworks to discuss appropriate actions taken at different stages, it’s important to remember that for most data the path is not linear and some actions may not occur at all.

Here are two examples of data lifecycle models:

This is an accordion element with a series of buttons that open and close related content panels.

Example 1: U.S. Fish and Wildlife Services "Data Management Life Cycle"

Data lifecycle
US Fish and Wild Life Services Data Management Life Cycle published at https://www.fws.gov/data/life-cycle

This data lifecycle by the U.S Fish and Wildlife Services places Quality Assurance (QA) and Quality Control (QC) at the center of data management. Components of this lifecycle include:

  • Plan
  • Acquire
  • Maintain
  • Access
  • Evaluate
  • Archive

Example 2: "Biomedical Data Lifecycle" by Harvard University Library

data lifecycle wheel harvard library
“Biomedical Data Lifecycle” by Harvard Longwood Medical School LMA Research Data Management Working Group. License: CC-BY-NC 4.0.

This Biomedical Data Lifecycle example by the Harvard Longwood Medical School LMA Research Data Management Working Group adds an additional layer over the base components giving more content to the actions typically taken at each stage.

At the center and throughout the lifecycle are the continuous actions of storage and management including:

  • Data safety
  • Data security
  • Storage Options

 

Data lifecycle stages

Data management best practices involve the entire data lifecycle from project start to end, and all the governance, rules, laws, and regulations that might apply. Our training includes the following stages: Plan, Create, Manage, Use, Share, Collect/Reuse, and Destroy.

Data lifecycle stages Plan collect use share project end
Caption: “Data Lifecycle model” by University of Wisconsin Data Governance Program. Updated Aug 9, 2022. License: CC-BY-NC 4.0.

Planning Stage

Before collecting or acquiring data, plan for how the data will be managed throughout the data lifecycle. An actionable data management plan should consider data governance roles and responsibilities such as who can make decisions about data access, use, and retention. It is also necessary to consider how any laws, rules, and regulations may apply to the data and who will be accountable.

Learn more about Data Planning


data collection stage involves store assure secure monitor
“Data Management in the Data Lifecycle” by University of Wisconsin Data Governance Program. Updated Sep 7, 2022. License: CC-BY-NC 4.0.

Manage Data

From data creation to destruction, data management actions include data storage, data quality and integrity, security, and monitoring for how long to retain the data. This stage is where information technology experts including data architects, data modelers, and risk managers play a critical role in designing and developing the appropriate infrastructure for data management. Finally, data archives, records retention, and digital preservation best practices play an important role in deciding how long to keep the data by weighing the requirements of laws, local policy, and anticipated usefulness.

Learn more about Data Management


Data Literacy in the Data Lifecycle
“Data Literacy in the Data Lifecycle” by University of Wisconsin Data Governance Program. Updated Sep 7, 2022. License: CC-BY-NC 4.0.

Use Data

In the data use phase, data literacy skills help us to organize, transform, analyze, and interpret data to convey meaningful information. Documentation, data pipelines, and reproducible workflows aid in the data-to-knowledge cycle by helping future users of your data understand the changes and transformations made in your analysis to better ensure transparency and increase trust.

Learn more about Using Data Ethically


Data sharing involves data curation, data preparation, selection and contexualization
“Data Sharing in the Data Lifecycle” by University of Wisconsin Data Governance Program. Updated Sep 7, 2022. License: CC-BY-NC 4.0.

Share Data

The goals of data sharing include facilitating data reuse, replicability, validation, and transparency. The data sharing stage involves data curation techniques of preparation, selection and contextualization to aid in effective and appropriate data reuse. Modes of transmission and authorization for access will vary and, in cases when long-term access is desirable, the responsibilities of data sharing may be transferred to a trusted data repository, to aid in preservation and access over time.


finding and reuse data involves data literacy techniques of find evaluate access
“Data Reuse in the Data Lifecycle” by University of Wisconsin Data Governance Program. Updated Sep 7, 2022. License: CC-BY-NC 4.0.

Find/Reuse Stage

Data reuse involves data literacy skills for finding, evaluating, understanding, and agreeing to any necessary conditions of access. A good understanding of the data’s purpose, history, and lineage are essential components to reusing data appropriately and effectively.

Learn more about Finding/Reusing Data

 


Project Close Out Stage

Any data collected is kept according to predefined retention schedules. Keep only what is essential, and retain/archive what is required by law and what might be needed for future use.