Attention: You are using an outdated browser, device or you do not have the latest version of JavaScript downloaded and so this website may not work as expected. Please download the latest software or switch device to avoid further issues.
10 Jul 2025 | |
Blogs |
The Central Question: Technical capability alone doesn't justify data use. State CDOs must consistently ask both "Can we?" and "Should we?" when considering data initiatives.
Build on Proven Foundations: The Fair Information Practice Principles (FIPPs) from the 1970s remain essential ethical guideposts, but need modern implementation approaches for today's complex data environment.
Apply the Five Safes Framework: Use this structured approach to evaluate data sharing decisions across five dimensions—People (who gets access), Projects (what purposes justify access), Data (what protection level is needed), Settings (where and how data is accessed), and Outputs (what can be released).
Prepare for AI Governance Now: AI amplifies existing data quality issues and creates new challenges. State governments need flexible frameworks that ensure high-quality data, effective governance principles, and enhanced technical capacity.
Embrace Dynamic Risk Management: Use tiered access approaches and coordination mechanisms that adapt to changing technology while maintaining public trust. High-value, high-risk use cases (like linking data for child welfare) require careful safeguards, not automatic rejection.
Lead Through Collaboration: State CDOs are uniquely positioned to demonstrate responsible data innovation. Share experiences, coordinate approaches across agencies, and maintain focus on public benefit.
The Bottom Line: Governments will continue collecting and using data in the years ahead at an escalating pace. Success depends on doing so in ways that build public trust while delivering real value to citizens
On my drive back from presenting to Maryland's state Chief Data Officers in early-July 2025, I passed a gate with a simple sign: "Push to Open." A perfect metaphor for the challenge I had just spent two hours discussing with the state’s key data stewards.
In the data world, we're constantly pushing gates open—accessing new datasets, linking information across agencies, and now applying AI to government services. Emerging technology makes it easier than ever to push these gates open. But as we do, we must be mindful of what lies on the other side: individual privacy, entity confidentiality, public trust, and the approaches to simultaneously achieving transparency and protection.
This tension across these topics was a key point during an Executive Education session for Chief Data Officers from Maryland's state departments and agencies. Led by Stefaan Verhulst, joined by Maryland CDO and Data Foundation Senior Fellow Natalie Evans Harris, and presenting alongside Lynn Overmann, Executive Director of the Beeck Center, our discussion centered on a fundamental question that every state CDO must grapple with: "Just because we can push the gate open, should we?"
Before diving into modern frameworks, recall that we're not starting from scratch in answering this question. The Fair Information Practice Principles (FIPPs), established in the 1970s, remain a bedrock of responsible data governance and stewardship. These core principles—transparency, individual participation, purpose specification, data minimization, use limitation, data quality and integrity, security, and accountability—provide the ethical foundation for all data activities that have also shaped data policy implementation for decades.
However, the FIPPs were designed for simpler data environments more than 50 years ago, before many modern data laws, before popular use of generative AI, and before vast expanses in computational capabilities. Today's challenges—data linkage across agencies and with the private sector, responsible use of AI and machine learning applications, and open data mandates—require more nuanced approaches while maintaining these core principles.
The applications in the state of Maryland in this arena are noteworthy. The state’s open data policy was the foundation for open data directives in the federal government in 2010 which ultimately led to the OPEN Government Data Act.
The Five Safes framework, originally developed in the UK for statistical activities, offers state CDOs and data stewards a relevant, structured approach for evaluating data sharing and access decisions. Each "safe" addresses a different dimension of risk to be managed to avoid harms to individuals and businesses.
During our session in Maryland, one attendee highlighted a powerful use case that exemplifies both the high value and high risk of linking and blending data: understanding the full public support ecosystem for a child with divorced parents. In the example, one parent may receive SNAP, SSI, or child welfare supports while the other pays child support as a noncustodial parent. To monitor and improve children's outcomes, caseworkers need to see the complete picture for certain aspects of data help by multiple departments—but this requires linking data across multiple agencies and benefit systems. Yet, policymakers and program managers may not require this level of detailed knowledge to support families effectively.
This example helps illustrate why the Five Safes framework is relevant and useful for data stewards:
Safe People – Who should have access to your data? In our child support implementation example, this might include caseworkers from multiple agencies, court personnel, and benefit administrators using data. Responsible use involves establishing clear criteria for both internal staff (through role-based access and clearance levels) and external partners (through researcher vetting and contractor requirements). Different people may also have different levels of access to view more or less details based on needs. The key question: Do you have standardized, transparent criteria for data access approval?
The emergence of AI and machine learning at scale has fundamentally changed the data landscape. As the Data Foundation emphasized in a recent report on “Data Policy for the Age of AI” there's no "one-size-fits-all" approach. Policymakers need flexible frameworks to assess whether existing laws and policies adequately address AI's unique challenges, particularly in the constellation of more than 3,000 privacy laws across the country.
At least three factors make AI particularly challenging for state CDOs:
Moving forward to build AI-ready data governance, there are several key components for data stewards to consider. First, amplifying efforts to ensure high-quality data that is fit for purpose, specifically includes applying appropriate data standards, establishing knowable data revision processes, publishing rich metadata with assessments of data sensitivity, and sufficient documentation that provides context for AI training and use. Second, applying effective governance principles can occur alongside enhanced privacy protections for AI training data – especially when for open data available from the public sector – as well as, transparency in AI decision-making processes. Finally, stewards can enable the technical capacity for secure infrastructure for AI model training, ethical procurement standards for AI tools, and workforce development for AI literacy across government.
In 2024, the National Academy of Science’s Committee on National Statistics published a policy-risk framework for considering the value and risk of harm when blending data from multiple sources. This policy-risk framework can be applied for a range of purposes – including both statistical uses and administrative actions. The National Academies report concludes that when agencies combine multiple data sources, they create magnified disclosure risks. These "composition effects" mean that multiple data releases can accumulate disclosure risks in ways that weren't anticipated when each dataset was collected or released individually.
Initially designed for statistical disclosures, the National Academies framework emphasizes three essential elements that state governments can implement that are expansive in their realism for generating aggregate insights or individual record linkage:
In the framework, the child support example may be recognized as both high value and high risk, which does not preclude using the data – but does require successful application of approaches to manage risk responsibly to protect vulnerable populations.
Perhaps the most important insight from our session with Maryland's CDOs in July 2025 was that technical capability alone does not justify data use. The question "Can we?" must always be paired with "Should we?"
Answering the can-should we question together requires considering multiple factors: public benefit, privacy risks, community impact, and whether less invasive alternatives exist. The Fair Information Practice Principles provide an ethical standard. The Five Safes framework provides structure for these decisions, notably not replacing ethical considerations and community engagement. The National Academies policy-risk framework can also help consider appropriate management strategies if the answer is yes to both questions.
State CDOs are uniquely positioned to lead in responsible data innovation, at the nexus of the American people and their national government. Importantly, state governments can be more agile while maintaining strong public accountability.
Addressing the data use paradox facing state governments – and all data stewards – highlights why CDOs will shape public trust in government for years to come. The Maryland CDOs who participated in our session are already demonstrating this leadership. By sharing experiences, coordinating approaches, and maintaining focus on public benefit, state governments can lead the way in responsible data innovation.
The question isn't whether public sector agencies will continue to collect and use data—they certainly will. The question is whether they'll do so in ways that builds or maintains public trust while delivering real value to citizens.
NICK HART, Ph.D. is President and CEO of the Data Foundation. This article reflects themes from a presentation delivered on July 8, 2025, to Maryland state Chief Data Officers, co-presented with Lynn Overmann of the Beeck Center and led by Stefaan Verhulst. The session was co-sponsored by The Gov Lab, Open Data Policy Lab, and The Data Tank, with participation from Maryland CDO and Data Foundation Senior Fellow Natalie Evans Harris.
DATA FOUNDATION
1100 13TH STREET NORTHWEST
SUITE 800, WASHINGTON, DC
20005, UNITED STATES