An April 2018 public deliberation event in Vancouver, British Columbia brought together 23 members of the public to elicit informed and civic-minded advice regarding the sharing and use of linked data for research. Over the course of two weekends, participants learned about how integrated data is leveraged for research, deliberated on common issues in the field, and developed
19 policy-relevant statements that “provide a broad view of public support and concerns regarding the use of linked data sets for research and offer guidance on measures that can be taken to improve the trustworthiness of policies around data sharing.” Overall, participants expressed support for data integration, recognizing the many societal benefits. They also shared concerns about data privacy, security, and transparency—most of which participants believe can be ameliorated by systematizing processes and clarifying roles.
The participant statements can teach us a lot about how best to develop the “social license” and public trust required to effectively and ethically share data and conduct research of this kind. The event can also serve as a blueprint for future community-engagement efforts and a guide for localities using integrated data systems (IDS) to conduct research and improve social policy.
Individual attendees represented a variety of backgrounds and were not required to have prior knowledge about data or research in order to participate. To create a baseline of understanding among the group, participants were provided educational materials prior to the start of the event that described linked datasets (e.g. what they are, how they are developed), data sharing regulations, and current questions regarding data use.
On the first day, participants heard from five speakers representing particular perspectives on linked data in order to further foster their understanding and knowledge on the topic. Speakers were available for questions, but did not participate in the deliberation process. Perspectives included:
A data steward
A privacy advocate
An indigenous community member
In preparation for the event, organizers explicitly asked the five presenters to state the reasoning behind any viewpoints and positions they expressed, rather than attempting to present them as “neutral.” The purpose of this framing was to “make potentially competing interests explicit to participants of the public deliberation, so that these tensions could be discussed and worked through in the deliberative process” (p. 3).
The hosts—a group composed of researchers and four current or previous graduate students experienced in facilitation—established a set of three core questions and one scenario prior to the event to serve as a guide for the deliberation process. They included:
What is important information to consider when approving access to and use of linked data?
When is it justified to grant access to linked data, and what measures are important to reduce risks?
Working with scenarios: applying previous discussions to work out trade-offs and recommendations
What processes would make the assessments of risks and benefits from the use linked data trustworthy?
Each question or exercise was constructed to foster critical thinking about typical concerns around data sharing in research in order to elicit informed responses from the public regarding the use and the sharing of linked data for research. Questions, comments, concerns, and reactions from contributors were recorded.
To facilitate deliberation and conversation among participants, the attendees met in four small groups of 6-8 individuals, as well as in a large group that included all participants. As the groups and ideas about data sharing began to converge, facilitators helped participants formulate preliminary statements that would then be edited, honed, and voted upon by all participants (the results of this activity are included in the following section).
Researchers grouped the 19 policy-relevant statements developed by the group into four major themes that are generalizable and relevant to communities engaged in data integration.
Themes and corresponding summaries are reproduced below:
The governance of linked datasets
Security and review process for releasing linked data sets
Participants expressed a need for secure datasets as well as robust assessments of the scientific merit of research proposals.
Participants also raised concerns about risks to the research study population, regardless of intent.
Disagreement arose as to whether sufficient ethics review requirements were already in place or if new requirements or resources were needed.
The responsibilities of data stewards and researchers
Involvement of the public
Participants expressed both concern about and need for greater public transparency in terms of what data are shared, with whom, for what reasons, and when. As a result, they recommended creating a website that lists information on data access requests, including the rationale behind each approval or denial. Additionally, some group members found the term “transparency” to be too vague, which made it difficult for them to identify useful steps to resolve issues discussed.
As these statements demonstrate, participants have nuanced and thoughtful ideas about integrating data for research. Much can be gleaned from both points of consensus and the disagreements evidenced among deliberators.
Standardization in terms of both data access processes and guidelines as well as training particularly for data stewards
Future deliberations will involve how private enterprises—particularly corporations—should be involved in research
The authors of the study are actively developing plans for future deliberations to discuss the growing role of private corporation and institution influence and relevance in data sharing for research, which was raised during the April 2018 deliberation process. The authors suggest that the statements and recommendations that did not reach full consensus offer opportunities for deeper investigation. One such area for more consideration is the “meaning and enactment of” transparency—an idea that is central to ensuring public access and awareness but is often difficult to define.
Engaging a full spectrum of stakeholders is necessary to ensure the highest ethical standards of data use, effectively identify barriers to implementation and successful operations, as well as promote sustainability in future projects. The validity of a project, program, or even system (such as an IDS) increases when communities with conflicting perspectives are brought together in an effort to gather insight from various sources and, ultimately, to find consensus; doing so can promote the sustainability of data sharing practice over time. As such, the work in Vancouver offers an excellent example of actively engaging representations of diverse thought, opinion, and perspective for the enhancement and growth of linked data both currently and in the future.
For tips and tools for public engagement around data sharing and privacy, check out Actionable Intelligence for Social Policy's “Nothing to Hide” Toolkit here.
Access the full report here.
The research was published in the International Journal of Population Data Science.
Corresponding author: Jack Teng from Population Data, BC. Contact at firstname.lastname@example.org.
Citation: Teng, J., Bentley, C., Burgess, M.M., O'Doherty, K.C., & McGrail, K.M. (2019). Sharing linked data sets for research: Results form a deliberative public engagement event in British Columbia, Canada. International Journal of Population Data Science, 4(1). doi: https://doi.org/10.23889/ijpds.v4i1.1103
Statements and recommendations
(*=full consensus. Full explanation of votes can be found in pages 6-8 of the full report)
Develop a plan to make the data linkage approval process more efficient, without compromising the evaluation process.
It is important to invest in a collection of linked data sets to promote efficient research while enhancing privacy protection.
Policy makers should establish categories that identify requests that require different paths or speed for review, e.g., fast-track for urgent research priorities.
If a commercial entity funds research with linked data, it should not be involved in the production and review of that research.
There should be a committee or governing body with authority to:
Provide oversight and investigation for breaches and/or harms
Apply penalties or other consequences
Develop policies to mitigate the potential for future breaches and/or harms
Intervene when data stewards disagree
Develop and operate an appeals process
Provide certification for data stewards
Scientific review of the research proposal should be performed by an independent party.*
There should be best practices and guidelines for secure storage and access to linked data.*
Results and publications of linked data research must be reviewed to ensure that they are justified by the analysis of the data.
The proposed research and data access should be reviewed by an independent ethics committee to ensure benefits outweigh potential harms (e.g., potential for re-identification, stigma).
Research results should be reviewed by a qualified independent party to affirms the original purpose of the research.
An independent party should assess requests for data to be sure that the data are necessary to conduct the research.
Data stewards should have standard training of certification to ensure appropriate expertise for their role.
Data stewards should have standard policies and procedures to guide their work and there should be a certifying body to maintain them.
Research using linked data must be monitored by data stewards to ensure data are used in accordance with the original request.
Researchers have some responsibility to vulnerable populations they study or identify as vulnerable in their research.*
Anyone seeking access to linked data must sign a standardized contract outlining confidentiality requirements and further dissemination of data.
Data security certificate program should be established and it should be mandatory for people who are using linked data.
There should be public discourse (e.g. on a website) of requests for access to data. This should include approvals, denials, and reasons for those decisions.
Transparency and disclosure of research requests is sufficient as a form of public consultation.