Can responsible research be done with hacked data?

EPFL College of Humanities (CDH) researcher Marcello Ienca has recently co-authored a paper on the ethics of doing scientific research using hacked data with ETH Zurich bioethics professor Effy Vayena. They have published a series of recommendations for researchers and policymakers as a Perspective piece in the journal Nature Machine Intelligence.

According to a press release by the journal, Ienca and Vayena “intend to stimulate a debate in the scientific community to clarify when – if at all – hacked data can be used in research, and under what conditions”.

Hacked data – which the authors define as that “obtained in an unauthorized manner through illicit access to a computer or computer network” – can be a valuable source of publicly available (and therefore legally usable) data for machine learning, for example. However, it also creates a pressing ethical dilemma.

“Research integrity is not simply reducible to lawfulness…the ethical acceptability of using datasets of illicit origin cannot simply be presumed, but requires ethical justification,” Ienca and Vayena argue.

In their peer-reviewed article, the authors use historical examples of scientific misconduct and current research ethics guidelines to consider how best to manage this conundrum. They conclude that by default, research using hacked data should be considered unethical, notably due to the violation of informed consent and privacy of those the data were taken from. However, they also argue that exceptions can be made if a formal research ethics assessment finds that the benefits of using such data for research outweigh the risks.

Ienca and Vayena go on to outline criteria for determining whether such exceptions should be made, and for carrying out such research ethically, including: assessing the risks and benefits of using hacked data in advance; demonstrating that the data are unique and therefore not available through any other source; keeping a record of how and where the data have been obtained; and preserving data subjects’ informed consent and privacy wherever possible.

In addition to protecting data subjects, Ienca emphasizes that his and Vayena’s guidelines are also intended to protect scientific integrity.

“As hacked data become increasingly available for research, the scientific community has a moral responsibility to discuss whether or under which conditions it is ethical to use those data,” he says. “Without adequate ethical reflection, there is a risk that the use of hacked data may undermine public trust in science and lower the standards of scientific integrity”.

The Nature Machine Intelligence article can be read with a subscription, however a free e-PDF is also viewable online.

About Marcello Ienca

Marcello Ienca is a specialist in bioethics and technology ethics, and also an expert advisor to the Bioethics Committee of the Council of Europe and the OECD Steering Committee on Neurotechnology, where he coordinates the Swiss implementation strategy. At the CDH, Ienca is principal investigator of the ERA-NET-funded multi-national project “Hybrid Minds”, and leads a small research unit investigating the interaction between bioethics and AI. He also contributes to the teaching of ethics at EPFL and to the strengthening of possible synergies between CDH and other schools and universities.

References: Ienca, M., Vayena, E. Ethical requirements for responsible research with hacked data. Nat Mach Intell 3, 744–748 (2021).

This article was originally published on 30.09.21 by Celia Luterbacher, EPFL CDH.