As a public health researcher, I love data, the more the better. I held this belief until I found that I myself had become the “subject” of research without my consent. This experience made me rethink ethical research.

The more data, the better?

In 2017, I encountered a state-level bill that required all the government agencies in Massachusetts to collect “disaggregated Asian data”, which is a person’s national or ancestral origin. I am from China, thus my “disaggregated race” is “Chinese”. The stated purpose of such data collection was to identify the underserved communities among the broader “Asian” community and provide services accordingly. Yet, bills like this faced fierce opposition from Asian immigrants. The Massachusetts bill failed to pass in three successive legislative cycles.

The research community in general favors the concept of disaggregated race data. Indeed, the broad Asian category encompasses people who hail from a continent with more than half of the world population. Disaggregated race seems more meaningful. In 2014, I myself made a presentation to my classmates to support breaking down the category of “Asian” into smaller subgroups. I had no concern about having this granular data to do more research with the intention of helping Asian Americans.

Aside from being a researcher, I am also an immigrant. With the toughening of US immigration policy, and deterioration of the US-China relationship, my Chinese immigrant background began to play an increasingly significant role in shaping my view on ethical research.

In 2018, FBI director Christopher Wray declared that Chinese students are spies for the Chinese government in every discipline across the US, which requires the US to have a “whole-of-society” response. When COVID-19 devastated America, the coronavirus became the “Chinese virus”. After 100 years, the “yellow peril” nightmare suddenly came back.

Overlooked risks

My instincts told me national origin data collection and public health reporting by origin might be dangerous. Easily overlooked, government national origin data collection and reporting have a particular danger when the “subjects” are an ethnic group who are predominantly immigrants. Former President Trump was reported saying that he didn’t want immigrants from Haiti because “all of them have AIDS”. Besides being a bigoted and hurtful statement, these words may have real consequences to Haitian immigrants.

The Department of Homeland Security (DHS) has vast discretion in deciding who is admissible to this country and who is not. One of the major decision criteria is public health concern. Under USCIS Policy Manual (Current as of March 23, 2022), Volume 8 – Admissibility, Part B – Health-Related Grounds of Inadmissibility, the following four basic medical conditions may make an applicant inadmissible on health-related grounds: communicable disease of public health significance, an immigrant’s failure to show proof of required vaccinations, physical or mental disorder with associated harmful behavior, and drug abuse or addiction.


Imagine if, for example, government produces disease prevalence reports by national origin, and DHS has every Asian subgroups’ prevalence of the four categories of inadmissible medical conditions. The data could be sorted to identify the subgroups with high prevalence of each disease. DHS could issue a directive to immigration physical exam doctors, asking them to screen the applicants from these subgroups more strictly.

Many of these “inadmissible” medical conditions are behavioral health conditions, which are difficult to evaluate in a single exam. Is a person who has a diagnosis of major depression and attempted suicide in the past but has been recovered for years admissible? What about a person who had a psychotic break and has taken medication ever since and is functioning well? What about a person who formerly abused alcohol but has stopped doing so? When facing political pressure to “take action”, DHS could reject all these “borderline” applicants. These immigrants may not pose any excessive burden to American public health. These immigrants may not have stepped on American soil yet, but their rights deserve to be protected as well. At least, public health researchers should not impose more challenges on their lives.

Power and privilege held by researchers

I thought my public health colleagues could easily understand the fear of racial profiling by the US government. I hoped so especially after former President Trump insisted on calling COVID-19 the “China virus.” Yet, the proponents of data disaggregation, especially the persons of Asian descent, persisted. One article tried to summarize the potential danger and concerns of data disaggregation voiced by the opposing side. The authors cited various injustices inflicted by the US government on racial minorities, including the Japanese internment camps, which was particularly the situation feared by many Chinese immigrants due to the poor US-China relationship. Yet, the authors concluded that the fear from opponents is misplaced.

This attitude surprises me. I wonder where it comes from. It feels very “academic”.  Perhaps it comes from the power and privilege researchers hold. In the public health research community, people examine the power and privileges brought by race, gender, and other immutable characteristics. Yet, the power brought by being a public health researcher itself could be collectively forgotten.

In this profession, we breathe data. We are under pressure to obtain more data to publish or apply for funding. As researchers, we talk about abstract concepts and lofty missions. We educate politicians as a way to shape the landscape of health policy. The social status brought by our profession shelters us from many hardships, thus we might be insensitive to many perilous situations other people face. Many of us automatically assume more national origin data means better healthcare, rather than harm. Yet, outside the ivory tower, this assumption could be wrong.

Ethics training

Every public health researcher has gone through research ethics training. The foundation of biomedical and behavioral research is to protect the people whom researchers want to “study”. The participants of a study must be willing participants, fully informed of the risks and benefits. They are told how their data will be used and must not face coercive power when deciding if they want to provide any information. To proceed with a data collection, participants usually need to sign an informed consent form to give the researcher permission. In the simplest terms, researchers should never put the people they want to research in harm’s way. In my view, requiring government agencies to collect national origin information and report results by national origin hardly fits into the picture of ethical research.

Questions to public health researchers

To conduct ethical research, we need to ask ourselves a few questions, such as:

  • Government collects administrative data is solely for administrative purpose. Is it ethical to ask government to collect additional data for research purposes?
  • What type of research topics by nature impose excessive risk on vulnerable populations?
  • Is it indisputable to assume certain data is critical and irreplaceable for research, such as using national origin data for health equity research?
  • Is that possible that we are conducting research without consent when we use government administrative data?
  • Are we responsible for the psychological pressure and other harm imposed on these “participants” in the data collection process? (I used quotation marks on the word participant because the persons being studied never gave their consent).
  • How could we know if we’re causing harm when the data collection process does not typically require researcher involvement?
  • Do the current research ethical standards cover requesting new data collection through government agencies?
  • Is asking the government agencies to collect data for research a convenient way to bypass Institutional Review Board (IRB) review and the lengthy and costly data collection process?

Ultimately, we need to ask this question: Is making data collection “easier” serving the communities that need quality healthcare, or is it actually serving our own careers?

A call for discussion about ethical research

We are in an era of big data. The race data disaggregation case is one example that showcases the potential harms of requesting the government to collect sensitive information for public health research. By no means is this article advocating against research studies and data collection with government administrative data involved. However, I want to open a discussion about the ethical and equity issues of using government administrative data without informed consent from “participants”. I alone cannot provide a definition of what is ethical and what is not in this new era, but this discussion needs to happen. And it is long overdue.

Ye Zhang Pogue

Ye Zhang Pogue, Ph.D., is a research public health analyst at RTI International. She is an expert in health service research related to mental illnesses and other behavioral health conditions. At RTI, she contributes to teams in a variety of health care research studies, encompassing such areas as secondary data analyses involving Medicare claims, evaluation of health policy initiatives, and quality measurement and outcomes.

