Flying Blind
The Australian Health Data Series

Flying Blind is a series of three reports dedicated to uncovering the acute levels of data fragmentation existing at all levels of Australia’s health landscape.

in collaboration with

The Ethics Quagmire: Case Studies

In Flying Blind 2, we have been highlighting the tortuous route of the researcher’s journey, as they negotiate the ethics processes and the myriad data sources required for their research. In the next few blogs, Australian health and medical researchers who have been through the journey, present real-life case studies and  back-of-the-envelope calculations of what it takes to identify existing data sets and negotiating the ethics processes, to link the data sets to support their research.

What is sad for Australian health research is that these numbers do not reflect reseachers' time spent in actually performing research!

We hope the case studies will shine a light on the complexities and the lack of efficiency and transparency around tapping into de-identified pre-existing administrative data sets from multiple states and federal health data sources.

Case Study 1
The Real Cost of Accessing Linked Health Data Study

Guest post by Kathy Tannous

Overview

The value of linking health data for research purposes is well documented. However, the process of applying for linked health data for research is a time consuming and demanding exercise. The researcher must be passionate, determined and willing to sacrifice significant amount of their paid and unpaid time to negotiate it. This is a case study of a teaching academic who began this journey. Having identified a clear and important need, the researcher’s idea was to use linked health data to measure the cost of a specific incident.

The process of just obtaining access to the data, not actual analysis of that data, took approximately two years. The researcher spent two days per week during this time on:

  • identification of the availability of data
  • writing justifications for their accessing
  • writing and revising research protocol and ethics application
  • obtaining data custodians’ approvals
  • and submitting grant applications to fund the cost of data linkage and hosting.

That’s 1,456 hours. If a researcher’s base is $70 per hour with $30 per hour of overheads, the labour cost of this process alone amounts to $145,600. This is all before they have actually obtained the data or done the analysis and it doesn’t account for many additional cost factors.

So what’s going on?

The Start of the Journey – Getting Data Access Approval

First the researcher identifies the data used to measure frequency and effects of the incident of interest. Then a literature review is conducted, including international comparisons. Through this process, the current figures about the number of incidents and their impacts are identified as being systematically understated. This is a major issue due to the use of these figures for decisions around policy and staffing. Now there is a demonstrated rationale for the study and a research proposal can be developed. This is then subject to approval. In this particular case, the major challenges to approval are study design and ethical considerations.

The research proposal had to include detailed information on:

  • participating sites
  • study design
    • type of study
    • data sources – this subsection involved itemised listing and detailing of:
      • principal diagnosis data on hospital admissions based on:
        • International Classification of Diseases, Tenth Revision, Australian Modification (ICD-10-AM)1
        • ICD, Ninth Revision, Clinical Modification (ICD-9-CM)2
        • Systematized Nomenclature of Medicine – Clinical Terms (SNOMED – CT)3
    • ambulance data including:
      • Patient Health Care Record (PHCR)4
      • electronic Medical Record (eMR)5
    • population size
    • flow charts of required data linkage processes
    • statistical analysis to be conducted
  • ethical considerations
  • outcomes and significance
  • timelines/milestones
  • publication policy

Within the data sources section, the researcher must describe each individual dataset item and justify its need. For data collected by external agencies, the researcher must develop a data dictionary with lists of variables and their justifications similar to those developed by Centre for Health Record Linkage (CHeReL) for their Master Linkage Key (MLK). In this case, the process of listing the codes and negotiating with the different data custodians on their access faced so many obstacles that it is estimated to have delayed the project by a full year.

Next Step – Getting Ethics Approval

Another aspect of the approval process that involved a significant amount of time to resolve is the ethical considerations. In this project the privacy and confidentiality of the study participants is protected through the use of the CHeReL. CHeReL links between and within the records held in their Master Linkage Key (MLK) and other datasets6 with a unique project specific person number (PPN) generated. The data that is provided to the researcher only contains the PPN and associated health and incident extracts. At first, the researcher submitted that this data would be stored and used on a password protected secure server located at their university with only the named researcher having access to the server. This was rejected by the data custodians as not being sufficient in terms of ensuring safe storage and access to the data. Instead, the Secure Unified Research Environment (SURE), a remote-access data laboratory hosted by the Sax Institute7, was deemed acceptable for hosting the data. According to CHeReL requirements, this agency must also accept the research protocol. This also introduces additional costs as CHeReL charges a fee for record linkage8, as a guide the fee for linkage of an external dataset of 10,000 records and linking is approximately $7,0009. In addition, users of SURE are required to complete training and access charges are applied. As of 1 January 2017, exclusive of GST, the project fees are $6,564 (for project establishment, annual storage and operations, and archiving and storage fee) and per standard user fee of $2,426 per annum10. Again, this is still just in the phase of the project to get approval for accessing data and data custodian sign-off. A requirement before the project can be submitted for the required ethics approval.

The next step for the study was to apply to the state’s population health research ethics committee. The requirement is to complete a National Ethics Application Form (NEAF) and attach the detailed research protocol, data linkage flow chart, and variable list including data custodian sign-off11. The submission to the ethics committee involved the provision and rejection of documentation over 6 months. Each change required by the ethics committee involved the added challenge of liaising with CHeReL and the various data custodians (when applicable) to ensure their acceptance. 

What now?

Despite the fact that two years have passed, the researcher is only up to the point of obtaining approval to access to the data and receiving ethics clearance. The actual work on analysing the data has not even begun. There were direct cost outlays required, for SURE training for example, but more importantly significant indirect costs of the researchers time. Not even considered here are the significant time and costs of everyone else in contact with this project such as the data custodians, CHeReL personnel, members of the ethics committee and supporting staff. This also does not take into account the time and effort spent by the researcher applying the for funding from different sources in order to cover the outlay costs of data linkage and SURE and ensure the project can actually continue. We know how valuable the insights gained from linked data can be. Right now, the real cost of actually achieving this is something that individual researchers are often bearing. And these costs are often much more significant than initially thought. Does this mean that we should stop linking data for health research? Of course not. It does mean that efforts to improve the situation should be an urgent priority for the sector. The time and expertise of health researchers is too valuable to be spent on this level of bureaucracy. Imagine what these 1,456 hours of that expertise could have already achieved.

 

Post new comment