Best Practices in Ethical Data Analysis - Program

The workshop will be held in the afternoon of Friday, February 17. No colloque will be held that day.

L'atelier aura lieu le vendredi 17 février en après-midi. Il n'y aura pas de colloque ce jour-là. L'horaire provisoire est le suivant :

Rubab Arim and Evelyne Bougie, Statistics Canada (1:00-1:45)

Statistics Canada’s Disaggregated Data Action Plan: Best Practices and Analytical Guidelines

This presentation will provide an overview of Statistics Canada’s Disaggregated Data Action Plan with a particular focus on a recommended systematic approach to research. Best practices and guidelines under this approach will be discussed along with top challenges. Finally, recent publications and ongoing projects and training related to disaggregated data will be highlighted.

Irene Chen (1:45-2:30)

Beyond Bias Audits: Bringing Equity to the Machine Learning Pipeline

Advances in machine learning and the explosion of clinical data have demonstrated immense potential to fundamentally improve clinical care and deepen our understanding of human health. However, algorithms for medical interventions and scientific discovery in heterogeneous patient populations are particularly challenged by the complexities of healthcare data. Not only are clinical data noisy, missing, and irregularly sampled, but questions of equity and fairness also raise grave concerns and create additional computational challenges.

In this talk, I present two approaches for leveraging machine learning towards equitable healthcare. First, I examine how to address algorithmic bias in supervised learning for cost-based metrics discrimination. By decomposing discrimination into bias, variance, and noise components, I propose tailored actions for estimating and reducing each term of the total discrimination. Second, I demonstrate how to address one specific health disparity through the early detection of intimate partner violence from clinical indicators. Using a time-based model with noisy labels, we can correct for biases in data measurement to learn more clinically useful subtypes and improve prediction. The talk concludes with a discussion about how to rethink the entire machine learning pipeline with an ethical lens to building algorithms that serve the entire patient population.

Break (2:30-2:45)

AJ Lowik (2:45-3:30)

Gender Equity in Research: Strategies for Improving Accuracy, Precision and Inclusion
During this talk, you will be introduced to the Centre for Gender & Sexual Health Equity's Research Equity Toolkit, entitled Gender & Sex in Methods and Measurement. This toolkit addresses two interconnected problems in research: the pervasive erasure of intersex, trans and Two-Spirit people, among others; and, the often inadvertent misuse of gender and sex concepts. Dr. Lowik will share some strategies for improving accuracy, precision and inclusion in research with a particular eye towards gender diversity, from determining eligibility criteria, to thinking about sample size, to designing surveys and analyzing data.

Emma Pierson (3:30-4:15)

Using machine learning to increase equity in healthcare and public health.

Our society remains profoundly unequal. This talk discusses how data science and machine learning can be used to combat inequality in health care and public health by presenting several vignettes about policing, women's health, and cancer risk prediction.

Open floor discussion (4:15-4:30)