Algorithmic Fairness
Motivation: Algorithms influence our lives in many ways
Machine Learning based systems have been used (to automate complex decision) for:
Copyright By Assignmentchef assignmentchef
Selecting job applicants
Recidivism prediction and predictive policing
Credit scoring and loans
Facial recognition
Search and recommendations
Machine Translation
and many other critical applications (involving humans)
Unfortunately it has been repeatedly shown that these systems are (often significantly) biased
Advanced Topics 26
Data Analytics and Machine Learning
Biased algorithms influence our lives in many ways
Selecting job applicants
XING ranks less qualified male candidates higher than more qualified
female candidates (Lahoti et al. 2018)
Recidivism prediction and predictive policing
COMPAS: high-risk FP: 23.5% for white vs. 44.9% for black;
low-risk FP: 47.7% for white vs. 28.0% for black (ProPublica article)
Facial recognition
Commercial software has much lower accuracy on females with
darker color (Buolamwini and Gebru, 2018)
Search and recommendations
Search queries for African-American names more likely to return ads
suggestive of an arrest (Sweeney, 2019)
Bias found in word embeddings
man-woman=surgeon-nurse (Bolukbasi et al. 2016)
Advanced Topics 27
Data Analytics and Machine Learning
What causes the bias?
Tainted training data: Any ML system maintains (and amplifies) the existing bias in the data caused by human bias, e.g. hiring decisions made by a (biased) manager used as labels, historic and systematic biases in the data collection process, etc.
Skewed sample: Initial predictions influence future observations, e.g. regions with initial high crime rate get more police attention (and thus higher recorded crime in the future), Selection bias
Proxies: Even if we exclude legally protected features (e.g. race, gender, sexuality) other features may be highly correlated with these
Sample size disparity: Models will tend to fit the larger groups first (possibly) trading off accuracy for minority groups
Limited features: Features may be less informative or reliably collected for minority groups
Advanced Topics 28
Data Analytics and Machine Learning
Why Fairness is Hard
How to define fairness?
How can we formulate it so it can be considered in ML systems?
Two distinct notions from the law (Barocas and Selbst, 2016):
Disparate treatment: decisions are (partly) based on the subjects
sensitive attribute
Disparate impact: disproportionately hurt (or benefit) people with
certain sensitive attribute values
Currently, no consensus on the mathematical formulations of fairness
Advanced Topics 29
Data Analytics and Machine Learning
An illustrating example
We are a bank trying to fairly decide who should get a loan i.e. predict which people will likely pay us back?
We have two groups: Blue and Orange (the sensitive attribute) This is where discrimination could occur
Figure: Simulating loan thresholds, research.google.com/bigpicture
Advanced Topics 30
Data Analytics and Machine Learning
Definitions of Fairness
How can we test if our (loan repay) classifier is fair?
The notions of Group fairness aim to treat all groups equally
e.g. We can require that the same percentage of Blue and Orange
receive loans
or Require equal false positive/negative rates, e.g.
P (no loan | would repay, Blue) = P (no loan | would repay, Orange)
Individual notions of fairness (treat similar examples similarly) also exist but wont be covered in this lecture
Counterfactual fairness uses tools from causal inference
Same decision in the actual world and a counterfactual world where the individual belonges to a different group
Advanced Topics 31
Data Analytics and Machine Learning
Setup Group Fairness
Consider binary classification with single sensitive attribute for simplicity:
X Rd : features of an individual (e.g. credit history)
A {a, b, . . . }: sensitive feature (gender, race, etc.)
R = r (X , A) {0, 1}: binary predictor (e.g. whether to grant a loan or not) which makes a decision
thresholding a score R = r (X , A) [0, 1], e.g. a NN classifier
Y {0, 1}: the target variable representing the ground truth
Assume (X , A, Y ) D are generated from an underlying distribution
X , A, Y and R are thus random variables
Notation: Pa{R}=P{R|A=a}
Advanced Topics 32
Data Analytics and Machine Learning
Naive Approach: Fairness through Unawareness
We should not include the sensitive attribute as a feature in the training data
R=r(X)insteadofR=r(X,A)
Pros/Cons:
Intuitive, easy to use and implement
Consistent with disparate treatment which has legal support (e.g. the General Equal Treatment Actin Germany)
However, there can be many highly correlated features (e.g. neighborhood) that are proxies of the sensitive attribute (e.g. race)
Advanced Topics 33
Data Analytics and Machine Learning
First Criterion: Independence
Require: R independent of A, denoted R A
Also called Demographic Parity, Statistical Parity, Group Fairness,
Darlington criterion (4)
In case of binary classification for all groups a,b: Pa {R = 1} = Pb {R = 1}
In our example, this means that the acceptance rates of the applicants from the two groups must be equal, i.e. same percentage of applications receive loans
Approximate versions:
Pa{R=1} 1 |Pa{R=1}Pb{R=1}| Pb{R = 1}
Advanced Topics 34
Data Analytics and Machine Learning
How to achieve Independence?
Post-processing
Adjust a learned classifier so as to be uncorrelated with the sensitive
Training time constraint
Include the exact/approximate constraints in the optimization
Pre-processing: e.g. via representation learning (next slide)
Advanced Topics 35
Data Analytics and Machine Learning
Representation learning approach
Map (X , A) to a representation Z (e.g. dimensionality reduction) Train the predictor on the representation: R = r(Z)
How to learn a fair representation Z ?
e.g. optimize for maxI(X;Z) and minI(A;Z), where I is some measure of information (e.g. mutual information)
e.g. Fair PCA, Fair VAE
Figure: The Variational Fair Autoencoder (Louizos et al., 2016)
Advanced Topics 36
Data Analytics and Machine Learning
Pros/Cons of Independence
Legal support: four-fifth ruleprescribes that a selection rate for any disadvantaged group that is less than four-fifths of that for the group with the highest rate must be justified
What if 83% of Blue is likely to repay, but only 43% of Orange is? Then Independence is too strong
Rules out perfect predictor R = Y when base rates are different
Laziness: We can trivially satisfy the criterion if we give loan to qualified people from one group and random people from the other
Can even establish a negative track record for the second group
Advanced Topics 37
Data Analytics and Machine Learning
Second Criterion: Separation
Require: The prediction R and A to be independent conditional on the target Y, denoted R A | Y
Also called Equalized Odds, Conditional procedure Accuracy, Avoiding disparate mistreatment,
In case of binary classification for all groups a,b:
Pa(R=1|Y =1)=Pb(R=1|Y =1) truepositive(TP)
Pa(R=1|Y =0)=Pb(R=1|Y =0) falsepositive(FP)
Equality of Opportunity is a commonly used relaxation
OnlymatchtheTPrate: Pa(R=1|Y =1)=Pb(R=1|Y =1)
In our example, this means we should give loan to equal proportion of individuals who would in reality repay
Advanced Topics 38
Data Analytics and Machine Learning
Achieving Separation
Area under the ROC (Receiver Operating Characteristic) curve Each point on the solid curve(s) is realized by thresholding the
predicted score at some value, i.e. predict I(r (X , A) > t )
Pick a classifier that minimizes the given cost (e.g. maximizes profit)
Figure: Intersection of area under the curves (https://fairmlbook.org/)
Advanced Topics 39
Data Analytics and Machine Learning
Pros/Cons of Separation
Optimal predictor not ruled out: R = Y is allowed
Penalizes laziness: it provides incentive to reduce errors uniformly in
all groups
It may not help closing the gap between two groups
Granting more loans to the group that is more likely to repay now
makes the groups more likely to have better living conditions and thus even more likely to repay in the future, thus widening the gap
Advanced Topics 40
Data Analytics and Machine Learning
Third Criterion: Sufficiency
Require the target Y and A to be independent conditional on the prediction (or score) R, denoted Y A | R
Also called Cleary model, Conditional use accuracy, Calibration within groups
In case of binary classification for all groups a , b and all r [0, 1]: Pa(Y =1|R=r)=Pb(Y =1|R=r)
In our example, the score used to determine if a candidate would repay should reflect the candidates real/actual capability of repaying
Advanced Topics 41
Data Analytics and Machine Learning
Achieving Sufficiency
In general a classifier R is calibrated if for all r [0, 1]: P(Y =1|R=r)=r
Of all instances assigned a score value r an r fraction of them should be positive
Calibration for each group a implies sufficiency: Pa(Y =1|R=r)=r
Apply standard calibration techniques to each group (if necessary)
: given an uncalibrated score treat it as a single feature and fit a one variable regression model against Y
Advanced Topics 42
Data Analytics and Machine Learning
Pros/Cons of Sufficiency
Satisfied by the Bayes optimal classifier r(X,A)=E[Y |X =x,A=a]
For predicting Y do not need to see A when we have R
Equal chance of success (Y = 1) given acceptance (R = 1)
Similar to before it may not help closing the gap between the groups
Advanced Topics 43
Data Analytics and Machine Learning
Fairness Summary: A growing list of fairness criteria
General theme: Require some invariance w.r.t. the sensitive attribute Independence: R A
Separation: R A | Y
Equality of Opportunity: R
Sufficiency: Y A | R
Conditional statistical parity Predictive equality
Predictive parity
and many many more
Many of these definitions are (provably) incompatible, i.e. they are mutually exclusive except in degenerate cases
Advanced Topics 44
Data Analytics and Machine Learning
Visualizing the trade-offs: research.google.com/bigpicture
Advanced Topics 45
Data Analytics and Machine Learning
Comparing different criteria
ProfitforaTPandcostforaFP
The cost of FP is typically much greater than the profit for TP
Figure: Different thresholds induced by different criteria (Hardt et al., 2016)
Advanced Topics 46
Data Analytics and Machine Learning
Adversarial Examples
Adversarial Examples are deliberate perturbations of the data designed to achieve a specific malicious goal (e.g. cause a misclassification)
Figure: The panda is classified as a gibbon by the NN, Goodfellow et. al, 2014
Advanced Topics 47
Data Analytics and Machine Learning
Adversarial Examples
Figure: Adversarial glasses fool Facial Recognition systems into classifying the wearer as someone else, Sharif et al., 2016
Figure: ML systems classify the adversarially modified Stop sign as a Speed Limit sign, Eykholt et al., 2018
Advanced Topics 48
Data Analytics and Machine Learning
Adversarial Examples
Many other recent studies show that most ML models are vulnerable to adversarial examples
On a high level this is because ML model do not really generalize
If the distribution of the test data is even slightly different from the
distribution of the training data they fail miserably
How can we create, detect and defend against Adversarial Examples?
Especially important to quantify this risk if we are in a safety-critical
application, e.g. self-driving cars
Certifiable robustness provides mathematical guarantees
Nature as an adversary: Even if there is no adversary in our use-case, we should quantify robustness to worst-case noise
Advanced Topics 49
Data Analytics and Machine Learning
Decision based on data are not always accurate, reliable, or fair
Differential Privacy allows us to compute arbitrary queries on (sensitive) data with provable guarantees on information leakage
There are no absolute privacy guarantees, your neighbors habits are correlated with your habits
Algorithmic Fairness criterions require (and enforce) some invariances w.r.t. sensitive attributes
Algorithmic Fairness = Actual Fairness, social/legal/political effort also needed
Without a model of long-term impact it is difficult to foresee the effect of a fairness criterion implemented as a constraint
Accuracy, Fairness, Privacy, Adversarial Robustness, Explainability and other aspects are non-trivially related
Algorithmic solutions are only (small) part of the puzzle
Advanced Topics 50
Data Analytics and Machine Learning
Reading material
Main reading
The Algorithmic Foundations of Differential Privacyby Dwork and Roth
[ch. 2, 3.1-3.5],
https://www.cis.upenn.edu/~aaroth/privacybook.html
Fairness and Machine Learningby Barocas, Hardt, and Narayanan [ch. 1, 2], https://fairmlbook.org/6
6Part of the slides adapted from the CSC 2515 lecture by and the Differential Privacy Tutorial by
Advanced Topics 51
Data Analytics and Machine Learning
CS: assignmentchef QQ: 1823890830 Email: [email protected]
Reviews
There are no reviews yet.