Objectives
In the FairXCluster project, we will focus on these two fundamental challenges in AI research, namely
explainability and fairness. We will address these challenges in the context of clustering, a cornerstone AI
task by the use of counterfactual explanations (CFEs).
Explainable AI is an essential facet of modern artificial intelligence development,
addressing the critical need for transparency and understandability in AI systems.
As AI models, particularly deep learning algorithms, become increasingly complex and integrated
into high-stakes domains such as healthcare, finance, and autonomous vehicles, the ability to
understand and interpret their decisions becomes paramount. Explainable AI seeks to make
the decision-making processes of these algorithms transparent, providing insight into how
and why certain outcomes are reached. This transparency not only fosters trust among users
and stakeholders, but also enables developers and researchers to more effectively diagnose
and refine AI models to ensure they meet ethical standards, legal compliance, and social acceptance.
Counterfactual explanations delve into the realm of "what might have been" to shed light
on how a machine learning (ML) model's decisions work. At their core, these explanations
explore alternative scenarios by suggesting minimal adjustments to the original input that
would result in a different outcome from an ML system. They operate on the principle of
identifying and modifying critical variables that would change the model's decision, without
relying on specific examples. This approach provides a straightforward and tangible method
for users to understand the decision-making process of the ML model. By focusing on the changes
necessary to achieve a different outcome, counterfactual explanations not only shed light
on how ML models make decisions, but also empower users with the knowledge to navigate
and potentially influence the outcomes of these systems in future interactions. Notably,
this method significantly contributes to enhancing the fairness of AI systems by ensuring
that decisions are made transparently and can be scrutinized for bias or inaccuracies,
thus promoting equitable treatment across all users.
1. Our first objective is to formally define and solve
the counterfactual explanation problem for clustering and deep clustering. We will define counterfactuals
for explaining clustering decisions for both individual points and set of points. In addition, we will seek
for counterfactuals with desirable properties such as actionability (e.g., changing only mutable features) and
similarity to the training data.
2. Our second objective is to provide a counterfactual-based approach for explaining and achieving fairness.
Formalizing and operationalizing fairness in clustering is a challenging problem and counterfactuals offer an
intuitive way of formulating this problem. Intuitively, our approach will explain why a specific instance was
clustered in an unfavorable cluster and will measure the cost of unfairness through the distance between the
instance and its counterfactual, i.e., the minimum cost required for reversing the decision. We will consider
both individual fairness (i.e., nondiscrimination against individual instances) and group fairness (i.e.,
nondiscrimination against groups of instances as defined by the values of protected attributes, e.g., gender, or
ethnicity). Furthermore, we will express our fairness models as constraints towards making clustering
algorithms fair. We will exploit approaches that use the constraints during clustering and as a postprocessing
step.
3. Our third objective is to introduce new counterfactual-based cluster quality indices and exploit them
towards novel clustering algorithms. Clustering is a fundamental and well-studied problem in AI research
and introducing new perspectives to the problem is challenging. Our indices will use statistics over the
individual counterfactual distances as such distances are indicative of the cost of moving an instance to a
different cluster and thus the compactness of a clustering assignment. The proposed indices will be compared
and used in conjunction with existing ones for improving clustering.
Key Activities
During the first year of this two-year project, research has focused on the first and most
critical objective, ie,. the definition of counterfactuals for clustering and the development
of methodologies for computing counterfactuals to explain clustering decisions for both
traditional (k-means. Gaussian clustering) and deep clustering.
At first a literature survey on counterfactual explanations for classification has been
conducted, the major approaches have been identified and reported. Moreover, available
software tools for computing counterfactuals have been examined. During this survey
we verified that the problem of computing counterfactuals for clustering had not been
previously considered in the literature.
Next the research work focused on proposing a broad definition of counterfactuals
under the general assumption that each cluster is modeled using a probability density. It
should be noted that the popular k-means clustering algorithm can be considered as a
special case in the above more general framework. In analogy with the classification
case, the proposed definition also accommodates for actionability and plausibility
constraints, taking into account the possibility that some features cannot be modified
and that the counterfactuals that are produced should be located in regions with
sufficient data density.
Based on the above general definitions of counterfactuals for clustering, we focused on
the two most popular clustering paradigms, namely centroid-based clustering (eg.
solutions provided by k-means algorithm) and Gaussian clustering solutions typically
provided by training a Gaussian mixture model (GMM). We considered that the distance
to be minimized between a factual and its counterfactual is the Euclidean distance (L2
norm) and defined the cluster boundary equations for centroid-based and Gaussian
clustering respectively. Then we formulated the counterfactual generation problem as
the minimization of the Euclidean distance between factual and counterfactual under
constraints that take into account the cluster boundary and the actionability of features
(implemented using a feature mask).
Interestingly we have found that for centroid-based clustering the optimal solution to
the above constrained minimization problem is directly provided using an analytical
mathematical formula, while for Gaussian clustering the solution of a non-linear
equation with single parameter is required that can be easily computed.
Next, we extended the above results to the case where the Mahalanobis distance (a
generalization of Euclidean distance) is considered as distance measure between factual
and counterfactual. We also studied the case where a counterfactual should be
generated given a group of factual points.
Finally, we considered the computation of counterfactuals in a deep clustering
framework typically used to cluster complex data such as images. In this case, using
deep neural networks (autoencoders) the original complex data are transformed into
vectorial representations belonging in a so-called latent space and clustering is applied
on the latent vectors. Therefore, we applied the previous methodologies to generate
counterfactuals in the latent space and we conducted research and proposed solutions
on how to exploit the counterfactual computed in the latent space so as to generate the
corresponding counterfactual in the original data space.
A significant part of first-year work was allocated to the implementation and evaluation
of the proposed techniques using several datasets. For comparison, we have developed
an indirect approach where the clustering problem has been transformed into an
equivalent classification problem, thus allowing the use of software tools that generate
counterfactuals for classification. The approaches have been compared in terms of
minimum L2 distance, validity and execution time.
In the second year of the project, the research was conducted in two parallel lines,
exploiting the counterfactual generation results produced in the first year.
The first line of second-year research focused on the definition, implementation and
evaluation of novel fairness measures for clustering solutions. Such measures exploit the
average group counterfactual distances between the protected and unprotected group
of points. It is also possible to assess the effectiveness of each feature in mitigating
unfairness. More specifically, we first addressed the key question of quantifying the cost
of fairness interventions in clustering. We initially defined similarity-based costs that
compare an original (unfair) clustering solution with a fair one obtained from existing
fair clustering algorithms. The comparison was made using: (a) a group-level adaptation
of Normalized Mutual Information (NMI), and (b) a misalignment metric that counts
data points assigned to different clusters in the fair and unfair partitions. Then, we
introduced a novel counterfactual-based cost to measure the effort required for an
individual point to receive a fair assignment which similarity-based scores fail to
capture. We considered both cases of hard and soft clustering solutions. We evaluated
the proposed cost metrics using several datasets drawing interesting conclusions. For
example, we observed that the contributions of individual features to counterfactual
cost differ across groups, revealing asymmetries in how fairness is achieved.
Next, we have introduced ‘separation fairness’, a boundary-aware group fairness
criterion that relies on the equity of counterfactual distances of groups, i.e., the average
distance of the groups from their nearest cluster boundaries. Intuitively, separation
fairness pushes cluster boundaries outward from both groups, reducing near-boundary
disparities that can produce unstable or inequitable assignments. We have also
combined the proposed separation fairness with the so-called ‘social fairness’ (that
computes cluster compactness at group level) criterion to capture both aspects of group
equity in clustering: compactness and separation. We have developed and evaluated
three gradient-based fair clustering methods as fair adaptations of k-means algorithm:
- Separation Fair k-Means that maximizes the minimum average counterfactual
distance across groups (separation fairness).
- Socially Fair k-Means that minimizes the worst-group inter-cluster distortion.
- Unifair k-means that jointly optimizes both fairness criteria under a single
framework.
Finally, we have also implemented a deep clustering extension of the above
methodology for fair clustering of complex data (e.g. images). We have extended the
objectives to a deep latent-space setting via an autoencoder, enabling joint optimization
of reconstruction error, cluster quality and fairness cost.
The second line of second-year research focused on exploiting counterfactual distances
to estimate the quality of clustering solutions. We relied on the remark hat, if the
generated counterfactuals lie on the cluster boundary, then the corresponding
counterfactual distances provide the distance of the factual point from the boundary of
its cluster. Therefore, the sum of counterfactual distances of the points of a cluster
provides a sensible measure of cluster separation, that we call counterfactual
separation.
Typical clustering quality scores (such a silhouette score) combine two terms: a term
expressing cluster variance to account for cluster compactness and a term expressing
cluster separation. By using counterfactual separation to account for cluster separation,
we have proposed a new clustering quality score called CFQ and evaluated its
effectiveness in estimating the number of clusters through extensive comparison with
standard approaches (e.g. silhouette score) in centroid-based and Gaussian clustering.
We additionally exploited counterfactual separation to improve traditional and deep
clustering algorithms. At first, we considered agglomerative clustering methods. Such
bottom-up hierarchical algorithms rely on the so-called linkage criteria to decide
whether two clusters should be merged or not, where a linkage criterion measures the
separation between two clusters. We proposed, implemented and evaluated two
counterfactual-based linkages focusing on the frontier (boundary) points between
clusters. Both linkages were found to improve the traditional ones, however at higher
computational cost.
We also studied an adaptation of the k-means algorithm, through the introduction of the
counterfactual separation objective directly into the k-means objective so as to to
account both for cluster compactness and separation. The resulting algorithm, called cf-
means, implements gradient-based refinement of initial cluster centers prior to the
typical k-means phase that follows. It has been shown that it reduces sensitivity to
center initialization and promotes solutions with both greater compactness and better
separability. Finally, we extended this methodology to deep clustering by designing the
deep cf-means algorithm.
Research Results
The project succeeded in delivering a unified framework connecting explainability,
fairness, and quality in clustering via counterfactual reasoning. The project
achievements can be summarized as follows:
-
First formulation of counterfactuals for clustering
- Easy to compute non-iterative solutions for k-means and Gaussian clustering.
- Extensions to actionability and plausibility constraints, Mahalanobis distance, counterfactuals for groups.
-
Novel counterfactual-based fairness framework
- Counterfactual fairness cost and feature-level fairness attribution.
- Separation fairness criterion.
- New fair and deep fair clustering algorithms optimizing the separation fairness criterion.
-
New counterfactual-based cluster quality measures
- The novel CFQ score that uses counterfactual distances to measure cluster separation.
- New linkages for agglomerative clustering that focus on cluster frontiers defined by counterfactual points.
- Adaptation of k-means and its deep extension so that counterfactual-based cluster separation is taken into account.