FairXCluster Logo

FairXCluster

Counterfactuals for Clustering: Explainability, Fairness, and Quality


Our main objectives

In the FairXCluster project, we will focus on these two fundamental challenges in AI research, namely explainability and fairness. We will address these challenges in the context of clustering, a cornerstone AI task by the use of counterfactual explanations (CFEs).

Explainable AI is an essential facet of modern artificial intelligence development, addressing the critical need for transparency and understandability in AI systems. As AI models, particularly deep learning algorithms, become increasingly complex and integrated into high-stakes domains such as healthcare, finance, and autonomous vehicles, the ability to understand and interpret their decisions becomes paramount. Explainable AI seeks to make the decision-making processes of these algorithms transparent, providing insight into how and why certain outcomes are reached. This transparency not only fosters trust among users and stakeholders, but also enables developers and researchers to more effectively diagnose and refine AI models to ensure they meet ethical standards, legal compliance, and social acceptance.

Counterfactual explanations delve into the realm of "what might have been" to shed light on how a machine learning (ML) model's decisions work. At their core, these explanations explore alternative scenarios by suggesting minimal adjustments to the original input that would result in a different outcome from an ML system. They operate on the principle of identifying and modifying critical variables that would change the model's decision, without relying on specific examples. This approach provides a straightforward and tangible method for users to understand the decision-making process of the ML model. By focusing on the changes necessary to achieve a different outcome, counterfactual explanations not only shed light on how ML models make decisions, but also empower users with the knowledge to navigate and potentially influence the outcomes of these systems in future interactions. Notably, this method significantly contributes to enhancing the fairness of AI systems by ensuring that decisions are made transparently and can be scrutinized for bias or inaccuracies, thus promoting equitable treatment across all users.

Objectives

1. Our first objective is to formally define and solve the counterfactual explanation problem for clustering and deep clustering. We will define counterfactuals for explaining clustering decisions for both individual points and set of points. In addition, we will seek for counterfactuals with desirable properties such as actionability (e.g., changing only mutable features) and similarity to the training data.

2. Our second objective is to provide a counterfactual-based approach for explaining and achieving fairness. Formalizing and operationalizing fairness in clustering is a challenging problem and counterfactuals offer an intuitive way of formulating this problem. Intuitively, our approach will explain why a specific instance was clustered in an unfavorable cluster and will measure the cost of unfairness through the distance between the instance and its counterfactual, i.e., the minimum cost required for reversing the decision. We will consider both individual fairness (i.e., nondiscrimination against individual instances) and group fairness (i.e., nondiscrimination against groups of instances as defined by the values of protected attributes, e.g., gender, or ethnicity). Furthermore, we will express our fairness models as constraints towards making clustering algorithms fair. We will exploit approaches that use the constraints during clustering and as a postprocessing step.

3. Our third objective is to introduce new counterfactual-based cluster quality indices and exploit them towards novel clustering algorithms. Clustering is a fundamental and well-studied problem in AI research and introducing new perspectives to the problem is challenging. Our indices will use statistics over the individual counterfactual distances as such distances are indicative of the cost of moving an instance to a different cluster and thus the compactness of a clustering assignment. The proposed indices will be compared and used in conjunction with existing ones for improving clustering.

Project Information:

  • Project Title: Counterfactual for Clustering: Explainability, Fairness and Quality
  • Acronym: FairXCluster
  • Thematic Area: Mathematics & Information Science
  • Thematic Field: Artificial Intelligence and Robotics
  • Host Institution: University of Ioannina
  • Department: Department of Computer Science & Engineering

  • Funding

    The research project is implemented in the framework of H.F.R.I. call "Basic research Financing (Horizontal support of all Sciences)" under the National Recovery and Resilience Plan "Greece 2.0" funded by the European Union - NextGenerationEU (H.F.R.I. ProjectNumber: 15940)

    Image 2 Image 1