Research

My research sits at the intersection of machine learning, ethics, and society. I study how we can build AI systems that are not only accurate, but also transparent, fair, and privacy-preserving — and what happens when these goals conflict with each other or with commercial incentives.

Explainable AI Algorithmic Fairness Privacy Large Language Models Computational Social Science

One area of my work corresponds to counterfactual explanations: “what would need to change for a different outcome?” — and how they can be used to audit models for bias, quantify privacy risks, and provide actionable recourse to individuals. More recently I have been studying LLM-based AI agents and their societal implications: do they homogenise our choices? Can we derive their decision boundaries? Can they lead to privacy leakage?

Also on Google Scholar · ORCID · Synced 2026-04-17

Journal Publications

Goethals, S., Rhue, L., & Sundararajan, A. (2026). Fairness Principles Across Contexts: Evaluating Gender Disparities of Facts and Opinions in Large Language Models. AI & Ethics.

DOI

Goethals, S., Favier, M. & Calders, T. (2026). Reranking individuals: The effect of fair classification within-groups. ACM Journal on Responsible Computing, 3(2), 1–27.

DOI

Greene, T., Goethals, S., Martens, D. & Shmueli, G. (2025). Monetization Could Corrupt Algorithmic Explanations. AI & Society.

DOI

Goethals, S., Matz, S., Provost, F., Martens, D. & Ramon, Y. (2025). The Impact of Cloaking Digital Footprints on User Privacy and Personalization. Big Data.

DOI

Goethals, S., Sörensen, K., & Martens, D. (2023). The privacy issue of counterfactual explanations: explanation linkage attacks. ACM Transactions on Intelligent Systems and Technology, 14(5), 1–24.

DOI

Goethals, S., Martens, D., & Calders, T. (2023). PreCoF: counterfactual explanations for fairness. Machine Learning, 1–32.

DOI

Goethals, S., Martens, D., & Evgeniou, T. (2022). The non-linear nature of the cost of comprehensibility. Journal of Big Data, 9(1), 1–23.

DOI

Vermeire, T., Brughmans, D., Goethals, S., de Oliveira, R. M. B., & Martens, D. (2022). Explainable image classification with evidence counterfactual. Pattern Analysis and Applications, 25(2), 315–335.

DOI

Conference Publications

Ding, F., Shmueli, G., Greene, T. & Goethals, S. (2026). Adapting Multiverse Analysis for Prediction: A Decision-Maker's Dashboard. Navigating the Model Uncertainty and the Rashomon Effect: From Theory and Tools to Applications and Impact.

PDF

Goethals, S., Sedoc, J., & Provost, F. (2025). What If the Prompt Were Different? Counterfactual Explanations for the Characteristics of Generative Outputs. Adjunct Proceedings of UMAP 2025, pp. 237–242.

PDF DOI

Goethals, S., Luther, J., & Matz, S. (2025). Words reveal wants: How well can simple LLM-based AI agents replicate people's choices based on their social media posts. Adjunct Proceedings of UMAP 2025, pp. 126–131.

PDF DOI

Goethals, S. & Rhue, L. (2025). One world, one opinion? The superstar effect in LLM responses. Proceedings of the 2nd Workshop on Cross-Cultural Considerations in NLP (C3NLP @ NAACL).

ACL Anthology

Goethals, S., Martens, D., & Evgeniou, T. (2023). Manipulation Risks in Explainable AI: The Implications of the Disagreement Problem. ECML-PKDD 2023.

DOI

Goethals, S., Martens, D., & Calders, T. (2023). Explainability methods to detect and measure discrimination in machine learning models. European Workshop on Algorithmic Fairness (EWAF). CEUR, Vol. 3442.

PDF

Mazzine, R., Goethals, S., Brughmans, D., & Martens, D. (2021). Counterfactual explanations for employment services. FEAST Workshop @ ECML-PKDD 2021.

PDF

Preprints

Matz, S., Horton, B. & Goethals, S. (2025). The Basic B*** Effect: The Use of LLM-based Agents Reduces the Distinctiveness and Diversity of People's Choices.

arXiv PDF

Cedro, M., Ichmoukhamedo, T., Goethals, S., He, Y., Hinns, J. & Martens, D. (2025). Cash or Comfort? How LLMs Value Your Inconvenience.

arXiv PDF

Martens, D., Shmueli, G., Evgeniou, T., Bauer, K., Janiesch, C., Feuerriegel, S., ... & Provost, F. (2025). Beware of "explanations" of AI.

arXiv PDF

Goethals, S., Delaney, E., Mittelstadt, B. & Russell, C. (2024). Resource-constrained fairness.

arXiv PDF

Reusens, M., Goethals, S., Calders, T. & Martens, D. (2026). Would a Large Language Model Pay Extra for a View? Inferring Willingness to Pay from Subjective Choices.

arXiv PDF

Goethals, S., Provost, F. & Sedoc, J. (2026). Prompt-Counterfactual Explanations for Generative AI System Behavior.

arXiv PDF

Hinns, J., Goethals, S., Van der Veeken, S., Evgeniou, T. & Martens, D. (2026). On the Definition and Detection of Cherry-Picking in Counterfactual Explanations.

arXiv PDF