Research

My research sits at the intersection of machine learning, ethics, and society. I study how we can build AI systems that are not only accurate, but also transparent, fair, and privacy-preserving — and what happens when these goals conflict with each other or with commercial incentives.

Explainable AI Algorithmic Fairness Privacy Large Language Models Computational Social Science

One area of my work corresponds to counterfactual explanations: “what would need to change for a different outcome?” — and how they can be used to audit models for bias, quantify privacy risks, and provide actionable recourse to individuals. More recently I have been studying LLM-based AI agents and their societal implications: do they homogenise our choices? Can we derive their decision boundaries? Can they lead to privacy leakage?

Also on Google Scholar  ·  ORCID  ·  Synced 2026-04-17


Journal Publications

Goethals, S., Rhue, L., & Sundararajan, A. (2026). Fairness Principles Across Contexts: Evaluating Gender Disparities of Facts and Opinions in Large Language Models. AI & Ethics.

Goethals, S., Favier, M. & Calders, T. (2026). Reranking individuals: The effect of fair classification within-groups. ACM Journal on Responsible Computing, 3(2), 1–27.

Greene, T., Goethals, S., Martens, D. & Shmueli, G. (2025). Monetization Could Corrupt Algorithmic Explanations. AI & Society.

Goethals, S., Matz, S., Provost, F., Martens, D. & Ramon, Y. (2025). The Impact of Cloaking Digital Footprints on User Privacy and Personalization. Big Data.

Goethals, S., Sörensen, K., & Martens, D. (2023). The privacy issue of counterfactual explanations: explanation linkage attacks. ACM Transactions on Intelligent Systems and Technology, 14(5), 1–24.

Goethals, S., Martens, D., & Calders, T. (2023). PreCoF: counterfactual explanations for fairness. Machine Learning, 1–32.

Goethals, S., Martens, D., & Evgeniou, T. (2022). The non-linear nature of the cost of comprehensibility. Journal of Big Data, 9(1), 1–23.

Vermeire, T., Brughmans, D., Goethals, S., de Oliveira, R. M. B., & Martens, D. (2022). Explainable image classification with evidence counterfactual. Pattern Analysis and Applications, 25(2), 315–335.


Conference Publications

Ding, F., Shmueli, G., Greene, T. & Goethals, S. (2026). Adapting Multiverse Analysis for Prediction: A Decision-Maker's Dashboard. Navigating the Model Uncertainty and the Rashomon Effect: From Theory and Tools to Applications and Impact.

Goethals, S., Sedoc, J., & Provost, F. (2025). What If the Prompt Were Different? Counterfactual Explanations for the Characteristics of Generative Outputs. Adjunct Proceedings of UMAP 2025, pp. 237–242.

Goethals, S., Luther, J., & Matz, S. (2025). Words reveal wants: How well can simple LLM-based AI agents replicate people's choices based on their social media posts. Adjunct Proceedings of UMAP 2025, pp. 126–131.

Goethals, S. & Rhue, L. (2025). One world, one opinion? The superstar effect in LLM responses. Proceedings of the 2nd Workshop on Cross-Cultural Considerations in NLP (C3NLP @ NAACL).

Goethals, S., Martens, D., & Evgeniou, T. (2023). Manipulation Risks in Explainable AI: The Implications of the Disagreement Problem. ECML-PKDD 2023.

Goethals, S., Martens, D., & Calders, T. (2023). Explainability methods to detect and measure discrimination in machine learning models. European Workshop on Algorithmic Fairness (EWAF). CEUR, Vol. 3442.

Mazzine, R., Goethals, S., Brughmans, D., & Martens, D. (2021). Counterfactual explanations for employment services. FEAST Workshop @ ECML-PKDD 2021.


Preprints

Matz, S., Horton, B. & Goethals, S. (2025). The Basic B*** Effect: The Use of LLM-based Agents Reduces the Distinctiveness and Diversity of People's Choices.

Cedro, M., Ichmoukhamedo, T., Goethals, S., He, Y., Hinns, J. & Martens, D. (2025). Cash or Comfort? How LLMs Value Your Inconvenience.

Martens, D., Shmueli, G., Evgeniou, T., Bauer, K., Janiesch, C., Feuerriegel, S., ... & Provost, F. (2025). Beware of "explanations" of AI.

Goethals, S., Delaney, E., Mittelstadt, B. & Russell, C. (2024). Resource-constrained fairness.

Reusens, M., Goethals, S., Calders, T. & Martens, D. (2026). Would a Large Language Model Pay Extra for a View? Inferring Willingness to Pay from Subjective Choices.

Goethals, S., Provost, F. & Sedoc, J. (2026). Prompt-Counterfactual Explanations for Generative AI System Behavior.

Hinns, J., Goethals, S., Van der Veeken, S., Evgeniou, T. & Martens, D. (2026). On the Definition and Detection of Cherry-Picking in Counterfactual Explanations.