My research sits at the intersection of machine learning, ethics, and society. I study how we can build AI systems that are not only accurate, but also transparent, fair, and privacy-preserving — and what happens when these goals conflict with each other or with commercial incentives.
One area of my work corresponds to counterfactual explanations: “what would need to change for a different outcome?” — and how they can be used to audit models for bias, quantify privacy risks, and provide actionable recourse to individuals. More recently I have been studying LLM-based AI agents and their societal implications: do they homogenise our choices? Can we derive their decision boundaries? Can they lead to privacy leakage?
Also on Google Scholar · ORCID · Synced 2026-04-17
Goethals, S., Rhue, L., & Sundararajan, A. (2026). Fairness Principles Across Contexts: Evaluating Gender Disparities of Facts and Opinions in Large Language Models. AI & Ethics.
Goethals, S., Favier, M. & Calders, T. (2026). Reranking individuals: The effect of fair classification within-groups. ACM Journal on Responsible Computing, 3(2), 1–27.
Greene, T., Goethals, S., Martens, D. & Shmueli, G. (2025). Monetization Could Corrupt Algorithmic Explanations. AI & Society.
Goethals, S., Matz, S., Provost, F., Martens, D. & Ramon, Y. (2025). The Impact of Cloaking Digital Footprints on User Privacy and Personalization. Big Data.
Goethals, S., Sörensen, K., & Martens, D. (2023). The privacy issue of counterfactual explanations: explanation linkage attacks. ACM Transactions on Intelligent Systems and Technology, 14(5), 1–24.
Goethals, S., Martens, D., & Calders, T. (2023). PreCoF: counterfactual explanations for fairness. Machine Learning, 1–32.
Goethals, S., Martens, D., & Evgeniou, T. (2022). The non-linear nature of the cost of comprehensibility. Journal of Big Data, 9(1), 1–23.
Vermeire, T., Brughmans, D., Goethals, S., de Oliveira, R. M. B., & Martens, D. (2022). Explainable image classification with evidence counterfactual. Pattern Analysis and Applications, 25(2), 315–335.
Ding, F., Shmueli, G., Greene, T. & Goethals, S. (2026). Adapting Multiverse Analysis for Prediction: A Decision-Maker's Dashboard. Navigating the Model Uncertainty and the Rashomon Effect: From Theory and Tools to Applications and Impact.
Goethals, S., Sedoc, J., & Provost, F. (2025). What If the Prompt Were Different? Counterfactual Explanations for the Characteristics of Generative Outputs. Adjunct Proceedings of UMAP 2025, pp. 237–242.
Goethals, S., Luther, J., & Matz, S. (2025). Words reveal wants: How well can simple LLM-based AI agents replicate people's choices based on their social media posts. Adjunct Proceedings of UMAP 2025, pp. 126–131.
Goethals, S. & Rhue, L. (2025). One world, one opinion? The superstar effect in LLM responses. Proceedings of the 2nd Workshop on Cross-Cultural Considerations in NLP (C3NLP @ NAACL).
Goethals, S., Martens, D., & Evgeniou, T. (2023). Manipulation Risks in Explainable AI: The Implications of the Disagreement Problem. ECML-PKDD 2023.
Goethals, S., Martens, D., & Calders, T. (2023). Explainability methods to detect and measure discrimination in machine learning models. European Workshop on Algorithmic Fairness (EWAF). CEUR, Vol. 3442.
Mazzine, R., Goethals, S., Brughmans, D., & Martens, D. (2021). Counterfactual explanations for employment services. FEAST Workshop @ ECML-PKDD 2021.
Matz, S., Horton, B. & Goethals, S. (2025). The Basic B*** Effect: The Use of LLM-based Agents Reduces the Distinctiveness and Diversity of People's Choices.
Cedro, M., Ichmoukhamedo, T., Goethals, S., He, Y., Hinns, J. & Martens, D. (2025). Cash or Comfort? How LLMs Value Your Inconvenience.
Martens, D., Shmueli, G., Evgeniou, T., Bauer, K., Janiesch, C., Feuerriegel, S., ... & Provost, F. (2025). Beware of "explanations" of AI.
Goethals, S., Delaney, E., Mittelstadt, B. & Russell, C. (2024). Resource-constrained fairness.
Reusens, M., Goethals, S., Calders, T. & Martens, D. (2026). Would a Large Language Model Pay Extra for a View? Inferring Willingness to Pay from Subjective Choices.
Goethals, S., Provost, F. & Sedoc, J. (2026). Prompt-Counterfactual Explanations for Generative AI System Behavior.