Explore the topics we focus on, our latest projects, and our research publications.
See how we turn ideas into impact.
What we do
Research – One of the goals of CERTAIN is to conduct excellent research on diverse aspects of trustworthy AI. Several doctoral students and researchers work under the umbrella of CERTAIN, on topics like safe and trustworthy reinforcement learning, interpretability and transparency of large language models, or guardrails for large AI models. Research in CERTAIN specifically also spans interdisciplinary topics such as AI ethics and effective and transparent human oversight, interdisciplinary aspects of AI alignment, and more.
Defragmentation – Building trustworthy AI and measuring the trustworthiness of AI systems has emerged as one of the most pressing topics for AI research – especially due to boom of generative AI in recent years. Consequently, trustworthiness research tends to be heterogeneous, having 1) many adjacent research fields and 2) often unclear objectives and processes. One of the goals of CERTAIN is to defragment the heterogeneous landscape of AI trustworthiness research by publications, standardization proposals and communications to and with its members. This specifically also includes network building for stakeholders in the European Union for institutes, companies, and other entities with diverse background and views on AI usage and benchmarking.
Communication – CERTAIN aims to bridge the gap between academic AI trustworthiness research and the concrete need of industries for trustworthy systems. We aim to do so by connecting industry partners, that are members of the CERTAIN network, with relevant academic stakeholders; and by fostering an environment for applied research on trustworthiness aspects, which can be transferred and adapted to real-world problems that the AI using industry is facing.
Application – Finally, CERTAIN also serves a disseminative purpose by communicating the latest news, calls, collaboration opportunities, and research outcomes around trustworthy AI to the CERTAIN network. We also perform matchmaking between the CERTAIN network partners, helping them to find complementary expertise, for example when building a project consortium or applying for funding otherwise.
Topics
Topic: AI Safety, Alignment, Machine Ethics
As AI systems become more capable and autonomous, their behaviour can become opaque, unpredictable, and difficult to control. This increases the risk of unintended, harmful, or misaligned outputs, whether due to technical limitations, misuse, or a mismatch with human goals. Ensuring that these systems are safe—without causing unintended or unnecessary harm—and are meeting broader normative […]
Topic: (Mechanistic) Interpretability, Explainability, Transparency
As large AI systems — particularly language models — grow increasingly powerful and complex, understanding how they operate “under the hood” is no longer optional. Making these typically opaque systems more transparent has become a central goal in modern AI research. Several key subfields contribute to this effort: Advances in these areas are critical not […]
Topic: Fairness
Method: Causal Modeling
Causal models represent how changes in one part of a system affect other system components. In contrast to correlations in statistical (machine learning) models, they can thus explain why effects occur and how external interventions would impact the system. Causal models are grounded in expert knowledge, assumptions and data-based inferences and can ideally be tested […]
Method: Neuro-Explicit Modeling
Some of the current problems related to a lack of trust in AI systems are a direct result of the massive use of black-box methods that depend solely on the processing of data. Instead, the new AI generation has its foundation built on hybrid AI systems (also known as neuro-symbolic or neuro-explicit). These hybrids do […]
Perspective: Interdisciplinary Trusted AI Research
Although much of CERTAIN’s current research is technical in nature, we firmly believe that trustworthy AI requires more than just computer science, namely issues such as hybrid intelligence, human oversight, agency, and the very concept of trust itself. Legal, ethical, empirical, and societal insights are indispensable. Normative frameworks—drawn from legal scholarship, ethics, and regulatory initiatives […]
To achieve these objectives, we employ a wide range of methodologies, including causal inference, neuro-explicit modeling, mechanistic interpretability, and many more. Our interdisciplinary approach enables us to develop innovative yet practical solutions across different domains, driving the advancement of trustworthy AI systems. We are committed to responsible AI development and deployment, ensuring that AI is designed and implemented in a way that supports autonomy, human oversight, responsible decision-making, and many other essential aspects of ethical and sustainable AI.
Projects
Albatross
Albatross aims to advance the scientific frontier by developing effective learning techniques for robust, long-term lifelong learning. These techniques will enable models to sustainably expand their knowledge while preserving previously acquired concepts and, when necessary, intentionally forgetting undesired or harmful capabilties.
DisAI
The DisAI project focuses on advancing research in artificial intelligence and language technologies to address the critical societal challenge of combating disinformation. Led by the Kempelen Institute of Intelligent Technologies (KInIT), the project emphasizes three core research areas: multilingual language technologies, multimodal natural language processing, and trustworthy AI. These efforts aim to develop cutting-edge solutions […]
lorAI
Das Hauptziel des lorAI-Projekts ist es, das Kempelen Institute of Intelligent Technologies (KInIT) zu einer führenden F&I-Einrichtung im Bereich der ressourcenarmen künstlichen Intelligenz (LRAI) in der Slowakei und in Europa auszubauen. Das lorAI-Konsortium wurde sorgfältig zusammengestellt, um führende europäische KI-Forschungsinstitute mit unterschiedlichen Organisationsmodellen und Budgetstrukturen sowie einzigartiger nationaler Erfahrung und Geschichte – ADAPT (Irland), DFKI […]
MAC-MERLin
The project Multi-Level Abstractions and Causal Modeling for Enhanced Reinforcement Learning (MAC-MERLin) aims to integrate established expert and domain knowledge into artificial neural networks by using multi-level abstractions in combination with causal modeling. Building on the established concepts of causality and abstraction, MAC-MERLin will improve the interpretability of deep reinforcement learning (DRL) agents, aligning them […]
Momentum
MOMENTUM is a research project dedicated to TRUSTED-AI, which aims to advance the development and application of artificial intelligence by integrating robustness and explainability. The aim of the project is to make the development of autonomous systems safer, more reliable and more transparent. Particular attention will be paid to ensuring that these systems can interact […]
Perspicuity and Societal Risk
This interdisciplinary project integrates expertise from law and ethics into research on the design, regulation, and deployment of systems that ensure perspicuity, thereby enabling effective human oversight and control to mitigate risks. By examining both run-time and inspection-time scenarios, the project analyzes the interplay between technical, legal, and ethical perspectives. The goal is to develop […]
TRAILS
The TRAILS project focuses on addressing challenges in natural language processing (NLP) related to linguistic and cultural inclusivity, robustness, and efficiency. Current pre-trained neural models rely on vast amounts of uncurated text data, leading to issues such as cultural bias, underrepresentation of rare languages and phenomena, and overemphasis on large, resource-intensive models. TRAILS aims to […]