Publications of Andreas Holzinger, Human-Centered AI > Scholar, DBLP, ORCIDSCI

The introduction of artificial intelligence (AI) in domains that affect human life (agriculture, climate, forestry, health, …) has led to an increased demand for trustworthy AI. Andreas Holzinger works with his group on Human-Centered AI (HCAI) and multimodal Causability *) motivated by the desire to improve robustness and explainability  in order to foster trustworthy AI solutions. Andreas promotes a synergistic approach to enable the human-in-control and align AI with human values, ethical principles and legal requirements to ensure privacy, security, and safety. Large Language Models (such as Chat-GPT) show impressivly how important the question remains “Can I trust the results?”

Subject: Computer Science (102) > Artificial Intelligence (102 001)
Technical Area: Machine Learning (102 019)
Keywords: Human-Centered AI (HCAI), Explainable AI (XAI), interactive Machine Learning (iML), Decision Support, trusthworthy AI
Application Areas: Domains that impact human life (agriculture, climate, forestry, health, …)
United Nations Sustainability Goals (SDG): 2, 3, 12, 13, 15 (see research plan: https://doi.org/10.3390/s22083043)

ORCID-ID: http://orcid.org/0000-0002-6786-5194

Publication metrics as of December 2023

Google Scholar citations: 25,232, Google Scholar h-Index = 75
Scopus h-Index = 52, Scopus citations = 12,311
Web of Science (Clarivate Science Citation Index SCI – formerly publons) = 45, WoS citations 7,388
DBLP Peer-reviewed conference papers c = 208, Peer-reviewed journal papers j = 108

*) N.B. Causability is neither a typo nor a synonym for Causality in the sense of Judea Pearl. We introduced the term Causa-bil-ity in reference to Usa-bil-ity. While XAI is about implementing transparency and traceability, Causability is about the measurement of the quality of explanations, i.e. the measurable extent to which an explanation of a statement to a user achieves a specified level of causal understanding with effectiveness, efficiency and satisfaction in a specified context of use.

  • Explainability := technically highlights decision relevant parts of machine representations and machine models i.e., parts which contributed to model accuracy in training, or to a specific prediction. It does NOT refer to a human model !
  • Causality : = relationship between cause and effect in the sense of Pearl
  • Usability := according to DIN EN ISO 9241-11 is the measurable extent to which a software can be used by specified users to achieve specified goals with effectiveness, efficiency, and satisfaction in a specified context of use, see Holzinger
  • Causability := the measureable extent to which an explanation to a human achieves a specified level of causal understanding, see Holzinger) It does refer to a human model !

Successful mapping between Explainability and Causability requires new human-AI interfaces which allow a contextual understanding and let domain expert ask questions and counterfactuals (“what-if” questions). Critics of this approach continue to ask the question what the human-in-the-loop should do; my concrete answer: The human-in-the-loop can (sometimes – not always) bring in human experience and conceptual knowledge into AI processes – something that the best AI algorithms on this planet are still lacking.

N.B. Causability ist weder ein Tippfehler noch ein Synonym für Causality im Sinne von Judea Pearl. Wir haben den Begriff Causa-bil-ity in Anlehnung an Usa-bil-ity eingeführt. Während es bei XAI um die Umsetzung von Transparenz und Nachvollziehbarkeit geht, geht es bei Causability um die Messung der Qualität von Erklärungen.

  • Explainability := hebt technisch entscheidungsrelevante Teile von Maschinendarstellungen und Maschinenmodellen hervor, d.h. Teile, die zur Modellgenauigkeit im Training oder zu einer bestimmten Vorhersage beigetragen haben. Sie bezieht sich nicht auf ein menschliches Modell!
  • Usability := (“Gebrauchstauglichkeit”) nach DIN EN ISO 9241-11 ist das messbare Ausmaß, in dem ein System für einen Benutzer ein spezifiziertes Niveau der Benutzbarkeit mit Effektivität, Effizienz und Zufriedenheit in einem spezifizierten Nutzungskontext erreicht, siehe Holzinger
  • Causality := Beziehung von Ursache und Wirkung im Sinne von Pearl
  • Causability := (“Ursachenerkennbarkeit” – in Erweiterung zu “Ursachentauglichkeit”) ist das messbare Ausmaß, in dem eine Erklärung einer Aussage für einen Benutzer (das menschliche Modell !) ein spezifiziertes Niveau des kausalen Verständnisses mit Effektivität, Effizienz und Zufriedenheit in einem spezifizierten Nutzungskontext erreicht, siehe Holzinger

Dies erfordert neue Mensch-KI-Schnittstellen, die ein kontextuelles Verständnis ermöglichen und es dem Domänenexperten erlauben, Fragen und Counterfactuals (“Was-wäre-wenn”-Fragen) zu stellen. Kritiker dieses Ansatzes stellen stets die Frage, was der Human-in-the-Loop tun soll; meine konkrete Antwort: Der Human-in-the-Loop kann (manchmal – natürlich nicht immer) menschliche Erfahrung und konzeptionelles Wissen in KI-Prozesse einbringen – etwas, das den besten KI-Algorithmen auf diesem Planeten (noch) fehlt.


On generating trustworthy counterfactual explanations

Deep learning models like chatGPT exemplify AI success but necessitate a deeper understanding of trust in critical sectors. Trust can be achieved using counterfactual explanations, which is how humans become familiar with unknown processes; by understanding the hypothetical input circumstances under which the output changes. We argue that the generation of counterfactual explanations requires several aspects of the generated counterfactual instances, not just their counterfactual ability. We present a framework for generating counterfactual explanations that formulate its goal as a multiobjective optimization problem balancing three objectives: plausibility; the intensity of changes; and adversarial power. We use a generative adversarial network to model the distribution of the input, along with a multiobjective counterfactual discovery solver balancing these objectives. We demonstrate the usefulness of six classification tasks with image and 3D data confirming with evidence the existence of a trade-off between the objectives, the consistency of the produced counterfactual explanations with human knowledge, and the capability of the framework to unveil the existence of concept-based biases and misrepresented attributes in the input domain of the audited model. Our pioneering effort shall inspire further work on the generation of plausible counterfactual explanations in real-world scenarios where attribute-/concept-based annotations are available for the domain under analysis. Javier Del Ser, Alejandro Barredo-Arrieta, Natalia Díaz-Rodríguez, Francisco Herrera, Anna Saranti, Andreas Holzinger (2024), Information Sciences


The next frontier : AI We Can Really Trust

This is the keynote paper “The next frontier : AI We Can Really Trust” presented at the ECML-PKDD 2021 – Machine Learning and Principles and Practice of Knowledge Discovery in Databases:  Holzinger A. (2021) The Next Frontier: AI We Can Really Trust. In: Kamp M. et al. (eds) Machine Learning and Principles and Practice of Knowledge Discovery in Databases. ECML PKDD 2021. Communications in Computer and Information Science, vol 1524. Springer, Cham. https://doi.org/10.1007/978-3-030-93736-2_33 This paper is fostering a human-centered AI approach due to the fact taht the use of AI in domains that impact human life has led to an increased demand for robust, explainable hence trustworthy AI. One possible step to make AI more robust is to combine statistical learning with knowledge representations. For certain tasks, it can be advantageous to include a human in the loop to make use of the fascinating abilities of the natural intelligence of humans.   [paper, pdf, 752 KB] [Scholar] [WoS]39


Digital Transformation in Smart Farm and Forest Operations Needs Human-Centered AI

Andreas Holzinger, Anna Saranti, Alessa Angerschmid, Carl Orge Retzlaff, Andreas Gronauer, Vladimir Pejakovic, Francisco Medel, Theresa Krexner, Christoph Gollob & Karl Stampfer (2022). Digital Transformation in Smart Farm and Forest Operations needs Human-Centered AI: Challenges and Future Directions. Sensors, 22, (8), 3043, doi:10.3390/s22083043. In this paper we describe the use of trustworthy AI in two for mankind and our planet important domains: agriculture and forestry fostered by explainability and robustness. One step to make AI more robust is to use expert knowledge. For example, a farmer or a forester can bring in their experience and conceptual understanding to the AI pipeline. Consequently, human-centred AI (HCAI) can be seen as a combination of ‘artificial intelligence’ and ‘natural intelligence’ to strengthen, enhance and complement human performance, rather than replacing humans. To ensure practical success we introduce three Frontier Research Areas: (1) Intelligent Information Fusion, (2) Robotics and Embodied Intelligence, and (3) Augmentation, Explanation and Verification for Trustworthy Decision Support. [Scholar] [WoS]


Towards multi-modal causability with Graph Neural Networks enabling information fusion for explainable AI

Andreas Holzinger, Bernd Malle, Anna Saranti & Bastian Pfeifer (2021). Towards Multi-Modal Causability with Graph Neural Networks enabling Information Fusion for explainable AI. Information Fusion, 71, (7), 28-37, doi:10.1016/j.inffus.2021.01.008   In this paper our central hypothesis is that using conceptual knowledge as a guiding model of reality will help to train more explainable, more robust and less biased machine learning models, ideally able to learn from less data. One important aspect in many application domains is that various modalities contribute to one single result. Our main question is “How can we construct a multi-modal feature representation space (spanning images, text, genomics data) using knowledge bases as an initial connector for the development of novel explanation techniques?”. In this paper we argue for using Graph Neural Networks as a method-of-choice, enabling information fusion for multi-modal causability (causability – not to confuse with causality – is the measurable extent to which an explanation to a human expert achieves a specified level of causal understanding). [Project Page] [Scholar] [publons] 79


Classification by ordinal sums of conjunctive & disjunctive functions for explainable AI & interpretable machine learning

Miroslav Hudec, Erika Minarikova, Radko Mesiar, Anna Saranti & Andreas Holzinger (2021). Classification by ordinal sums of conjunctive and disjunctive functions for explainable AI and interpretable machine learning solutions. Knowledge Based Systems, doi:10.1016/j.knosys.2021.106916 In this paper we propose a novel classification according to aggregation functions of mixed behavior by variability in ordinal sums of conjunctive and disjunctive functions. Domain experts are empowered to assign only the most important observations regarding the considered attributes. This has the advantage that the variability of the functions provides opportunities for machine learning to learn the best possible option. Such a solution is comprehensible, reproducible and explainable-per-design to domain experts. We discuss the proposed approach with examples and outline the research steps in interactive machine learning with a human-in-the-loop over aggregation functions. Although human experts are not always able to explain anything either, they are sometimes able to bring in experience, contextual understanding and implicit knowledge, which is desirable in certain machine learning tasks and can contribute to the robustness of algorithms. [Project Page] [Scholar] [publons] 03

 


KANDINSKYPatterns - An experimental exploration environment for Pattern Analysis and Machine Intelligence

Andreas Holzinger, Anna Saranti & Heimo Mueller (2021). arXiv 2103.00519. Machine intelligence is successful at recognition tasks when having high-quality data. There is still a gap between machine-level pattern recognition and human-level concept learning. Humans can learn under uncertainty from few examples and generalize these concepts to solve new problems. The growing interest in explainable AI, requires diagnostic tests to analyze weaknesses in existing approaches. In this paper, we discuss existing diagnostic test data sets such as CLEVR, CLEVERER, CLOSURE, CURI, Bongard-LOGO, V-PROM, and present our own experimental environment: KANDINSKYPatterns, named after Wassily Kandinksy, who made contributions to compositivity, i.e. that all perceptions consist of geometrically elementary individual components. This was experimentally proven by Hubel &Wiesel in the 1960s and became the basis for machine learning approaches such as the Neocognitron and later Deep Learning. While KP have computationally controllable properties, bringing ground truth, they are also distinguishable by human observers, i.e., controlled patterns can be described by both humans and algorithms, making them another important contribution to international research in machine intelligence. [Project Page] [Scholar]


Artificial Intelligence and Machine Learning for Digital Pathology

Andreas Holzinger, Randy Goebel, Michael Mengel, Heimo Mueller (eds.) 2020. Artificial Intelligence and Machine Learning for Digital Pathology: State-of-the-Art and Future Challenges, Springer Lecture Notes in Artificial Intelligence Volume 12090, doi:10.1007/978-3-030-50402-1 Data driven AI and ML in digital pathology, radiology, dermatology is promising. In specific cases Deep Learning even exceeds human performance, however, in the context of medicine it is important for a human expert to verify the outcome. There is a need for transparency and re-traceability of state-of-the-art solutions to make them usable for ethical responsible medical decision support. Moreover, big data is required for training, covering a wide spectrum of a variety of human diseases in different organ systems. These data sets must meet top-quality and regulatory criteria and must be well annotated for ML at patient-, sample-, and image-level. Mentioned by WHO as important application of AI for human health [Book Homepage]


Measuring the Quality of Explanations: The Systems Causability Scale (SCS). Comparing Human and Machine Explanations.

Andreas Holzinger, Andre Carrington & Heimo Müller 2020. Measuring the Quality of Explanations: The System Causability Scale (SCS). Comparing Human and Machine Explanations. KI – Künstliche Intelligenz (German Journal of Artificial intelligence), Special Issue on Interactive Machine Learning, Edited by Kristian Kersting, TU Darmstadt, 34, (2), doi:10.1007/s13218-020-00636-z., online available via https://link.springer.com/article/10.1007/s13218-020-00636-z In this paper we introduce the System Causability Scale (SCS) to measure the qualty of explanations. It is based on the notion of Causability (Holzinger et al., 2019) combned with concepts adapted from the widely accepted System Usability Scale (SUS). In the same way as usability measures the quality of use, Causability measures the quality of explanations. [xAI-Project] [Scholar] [publons] 68


Causability and Explainability of Artificial Intelligence in Medicine

Andreas Holzinger, Georg Langs, Helmut Denk, Kurt Zatloukal & Heimo Mueller 2019. Causability and Explainability of AI in Medicine. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, doi:10.1002/widm.1312  In this paper we intrduce the notion of Causability, which is extending explainability and is of great importance for future Human-AI interfaces (see our paper on dialogue systems for intelligent user interfaces). Such interfaces for explainable AI have to map the technical explainability (which is a property of an AI, e.g. the heatmap of a neural network produced by e.g. layer wise relevance propagation) with  causability (which is a property of a human, i.e. the extent to which the technical explanation is interpretable by a human) and to answer questions of why we need a ground truth, i.e. a framework for understanding. Here counterfactuals are important P (y x | x , y ) with the typical activity of “retrospection” and questions including “what-if?” [Systems Causability Scale] [Scholar] [publons] 287


KANDINSKY Patterns: A Swiss-Knife for the Study of Explainable AI

Andreas Holzinger, Peter Kieseberg & Heimo Müller 2020. KANDINSKY Patterns: A Swiss-Knife for the Study of Explainable AI. ERCIM News, (120), 41-42. [pdf, 755 KB] Online available: https://ercim-news.ercim.eu/en120/r-i/kandinsky-patterns-a-swiss-knife-for-the-study-of-explainable-ai  Kandinsky Patterns enable testing, benchmarking and evaluating machine learning algorithms under mathematically strictly controllable conditions, but at the same time are accessible and understandable for human observers and with the possibility to produce (and hide) a ground truth. This helps us in understanding “how do humans explain ?” and do basic research in ground truth. This is important, as adversarial examples have already demonstrated their potential in attacking security mechanisms applied in various domains, especially medical environments. Last, but not least, Kandinsky Patterns can be used to produce “counterfactuals” – the “what if”, which is difficult to handle for both humans and machines – but can provide novel insights into the behaviour of explanation methods. [Project page]


A new concordant partial AUC and partial c statistic for imbalanced data in the evaluation of machine learning algorithms

André M. Carrington, Paul W. Fieguth, Hammad Qazi, Andreas Holzinger, Helen H. Chen, Franz Mayr & Douglas G. Manuel 2020. A new concordant partial AUC and partial c statistic for imbalanced data in the evaluation of machine learning algorithms. Springer/Nature BMC Medical Informatics and Decision Making, 20, (1), 4, doi:10.1186/s12911-019-1014-6. In explainable AI a very important issue is robustness of machine learning algorithms. For measuring robustness, we introduce a novel concordant partial Area Under the Curve (AUC) and a new partial c statistic for Receiver Operator Characteristic (ROC) dataas foundational measures to help to understand and to explain ROC and AUC. Our partial measures are continuous and discrete versions of the same measure, are derived from the AUC and c statistic respectively, are validated as equal to each other, and validated as equal in summation to whole measures where expected. [relevant for xAI] [Scholar] [publons] 38


KANDINSKY Patterns as Intelligence Test for machines

Andreas Holzinger, Michael Kickmeier-Rust & Heimo Mueller 2019. KANDINSKY Patterns as IQ-Test for machine learning. Springer Lecture Notes LNCS 11713. Cham (CH): Springer Nature Switzerland, pp. 1-14, doi:10.1007/978-3-030-29726-8_1 . AI follows the notion of human intelligence which is not a clearly defined term, according to cognitive science includes abilities to think abstract, to reason, and to solve problems from the real world. A hot topic in current AI/machne learning research is to find whether and to what extent algorithms are able to learn abstract thinking and reasoning similarly as humans can do, or whether the learning remains on purely statistical correlations. In this paper we propose to use our Kandinsky Patterns as an IQ-Test for machines and to study concept learning which is a fundamental problem for future AI/ML. [Paper] [exploration enviroment] [TEDx] [Scholar] [publons] 11


Dialogue Systems for Intelligent Human Computer Interactions

Erinc Merdivan, Deepika Singh, Sten Hanke & Andreas Holzinger 2019. Dialogue Systems for Intelligent Human Computer Interactions. Electronic Notes in Theoretical Computer Science, 343, 57-71, doi:10.1016/j.entcs.2019.04.010. Online available via: https://www.sciencedirect.com/science/article/pii/S1571066119300106 In this paper we present some fundamentals on communication techniques for interation in dialogues involving speech, gesture, semantic and pragmatic knowledge and present a new image-based method in an Out Of Vocabulary setting. The results show that using dialogue as an image performs well and helps dialogue manager in expanding out of vocabulary dialogue tasks in comparison to Memory Networks. This is important for future Human-AI interfaces. [relevant for xAI] [Scholar] [publons] 13


The first publication on our KANDINSKY Universe, the experimental environment for explainability and causability

Mueller, H. & Holzinger, A. 2021. Kandinsky Patterns. Artificial intelligence, 300, (11), 103546, doi:10.1016/j.artint.2021.103546 In the medical domain (e.g. histopathology) the ground truth is in generally accepted textbooks, hence in the brain of the human pathologist, but often not directly accessible. Here the KANDINSKY Figures and KANDINSKY Patterns come into play: those are mathematically-logically describable, simple, self-contained, hence controllable test data sets for the development, validation and training of explainability/interpretability in artificial intelligence (AI) and machine learning (ML). While they possess these computationally manageable properties, they are at the same time easily distinguishable by human observers, so can be described by both humans and algorithms. We invite the international machine learning research community to a challenge to experiment with our Kandinsky Patterns to expand and thus make progress in the field of explainable AI and to contribute to the upcoming field of explainability and causability. [Project Page] [Scholar] [publons] 01


Interactive machine learning: experimental evidence for the human in the algorithmic loop: A case study on Ant Colony Optimization

Andreas Holzinger, Markus Plass, Michael Kickmeier-Rust, Katharina Holzinger, Gloria Cerasela Crişan, Camelia-M. Pintea & Vasile Palade 2019. Interactive machine learning: experimental evidence for the human in the algorithmic loop. Applied Intelligence, 49, (7), 2401-2414, doi:10.1007/s10489-018-1361-5. Online available: https://link.springer.com/article/10.1007/s10489-018-1361-5 In this paper we provide novel experimental insights on how we can improve computational intelligence by complementing it with human intelligence in an interactive machine learning approach (iML). For this purpose, we used the Ant Colony Optimization (ACO) framework, because this fosters multi-agent approaches with human agents in the loop (see when we need a human-in-the-loop). We propose a unification between human intelligence and interaction abilities and the computing power of an artificial machine learning system. The “human-in-the-loop” brings in conceptual knowledge that no algorithm on this planet yet has. [Scholar] [publons] 73


Evaluating the Quality of Machine Learning Explanations: A Survey on Methods and Metrics

Zhou, J., Gandomi, A. H., Chen, F. & Holzinger, A. 2021. Evaluating the Quality of Machine Learning Explanations: A Survey on Methods and Metrics. Electronics, 10, (5), 593, doi:10.3390/electronics10050593 While numerous explanation methods have been explored, there is a need for evaluations to quantify the quality of explanation methods to determine whether and to what extent the offered explainability achieves the defined objective, and compare available explanation methods and suggest the best explanation from the comparison for a specific task. This survey paper presents a comprehensive overview of methods proposed in the current literature for the evaluation of ML explanations. We identify properties of explainability from the review of definitions of explainability. The identified properties of explainability are used as objectives that evaluation metrics should achieve. The survey found that the quantitative metrics for both model-based and example-based explanations are primarily used to evaluate the parsimony/simplicity of interpretability, while the quantitative metrics for attribution-based explanations are primarily used to evaluate the soundness of fidelity of explainability. The survey also demonstrated that subjective measures, such as trust and confidence, have been embraced as the focal point for the human-centered evaluation of explainable systems. The paper concludes that the evaluation of ML explanations is a multidisciplinary research topic. It is also not possible to define an implementation of evaluation metrics, which can be applied to all explanation methods. [Scholar] [publons] 22


From Computer Innovation to Human Integration: Current Trends and Challenges for Pervasive Health Technologies

Röcker, C., Ziefle, M. & Holzinger, A. 2014. From Computer Innovation to Human Integration: Current Trends and Challenges for Pervasive Health Technologies. In: Holzinger, Andreas, Ziefle, Martina & Röcker, Carsten (eds.) Pervasive Health: State-of-the-Art and Beyond. Springer London, pp. 1-17, doi:10.1007/978-1-4471-6413-5_1 This chapter starts with an overview of the technical innovations and societal transformation processes we have seen in the last decades and as well as the consequences those changes have for the design of pervasive healthcare systems. Based on this theoretical foundation, emerging design requirements and research challenges are outlined, which are crucial to be addressed when developing future health technologies. [Scholar] [publons] 38


Rapid prototyping for a Virtual Medical Campus interface

Holzinger, A. 2004. Rapid Prototyping to the User Interface Development for a Virtual Medical Campus. IEEE Software, 21, (1), 92–99, doi:10.1109/MS.2004.1259241 . This is a best practice paper about the design of a Virtual Campus, working under a strict time-line, used simple, rapid, cost-effective prototyping techniques to create a user interface and release a working system within six months. Involving users early in the interface design facilitated acceptance. The VMC system architecture includes a multimedia repository for reusable learning objects; middleware containing the VMC logic that arranges learning objects into lectures, themes, and modules; and user interface. To ensure the interface suited the target population, we used the user-centered design method. This true pioneer project began in 2002 and was worldwide one of the first large-scale projects in development of a virtual campus system to support full and large-scale online learning – 20 years later in times of the Covid 19 pandemic all this becomes important again. [Scholar] [publons] 37


Why imaging data alone is not enough: AI-based integration of imaging, omics, and clinical data
Why imaging data alone is not enough: AI-based integration of imaging, omics, and clinical data

Andreas Holzinger, Benjamin Haibe-Kains & Igor Jurisica 2019. Why imaging data alone is not enough: AI-based integration of imaging, omics, and clinical data. European Journal of Nuclear Medicine and Molecular Imaging, 46, (13), 2722-2730, doi:10.1007/s00259-019-04382-9. Integration of clinical, imaging, molecular data is necessary to understand complex diseases, and to achieve accurate diagnosis to provide the best possible treatment. In addition to the need for sufficient computing resources, suitable algorithms, models, and data infrastructure, three important aspects are often neglected: (1) the need for multiple independent, sufficiently large and, above all, high-quality data sets; (2) the need for domain knowledge and ontologies; and (3) the requirement for multiple networks that provide relevant relationships among biological entities. While one will always get results out of high-dimensional data, all three aspects are essential to provide robust training and validation of ML models, to provide explainable hypotheses and results, and to achieve the necessary trust in AI and confidence for clinical applications. [Preprint available here


Human Activity Recognition Using Recurrent Neural Networks

Deepika Singh, Erinc Merdivan, Ismini Psychoula, Johannes Kropf, Sten Hanke, Matthieu Geist & Andreas Holzinger 2017. Human Activity Recognition Using Recurrent Neural Networks. In: Lecture Notes in Computer Science LNCS 10410. Cham: Springer International, pp. 267-274, doi:10.1007/978-3-319-66808-6_18. In this paper, we introduce a deep learning model that learns to classify human activities without using any prior knowledge. For this purpose, a Long Short Term Memory (LSTM) Recurrent Neural Network was applied to three real world smart home datasets. The results of our experiments show that the proposed approach outperforms existing in terms of accuracy and performance. Human activity recognition using smart home sensors is one of the bases of ubiquitous computing in smart environments and a topic undergoing intense research in the field of ambient assisted living. The increasingly large amount of data sets calls for machine learning methods. https://arxiv.org/abs/1804.07144


Augmenting Statistical Data Dissemination by Short Quantified Sentences of Natural Language
Augmenting Statistical Data Dissemination by Short Quantified Sentences of Natural Language

Miroslav Hudec, Erika Bednárová & Andreas Holzinger 2018. Augmenting Statistical Data Dissemination by Short Quantified Sentences of Natural Language. Journal of Official Statistics (JOS), 34, (4), 981, doi:10.2478/jos-2018-0048. Online available: https://content.sciendo.com/view/journals/jos/34/4/article-p981.xml In this paper we study the potential of natural language summaries expressed in short quantified sentences. Linguistic summaries are not intended to replace existing dissemination approaches, but can augment them by providing alternatives for the benefit of diverse users (e.g. domain experts, general public, disabled people, …). The concept of lingusitic summaries is demonstrated on test interfaces, which can be important for future human-AI dialogue systems. [relevant for xAI]


Computational approaches for mining user’s opinions on the Web 2.0

Gerald Petz, Michał Karpowicz, Harald Fuerschuss, Andreas Auinger, Vaclav Stritesky & Andreas Holzinger 2014. Computational approaches for mining user’s opinions on the Web 2.0. Information Processing & Management, 50, (6), 899-908, doi:10.1016/j.ipm.2014.07.005. Computational opinion mining discovers, extracts and analyzes people’s opinions, sentiments, attitudes and emotions towards certain topics in social media. While providing interesting market research information, the user generated content presents numerous challenges regarding systematic analysis, the differences and unique characteristics of the various social media channels. Here we report on the determination of such particularities, and deduces their impact on text preprocessing and opinion mining algorithms (sentiment anslaysis). [RG] [publons] 25


Explainable AI: The New 42?

Randy Goebel, Ajay Chander, Katharina Holzinger, Freddy Lecue, Zeynep Akata, Simone Stumpf, Peter Kieseberg & Andreas Holzinger 2018. Explainable AI: the new 42? Springer Lecture Notes in Computer Science LNCS 11015. Cham: Springer, pp. 295-303, doi:10.1007/978-3-319-99740-7_21. In this 2018 output of our yearly xAI-workshop at the CD-MAKE conference we discuss some issues of the current state-of-the-art in what is now called explainable AI and outline what we think is the next big thing in AI/machine learning: the combination of statistical probabilistic machine learning methods with classic logic based symbolic artificial intelligence. Maybe the field of explainable ai can act as an ideal bridge to combine these two worlds. [pdf, 875 kB] 


Integrated web visualizations for protein-protein interaction databases

Jeanquartier, F., Jean-Quartier, C. & Holzinger, A. 2015. Integrated web visualizations for protein-protein interaction databases. BMC Bioinformatics, 16, (1), 195, doi:10.1186/s12859-015-0615-z  Understanding living systems is crucial for curing diseases. To achieve this task we have to understand biological networks based on protein-protein interactions. Bioinformatics has come up with a great amount of databases and tools that support analysts in exploring protein-protein interactions on an integrated level for knowledge discovery. They provide predictions and correlations, indicate possibilities for future experimental research and fill the gaps to complete the picture of biochemical processes. There are numerous and huge databases of protein-protein interactions used to gain insights into answering some of the many questions of systems biology. Many computational resources integrate interaction data with additional information on molecular background. However, the vast number of diverse Bioinformatics resources poses an obstacle to the goal of understanding. We present a survey of databases that enable the visual analysis of protein networks. We selected M=10 out of N=53 resources supporting visualization, and we tested against the following set of criteria: interoperability, data integration, quantity of possible interactions, data visualization quality and data coverage. The study reveals differences in usability, visualization features and quality as well as the quantity of interactions. StringDB is the recommended first choice. CPDB presents a comprehensive dataset and IntAct lets the user change the network layout. A comprehensive comparison table is available via web. [Scholar] [WoS]


Interpretierbare KI: Neue Methoden zeigen Entscheidungswege künstlicher Intelligenz auf

Andreas Holzinger 2018. Interpretierbare KI: Neue Methoden zeigen Entscheidungswege künstlicher Intelligenz auf. c’t Magazin für Computertechnik, 22, 136-141. Machinelles Lernen bringt heute KI-Systeme hervor, die Entscheidungen schneller treffen als ein Mensch. Darf dieser sich aber entmündigen lassen? Neue Methoden machen Entscheidungswege nachvollziehbar, interpretierbar und damit transparent und schaffen so Vertrauen (trust) und Akzeptanz – oder sie decken Missverständnisse auf. Menschen können (manchmal – nicht immer) Zusammenhänge im Kontext verstehen und aus wenigen Beispielen generalisieren. Ein menschlicher Experte kann helfen, wo die KI an ihre Grenzen kommt, aber auch KI kann unterstützen, wo Menschen an ihre Grenzen kommen. Ärzte können von monotonen Routineaufgaben entlastet werden, während gleichzeitig, KI-Systeme und menschliche Experten gemeinsam bessere Entscheidungen treffen als jeweils für sich allein [pdf, 871 kB]. Online verfügbar:  https://www.heise.de/select/ct/2018/22/1540263049336608


Explainable AI

Andreas Holzinger 2018. Explainable AI (ex-AI). Informatik-Spektrum, 41, (2), 138-143, doi:10.1007/s00287-018-1102-5. ,,Explainable AI“ ist kein neues Gebiet. Das Problem der Erklärbarkeit ist so alt wie die AI selbst, ja vielmehr das Resultat ihrer selbst. Während regelbasierte Lösungen der frühen AI nachvollziehbare ,,Glass-Box“-Ansätze darstellten, lag deren Schwäche im Umgang mit Unsicherheiten der realen Welt. Durch die Einführung probabilistischer Modellierung und statistischer Lernmethoden wurden die Anwendungen zunehmend erfolgreicher – aber immer komplexer und opak. Beispielsweise werden Wörter natürlicher Sprache auf hochdimensionale Vektoren abgebildet und dadurch für Menschen nicht mehr verstehbar. In Zukunft werden kontextadaptive Verfahren notwendig werden, die eine Verknüpfung zwischen statistischen Lernmethoden und großen Wissensrepräsentationen (Ontologien) herstellen und Nachvollziehbarkeit, Verständlichkeit und Erklärbarkeit erlauben – dem Ziel von ,,explainable AI“. Online verfügbar: https://link.springer.com/article/10.1007/s00287-018-1102-5


Emotion Detection: Application of the Valence Arousal Space for Rapid Biological Usability Testing

Stickel, C., Ebner, M., Steinbach-Nordmann, S., Searle, G. & Holzinger, A. 2009. Emotion Detection: Application of the Valence Arousal Space for Rapid Biological Usability Testing to Enhance Universal Access. In: Stephanidis, Constantine (ed.) Universal Access in Human-Computer Interaction. Addressing Diversity, Lecture Notes in Computer Science, LNCS 5614. Berlin, Heidelberg: Springer, pp. 615-624, doi:10.1007/978-3-642-02707-9-70 Emotion is an important mental and physiological state – in times of AI even more important –  influencing cognition, perception, learning, communication, decision making, etc. It is considered as a definitive important aspect of user experience (UX), although at least well developed and most of all lacking experimental evidence. This paper deals with an application for emotion detection in usability testing of software. It describes the approach to utilize the valence arousal space for emotion modeling in a formal experiment. Our study revealed correlations between low performance and negative emotional states. Reliable emotion detection in usability tests will help to prevent negative emotions and attitudes in the final products. [Scholar] [publons] 34


Human Annotated Dialogues Dataset for Natural Conversational Agents

Erinc Merdivan, Deepika Singh, Sten Hanke, Johannes Kropf, Andreas Holzinger & Matthieu Geist 2020. Human Annotated Dialogues Dataset for Natural Conversational Agents. Applied Sciences, 10, (3), 1-16, doi:10.3390/app10030762. [Scholar] We developed a benchmark dataset with human annotations and replies, useful to develop metrics for conversational agents. This is relevant for the xAI research community, because conversational agents are gaining huge popularity in industrial applications (e.g. digital assistants, chatbots, and particularly systems for natural language understanding (NLU), for medical decision support). A major drawback is the unavailability of a common metric to evaluate the replies against human judgement for conversation agents. Human responses include: (i) ratings of the dialogue reply in relevance to the dialogue history; and (ii) unique dialogue replies for each dialogue history from the users. This enables evaluating models against six unique human responses for each given history. Detailed analysis on how dialogues are structured and human perception on dialogue score in comparison with existing models are also presented.


Convolutional and Recurrent Neural Networks for Activity Recognition in Smart Environment

Singh, D., Merdivan, E., Hanke, S., Kropf, J., Geist, M. & Holzinger, A. 2017. Convolutional and Recurrent Neural Networks for Activity Recognition in Smart Environment. In: Holzinger, Andreas, Goebel, Randy, Ferri, Massimo & Palade, Vasile (eds.) Towards Integrative Machine Learning and Knowledge Extraction: BIRS Workshop, Banff, AB, Canada, July 24-26, 2015, Revised Selected Papers. Cham: Springer International Publishing, pp. 194–205, doi:10.1007/978-3-319-69775-8_12 Convolutional Neural Networks (CNN) are very useful for fully automatic extraction of discriminative features from raw sensor data. This is an important problem in activity recognition, which is of enormous interest in ambient sensor environments due to its universality on various applications. Activity recognition in smart homes uses large amounts of time-series sensor data to infer daily living activities and to extract effective features from those activities, which is a challenging task. In this paper we demonstrate the use of the CNN and a comparison of results, which has been performed with Long Short Term Memory (LSTM), recurrent neural networks and other machine learning algorithms, including Naive Bayes, Hidden Markov Models, Hidden Semi-Markov Models and Conditional Random Fields. [Scholar] [WoS]


Towards a Deeper Understanding of How a Pathologist Makes a Diagnosis

Birgit Pohn, Michaela Kargl, Robert Reihs, Andreas Holzinger, Kurt Zatloukal & Heimo Müller. Towards a Deeper Understanding of How a Pathologist Makes a Diagnosis: Visualization of the Diagnostic Process in Histopathology. IEEE Symposium on Computers and Communications (ISCC 2019), 2019 Barcelona. IEEE, 1081-1086, doi:10.1109/ISCC47284.2019.8969598. Advancements in Artificial Intelligence (AI) and Machine Learning (ML) are enabling new diagnostic capabilities. In this paper we argue that the very first step before introducing AI/ML into diagnostic workflows is a deep understanding of how pathologists work. We developed a visualization concept, including: (a) the sequence of the views observed by the pathologist (Observation Path), (b) the sequence of the spoken comments and statements of the pathologist (Dictation Path), (c) the underlying knowledge and experience of the pathologist (Knowledge Path), (d) information about the current phase of the diagnostic process and (e) the current magnification factor of the microscope chosen by the pathologist.  This is highly important for explainable AI [Paper] [Scholar]


NLP for the Generation of Training Data Sets for Ontology-Guided Weakly-Supervised Machine Learning in Digital Pathology

Robert Reihs, Birgit Pohn, Kurt Zatloukal, Andreas Holzinger & Heimo Müller. NLP for the Generation of Training Data Sets for Ontology-Guided Weakly-Supervised Machine Learning in Digital Pathology. 2019 IEEE Symposium on Computers and Communications (ISCC), 2019. IEEE, 1072-1076, doi:10.1109/ISCC47284.2019.8969703. The combination of ontologies with machine learning (ML) approaches is a hot topic and not yet extensively investigated but having great future potential – particularly for explainable AI – interpretable machine learning. Since full annotation on pixel level would be impracticably expensive, a practical solution is in weakly-supervised ML. In this paper we used ontology-guided natural language processing (NLP) for term extraction and a decision tree built with an expert-curated classification system. This demonstrates the practical value of our solution to analyze and structure training data sets for ML and as a tool for the generation of biobank catalogues. [xAI-Project] [Scholar] [RG]


In silico modeling for tumor growth visualization

Fleur Jeanquartier, Claire Jean-Quartier, David Cemernek & Andreas Holzinger 2016. In silico modeling for tumor growth visualization. BMC Systems Biology, 10, (1), 1-15, doi:10.1186/s12918-016-0318-8.

In-silico methods overcome the lack of wet experimental possibilities and as dry method succeed in terms of reduction, refinement and replacement of animal experimentation, also known as the 3R principles. Our visualization approach to simulation allows for more flexible usage and easy extension to facilitate understanding and gain novel insight. Biomedical research in general and research on tumor growth in particular will benefit from the systems biology perspective. We aim to provide a comprehensive and expandable simulation tool to visualizing tumor growth. This novel Web-based application offers the advantage of a user-friendly graphical interface with several manipulable input variables to correlate different aspects of tumor growth. [Paper] [Scholar]


In silico cancer research towards 3R

Claire Jean-Quartier, Fleur Jeanquartier, Igor Jurisica & Andreas Holzinger 2018. In silico cancer research towards 3R. Springer/Nature BMC cancer, 18, (1), 408, doi:10.1186/s12885-018-4302-0

Underlining and extending the in-silico approach with respect to the 3Rs (replacement, reduction, refinement) will lead cancer research towards efficient and effective precision medicine. Therefore, we suggest refined translational models and testing methods based on integrative analyses and the incorporation of computational biology within cancer research. We give an overview on in vivo, in vitro and in silico methods used in cancer research. Common models as cell-lines, xenografts, or genetically modified rodents reflect relevant pathological processes to a different degree, but can not replicate the full spectrum of human disease. There is an increasing importance of computational biology, advancing from the task of assisting biological analysis with network biology approaches as the basis for understanding a cell’s functional organization up to model building for predictive systems. [Paper] [Scholar]


From extreme programming & usability engineering to extreme usability in software engineering education (XP+UE > XU)

The success of extreme programming (XP) is based, among other things, on an optimal communication in teams of 6-12 persons, simplicity, frequent releases and a reaction to changing demands. Most of all, the customer is integrated into the development process, with constant feedback. This is very similar to usability engineering (UE) which follows a spiral four phase procedure model (analysis, draft, development, test) and a three step (paper mock-up, prototype, final product) production model. In comparison, these phases are extremely shortened in XP; also the ideal team size in UE user-centered development is 4-6 people, including the end-user. The two development approaches have different goals but, at the same time, employ similar methods to achieve them. It seems obvious that there must be synergy in combining them. The authors present ideas in how to combine them in an even more powerful development method called extreme usability (XU). The most important issue of this paper is that the authors have embedded their ideas into software engineering education. [Scholar]


Biomedical informatics: Discovering knowledge in big data
Biomedical informatics: Discovering knowledge in big data

This book provides a broad overview of the topic Bioinformatics with focus on data, information and knowledge. From data acquisition and storage to visualization, ranging through privacy, regulatory and other practical and theoretical topics, the author touches several fundamental aspects of the innovative interface between Medical and Technology domains that is Biomedical Informatics. Each chapter starts by providing a useful inventory of definitions and commonly used acronyms for each topic and throughout the text, the reader finds several real-world examples, methodologies and ideas that complement the technical and theoretical background. This new edition includes new sections at the end of each chapter, called “future outlook and research avenues,” providing pointers to future challenges. At the beginning of each chapter a new section called “key problems”, has been added, where the author discusses possible traps and unsolvable or major problems. https://www.springer.com/de/book/9783319045276


Expectations of Artificial Intelligence for Pathology
Expectations of Artificial Intelligence for Pathology

Peter Regitnig, Heimo Mueller & Andreas Holzinger 2020. Expectations of Artificial Intelligence in Pathology. Springer Lecture Notes in Artificial Intelligence LNAI 12090. Cham: Springer, pp. 1-15, doi:10.1007/978-3-030-50402-1-1 [For students, pdf, 1,3 MB]

Within the last ten years, essential steps have been made to bring artificial intelligence (AI) successfully into the field of pathology. However, most medical experts are still far away from using AI in daily practice. This paper focuses on tasks, which could be solved, and which could be done better by AI, or image-based algorithms, compared to a human expert. In particular, this paper focuses on the needs and demands of surgical pathologists; examples include: Finding small tumour deposits within lymph nodes, detection and grading of cancer, quantification of positive tumour cells in immunohistochemistry, pre-check of Papanicolaoustained gynaecological cytology in cervical cancer screening, text feature extraction, text interpretation for tumour-coding error prevention and AI in the next-generation virtual autopsy.


Legal, regulatory, ethical frameworks for standards in artificial intelligence and autonomous robotic surgery

Shane O’Sullivan, Nathalie Nevejans, Colin Allen, Andrew Blyth, Simon Leonard, Ugo Pagallo, Katharina Holzinger, Andreas Holzinger, Mohammed Imran Sajid & Hutan Ashrafian 2019. Legal, regulatory, and ethical frameworks for development of standards in artificial intelligence (AI) and autonomous robotic surgery. The International Journal of Medical Robotics and Computer Assisted Surgery, 15, (1), 1-12, doi:10.1002/rcs.1968.  We classify responsibility into (1) Accountability; (2) Liability; and (3) Culpability. All three aspects were addressed when discussing responsibility for AI and autonomous surgical robots, be these civil or military patients (however, these aspects may require revision in cases where robots become citizens). The component which produces the least clarity is Culpability, since it is unthinkable in the current state of technology. We envision that in the near future a surgical robot can learn and perform routine operative tasks that can then be supervised by a human surgeon. This represents a surgical parallel to autonomously driven vehicles. Here a human remains in the ‘driving seat’ as a ‘doctor‐in‐the‐loop’ thereby safeguarding patients undergoing operations that are supported by surgical machines with autonomous capabilities.


Analysis of biomedical data with multilevel glyphs

Heimo Müller, Robert Reihs, Kurt Zatloukal & Andreas Holzinger 2014. Analysis of biomedical data with multilevel glyphs. BMC Bioinformatics, 15, (Suppl 6), S5, doi:10.1186/1471-2105-15-S6-S5 – We present multilevel data glyphs optimized for interactive knowledge discovery and visualization of large biomedical data sets. Data glyphs are 3D objects defined by multiple levels of geometric descriptions (levels of detail) combined with a mapping of data attributes to graphical elements and methods, which specify their spatial position. In the data mapping phase meta information about the attributes (scale, number of distinct values) are compared with the visual capabilities of the graphical elements in order to give a feedback to the user about the correctness of the variable mapping. The spatial arrangement of glyphs is done in a dimetric view, which leads to high data density, a simplified 3D navigation and avoids perspective distortion. We show the usage of data glyphs in the disease analyser for personalized medicine. Data glyphs are successfully applied in the disease analyser. Especially the automatic validation of the data mapping, selection of subgroups within histograms and the visual comparison of the value distributions were seen by experts as an important functionality.


From Machine Learning to Explainable AI (reading for students)
From Machine Learning to Explainable AI (reading for students)

Andreas Holzinger 2018. From Machine Learning to Explainable AI. 2018 World Symposium on Digital Intelligence for Systems and Machines (IEEE DISA). IEEE, pp. 55-66, doi:10.1109/DISA.2018.8490530. The success of statistical machine learning (ML) methods made the field of Artificial Intelligence (AI) so popular again, after the last AI winter. Meanwhile deep learning approaches even exceed human performance in particular tasks. However, such approaches have some disadvantages besides of needing big quality data, much computational power and engineering effort; those approaches are becoming increasingly opaque, and even if we understand the underlying mathematical principles of such models they still lack explicit declarative knowledge. For example, words are mapped to high-dimensional vectors, making them unintelligible to humans. What we need in the future are context-adaptive procedures, i.e. systems that construct contextual explanatory models for classes of real-world phenomena. This is the goal of explainable AI, which is not a new field; rather, the problem of explainability is as old as AI itself. While rule-based approaches of early AI were comprehensible “glass-box” approaches at least in narrow domains, their weakness was in dealing with uncertainties of the real world. Maybe one step further is in linking probabilistic learning methods with large knowledge representations (ontologies) and logical approaches, thus making results re-traceable, explainable and comprehensible on demand. [For my students]


On Graph Extraction from Image Data

Andreas Holzinger, Bernd Malle & Nicola Giuliani 2014. On Graph Extraction from Image Data. In: Slezak, Dominik, Peters, James F., Tan, Ah-Hwee & Schwabe, Lars (eds.) Brain Informatics and Health, BIH 2014, Lecture Notes in Artificial Intelligence, LNAI 8609. Heidelberg, Berlin: Springer, pp. 552-563, doi:10.1007/978-3-319-09891-3-50 A hot topic in AI/machine learning is to learn from graphs, particularly as graphs are a data structure which fosters explainability/causability. For any such approach one needs at first a relevant and robust representation from the image data. In this paper we present a novel approach for knowledge discovery by extracting graph structures from natural image data. For this purpose, we created a framework built upon modern Web technologies, utilizing HTML canvas and pure Javascript inside a Web-browser, which is a very promising engineering approach. This was the basis for our Graphinius project  [Paper ]


The European Legal Framework for Medical AI

Schneeberger, D., Stöger, K. & Holzinger, A. The European Legal Framework for Medical AI. In: Springer Lecture Notes in Computer Science LNCS 12279, (2020) Cham. Springer International, doi:10.1007/978-3-030-57321-8_12. In Feb 2020, the EC published a White Paper on AI and report on the safety and liability implications of AI, the Internet of Things (IoT) and robotics. The EC highlighted the “European Approach” to AI, stressing that “it is vital that European AI is grounded in our values and fundamental rights such as human dignity and privacy protection”. It also announced its intention to propose EU legislation for “high risk” AI applications in the nearer future which will include the majority of medical AI applications. We analyse the current European framework regulating medical AI. Starting with the fundamental rights framework as clear guidelineswe are focusing on data protection, product approval procedures and liability law. This analysis of the current state of law, including its problems and ambiguities regarding AI, is complemented by an outlook at the proposed amendments to product approval procedures and liability law, which, by endorsing a human-centred approach, will influence how medical AI will be used in Europe in the future. [paper] [Scholar] [publons] 23


Current Advances, Trends and Challenges in Machine Learning and Knowledge Extraction

Andreas Holzinger, Peter Kieseberg, Edgar Weippl & A Min Tjoa (2018). Current Advances, Trends and Challenges of Machine Learning and Knowledge Extraction: From Machine Learning to Explainable AI. Springer Lecture Notes in Computer Science LNCS 11015. Cham: Springer, pp. 1-8, doi:10.1007/978-3-319-99740-7-1 In this editorial we present thoughts on future trends in AI generally, and ML specifically. Industry is investing heavily in AI, and spin-offs and start-ups are emerging on an unprecedented rate. The European Union is allocating a lot of additional funding into AI research grants, and various institutions are calling for a joint European AI research institute. Even universities are taking AI/ML into their curricula and strategic plans. Finally, even the people on the street talk about it, and if grandma knows what her grandson is doing in his new start-up, then the time is ripe: We are reaching a new AI spring. However, as fantastic current approaches seem to be, there are still huge problems to be solved: the best performing models lack transparency, hence are considered to be black boxes. The general and worldwide trends in privacy, data protection, safety and security make such black box solutions difficult to use in practice. Specifically in Europe, where the new General Data Protection Regulation (GDPR) came into effect on May, 28, 2018 which affects everybody (right of explanation). Consequently, a previous niche field for many years, explainable AI, explodes in importance. For the future, we envision a fruitful marriage between classic logical approaches (ontologies) with statistical approaches which may lead to context-adaptive systems (stochastic ontologies) that might work similar as our human brain.


The Ten Commandments of Ethical Medical AI

Mueller, H., Mayrhofer, M. T., Veen, E.-B. V. & Holzinger, A. 2021. The Ten Commandments of Ethical Medical AI. IEEE COMPUTER, 54, (7), 119–123, doi:10.1109/MC.2021.3074263. In this paper we propose ten commandments as practical guidelines for those applying artificial intelligence to provide a concise checklist to a wide group of stakeholders. The aim of the third United Nations (UN) Sustainable Development Goal, dedicated to “Good Health and Well-Being,” is that all people can access the health services they need without facing financial hardship. The goal has three targets: 1) 1 billion more people should benefit from universal health coverage, 2) 1 billion more people should be better protected from health emergencies, and 3) 1 billion more people should enjoy better health and well-being (World Health Organization, 2018).21 Artificial intelligence (AI) is generally acknowledged as an important component in achieving these three targets. [Scholar] [publons] 20


Digital Transformation for Sustainable Development Goals (SDGs) - A Security, Safety and Privacy Perspective on AI

Holzinger, A., Weippl, E., Tjoa, A. M. & Kieseberg, P. 2021. Digital Transformation for Sustainable Development Goals (SDGs) – a Security, Safety and Privacy Perspective on AI. Springer Lecture Notes in Computer Science, LNCS 12844. Cham: Springer, pp. 1-20, doi:10.1007/978-3-030-84060-0_1. The main driver of digital transformation is artificial intelligence (AI). The potential of AI to benefit humanity and its environment is enormous. AI can help find new solutions to the most pressing challenges in virtually all areas of life: from agriculture and forest ecosystems that affect our entire planet, to the health of every single human being. This article highlights a very different aspect: For all its benefits, the large-scale adoption of AI technologies also holds enormous and unimagined potential for new kinds of unforeseen threats. All stakeholders, governments, policy makers, industry, academia, must ensure that AI is developed with these potential threats in mind and that the safety, traceability, transparency, explainability, validity, and verifiability of AI applications in our everyday lives are ensured. It is the responsibility of all stakeholders to ensure the use of trustworthy AI. Achieving this will require a concerted effort to ensure that AI is always consistent with human values and includes a future that is safe in every way for all people on this planet. In this paper, we describe some of these threats and show that safety, security and explainability are indispensable cross-cutting issues and highlight this with two exemplary selected application areas: smart agriculture and smart health. [Scholar] [publons] 12


Legal aspects of data cleansing in medical AI

Data quality is of paramount importance for the smooth functioning of modern data-driven AI applications with machine learning as a core technology. This is also true for medical AI, where malfunctions due to “dirty data” can have particularly dramatic harmful implications. Consequently, data cleansing is an important part in improving the usability of (Big) Data for medical AI systems. However, it should not be overlooked that data cleansing can also have negative effects on data quality if not performed carefully. This paper takes an interdisciplinary look at some of the technical and legal challenges of data cleansing against the background of European medical device law, with the key message that technical and legal aspects must always be considered together in such a sensitive context.  Stoeger, K., Schneeberger, D., Kieseberg, P. & Holzinger, A. (2021). Legal aspects of data cleansing in medical AI. Computer Law and Security Review, 42, 105587, doi:10.1016/j.clsr.2021.105587.  [Scholar] [publons] 02


Network Module Detection from Multi-Modal Node Features with a Greedy Decision Forest for Actionable Explainable AI (AXAI)

Network-based algorithms are often used in real-world applications and are of great practical value. In this work, we demonstrate subnetwork detection based on multimodal node features using a new Greedy Decision Forest for better interpretability. The latter will be a crucial factor to gain the trust of human experts in the future. We show a concrete application example from bioinformatics and systems biology with a special focus on biomedicine. However, our methodological approach is applicable in many other fields as well. Systems biology is a very good example of a field where statistical data-driven machine learning enables the analysis of large amounts of multimodal biomedical data. This is important to achieve the future goal of precision applications (e.g., precision medicine), where complexity is modeled at the system level to, for example, optimally tailor decisions, health practices, and therapies to individual patients. Our glass-box approach could be revolutionary in uncovering disease-causing network modules from multi-omics data to better understand diseases such as cancer. https://arxiv.org/abs/2108.11674 [Project Page]


Medical Artificial Intelligence: The European Legal Perspective

Karl Stöger, David Schneeberger & Andreas Holzinger (2021). Medical Artificial Intelligence: The European Legal Perspective. Communications of the ACM, 64, (11), doi:10.1145/3458652 . Although the European Commission proposed new legislation for the use of “high-risk artificial intelligence” earlier this year, the existing European fundamental rights framework already provides some clear guidance on the use of medical AI. The European Commission has already published a white paper on artificial intelligence (AI) and an accompanying report on the security and liability implications of AI, the Internet of Things (IoT), and robotics. Here, the “European approach” to AI is highlighted, emphasizing that “it is crucial that European AI is based on human values and fundamental rights and privacy protection.” In April 2021, a proposal for a regulation entitled the Artificial Intelligence Act was presented. This regulation is intended to regulate the use of “high-risk” AI applications, which include those AI applications that affect human life in some way. [pdf preprint] [Scholar] [publons]


Robust, explainable, and trustworthy artificial intelligence

Holzinger, A., Dehmer, M., Emmert-Streib, F., Cucchiara, R., Augenstein, I., Del Ser, J., Samek, W., Jurisica, I. & Díaz-Rodríguez, N. 2022. Information fusion as an integrative cross-cutting enabler to achieve robust, explainable, and trustworthy medical artificial intelligence. Information Fusion, 79, (3), 263–278, doi:10.1016/j.inffus.2021.10.007. In this paper we argue that if we want to use AI to solve real-world problems outside the lab and in routine environments (beyond i.i.d. data) we need to integrate conceptual knowledge as a guiding model of reality so to help develop more robust, explainable, and less biased machine learning models that can ideally learn from less data. We argue that achieving these goals will require a coordinated joint effort that combines three complementary pioneering research areas: (1) complex netwoks (graphs) and their inference, (2) graphical causal models and counterfactual models, and (3) verification, interpretability and explainabilty methods. [Scholar] [publons] 07


Explainable AI Methods - A Brief Overview

Holzinger, A., Saranti, A., Molnar, C., Biececk, P. & Samek, W. 2022. Explainable AI Methods – A Brief Overview. XXAI – Lecture Notes in Artificial Intelligence LNAI 13200. Cham: Springer, pp. 13-38, doi:10.1007/978-3-031-04083-2_2 In this paper, we briefly introduce a few selected methods and discuss them in a short, clear and concise way. The goal of this article is to give beginners, especially application engineers and data scientists, a quick overview of the state of the art in the current topic of explainable AI (XAI). The following 17 methods are covered in this chapter: LIME, Anchors, GraphLIME, LRP, DTD, PDA, TCAV, XGNN, SHAP, ASV, Break-Down, Shapley Flow, Textual Explanations of Visual Models, Integrated Gradients, Causal Models, Meaningful Perturbations, and X-NeSyL. [Scholar] [publons] –


Toward Human-AI Interfaces to Support Explainability and Causability in Medical AI

Andreas Holzinger & Heimo Mueller (2021). Toward Human-AI Interfaces to Support Explainability and Causability in Medical AI. IEEE COMPUTER, 54, (10), 78-86, doi:10.1109/MC.2021.3092610.  Our concept of causability is a measure of whether and to what extent humans can understand a given machine explanation. We motivate causability with a clinical case from cancer research. We argue for using causability in medical artificial intelligence (AI) to develop and evaluate future human–AI interfaces. In Figure 2, we outline a model for the information flow between humans and an AI system. On the interaction surface, which can be seen as a “border” between human intelligence and AI, the information flow is maximal. As one gradually goes “deeper” into the AI system, the information flow decreases; at the same time, the semantic richness (SR) of potential information objects increases. In traditional human–computer interactions, the information flow is extremely asymmetrical; that is, much more information is shown by high-resolution displays compared to mouse and/or textual input—not to mention other input modalities (see the dotted line in Figure 2). [WoS]

 


The explainability paradox: Challenges for xAI in digital pathology

Theodore Evans, Carl Orge Retzlaff, Christian Geißler, Michaela Kargl, Markus Plass, Heimo Müller, Tim-Rasmus Kiehl, Norman Zerbe & Andreas Holzinger (2022). The explainability paradox: Challenges for xAI in digital pathology. Future Generation Computer Systems, 133, (8), 281–296, doi:10.1016/j.future.2022.03.009. The increasing prevalence of digitised workflows in diagnostic pathology opens the door to life-saving applications of artificial intelligence (AI). Explainability is identified as a critical component for the safety, approval and acceptance of AI systems for clinical use. Despite the cross-disciplinary challenge of building explainable AI (xAI), very few application-and user-centric studies in this domain have been carried out. We conducted the first mixed-methods study of user interaction with samples of stateof-the-art AI explainability techniques for digital pathology. This study reveals challenging dilemmas faced by developers of xAI solutions for medicine and proposes empirically-backed principles for their safer and more effective design. [WoS]


Emotion Detection: Application of the Valence Arousal Space for Rapid Biological Usability Testing

Christian Stickel, Martin Ebner, Silke Steinbach-Nordmann, Gig Searle & Andreas Holzinger (2009). Emotion Detection: Application of the Valence Arousal Space for Rapid Biological Usability Testing to Enhance Universal Access. In: Stephanidis, Constantine (ed.) Universal Access in Human-Computer Interaction. Addressing Diversity, Lecture Notes in Computer Science, LNCS 5614. Berlin, Heidelberg: Springer, pp. 615–624, doi:10.1007/978-3-642-02707-9-70. Emotions are an important mental and physiological state that influences cognition, perception, learning, communication, decision making, etc. They are considered an important aspect of user experience (UX), even though they are not well developed and, most importantly, experimental evidence is not yet available. This contribution addresses an application for emotion detection in software usability testing. It describes the approach of using the valence arousal space for emotion modeling in a formal experiment. Our study showed correlations between low performance and negative emotional states. Reliable detection of emotions in usability tests will help to avoid negative emotions and attitudes in final products. This can be a great advantage to improve Universal Access. [Physiological Computing] [Scholar] [WoS]


Machine Learning and Knowledge Extraction to Support Work Safety for Smart Forest Operations

Hoenigsberger, F., Saranti, A., Angerschmid, A., Retzlaff, C. O., Gollob, C., Witzmann, S., Nothdurft, A., Kieseberg, P., Holzinger, A. & Stampfer, K. 2022. Machine Learning and Knowledge Extraction to Support Work Safety for Smart Forest Operations. International Cross-Domain Conference for Machine Learning and Knowledge Extraction. Springer. 362–375, doi:10.1007/978-3-031-14463-9_23

Forestry work is one of the most difficult and dangerous professions in all production areas worldwide – therefore, any kind of occupational safety and any contribution to increasing occupational safety plays a major role, in line with addressing sustainability goal SDG 3 (good health and well-being). Detailed records of occupational accidents and the analysis of these data play an important role in understanding the interacting factors that lead to occupational accidents and, if possible, adjusting them for the future. However, the application of machine learning and knowledge extraction in this domain is still in its infancy, so this contribution is also intended to serve as a starting point and test bed for the future application of artificial intelligence in occupational safety and health, particularly in forestry. In this context, this study evaluates the accident data of Österreichische Bundesforste AG (ÖBf), Austria’s largest forestry company, for the years 2005–2021. Overall, there are 2481 registered accidents, 9 of which were fatal. For the task of forecasting the absence hours due to an accident as well as the classification of fatal or non-fatal cases, decision trees, random forests and fully-connected neuronal networks were used.


AI for Life: Trends in Artificial Intelligence for Biotechnology

Holzinger, A., Keiblinger, K., Holub, P., Zatloukal, K. & Müller, H. 2023. AI for Life: Trends in Artificial Intelligence for Biotechnology. New Biotechnology, 74, (1), 16–24, doi:10.1016/j.nbt.2023.02.001. Due to popular successes (e.g., ChatGPT) Artificial Intelligence (AI) is on everyone’s lips today. When advances in biotechnology are combined with advances in AI unprecedented new potential solutions become available. This can help with many global problems and contribute to important Sustainability Development Goals. Current examples include Food Security, Health and Well-being, Clean Water, Clean Energy, Responsible Consumption and Production, Climate Action, Life below Water, or protect, restore and promote sustainable use of terrestrial ecosystems, sustainably manage forests, combat desertification, and halt and reverse land degradation and halt biodiversity loss. AI is ubiquitous in the life sciences today. Topics include a wide range from machine learning and Big Data analytics, knowledge discovery and data mining, biomedical ontologies, knowledge-based reasoning, natural language processing, decision support and reasoning under uncertainty, temporal and spatial representation and inference, and methodological aspects of explainable AI (XAI) with applications of biotechnology. [WoS]


Quod erat demonstrandum?-Towards a typology of the concept of explanation for the design of explainable AI

Cabitza, F., Campagner, A., Malgieri, G., Natali, C., Schneeberger, D., Stoeger, K. & Holzinger, A. 2023. Quod erat demonstrandum?-Towards a typology of the concept of explanation for the design of explainable AI. Expert Systems with Applications, 213, (3), 1–16, doi:10.1016/j.eswa.2022.118888. In this paper, we present a fundamental framework for defining different types of explanations of AI systems and the criteria for evaluating their quality. Starting from a structural view of how explanations can be constructed, i.e., in terms of an explanandum (what needs to be explained), multiple explanantia (explanations, clues, or parts of information that explain), and a relationship linking explanandum and explanantia, we propose an explanandum-based typology and point to other possible typologies based on how explanantia are presented and how they relate to explanandia. We also highlight two broad and complementary perspectives for defining possible quality criteria for assessing explainability: epistemological and psychological (cognitive). These definition attempts aim to support the three main functions that we believe should attract the interest and further research of XAI scholars: clear inventories, clear verification criteria, and clear validation methods. [WoS]

 


Deep ROC Analysis and AUC as Balanced Average Accuracy, for Improved Classifier Selection, Audit and Explanation

Carrington, A. M., Manuel, D. G., Fieguth, P. W., Ramsay, T., Osmani, V., Wernly, B., Benett, C., Hawken, S., Mcinnes, M., Magwood, O., Sheikh, Y. & Holzinger, A. 2023. Deep ROC Analysis and AUC as Balanced Average Accuracy, for Improved Classifier Selection, Audit and Explanation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45, (1), 329–341, doi:10.1109/TPAMI.2022.3145392. Optimal performance is desired for decision-making in any field with binary classifiers and diagnostic tests, however common performance measures lack depth in information. The area under the receiver operating characteristic curve (AUC) and the area under the precision recall curve are too general because they evaluate all decision thresholds including unrealistic ones. Conversely, accuracy, sensitivity, specificity, positive predictive value and the F1 score are too specificthey are measured at a single threshold that is optimal for some instances, but not others, which is not equitable. In between both approaches, we propose deep ROC analysis to measure performance in multiple groups of predicted risk (like calibration), or groups of true positive rate or false positive rate. In each group, we measure the group AUC (properly), normalized group AUC, and averages of: sensitivity, specificity, positive and negative predictive value, and likelihood ratio positive and negative. The measurements can be compared between groups, to whole measures, to point measures and between models. We also provide a new interpretation of AUC in whole or part, as balanced average accuracy, relevant to individuals instead of pairs. We evaluate models in three case studies using our method and Python toolkit and confirm its utility. [WoS]

 


GNN-SubNet: disease subnetwork detection with explainable Graph Neural Networks.

Pfeifer, B., Saranti, A. & Holzinger, A. 2022. GNN-SubNet: disease subnetwork detection with explainable Graph Neural Networks. Bioinformatics, 38, (S-2), ii120-ii126, doi:10.1093/bioinformatics/btac478.

The tremendous success of graphical neural networks (GNNs) already had a major impact on systems biology research. For example, GNNs are currently being used for drug target recognition in protein–drug interaction networks, as well as for cancer gene discovery and more. Important aspects whose practical relevance is often underestimated are comprehensibility, interpretability and explainability.
In this work, we present a novel graph-based deep learning framework for disease subnetwork detection via explainable GNNs. Each patient is represented by the topology of a protein–protein interaction (PPI) network, and the nodes are enriched with multi-omics features from gene expression and DNA methylation. In addition, we propose a modification of the GNNexplainer that provides model-wide explanations for improved disease subnetwork detection.
Availability and implementation. The proposed methods and tools are implemented in the GNN-SubNet Python package, which we have made available on our GitHub for the international research community [WoS]

 


Exploring artificial intelligence for applications of drones in forest ecology and management

This paper highlights the significance of Artificial Intelligence (AI) in the realm of drone applications in forestry. Drones have revolutionized various forest operations, and their role in mapping, monitoring, and inventory procedures is explored comprehensively. Leveraging advanced imaging technologies and data processing techniques, drones enable real-time tracking of changes in forested landscapes, facilitating effective monitoring of threats such as fire outbreaks and pest infestations. They expedite forest inventory by swiftly surveying large areas, providing precise data on tree species identification, size estimation, and health assessment, thus supporting informed decision-making and sustainable forest management practices. Moreover, drones contribute to tree planting, pruning, and harvesting, while monitoring reforestation efforts in real-time. Wildlife monitoring is also enhanced, aiding in the identification of conservation concerns and informing targeted conservation strategies. Drones offer a safer and more efficient alternative in search and rescue operations within dense forests, reducing response time and improving outcomes. Additionally, drones equipped with thermal cameras enable early detection of wildfires, enabling timely response, mitigation, and preservation efforts. The integration of AI and drones holds immense potential for enhancing forestry practices and contributing to sustainable land management. In the future explainable AI (XAI) improves trust and safety by providing transparency in decision-making, aiding in liability issues, and enabling precise operations. XAI facilitates better environmental monitoring and impact analysis, contributing to efficient forest management and preservation efforts. If a drone’s AI can explain its actions, it will be easier to understand why it chose a particular path or action, which could inform safety procedures and improvements.