Runtime Verification of Computer Vision Deep Neural Networks against Symbolic Constraints

for Degree:

Contact Person:

Status:

Available

Abstract

Recent work has introduced simple methods to evaluate compliance with symbolic rules in black-box deep neural networks (DNNs). This thesis should investigate and quantitatively compare the extent to which different computer vision DNNs comply with symbolic rules using real-world datasets, such as those from the automated driving domain. The goal is to assess how effectively existing verification and testing techniques for DNNs can identify remaining issues in the model's learned knowledge. Finally, the approach should be evaluated as a runtime verification framework for DNNs, which can be installed post hoc and trigger an alert in case of implausible outputs.

Problem Statement

Deep neural networks excel in computer vision tasks, but are yet too unreliable for use in safety-critical applications like fully automated driving perception. A core reason are unavoidable, but unintuitive and wrong correlations in the training data. These are easily incorporated by the DNN during training and may lead to failures in rare situations (e.g., high occlusion). This makes it important to ensure that DNNs comply with given intuition in form of symbolic constraints on the desired outputs, for example "If there is a head, there should usually be a person" (isHead(region) => isPerson(region)). Techniques from concept-based explainable artificial intelligence (C-XAI; Lee et al. 2025) allow to associate symbols (concepts) of interest, e.g., "head", with regions in the internal representations of a trained deep neural network (DNN). As proposed by Schwalbe et al. (2022), this can be used to attach additional segmentation outputs for those concepts to a trained DNN, even if the DNN has not directly been trained to output these symbols. Subsequently, it can be tested on a test set or even during runtime, whether the extended DNN outputs fulfill the (potentially fuzzy) logical constraints.

However, so far this approach has only been showcased on a very small setup. Therefore, it remains open, how effectively the method uncovers errors for different DNN architectures, datasets, and rule sets. In other words: How many DNN failures arise from inconsistencies with known rules? And which factors influence how strongly non-compliance with logical constraints correlates with incorrect predictions?

Goals

Define a knowledge base of diverse rules applicable to vision tasks which serve as a test setup to test rule compliance.
Implement the verification testing setup for a selection of concurrent vision DNN architectures, rules, and datasets.
Conduct and evaluate a comparative study:
- assess what are influence factors in DNN architecture, rule type and dataset for rule compliance
- correlate rule compliance against quality as runtime monitor, i.e., ratio of false alarms against uncovered true errors

Approach

Subsequent steps for the comparison:
- How do different object detection DNN architectures compare (Convolutional DNNs / Vision Transformers; small / big; ...)?
- How do different datasets compare (general like MS COCO vs. automotive like A2D2, ...)?
Setup for the extraction of symbols and relations from the DNN:
- Rule base: For defining an exemplary rule base, it is recommended to start into a semantically rich domain like automated driving, for which plenty of intuitive rules and full ontologies are available (Giunchiglia et al. 2022).
- The base method to extract symbols should be the C-XAI method described in Schwalbe et al. (2022), for which a rich code base is available from the team.
- As symbols for the rule bases, simple object classes (e.g., street light, car, person), object parts (e.g., head, arm, steering wheel), and/or object attributes (e.g., red) can be used as a starting point. To probe their representations, existing datasets like ImageNet, the German Traffic Sign Datasets, or the very common BRODEN dataset can serve as a basis, later potentially extended by generated data.
- As relations, simple hierarchical relations (isA), and 2D spatial relations (isPartOf) may serve as a starting point, which can be estimated from concept segmentations right away. These may be extended to 3D spatial relations using (separately) predicted or ground truth depth information.
- As logic, Boolean and probabilistic t-norm fuzzy logic can be used as a starting point, later extended / compared against other continuous t-norm fuzzy logics (for a brief introduction see, e.g., Schwalbe et al. (2022) or Roychowdhury et al. (2018)).

Requirements

Solid programming skills in python and familiarity with the pytorch deep learning framework
Familiarity with machine learning using DNNs and logistic regression models
Familiarity with formalization of knowledge as logical rules
Basic understanding of continuous fuzzy (=multi-valued) logics

Literature

Giunchiglia, Eleonora, Mihaela Stoian, Salman Khan, Fabio Cuzzolin, and Thomas Lukasiewicz. 2022. “ROAD-R: The Autonomous Driving Dataset with Logical Requirements.” In IJCLR 2022 Workshops. Vienna, Austria. https://arxiv.org/abs/2210.01597.
Lee, Jae Hee, Georgii Mikriukov, Gesina Schwalbe, Stefan Wermter, and Diedrich Wolter. 2025. “Concept-Based Explanations in Computer Vision: Where Are We and Where Could We Go?” In Computer Vision – ECCV 2024 Workshops, edited by Alessio Del Bue, Cristian Canton, Jordi Pont-Tuset, and Tatiana Tommasi, 266–87. Cham: Springer Nature Switzerland. https://doi.org/10.1007/978-3-031-92648-8_17.
Roychowdhury, Soumali, Michelangelo Diligenti, and Marco Gori. 2018. “Image Classification Using Deep Learning and Prior Knowledge.” In Workshops of the 32nd AAAI Conf. Artificial Intelligence, WS-18:336–43. AAAI Workshops. New Orleans, Louisiana, USA: AAAI Press. https://aaai.org/ocs/index.php/WS/AAAIW18/paper/view/16575.
Schwalbe, Gesina, Christian Wirth, and Ute Schmid. 2022. “Enabling Verification of Deep Neural Networks in Perception Tasks Using Fuzzy Logic and Concept Embeddings.” arXiv. https://doi.org/10.48550/arXiv.2201.00572. (Preprint)

Search form

You are here

Runtime Verification of Computer Vision Deep Neural Networks against Symbolic Constraints