Posts tagged: AI/ML

From embeddings to exploration: Engineering interactive latent space visualizations for AI model sensemaking

Machine learning systems are often inspected through 2D projections of high-dimensional representations using techniques such as t-SNE or UMAP. While these visualizations provide useful overviews of clustering and similarity, they are inherently static: they display only the existing data points and do not allow users to interactively explore a model's decision space. We present an interactive exploration system that uses a Variational Autoencoder (VAE) as a generative proxy over a model's training distribution, turning the latent space into a navigable workspace for model sensemaking. Unlike static embeddings, the proxy provides an explicit decoding path from latent coordinates to inputs, enabling interaction patterns such as continuous sampling, interpolation between anchors, and region probing. We operationalize these capabilities through a set of interactive probes that augment a familiar scatter-plot overview with generative overlays for comparing transitions between classes and examining sparsely populated regions. A within-subject formative study (N=16) comparing an interactive VAE-based method to a static t-SNE baseline shows that generative interaction substantially improves counterfactual reasoning and influences how users assess model behavior in sparse or uncertain regions, while static embeddings sometimes provide clearer boundary perception. From these findings, we derive concrete design guidelines and architectural considerations for engineering interactive AI model exploration systems using generative latent representations.

BeatriXR: Comprehensive and adaptive feedforward support for guidance in virtual reality

Virtual Reality (VR) environments present significant obstacles for users due to the sheer diversity of input devices, numerous interaction modalities, and varying user interface designs, which contribute to a steep learning curve. To address these complexities, feedforward is essential as it informs the user about the anticipated result of their actions, easing the learning process through contextualized previews of required interactions. However, designing and implementing effective direct feedforward, particularly without dedicated tools, can be tedious. We introduce BeatriXR, a comprehensive, reusable, and adaptive toolkit that provides extensive support for creating all possible configurations of direct feedforward to enhance user understanding and performance in VR. BeatriXR is an integrated system combining a modular VR toolkit with intelligent support derived from Large Language Models (LLMs). It supports the creation, visualization, and customization of direct feedforward using virtual avatars and two visualizations: an in-world representation and an on-screen comparison of interaction alternatives. This toolkit is mapped onto an established feedforward design space, covering phases such as Triggering, Previewing, and Exiting. To overcome the challenge designers face in determining optimal configurations, BeatriXR integrates an LLM-based adaptive decision support layer that proposes context-sensitive configuration alternatives. This guidance can be used both at design time to help domain experts select optimal configurations, and at runtime to adapt to the user's context and environment, such as recommending a change from a default partial avatar preview to a full ghosted avatar when a trainee shows uncertainty. We conducted an exploratory review with XR domain experts who rated the UI interface the final user can use to modify the feedforward settings, and the output of four LLM models customised for use in BeatriXR that would interact with procedure creators, providing insights on how to improve the UI experience and LLM answers. The results indicated that none of the evaluated models consistently outperformed the others, suggesting that the tested LLMs can be used interchangeably. Additionally, participants' feedback provided valuable insights for improving both the user interface, generally perceived positively, and the quality of the LLM-generated responses.

Paper accepted at EICS 2026: Interactive Latent Space Visualization for AI Model Sensemaking

From Embeddings to Exploration: Engineering Interactive Latent Space Visualizations for AI Model Sensemaking

Our paper "From Embeddings to Exploration: Engineering Interactive Latent Space Visualizations for AI Model Sensemaking" (PDF) has been accepted at EICS 2026 and will appear in the EICS issue of Proceedings of the ACM on Human-Computer Interaction. This is work by Sebe Vanbrabant together with Jarne Thys, Gilles Eerlings, Gustavo Rovelo Ruiz, Davy Vanacken, and myself.

Read more →

Paper accepted at EICS 2026: BeatriXR for Direct Feedforward in Virtual Reality

BeatriXR: Comprehensive and Adaptive Feedforward Support for Guidance in Virtual Reality

Our paper "BeatriXR: Comprehensive and Adaptive Feedforward Support for Guidance in Virtual Reality" (PDF) has been accepted at EICS 2026 and will appear in the EICS issue of Proceedings of the ACM on Human-Computer Interaction. This is work by Valentino Artizzu together with Gustavo Rovelo Ruiz, Lucio Davide Spano, and myself.

Read more →

Teaching as training: Iterative and incremental AI skill development

Higher education must equip students with skills for complex, multidisciplinary challenges. Traditional approaches relying on fixed deadlines and traditional exams often limit opportunities for growth and continuous skill development. This contribution presents an iterative and incremental teaching method, applied for five years in a row in a master-level Computer Science course on Human–AI Interaction. Our approach emphasizes formative feedback, collaborative learning, and individual progression. Students work on group assignments and an individual project, with no strict deadlines and unlimited opportunities during the semester to resubmit until a "pass" is achieved. Compact feedback sessions after each iteration serve both as assessment moments and teaching opportunities, clarifying expectations and guiding improvement. The method is grounded in mastery learning, formative assessment, and the High Impact Learning that Lasts model, fostering motivation and self-determination. Survey data and performance analysis of a study conducted two years ago, show positive effects on learning outcomes and student motivation: students valued the clarity of assessment, the removal of "one chance" exams, and the freedom to iteratively improve. Over five years of teaching, this approach has proven effective in balancing diverse prior knowledge, building applicable skills, and sustaining motivation during the semester. We conclude that incremental and iterative teaching constitutes a viable model for skill-oriented higher education, adaptable across contexts where collaboration, feedback, and progression are central.

Learning to delegate and act with DELEGACT: Multimodal language models for task-level human--cobot planning in industrial assembly

Industrial assembly is shifting toward human-robot collaboration (HRC) to leverage the complementary strengths of both agents. However, traditional task allocation referred to as the Robotic Assembly Line Balancing Problem (RALBP) remains labor-intensive and often lacks transparency. We introduce DELEGACT, a framework designed to produce workable, intelligible human-cobot task allocations. The framework uses a Vision-Language Model (VLM) to extract atomic operations from expert demonstration videos, then employs a Large Language Model (LLM) to delegate these tasks based on robot specifications, operator competencies, and material definitions. We provide a proof-of-concept prototype and preliminary testing on illustrative cases. Results demonstrate the system's ability to reason about complex constraints such as precision, weight, and ergonomics. This paper illustrates how off-the-shelf foundation models can automate HRC decision-making via a human-in-the-loop paradigm while preserving operator agency and understanding.

DIVERSE: Disagreement-inducing vector evolution for rashomon set exploration

We propose DIVERSE, a framework for systematically exploring the Rashomon set of deep neural networks, the collection of models that match a reference model's accuracy while differing in their predictive behavior. DIVERSE augments a pretrained model with Feature-wise Linear Modulation (FiLM) layers and uses Covariance Matrix Adaptation Evolution Strategy (CMA-ES) to search a latent modulation space, generating diverse model variants without retraining or gradient access. Across MNIST, PneumoniaMNIST, and CIFAR-10, DIVERSE uncovers multiple high-performing yet functionally distinct models. Our experiments show that DIVERSE offers a competitive and efficient exploration of the Rashomon set, making it feasible to construct diverse sets that maintain robustness and performance while supporting well-balanced model multiplicity. While retraining remains the baseline for generating Rashomon sets, DIVERSE achieves comparable diversity at reduced computational cost.

Challenges and opportunities for delay-invariant telerobotic interactions (short paper)

Effective operation in direct-control telerobotics relies heavily on real-time communication between the operator and the robot, as the operator retains full control over the robot's actions. However, in scenarios involving long distances, communication delays disrupt this feedback loop, creating significant challenges for precise control. To investigate these challenges, we conducted a user study where participants operated a TurtleBot3 Waffle Pi under varying delay conditions. Post-experiment brainstorming and analysis revealed recurring challenges, including over-correction, unpredictable robot behavior, and reduced situational awareness. Potential solutions identified include improving robot behavior predictability, integrating feedforward mechanisms, and enhancing visual feedback. These findings underscore the importance of designing intelligent interfaces to mitigate the impact of delays on telerobotic performance.

AI-spectra: A visual dashboard for model multiplicity to enhance informed and transparent decision-making

We present an approach, AI-Spectra, to leverage model multiplicity for interactive systems. Model multiplicity means using slightly different AI models yielding equally valid outcomes or predictions for the same task, thus relying on many simultaneous "expert advisors" that can have different opinions. Dealing with multiple AI models that generate potentially divergent results for the same task is challenging for users to deal with. It helps users understand and identify AI models are not always correct and might differ, but it can also result in an information overload when being confronted with multiple results instead of one. AI-Spectra leverages model multiplicity by using a visual dashboard designed for conveying what AI models generate which results while minimizing the cognitive effort to detect consensus among models and what type of models might have different opinions. We use a custom adaptation of Chernoff faces for AI-Spectra; Chernoff Bots. This visualization technique lets users quickly interpret complex, multivariate model configurations and compare predictions across multiple models. Our design is informed by building on established Human-AI Interaction guidelines and well know practices in information visualization. We validated our approach through a series of experiments training a wide variation of models with the MNIST dataset to perform number recognition. Our work contributes to the growing discourse on making AI systems more transparent, trustworthy, and effective through the strategic use of multiple models.

Opportunities and challenges of model multiplicity in interactive software systems

The proliferation of artificial intelligence (AI) in interactive systems has led to significant challenges in model integration, but also end-user-related aspects such as over- and undertrust. This paper explores how multiple AI models with the same performance and behavior but different internal workings –a phenomenon called model multiplicity– affect system integration and user interaction. We discuss the implications of model multiplicity for transparency, trust, and operational effectiveness in interactive software systems.

All Posts by Category or Tags.