Featured Blog: Exploring user interaction challenges with large language models

We’re using AI assistants and large language models everywhere in our daily lives. But what constitutes this interaction between person and machine? SRI Graduate Affiliate Davide Gentile writes about the virtues and pitfalls of user experience, highlighting some ways in which human-computer interaction could be made clearer, more trustworthy, and overall a better experience—for everyone.

Davide Gentile

3/22/20234 min read

For the full blog, please refer to the following link: uoft.me/9tD

The integration of large language models (LLMs) into our daily lives has magnified our understanding of what artificial intelligence (AI) systems can do, and what we can achieve with them.

For instance, LLMs have revolutionized customer support by providing instant and accurate responses to user inquiries, reducing wait times, and improving overall satisfaction. LLMs have also transformed content creation, as they can generate high-quality content that is often indistinguishable from that which was written by people.

However, as we adopt LLMs more readily, we should also recognize some potential challenges. First, let’s examine a significant change in the interaction paradigm between users and AI products, and then let’s explore two key challenges in user interaction with LLMs and discuss potential mitigation strategies based on the human-automation interaction literature.

Switching the interaction paradigm between users and AI products

No longer hidden behind the scenes, AI systems—including those based on LLMs—are now designed to be more apparent, engaging us directly through human-like communication. This paradigm shift from invisible to apparent automation can undoubtedly have positive implications for productivity, entertainment, and creativity—think of a home cook asking an LLM to produce or refine a recipe, setting timers, and even playing music while their hands are dirty.

However, this also poses novel interaction challenges because a person without technical expertise has a limited understanding of the capabilities and limitations of LLMs, and of current practices in their interface design. Consequently, users are left grappling with automated systems that demand our trust in order to be used effectively, and yet may not be designed to support us in that endeavour. One concerning example that compromises trust is the rise of deepfake technology, where AI is used to create highly convincing manipulated videos and audio, blurring the line between reality and fiction. In addition, a lack of transparency regarding data usage and conflicting narratives in the media about whether AI is “good” or “bad” further complicates the issue.

While the full extent of these challenges remains uncertain, two considerations warrant attention from a human-automation interaction perspective.

“BY ENHANCING TRANSPARENCY AS A DESIGN PRINCIPLE IN LLMS INTERFACES, WE CAN FOSTER A MORE INFORMED APPROACH TO RELIANCE ON THESE TOOLS.”

Challenge 1: Overreliance and advice-seeking

As humans, we tend to attribute agency and intentions to things that exhibit agent-like characteristics. Consequently, when we engage with LLMs, we tend to trust their responses and consider them as authoritative sources if linguistic cues in their output (e.g. tone) suggest it. This may lead to overreliance in the form of blind acceptance of information or advice-seeking without critical evaluation.

Even though some LLM interfaces include a warning advising users of their potential flaws and limitations, surveys are showing that many people already seek advice from AI in finance, relationships, healthcare, and more; professionals may also be misusing ChatGPT in the context of their work, thereby unknowingly disseminating false or unhelpful information.

One mitigation strategy is to communicate explicitly the purpose of this kind of automation and its operational limits to users. Simply providing disclaimers like “this tool may occasionally be incorrect” sounds deliberately vague and evidently isn’t enough to prevent the risk of misuse. Specific ways in which these tools can or cannot be useful to non-technical users should be communicated constantly and clearly. In addition, the interface should communicate what it is doing and why at any given moment to the users. This feedback to users should be provided within the user interface, not with tool’s technical manuals—which almost no one will read. Incorporating clear indicators of what this tool was designed for, what it is doing, and why in the LLMs interfaces can help users understand how to achieve better collaboration.

For example, in the context of automated driving systems, interfaces can display real-time information on the system’s decision-making process—such as detecting and avoiding obstacles—ensuring user confidence in the system’s capabilities while also educating users about its limitations. By enhancing transparency as a design principle in LLMs interfaces, we can foster a more informed approach to reliance on these tools.

Challenge 2: New, hard-to-automate tasks may overload the user

While LLMs may be designed to seem competent and knowledgeable (the ChatGPT interface talks about “limited knowledge of the world”), it is essential to recognize that the introduction of automation does not replace humans in a given task, but rather redefines the nature of the task, which may now include novel subtasks that are either related to monitoring the automation or are very hard to automate.

For example, producing a written artifact has traditionally required individuals to express their thoughts, ideas, and experiences in their own words. This process inherently reflected the writer’s unique perspective, voice, and individuality. With the advent of LLMs, the task of organizing thoughts and developing them into coherent paragraphs can be partially automated. But users are still left with the issues of figuring out how to frame prompts in a way that reflects their individuality, injecting personal anecdotes, or revising and editing the generated content to infuse it with their own voice. This creates a workload that may in fact overburden users with novel non-automatable tasks.

To mitigate this risk, designers must carefully consider the roles of humans and automation in the design of LLMs and create interfaces that facilitate human-AI collaboration. For example, LLMs interfaces could be designed to facilitate the integration of human-generated content and LLM-generated suggestions. This would allow users to selectively incorporate and modify suggestions provided by LLMs without diluting their writing voice after a few iterations. However, it is also important to account for flexibility in the division of labour between humans and LLMs, as different users may have varied motivations for utilizing these systems.

How to responsibly harness the full potential of LLMs

While the integration of LLMs has opened exciting possibilities in not only the field of human-automation interaction but also for non-technical users in their daily lives, it is important to acknowledge and address the challenges that come with their adoption. Specifically, we may over-rely on the automated outputs or be left confused about how we are supposed to leverage LLMs to enhance our capabilities.

Current practices in LLM interface design may not be ideal given our imperfect understanding of the limitations of LLMs. LLMs interfaces should instead:

  • Communicate their specific purpose and operational limits, including what actions are carried out and why. For example, explaining in plain language what LLMs can be useful for, and what they cannot be useful for, across use cases.

  • Be designed with clear roles and responsibilities divided between user and automation, but kept flexible due to user differences in knowledge, skills, and motivation.

By understanding and actively addressing these challenges, we can navigate a path toward harnessing the full potential of LLMs in a responsible and socially beneficial manner.

This work was done through collaboration between Davide Gentile and Schwartz Reisman Institute for Technology and Society at the University of Toronto. Please access the full blog using this link.