Skip to content

UKPLab/arxiv2024-inferact

Repository files navigation

InferAct: Inferring Safe Actions for LLMs-Based Agents Through Preemptive Evaluation and Human Feedback

Arxiv License Python Versions

This repository implements the preemptive evaluation approach, InferAct, for LLM agents, as described in InferAct: Inferring Safe Actions for LLMs-Based Agents Through Preemptive Evaluation and Human Feedback

Abstract :A crucial requirement for deploying LLM-based agents in real-life applications is the robustness against risky or even irreversible mistakes. However, the existing research lacks a focus on preemptive evaluation of reasoning trajectories performed by LLM agents, leading to a gap in ensuring safe and reliable operations. To explore better solutions, this paper introduces InferAct, a novel approach that leverages the Theory-of-Mind capability of LLMs to proactively detect potential errors before critical actions are executed (e.g., buy-now in automatic online trading or web shopping). InferAct is also capable of integrating human feedback to prevent irreversible risks as well as enhance the actor agent's decision-making process. Experiments on three widely-used tasks demonstrate the effectiveness of InferAct. The proposed solution presents a novel approach and concrete contributions towards developing LLM agents that can be safely deployed in different environments involving critical decision-making.

Contact person: Haishuo Fang

UKP Lab | TU Darmstadt

Don't hesitate to send us an e-mail or report an issue, if something is broken (and it shouldn't be) or if you have further questions.

InferAct

🚀 Setup

> python -m venv .inferact
> source ./.inferact/bin/activate
> pip install -r requirements.txt

WebShop

  • Install openjdk in the virtual environment.
import jdk
from jdk.enums import OperatingSystem, Architecture

jdk.install('11', operating_system=OperatingSystem.LINUX)
import os
jdk_version = 'jdk-11.0.19+7' #change with your version
os.environ['JAVA_HOME'] = 'path/to/jdk'
  • Configure the environment
> cd ./actor/webshop
> ./setup.sh -d all

ALFWorld

  • Download env data

Please refer to ALFWorld

export ALFWORLD_DATA="path/to/data"

🛠️ Usage

Run Actor

We adapt code for ALFWorld, HotPotQA from the Reflexion repository

The Actor agent is responsible for performing tasks in environments. --run_agents controls whether to run actor in different environments e.g. --task webshop.

python main.py 
    --run_agents 
    --task webshop 
    --trial_num 0
    --feedback_type nl
    --num_envs 300

Run Evaluator

The evaluator evaluates the Actor's trajectory before critical actions.

python main.py 
    --do_eval
    --task webshop
    --eval_method inferact
    --trial_num 0
    --model_name gpt4-turbo
    --feedback_type nl
    --threshold 0.9
  • --eval_method specifies different evaluation methods.
  • --threshold specifies the threshold of F1-score for multi-step evaluation and inferact.
  • --do_eval controls whether to evaluate the Actor trajectory.

Run Feedback Generation

After the off-track trajectory is detected by the Evaluator, the binary or NL feedback will be generated to prevent the critial action from executing.

python main.py
    --do_feedback_gen
    --task webshop
    --eval_method inferact
    --trial_num 0
    --model_name gpt4-turbo
    --threshold 0.9
    --feedback_type nl

Pipeline

To run different components in a pipeline, you can use

python main.py 
    --run_agents
    --do_eval
    --do_feedback_gen
    --task webshop
    --model_name gpt35-turbo
    --num_envs 300
    --eval_method standard
    --trial_num 0
    --threshold 0.0
    --feedback_type nl

Cite

Please use the following citation:

@article{fang2024inferact,
  title={InferAct: Inferring Safe Actions for LLM-Based Agents Through Preemptive Evaluation and Human Feedback},
  author={Fang, Haishuo and Zhu, Xiaodan and Gurevych, Iryna},
  journal={arXiv preprint arXiv:2407.11843},
  year={2024}
}

Disclaimer

This repository contains experimental software and is published for the sole purpose of giving additional background details on the respective publication.

About

A preempative evaluation method for LLM-based agents

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages