<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=undefined&type=result"></script>');
-->
</script>
When LLMs Go Online: The Emerging Threat of Web-Enabled LLMs This is the source code accompanying the paper "When LLMs Go Online: The Emerging Threat of Web-Enabled LLMs". In this work, we investigate the risks associated with misuse of LLM agents in cyberattacks involving personal data. Specifically, we aim to understand: 1) how potent LLM agents can be when directed to conduct cyberattacks, 2) how cyberattacks are enhanced by web-based tools, and 3) how affordable and easy it becomes to launch cyberattacks using LLM agents. We examine three attack scenarios: Collection of Personally Identifiable Information (PII) Generation of impersonation posts Creation of spear-phishing emails To prevent the potential misuse of our findings, we have not disclosed the exact prompts used in these attacks. Instead, we have provided the source code for LLM agents with dummy prompts. The complete source code is available upon request for legitimate research purposes only. Implementation We implement LLM agents using the function calling feature provided by each LLM's API. We provide a set of function descriptions to the LLM, enabling the model to determine the appropriate timing and method for calling functions based on the task requirements. It is important to note that LLMs do not execute functions directly; rather, they identify the appropriate moments for function execution and supply the necessary arguments. The actual execution is carried out by an application, such as a web search tool, which then returns the results to the LLM. The LLM uses these results to generate a response, thus automating the process and enabling the agent to perform designated tasks effectively. There are two types of agents: WebSearch Agent and WebNav Agent For the WebSearch Agent, we implement the search() function using the Custom Search JSON API, which retrieves Google search results in a structured JSON format. This function accepts a search term as an argument and returns the corresponding Google search results. The WebSearch agent calls the search() function with an appropriate query and then uses the returned search results to generate a response. When the agent cannot find the required information from the results, it may repeatedly call the function, adjusting the query as needed. For the WebNav Agent, we implement the functionality using web automation tools, such as Selenium and BeautifulSoup with Requests. Specifically, we develop two functions: fetch_content() and find_button(). The fetch_content() function takes a URL as an argument and returns the content of the site, while the find_button() function identifies clickable buttons and their corresponding URLs at a given URL. Models We employ commercially available models, whose accessibility and capabilities can encourage misuse by attackers. Specifically, we use GPT-4o, Claude 3.5 Sonnet, and Gemini 1.5 Flash. We utilize the respective APIs: the OpenAI API for GPT-4o, the Anthropic API for Claude 3.5 Sonnet, and the Gemini API for Gemini 1.5 Flash. How to use? You need to replace the placeholder API keys and identifiers with your own. In test_agent.ipynb: Replace OPENAI_API with your OpenAI API key. Replace CLAUDE_API with your Claude API key. Replace GOOGLE_CSE_ID with your Google Custom Search Engine ID. Replace GOOGLE_API_KEY with your Google API key. In utils.py: Replace GOOGLE_CSE_ID with your Google Custom Search Engine ID. Replace GOOGLE_API_KEY with your Google API key. In test_agent.ipynb, you can configure different types of agents by adjusting the web_use and navi_use parameters: LLM Agent web_use = False navi_use = False WebSearch Agent web_use = True navi_use = False WebNav Agent web_use = True navi_use = True
llm, llm agent, agent
llm, llm agent, agent
citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |