BrowserTool
Use BrowserTool
class to create a new tool that interacts with web browsers. This tool can be used to automate web browsing tasks, such as opening a webpage, clicking on elements, and extracting information from web pages.
from npiai import BrowserTool
class MyTool(BrowserTool):
def __init__(self):
super().__init__(
name='<Provide the name of your tool here>',
description='<Provide a brief description of your tool here>',
system_prompt='<Provide a system prompt for LLMs to interact with your tool here>',
)
Attributes
name
- Description: The name of the tool.
- Returns:
str
description
- Description: A brief description of the tool.
- Returns:
str
system_prompt
- Description: A system prompt for LLMs to interact with the tool.
- Returns:
str
tools
- Description: A list containing the JSON schema of the functions registered with the tool.
- Returns:
List[ChatCompletionToolParam]
. See the Schema Generation section for more information.
Sync Methods
add_tool
-
Usage:
tool.add_tool(tool_a, tool_b, agent_a, agent_b)
-
Description: Adds tools or agents to the tool.
-
Arguments:
*tools (BaseTool)
: The tools to add. They can be either aTool
or anAgent
returns by thenpiai.agent.wrap
function.
-
Returns: None
use_hitl
-
Usage:
tool.use_hitl(hitl_handler)
-
Description: Attach the given HITL(human-in-the-loop) handler to this tool and all its sub-tools.
-
Arguments:
hitl (HITLHandler)
: The HITL handler to attach.
-
Returns: None
Async Methods
start
-
Usage:
await tool.start()
-
Description: Starts the tool.
-
Arguments: None
-
Returns: None
end
-
Usage:
await tool.end()
-
Description: Ends the tool.
-
Arguments: None
-
Returns: None
call
-
Usage:
results = await tool.call(llm_response.tool_calls)
-
Description: Calls the function corresponding to the LLM tool calls and returns the results.
-
Arguments:
tool_calls (List[ChatCompletionMessageToolCall])
: The tool calls from the LLM response.
-
Returns:
List[ChatCompletionMessageToolCall]
- role:
"tool"
- tool_call_id (str): The ID of the tool call.
- content (str): The result of the corresponding function call.
- role:
Browser-related Methods
The BrowserTool
class provides several methods to interact with web browsers. Here are some of the key methods:
get_text
-
Usage:
text = await tool.get_text()
-
Description: Extracts the text content of the current web page as markdown.
-
Arguments: None
-
Returns:
str
goto_blank
-
Usage:
await tool.goto_blank()
-
Description: Go to
about:blank
page. This can be used as a cleanup function when a session finishes. -
Arguments: None
-
Returns: None
get_screenshot
-
Usage:
screenshot = await tool.get_screenshot()
-
Description: Takes a screenshot of the current web page.
-
Arguments: None
-
Returns:
str | None
: The base64-encoded image data orNone
if the screenshot fails.
get_page_url
-
Usage:
url = await tool.get_page_url()
-
Description: Gets the URL of the current web page.
-
Arguments: None
-
Returns:
str
get_page_title
-
Usage:
title = await tool.get_page_title()
-
Description: Gets the title of the current web page.
-
Arguments: None
-
Returns:
str
is_scrollable
-
Usage:
scrollable = await tool.is_scrollable()
-
Description: Checks if the current web page is scrollable.
-
Arguments: None
-
Returns:
bool
click
-
Usage:
await tool.click(element_handle)
-
Description: Clicks on the specified element on the web page.
-
Arguments:
element_handle (playwright.ElementHandle)
: The handle of the element to click. See the Playwright documentation (opens in a new tab) for more details.
-
Returns: None
fill
-
Usage:
await tool.fill(element_handle, text)
-
Description: Fills the specified text into the input field of the element on the web page.
-
Arguments:
element_handle (playwright.ElementHandle)
: The handle of the input element. See the Playwright documentation (opens in a new tab) for more details.text (str)
: The text to fill into the input field.
-
Returns: None
select
-
Usage:
await tool.select(element_handle, value)
-
Description: Selects the specified value from the dropdown list element on the web page.
-
Arguments:
element_handle (playwright.ElementHandle)
: The handle of the dropdown list element. See the Playwright documentation (opens in a new tab) for more details.value (str)
: The value to select from the dropdown list.
-
Returns: None
enter
-
Usage:
await tool.enter(element_handle)
-
Description: Presses the Enter key on the specified element on the web page.
-
Arguments:
element_handle (playwright.ElementHandle)
: The handle of the element to press the Enter key. See the Playwright documentation (opens in a new tab) for more details.
-
Returns: None
scroll
-
Usage:
await tool.scroll()
-
Description: Scrolls the web page down by one viewport height to reveal more content.
-
Arguments: None
-
Returns: None
back_to_top
-
Usage:
await tool.back_to_top()
-
Description: Scrolls the web page back to the top.
-
Arguments: None
-
Returns: None