Skip to main content

Web Process Automation and RPA

When a system has no dependable API — or the API does not expose the operation you need — Wrk can automate through the user interface instead. These Wrk Actions simulate what a person would click, type, and read on screen.

Prefer Integrations when a stable API exists. See RPA vs API for the full decision guide.

Web Process Automation

What it is

Web Process Automation drives a headless browser session against websites and web applications. You open a session, perform steps (navigate, click, fill forms, download files), then close the session when finished.

Good for

  • Public websites and SaaS web apps without API coverage
  • Filling online forms, scraping data, or downloading files from a portal
  • Browser-based workflows where standard selectors and DOM interactions work reliably

What it does

  • Opens and manages browser sessions (active for up to 15 minutes per session)
  • Navigates URLs, clicks elements, fills fields, and runs JavaScript on pages
  • Handles cookies, captchas, screenshots, and PDF exports from web pages
  • Requires pairing session-opening and session-closing Wrk Actions for best performance

Go deeper

Desktop RPA

What it is

Desktop RPA automates Windows desktop applications through a virtual machine (VM). Wrk's agent connects remotely to your VM — no RPA software needs to be installed on end-user desktops.

Good for

  • Legacy or thick-client applications with no web interface
  • Internal tools that only expose a desktop UI
  • Processes where the target software runs on a dedicated VM you control

What it does

  • Connects to and logs into a VM over RDP or VNC
  • Simulates mouse clicks, key presses, and screen reads on the desktop
  • Downloads and uploads files, captures screenshots, and retrieves clipboard text
  • Runs against applications installed and licensed on the VM

Go deeper

Vision RPA

What it is

Vision RPA finds and interacts with on-screen elements visually — using OCR text matching, reference images, or AI-generated element descriptions. It works on top of an existing Web Process Automation or Desktop RPA session.

Good for

  • Dynamic or frequently changing UIs where standard selectors break
  • Applications with non-standard rendering (canvas, shadow DOM, custom widgets)
  • Targeting elements described in natural language when you cannot write a reliable selector

What it does

  • Clicks, fills, and checks visibility of elements identified by text, image, or description
  • Retrieves text and element data from live sessions or screenshots
  • Requires a valid session ID from an open browser or desktop session
  • Supports rapid iteration using Wrk Action Testing against a reusable session

Go deeper

When to use which

NeedChoose
Automate a website or web app with reliable page structureWeb Process Automation
Automate a desktop application on a VMDesktop RPA
UI is brittle or changes often, or selectors failVision RPA (on top of a web or desktop session)
System has a documented APIIntegrations instead — faster and more maintainable

Combining types: Many Wrkflows open a web or desktop session, use Vision RPA for hard-to-target elements, then pass extracted data to Integrations or data-transform Wrk Actions downstream.