Web Process Automation and RPA
When a system has no dependable API — or the API does not expose the operation you need — Wrk can automate through the user interface instead. These Wrk Actions simulate what a person would click, type, and read on screen.
Prefer Integrations when a stable API exists. See RPA vs API for the full decision guide.
Web Process Automation
What it is
Web Process Automation drives a headless browser session against websites and web applications. You open a session, perform steps (navigate, click, fill forms, download files), then close the session when finished.
Good for
- Public websites and SaaS web apps without API coverage
- Filling online forms, scraping data, or downloading files from a portal
- Browser-based workflows where standard selectors and DOM interactions work reliably
What it does
- Opens and manages browser sessions (active for up to 15 minutes per session)
- Navigates URLs, clicks elements, fills fields, and runs JavaScript on pages
- Handles cookies, captchas, screenshots, and PDF exports from web pages
- Requires pairing session-opening and session-closing Wrk Actions for best performance
Go deeper
- Reference: Robotic Process Automation overview — start with Open a browser session
- Tutorials: Build your first Wrkflow
Desktop RPA
What it is
Desktop RPA automates Windows desktop applications through a virtual machine (VM). Wrk's agent connects remotely to your VM — no RPA software needs to be installed on end-user desktops.
Good for
- Legacy or thick-client applications with no web interface
- Internal tools that only expose a desktop UI
- Processes where the target software runs on a dedicated VM you control
What it does
- Connects to and logs into a VM over RDP or VNC
- Simulates mouse clicks, key presses, and screen reads on the desktop
- Downloads and uploads files, captures screenshots, and retrieves clipboard text
- Runs against applications installed and licensed on the VM
Go deeper
- Reference: Desktop RPA reference — start with Connect and log in to a VM
- Concepts: Robotic Process Automation (RPA) — Wrk's VM-based approach and requirements
Vision RPA
What it is
Vision RPA finds and interacts with on-screen elements visually — using OCR text matching, reference images, or AI-generated element descriptions. It works on top of an existing Web Process Automation or Desktop RPA session.
Good for
- Dynamic or frequently changing UIs where standard selectors break
- Applications with non-standard rendering (canvas, shadow DOM, custom widgets)
- Targeting elements described in natural language when you cannot write a reliable selector
What it does
- Clicks, fills, and checks visibility of elements identified by text, image, or description
- Retrieves text and element data from live sessions or screenshots
- Requires a valid session ID from an open browser or desktop session
- Supports rapid iteration using Wrk Action Testing against a reusable session
Go deeper
- Reference: Vision RPA reference
- Tutorials: Vision RPA Training
When to use which
| Need | Choose |
|---|---|
| Automate a website or web app with reliable page structure | Web Process Automation |
| Automate a desktop application on a VM | Desktop RPA |
| UI is brittle or changes often, or selectors fail | Vision RPA (on top of a web or desktop session) |
| System has a documented API | Integrations instead — faster and more maintainable |
Combining types: Many Wrkflows open a web or desktop session, use Vision RPA for hard-to-target elements, then pass extracted data to Integrations or data-transform Wrk Actions downstream.