Skyvern: Workflow Automation That Actually Handles Complex Forms

I needed to automate a client onboarding process - a 15-step workflow with multiple forms, document uploads, and verification steps. Traditional RPA tools couldn't handle the dynamic forms. Skyvern promised AI-powered workflow automation, but it would get stuck on validation errors and couldn't handle complex field types. Here's how I got reliable end-to-end automation.

Problem

The onboarding form had fields that would appear/disappear based on previous answers, and real-time validation that changed error messages dynamically. Skyvern would fill out the form but fail on submission because it didn't handle the dynamic validation.

Error: FormSubmissionError: Validation failed - 3 fields have errors

What I Tried

Attempt 1: Added explicit wait times between fields. This was unreliable and slow.
Attempt 2: Tried to fill all visible fields at once. The form would reject partially filled data.
Attempt 3: Used screenshots to read validation errors. This worked but required manual setup for each form.

Actual Fix

Configured Skyvern's adaptive form mode with real-time validation monitoring. The system now watches for validation state changes, reads error messages dynamically, and retries fields with validation feedback.

# Skyvern workflow configuration with adaptive form handling
workflow:
  name: client_onboarding
  description: Automated client onboarding with dynamic forms

  # Form handling settings
  form_settings:
    # Adaptive form detection
    detect_dynamic_fields: true
    detect_conditional_fields: true
    detect_validation_rules: true

    # Validation monitoring
    monitor_validation_state: true
    validation_check_interval: 500  # Check every 500ms
    wait_for_stable_validation: true
    stable_validation_duration: 1000  # Wait 1s of no errors

    # Error handling
    read_validation_errors: true
    error_selector_strategy: "auto"  # Auto-detect error messages
    retry_on_validation_error: true
    max_field_retries: 3

    # Field interaction
    fill_method: "sequential"  # Fill one field at a time
    wait_between_fields: 200  # 200ms between fields
    trigger_validation_on_change: true

  # Steps
  steps:
    - name: navigate_to_form
      action: goto
      url: "https://example.com/onboarding"

    - name: fill_personal_info
      action: fill_form
      fields:
        - name: first_name
          selector: "#firstName"
          value: "{{client.first_name}}"
          validate_after: true
        - name: last_name
          selector: "#lastName"
          value: "{{client.last_name}}"
          validate_after: true
        - name: email
          selector: "#email"
          value: "{{client.email}}"
          validate_after: true
          # This field has complex validation
          wait_for_validation: true
          read_error_message: true

    - name: handle_conditional_fields
      action: conditional_fill
      condition:
        field: "company_type"
        equals: "LLC"
      then_fill:
        - field: "ein_number"
          value: "{{client.ein}}"

    - name: submit_and_verify
      action: submit_form
      verify_success: true
      success_selector: ".success-message"

Problem

The onboarding workflow spanned multiple pages and required data from earlier steps (like a generated ID). Skyvern would complete step 1, but by step 3 it had "forgotten" the ID from step 1, causing failures.

What I Tried

Attempt 1: Hard-coded the ID in the workflow. This only worked for one run.
Attempt 2: Extracted data to a file and read it back. This was fragile and required extra steps.

Actual Fix

Enabled Skyvern's state management with variable extraction and persistence. The system now extracts values from pages, stores them in workflow state, and makes them available to subsequent steps.

# Workflow with state management
workflow:
  name: multi_step_onboarding
  description: Multi-step workflow with state persistence

  # State management
  state:
    persist_state: true
    state_file: "./workflows/state/{{workflow_id}}.json"
    # Extract variables from pages
    extract_variables:
      - name: client_id
        selector: ".client-id"
        extract_method: "text"
        required: true
      - name: confirmation_code
        selector: "[data-confirmation-code]"
        extract_method: "attribute"
        attribute: "data-confirmation-code"
      - name: session_token
        selector: "meta[name=session-token]"
        extract_method: "attribute"
        attribute: "content"

  # Steps with state usage
  steps:
    - name: create_account
      action: submit_form
      form:
        fields:
          - email: "{{client.email}}"
          - password: "{{client.password}}"
      # Extract client_id from response page
      extract:
        client_id:
          selector: ".client-id"
          required: true

    - name: upload_documents
      action: upload_files
      files:
        - field: "id_document"
          path: "{{client.id_doc_path}}"
        - field: "proof_address"
          path: "{{client.address_doc_path}}"
      # Use extracted client_id in URL
      url_params:
        client_id: "{{state.client_id}}"

    - name: verify_identity
      action: navigate
      url: "https://example.com/verify?client={{state.client_id}}&code={{state.confirmation_code}}"

    - name: complete_setup
      action: submit_form
      form:
        fields:
          - session_token: "{{state.session_token}}"
          - preferences: "{{client.preferences}}"

Problem

The workflow required uploading PDF and image files. Skyvern would try to upload but the server would reject files due to security checks, wrong file types, or file size issues.

What I Tried

Attempt 1: Converted all files to PDF. Some uploads required images specifically.
Attempt 2: Disabled browser security. This didn't work and was unsafe.

Actual Fix

Configured file upload handling with automatic type detection, size validation, and security header management. Skyvern now validates files before upload, handles multipart forms correctly, and works with cloud storage.

# File upload configuration
workflow:
  name: document_upload_workflow

  # File handling
  file_settings:
    # Validation
    validate_before_upload: true
    check_file_size: true
    max_file_size: 10485760  # 10MB
    check_file_type: true
    allowed_types:
      - "application/pdf"
      - "image/png"
      - "image/jpeg"

    # Conversion
    auto_convert_format: false
    compress_if_needed: true
    compression_quality: 0.9

    # Upload method
    upload_method: "multipart"  # or "base64", "cloud_storage"
    chunk_size: 5242880  # 5MB chunks

    # Headers
    set_content_type: true
    set_content_disposition: true
    custom_headers:
      X-File-Authorization: "{{api_token}}"

  # Upload steps
  steps:
    - name: upload_id_document
      action: upload_file
      file:
        path: "{{client.id_document}}"
        field: "id_doc"
        # Validate and prepare
        validate: true
        convert_if_needed: false
      # Handle upload response
      on_success:
        extract:
          upload_id:
            selector: ".upload-id"
            required: true
      on_failure:
        retry: true
        max_retries: 3
        fallback_action: "convert_and_retry"

    - name: upload_proof_of_address
      action: upload_file
      file:
        path: "{{client.address_proof}}"
        field: "address_proof"
        # This file might need conversion
        auto_convert: true
        target_format: "pdf"
      # Cloud storage option
      use_cloud_storage: true
      storage_provider: "s3"
      storage_bucket: "workflow-uploads"

What I Learned

Production Setup

Complete setup for production workflow automation.

# Install Skyvern
pip install skyvern

# Install browser dependencies
playwright install chromium

# For file handling (optional)
pip install python-magic  # File type detection
pip install pillow  # Image processing

Production workflow runner:

import asyncio
from skyvern import Skyvern
from skyvern.config import WorkflowConfig
from pathlib import Path

class WorkflowRunner:
    """Production workflow automation runner."""

    def __init__(self):
        self.skyvern = Skyvern(
            # Browser settings
            browser_type="chromium",
            headless=True,
            # Performance
            concurrency=2,  # Run 2 workflows in parallel
            # State
            state_dir="./workflows/state",
            checkpoint_dir="./workflows/checkpoints",
            # Monitoring
            log_level="INFO",
            log_dir="./logs",
            metrics_enabled=True
        )

    async def run_workflow(self, workflow_path: str, data: dict):
        """Run a workflow with data."""
        # Load workflow
        config = WorkflowConfig.from_yaml(workflow_path)

        # Validate data
        config.validate_data(data)

        # Run workflow
        result = await self.skyvern.run(
            config=config,
            data=data,
            # Options
            save_checkpoint=True,
            resume_from_checkpoint=True,
            on_error="continue"  # Continue to next step on error
        )

        return result

    async def run_batch(self, workflow_path: str, data_list: list[dict]):
        """Run workflow for multiple data sets."""
        tasks = [
            self.run_workflow(workflow_path, data)
            for data in data_list
        ]

        results = await asyncio.gather(*tasks, return_exceptions=True)

        # Process results
        for i, result in enumerate(results):
            if isinstance(result, Exception):
                print(f"Workflow {i} failed: {result}")
            else:
                print(f"Workflow {i} completed: {result.success}")

# Usage
async def main():
    runner = WorkflowRunner()

    # Single workflow
    result = await runner.run_workflow(
        workflow_path="./workflows/client_onboarding.yaml",
        data={
            "client": {
                "first_name": "John",
                "last_name": "Doe",
                "email": "john@example.com",
                "company_type": "LLC",
                "ein": "12-3456789",
                "id_doc_path": "./docs/id.pdf",
                "address_doc_path": "./docs/address.pdf"
            }
        }
    )

    print(f"Success: {result.success}")
    print(f"Client ID: {result.state.get('client_id')}")

    # Batch processing
    clients = load_clients_from_csv("./clients.csv")
    await runner.run_batch(
        workflow_path="./workflows/client_onboarding.yaml",
        data_list=clients
    )

if __name__ == "__main__":
    asyncio.run(main())

Monitoring & Debugging

Key metrics for workflow automation.

Red Flags to Watch For

Debug Commands

# Test workflow with dry run
skyvern test \
    --workflow ./workflows/client_onboarding.yaml \
    --data ./test_data.json \
    --dry-run

# Validate workflow configuration
skyvern validate \
    --workflow ./workflows/client_onboarding.yaml

# Run workflow with debugging
skyvern run \
    --workflow ./workflows/client_onboarding.yaml \
    --data ./test_data.json \
    --debug \
    --screenshot-on-error

# View workflow logs
skyvern logs \
    --workflow client_onboarding \
    --tail

# Analyze workflow performance
skyvern analyze \
    --log_dir ./logs \
    --workflow client_onboarding

Related Resources