Skip to main content

Supported Steps

This page lists all the available action types you can use when creating flow steps manually or through natural language processing.

Basic Interaction Actions

  • Tap - Tap or click on an element or coordinates
  • Type - Type text into a field (supports fake data generation)
  • Clear - Clear text from input fields
  • Long Press - Long press on an element (supports custom duration)
  • Double Tap - Double tap on an element
  • Verify Screen - Use AI vision to verify screen conditions with true/false responses

Controls

  • Swipe - Swipe in any direction (up, down, left, right) or to coordinates
  • Scroll - Scroll on the screen
  • Custom ADB - Execute custom ADB commands
  • Subflow - Execute another flow as a subflow
  • Play Audio - Play audio files from the server/audio/ directory

SwipeSteps

Time & Date Selection

  • Select Time - Select specific time from time picker
  • Select Relative Time - Select time relative to current time (e.g., "2 hours from now")
  • Select Date - Select date from date picker
  • Select Date Range - Select a date range with start and end dates

Timestamps

Screen Analysis

  • Verify Screen - Use AI vision to verify screen conditions with true/false responses

Advanced Actions

  • Drag Element - Drag from one element to another
  • Play Audio - Play audio files from the server/audio/ directory

Advanced Step Editing

The Flow Steps Editor provides powerful tools for creating and managing complex automation sequences. This section covers advanced editing features and detailed field requirements for each step type.

Step Editor Interface

Main Functions

  • Add Step - Insert new steps at specific positions in the flow
  • Edit Step - Modify existing step properties and parameters
  • Delete Step - Remove unwanted steps from the flow
  • Duplicate Step - Create copies of existing steps
  • Move Steps - Reorder steps using up/down buttons or drag-and-drop
  • Execute Step - Test individual steps without running the full flow
  • Extract Steps - Create subflows from selected consecutive steps

Step Management Features

  1. Drag-and-Drop Reordering

    • Click and drag the grip handle (⋮⋮) to reorder steps
    • Visual feedback during dragging with opacity changes
    • Automatic position updates
  2. Step Selection for Extraction

    • Enable "Select Steps to Extract" mode
    • Click checkboxes to select consecutive steps
    • Extract selected steps as reusable subflows
    • Validation ensures selected steps are consecutive
  3. Individual Step Execution

    • Click the play button (▶) next to any step
    • Executes only that step on the target device
    • Useful for testing and debugging individual actions

StepsButtons

Step Fields and Requirements

Basic Interaction Actions

Tap
  • Target Element (required): Element to tap on screen
  • Cache Coordinates (optional): Store element position for faster subsequent executions
  • Device (optional): Target device (defaults to main device)
Long Press
  • Target Element (required): Element to long press
  • Duration (required): Press duration in seconds (default: 3 seconds)
  • Cache Coordinates (optional)
  • Device (optional)
Double Tap
  • Target Element (required): Element to double tap
  • Cache Coordinates (optional)
  • Device (optional)
Type
  • Text (required when not using fake data): Text to type
  • Target Element (optional): Input field to focus (auto-detected if not specified)
  • Use Fake Data (optional): Generate realistic test data instead of manual text
  • Fake Data Type (required if using fake data):
    • auto - Auto-detect based on context
    • firstName, lastName, fullName - Name generation
    • email - Email address generation
    • phone - Phone number generation
    • address, city - Address generation
    • company - Company name generation
    • password - Password generation
    • food - Food item generation
    • safeUrl, youtubeUrl - URL generation
    • words - Random lorem ipsum words
  • Cache Coordinates (optional)
  • Device (optional)

FakeDataType

Clear
  • Iterations (required): Number of clear attempts (default: 100)
  • Cache Coordinates (optional)
  • Device (optional)
Swipe
  • Target Element (optional): Element to swipe from
  • Direction (required): up, down, left, right
  • Distance (required): Swipe distance in pixels (default: 300)
  • Cache Coordinates (optional)
  • Device (optional)
Scroll
  • Target Element (optional): Area to scroll in
  • Direction (required): up, down, left, right
  • Distance (required): Scroll distance in pixels (default: 300)
  • Cache Coordinates (optional)
  • Device (optional)
Custom ADB
  • ADB Command (required): ADB shell command to execute
  • Timeout (required): Command timeout in seconds (default: 30)
  • Device (optional)

Time & Date Selection

Select Time
  • Target Time (required): Time in format like "2:30 PM" or "14:30"
  • Validate Selection (optional): Verify time was set correctly
  • Max Scroll Attempts (optional): Maximum attempts to find time (default: 10)
  • Device (optional)
Select Relative Time
  • Hours (required): Hours from now (can be negative)
  • Minutes (required): Minutes from now (can be negative)
  • Validate Selection (optional)
  • Max Scroll Attempts (optional)
  • Device (optional)
Select Date
  • Target Date (required): Date in formats like "2024-12-25" or "December 25, 2024"
  • Validate Selection (optional)
  • Max Scroll Attempts (optional)
  • Device (optional)
Select Date Range
  • Start Date (required): Beginning date of range
  • End Date (required): Ending date of range
  • Validate Selection (optional)
  • Max Scroll Attempts (optional)
  • Device (optional)

Screen Analysis

Verify Screen
  • Verification Query (required): Question about the current screen state that can be answered with true/false
  • Device (optional)
  • AI Response: Ollama AI analyzes the screen and responds with only "true" or "false"
  • Flow Behavior:
    • If "true": Flow continues to next step
    • If "false": Flow execution stops with failure
  • Variable Support: Use ${variableName} syntax to reference data from previous steps
  • Query Examples:
    • "Is there a blue submit button on screen?"
    • "Does the welcome message contain the user's name?"
    • "Is the error message visible?"
    • "Has the loading spinner disappeared?"
  • Technical Details:
    • Takes a fresh screenshot of the current device screen
    • Sends screenshot and query to Ollama API (default: localhost:11434)
    • Makes 3 parallel requests for reliability and uses majority voting
    • Parses response to extract true/false answer
    • Supports both direct queries and variable substitution
    • Requires Ollama service to be running and configured (see Configuration Requirements)

Understanding Verify Screen

Verify Screen

  • Purpose: Screen state validation with binary outcomes
  • Response: Returns true/false based on screen condition
  • Use Case: "Validate that screen meets specific criteria"
  • Flow Impact: Stops flow if verification fails
  • Example: Verify "Success message is visible after login"

Best Practices for Verify Screen

Query Formulation

  1. Be Specific: Use clear, unambiguous questions

    • Good: "Is there a green 'Submit' button at the bottom of the screen?"
    • Poor: "Is there a button?"
  2. Binary Questions: Frame questions that can be answered with true/false

    • Good: "Is the error message visible?"
    • Poor: "What error message is shown?"
  3. Context Awareness: Include relevant context in queries

    • Good: "Is the user's email address displayed in the profile header?"
    • Poor: "Is email there?"

Variable Integration

  1. Dynamic Validation: Reference data from previous steps

    Step 3 (Type): Enter "john.doe@example.com" in email field
    Step 7 (Verify Screen): Is the email "${step3text}" visible in the profile?
  2. Multi-step Validation: Chain multiple verifications

    Step 5 (Verify Screen): Is the "Welcome ${step2text}" message shown?
    Step 8 (Verify Screen): Has the loading spinner disappeared?

Error Handling

  1. Service Dependencies: Ensure Ollama is running and properly configured before using verify_screen
  2. Configuration Requirements: Set OLLAMA_BASE_URL and OLLAMA_MODEL_ID in server/.env
  3. Network Considerations: Verify screen requires API calls to Ollama service
  4. Fallback Strategies: Use multiple verify_screen steps for comprehensive validation

Performance Tips

  1. Strategic Placement: Use verify_screen at critical validation points
  2. Avoid Overuse: Don't verify every minor UI change
  3. Combine with Waits: Add wait steps before verify_screen for dynamic content

Advanced Usage

  1. Complex Conditions: Break complex verifications into multiple steps
  2. State Validation: Verify application state transitions
  3. Data Integrity: Confirm data persistence after actions

Ollama Configuration Requirements

To use the Verify Screen step, you must configure Ollama in your server/.env file:

# Ollama Configuration (for screen verification)
OLLAMA_BASE_URL=http://localhost:11434
OLLAMA_MODEL_ID=your-model-id-here

Setup Steps:

  1. Install Ollama: Download and install from https://ollama.ai
  2. Pull a Vision Model: Run ollama pull <model-name>
    • Recommended: ollama pull qwen3-vl:30b-a3b-instruct-q8_0 (or any qwen3-vl model)
    • Alternative models: ollama pull llava or ollama pull bakllava
  3. Start Ollama Service: Ensure Ollama is running (usually starts automatically)
  4. Configure Environment: Set OLLAMA_BASE_URL and OLLAMA_MODEL_ID in server/.env
  5. Verify Connection: Test with curl http://localhost:11434/api/tags to see available models

Important Notes:

  • The OLLAMA_MODEL_ID must match a vision-capable model you've pulled (models that support image input)
  • Recommended Model: qwen3-vl:30b-a3b-instruct-q8_0 or any qwen3-vl model variant
  • Other supported vision models: llava, bakllava, llava:13b, etc.
  • If Ollama is not configured, verify_screen steps will fail with a clear error message
  • The verification makes 3 parallel requests for reliability and uses majority voting

Advanced Actions

Drag Element
  • Source Element (required): Element to drag from
  • Target Element (required): Element to drop on
  • Long Press Duration (required): Initial press duration in seconds (default: 1)
  • Cache Coordinates (optional)
  • Device (optional)
Subflow
  • Subflow Selection (required): Choose from available subflows
  • Device (optional): Inherits from subflow unless overridden
Play Audio
  • Audio File (required): Select from available audio files in server/audio/ directory
  • Device (optional): Target device (defaults to main device)
  • Supported Formats: .mp3, .wav, .m4a, .aac, .ogg, .flac (maximum file size: 50MB)
  • File Management: Audio files can be managed through the Audio Management interface
  • Default File: If no file is specified, defaults to "pasta.mp3"
  • File Loading: The system automatically loads available audio files from the server directory

Play_Audio.png

Record Audio
  • Duration (required): Recording duration in seconds (default: 5 seconds)
  • Device (optional): Target device to record audio from
  • Output Format: Recordings are saved as WAV files with randomized filenames
  • Storage: Audio files are stored in the /record-audio directory with automatic hourly cleanup
  • Variables Stored:
    • step{N}AudioFile - The filename of the recording (e.g., step1AudioFile, step2AudioFile)
    • step{N}AudioFilePath - Full path to the recording file
  • Example Usage:
    • Record audio for 5 seconds from the device
    • Reference the recording in later steps using ${step1AudioFile} to play it back or validate content
  • Requirements: scrcpy must be installed and available in PATH

Timing and Control

Wait

  • Duration (required): Wait time in seconds (default: 1)
  • Device (optional)

Advanced Editing Features

Coordinate Caching

  • Cache Coordinates: Stores element positions for faster execution
  • Cache Hit Rate: Improves performance on repeated executions
  • Cache Sharing: Subflows inherit parent flow caches
  • Advanced Cache Management - Comprehensive coordinate cache guide

Device Targeting

  • Device Selection: Choose specific devices for step execution
  • Device Validation: Ensures selected device is connected
  • Multi-Device Flows: Different steps can target different devices
  • Multi-Device Testing Guide - Comprehensive guide for targeting multiple devices

Step Validation

  • Real-time Validation: Checks required fields as you type
  • Field Dependencies: Shows/hides fields based on action type
  • Error Prevention: Prevents saving incomplete steps

Dynamic Step References

  • Variable Substitution: Use ${variableName} syntax to reference data from previous steps
  • Auto-Generated Variables: Text input from steps is automatically stored as step{N}text (e.g., step1text, step2text)
  • Syntax Format: ${variableName} where variableName is the stored variable
  • Arithmetic Expressions: Supports expressions like ${step1text + 1} or ${counter - 2}
  • Use Cases:
    • Reference text typed in previous steps using ${step3text} format
    • Use data captured by earlier actions in validation instructions
    • Build dynamic workflows where later steps depend on earlier results
    • Perform calculations on numeric values from previous steps
  • Example: If step 2 types a value, step 5 can reference it using ${step2text} in verify_screen or other validation actions
  • Resolution: Variables are resolved at runtime and can be used in targetElement fields for check_screen actions
  • Storage: Variables persist throughout the flow execution session

Best Practices for Advanced Editing

Step Organization

  1. Logical Grouping - Group related actions together
  2. Descriptive Names - Use clear step descriptions
  3. Consistent Naming - Use consistent element names across steps

Performance Tips

  1. Enable Caching - Cache coordinates for frequently used elements
  2. Minimize Waits - Use appropriate wait times, not excessive delays
  3. Device Selection - Match steps to appropriate devices

Debugging Techniques

  1. Step-by-Step Execution - Use individual step execution to isolate issues
  2. Validation Checks - Use Check Screen actions for verification
  3. Subflow Extraction - Break complex flows into testable subflows

Maintenance

  1. Regular Testing - Test steps individually after UI changes
  2. Cache Clearing - Clear coordinate cache when UI changes significantly
  3. Version Control - Track changes to critical automation flows

Managing Subflows (Advanced)

Subflows are reusable automation sequences that can be executed as part of larger workflows. They allow you to create complex automation chains by combining multiple flows.

Creating and Selecting Subflows

Using the Subflow Selector Dialog

  1. Access Subflow Selection - When editing a flow step, select "Subflow" as the action type
  2. Open Subflow Dialog - Click the "Choose a subflow..." button to open the selection dialog
  3. Search Subflows - Use the search bar to filter subflows by name or instructions
  4. Browse Available Subflows - The dialog displays:
    • Subflow name and description
    • Target device information
    • Associated labels
    • Pagination for large lists
  5. Select Subflow - Click on any subflow card or use the "Select" button

SubFlow

Subflow Execution Management

Managing Subflow Order

  1. Add Subflows - Use the Subflow Management interface to add subflows
  2. Reorder Subflows - Use the up/down arrow buttons to change execution order
  3. Remove Subflows - Use the X button to remove unwanted subflows

Visual Organization

The subflow management interface provides:

  • Color-coded Execution - Blue borders for "before" execution, green borders for "after" execution
  • Order Indicators - Numbered badges showing execution sequence
  • Execution Icons - Play icons for "before", stop icons for "after"
  • Hover Effects - Action buttons appear on hover for cleaner interface

Advanced Subflow Features

Coordinate Cache Sharing

When extracting steps to create a subflow:

  • Automatic Cache Transfer - Coordinate caches are copied from the parent flow
  • Performance Optimization - Subflows inherit cached element positions
  • Error Handling - Graceful handling of cache copy failures

Subflow Validation

The system validates:

  • Circular References - Prevents flows from referencing themselves
  • Execution Dependencies - Ensures proper flow dependencies
  • Device Compatibility - Validates target device availability

Bulk Operations

  • Extract Multiple Steps - Select consecutive steps to create subflows
  • Batch Management - Add multiple subflows at once
  • Template Creation - Save subflow configurations as templates

extractingSteps

Best Practices for Subflows

Organization Tips

  1. Naming Conventions - Use clear, descriptive names (e.g., "Login Setup", "Data Cleanup")
  2. Logical Grouping - Group related setup/cleanup tasks in dedicated subflows
  3. Modular Design - Create small, focused subflows for reusability
  4. Documentation - Include detailed descriptions in subflow instructions

Performance Considerations

  • Cache Utilization - Subflows benefit from coordinate caching for faster execution
  • Device Selection - Ensure subflows target appropriate devices
  • Execution Order - Plan subflow execution order to minimize wait times
  • Resource Management - Consider device resource usage when chaining subflows

Troubleshooting

  • Execution Failures - Check subflow device compatibility and availability
  • Missing Subflows - Verify subflow exists and is not deleted
  • Performance Issues - Monitor coordinate cache hit rates and device responsiveness
  • Circular Dependencies - Use the flow selector to avoid self-referencing flows

ExtractingSubFlowFinishingStep

Previous: Getting Started