Desktop Automation Switching Between the Windows
While automating web applications or desktop interfaces, we often encounter instances where multiple windows have to be managed simultaneously within an application. Such scenarios may demand a method to navigate seamlessly between these windows and execute the necessary actions. Usually, it’s tempting to perform actions on all windows simultaneously. However, the reality is that such simultaneous actions are not always feasible. Hence, we must employ techniques to switch between windows efficiently using Desktop Automation, allowing us to execute the required tasks effectively. This requires us to develop strategies for smoothly transitioning between windows as needed, ensuring that we carry out every action precisely where it’s needed most.
During a test case execution, it’s not uncommon for multiple windows to come into play. That’s where window handling steps in – it’s the savvy technique that ensures smooth management of these windows. Picture this: as new windows pop up during execution, we seamlessly shift the driver’s focus to the newly opened window. This pivotal maneuver grants us the control we need to execute subsequent actions with precision. Whether we’re testing desktop automation or interacting with applications, mastering window handling is paramount. It’s the key that unlocks the door to seamless, effective testing procedures.
In desktop automation applications, a window can be either the pop-up window coming within the application or a new application itself. Unlike WebDriver, winAppDriver does not provide any built-in methods to handle the multiple open windows within the application or other Windows applications. In desktop applications automation it is a task of testers to write their methods to perform switching between multiple windows as per the requirement of the automation.
As testers, navigating between actions is an inevitable part of our role. These transitions are not just incidental; they’re essential to the testing process, allowing us to scrutinize the application’s behavior comprehensively. Picture this: sometimes, we find ourselves toggling between pop-up windows within the application, or perhaps switching to an entirely new application altogether. It’s all part of the meticulous process to evaluate how the application under test behaves in various scenarios. Now, when it comes to desktop automation, there’s an array of techniques at our disposal for seamless switching methods, each offering its own unique approach like
- switching the window with the window name
- switching the window with the Class name
- switching the window with the window name and window class name(combined)
- switching the window with the current date etc.
In this blog, we will explore how we can switch between the two desktop applications through automation.
In the following example, I will explain how we can switch between a calculator and an Excel workbook with the name of the application.
Let’s embark on a journey with our trusty companion, an Excel workbook named “InputFile,” adorned with four columns: Operand1, Operand2, Operator, and Result. While Operand1, Operand2, and Operator columns are already graced with some sample data, the Result column eagerly awaits its destiny. Our mission? To extract data from this Excel sanctuary, employ it for calculations within our calculator, and then gracefully usher the results back into the workbook.We have three ways to perform this task:
- Launch both the applications within the scenarios one after another and do the switching between them.
- Keep one application open before the execution of the scenario launch the second application and do the switching between them.
- Keep both applications open and do the switching between them.
In this blog, we will follow the second way to explain the example of switching. So let us follow these steps:
- Launch the desktop calculator through automation.
- Switch to the Excel workbook which we will keep open before the start of the execution and will read the operands and operators.
- Then switch back to the calculator and do the calculations as per data from the Excel workbook, then read the result and save that result in the workbook after switching on it.
Launch Desktop Application
public static WindowsDriver winAppInit() {
try {
if (!serverListening("localhost", 4723))
Runtime.getRuntime().exec("cmd /c start cmd.exe /K " + ConfigUtil.getPropValues
("WinAppDriverLocation"));
sleep(2000);
DesiredCapabilities capabilities = new DesiredCapabilities();
capabilities.setCapability("app", "Microsoft.WindowsCalculator_8wekyb3d8bbwe!App");
capabilities.setCapability("platformName", "Windows 10");
capabilities.setCapability("ms:experimental-webdriver", true);
winAppDriver = new WindowsDriver(new URL("http://127.0.0.1:4723/"), capabilities);
winAppDriver.manage().timeouts().implicitlyWait(2, TimeUnit.SECONDS);
} catch (Exception e) {
e.printStackTrace();
} finally {
}
return winAppDriver;
}
The above method is to launch any desktop application using winAppDriver selenium.
Windows application driver is listening for requests at http://127.0.0.1:4273/. So, To establish a connection with a computer we use “localhost” as the default name and a “4273” port for communication between the automation framework and the target desktop application.
Let’s split the above method into 3 parts
Part 1:
Check if a server is listening on localhost specifically on port 4723 using the “serverListening” function. If the server is not listening, it executes a command using Runtime.getRuntime().exec() to open the WinAppDriver executable.
Part 2:
Creates a new instance of DesiredCapabilities, which is a class used to specify the desired capabilities or configuration for the automation session. It sets the capability “app” to the value “Microsoft.WindowsCalculator_8wekyb3d8bbwe!App”. This indicates that we are automating the Windows Calculator application. The value represents the application’s package name and the entry point or application identifier. It also sets the capability “platform name” to the value “Windows 10”. It specifies the target platform for the automation session, indicating that it should run on a Windows 10 machine. Apart from that, It sets capability “ms:experimental-webdriver” to true. This capability enables experimental features in the WinAppDriver.
Part 3:
Create a new instance of WindowsDriver using the URL “http://127.0.0.1:4723/” (localhost:4723) as the server URL and the previously configured capabilities. This establishes a connection to the WinAppDriver server and initializes the WindowsDriver instance for automating the Windows Calculator application. Finally, the method returns the initialized WindowsDriver object.
Switch to the Window Name Method in Desktop Automation
public static WindowsDriver switchToWindowWithName(String name) throws MalformedURLException,
InterruptedException {
DesiredCapabilities capabilities = new DesiredCapabilities();
capabilities.setCapability("app", "Root");
capabilities.setCapability("platformName", "Windows");
capabilities.setCapability("ms:experimental-webdriver", true);
rootDriver = new WindowsDriver(new URL("http://127.0.0.1:4723"), capabilities);
sleep(3);
List<WebElement> windows = rootDriver.findElementsByXPath("/Pane/Window");
for (WebElement window : windows) {
String windowName = window.getAttribute("Name");
if (windowName.contains(name)) {
sleep(100);
String hex = window.getAttribute("NativeWindowHandle");
int handle = Integer.parseInt(hex);
String windowHandle = Integer.toHexString(handle);
windowHandle = "0x" + windowHandle;
DesiredCapabilities caps = new DesiredCapabilities();
caps.setCapability("appTopLevelWindow", windowHandle);
caps.setCapability("platformName", "Windows");
caps.setCapability("ms:experimental-webdriver", true);
winAppDriver = new WindowsDriver(new URL("http://127.0.0.1:4723"), caps);
// todo change the time
winAppDriver.manage().timeouts().implicitlyWait(40, TimeUnit.SECONDS);
System.out.println("Switched to window with name 1: " + winAppDriver.getTitle());
return winAppDriver;
}
}
return null;
}
This is the method that switches the control to another application window. This method takes the name of the window as a parameter and returns a WindowsDriver object. Before getting into this method let me discuss the “root” window in desktop applications.
Here we will see the “root” window: In any window system every window is part of some other window, we call it the parent window. This makes a kind of hierarchy of the windows. And there is one window we can call the “root” window of this hierarchy. The “root” window is the screen surface, and other windows are children of it. This can be more clear with the two figures. The direct children of the root window are called top-level windows.
So the above Fig: 1 is the root window, which covers the whole screen; 2,3, and 4 are top-level windows; 5 and 6 are subwindows of 2.
Below Fig 2: Hierarchy of the windows: calculator and Excel file are the children of the root window.
Now break down the method into 3 parts.
Part 1:
It sets up the desired capabilities for Windows-based application automation. It creates a new instance of DesiredCapabilities, which is used to specify the desired capabilities or properties for the automation session. Here, the “app” capability is set to “root”. It also sets the “platform name” capability to “Windows”, indicating that the target platform for the automation is Windows. Finally, a new instance of WindowsDriver referred to as rootDriver, is created with the desired capabilities. A URL pointing to the server running locally on http://127.0.0.1:4723. We will use this driver instance to interact with the Windows application specified by the “app” capability.
Part 2:
It finds all the windows in the application by using an XPath query (/Pane/Window). It iterates over each window and retrieves its name using the get Attribute method. If the window name contains the specified name parameter, it proceeds with the following steps.
- The method retrieves the native window handle (Handles are numbers that uniquely identify each window.) of the matching window as a hexadecimal string using the getAttribute method
- It converts it to an integer.
- The window handle is then converted back to a hexadecimal string and prefixed with “0x” to create a valid window handle.
Part 3:
A new set of DesiredCapabilities is generated. Configuring the “appTopLevelWindow” capability with the window handle value obtained in the preceding step. This set, which includes the acquired window handle and other essential capabilities, is then used to initialize a new WindowsDriver object named ‘winAppDriver.’ The initialization requires providing the URL of the WinAppDriver server and updating the capabilities. The resulting ‘winAppDriver’ now has control over the newly opened window.
Interaction with calculator & Excel workbook
We can inspect all the locators of the calculator like all input buttons, results in text boxes, etc. In Fig we can see the locators of the calculator, for example for button nine the locator is “Nine” and to read the result locator is “CalculatorResult” etc. To enter any number in the calculator I search for an element by name and reading the result from the calculator I am using an automation ID.
In the same way, we can locate all cells on the Excel workbook once we switch on it. And using those locators we can read operands and operators from Excel and use them whenever needed. In the figure, we can see that every cell in the sheet has its unique locator like “A2”, “ B2” etc.
Explore the Desktop Automation example using a Feature, Step, and Page File.
Feature File:
Scenario: Switching between calculator and excel workbook
Given I am on Calculator Windows App
When I switch to 'InputFile' excel workbook
And I read first number from sheet
And
I read operator from the sheet
And
I read second number from sheet
And
I switch to calculator
And
I do calculation as per the data which I read from excel
And I
switch to 'InputFile' excel workbook
Then I save the result in excel workbook
I kept the feature file very simple. First of all, I am launching the calculator. Then I switched to “InputFIle” which is an Excel workbook that I kept open before the start of execution. Once I have control of the Excel workbook I can read the first row of data using the “name” locator. After that, I switch to a calculator, perform calculations on data read the output and again switch back to the Excel workbook. Following that, I save the result to the appropriate cell. Basically, I switch to an Excel workbook in which I have to pass the name of the file as a parameter. The following is to switch to a calculator I use “Calculator” as a parameter directly.
Step File:
Now in the step file, We can see in the following step that I called the method and passed the window name as a parameter that I have written in the above feature file. For the calculator, I have used a separate method.
@When("I switch to {string} excel workbook")
public void iSwitchToInputFileExcelWorkbook(String fileName) throws MalformedURLException, InterruptedException { switchToWindowPage.switchToOtherWindowApplication (fileName);
}
@And("I read first number from sheet")
public void iReadFirstNumberFromSheet() { switchToWindowPage.readOperand1(); }
@And("I switch to calculator")
public void iSwitchToCalculator() throws MalformedURLException, InterruptedException {
switchToWindowPage.switchToCalculator();
}
Page File:
These are the two methods that I have used to call the “switchToWindowWithName” method which I have in WinAppUtil
public void switchTootherWindowApplication(String fileName) throws MalformedURLException, InterruptedException {
sleep( millis: 2000);
winAppDriver WinAppUtil.switchToWindowWithName(fileName);
sleep( millis: 100);
}
public void switchToCalculator () throws MalformedURLException, InterruptedException {
winAppDriver=WinAppUtil.switchToWindowWithName("Calculator");
}
You can find the complete code at htps://github.com/kfale-spurqlabs/Desktop-Application-Testing-Switching-Between-The-Windows
Conclusion:
We all have been using desktop automation applications long before the web and mobile apps but testing these apps becomes very rare. Communities are not large enough, so it is hard to find solutions for our issues in automation. So, in this blog, I have tried to cover all the scenarios where a desktop automation applications tester needs to perform switching between windows or within the window. Hope you will find it useful. For me, desktop applications have been a nightmare earlier but when I started working with them they seem like a piece of cake now! Hope we all can overcome all of the challenges in desktop automation testing with good preparation, strategy, and the team’s test automation skills.
An SDET with hands-on experience in the life science domain, including manual testing, functional testing, Jira, defect reporting, web application, and desktop application testing.
I also have extensive experience in web and desktop automation using Selenium, WebDriver, WinAppDriver, Playwright, Java, Cucumber, maven, POM, Xray, and building frameworks.