Image Comparison Using Java & Selenium

Image Comparison Using Java & Selenium

Introduction:

Let’s explore the topic of image comparison testing, which is often overlooked by testers who typically focus more on validating texts, buttons, forms, fields, and other similar elements. However, in some instances, businesses may require testing of images such as logos, infographics, and other graphics. 

I encountered such a scenario in my project, where I had to compare two images from different windows. This project was related to life sciences, specifically comparing images of two different animals. During this process, I discovered two methods for image comparison testing through automation using Selenium WebDriver and Java. Today, we will look into these two aspects of image comparison testing.

I don’t think any industries are left that are not involved in the minor or major level of image testing. At least, organizations will ask testers to perform testing of the logo placed on their website and it is very common to ask. Still, I have listed examples of some of the industries below where Image comparison testing can be extensively used.

Prerequisite: 

Need to download API dependency, BufferedImage library (Optional – Pointlib and Sikuli ) 

Applications of Image Comparison in Various Domains:

  • E-commerce: Ensuring accurate representation of products and their images is crucial for online retailers.
  • Advertising and Marketing: Verification of visual advertisements, banners, and promotional materials is essential for maintaining brand consistency.
  • Gaming: Testing game graphics, character designs, and visual effects are vital for delivering an immersive gaming experience.
  • Healthcare and Medical Imaging: Evaluating medical images for accuracy and precision is critical for diagnostic purposes.
  • Automotive: Comparing images of vehicle designs, safety features, and user interfaces helps ensure optimal user experiences.

Image comparison cannot be directly performed using Selenium WebDriver alone. However, when there is a specific need for image comparison, we can rely on third-party APIs to achieve this functionality. One such API is AShot, which allows us to compare two images. In this blog, I will explain how to compare two images using the AShot API.

Sometimes during automation, we need to compare two images for verification. We can compare two images using Java selenium with the help of Ashot API as the web driver does not support image comparison. 

Step-by-Step Implementation Guide:

  1. We need to capture the screenshot of a particular element and store it in the framework of the project during execution.
  2. Then compare the captured image with the expected image which was already stored in the project.

To capture the image during automation Selenium has to take a screenshot interface which allows us to take the screenshot of the whole screen. The code snippet to take screenshots is explained below

TakesScreenshot scrShot =((TakesScreenshot)webdriver);

File SrcFile=scrShot.getscreenshots(OutputType.FILE);

But when there is a need to take a screenshot of a particular element or image then there are various methods available in Selenium like BufferedImage, PointLib library, and external libraries like Sikuli.

While doing image comparison some things need to be kept in mind such as both images should have the same dimensions and if images are grid images then both images must be grid images otherwise they won’t be recognized as the same image.

Java ImageIO class:

This is a final class that belongs to the javax.imageio package. This class provides a convenient method for reading and writing images and performing simple encoding and decoding. The class provides a lot of utility methods related to image processing. Using this class, we can deal with popular image extensions like .jpg, .gif, .png, etc.

BufferedImage class:

This is a subclass of the Image class. It handles and manipulates the image data. A ColorModel of image data makes up a Buffered Image. The bufferedImage class consists of so many things like getHeight, getWidth,getRGB, etc.


In the first Image comparison method, we take both images’ RGB values with the Color class’s help. The Color class is a part of the Java Abstract Window Toolkit (AWT) package. The Color class creates color by using the given RGBA values and has different methods which return the component in the range of 0-255.

Comparison Scenarios: Considered Image Scenarios:

Positive scenario:

Where both the images are the same. For this scenario both the methods mentioned below will give the result as – images are the same.

Negative scenario:

Where one small change is made in the images. one small black dot is added in the image as seen below. For this scenario, the first method will give the result as images are not the same and the second method will give the result as images are the same. So we can confirm that the first method is more accurate to check every small change in the image.

Method 1:

This method of comparing the RGB value is more reliable than the second one. The second method only reflects the difference when there is a significant change in the images.

Here is the link provided for the code of both methods.

ImageComparison/ImageProcessing.java at main · sarmorikar-spurqlabs/ImageComparison (github.com)

Let’s understand the program in detail

  1. So here, we use the BufferedImage class to read the images and save them as BufferedImage objects named img1 and img2, respectively.
  2. By using the getWidth()and getHeight() methods we are reading the height and width of both the images as the height and width of both the images should be the same otherwise it will not proceed further to find out the difference. And if it is not the same, the program will print “Both images should have the same dimensions”  as we compare them in an if loop.
  3. If the images are the same then it will execute 2 for loop for height and width. We are reading the RGB value of both the images by method getRGB. And we are storing that value in integers defined as pixels and pixel integer values are passing values into the Color class.
  4. The color class has a method to read the red, green, and blue values of the images and then find the sum of the differences in RGB values of the two images. 
  5. From the differences we find out the average and the percentage and if the percentage is equal to 0 then the images are the same and if the rate is more than 0 then the photos are different.

Method 2:

Let’s Understand Method 2 in detail

  1. Again, we are using the BufferedImage class to read the images, and we are saving them as BufferedImage objects named img1 and img2, respectively.
  2. Now we are creating the object of the ImageDiffer class and the ImageDiffer class has a built-in method makeDiff which compares two images. If there is a difference, it returns the ImageDiff class object.
  3. Then we are using the hasDiff built-in method to check the value of the diff objects and confirm whether the images are different.

This way we can compare two images whether they are the same or not using Ashot. Here the intention of showing a negative scenario is to make sure that the scenarios are working fine even if there is a small change in the expected image. Method 1 is more accurate as it is comparing the RGB values of the images and when there is a small change in the image also it will give an accurate result.

Conclusion:

Image comparison using the AShot API provides a reliable and efficient method to verify and validate images during automation testing. By using the capabilities of Ashot and implementing the image comparison process, we can enhance the quality and reliability of our automation tests, ensuring that the visual aspects of our applications meet the desired expectations. 

Read more blogs here.

Building a robust mobile test automation framework using Appium in Python

Building a robust mobile test automation framework using Appium in Python

In this blog, we will explore how to build a robust mobile test automation framework using Appium in Python (behave framework). As a result, it will be very useful for executing the program.

Mobile test automation can be more challenging than web automation, as inspecting and interacting with mobile elements requires additional effort. However, with the help of Appium, an open-source tool, it is possible to overcome these challenges and build a powerful mobile test automation framework. In this blog, we will explore how to create a robust framework using Appium in conjunction with the Behave framework in Python.

Let’s talk about robust test frameworks

Robust test automation framework ranks highly on the list of Software Testing “must-haves”.

It helps improve the overall quality and reliability of software when executed in a structured manner.

If we don’t build the right framework then the results will be: Inconsistent test results, Non-modularized tests, and Maintenance difficulties. The automation framework needs to be well organized to make it easier to understand. An organized framework provides an easier way to maintain and expand.

There are many features that we should consider to make the automation framework more robust. 

  • Scalability – The automation framework that you have in your organization should be scalable. It should not just apply to one project. Your automation framework should be applied throughout projects across the organization. It should be an organization-wide test automation framework.
  • Re-portability – Every automation framework should have a good reporting capability. The test framework engineer can choose a third-party reporting library.
  • Configurable – A framework should be configurable. It should execute scripts in different test environments. The automation framework should not be restricted to a single test environment. The user credentials should not be “hard-coded” in the automation script itself. 
  • Re-usability – The framework should follow re-usability. We should use the same methods, and page objects in all the test scenarios in the test automation framework.
  • Extendability –  You should be able to integrate easily with other third-party tools via APIs. Automation frameworks should be easily integrated with security testing tools, web proxy debugging tools, test case management tools, or with other global frameworks thereby making it more hybrid in nature. 

Benefits

  • Increase product reliability – Accurate, efficient, automated regression tests – reduce risks
  • Reduce the product release cycle time – Improve the time to market, Reduce QA cycle time

Let’s start with basic

Appium is an open source Test Automation Framework which is used for automating mobile applications.

Appium supports Android, IOS mobile apps, and Windows PC Desktop apps. We can automate Native, Hybrid, and Mobile web apps using Appium.

Uses of Appium:

  • Appium is open source and it is free of cost.
  • Appium supports Android, IOS, and Windows to run test scripts.
  • Appium supports languages such as Python, Java, Perl, PHP, C#, and Ruby
  • Appium supports different operating systems such as Mac, Windows, Linux, UNIX, etc.
  • Functional test cases of mobile applications can be easily automated.

Appium Inspector

Appium Inspector is a tool for QA engineers who want to automate mobile applications. Basically, this tool also serves as the standard procedure for identifying mobile application elements.

The following are the used for inspecting the mobile element for both Android and iOS.

  • Download the Appium inspector.exe.

This is the link: https://github.com/appium/appium-inspector

  • After downloading the exe file launch the Appium inspector.exe file. On top of the web page, select to Cloud-based platform – BrowserStack

BrowserStack is a cloud-based real devices platform that provides support for both manual and automated testing of mobile apps for both Android and iOS devices. One of its standout features is the App Live feature, which allows users to manually test their mobile apps on over 3000 real Android and iOS devices.

BrowserStack supports testing across different environments, including Dev, QA, Staging, and Production apps from the play store or app store. This makes it easy for developers to test their apps in various environments and ensure that they are working correctly in each environment.

To proceed further we need BrowserStack Username and BrowserStack Access Key

For that, Go to https://www.browserstack.com/

Following are the steps that will guide the process

  • Log in to your BrowserStack account ->Navigate to the “Account” section ->Then Go to Summary
  • After going to Summery Section you will get Username and Password. Copy both Username and Password and paste them into Appium inspector fields
  • Go to Desired Capabilities -> Here we need to add basic capabilities which are required for starting the session. Below image will guide you
  • To add capabilities need to click on the “+” symbol as shown in the below image
  • Add the capabilities with desired values as shown in the below image
  • In the Value field, we need to add data that we want to add.

The most important thing here is for the last field “Appium: app”, we have to upload the .ipa or .ipk or .aab file on BrowserStack.

For that, there are 2 ways mentioned below

  1. Through Command Line
  2. Directly through BrowserStack

The most important thing here is for the last field “Appium: app”, we have to upload the .ipa or .ipk or .aab file on BrowserStack.

Let’s start

  • Through Command Line

To upload the .ipa or .ipk or .aab file on BrowserStack, the following Curl command is very useful.

Curl is a command line tool that enables data exchange between a device and a server through a terminal. Using this command line interface (CLI), a user specifies a server URL (the location where they want to send a request) and the data they want to send to that server URL.

Go to cmd and write this curl command

curl -u “username:accesskey” -X POST “https://api-cloud.browserstack.com/app-automate/upload” -F “file=@path of the file where you save your apk or IPA file” -F “custom_id=any name “

  • Directly through BrowserStack:

 Go to “App Live” -> Click on Uploaded Apps -> And Upload your file.

But here 1st one is preferable: Through this command, we get “app_url” which is required in Desired Capabilities:

Copy that app_url and paste it into Desired Capabilities.

Here you need to Click on Start Session -> You will get below the window.

You can select the element in the App or from the App Source section and the attributes including ID, Name, Text, etc will be displayed on the right side under the Selected Element section and you can create Xpaths using those attributes.

Developing Framework

  1. Python: https://www.python.org/downloads/ visit the site to download and install Python in your system if it is not there. 
  2. Install Selenium and Behave using:

pip install selenium 

Pip install behave 

For more details please visit: https://pypi.org/project/behave/  &  https://pypi.org/project/selenium/ 

  1. Pycharm IDE (Professional or Community): https://www.jetbrains.com/pycharm/download/ 
  1. Install Appium-Python-Client:

pip install Appium-Python-Client

For more details please visit: https://pypi.org/project/Appium-Python-Client/

  1. Install allure for report generating using:

pip install allure-behave 

For more details please visit: https://pypi.org/project/allure-behave/ 

  1. We can also install all the required packages using the requirement.txt file using the below command. 

pip install -r requirement.txt

Framework Structure Overview

Here is the overview of our mobile test automation framework using Appium in Python.

You have to follow the below 7 steps to build a robust mobile test automation framework using Appium.

Step 1: 

Create a project in Pycharm (here I am using Pycharm professional) and as mentioned in the prerequisites install the packages

Step 2:

In this step, we will be creating a Features folder in which we will be creating our feature files for different scenarios. Every step in a Feature File describes the action we are going to perform on UI. A feature file is something that holds your test cases in the form of a scenario and scenario outline. In this framework, we are using a scenario. Both scenario and scenario outlines contain steps that are easy to understand for non-technical persons. We are giving tags for the feature files. We can also give it for the scenarios present in that file. Depending on our test cases. Note that the feature file should end with a .feature extension.

@iOS
#@android
Feature: Simple Calculator
 Addition of two numbers
 Scenario: Verify addition of two numbers
   Given I am on calculator home page
   When I enter '4'
   And I enter operator of addition
  And I enter operator of addition this should be like And I enter 
 ‘+’ operator so if possible you can update code as per this
   And Enter number '2'
   And I enter operator '='
   Then I see result as '6'

Step 3:

After creating the feature file now create a step file. Both feature files and step files are essential parts of the BDD framework. The steps with ‘When’ are related to the user actions like navigation, clicking on the button, and filling in the information in input boxes. The steps with ‘Then’ are related to the verifications or Assertions. In this, we are using both iOS and Android, so the step file should look like this. We are creating only one-step files for both iOS and Android.

Purpose of Step file: The step file is to attach steps from the feature file to the page file, where actual implementation is available.

from behave import *
use_step_matcher("parse")
@given("I am on calculator home page")
def step_impl(context):
   print("User is on Homepage")

@when("I enter '{number}'")
def step_impl(context, number):
   str = context.config.userdata["deviceType"]
   print("str " + str)
   if str == "['iOS']":
       context.iOS_cal.iOS_tap_number1(number)
   else:
       context.android_cal.tap_number1()

@step("I enter operator of addition")
def step_impl(context):
   str = context.config.userdata["deviceType"]
   print("str " + str)
   if str == "['iOS']":
       context.iOS_cal.iOS_tap_operator()
   else:
       context.android_cal.tap_operator()

@step("Enter number '{number}'")
def step_impl(context, number):
   str = context.config.userdata["deviceType"]
   print("str " + str)
   if str == "['iOS']":
       context.iOS_cal.iOS_tap_number1(number)
   else:
       context.android_cal.tap_number2()

@step("I enter operator '{operator}'")
def step_impl(context, operator):
   str = context.config.userdata["deviceType"]
   print("str " + str)
   if str == "['iOS']":
       context.iOS_cal.iOS_equals(operator)
   else:
       context.android_cal.equals()

@then("I see result as '{result}'")
def step_impl(context, result):
   str = context.config.userdata["deviceType"]
   print("str " + str)
   if str == "['iOS']":
       flag = context.iOS_cal.iOS_verify_result()
       assert flag == True
   else:
       flag = context.android_cal.verify_result()
       assert flag == True

Step 4:

In this step, we are creating two-page files one for iOS and one for Android that contains all the locators and the action methods to perform the particular action on the web element. We are going to add all the locators at the class level only and will be using them in the respective methods.

iOS page file:

import time
from appium.webdriver.common.mobileby import MobileBy
from selenium.common import NoSuchElementException
from selenium.webdriver.support import expected_conditions as EC
from time import sleep

from Features.Pages.BasePage import Basepage

class iOS_Calculator_Page(Basepage):
   def __init__(self, context):
       Basepage.__init__(self, context.driver)
       self.context = context

       self.add_operator = "//XCUIElementTypeStaticText[@name='+']"
       self.result = "(//XCUIElementTypeStaticText)[1]"

   def iOS_tap_number1(self,number):
       time.sleep(2)
       tap_on = self.wait.until(
           EC.presence_of_element_located((MobileBy.XPATH, "//XCUIElementTypeButton[@name='"+number+"']")))
       tap_on.click()

   def iOS_tap_operator(self):
       time.sleep(2)
       tap_on = self.wait.until(
           EC.presence_of_element_located((MobileBy.XPATH, self.add_operator)))
       tap_on.click()

   def iOS_equals(self, operator):
       time.sleep(2)
       tap_on = self.wait.until(
           EC.presence_of_element_located((MobileBy.XPATH, "//XCUIElementTypeStaticText[@name='"+operator+"']")))
       tap_on.click()

   def iOS_verify_result(self):
       sleep(5)
       try:
           verify_element = self.wait.until(EC.presence_of_element_located(
               (MobileBy.XPATH, self.result))).is_displayed()
       except NoSuchElementException:
           verify_element = False
       return verify_element

Android page file:

from appium.webdriver.common.mobileby import MobileBy
from selenium.common import NoSuchElementException
from selenium.webdriver.support import expected_conditions as EC
import time

from Features.Pages.BasePage import Basepage

class android_Calculator_Page(Basepage):
   def __init__(self, context):
       Basepage.__init__(self, context.driver)
       self.context = context
       self.number1 = "(//android.widget.Button)[5]"
       self.add = "//android.widget.Button[@content-desc='plus']"
       self.number2 = "(//android.widget.Button)[9]"
       self.operator_equals = "(//android.widget.Button)[13]"
     self.verify = "(//android.widget.TextView)[2]"
   def tap_number1(self):
       time.sleep(2)
       tap_on = self.wait.until(
           EC.presence_of_element_located((MobileBy.XPATH, self.number1)))
       tap_on.click()

   def tap_operator(self):
       time.sleep(2)
       tap_on = self.wait.until(
           EC.presence_of_element_located((MobileBy.XPATH, self.add)))
       tap_on.click()

   def tap_number2(self):
       time.sleep(2)
       tap_on = self.wait.until(
           EC.presence_of_element_located((MobileBy.XPATH, self.number2)))
       tap_on.click()

   def equals(self):
       time.sleep(2)
       tap_on = self.wait.until(
           EC.presence_of_element_located((MobileBy.XPATH, self.operator_equals)))
       tap_on.click()

   def verify_result(self):
       try:
           verify_element = self.wait.until(EC.presence_of_element_located(
               (MobileBy.XPATH, self.verify))).is_displayed()
       except NoSuchElementException:
           verify_element = False
       return verify_element

Base Page File:

The next one is the base page file. We are creating a base page file to make an object of the driver so that we can easily use that for our page and environment file. On this page, we can create a method that gets used frequently in our code like the click() method or send_keys() method, etc.

from selenium.webdriver.support.ui import WebDriverWait
# In the base page we are creating an object of the driver.
# We are using this driver in the other pages and environment page.
class Basepage(object):
   def __init__(self, driver):
       self.driver = driver
       self.wait = WebDriverWait(self.driver, 60)
       self.implicit_wait = 25

Step 5:

Environment file (i.e. Hooks file).

This file contains hooks for before and after scenarios to start and close the browser. Also if you want you can add after-step hooks for capturing screenshots for reporting. We have added a method to capture screenshots after every step and will attach them to the allure report. We have added before feature hooks.

In the feature file, we have given tags(@iOS and @android) before the feature.

  • def before_feature hook: This will check for which device type (iOS or Android) we are executing the code.
  • def before_scenario hook: We are checking the execution mode and within that adding device type conditions for iOS and Android. 

Here we are using  “context. config.userdata[]” This will read data from the behave.ini file

Added condition for iOS and Android

import json
from appium import webdriver
from allure_commons.types import AttachmentType
from allure_commons._allure import attach
from Features.Pages.BasePage import Basepage
from Features.Pages.android_Calculator_Page import android_Calculator_Page
from Features.Pages.iOS_Calculator_Page import iOS_Calculator_Page

data = json.load(open("Features/Resources/config.json"))

def before_feature(context, feature):
   tags = str(feature.tags)
   print("Tags " + tags)
   context.config.userdata["deviceType"] = tags
   print("Device Type :" + context.config.userdata["deviceType"])

def before_scenario(context, scenario):
   if context.config.userdata["executionMode"] == "Browserstack":
       if context.config.userdata["deviceType"] == "['iOS']":
           print(context.config.userdata["deviceType"])
           context.driver = webdriver.Remote(
               command_executor='https://' + context.config.userdata["userName"] + ':' + context.config.userdata[
                   "accessKey"] + '@hub-cloud.browserstack.com/wd/hub',
               desired_capabilities={
                   "platformName": "iOS",
                   "build": context.config.userdata["iOS_browserstack_build"],
                   "deviceName": context.config.userdata["iOS_browserstack_device"],
                   "os_version": context.config.userdata["iOS_device_os_version"],
                   "app": context.config.userdata["iOS_browserstack_appUrl"],
               }
           )
       else:
           context.driver = webdriver.Remote(
               command_executor='https://' + context.config.userdata["userName"] + ':' + context.config.userdata[
                   "accessKey"] + '@hub-cloud.browserstack.com/wd/hub',
               desired_capabilities={
                   "platformName": "android",
                   "build": context.config.userdata["android_browserstack_build"],
                   "deviceName": context.config.userdata["android_browserstack_device"],
                   "os_version": context.config.userdata["android_device_os_version"],
                   "app": context.config.userdata["android_browserstack_appUrl"],
               }
           )
   else:
       print("...")
   context.driver.switch_to.context('NATIVE_APP')
   baseobject = Basepage(context.driver)
   context.android_cal = android_Calculator_Page(baseobject)
   context.iOS_cal = iOS_Calculator_Page(baseobject)
   context.stepid = 1

def after_step(context, step):
   attach(context.driver.get_screenshot_as_png(), name=str(context.stepid), attachment_type=AttachmentType.PNG)
   context.stepid = context.stepid + 1
def after_scenario(context, scenario):
   context.driver.reset()
   context.driver.quit()

Step 6:

INI files are configuration files used by Windows to initialize program settings. The main role is to set values for parameters and configuration data required at startup or used by setup installers.

The configuration files should begin with the keyword [behave] and follow Windows INI style format.

Here we are giving BrowserStack Capabilities.

[behave]
color=False
show_snippets=True
show_skipped=True
dry_run=False
show_source=True
show_timings=True
stdout_capture=True
stderr_capture=True
log_capture=True
default_format=pretty
[behave.formatters]
allure=allure_behave.formatter:AllureFormatter
[behave.userdata]
executionMode=Browserstack
userName=harishekal_QYol9T
accessKey=RG6juTxk2Coom4p5JPSN
iOS_browserstack_appUrl=bs://f3b6763a19d710fc72fdc615db7119ea654e06da
iOS_browserstack_build=0.0.0
iOS_browserstack_device=iPhone 11
iOS_device_os_version=15.4
iOS_deviceType = iOS

android_browserstack_appUrl=bs://495dd3b36cae77a1ae2f4ce6f439456999e0bb45
android_browserstack_build=0.0.1
android_browserstack_device=Google Pixel 3
android_device_os_version=9.0
android_deviceType = Android

Copy user userName and accessKey of the user BrowserStack account. And iOS_broserstack_appUrl – Uploaded .ipa file through curl command. android_broserstack_appUrl – Uploaded .apk file through curl command.

Congratulations, finally we have created our own Python Selenium Behave BDD framework. 

Step 7:

As I mentioned earlier we will be using Allure for reporting the test result. For this use the below command in the terminal and it will generate the result folder for you.

  • behave Features/Calculator.feature -f allure_behave.formatter: AllureFormatter -o Report_Json

To convert the JSON file into readable HTML format use the below command. 

  • allure generate Report_Json -o Report_Html –clean

I have added this framework to the following Git Repository.

https://github.com/spurqlabs/PythonAppiumMobileFramework

Conclusion: 

Creating a robust mobile testing framework using Appium is very important as well as feels like a tedious task but with the right guidelines, everyone can create a testing framework. This framework helps improve the quality and efficiency of the testing process. I hope this blog will help everyone to create a robust mobile testing framework using Appium. Here, we choose a behave framework over other existing frameworks because of its better understanding, ease of adaptation, and ease to understand for end users.

Different Tools to Inspect Desktop App Elements

Different Tools to Inspect Desktop App Elements

Introduction:

Desktop applications can be accessed from the machines on which they are deployed. Generally, single users only use it, or whoever is signed in to the account at the time. Whereas Web applications don’t have any location constraint to be accessed as they are deployed on servers that can be accessed by any user with an internet connection and can be accessed via a URL or browser.

In terms of determining locators to use in test scripts, a different approach is necessary for different applications. With web applications, for instance, the browser’s developer tools can inspect the HTML elements on the page. But for desktop applications, you need to download an application to be able to inspect UI elements on the screen. So in software testing, when it comes to testing desktop applications, one of the key challenges is to identify the locators for different UI elements in an application.

 In this blog, we will explore different tools specifically designed to help testers to find the right locators for UI elements in desktop applications and also discuss how to use these tools effectively.

1.       Inspect tool

2.       WinAppDriver UI Recorder

3.       Appium

1. Inspect Tool:

What is an Inspect Tool?

It is a Windows-based tool that can select any UI element and view its accessibility data and also test the navigational structure of the automation elements in the UI Automation tree and the accessible objects in the Microsoft Active Accessibility hierarchy.

Where do you get Inspect tool?

This is shipped together with Visual studio.Inspect.exe. and it’s inside the SDK directory like this: “C:\Program Files (x86)\Windows Kits\10\bin\10.0.16299.0\x64\inspect.exe”

Note: If Windows kits are not available on your machine then use the following location to download Windows SDK https://developer.microsoft.com/en-us/windows/downloads/windows-sdk/

Now let’s see how to inspect desktop application elements using Inspect tool:

1. Double-click on the Inspect.exe file to open the Inspect tool: When you start Inspect, you will see Tree view on the left side it will show all the applications open in your machine, and on the right side you will see Data view which shows all the accessibility information about selected element also you can navigate the UI to view accessibility information about every element in the UI. By default, Inspect tracks the keyboard or mouse focus. As focus changes, the data view updates with the property information of the element with focus as shown in the below screenshot:

2. Second step is to open the desktop application you want to inspect.

3. Move your mouse cursor over the element that you want to inspect: then on the left side of the tool you will see the hierarchical structure of UI elements as a tree-view control and on the right side, you will see all the properties of selected elements like Name, AutomationId, Class Name, Control Type, Native Window Handle and so on…
You can also try the following options from Inspector

  • MSAA Mode- Displays Microsoft Active Accessibility property information.
  • UI Automation Mode-Displays UI Automation property information.
  • Active Hover Toolbar-Activates toolbar buttons on mouse hover, instead of requiring a mouse click.
  • Show Highlight Rectangle-Highlights a rectangle around the element with focus.
  • Show Information Tooltip-Shows property information in a tooltip.

then you can use these properties in your code as examples shown  below:

winAppDriver.findElementByName("Add").click();
winAppDriver.findElementByAccessibilityId("93").click();
winAppDriver.findElementByXPath("//*[@LocalizedControlType='button' and contains(@Name,'Add')]");

2. WinAppDriver UI Recorder:

What is a UI recorder?

This is a tool for Windows Application Driver (WinAppDriver) that allows you to record user actions on a Windows application for selected UI elements and view their attribute data to generate XPath queries.

WAD UI Recorder will record the absolute xPath for an application object on mouse hover or click. Then you can make some slight modifications to the recorded absolute xPath for use within your step. You can also use the absolute xPath recorded in WAD UI Recorder to help you construct a relative xPath that may be more readable and manageable within your Cycle steps.

Where do we get a UI recorder?

Download the WinAppDriverUIRecorder.zip from the following link:

https://github.com/microsoft/WinAppDriver/releases

Then in this unzip file you will see WinAppDriverUIRecorder.exe to launch the UI recorder

Now let’s see how to inspect desktop application elements using UI Recorder:

1. Run WAD UI Recorder as administrator. The application window will open on your desktop which is divided into two panels, as shown below. The top panel will display the absolute XPath recorded by the application and the bottom panel displays xPath node information.

There are two buttons located at the bottom of the window which are used to Record and Clear the panels. The Record button will change to a Pause button while the recording is in progress.

how to inspect desktop application elements using UI Recorder:

 2. Open the desktop application you want to inspect

3. Click on the Record button to start recording xPath. The WAD UI Recorder will now start recording xPaths of any object or element you interact with on your desktop.

4.  Move your mouse cursor or click on the element in your target application to generate the absolute xPath. The xPath appears in the top panel of WAD UI Recorder. Clicking the Pause button will stop the WAD UI Recorder from recording xPath.

Then Xpath is recorded in WAD UI Recorder, as shown below 

 inspect desktop application elements using UI Recorder:

5. Copy the xPath from the top panel of WAD UIRecorder and paste it into your text editor of choice then will need to be modified in order to use it in your step. You will need to replace the escaped double quote characters (  \” )  with single quotes  ( ‘ ) and change the starting point of the xPath and also remove the “Desktop” portion from the absolute xPath. This portion of the xPath is not necessary then we can use that xPath in your step as shown below:

winAppDriver.findElementByXPath("//*[@ClassName='CalcFrame']/Button[@ClassName='Button'][@Name='Add']").click();
winAppDriver.findElementByXPath("//*[@ClassName=\"CalcFrame\"]/Button[@ClassName=\"Button\"][@Name=\"Add\"]").click();

3. Appium:

What is Appium?

Appium is an open-source tool used for automating mobile and desktop applications. It supports a variety of programming languages, including Java, PHP, Objective C, Python, and JavaScript.

Appium is a combination of two components:

  • Appium Server: To ensure test automation of the applications, we use the server.
  • Appium Inspector: We use the feature to inspect and obtain all the specifications of the application’s UI element. For inspecting the element, you have to set the desired capabilities that can inform the Appium server about the type of application and platform you aim to automate.

Where do we get an Appium?

Download the exe file of the Appium desktop setup from the link:

https://github.com/appium/appium-desktop/releases

 On this Github page, you will see the latest release details as shown below:

Under the Assets section, click and download the Appium-Inspector-windows exe file and save it on your machine.

  • Under the Assets section, click and download the Appium-Inspector-windows exe file and save it on your machine.

  • Open the downloaded exe folder and double-click on the exe to start the installation process.
  • After completing the installation, the Appium GUI window will open as shown below.

Now let’s see how to inspect desktop application elements using Appium:

1. First step before opening the Appium you should launch the winAppDriver by running the nan automation script.

2. Second step is to open the app from which you can inspect the elements

3. Third step is to start the Appium server by clicking on the “Start Server” button it will launch the Appium server on your local machine.

4. Fourth step is to click on Attached Session button, then attach the root path of the application with Appium by clicking Attach Session button present at the bottom

Appium server

5. Then in App Source you will see all the elements and all opened apps and in Selected Elements, you will see the details of selected elements such as ID, class name, or XPath, that you can use in your test script as shown in the following example.

winAppDriver.findElementByName("Add").click();
winAppDriver.findElementByXPath("/Window[2]/Pane/Button[23]").click();

Comparison amongst the above three:

In the Inspect tool, you will immediately get the element attributes, there is no need to start the session every time and you can directly use these attributes in your script without any modifications. Whereas with a UI recorder, you can get an Xpath and with Appium you will get a unique xpath but you need to start the recorder and Appium every time. Along with these tools, you can also explore TestArchitect, Ranorex, Tricentis Tosca, SikuliX, and Askui’s James tools.

Appium im

Conclusion:

I have used Inspect, UI Recorder, and Appium Inspector tools, the ease of using these tools depends upon the application’s complexity. For me, getting attributes of the UI element Inspect tool will prove better and for Xpath UI recorder will help.

Read more blogs here