How to use Visual Studio Code for Java Selenium Automation?

How to use Visual Studio Code for Java Selenium Automation?

The purpose of this blog is to provide a step-by-step guide on how to use Visual Studio Code for Java Selenium Automation as the Integrated Development Environment (IDE) for designing and running Selenium tests with Java. Visual Studio Code is a defacto IDE for JavaScript, however, it is rarely used for Java and Selenium automation as traditionally test automation engineers use either Eclipse or IntelliJ IDEA. Being one of the most widely used IDEs, software professionals prefer Visual Sstudio Code for its functionality.

  1. Cross-platform support: Visual Studio Code can run on Windows, macOS, and Linux
  2. IntelliSense: Visual Studio Code provides intelligent code completion and error detection, making it easier to write and debug code
  3. Built-in Git integration: Visual Studio Code provides built-in support for Git, allowing you to easily version control your projects
  4. Extensible with plugins: Visual Studio Code has a large community of developers who have created a variety of plugins to enhance the functionality of the IDE.
  5. Fast and lightweight: The design of Visual Studio Code focuses on being fast and lightweight, making it easy to use on lower-end hardware.
  6. Free and open-source: Visual Studio Code is free and open-source, making it accessible to everyone.
  7. Debugging and testing: Visual Studio Code has built-in support for debugging and testing, making it easier to identify and fix bugs in your code.
  8. Customization: You can customize Visual Studio Code with themes, keybindings, and settings, allowing you to tailor the IDE to your needs.

Here’s what you’ll need to get started with using Visual Studio Code for Java Selenium Automation:

  1. Java Development Kit (JDK): This is a software development kit that provides the necessary tools to create Java applications. You can download the JDK from the Oracle website (https://www.oracle.com/java/technologies/javase-downloads.html).
  2. Visual Studio Code: We can download from the Microsoft website (https://code.visualstudio.com/).

Once you’ve installed these components, you can create a new Java project in Visual Studio Code and add the Maven libraries to it.

Following the installation of these parts, you can use Visual Studio Code to build a new Java project and add the Maven libraries to your project.

Steps to setup in Visual Studio Code for Java Selenium Automation:

Assuming Visual Studio Code is installed, let’s start the recipe step by step to create a flow for execution.

Step 1:

You need to open Visual Studio Code and locate the Marketplace.

Step 1 Image.

Step 2:

Go to the search bar search for Java then Install the Extension pack for Java from the Marketplace.

Step 3:

Now, Navigate to the settings option in the lower left corner.

Step 3 Image

Step 4:

Select the command palette and look for the command create Java project. 

Step 4 Image

Step 5:

Probably, here we start how to add a maven repository into the project to run the Java-Selenium test using Visual Studio Code. After selecting Create Java Project, select the “Maven” option to help you create the project.

Step 6:

After that, it will navigate to the browse folder and choose a specific folder to create your Java project. Please select a folder and open it in Visual Studio Code.

Step 7:

The screen below will appear; enter your project name and press “Enter” to open that project in Visual Sstudio Code. 

Step 8:

Once you open the project, you will see the default folder structure displayed. To ensure code reusability, we create the following framework in Visual Studio Code.

  • Default Structure
  • Hybrid Framework folder structure

Hybrid Driven Framework is a combination of both the Data-Driven and Keyword-Driven frameworks. Here, we externalize the keywords, as well as the test data. Furthermore, Keywords are maintained in a separate Java class file and test data can be maintained either in a properties file/excel file/can use the data provider of a TestNG framework.

Step 9:

So, for adding Maven dependencies and how to download them

Maven is a tool for managing or building projects. When various test engineers incorporate their files into the same framework, they use it to check for compilation problems amongst framework components. Consequently, every time we make a change to the framework, we update the build status and continuously maintain the monitor, framework components, or build. As a result, there are no compilation errors in the framework, it will output a “build success” message; otherwise, it will output a “build failure” message.

Once you select Maven Project, it will create a Maven project with a pom.xml file. In the pom.xml file, you need to add the selenium dependency from the website https://mvnrepository.com/ and add it into dependency.

This dependency has been added to provide the libraries required to run the Selenium project. In a normal Selenium project, we need to download standalone libraries and then add them to external libraries, whereas here, once you add this Maven dependency, it will automatically download all libraries required for the Selenium project.

In a Maven project, you can use the Apache POI library by adding its dependency to your project’s pom.xml file. This will automatically download and include the library in your project’s classpath. Once you have added the dependency, you can start using the Apache POI APIs in your Java code to read, write, and manipulate Microsoft Office documents.

In a Maven project, you can add the Extent Reports dependency to your project’s pom.xml file to use its reporting functionality. Once you have added the dependency, you can start using the Extent Reports APIs in your Java code to create and generate detailed reports for your automation test results.

In a Maven project, you can add the TestNG dependency to your project’s pom.xml file to use its testing functionality. Once you have added the dependency, you can start using the TestNG annotations in your Java code to define your tests, test suites, and test configurations.

In a Maven project, you can add the Log4j dependency to your project’s pom.xml file to use its logging functionality. Once you have added the dependency, you can start using the Log4j APIs in your Java code to log messages to various output targets. You can configure Log4j using a configuration file such as log4j2.xml or log4j2.properties, which specifies the logging level, output target, and other logging settings.

Here, you need to run Java-selenium tests by using the pom.xml file, according to your demand and test requirements, Therefore, you can add more maven repositories.

1. Test/Java/example:

This folder contains the test source code packages and classes. Here, this contains all the data related to test cases, including page objects, base classes, and the test case class, which helps run the whole TestNG framework. However, this contains all the data related to test cases, including page objects, base classes, and the test cases class, which help run the whole TestNG framework.

Code Explanation in Demo example:

Login Page:

  • The base class in the main class will take care of browser setup, loading configuration files, and other reusable methods like screenshots, handling sync issues, and many more.
  • Using Base, we can avoid code duplication.
  • Reuse code as much as we can.
package com.example.Pages;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.support.FindBy;
import org.openqa.selenium.support.PageFactory;

public class LoginPage {
    WebDriver driver;
    public LoginPage(WebDriver driver){
        this.driver = driver;
        PageFactory.initElements(driver, this);
    }

    @FindBy(id="email")
    WebElement usernameBox;

    @FindBy(id="passwd")
    WebElement passwordBox;

    @FindBy(name="SubmitLogin")
    WebElement SignInBtn;

    public void enterUsername(String uname){
        usernameBox.sendKeys(uname);
    }

    public void enterPassword(String upwd){
        passwordBox.sendKeys(upwd);
    }

    public void submitButton(){
        SignInBtn.click();
    }
}

Base Class:

  • @BeforeClass: The annotated method will be run before the first test method in the current class is invoked.
  • @AfterClass: The annotated method will be run after all the test methods in the current class have been run.
package com.example.TestCases;
import org.apache.log4j.Logger;
import org.apache.log4j.PropertyConfigurator;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.chrome.ChromeDriver;
import org.testng.annotations.AfterClass;
import org.testng.annotations.BeforeClass;
import com.example.Utilities.ReadConfig;
public class BaseClass {
    ReadConfig read = new ReadConfig();
    public String googlebaseurl = read.getGoogleBaseURL();
    public String loginbaseurl = read.getLoginBaseURL();
    public String uname = read.getUsername();
    public String upwd = read.getPassword();
    public String SuccessURL = read.getSuccessURL();
    public WebDriver driver; 
    public Logger logger;
    @BeforeClass
    public void SetUp(){
        System.setProperty("webdriver.chrome.driver", "Drivers/chromedriver.exe");
        // WebDriverManager.chromedriver().setup();
        driver = new ChromeDriver();
        driver.manage().window().maximize();

        logger = Logger.getLogger("googledemo");
        PropertyConfigurator.configure("log4j.properties");
    }

    @AfterClass
    public void tearDown(){
        driver.quit();
    }
}

Test Class:

According to the below code, This class will help to run TestNG test cases and will give results. You can add third-party tools for generating reports as well such as extent reports, and Allure reports.

package com.example.TestCases;

import java.io.File;
import java.io.IOException;

import org.openqa.selenium.OutputType;
import org.openqa.selenium.TakesScreenshot;
import org.testng.Assert;
import org.testng.annotations.Test;

import com.example.Pages.LoginPage;
import com.google.common.io.Files;

public class TC_LoginTest extends BaseClass{
    @Test
    public void LoginToWebsite(){
        LoginPage Lp = new LoginPage(driver);
        driver.get(loginbaseurl);
        Lp.enterUsername(uname);
        Lp.enterPassword(upwd);
        Lp.submitButton();
        if (driver.getCurrentUrl().equals(SuccessURL)) {
            Assert.assertTrue(true);
        } else {
            Assert.assertTrue(false);
        }
        File screen = ((TakesScreenshot)driver).getScreenshotAs(OutputType.FILE);
        try {
            Files.copy(screen, new File("Screenshots/login.jpg"));
        } catch (IOException e) {
            System.out.println(e.getMessage());
        }
    }
}

2. Drivers:

Basically, this folder is used to store browsers that will be required to run test cases on that specific browser. Whether it is Chrome, Edge, or Opera. Further includes driver information, where the Chrome Driver, Firefox, or Edge—whichever you’re comfortable with. Even you can use maven dependency for instantiating Webdriver. The WebDriverManager is a library that allows you to automate the management of WebDriver binaries. There are such as Chrome driver, gecko driver, etc. in your Java project.

3. Configuration:

The Above image implies that properties are used to externalize configurable data. Because if you put that data in your code (test script), you would need to build the code each time you wanted to modify the property’s value. As a result, the fundamental advantage of properties is that they may be modified at any time. And exist independently of your source code. Thus, a pair of strings is used to store each parameter, with one string serving as the key and the other as the value.

You can see the folder structure, where the configuration folder contains a config file that includes all the common information required to run a test.

4. Utilities:

Once we have the config file loaded, we need to read the properties of the config file. The Properties object gives us the .getPropertymethod that takes the key of the property as a parameter and returns it. Basically, the value of the matched key from the .properties file. Therefore, this includes a class that assists in reading all of the data from the config.properties file. 

5. Screenshots:

A screenshot in Selenium WebDriver is there we can use for bug analysis. Selenium WebDriver can automatically take screenshots during the execution. But if users need to capture a screenshot on their own, they need to use the Take Screenshot method. So, which notifies the WebDriver to take the screenshot and store it in Selenium. As a result, it includes screenshots that we took at the time of running the test.

6. log4j.properties:

The above image log4j.properties is a configuration file that we can use with the Apache Log4j logging framework. As a result, which is a popular logging library for Java applications. Users utilize this file to configure the logging behavior of the application, including specifying the output destination for log messages. The format of log messages, and the logging levels for different categories of log messages. The log4j.properties file contains key-value pairs that define the various logging settings for the application.

Step 10:

Basically, here is how we organize the hybrid framework. We need to add a few folders to make it easier for testers to run code without encountering them.

Initially, you will not find any errors until you save your code in Visual Studio Code through ctrl+s. You can hover over that error and you will find a solution for that. If you need to install any package, Thus it will also be visible to you.

Run Test:

  1. Navigate to the Project Directory:- Open a command prompt or terminal window. And navigate to the directory where your Maven project is located. This should be the directory that contains the “pom.xml” file, which is the Maven project configuration file.
  2. Run Maven Test Command:- Once you are in the project directory, you can run the following command to execute the Maven tests: “mvn test”
  3. This command will instruct Maven to run the tests defined in your Maven project. Maven will automatically compile the necessary source code and download dependencies. And execute the tests using a testing framework such as JUnit, TestNG, or any other testing framework that you have configured in your project’s dependencies.
  4. View Test Results:- After the tests have been executed. The Maven will display the test results in the command prompt or terminal window. You can view the test results to see which tests have passed, failed, or skipped.

Video:

Accordingly, the below tutorial will guide you

Conclusion:

Since Visual Studio Code for Java Selenium m Automation is a well-known IDE among software professionals. Here, I can think of no better method to run Java-Selenium automation through VSCode.

However, It offers the best Java extension for using Visual Studio Code for Java Selenium Automation. Basically, we make run test cases more appropriate and adaptable.

Read more blogs here
How to click on an element with Sikuli using an image?

How to click on an element with Sikuli using an image?

Introduction:

In the realm of automation testing, the conventional practice of identifying locators such as XPath, CSS, and ID is widely employed. However, there are scenarios where substantial time is expended in locating elements within diverse components, such as popup windows and Microsoft Foundation Class (MFC) windows. Additionally, there are cases where element location proves to be impossible. These challenges often impede progress and create bottlenecks. Hence, here in this blog, my aim is to propose a solution for addressing these issues and optimizing time allocation.

What if there was a way to bypass the traditional locator-finding technique and still identify and interact with elements? 

Well, it is indeed possible using Sikuli. Sikuli offers an alternative approach to automation by leveraging visual patterns, allowing users to interact with elements on the screen without relying on traditional locator-based techniques.

Let’s understand What is Sikuli:

Sikuli is an open-source and powerful test automation tool that excels when there is limited access to a GUI’s internal or source code. Instead of relying on XPath, CSS, or ID, Sikuli employs image recognition and GUI component control to identify objects displayed on the screen. It is operate as a separate tool to employees’ image recognition mechanism with some action perform on the element.

Sikuli is a versatile tool that integrates seamlessly with popular programming languages like Python and Java. It is compatible with various operating systems, including Windows, Mac, and Linux as well as integrating with Selenium and Pycharm. By adopting this approach, we significantly reduce the time required for element location, simplifying the automation process.

Pre-requisite For Sikuli:

To get started using Sikuli, we need to install the following things.

  1. Download and Installed any IDE as per your preference. Here we are using  Intellij Idea
  2. Create a new maven project using IntelliJ Idea 
  3. Download the Sikuli dependencies or jar file from  https://mvnrepository.com/artifact/org.sikuli  and installed it in your POM.xml file.
  4. Install Other required dependencies like selenium, web driver, etc
  5. Create a folder to store screenshots in a project.
  6. To take a screenshot, you can use a built-in snippet tool available on your system. Alternatively, you can install tools like Inspector, PowerShell, or AutoIT, which provide x and y coordinates. For more information on these tools, you can refer to this blog: https://spurqlabs.com/different-tools-to-inspect-desktop-app-elements/
  7. Using x and y coordinates, we take a screenshot during execution and store it in a specific path. We have written the code below:
  8. Create one Java class.
  9. Build your project.

Architecture of Sikuli:

  • Sikuli is a framework that assists in automating various elements on web pages.
  • The framework utilizes an image recognition mechanism to identify elements on a webpage.
  • Image recognition is achieved by comparing the elements on the webpage with provided images.
  • If a provided image is not found on the webpage, Sikuli raises an exception.
  • In specific scenarios, it is advisable to select an appropriate image that precisely highlights a single element on the webpage.
  • Selecting a precise image helps to ensure greater accuracy in element identification.
  • The Sikuli framework offers different methods to execute actions on web pages.
  • These methods provide versatility and flexibility in achieving automation objectives.

Screen Class:

The Sikuli framework has an inbuilt Screen class, a predefined method for performing actions on web elements using images. To access methods of the Screen class, we need to declare a reference to this class and initialize it.  

Screen screen = new Screen();

In the above code, the variable “screen” is declared as an instance of the Screen class, and the new keyword is used to create a new object of the Screen class.

Here are some of the methods available in the Screen class that can be used efficiently:
  1. Click on Element- Image:

To perform a left click on an element, provide an image to locate/identify the element to be clicked.

Ex: screen.click(“image path”);

  1. Right-click on the element:

This method is used to perform a right-click on an element by providing an image to locate/identify the element to be clicked.

Ex: s.rightClick(“Image Path”);

  1. Double-click on the element:

We use this method to perform a double-click action on an element. It first locates the element on the screen and then performs a double left click on the element.

Ex. s.doubbleClick(“Image Path”);

  1. Type on Element :

In the Sikuli framework, you use the Type method to send Keys by providing an image path and sending text as a method argument.

Ex: screen.type(“Image path”, ”Send Key”);

  1. Find() :

We can use this method to check the element’s visibility on a webpage.

EX. screen.find(“Image Path”);

  1. DragDrop :

Users use this method to perform the action as drag and drop. We provide a source image and target image to the drag-drop method argument.

  Ex-screen.dragDrop(“source image”,”target image”);

  1. Hover() :

We use this method to hover our cursor on a web element and validate upcoming popup messages.

Ex-screen.hover(image path);

Add this dependency in the pom.xml file to use the screen class of sikuli.

<dependencies>
<dependency>
<groupId>com.sikuliX</groupId>
<artifactId>sikulixapi</artifactId>
<version>2.0.5</version>
</dependency>

How to integrate sikuli with selenium:

 In the world of automation testing, we use an Integrated Development Environment (IDE) to write code. Nowadays, it has become common to create Maven projects to facilitate collaboration with various add-ons. As a widely used automation tool, Selenium supports integration with many add-ons. To integrate Sikuli into the Selenium framework, we need to add the required dependencies in the pom.xml file of our project.

To find the Sikuli dependencies, we can search the Maven repository at https://mvnrepository.com/artifact/org.sikuli. From this repository, we can copy the necessary dependencies and paste them into the pom.xml file of our project.

By adding the Sikuli dependencies to the pom.xml file, we ensure that the required libraries and resources are properly imported and utilized within our Selenium-Sikuli integration. This allows us to leverage the capabilities of Sikuli for image recognition and interaction within our Selenium automation framework.

We are creating sikuli funcion 

1. Create a maven project, create a class with the main method where a  set a browser and launch a browser:

public static void main (String [] argos){
WebDriverManager.chromedriver().setup();
ChromeDriver driver = new ChromeDriver();
driver.get(“https://demoqa.com/”);
driver.manage().window().maximize();}

2. Take a screenshot and store it in a specific location:

  1. We are well aware that the Snipping Tool is a reliable tool for capturing screenshots. By utilizing this tool, we can capture customized screenshots and save them within the project. folder.

From the above image, we are cropping a single element image and saving it in the project screenshot folder.

 How to take screenshots by using x,y coordinates:

  • There is an alternative method to capture screenshots without relying on external tools.
  • We can utilize the Robot class and its methods to capture screenshots based on x and y coordinates.
  • To capture a rectangular screenshot, we need two sets of x and y coordinates.
  • The first set represents the top-left corner of the rectangle, and the second set represents the bottom-right corner.
  • By specifying these coordinates, we can define the area of the screen to capture.
  • An example code snippet captures a screenshot based on the specified coordinates.
  • Our framework saves the captured screenshot to a specific location.
String fileName1 = "";
        try {
            Robot robot = new Robot();
            String imgeFormat = ".png";
            StringBuilder str = new StringBuilder("imageFolderPath" +   
System.currentTimeMillis() + image format);
fileName1 = str.toString();
            Rectangle captureRect = new Rectangle(xStart, yStart, xEnd - xStart, yEnd - yStart);
            BufferedImage screenFullImage = robot.createScreenCapture(captureRect);
            format = "png";
            System.out.println(" Path is " + fileName1);
            ImageIO.write(screenFullImage, format, new File(fileName1));
            System.out.println("A partial screenshot saved!");
  } catch (AWTException | IOException ex) {
            System.err.println(ex);
        }
Explanation of the above code:
  • Declares a variable fileName1 of type String and initializes it as an empty string.
  • The try-catch block handles potential exceptions that may occur during execution.
  • Creates a new instance of the Robot class, which allows for programmatic control of the mouse and keyboard
  • Declare an image format variable that assigns value as ‘.png’
  • Constructs a StringBuilder object to create the file path for the screenshot. It concatenates the image folder path, the current system time in milliseconds, and the image format.
  • Converts the StringBuilder object to a String and assigns it to the ‘fileName1’ variable.
  • Defines a Rectangle object that represents the area of the screen to be captured. It takes the starting coordinates (xStart, yStart) and the width and height calculated from (xEnd – xStart) and ( yEnd – yStart)
  • Uses the ‘createScreenCapture(captureRect)’ method of the Robot class to capture the screen within the specified Rectangle area. It returns a BufferedImage object representing the captured image.
  • Writes the captured image to the file specified by fileName1 using the write()  method from the ImageIO class.

3. Click on Element by using the previous taking screenshot:

As we mentioned above sikuli methods, by using this method we will do multiple actions performed on elements.

Screen s = new Screen();
s.find(fileName1);
s.click(fileName1);

Limitations:

  1. Manage a number of screenshots:

Managing a large number of screenshots can be a complex and time-consuming process. Locating a specific screenshot among many can become challenging. To simplify this process, a recommended solution is to establish a specific naming convention for the screenshots.

  1. Two similar images are available on the webpage:

If there is more than one image available on the webpage, Sikuli cannot accurately categorize and distinguish a specific image. If it’s not a recognized image then it throws an exception.

Conclusion:

To overcome the challenges of locating elements in automation testing, especially within popup windows and MFC windows, we have successfully implemented Sikuli as a solution. So by adopting Sikuli, we can eliminate the need for traditional locators, leading to enhanced execution time and improved efficiency in our automation efforts. Sikuli’s visual recognition capabilities can help users swiftly identify and interact with GUI elements, enabling faster automation execution. Overall, Sikuli proves to be a valuable alternative in scenarios where traditional locators are insufficient or inaccessible.

Read more blogs here

Image Comparison Using Java & Selenium

Image Comparison Using Java & Selenium

Introduction:

Let’s explore the topic of image comparison testing, which is often overlooked by testers who typically focus more on validating texts, buttons, forms, fields, and other similar elements. However, in some instances, businesses may require testing of images such as logos, infographics, and other graphics. 

I encountered such a scenario in my project, where I had to compare two images from different windows. This project was related to life sciences, specifically comparing images of two different animals. During this process, I discovered two methods for image comparison testing through automation using Selenium WebDriver and Java. Today, we will look into these two aspects of image comparison testing.

I don’t think any industries are left that are not involved in the minor or major level of image testing. At least, organizations will ask testers to perform testing of the logo placed on their website and it is very common to ask. Still, I have listed examples of some of the industries below where Image comparison testing can be extensively used.

Prerequisite: 

Need to download API dependency, BufferedImage library (Optional – Pointlib and Sikuli ) 

Applications of Image Comparison in Various Domains:

  • E-commerce: Ensuring accurate representation of products and their images is crucial for online retailers.
  • Advertising and Marketing: Verification of visual advertisements, banners, and promotional materials is essential for maintaining brand consistency.
  • Gaming: Testing game graphics, character designs, and visual effects are vital for delivering an immersive gaming experience.
  • Healthcare and Medical Imaging: Evaluating medical images for accuracy and precision is critical for diagnostic purposes.
  • Automotive: Comparing images of vehicle designs, safety features, and user interfaces helps ensure optimal user experiences.

Image comparison cannot be directly performed using Selenium WebDriver alone. However, when there is a specific need for image comparison, we can rely on third-party APIs to achieve this functionality. One such API is AShot, which allows us to compare two images. In this blog, I will explain how to compare two images using the AShot API.

Sometimes during automation, we need to compare two images for verification. We can compare two images using Java selenium with the help of Ashot API as the web driver does not support image comparison. 

Step-by-Step Implementation Guide:

  1. We need to capture the screenshot of a particular element and store it in the framework of the project during execution.
  2. Then compare the captured image with the expected image which was already stored in the project.

To capture the image during automation Selenium has to take a screenshot interface which allows us to take the screenshot of the whole screen. The code snippet to take screenshots is explained below

TakesScreenshot scrShot =((TakesScreenshot)webdriver);

File SrcFile=scrShot.getscreenshots(OutputType.FILE);

But when there is a need to take a screenshot of a particular element or image then there are various methods available in Selenium like BufferedImage, PointLib library, and external libraries like Sikuli.

While doing image comparison some things need to be kept in mind such as both images should have the same dimensions and if images are grid images then both images must be grid images otherwise they won’t be recognized as the same image.

Java ImageIO class:

This is a final class that belongs to the javax.imageio package. This class provides a convenient method for reading and writing images and performing simple encoding and decoding. The class provides a lot of utility methods related to image processing. Using this class, we can deal with popular image extensions like .jpg, .gif, .png, etc.

BufferedImage class:

This is a subclass of the Image class. It handles and manipulates the image data. A ColorModel of image data makes up a Buffered Image. The bufferedImage class consists of so many things like getHeight, getWidth,getRGB, etc.


In the first Image comparison method, we take both images’ RGB values with the Color class’s help. The Color class is a part of the Java Abstract Window Toolkit (AWT) package. The Color class creates color by using the given RGBA values and has different methods which return the component in the range of 0-255.

Comparison Scenarios: Considered Image Scenarios:

Positive scenario:

Where both the images are the same. For this scenario both the methods mentioned below will give the result as – images are the same.

Negative scenario:

Where one small change is made in the images. one small black dot is added in the image as seen below. For this scenario, the first method will give the result as images are not the same and the second method will give the result as images are the same. So we can confirm that the first method is more accurate to check every small change in the image.

Method 1:

This method of comparing the RGB value is more reliable than the second one. The second method only reflects the difference when there is a significant change in the images.

Here is the link provided for the code of both methods.

ImageComparison/ImageProcessing.java at main · sarmorikar-spurqlabs/ImageComparison (github.com)

Let’s understand the program in detail

  1. So here, we use the BufferedImage class to read the images and save them as BufferedImage objects named img1 and img2, respectively.
  2. By using the getWidth()and getHeight() methods we are reading the height and width of both the images as the height and width of both the images should be the same otherwise it will not proceed further to find out the difference. And if it is not the same, the program will print “Both images should have the same dimensions”  as we compare them in an if loop.
  3. If the images are the same then it will execute 2 for loop for height and width. We are reading the RGB value of both the images by method getRGB. And we are storing that value in integers defined as pixels and pixel integer values are passing values into the Color class.
  4. The color class has a method to read the red, green, and blue values of the images and then find the sum of the differences in RGB values of the two images. 
  5. From the differences we find out the average and the percentage and if the percentage is equal to 0 then the images are the same and if the rate is more than 0 then the photos are different.

Method 2:

Let’s Understand Method 2 in detail

  1. Again, we are using the BufferedImage class to read the images, and we are saving them as BufferedImage objects named img1 and img2, respectively.
  2. Now we are creating the object of the ImageDiffer class and the ImageDiffer class has a built-in method makeDiff which compares two images. If there is a difference, it returns the ImageDiff class object.
  3. Then we are using the hasDiff built-in method to check the value of the diff objects and confirm whether the images are different.

This way we can compare two images whether they are the same or not using Ashot. Here the intention of showing a negative scenario is to make sure that the scenarios are working fine even if there is a small change in the expected image. Method 1 is more accurate as it is comparing the RGB values of the images and when there is a small change in the image also it will give an accurate result.

Conclusion:

Image comparison using the AShot API provides a reliable and efficient method to verify and validate images during automation testing. By using the capabilities of Ashot and implementing the image comparison process, we can enhance the quality and reliability of our automation tests, ensuring that the visual aspects of our applications meet the desired expectations. 

Read more blogs here.

Building a robust mobile test automation framework using Appium in Python

Building a robust mobile test automation framework using Appium in Python

In this blog, we will explore how to build a robust mobile test automation framework using Appium in Python (behave framework). As a result, it will be very useful for executing the program.

Mobile test automation can be more challenging than web automation, as inspecting and interacting with mobile elements requires additional effort. However, with the help of Appium, an open-source tool, it is possible to overcome these challenges and build a powerful mobile test automation framework. In this blog, we will explore how to create a robust framework using Appium in conjunction with the Behave framework in Python.

Let’s talk about robust test frameworks

Robust test automation framework ranks highly on the list of Software Testing “must-haves”.

It helps improve the overall quality and reliability of software when executed in a structured manner.

If we don’t build the right framework then the results will be: Inconsistent test results, Non-modularized tests, and Maintenance difficulties. The automation framework needs to be well organized to make it easier to understand. An organized framework provides an easier way to maintain and expand.

There are many features that we should consider to make the automation framework more robust. 

  • Scalability – The automation framework that you have in your organization should be scalable. It should not just apply to one project. Your automation framework should be applied throughout projects across the organization. It should be an organization-wide test automation framework.
  • Re-portability – Every automation framework should have a good reporting capability. The test framework engineer can choose a third-party reporting library.
  • Configurable – A framework should be configurable. It should execute scripts in different test environments. The automation framework should not be restricted to a single test environment. The user credentials should not be “hard-coded” in the automation script itself. 
  • Re-usability – The framework should follow re-usability. We should use the same methods, and page objects in all the test scenarios in the test automation framework.
  • Extendability –  You should be able to integrate easily with other third-party tools via APIs. Automation frameworks should be easily integrated with security testing tools, web proxy debugging tools, test case management tools, or with other global frameworks thereby making it more hybrid in nature. 

Benefits

  • Increase product reliability – Accurate, efficient, automated regression tests – reduce risks
  • Reduce the product release cycle time – Improve the time to market, Reduce QA cycle time

Let’s start with basic

Appium is an open source Test Automation Framework which is used for automating mobile applications.

Appium supports Android, IOS mobile apps, and Windows PC Desktop apps. We can automate Native, Hybrid, and Mobile web apps using Appium.

Uses of Appium:

  • Appium is open source and it is free of cost.
  • Appium supports Android, IOS, and Windows to run test scripts.
  • Appium supports languages such as Python, Java, Perl, PHP, C#, and Ruby
  • Appium supports different operating systems such as Mac, Windows, Linux, UNIX, etc.
  • Functional test cases of mobile applications can be easily automated.

Appium Inspector

Appium Inspector is a tool for QA engineers who want to automate mobile applications. Basically, this tool also serves as the standard procedure for identifying mobile application elements.

The following are the used for inspecting the mobile element for both Android and iOS.

  • Download the Appium inspector.exe.

This is the link: https://github.com/appium/appium-inspector

  • After downloading the exe file launch the Appium inspector.exe file. On top of the web page, select to Cloud-based platform – BrowserStack

BrowserStack is a cloud-based real devices platform that provides support for both manual and automated testing of mobile apps for both Android and iOS devices. One of its standout features is the App Live feature, which allows users to manually test their mobile apps on over 3000 real Android and iOS devices.

BrowserStack supports testing across different environments, including Dev, QA, Staging, and Production apps from the play store or app store. This makes it easy for developers to test their apps in various environments and ensure that they are working correctly in each environment.

To proceed further we need BrowserStack Username and BrowserStack Access Key

For that, Go to https://www.browserstack.com/

Following are the steps that will guide the process

  • Log in to your BrowserStack account ->Navigate to the “Account” section ->Then Go to Summary
  • After going to Summery Section you will get Username and Password. Copy both Username and Password and paste them into Appium inspector fields
  • Go to Desired Capabilities -> Here we need to add basic capabilities which are required for starting the session. Below image will guide you
  • To add capabilities need to click on the “+” symbol as shown in the below image
  • Add the capabilities with desired values as shown in the below image
  • In the Value field, we need to add data that we want to add.

The most important thing here is for the last field “Appium: app”, we have to upload the .ipa or .ipk or .aab file on BrowserStack.

For that, there are 2 ways mentioned below

  1. Through Command Line
  2. Directly through BrowserStack

The most important thing here is for the last field “Appium: app”, we have to upload the .ipa or .ipk or .aab file on BrowserStack.

Let’s start

  • Through Command Line

To upload the .ipa or .ipk or .aab file on BrowserStack, the following Curl command is very useful.

Curl is a command line tool that enables data exchange between a device and a server through a terminal. Using this command line interface (CLI), a user specifies a server URL (the location where they want to send a request) and the data they want to send to that server URL.

Go to cmd and write this curl command

curl -u “username:accesskey” -X POST “https://api-cloud.browserstack.com/app-automate/upload” -F “file=@path of the file where you save your apk or IPA file” -F “custom_id=any name “

  • Directly through BrowserStack:

 Go to “App Live” -> Click on Uploaded Apps -> And Upload your file.

But here 1st one is preferable: Through this command, we get “app_url” which is required in Desired Capabilities:

Copy that app_url and paste it into Desired Capabilities.

Here you need to Click on Start Session -> You will get below the window.

You can select the element in the App or from the App Source section and the attributes including ID, Name, Text, etc will be displayed on the right side under the Selected Element section and you can create Xpaths using those attributes.

Developing Framework

  1. Python: https://www.python.org/downloads/ visit the site to download and install Python in your system if it is not there. 
  2. Install Selenium and Behave using:

pip install selenium 

Pip install behave 

For more details please visit: https://pypi.org/project/behave/  &  https://pypi.org/project/selenium/ 

  1. Pycharm IDE (Professional or Community): https://www.jetbrains.com/pycharm/download/ 
  1. Install Appium-Python-Client:

pip install Appium-Python-Client

For more details please visit: https://pypi.org/project/Appium-Python-Client/

  1. Install allure for report generating using:

pip install allure-behave 

For more details please visit: https://pypi.org/project/allure-behave/ 

  1. We can also install all the required packages using the requirement.txt file using the below command. 

pip install -r requirement.txt

Framework Structure Overview

Here is the overview of our mobile test automation framework using Appium in Python.

You have to follow the below 7 steps to build a robust mobile test automation framework using Appium.

Step 1: 

Create a project in Pycharm (here I am using Pycharm professional) and as mentioned in the prerequisites install the packages

Step 2:

In this step, we will be creating a Features folder in which we will be creating our feature files for different scenarios. Every step in a Feature File describes the action we are going to perform on UI. A feature file is something that holds your test cases in the form of a scenario and scenario outline. In this framework, we are using a scenario. Both scenario and scenario outlines contain steps that are easy to understand for non-technical persons. We are giving tags for the feature files. We can also give it for the scenarios present in that file. Depending on our test cases. Note that the feature file should end with a .feature extension.

@iOS
#@android
Feature: Simple Calculator
 Addition of two numbers
 Scenario: Verify addition of two numbers
   Given I am on calculator home page
   When I enter '4'
   And I enter operator of addition
  And I enter operator of addition this should be like And I enter 
 ‘+’ operator so if possible you can update code as per this
   And Enter number '2'
   And I enter operator '='
   Then I see result as '6'

Step 3:

After creating the feature file now create a step file. Both feature files and step files are essential parts of the BDD framework. The steps with ‘When’ are related to the user actions like navigation, clicking on the button, and filling in the information in input boxes. The steps with ‘Then’ are related to the verifications or Assertions. In this, we are using both iOS and Android, so the step file should look like this. We are creating only one-step files for both iOS and Android.

Purpose of Step file: The step file is to attach steps from the feature file to the page file, where actual implementation is available.

from behave import *
use_step_matcher("parse")
@given("I am on calculator home page")
def step_impl(context):
   print("User is on Homepage")

@when("I enter '{number}'")
def step_impl(context, number):
   str = context.config.userdata["deviceType"]
   print("str " + str)
   if str == "['iOS']":
       context.iOS_cal.iOS_tap_number1(number)
   else:
       context.android_cal.tap_number1()

@step("I enter operator of addition")
def step_impl(context):
   str = context.config.userdata["deviceType"]
   print("str " + str)
   if str == "['iOS']":
       context.iOS_cal.iOS_tap_operator()
   else:
       context.android_cal.tap_operator()

@step("Enter number '{number}'")
def step_impl(context, number):
   str = context.config.userdata["deviceType"]
   print("str " + str)
   if str == "['iOS']":
       context.iOS_cal.iOS_tap_number1(number)
   else:
       context.android_cal.tap_number2()

@step("I enter operator '{operator}'")
def step_impl(context, operator):
   str = context.config.userdata["deviceType"]
   print("str " + str)
   if str == "['iOS']":
       context.iOS_cal.iOS_equals(operator)
   else:
       context.android_cal.equals()

@then("I see result as '{result}'")
def step_impl(context, result):
   str = context.config.userdata["deviceType"]
   print("str " + str)
   if str == "['iOS']":
       flag = context.iOS_cal.iOS_verify_result()
       assert flag == True
   else:
       flag = context.android_cal.verify_result()
       assert flag == True

Step 4:

In this step, we are creating two-page files one for iOS and one for Android that contains all the locators and the action methods to perform the particular action on the web element. We are going to add all the locators at the class level only and will be using them in the respective methods.

iOS page file:

import time
from appium.webdriver.common.mobileby import MobileBy
from selenium.common import NoSuchElementException
from selenium.webdriver.support import expected_conditions as EC
from time import sleep

from Features.Pages.BasePage import Basepage

class iOS_Calculator_Page(Basepage):
   def __init__(self, context):
       Basepage.__init__(self, context.driver)
       self.context = context

       self.add_operator = "//XCUIElementTypeStaticText[@name='+']"
       self.result = "(//XCUIElementTypeStaticText)[1]"

   def iOS_tap_number1(self,number):
       time.sleep(2)
       tap_on = self.wait.until(
           EC.presence_of_element_located((MobileBy.XPATH, "//XCUIElementTypeButton[@name='"+number+"']")))
       tap_on.click()

   def iOS_tap_operator(self):
       time.sleep(2)
       tap_on = self.wait.until(
           EC.presence_of_element_located((MobileBy.XPATH, self.add_operator)))
       tap_on.click()

   def iOS_equals(self, operator):
       time.sleep(2)
       tap_on = self.wait.until(
           EC.presence_of_element_located((MobileBy.XPATH, "//XCUIElementTypeStaticText[@name='"+operator+"']")))
       tap_on.click()

   def iOS_verify_result(self):
       sleep(5)
       try:
           verify_element = self.wait.until(EC.presence_of_element_located(
               (MobileBy.XPATH, self.result))).is_displayed()
       except NoSuchElementException:
           verify_element = False
       return verify_element

Android page file:

from appium.webdriver.common.mobileby import MobileBy
from selenium.common import NoSuchElementException
from selenium.webdriver.support import expected_conditions as EC
import time

from Features.Pages.BasePage import Basepage

class android_Calculator_Page(Basepage):
   def __init__(self, context):
       Basepage.__init__(self, context.driver)
       self.context = context
       self.number1 = "(//android.widget.Button)[5]"
       self.add = "//android.widget.Button[@content-desc='plus']"
       self.number2 = "(//android.widget.Button)[9]"
       self.operator_equals = "(//android.widget.Button)[13]"
     self.verify = "(//android.widget.TextView)[2]"
   def tap_number1(self):
       time.sleep(2)
       tap_on = self.wait.until(
           EC.presence_of_element_located((MobileBy.XPATH, self.number1)))
       tap_on.click()

   def tap_operator(self):
       time.sleep(2)
       tap_on = self.wait.until(
           EC.presence_of_element_located((MobileBy.XPATH, self.add)))
       tap_on.click()

   def tap_number2(self):
       time.sleep(2)
       tap_on = self.wait.until(
           EC.presence_of_element_located((MobileBy.XPATH, self.number2)))
       tap_on.click()

   def equals(self):
       time.sleep(2)
       tap_on = self.wait.until(
           EC.presence_of_element_located((MobileBy.XPATH, self.operator_equals)))
       tap_on.click()

   def verify_result(self):
       try:
           verify_element = self.wait.until(EC.presence_of_element_located(
               (MobileBy.XPATH, self.verify))).is_displayed()
       except NoSuchElementException:
           verify_element = False
       return verify_element

Base Page File:

The next one is the base page file. We are creating a base page file to make an object of the driver so that we can easily use that for our page and environment file. On this page, we can create a method that gets used frequently in our code like the click() method or send_keys() method, etc.

from selenium.webdriver.support.ui import WebDriverWait
# In the base page we are creating an object of the driver.
# We are using this driver in the other pages and environment page.
class Basepage(object):
   def __init__(self, driver):
       self.driver = driver
       self.wait = WebDriverWait(self.driver, 60)
       self.implicit_wait = 25

Step 5:

Environment file (i.e. Hooks file).

This file contains hooks for before and after scenarios to start and close the browser. Also if you want you can add after-step hooks for capturing screenshots for reporting. We have added a method to capture screenshots after every step and will attach them to the allure report. We have added before feature hooks.

In the feature file, we have given tags(@iOS and @android) before the feature.

  • def before_feature hook: This will check for which device type (iOS or Android) we are executing the code.
  • def before_scenario hook: We are checking the execution mode and within that adding device type conditions for iOS and Android. 

Here we are using  “context. config.userdata[]” This will read data from the behave.ini file

Added condition for iOS and Android

import json
from appium import webdriver
from allure_commons.types import AttachmentType
from allure_commons._allure import attach
from Features.Pages.BasePage import Basepage
from Features.Pages.android_Calculator_Page import android_Calculator_Page
from Features.Pages.iOS_Calculator_Page import iOS_Calculator_Page

data = json.load(open("Features/Resources/config.json"))

def before_feature(context, feature):
   tags = str(feature.tags)
   print("Tags " + tags)
   context.config.userdata["deviceType"] = tags
   print("Device Type :" + context.config.userdata["deviceType"])

def before_scenario(context, scenario):
   if context.config.userdata["executionMode"] == "Browserstack":
       if context.config.userdata["deviceType"] == "['iOS']":
           print(context.config.userdata["deviceType"])
           context.driver = webdriver.Remote(
               command_executor='https://' + context.config.userdata["userName"] + ':' + context.config.userdata[
                   "accessKey"] + '@hub-cloud.browserstack.com/wd/hub',
               desired_capabilities={
                   "platformName": "iOS",
                   "build": context.config.userdata["iOS_browserstack_build"],
                   "deviceName": context.config.userdata["iOS_browserstack_device"],
                   "os_version": context.config.userdata["iOS_device_os_version"],
                   "app": context.config.userdata["iOS_browserstack_appUrl"],
               }
           )
       else:
           context.driver = webdriver.Remote(
               command_executor='https://' + context.config.userdata["userName"] + ':' + context.config.userdata[
                   "accessKey"] + '@hub-cloud.browserstack.com/wd/hub',
               desired_capabilities={
                   "platformName": "android",
                   "build": context.config.userdata["android_browserstack_build"],
                   "deviceName": context.config.userdata["android_browserstack_device"],
                   "os_version": context.config.userdata["android_device_os_version"],
                   "app": context.config.userdata["android_browserstack_appUrl"],
               }
           )
   else:
       print("...")
   context.driver.switch_to.context('NATIVE_APP')
   baseobject = Basepage(context.driver)
   context.android_cal = android_Calculator_Page(baseobject)
   context.iOS_cal = iOS_Calculator_Page(baseobject)
   context.stepid = 1

def after_step(context, step):
   attach(context.driver.get_screenshot_as_png(), name=str(context.stepid), attachment_type=AttachmentType.PNG)
   context.stepid = context.stepid + 1
def after_scenario(context, scenario):
   context.driver.reset()
   context.driver.quit()

Step 6:

INI files are configuration files used by Windows to initialize program settings. The main role is to set values for parameters and configuration data required at startup or used by setup installers.

The configuration files should begin with the keyword [behave] and follow Windows INI style format.

Here we are giving BrowserStack Capabilities.

[behave]
color=False
show_snippets=True
show_skipped=True
dry_run=False
show_source=True
show_timings=True
stdout_capture=True
stderr_capture=True
log_capture=True
default_format=pretty
[behave.formatters]
allure=allure_behave.formatter:AllureFormatter
[behave.userdata]
executionMode=Browserstack
userName=harishekal_QYol9T
accessKey=RG6juTxk2Coom4p5JPSN
iOS_browserstack_appUrl=bs://f3b6763a19d710fc72fdc615db7119ea654e06da
iOS_browserstack_build=0.0.0
iOS_browserstack_device=iPhone 11
iOS_device_os_version=15.4
iOS_deviceType = iOS

android_browserstack_appUrl=bs://495dd3b36cae77a1ae2f4ce6f439456999e0bb45
android_browserstack_build=0.0.1
android_browserstack_device=Google Pixel 3
android_device_os_version=9.0
android_deviceType = Android

Copy user userName and accessKey of the user BrowserStack account. And iOS_broserstack_appUrl – Uploaded .ipa file through curl command. android_broserstack_appUrl – Uploaded .apk file through curl command.

Congratulations, finally we have created our own Python Selenium Behave BDD framework. 

Step 7:

As I mentioned earlier we will be using Allure for reporting the test result. For this use the below command in the terminal and it will generate the result folder for you.

  • behave Features/Calculator.feature -f allure_behave.formatter: AllureFormatter -o Report_Json

To convert the JSON file into readable HTML format use the below command. 

  • allure generate Report_Json -o Report_Html –clean

I have added this framework to the following Git Repository.

https://github.com/spurqlabs/PythonAppiumMobileFramework

Conclusion: 

Creating a robust mobile testing framework using Appium is very important as well as feels like a tedious task but with the right guidelines, everyone can create a testing framework. This framework helps improve the quality and efficiency of the testing process. I hope this blog will help everyone to create a robust mobile testing framework using Appium. Here, we choose a behave framework over other existing frameworks because of its better understanding, ease of adaptation, and ease to understand for end users.

How to implement Page Object Model (POM) using C# with Selenium

How to implement Page Object Model (POM) using C# with Selenium

Introduction:

Selenium is an open-source Web UI automation testing suite/tool. It supports automation across different browsers, platforms, and programming languages which includes Java, Python, C#, .net, Ruby, PHP, and Perl, etc. for developing automated tests. Selenium can be easily deployed on Windows, Linux, Solaris, and Macintosh Operating Systems. It also provides support for mobile applications like iOS, windows mobile, and Android for different Operating Systems.

Selenium consists of drivers specific to each language. Additionally, the Selenium Web driver is mostly used with Java and C#. 

Test scripts can be coded in selenium in any of the supported programming languages and can be run directly in most modern web browsers which include Internet Explorer, Microsoft Edge, Mozilla Firefox, Google Chrome, Safari, etc.

Furthermore, C# is an object-oriented programming language derived from C++ and Java.
It supports the development of console, windows, and web-based applications using Visual Studio IDE on the .Net platform.

With Selenium C#, there is a wide variety of automation frameworks that can be used for automated browser testing. Each framework has its own advantages and disadvantages, they are chosen on the basis of their requirement, compatibility, and the kind of solution they’d prefer. These are the most popular Selenium C# frameworks used for test automation.

NUnit:

It is a unit testing tool ported initially from JUnit for .Net Framework and is an Open Source Project. NUnit was released in the early 2000s, though the initial Nunit was ported from Junit, the recent .Net version 3 is completely rewritten from scratch.

To run the Nunit test we need to add attributes to our methods. An example, attribute [Test], Indicates the Test method. Below are the NuGet Packages required by NUnit

NUnit
NUnit3TestAdapter
Microsoft.NET.Test.Sdk 

XUnit:

XUnit is a unit testing tool for .Net Framework which was released in 2007 as an alternative to Nunit. xUnit has attributes for the execution of tests but is not similar to NUnit. [Fact] and [Theory] attributes are similar to [Test] 

Below are the NuGet Packages required by xUnit

Xunit
Xunit. runner.VisualStudio
Microsoft.NET.Test.Sdk

MSTest:

MSTest is a unit testing framework developed by Microsoft and ships with Visual Studio. However, Microsoft made version 2 open-source which can easily be downloaded. Additionally, MSTest has an attributes range similar to NUnit and provides a wide range of attributes along with parallel run support at the Class and Method level.

Prerequisite:

To get started with Selenium C# and the Page Object Model framework, first, we need to have the following things installed.

1) IDE: Download and install any IDE of your choice.

  •  Here we are using Microsoft Visual Studio 2022
  •  After downloading the Visual Studio Installer, select the .NET desktop development option and then click on Install.
  •  Now let the Visual Studio Installer download the packages and perform the installation.
  •  Install the latest version of the .NET Framework on your machine.

2) Create New Project: After the installation is over, begin using Visual Studio.

  •  select the Create a new project option, then select the xUnit Test Project option for C#.

3) Selenium Webdriver for Chrome Browser: You must also install Selenium’s web driver for Chrome browser.

  •  In Visual Studio navigate to Tools -> NuGet Package Manager -> Manage NuGet Packages for Solution.
  • In the Search Bar, enter the name of the packages you want to install (e.g. Selenium .WebDriver).
  • Check the Project checkbox, and click on Install.
  • In the dialogue box asking to accept the licences click on Accept button.
  • This will start the installation process and install the Selenium WebDriver.

Selenium.WebDriver

This package contains the .NET bindings for concise and object-based

Selenium WebDriver API, which uses native OS-level events to manipulate the browser,
Selenium.Chrome.WebDriver (chrome driver exe)
This NuGet package installs Chrome Driver (Win32) for Selenium WebDriver in your xUnit Test Project.


Once Visual Studio is finished with the successful installation of the Selenium WebDriver, it will show a message in the output window.
Once the Visual Studio is set up with all dependencies, it’s ready for work.

Note: We will be using the demo testing website (https://www.calculator.net/) and trying to achieve the addition and subtraction operations for our automation test.

Writing the First Selenium C# Test:

Download the WebDriverManager from Tools -> NuGet Package Manager -> Manage NuGet Packages for Solution.

WebDriverManager is an open-source Java Library that automates the management of driver executables required by Selenium WebDriverby performing the four steps (find, download, setup, and maintenance) for the drivers required for Selenium tests. Here are some benefits of WebDriverManager in Selenium:

  • WebDriverManager automates the management of WebDriver binaries, thereby avoiding installing any device binaries manually.
  • WebDriverManager checks the version of the browser installed on your machine and downloads the proper driver binaries into the local cache (~/.cache/selenium by default) if not already present.
  • WebDriverManager matches the version of the drivers. If unknown, it uses the latest version of the driver.
  • WebDriverManager offers cross-browser testing without the hassle of installing and maintaining different browser driver binaries.

In the UnitTest1 file, the final code looks like this:

public class UnitTest1
    {
        IWebDriver driver;
        CalculatorPage calc_page;
        public void Initialize_driver()
        {
         new WebDriverManager.DriverManager().SetUpDriver(new ChromeConfig());
            driver = new ChromeDriver();
          calc_page = new CalculatorPage();
        }
        public void Close_driver()
        {
        driver.Close();
        }
       [Fact]
        public void Add()
        {
            initialize_driver();
            calc_page.Initialize(driver);
            string actualresult = calc_page.calculate("14", "+", "5");
            Assert.Equal("19", actualresult);
            Close_driver();
        }
        [Fact]
        public void Subtract()
        {
            initialize_driver();
            calc_page.Initialize(driver);
            string actualresult = calc_page.calculate("24", "-", "5");
            Assert.Equal("19", actualresult);
           Close_driver()
        }
    }

Now just build your code by right-clicking the project xUnitTestProject1 or by pressing Ctrl + Shift + B and you will be able to see your test in “Test Explorer”.

After following the above procedure, run the test case. But this code will not execute unless the Chrome driver for the Selenium is not downloaded and unzipped on the system.

When developing a scalable and robust automation framework, it is important to consider the following challenges:

  1. Keeping up with UI changes: The primary goal of automated UI web tests is to validate the functionality of web page elements. However, the UI is subject to constant evolution, leading to changes in web locators. These frequent changes in web locators pose a challenge to code maintenance.
  2. Code maintenance: With the ever-changing UI, it is crucial to maintain the automation codebase effectively. Failing to update Selenium test automation scripts to reflect changes in web locators can result in test failures. Proper maintenance is essential to ensure the longevity and reliability of the test scripts.
  3. Test failure due to lack of maintenance: Inadequate maintenance of automation scripts can lead to scenarios where tests fail. One common cause is a change in web locators. If the Selenium test automation scripts are not updated accordingly, it can cause a significant number of tests to fail, impacting the overall test suite’s reliability.

So to address this, restructure the Selenium test automation scripts for increased modularity and reduced code duplication.

Utilizing the Page Object Model (POM) design pattern achieves code restructuring and minimizes the effort required for test code maintenance.

Now, let’s delve into a comprehensive overview of the Page Object Model, including the implementation and effective maintenance of your Selenium test automation scripts.

Why do we need Page Object Model in Selenium C#?

Selenium test automation scripts become more complex as the web applications add more features and web pages. With every new page added, new test scenarios are included in the Selenium test automation scripts. With this increase in lines of code, its maintenance can become very tedious and time-consuming. Also, the Repetitive use of web locators and their respective test methods can make the test code difficult to read.

Instead of spending time updating the same set of locators in multiple Selenium test automation scripts, a design pattern such as the Page Object Model can be used to develop and maintain code.    

What is Page Object Model In Selenium C#?

Page Object Model is the most widely used design pattern by the Selenium community for automation tests in which each web page (or significant ones) is considered as a separate class and a central object repository is created for controls on a web page.

  • Each Page Object (or page classes) contain the elements of the corresponding web page along with the necessary methods to access the elements on the page.
  • Thus it is a layer between the test scripts and UI and encapsulates the features of the page.
  • The Selenium test automation scripts do not interact directly with web elements on the page, instead, a new layer (i.e. page class/page object) resides between the test code and UI on the web page.  
  • Hence, Selenium test automation implementation that uses the Page Object Model in Selenium C# will constitute different classes for each web page thereby making code maintenance easier.
  • In complex test automation scenarios, automation scripts based on Page Object Model can have several page classes (or page objects). It is recommended that you follow a common nomenclature while coming up with file names (representing page objects) as well as the methods used in the corresponding classes. For example, if automation for a login page & dashboard page is to be performed, our implementation will have a class each for login & dashboard. The controls for the login page are in the ‘login page’ class and the controls for the dashboard page are in the ‘dashboard page’ class.


How to Use Page Object Model:

We will now implement the Page Object Model for the use case we considered above i.e. trying to achieve the addition and subtraction operations for our automation test on the Calculator page.
Create a class file – CalculatorPage.cs for Calculator page operation. This page class contains the locator information of the elements on that page. Also, we need to define the methods for that page in the CalculatorPage.cs class and call the methods from UnitTest1.cs.


We are initializing the Chromedriver object and launching the web page from the initialiseDriver() method from UnitTest1.cs. Also, we are creating the instance of CalculatePage from the same method.
The CalculatePage.cs contain an instance of IWebDriver and the following methods –

Initialize(): this method takes one IWebDriver object as an input parameter and it is assigned to a locally defined IWebDriver object. Also, the required web page is launched using this driver.

Calculate(): this method is actually used to do the calculation operation of two numbers..either addition or subtraction using 3 input parameters as user input number value1, number value 2, and operator like -’+’ or ‘-’. The required elements from the page are located and as per the operator, the required operation is performed on those.
The final code of CalculatePage.cs would look like the below:

public class CalculatePage
    {
        IWebDriver driver;

        public void Initialize(IWebDriver driver)
        {
            this.driver = driver;          
            driver.Navigate().GoToUrl("https://www.calculator.net/");
        }
       
    public string Calculate(string no1, string op, string no2)
        {
            IWebElement number1;
            char[] ch = no1.ToCharArray();

            for (int i = 0; i < no1.Length; i++)
            {
                number1 = driver.FindElement(By.XPath("//span[@onclick='r(" + ch[i] + ")']"));
                number1.Click();
             }

            IWebElement op_element = driver.FindElement(By.XPath("//span[@onclick=\"r('" +op + "')\"]"));
            op_element.Click();
            ch = no2.ToCharArray();

            for (int i = 0; i < no2.Length; i++)
            {
                number1 = driver.FindElement(By.XPath("//span[@onclick='r(" + ch[i] + ")']"));
                number1.Click();
            }
            IWebElement result = driver.FindElement(By.Id("sciOutPut"));
            string actual_result = result.Text.Trim();
            return actual_result;
      }
   }

Advantages of Page Object Model in Selenium C#:

Page Object Model is a widely used design pattern nowadays. It reduces code duplication, enhances code readability, and improves maintainability by emphasizing reusability and extensibility.
Furthermore, below are some of the major advantages of using the Page Object Model in Selenium C#.

Better Maintenance – With separate page objects (or page classes) for different web pages, functionality or web locator changes will have less impact on the change in test scripts. This makes the code cleaner and more maintainable as Selenium test automation implementation is spread across separate page classes.

Minimal Changes Due To UI Updates – The effect of changes in the web locators will only be limited to the page classes, created for automated browser testing of those web pages. This reduces the overall effort spent in changing test scripts due to frequent UI updates.

Reusability – The page object methods defined in different page classes can be reused across Selenium test automation scripts. This, in turn, results in a reduction of code size as there is increased usage of reusability with the Page Object Model in Selenium C#.

Simplification –One more important point of using this design pattern is that it simplifies the visualization of the functionality and model of the web page as both these entities are located in separate page classes. 

Execution:

Navigate to Test -> Run All Tests.
This will launch the test explorer in Visual Studio and will run our test. 

You can run the test from the command prompt or visual studio’s terminal (Developer Command Prompt) with the following command-

dotnet test

This dotnet test command is used to run the tests in the project in the current directory. The dotnet test command builds the solution and runs a test host application for each test project in the solution. While running the tests from the project, you can put different filters while running the test, like running the tests with particular tags, from specific projects, with particular names, etc.

You can find this framework in the following Git Repository.

spurqlabs/CSharp-Selenium-Page-Object-Model (github.com)

Conclusion:

Implementing the Page Object Model in Selenium with C# provides a structured approach to automation testing, making the code more maintainable and reusable. It simplifies the handling of UI changes and enhances the overall efficiency of the testing process for large-scale applications.

Read more blogs here