iOS App Automation on macOS – Configuring a macOS system for testing on real iOS devices for mobile app automation is a lengthy and complicated process. The steps to follow for this configuration are tricky, and many times testers struggle with it.
The steps involved are installing the right software, setting environment variables, configuring settings on the device, and connecting the devices properly for iOS app automation. This can cause lots of issues in configuration and slow down the testing process.
In this blog, we’ll make this process easy for you to follow.
We’ll walk you through each step, from installing necessary software like the Java Development Kit (JDK) and Xcode, to setting up your iOS device for iOS app automation on macOS.
We’ll also show you how to install Appium, configure it correctly, and use tools like Appium Inspector to interact with your app.
By following this simple guide, you’ll be ready to test your mobile apps on real devices quickly and efficiently for iOS app automation on macOS.
What is Appium testing in iOS App Automationon macoS
Appium is a freely distributed open-source automation tool used for testing mobile applications. It allows testers to automate native, hybrid, and mobile web applications on iOS and Android platforms using the WebDriver protocol.
Appium provides a unified API (Application Programming Interface) that allows you to write tests using your preferred programming language (such as Java, C#, Python, JavaScript, etc.) and test frameworks. It supports a wide range of automation capabilities and handling various types of mobile elements. Appium enables cross-platform testing, where the same tests can be executed on multiple devices, operating systems, and versions, providing flexibility and scalability in mobile app testing or iOS testing. It has NO dependency on Mobile device OS; because APPIUM has a framework or wrapper that translatesSelenium WebDriver commands into UIAutomation (iOS) or UIAutomator (Android) commands depending on the device type, not any OS type.
Prerequisites for iOS Automation Setup on macOS
JDK Installation
Install npm and Node.js
Install Appium Server & Appium Inspector
Setting up environment variables
Xcode installation and setup
Install XCUITest Driver
Install WebDriverAgent
Real Device Settings
Get the device-identifier or udid of real device
Configure the desired capabilities of Appium Inspector
2. Install npm and Node.js: Download the Node.js pre-built installer for mac platform and install it. (https://nodejs.org/en/download) If already installed on the system, please check and confirm using the following commands on the terminal.
To see if Node is installed, type in node -v .
To see if NPM is installed, type in npm -v
3. Install Appium Server & Appium Inspector: Install the Appium via terminal you need to run the below commands:
npm install -g Appium
If you face permission problems, run the commands using sudo. From this link https://github.com/appium/appium-inspector/releases download the Appium Inspector for mac. You should click and download .dmg file.
4. Setting up environment variables: Do below settings in .profile file. Open a terminal and type following command-
nano ~/.profile
Then, paste the below commands: (Change your username!).
Open Xcode a Preferences -> Accounts -> Add Apple ID
6. Install XCUITest Driver: Now we need to install XCUITest driver which allows to interact with the UI elements of iOS apps during automated testing. Use following command in terminal to install XCUITest driver-
appium driver install xcuitest
7. Install WebDriverAgent: Open Terminal & Run the following command:
cd /Applications/Appium-Server-GUI.app/Contents/Resources/app/node_modules/appium/node_modules/appium-xcuitestdriver/appium-webdriveragent
To make directory in Resources folder mkdir -p Resources/WebDriverAgent.bundle
How to setup WebDriverAgent on Mac for iOS App Automation
WebDriverAgent is a WebDriver server implementation for iOS that can be used to remote control iOS devices. We need to add an account to XCode (you can use your Apple Id or create new).
For that go to XCode —> Preferences —> Accounts
Once you are signed in — your account will appear at the left.
We need to add Signing Certificate to this account iOS Automation:
1. Click on Download Manual Profiles 2. Click on Manage Certificates — Plus icon — Apple Development. Once it is done — you will see a new certificate added to the list as per screenshot below
Open the WebDriverAgent.xcodeproj project in xcode
To find it please use the path: /Applications/Appium-Server-GUI.app/ Contents/ Resources/ app/node_modules /appium/node_modules/appium-xcuitestdriver/appium-webdriveragent
(NOTE: If you do not see this folder — please use shortkeys “Shift”+”Command”+”.” to display hidden files in your Macintosh HD root)
Click on Project name at the left navigation (WebDriverAgent)
For both the WebDriverAgentLib and WebDriverAgentRunner targets, Go to Signing & Capabilities and Select the Automatically Manage Signing check box select your development team and Select your device. This should also auto select Signing Certificate. The outcome should look as shown below:
If the error below appears while changing the Bundle Identifier, we will need to change the value of Bundle Identifier to something else which xCode can accept.
The value for Bundle Identifier should be changed in the following places:
WebDriverAgentLib target:
From the Signing & Capabilities tab —change value of Bundle Identifier
From Build Settings tab — Packaging section — change value of Product Bundle Identifier
WebDriverAgentRunner target:
From Build Settings tab — Packaging section — change value of Product Bundle Identifier
IntegrationApp target:
From the Signing & Capabilities tab — change value of Bundle Identifier
From Build Settings tab —> Packaging section —> change value of Product Bundle Identifier
After changing the values of Bundle Identifier, Build the WebDriverAgentLib, WebDriverAgenrRunner, IntegrationApp from WebDriverAgent project in xcode.
8. Now for real device testing
we also need to make some changes on device side too so we need to enable developer option for this:
Open settings and click on Privacy and Security-> Developer Option:
9. Get the device-identifier or udid of real device
Once the xcode build is succeeded and Developer mode of device is turned on, get the udiid or device-identifier connected to the mac machine from xcode as well as get the bundleId.
xcode- windows-Devices and Simulators – check the identifier and other device details, which are required to define capabilities to connect to device either programmatically or through Appium inspector.
Get the bundleId from xcode-
A bundle ID, also known as a CFBundleIdentifier. It is a unique identifier for an app in Xcode, allowing the system to distinguish it.
Bundle IDs are typically written in reverse-DNS format and can only contain alphanumeric characters (A–Z, a–z, and 0–9), hyphens (-), and periods (.). They are also case-insensitive.
BundleId is required to define capabilities to connect to the device either programmatically or through Appium inspector. For that, from xcode, select the top project item in the project navigator at the left then select TARGETS -> General. Bundle Identifier is found under Identity.
After all the settings, you need to build the xcode project from the Terminal – for that we need to run the following command: (xcodebuild -project WebDriverAgent.xcodeproj -scheme WebDriverAgentRunner -destination ‘id=udid’ test ) from the location where the WebDriverAgent project is present. To go to that location first run the command-
cd /Applications/Appium-Server-GUI.app/Contents/Resources/app/node_modules/appium/node_modules/appiumxcuitestdriver/appiumwebdriveragent
Now run the command with device identifier and WebDriverAgent project location.
After the above configurations, Start Appium Server from the terminal. Start Appium Inspector and set the desired capabilities.
10. Configure the desired capabilities and other settings of Appium Inspector:
Appium inspector is a tool which provides testers with a graphical user interface for inspecting and interacting with elements within mobile applications.When setting up automation with Appium for ios devices, it’s crucial to define the desired capabilities appropriately. These capabilities act as parameters that instruct Appium on how to interact with the device and the application under test.
Open the Appium Inspector and enter Remote Host as 0.0.0.0 and Remote Port as 4723. and set the following parameters as desired capabilities:
deviceName: This parameter specifies the name of the testing device. It’s essential to provide an accurate device name to ensure that Appium connects to the correct device.
udid: The Unique Device Identifier (UDID) uniquely identifies the device among all others. Appium uses this identifier to target the specific device for automation. Make sure to input the correct UDID of the device you intend to automate.
platformName: The platform name is set to “iOS,” indicating that the automation targets the iOS platform.
platformVersion: This parameter denotes the version of the ios platform of the device.
automationName: Appium supports multiple automation frameworks, and here, “XCUITest” is specified as the automation name. XCUITest is a widely used automation framework for testing iOS apps.
bundleId: This unique identifier for an app in Xcode allows the system to distinguish it.
Once you set all the above capabilities, click the Start Session button to open the application in Appium Inspector with the specified capabilities. Your app is now ready for inspection to prepare for efficient automation testing.
also you can see the following image on your device’s screen:
Conclusion
Setting up Appium for testing on real iOS devices can initially seem complicated due to the numerous steps involved and the technical nuances of configuring software and environment variables. However, by following this step-by-step guide, the process becomes easy and manageable.
Having the right tools and configurations in place streamlines your testing workflow, ensuring efficient and effective testing of your mobile apps on real devices. This not only improves the quality of your apps but also enhances your overall development process.
Remember, the key to successful automation testing is meticulous setup and configuration. By taking the time to follow each step carefully, you will save yourself from potential issues down the line and make your testing process smoother.
Trupti is a Sr. SDET at SpurQLabs with overall experience of 9 years, mainly in .NET- Web Application Development and UI Test Automation, Manual testing. Having hands-on experience in testing Web applications in Selenium, Specflow and Playwright BDD with C#.
The software development field is in a consistent state of innovation and change. AI in Software Testing in 2024 these Modern highlights, complex functionalities, and ever-evolving user requests require a vigorous testing procedure to guarantee quality and unwavering quality. Conventional testing strategies frequently struggle to keep pace and require a lot of maintenance. However, nowadays, AI-powered test automation and AI in software testing is shaping up as a game-changer that’s transforming the way we test software in 2024.
This progressive approach gives a path and the control of artificial intelligence to automate repeatable and time-consuming tasks, produce intelligent test cases, and analyze endless sums of information. The result? Unparalleled test scope progressed productivity and a noteworthy boost in software quality using AI in software testing in 2024.
In this blog post, we’ll dig into the energizing world of AI-powered test automation and how to use AI in Software Testing. We’ll investigate how AI is reshaping the testing process, the key points of interest it offers, and a glimpse into the future of this transformative innovation. So, buckle up and get prepared to find out how AI is revolutionizing software testing in 2024!
The Evolution of Test Automation (AI in Software Testing)
Back in the day, testing utilized to be a manual affair. Think spreadsheets, sticky notes, and parcels of coffee-fueled late nights. Then came the era of scripts and automation tools. In the early days (think the 1970s!), automation was pretty fundamental. We’re talking basic scripts that mirror user activities. Kind of like a clickbot on autopilot.
Then came the 2000s, with geniuses like Selenium rising. These frameworks permitted more complex testing, letting us automate web applications over distinctive browsers. All of a sudden, repetitive tests seem to be run at the tap of a button. It was a game-changer!
Fast forward to nowadays, and we’re in the age of AI and machine learning. Test automation has become more intelligent, more prescient, and fantastically productive. AI-powered testing tools can analyze tremendous sums of information, recognize patterns, and pinpoint potential issues sometime recently they have gotten to be full-blown bugs With help of AI-powered Test Automation and using AI in Software Testing in 2024.
But it doesn’t stop there. The future of test automation looks indeed more energizing. We’re talking approximately autonomous testing, where AI does not only identify issues but also fixes them independently. Envision a world where your testing suite is like a self-driving car, cruising through test scenarios with accuracy and agility.
So, whether you’re a seasoned QA professional or fair plunging your toes into the testing waters, one thing is clear: the journey of test automation is an exciting ride, and we’re as it is beginning!
Understanding AI in Software Testing and Test Automation in 2024
First off, what’s the buzz about AI? Simply put, AI brings a touch of intelligence to automation. It’s like having a smart assistant that learns and adapts to improve tasks over time. In the realm of software testing, AI is a game-changer.
Imagine this: you have a mountain of test cases to run. It’s tedious, time-consuming, and prone to human error. Enter AI-powered test automation! AI algorithms can analyze massive data sets, identify patterns, and make predictions, streamlining your testing process.
One of the coolest AI features is predictive analytics. It can foresee potential issues based on past data, helping you catch bugs before they cause chaos in production. Talk about being proactive!
Natural language processing (NLP) is another star player as it allows testers to interact with systems using human language, making test creation and execution more intuitive. Gone are the days of cryptic commands or complex scripts!
Let’s not forget about Machine Learning (ML). ML algorithms can autonomously improve test coverage by learning from test results and refining test cases. It’s like having a self-improving testing system on autopilot.
But wait, there’s more! Additionally, AI can optimize test execution by prioritizing critical tests, reducing redundant ones, and dynamically adjusting test suites based on code changes. Indeed, it’s like having a super-smart QA team working tirelessly in the background using AI in software testing in 2024.
Benefits of AI-Powered Test Automation
Integrating AI into test automation offers various points of interest to software improvement teams:
Improved Test Scope and Precision: AI algorithms pinpoint critical test scenarios and make test cases covering a wide extent of functionalities, guaranteeing comprehensive scope and exact results.
Faster Test Execution: AI-assisted testing speeds up test execution by automating repetitive tasks, liberating groups to focus on impactful testing and accelerating time-to-market.
Cost Savings and Asset Optimization: Automation diminishes manual effort, leading to noteworthy cost savings and way better asset allocation.
Enhanced Scalability and Adaptability: AI-powered automation scales with project needs, handles complex scenarios and adjusts to application changes consistently.
Challenges and Considerations of AI in Software Testing
Despite the compelling benefits, organizations must explore a few challenges into AI in software testing:
Initial Learning Curve: Implementing AI tools in test automation requires learning and setup, which can be a hurdle for some teams.
Data Quality: AI’s effectiveness pivots on clean, significant training data, emphasizing the significance of data quality.
Maintenance Overhead: Regular upgrades and maintenance of AI models are fundamental to align with advancing software and commerce needs.
Ethical Considerations: AI automation raises moral questions around data security, inclination, and straightforwardness, requiring proactive addressing of these concerns.
Best Practices for Implementing AI in Software Testing and Test Automation
To harness the full potential of AI-powered test automation, organizations should follow best practices such as:
Define Clear Objectives: Before diving into AI-powered test automation, outline your goals. What do you want to achieve? Improved test coverage, faster time to market, or better defect detection rates? Clear objectives will guide your AI implementation strategy.
Select the Right Tools: Choose AI tools that align with your testing needs. Look for features like intelligent test case generation, self-healing capabilities, and predictive analytics. Tools like Testim, Applitools, or Eggplant AI offer robust AI-assisted testing solutions.
Start Small, Scale Gradually: Begin with a pilot project or a small set of test cases to evaluate the effectiveness of AI in your automation framework. Once you gain confidence and see tangible benefits, gradually scale up your AI initiatives.
Data Quality Matters: AI thrives on data, so ensure you have high-quality, diverse datasets for training and testing AI models. Clean, relevant data will enhance the accuracy and reliability of your AI-assisted test automation.
Collaborate Across Teams: Foster collaboration between QA, development, and data science teams. Work together to define testing scenarios, validate AI models, and integrate AI-powered testing seamlessly into your CI/CD pipelines.
Continuous Learning and Optimization: AI evolves, so prioritize continuous learning and optimization. Monitor test results, gather feedback, and refine your AI models to adapt to changing requirements and improve overall testing efficiency.
Ethical Considerations: Finally, remember the ethical implications of AI in testing. Ensure transparency, fairness, and accountability in AI-assisted decision-making processes to build trust and maintain integrity in your testing practices.
The Role of AI in Software Testing
As we investigate the impact of AI in software testing, a few key zones come into focus:
AI Utilities for Test Case Generation, Test Execution, and Defect Prediction.
AI-Powered Test Case Generation: AI algorithms utilize strategies such as natural language processing (NLP) and machine learning (ML) to analyze prerequisite archives, user stories, and historical test data. They can recognize critical ways, edge cases, and potential vulnerabilities inside the software, producing test cases that cover these perspectives comprehensively. Moreover, AI in software testing can prioritize test cases based on risk factors, guaranteeing that high-impact zones are altogether tested.
AI-Assisted Test Execution: AI-assisted test execution optimizes testing processes by powerfully designating assets, prioritizing test cases, and adjusting testing techniques based on real-time input. AI algorithms can identify flaky tests, reroute test streams to maintain a strategic distance from bottlenecks and parallelize test execution to speed up input cycles. This approach minimizes testing costs and accelerates time-to-market for software releases.
AI-Based Defect Prediction: Machine learning models trained on historical defect data can anticipate potential defects and vulnerabilities in software code. By analyzing code complexity, altering history, and code quality measurements, AI can flag regions that are likely to cause issues. This proactive approach empowers developers to focus their efforts on code areas with a higher probability of defects, diminishing post-release bug events.
AI-Powered Test Automation Frameworks and Test Data Management
AI-Powered Test Automation Frameworks: AI-powered test automation frameworks consolidate keen features such as self-healing tests, versatile test execution, and prescient support. They utilize AI algorithms to identify and resolve test failures, optimize test execution based on historical information, and anticipate maintenance tasks for test scripts. This moves forward test steadiness, decreases false positives, and improves in general automation efficiency.
AI-Powered Test Data Management: AI computerizes test data management by analyzing data dependencies, making engineered test data sets, and anonymizing delicate data. It can distinguish information varieties required for testing distinctive scenarios and produce data that mimics real-world utilization patterns. This guarantees that testing environments are practical, different, and compliant with data protection regulations.
AI in Test Environment Provisioning and Test Maintenance.
Dynamic Test Environment Provisioning: AI analyzes asset accessibility, test prerequisites, and historical utilization patterns to dynamically provision test environments. It can distribute assets effectively, turn up virtualized environments, and configure network settings based on testing needs. This dynamic provisioning diminishes holding up times for test environments and progresses testing efficiency.
Intelligent Test Maintenance: AI in software testing automates test maintenance tasks by recognizing excess or obsolete test cases, recommending optimizations, and automatically upgrading test scripts. It analyzes code changes, affect analysis reports, and test coverage information to guarantee that tests stay significant and compelling. This decreases maintenance overhead and keeps testing forms agile and responsive to software changes.
AI enhances Test Efficiency, Effectiveness, and Reporting.
Progressed Test Scope and Accuracy: AI algorithms exceed expectations in recognizing complex test scenarios that conventional testing approaches might neglect. By leveraging strategies like genetic algorithms and support learning, AI can create test cases. These test cases cover a wide range of functionalities and edge cases. This comes about in progressed test scope and higher precision in recognizing software defects and performance issues.
AI-Enhanced Test Reporting and Analytics: AI-powered analytics tools analyze test results and identify patterns. They provide significant insights into test scope, performance trends, and defect clustering. AI-powered analytics tools analyze test results and identify patterns, thereby producing visualizations, trend examination reports, and inconsistency detection cautions. These insights help teams prioritize testing endeavors and make data-driven choices, enhancing overall test visibility and effectiveness.
AI-Powered Test Optimization and Performance Monitoring: AI plays a significant part in optimizing test processes and observing performance measurements. AI algorithms analyze testing information, execution times, asset utilization, and system behavior to distinguish optimization opportunities. Furthermore, this incorporates dynamically altering test arrangements, prioritizing critical tests, and optimizing test execution workflows for proficiency. Moreover, AI-assisted performance monitoring tools persistently screen application performance amid testing, identifying bottlenecks, memory leaks, and performance relapses. They produce performance reports, distinguish performance degradation patterns, and give suggestions for improving application performance.
Enhanced Collaboration between Development and Testing Teams
AI-powered test automation cultivates upgraded collaboration between development and testing teams:
Streamlined Communication: AI-assisted testing tools encourage consistent communication and collaboration between development and testing teams, empowering real-time input and issue resolution.
Shared Bits of Knowledge: AI-powered analytics give important bits of knowledge into testing measurements, performance patterns, and defect patterns, cultivating data-driven decision-making and persistent improvement.
Cross-Functional Collaboration: AI empowers cross-functional collaboration between developers, testers, data researchers, and AI masters, advancing collaboration and collective problem-solving.
Predictions for the Future of AI in Software Testing
Looking ahead, the future of AI in software testing holds promising predictions:
Advancements in AI Algorithms: Proceeded advancements in AI algorithms will lead to more advanced testing techniques, including progressed inconsistency discovery, self-learning testing frameworks, and predictive analytics.
Integration with DevOps and CI/CD: AI-powered testing will consistently coordinate with DevOps and Continuous Integration/Continuous Deployment (CI/CD) pipelines, thereby enabling quicker feedback loops. This includes automated testing in production environments and upgraded release cycles.
AI-Assisted Test Orchestration: AI will play a central part in test orchestration, dynamically managing test environments, assets, and test execution methodologies based on real-time data and project priorities.
Challenges and Opportunities in AI-Powered Testing
While AI-powered testing offers immense opportunities, it also presents challenges:
Complexity of AI Integration: Integrating AI into existing testing frameworks requires expertise in AI technologies, data management, and test automation, posing initial implementation challenges.
Data Quality and Bias: Ensuring data quality, addressing biases in AI models, and maintaining data privacy and security are ongoing challenges that organizations must address.
Skills Gap and Training: Building AI capabilities within testing teams, upskilling testers in AI concepts, and fostering a culture of AI-assisted testing require continuous learning and investment in training programs.
Strategies for Maximizing the Potential of AI-Powered Test Automation
To maximize the potential of AI-powered test automation, organizations can adopt the following strategies:
Strategic Alignment: Align AI initiatives with business objectives, prioritize use cases with high ROI potential, and develop a roadmap for AI integration into testing processes.
Continuous Learning and Collaboration: Invest in training programs, workshops, and knowledge-sharing sessions to build AI expertise within testing teams and foster collaboration with AI specialists and data scientists.
Data Governance and Ethics: Implement robust data governance practices, ensure data quality and integrity, address algorithm biases, and adhere to ethical guidelines for AI-assisted testing.
Pilot Projects and Iterative Approach: Start with pilot projects to validate AI capabilities, gather feedback, iterate on improvements, and gradually scale AI initiatives across testing environments.
Conclusion
In conclusion, AI-powered test automation stands as a significant force in revolutionizing AI in software testing in 2024 and beyond. With its capacity to upgrade effectiveness, precision, and speed in testing processes, AI-Assisted solutions are, therefore, reshaping how software is created and validated. By leveraging machine learning, natural language processing, and other AI technologies, organizations can streamline their testing workflows and identify defects earlier. Consequently, this enables them to bring high-quality software products to market faster than ever before.
As we move forward, the integration of AI into test automation will proceed to advance, offering indeed more progressed capabilities such as predictive analytics, autonomous testing, and versatile test procedures. This development will help optimize testing efforts, reduce costs, and improve overall software quality. Ultimately, it will benefit both businesses and end-users alike. Grasping AI-powered test automation is not just a trend; rather, it is a key imperative for modern software development organizations. Furthermore, it is essential for staying competitive in today’s fast-paced digital environment using AI in software testing in 2024.
An SDET Engineer with hands-on experience in manual, automation, API, Performance, and Security Testing. The technologies I have worked on include Selenium, Playwright, Cypress, SpecFlow, Cucumber, JMeter, K6, OWASP ZAP, Appium, Postman, Maven, Behave, Pytest, BrowserStack, SQL, GitHub, Java, JavaScript, Python, C#, HTML, CSS. Also, I have hands-on experience in CI/CD the technologies I have used for CI/CD include Azure DevOps, AWS, and Github. Apart from this, I like to write blogs and explore new technologies.
Selenium 4 features significant enhancements over Selenium 3, including a revamped Selenium Grid for distributed testing, native support for HTML5, and integration of the W3C WebDriver protocol for improved compatibility. Additionally, it offers enhanced debugging and error-handling capabilities, streamlining the testing process for better efficiency and reliability.
Benefits of Selenium Automation: Exploring Selenium 4 New Features
Streamline Testing Processes: Selenium automation allows organizations to streamline and enhance their testing processes by automating repetitive tasks associated with web application testing.
Interact with Web Elements: Automation scripts, facilitated by Selenium’s WebDriver, interact with web elements, imitating user actions to test functionality.
Accelerate Testing: Selenium automation accelerates testing by eliminating manual intervention and executing tests efficiently.
Ensure Consistency and Reliability: By automating tests, Selenium ensures consistent and reliable results across diverse browser environments, reducing the risk of human error.
Faster Releases: Selenium automation acts as a catalyst for achieving faster releases by expediting the testing phase.
Improve Test Coverage: With automation, organizations can improve test coverage by running tests more frequently and comprehensively.
Maintain Application Integrity: Selenium automation helps in maintaining the integrity of web applications by identifying and addressing issues promptly.
The Architecture of Selenium 3
The Architecture of Selenium 4
Selenium 4 New Features:W3C WebDriver Standardization
Selenium 4 fully supports the W3C WebDriver standard, improving compatibility across different browsers and reducing inconsistencies.
Standardized Communication:The adoption of the W3C WebDriver protocol ensures consistent behavior across different browsers, reducing compatibility issues.
Improved Grid Architecture: Enhanced scalability and easier management with support for distributed mode, Docker, and Kubernetes.
User-Friendly Selenium IDE: Modernized interface and parallel test execution simplify test creation and management.
Enhanced Browser Driver Management: Unified driver interface and automatic updates reduce manual configuration and ensure compatibility.
Advanced Browser Interactions: Integration with DevTools Protocols for Chrome and Firefox enables comprehensive network and performance monitoring.
Simplified Capabilities Configuration: Using Options classes instead of DesiredCapabilities improves the readability and maintainability of test scripts.
Improved Actions API: Enhancements provide more reliable and consistent complex user interactions across different browsers.
Enhanced Performance: Overall performance improvements result in faster and more efficient test execution.
Better Documentation: Comprehensive and improved documentation reduces the learning curve and enhances productivity.
Backward Compatibility: Designed to be backward compatible, allowing seamless upgrades without significant changes to existing test scripts.
Here, I’ll outline the precise changes introduced in Selenium 4 when compared to its earlier versions:
1. W3C WebDriver Protocol:
Selenium 4 further aligns with the W3C WebDriver standard, ensuring better compatibility across different browsers.
Full support for the W3C WebDriver protocol was a significant improvement to enhance consistency and stability across browser implementations.
2. New Grid :
Selenium Grid has been updated in Selenium 4 with a new version known as the “Grid 4”.
The new grid is more scalable and provides better support for Docker and Kubernetes.
Let’s briefly understand Selenium Grid, which consists of two major components:
Node: Used to execute tests on individual computer systems, there can be multiple nodes in a grid.
Hub: The central point from which it controls all the machines present in the network. It contains only one hub, which helps in allocating test execution to different nodes.
In Selenium 4, the Grid is highly flexible. It allows testing cases against multiple browsers, browsers of different versions, and also on different operating systems.
Even now, there is no need for a setup to start the hub and nodes individually. Once the user starts the server, the Grid automatically functions as both nodes and hub.
3. Relative Locators:
Selenium 4 introduced a new set of locators called “Relative Locators” or “Relative By”.
Relative Locators provide a more natural way of interacting with elements concerning their surrounding elements, making it easier to write maintainable tests.
There are five locators added in Selenium 4:
below(): Web element located below the specified element.
toLeftOf(): Target web element present to the left of the specified element.
toRightOf(): Target web element presented to the right of the specified element.
above(): Web element located above the specified element.
near(): Target web element away (approximately 50 pixels) from the specified element.
Note: All the above relative locator methods support the withTagName method.
The below example demonstrates the toLeftOf() and below() locators:
WebElement book = driver.findElement(RelativeLocators.withTagName("li").toLeftOf(By.id("pid1")).below(By.id("pid2")));
String id1 = book.getAttribute("id1");
The below example illustrates the toRightOf() and above() locators:
Selenium IDE received significant updates with Selenium 4 new features, making it more powerful and versatile for recording and playing back test scenarios.
The Selenium IDE has become a browser extension available for Chrome and Firefox.
The features include:
Improved Browser Support:
The new version enhances browser support, allowing any browser vendor to seamlessly integrate with the latest Selenium IDE.
CLI Runner Based on NodeJS:
The Command Line Interface (CLI) Runner is now built on NodeJS instead of the HTML-based runner.
It supports parallel execution, providing a more efficient way to execute tests concurrently.
The CLI Runner generates a comprehensive report, detailing the total number of test cases passed and failed, along with the execution time taken.
These improvements in Selenium IDE aim to enhance compatibility with various browsers and provide a more versatile and efficient test execution environment through the CLI Runner based on NodeJS.
5. New Window Handling API:
Selenium 4 introduced a new Window interface, providing a more consistent and convenient way to handle browser windows and tabs.
if the user wants to access two applications in the same browser, follow the below code
driver.get(“https://www.google.com/”);
driver.switchTo().newWindow(WindowType.WINDOW);
driver.navigate().to(“https://www.bing.com/”);
Set<String> windowHandles = driver.getWindowHandles();
for (String handle : windowHandles) {
driver.switchTo().window(handle);
// Perform actions on each window
}
6. Improved DevTools API:
Selenium 4 provides enhanced support for interacting with the browser DevTools using the DevTools API.
This allows testers to perform advanced browser interactions and access additional information about the browser.
In the new version of Selenium 4, they have made some internal changes in the API. Earlier in Selenium 3, the Chrome driver directly extended the Remote Web Driver class. However, in Selenium 4, the Chrome driver class now extends to the Chromium Driver class.The Chromium Driver class has some predefined methods to access the dev tool, highlighting the new features of Selenium 4.
Note: Chromium Driver extends the Remote Web driver class.
By using the API, we can perform the following operations:
In Selenium 4, a notable enhancement is the provision to capture a screenshot of a specific web element, which was unavailable in earlier versions. This feature lets users focus on capturing images of individual elements on a webpage, providing more targeted and precise visual information during testing or debugging processes. The capability to take screenshots of specific web elements enhances the flexibility and granularity of testing scenarios, making Selenium 4 a valuable upgrade for web automation tasks. Among the various Selenium 4 features, this improvement stands out for its practical application in detailed web testing.
In Selenium 4, the parameters received in Waits and Timeout have changed from expecting (long time, TimeUnit unit) to expect (Duration duration) which you see a deprecation message for all tests.
WebDriverWait is also now expecting a ‘Duration’ instead of a long for timeout in seconds and milliseconds.
The method is now deprecated in selenium public WebDriverWait(@NotNull org.openqa.selenium.WebDriver driver, long timeoutInSeconds)
Before Selenium 4 –
//Old syntax
WebDriverWait wait = new WebDriverWait(driver,10);
wait.until(ExpectedConditions.visibilityOfElementLocated(By.cssSelector(".classlocator")));
After Selenium 4 –
//Selenium 4 syntax
WebDriverWait wait = new WebDriverWait(driver,Duration.ofSeconds(10));
wait.until(ExpectedConditions.visibilityOfElementLocated(By.cssSelector(".classlocator")));
FluentWait –
Before Selenium 4 –
Wait<WebDriver> wait = new FluentWait<WebDriver>(driver)
.withTimeout(30, TimeUnit.SECONDS)
.pollingEvery(5, TimeUnit.SECONDS)
.ignoring(NoSuchElementException.class);
After Selenium 4 –
Wait<WebDriver> fluentWait = new FluentWait<WebDriver>(driver)
.withTimeout(Duration.ofSeconds(30))
.pollingEvery(Duration.ofSeconds(5))
.ignoring(NoSuchElementException.class);
9. Bi-Directional Communication:
Selenium 4 introduced better bi-directional communication between Selenium and browser drivers.
This allows for more efficient communication, resulting in improved performance and stability.
10. Enhanced Documentation:
Selenium 4 comes with improved and updated documentation, making it easier for users to find information and resources related to Selenium.
11. Support for Chrome DevTools Protocol (CDP):
Selenium 4 allows users to interact with Chrome DevTools using the Chrome DevTools Protocol directly.
Conclusion:
Selenium 4 marks a substantial leap forward, addressing limitations present in Selenium 3 and introducing new features to meet the evolving needs of web automation. The Relative Locators, enhanced window handling, improved DevTools API, and Grid 4 support make Selenium 4 a powerful and versatile tool for testers and developers in the realm of web testing and automation.
Click here for more blogs on software testing and test automation.
Harish is an SDET with expertise in API, web, and mobile testing. He has worked on multiple Web and mobile automation tools including Cypress with JavaScript, Appium, and Selenium with Python and Java. He is very keen to learn new Technologies and Tools for test automation. His latest stint was in TestProject.io. He loves to read books when he has spare time.
Web tables, also known as HTML tables, are a widely used format for displaying data on web pages. They allow for a structured representation of information in rows and columns, making it easy to read and manipulate data. Selenium WebDriver, a powerful tool for web browser automation, provides the functionality to interact with these tables programmatically. This capability is beneficial for tasks like web scraping, automated testing, and data validation. In this blog, we will see how to extract data from Web tables in Java-Selenium.
Identify web table from your webpage:
To effectively identify and interact with web tables using Selenium, it’s crucial to understand the HTML structure of tables and the specific tags used. Here’s an overview of the key table-related HTML tags
A typical HTML table consists of several tags that define its structure:
<table>: The main container for the table.
<thead>: Defines the table header, which contains header rows (<tr>).
<tbody>: Contains the table body, which includes the data rows.
<tr>:Defines a table row.
<th>: Defines a header cell in a table row.
<td>: Defines a standard data cell in a table row.
As a demo website, here you will get a sample WebTable with fields like first name, last name, email, etc. Here we have applied a filter for email to minimize the size of the table.
We will be starting by launching the browser and navigating to the webpage. We have applied a filter for the email “PolGermain@whatever.com”, you can change it as per your requirement.
Once we get the filtered data from the table, now we need to locate the table and get the number of rows. The table will have multiple rows so, we need to use a list to store all the rows.
As we have stored all the rows in the list, now we need to iterate through each rows to fetch the columns and store the column data in another list.
Example :
Abc
1
Xyz
2
table has 2 rows and 2 columns
When we are iterating through the 1st row we will get data as Abc and 1 and store it in the list ’as rowdata[Abc, 1] similarly data from the 2nd row will be stored as rowdata[Xyz, 2].When we are iterating through the 2nd row the data from the 1st row will be overwritten. That’s why we will need one more list ‘webRows ’ to store all the rows. In the below code snippet, here we are iterating through all the columns from each row one by one and finally storing all the rows in the list WebRows.
We have successfully extracted the table data now you can use this data as per your requirement
To do this we need to iterate through the list ‘webRows’ where we have our table data stored. We will be accessing all the columns by their index. In this case, you should know the column index you want to access. The column index always starts from 0.
for (int s = 0; s < webRows.size(); s++) {
List<String> row = webRows.get(s);
System.out.println(row.get(1));
System.out.println(row);
}
Below is the complete code snippet for the above-mentioned steps. You need to update related Xpaths in case you are not able to access the rows and columns with the given Xpaths.
Instead of accessing data by the index, you can access it using the column index also, and to do that you need to use the HashMaps instead of lists. HashMap will help to store column headers as keys and column data as values
Example:
Name
Id
Abc
1
Xyz
2
Table has 3 rows and 2 columns
Here Name and ID will be your keys and Abc, 1 and Xyz, 2 will be the values.
How to store and access table data using HashMap?
The code snippet below shows how to use HashMap to store data in key-value format.
package Selenium;
import io.github.bonigarcia.wdm.WebDriverManager;
import org.openqa.selenium.*;
import org.openqa.selenium.chrome.ChromeDriver;
import java.util.ArrayList;
import java.util.List;
public class Webtable_Blog {
public static void main(String[] args) throws InterruptedException {
WebDriverManager.chromedriver().setup();
WebDriver driver = new ChromeDriver();
driver.get("https://www.globalsqa.com/angularJs-protractor/WebTable/");
driver.manage().window().maximize();
WebElement global_search = driver.findElement(By.xpath("//input[@type='search' and @placeholder='global search']"));
global_search.sendKeys("PolGermain@whatever.com");
// global_search.sendKeys("Pol");
global_search.sendKeys(Keys.ENTER);
Thread.sleep(5000);
List<WebElement> rows = driver.findElements(By.xpath("//table[@class='table table-striped']/tbody/tr"));
System.out.println("size-"+rows.size());
List<Map<String, String>> webRows = new ArrayList<>();
for (int i = 0; i < rows.size(); i++) {
List<WebElement> keys = driver.findElements(By.xpath("//table[@class='table table-striped']/thead/tr[1]/th"));
List<WebElement> values = driver.findElements(By.xpath("//table[@class='table table-striped']/tbody/tr["+(i+1)+"]/td"));
Map<String, String> webColumn = new HashMap<>();
try {
for (int j = 0; i < keys.size(); j++) {
webColumn.put(keys.get(j).getText(), values.get(j).getText());
}
} catch (Exception e) {
}
webRows.add(webColumn);
}
for (int s = 0; s < webRows.size(); s++) {
System.out.println(webRows.get(s).get("lastName"));
System.out.println(webRows.get(s));
}
}
}
In this blog, we’ve delved into the powerful capabilities of Selenium WebDriver for handling web tables in Java. WebTables are a crucial part of web applications, often used to display large amounts of data in an organized manner. In Java Selenium, handling these WebTables efficiently is a key skill for any test automation engineer. Throughout this blog, we’ve explored various techniques to interact with WebTables, including locating tables, accessing rows and cells, iterating through table data, and performing actions like sorting and filtering.
Click here for more blogs on software testing and test automation.
Priyanka is an SDET with 2.5+ years of hands-on experience in Manual, Automation, and API testing. The technologies she has worked on include Selenium, Playwright, Cucumber, Appium, Postman, SQL, GitHub, and Java. Also, she is interested in Blog writing and learning new technologies.
Working with PDF documents programmatically can be a challenging task, especially when you need to extract and manipulate text content. However, with the right tools and libraries, you can efficiently convert PDF text to a structured JSON format.
Converting PDF to JSON programmatically offers flexibility and customization, especially in dynamic runtime environments where reliance on external tools may not be feasible. While free tools exist, they may not always cater to specific runtime requirements or integrate seamlessly into existing systems.
Consider scenarios like real-time data extraction from PDF reports generated by various sources. During runtime, integrating with a specific tool might not be viable due to constraints such as security policies, network connectivity, or the need for real-time processing. In such cases, a custom-coded solution allows for on-the-fly conversion tailored to the application’s needs.
For Example:
E-commerce Invoice Processing: Extracting invoice details and converting them to JSON for real-time database updates.
Healthcare Records Management: Converting patient records to JSON for integration with EHR systems, ensuring HIPAA compliance.
Legal Document Analysis: Extracting specific clauses and dates from legal documents for analysis.
Free tools are inadequate for real-time, automated, and secure PDF to JSON conversion. Coding your own solution ensures efficient, scalable, and compliant data handling.
In this blog, we’ll walk through a Java program that accomplishes using the powerful iTextPDF and Jackson libraries. Screenshots will be included to illustrate the process in Testing.
Introduction for Converting PDF to JSON in Java
PDF documents are ubiquitous in the modern world, used for everything from reports and ebooks to invoices and forms. They provide a versatile way to share formatted text, images, and even interactive content. Despite their convenience, PDFs can be difficult to work with programmatically, especially when you need to extract specific information from them.
Often, there arises a need to extract text content from PDFs for various purposes such as:
Data Analysis: Extracting textual data for analysis, reporting, or further processing.
Indexing: Creating searchable indexes for large collections of PDF documents.
Transformation: Converting PDF content into different formats like JSON, XML, or CSV for interoperability with other systems.
JSON (JavaScript Object Notation) is a lightweight data interchange format that’s easy for humans to read and write, and easy for machines to parse and generate. It is widely used in web applications, APIs, and configuration files due to its simplicity and versatility.
In this guide, we will explore how to convert the text content of a PDF file into a JSON format using Java. We’ll leverage the iTextPDF library for PDF text extraction and the Jackson library for JSON processing. This approach will allow us to take advantage of the structured nature of JSON to organize the extracted text in a meaningful way.
Prerequisites for Converting PDF to JSON in Java
Before we dive into the code, ensure you have the following prerequisites installed and configured:
Java Development Kit (JDK)
Maven for managing dependencies
iTextPDF library for handling PDF documents
Jackson library for JSON processing
Step-by-Step Installation and Setup for Converting PDF to JSON in Java
Install Java Development Kit (JDK)
The JDK is a software development environment used for developing Java applications. To install the JDK:
Start IntelliJ IDEA: Open from the start menu (Windows).
Complete Initial Setup: Import settings or start fresh.
Start a New Project: Begin a new project or open an existing one.
Open IntelliJ IDEA:
Launch IntelliJ IDEA on your computer
Create or Open a Project
If you already have a project, open it. Otherwise, create a new project by selecting File > New > Project….
Name your project and select the project location
Choose Java from Language.
Choose Maven from the Build systems.
Select the project SDK (JDK) and click Next.
Choose the project template (if any) and click Next.
Then click Create.
Create a New Java Class
In the Project tool window (usually on the left side), right-click on the (src → test → java) directory or any of its subdirectories where you want to create the new class.
Select New > Java Class from the context menu.
Name Your Class
In the dialog that appears, enter the name of your new class. For example, you can name it PdfToJsonConversion.
Click OK/Enter.
Add the following dependencies to your pom.xml file for Converting PDF to JSON in Java:
<dependencies>
<!-- https://mvnrepository.com/artifact/com.itextpdf/itext-core -->
<dependency>
<groupId>com.itextpdf</groupId>
<artifactId>itext-core</artifactId>
<version>8.0.3</version>
<type>pom</type>
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-core</artifactId>
<version>2.13.0</version> <!-- Use the same version for consistency -->
</dependency>
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-databind</artifactId>
<version>2.13.4.1</version> <!-- Use the latest version available -->
</dependency>
<!-- Jackson Annotations -->
<dependency>
<groupId>com.fasterxml.jackson.core</groupId>
<artifactId>jackson-annotations</artifactId>
<version>2.13.3</version> <!-- Use the latest version available -->
</dependency>
<dependency>
<groupId>org.testng</groupId>
<artifactId>testng</artifactId>
<version>7.8.0</version>
<scope>compile</scope>
</dependency>
</dependencies>
Write Your Code to Convert PDF to JSON in Java
IntelliJ IDEA will create a new .java file with the name you provided.
You can start writing your Java code inside this file.
The Java Program to Covert PFT to JSON
Here is the complete Java program that converts a PDF file to JSON:
import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.databind.SerializationFeature;
import com.fasterxml.jackson.databind.node.ArrayNode;
import com.fasterxml.jackson.databind.node.ObjectNode;
import com.itextpdf.kernel.pdf.PdfDocument;
import com.itextpdf.kernel.pdf.PdfPage;
import com.itextpdf.kernel.pdf.PdfReader;
import com.itextpdf.kernel.pdf.canvas.parser.PdfTextExtractor;
import org.testng.annotations.Test;
import java.io.File;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
public class PdfToJsonConversion {
@Test
public static void convertPdfFileToJson() {
String inputPdfPath = "C:\\Users\\Mangesh\\Downloads\\What is Software Testing.pdf";
String outputJsonPath = "src/test/java/What is Software Testing.json";
List<String> contentList = new ArrayList<>();
try (PdfDocument pdfDoc = new PdfDocument(new PdfReader(inputPdfPath))) {
int numPages = pdfDoc.getNumberOfPages();
for (int i = 1; i <= numPages; i++) {
PdfPage page = pdfDoc.getPage(i);
String pageContent = PdfTextExtractor.getTextFromPage(page);
contentList.add(pageContent);
}
} catch (IOException e) {
e.printStackTrace();
}
// Create JSON object
ObjectMapper mapper = new ObjectMapper();
mapper.enable(SerializationFeature.INDENT_OUTPUT);
ArrayNode pagesArray = mapper.createArrayNode();
// Add page contents to JSON array
for (int i = 0; i < contentList.size(); i++) {
ObjectNode pageNode = mapper.createObjectNode();
pageNode.put("Page", i + 1);
// Split content by lines and add to JSON object with line number as key
String[] lines = contentList.get(i).split("\\r?\\n");
ObjectNode linesObject = mapper.createObjectNode();
for (int j = 0; j < lines.length; j++) {
linesObject.put(Integer.toString(j + 1), lines[j]);
}
pageNode.set("Content", linesObject);
pagesArray.add(pageNode);
}
File outputJsonFile = new File(outputJsonPath);
// Write JSON to file
try {
mapper.writeValue(outputJsonFile, pagesArray);
} catch (IOException e) {
e.printStackTrace();
}
System.out.println("Content stored in " + outputJsonFile.getName());
}
}
Explanation
Let’s break down the code step by step:
1. Dependencies
Jackson Library:
ObjectMapper, SerializationFeature, ArrayNode, ObjectNode: These are from the Jackson library, used for creating and manipulating JSON objects.
iText Library:
PdfDocument, PdfPage, PdfReader, PdfTextExtractor: These classes are from the iText library, used for reading and extracting text from PDF documents.
TestNG Library:
@Test: An annotation from the TestNG library, used for marking the convertPdfFileToJson method as a test method.
Java Standard Library:
File, IOException, ArrayList, List: Standard Java classes for file operations, handling exceptions, and working with lists.
2. Test Annotation
The class PdfToJsonConversion contains a static method convertPdfFileToJson which is annotated with @Test, making it a test method in a TestNG test class.
3. Method convertPdfFileToJson:
This method handles the core functionality of reading a PDF and converting its content to JSON.
4. Input and Output Paths:
inputPdfPath specifies the PDF file location, and outputJsonPath defines where the resulting JSON file will be saved.
5. PDF to Text Conversion:
Create a PdfDocument object using a PdfReader for the input PDF file.
Get the number of pages in the PDF.
Loop through each page, extract text using PdfTextExtractor, and add the text to contentList.
Handle any IOException that may occur.
6. Creating JSON Objects:
Create an ObjectMapper for JSON manipulation.
Enable pretty printing with SerializationFeature.INDENT_OUTPUT.
Create an ArrayNode to hold the content of each page.
7. Adding Page Content to JSON:
Iterate over contentList to process each page’s content.
For each page, create an ObjectNode and set the page number.
Split the page content into lines, then create another ObjectNode to hold each line with its number as the key.
Add the linesObject to the pageNode and then add the pageNode to pagesArray.
8. Writing JSON to File
Create a File object for the output JSON file.
Use the ObjectMapper to write pagesArray to the JSON file, handling any IOException.
Print a confirmation message indicating the completion of the process.
9. Output
The program outputs the name of the JSON file once the conversion is complete.
Running the Program
To run this program, ensure you have the required libraries in your project’s classpath. You can run it through your IDE or using a build tool like Maven.
Open your IDE and load the project.
Ensure dependencies are correctly set in your pom.xml.
Run the test method convertPdfFileToJson.
You should see output similar to this in your console: Content stored in What is Software Testing.json. The JSON file will be created in the specified output path.
JSON Output Example
Here’s a snippet of what the JSON output might look like.
[ {
"Page" : 1,
"Content" : {
"1" : "What is Software Testing? ",
"2" : "Last Updated : 24 May, 2024 ",
"3" : " ",
"4" : " ",
"5" : "",
"6" : "Software testing can be stated as the process of verifying and validating whether a ",
"7" : "software or application is bug-free, meets the technical requirements as guided by "
}
}, {
"Page" : 2,
"Content" : {
"1" : " Increased customer satisfaction: Software testing ensures reliability, security, ",
"2" : "and high performance which results in saving time, costs, and customer "
}
} ]
Conclusion
Converting PDF text content to JSON can greatly simplify data processing and integration tasks. With Java, the iTextPDF, and Jackson libraries, this task becomes straightforward and efficient. This guide provides a comprehensive example to help you get started with your own PDF to JSON conversion projects. https://github.com/mangesh-31/PdfToJsonConversion
Hello! I’m Mangesh, a Software Tester SDET. In my professional life, I focus on ensuring the quality of software products through thorough testing and analysis. I have been learning and working with Selenium, Java, and Playwright to develop automated testing solutions. Currently, I am working Jr. SDET at SpurQLabs Technologies Pvt. Ltd.