Data Scraping using Excel VBA and Selenium

Data Scraping using Excel VBA and Selenium

Table of Contents

Selenium is widely recognized not only as a powerful tool for automation testing but also for data scraping, allowing users to extract information from HTML web pages. By simulating human interaction with web browsers, Selenium enables automated navigation through pages, input handling, and element selection to retrieve data directly from web elements. Many automation testing courses cover Selenium’s capabilities in depth, equipping learners with the skills to use it effectively for both testing and data extraction tasks.

This capability is particularly useful for data scraping tasks, where information such as text, images, URLs, or table data needs to be gathered systematically from multiple pages. Selenium’s flexibility across various browsers and programming languages makes it an ideal choice for those needing precise control over web interactions to extract structured data effectively.

Steps to prepare Excel Macro

Step 1: Open an Excel-based Macro.

Data Scraping using Excel VBA and Selenium

Step 2: Insert a new module.

Data Scraping using Excel VBA and Selenium

Step 3: Initialize a new subroutine and name it.

Data Scraping using Excel VBA and Selenium

Step 4: Under Tools > References. Add Selenium Type Library. These libraries help in opening google Chrome and developing a macro script.

Data Scraping using Excel VBA and Selenium

Now the excel file is ready to interact with the browser.

How to open Google Chrome using VBA?

Step 1: Declare and initialize the variable in the subroutine.

Sub test()
Dim driver As New WebDriver
Dim rc, cc, colC As Integer

Step 2: For opening Google Chrome using VBA and Selenium, you need to write command a driver.start ”chrome” and press F5.

Sub test()
Dim driver As New WebDriver
Dim rc, cc, colC As Integer
driver.Start “Chrome”
Application.Wait Now + TimeValue(“00:00:20”)
End Sub

How to open a Website in Chrome Browser using VBA?

Now we need to pass the URL to access the website on the browser.

Sub test()
Dim driver As New WebDriver
Dim rc, cc, colC As Integer
driver.Start “Chrome”
driver.get “https://www.google.com”
Application.Wait Now + TimeValue(“00:00:20”)
End Sub

How to scrape information from the website using VBA?

If someone needs daily data scraping from a website, they can automate it so that a single click pulls updated data directly into Excel. Using Selenium with Excel VBA, a script can be set up to connect, navigate, and extract the latest data instantly. This saves time and ensures accurate, real-time information, ideal for tasks like tracking prices or monitoring sales.

The first step is to Inspect Element and check how the data is structured in the source content of the website.

Let us assume that the source code looks like this:

<table class="datatable">
<thead>
<tr> 
<th>Company</th> 
<th>Group</th>
<th>Pre Close (Rs)</th>
<th>Current Price (Rs)</th> 
<th>% Change</th> </tr>

As it is clear that data is structured in the single HTML table. So we would design a macro which will pull the header information and corresponding data from the HTML table.


Step 1: To create a for loop that iterates through the HTML header information using Selenium, we will utilize FindElementByClass() and FindElementByTag() methods to locate and gather the header details of an HTML table. Here’s how you could structure the code in VBA:

Sub test2()
Dim driver As New WebDriver
Dim rowc, cc, columnC As Integer
rowc = 2
Application.ScreenUpdating = False
driver.Start "chrome"
driver.Get "http://demo.h2kinfosys.com/test/web-table-element.php"
For Each th In driver.FindElementByClass("dataTable").FindElementByTag("thead").FindElementsByTag("tr")
cc = 1
For Each t In th.FindElementsByTag("th")
Sheet2.Cells(1, cc).Value = t.Text
cc = cc + 1
Next t
Next th

Step 2: To locate and extract table data using the Selenium driver in a similar manner, here’s the revised VBA code. This script will locate the table headers and data, then write them to Sheet2 in Excel, starting from the specified cells.

Sub test2()
Dim driver As New WebDriver
Dim rowc, cc, columnC As Integer
rowc = 2
Application.ScreenUpdating = False
driver.Start "chrome"
driver.Get"http://demo.h2kinfosys/test/web-table-element.php"
For Each th In driver.FindElementByClass("dataTable").FindElementByTag("thead").FindElementsByTag("tr")
cc = 1
For Each t In th.FindElementsByTag("th")
Sheet2.Cells(1, cc).Value = t.Text
cc = cc + 1
Next t
Next th
For Each tr In driver.FindElementByClass("dataTable").FindElementByTag("tbody").FindElementsByTag("tr")
columnC = 1
For Each td In tr.FindElementsByTag("td")
Sheet2.Cells(rowc, columnC).Value = td.Text
columnC = columnC + 1
Next td
rowc = rowc + 1
Next tr
Application.Wait Now + TimeValue("00:00:20")
End Sub

Step 3: Once the macro is ready, pass, and assign the subroutine to excel button and exit the module. Label the button as any name.

Step 4: Press the labeled button to get the desired output.

Step 5: Compare the scraped results of Excel file with the results of the Chrome Site.

selenium training

Conclusion

In conclusion, data scraping using Excel VBA and Selenium offers a powerful solution for automating the extraction of information from websites directly into spreadsheets. This combination brings together the flexibility of Selenium’s web automation with the familiarity and functionality of Excel, enabling users to manage, analyze, and store data efficiently.

Whether you’re gathering product details, tracking financial data, or conducting research, integrating VBA and Selenium streamlines the entire process, saving time and reducing manual work. By mastering these tools, you can create robust, reusable scripts that transform how you handle data. Embrace this approach to enhance productivity and gain valuable insights effortlessly!

Call to Action

Need a faster way to extract data from websites into Excel? “Data Scraping using Excel VBA and Selenium” is the solution! By combining Selenium’s web automation with Excel’s VBA scripting, you can automate data extraction and populate spreadsheets with critical information in seconds. This approach is perfect for tasks like market research, financial tracking, and e-commerce analysis, making data collection more efficient and accurate.

At H2K Infosys, our expert-led training will teach you how to integrate Selenium with Excel VBA, build reusable scripts, and handle real-world data extraction challenges. With hands-on guidance, you’ll gain the skills to automate complex scraping tasks, boost productivity, and transform how you manage data. Enroll today and elevate your automation expertise with H2K Infosys!

One Response

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Share this article
Subscribe
By pressing the Subscribe button, you confirm that you have read our Privacy Policy.
Need a Free Demo Class?
Join H2K Infosys IT Online Training
Enroll Free demo class