Sunday, 23 May 2021

Scraping a PowerBI dashboard report: expanding rows and loading/saving values in a scrolling table

Objective:

I'm trying to scrape a PowerBI report for "Reported deliveries >> By number of vaccines" in the the "Delivery" page on this public PowerBI dashboard using Selenium to get table listing all countries, with the name of the vaccines delivered, and the number of doses delivered.

Target and setup:

Selecting the view, expanding the rows

This screencap shows the PowerBI report that I'm trying to scrape. After clicking on "Delivery" on the landing page and selecting "Reported deliveries >> By number of vaccines", there is a table that loads up on the bottom half of the page as you can see in this image. You can right-click anywhere on the table to expand all the rows, which are country names. Beneath each country name, is the name of the vaccine delivered in one column, and the "Doses delivered" in the adjacent column. There are 100+ rows with country names, and these load dynamically as you scroll down through this table.

Progress so far:

I've managed to start a Chrome headless browser, get to the "Delivery" page by clicking on the relevant button, and then open the correct report that I want by clicking on the relevant tick box in the "View option" menu on the left. After this I'm stuck. I can't figure out how to expand all of the rows in this dynamically rendered table, and I also can't figure out how to scroll down through the table and save the country names, along with their respective vaccine names and number of doses delivered.

With regards to expanding all of the Country name rows, I've seen a few posts here in StackOverflow suggesting the use of Selenium's Context Click in the Actions class, but I can't seem to figure out how to apply it to my example.

With regards to the scrolling problem, I'm at a bit of a loss here as well. I tried, in vain, to find the underlying API where data was being called for the table, but my very novice eyes were not able to spot anything, and I'm wondering how else to scroll all the way down the table to scrape this data.

Any help, hints, or suggestions would be greatly appreciated!

Code up to now

from selenium.webdriver import Chrome
from selenium.webdriver import ChromeOptions

webdriver = "C:\Program Files (x86)\Google\Chrome\Application\chromedriver.exe"
options = ChromeOptions()
driver = Chrome(webdriver, options = options)

url = 'https://app.powerbi.com/view?r=eyJrIjoiNmE0YjZiNzUtZjk2OS00ZTg4LThlMzMtNTRhNzE0NzA4YmZlIiwidCI6Ijc3NDEwMTk1LTE0ZTEtNGZiOC05MDRiLWFiMTg5MjAyMzY2NyIsImMiOjh9&pageName=ReportSectiona329b3eafd86059a947b'

driver.get(url)
  
#click on "Delivery" button
driver.find_element_by_xpath('//*[@id="pvExplorationHost"]/div/div/exploration/div/explore-canvas-modern/div/div[2]/div/div[2]/div[2]/visual-container-repeat/visual-container-modern[11]/transform').click()

#click on "Reported deliveries >> By number of vaccines" click box
driver.find_element_by_xpath('//*[@id="pvExplorationHost"]/div/div/exploration/div/explore-canvas-modern/div/div[2]/div/div[2]/div[2]/visual-container-repeat/visual-container-modern[8]/transform/div/div[3]/div/visual-modern/div/button').click()


from Scraping a PowerBI dashboard report: expanding rows and loading/saving values in a scrolling table

No comments:

Post a Comment