Warm tip: This article is reproduced from serverfault.com, please click

How to extract info within a #shadow-root (open) using Selenium Python?

发布于 2020-11-27 23:19:56

I got the next url related to an online store https://www.tiendasjumbo.co/buscar?q=mani and I can't extract the product label an another fields:

from selenium import webdriver
import time
from random import randint

driver = webdriver.Firefox(executable_path= "C:\Program Files (x86)\geckodriver.exe")
driver.implicitly_wait(10)
time.sleep(4)

url =  "https://www.tiendasjumbo.co/buscar?q=mani"
driver.maximize_window()
driver.get(url)
driver.find_element_by_xpath('//h1[@class="impulse-title"]')

What am I doing wrong, I also tried to switch the iframes but there is no way to achieve my goal? any help is welcome.enter image description here

Questioner
Alexis AG
Viewed
0
DebanjanB 2020-11-29 06:33:01

The products within the website https://www.tiendasjumbo.co/buscar?q=mani are located within a #shadow-root (open).

impulse-search


Solution

To extract the product label you have to use shadowRoot.querySelector() and you can use the following Locator Strategy:

  • Code Block:

    driver.get('https://www.tiendasjumbo.co/buscar?q=mani')
    item = driver.execute_script("return document.querySelector('impulse-search').shadowRoot.querySelector('div.group-name-brand h1.impulse-title span.formatted-text')")
    print(item.text)
    
  • Console Output:

    La especial mezcla de nueces, maní, almendras y marañones x 450 g
    

References

You can find a couple of relevant discussions in: