So im trying to compare 2 lists using python, one contains like 1000 links i fetched from a website. The other one contains a few words, that might be contained in a link in the first list. If this is the case, i want to get an output. i printed that first list, it actually works. for example if the link is "https://steamcdn-a.swap.gg/apps/730/icons/econ/stickers/eslkatowice2015/counterlogic.f49adabd6052a558bff3fe09f5a09e0675737936.png" and my list contains the word "eslkatowice2015", i want to get an output using the print()
function. My code looks like this:
page_source = driver.page_source
soup = BeautifulSoup(page_source, 'lxml')
Bot_Stickers = soup.find_all('img', class_='csi')
for sticker in Bot_Stickers:
for i in StickerIDs:
if i in sticker:
print("found")
driver.close()
now the problem is that i dont get an output, which is impossible because if i manually compare the lists, there are clearly elements from the first list existing in the 2nd list (the one with the links). when trying to fix i always got a NoneType error. The driver.page_source is above defined by some selenium i used to access a site and click some javascript stuff, to be able to find everything. I hope its more or less clear what i wanted to reach
Edit: the StickerIDs variable is the 2nd list containing the words i want to be checked
NoneType
error means that you might be getting a None
somewhere, so it's probably safer to check the results returned by find_all
for None
.
It's been a while since is used BeautifulSoup, but If I remember correctly, find_all
returns a list of beautiful soup tags
that match the search criteria, not URLs. You need to get the href
attribute from the tag before checking if it contains a keyword.
Something like that:
page_source = driver.page_source
soup = BeautifulSoup(page_source, 'lxml')
Bot_Stickers = soup.find_all('img', class_='csi')
if Bot_Stickers and StickersIDs:
for sticker in Bot_Stickers:
for i in StickerIDs:
if i in sticker.get("href"): # get href attribute of the img tag
print("found")
else:
print("Bot_Stickers:", Bot_Stickers)
print("StickersIDs:" StickersIDs)
driver.close()
Tried it right now with the other script, as i tried before, i also get the following error: Traceback (most recent call last): File "C:\Users\timjo\Desktop\BOTSCRIPT\Bot.py", line 27, in <module> if i in sticker.get("href"): TypeError: argument of type 'NoneType' is not iterable not sure how to solve that
@Timeler The error says trying to iterate over a None, so check both
Bot_Stickers
andStickersIDs
before the for loops. It's a good and safe practice to check for values that might break your code. Once you know which one isNone
, you should start debugging why you're getting aNone
instead of, say, a list of tags. I've updated the answer.if i try it as you edited the solution, i get the same error as before. if i replace the sticker.get("href"): by sticker i get no error but no output either, so either way it doesnt do what it should
Ok, that makes sense. That means the
<img>
tag has not attribute namedhref
. You should check the HTML source, and see what's the attribute name that has the image link, it could be something likesrc
. After that, changehref
to the attribute name that has the image link in it, and it should work.works, thanks a lot