I want to crawl the google play store and get all the app ids of a particular category. When I executed the below code I just got the app ids of first 49 apps not more than that. But I want to get all the app ids. How can I achieve this? And the URL that I used was https://play.google.com/store/search?q=sports&c=apps&hl=en for scrapping.
import urllib.request, urllib.error, urllib.parse
from bs4 import BeautifulSoup
url=input('Enter:')
html=urllib.request.urlopen(url).read()
soup=BeautifulSoup(html,'html.parser')
tags=soup('a')
l=list()
for tag in tags:
x=tag.get('href',None)
if x.find("/store/apps/details?id=")!=-1:
if not(x[23:] in l):
l.append(x[23:])
print(l)
On dynamic sites like this, it's better to use internal XHRs to get data instead of parsing html. There is a POST request for every 48 apps shown there, which you can call from your script. In this blog post is an example of how to get app reviews from Google play store this way.
Is there a possibility of doing it in python?