I want to reproduce a query to an API that I know the URL of with Python in order to be able to modify the ProductID
(P20201, P618001 ...) with which I can return their respective info.
The query is:
https://api.bazaarvoice.com/data/batch.json?passkey=iohrnzjadededr160osgfvimy&apiversion=5.5&displaycode=3232-fr_fr&resource.q0=products&filter.q0=id%3Aeq%3AP618001&stats.q0=questions%2Creviews&filteredstats.q0=questions%2Creviews&filter_questions.q0=contentlocale%3Aeq%3Afr_FR&filter_answers.q0=contentlocale%3Aeq%3Afr_FR&filter_reviews.q0=contentlocale%3Aeq%3Afr_FR&filter_reviewcomments.q0=contentlocale%3Aeq%3Afr_FR&resource.q1=questions&filter.q1=productid%3Aeq%3AP618001&filter.q1=contentlocale%3Aeq%3Afr_FR&sort.q1=lastapprovedanswersubmissiontime%3Adesc&stats.q1=questions&filteredstats.q1=questions&include.q1=authors%2Cproducts%2Canswers&filter_questions.q1=contentlocale%3Aeq%3Afr_FR&filter_answers.q1=contentlocale%3Aeq%3Afr_FR&limit.q1=10&offset.q1=0&limit_answers.q1=10&resource.q2=reviews&filter.q2=isratingsonly%3Aeq%3Afalse&filter.q2=productid%3Aeq%3AP618001&filter.q2=contentlocale%3Aeq%3Afr_FR&sort.q2=submissiontime%3Adesc&stats.q2=reviews&filteredstats.q2=reviews&include.q2=authors%2Cproducts%2Ccomments&filter_reviews.q2=contentlocale%3Aeq%3Afr_FR&filter_reviewcomments.q2=contentlocale%3Aeq%3Afr_FR&filter_comments.q2=contentlocale%3Aeq%3Afr_FR&limit.q2=5&offset.q2=0&limit_comments.q2=3&callback=BV._internal.dataHandler0
So I tried to do it in a Scrapy with:
C:\Users\antoi>python
Python 3.6.7 (v3.6.7:6ec5cf24b7, Oct 20 2018, 13:35:33) [MSC v.1900 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import requests
>>> params = {'passkey': ['iohrnzjadededr160osgfvimy'], 'apiversion': ['5.5'], 'displaycode': ['3232-fr_fr'],
... 'resource.q0': ['products'], 'filter.q0': ['id:eq:P618001'], 'stats. q0': ['questions,reviews'],
... 'filteredstats.q0': ['questions,reviews'], 'filter_questions.q0': ['contentlocale:eq:fr_FR'],
... 'filter_answers.q0': ['contentlocale:eq:fr_ FR'], 'filter_reviews.q0': ['contentlocale:eq:fr_FR'],
... 'filter_reviewcomments.q0': ['contentlocale:eq:fr_FR'], 'resource.q1': ['questions'],
... 'filter.q1': ['productid :eq:P618001', 'contentlocale:eq:fr_FR'],
... 'sort.q1': ['lastapprovedanswersubmissiontime:desc'],
... 'stats.q1': ['questions'], 'filteredstats.q1': ['questions'],
... 'include .q1': ['authors,products,answers'], 'filter_questions.q1': ['contentlocale:eq:fr_FR'],
... 'filter_answers.q1': ['contentlocale:eq:fr_FR'], 'limit.q1': ['10'],
... 'offset.q 1': ['0'], 'limit_answers.q1': ['10'], 'resource.q2': ['reviews'],
... 'filter.q2': ['isratingsonly:eq:false', 'productid:eq:P618001',
... 'contentlocale:eq:fr_FR'], 'sort.q 2': ['submissiontime:desc'],
... 'stats.q2': ['reviews'], 'filteredstats.q2': ['reviews'], 'include.q2': ['authors,products,comments'],
... 'filter_reviews.q2': ['contentloc ale:eq:fr_FR'], 'filter_reviewcomments.q2': ['contentlocale:eq:fr_FR'],
... 'filter_comments.q2': ['contentlocale:eq:fr_FR'], 'limit.q2': ['5'], 'offset.q2': ['0'],
... 'lim it_comments.q2': ['3'], 'callback': ['BV._internal.dataHandler0']}
>>> product_id = 'P618001'
>>> params['filter.q0'] = 'id:eq:' + product_id
>>> params['filter.q1'][0] = 'productid :eq:' + product_id
>>> params['filter.q2'][1] = 'productid :eq:' + product_id
>>> perfume = requests.get("https://api.bazaarvoice.com/data/batch.json", params=self.params)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'self' is not defined
>>> perfume = requests.get("https://api.bazaarvoice.com/data/batch.json", params=params)
But it returns json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
:
>>> print("perfume.json(): ", perfume.json())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\antoi\AppData\Roaming\Python\Python36\site-packages\requests\models.py", line 897, in json
return complexjson.loads(self.text, **kwargs)
File "C:\Python36\lib\site-packages\simplejson\__init__.py", line 518, in loads
return _default_decoder.decode(s)
File "C:\Python36\lib\site-packages\simplejson\decoder.py", line 370, in decode
obj, end = self.raw_decode(s)
File "C:\Python36\lib\site-packages\simplejson\decoder.py", line 400, in raw_decode
return self.scan_once(s, idx=_w(s, idx).end())
simplejson.errors.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
So how can I reproduce a query to an API from its URL?
Why would you want to recreate it if the request works but what you get is not a JSON, at least not initially. You need to scoop it out and then you get the data.
See this:
import json
import re
import requests
response = requests.get("https://api.bazaarvoice.com/data/batch.json?passkey=iohrnzjadededr160osgfvimy&apiversion=5.5&displaycode=3232-fr_fr&resource.q0=products&filter.q0=id%3Aeq%3AP618001&stats.q0=questions%2Creviews&filteredstats.q0=questions%2Creviews&filter_questions.q0=contentlocale%3Aeq%3Afr_FR&filter_answers.q0=contentlocale%3Aeq%3Afr_FR&filter_reviews.q0=contentlocale%3Aeq%3Afr_FR&filter_reviewcomments.q0=contentlocale%3Aeq%3Afr_FR&resource.q1=questions&filter.q1=productid%3Aeq%3AP618001&filter.q1=contentlocale%3Aeq%3Afr_FR&sort.q1=lastapprovedanswersubmissiontime%3Adesc&stats.q1=questions&filteredstats.q1=questions&include.q1=authors%2Cproducts%2Canswers&filter_questions.q1=contentlocale%3Aeq%3Afr_FR&filter_answers.q1=contentlocale%3Aeq%3Afr_FR&limit.q1=10&offset.q1=0&limit_answers.q1=10&resource.q2=reviews&filter.q2=isratingsonly%3Aeq%3Afalse&filter.q2=productid%3Aeq%3AP618001&filter.q2=contentlocale%3Aeq%3Afr_FR&sort.q2=submissiontime%3Adesc&stats.q2=reviews&filteredstats.q2=reviews&include.q2=authors%2Cproducts%2Ccomments&filter_reviews.q2=contentlocale%3Aeq%3Afr_FR&filter_reviewcomments.q2=contentlocale%3Aeq%3Afr_FR&filter_comments.q2=contentlocale%3Aeq%3Afr_FR&limit.q2=5&offset.q2=0&limit_comments.q2=3&callback=BV._internal.dataHandler0").text
data = json.loads(re.search(r"\((.*)\)", response).group(1))
print(data["BatchedResults"]["q2"]["Results"][0])
Output:
{'Id': '158638580', 'CID': '9b474509-239b-5f43-9926-6ca76192c563', 'SourceClient': 'sephora-fr', 'Badges': {'loyaltyYes--Im-a-beauty-insider': {'ContentType': 'REVIEW', 'Id': 'loyaltyYes--Im-a-beauty-insider', 'BadgeType': 'Custom'}}, 'BadgesOrder': ['loyaltyYes--Im-a-beauty-insider'], 'LastModeratedTime': '2020-06-24T11:45:09.000+00:00', 'LastModificationTime': '2020-07-13T16:40:52.000+00:00', 'ProductId': 'P618001', 'CampaignId': 'BV_PIE_ONLINE', 'ContextDataValuesOrder': ['Gender', 'Age', 'Eyes', 'Skin'], 'UserLocation': 'Clermont Ferrand', 'AuthorId': '72416110', 'ContentLocale': 'fr_FR', 'IsFeatured': False, 'TotalInappropriateFeedbackCount': 0, 'TotalClientResponseCount': 0, 'TotalCommentCount': 0, 'Rating': 5, 'IsRatingsOnly': False, 'IsRecommended': True, 'Helpfulness': 1.0, 'TotalFeedbackCount': 1, 'TotalNegativeFeedbackCount': 0, 'TotalPositiveFeedbackCount': 1, 'ModerationStatus': 'APPROVED', 'SubmissionId': 'r23232-fr__159299794LEh4bHKjq', 'SubmissionTime': '2020-06-24T11:25:40.000+00:00', 'ReviewText': 'Très contente de mon achat. Je cherchais ce parfum depuis un temps en magasin et je suis heureuse qu’il soit disponible en ligne il sent tellement bon !! En plus en promo, génial ! \r\nLivraison très rapide !', 'Title': 'Satisfaite', 'UserNickname': 'oceaned03', 'ContextDataValues': {'Age': {'Value': '18to24', 'Id': 'Age'}, 'Gender': {'Value': 'Female', 'Id': 'Gender'}, 'Skin': {'Value': 'Seche', 'Id': 'Skin'}, 'Eyes': {'Value': 'Bleus', 'Id': 'Eyes'}}, 'RatingRange': 5, 'AdditionalFieldsOrder': [], 'ProductRecommendationIds': [], 'Cons': None, 'TagDimensionsOrder': [], 'CommentIds': [], 'TagDimensions': {}, 'Videos': [], 'Photos': [], 'ClientResponses': [], 'InappropriateFeedbackList': [], 'SecondaryRatings': {}, 'AdditionalFields': {}, 'SecondaryRatingsOrder': [], 'IsSyndicated': False, 'Pros': None}
If you want "decode" the url, use parse_qs
:
from urllib.parse import parse_qs
api_url = "https://api.bazaarvoice.com/data/batch.json?passkey=iohrnzjadededr160osgfvimy&apiversion=5.5&displaycode=3232-fr_fr&resource.q0=products&filter.q0=id%3Aeq%3AP618001&stats.q0=questions%2Creviews&filteredstats.q0=questions%2Creviews&filter_questions.q0=contentlocale%3Aeq%3Afr_FR&filter_answers.q0=contentlocale%3Aeq%3Afr_FR&filter_reviews.q0=contentlocale%3Aeq%3Afr_FR&filter_reviewcomments.q0=contentlocale%3Aeq%3Afr_FR&resource.q1=questions&filter.q1=productid%3Aeq%3AP618001&filter.q1=contentlocale%3Aeq%3Afr_FR&sort.q1=lastapprovedanswersubmissiontime%3Adesc&stats.q1=questions&filteredstats.q1=questions&include.q1=authors%2Cproducts%2Canswers&filter_questions.q1=contentlocale%3Aeq%3Afr_FR&filter_answers.q1=contentlocale%3Aeq%3Afr_FR&limit.q1=10&offset.q1=0&limit_answers.q1=10&resource.q2=reviews&filter.q2=isratingsonly%3Aeq%3Afalse&filter.q2=productid%3Aeq%3AP618001&filter.q2=contentlocale%3Aeq%3Afr_FR&sort.q2=submissiontime%3Adesc&stats.q2=reviews&filteredstats.q2=reviews&include.q2=authors%2Cproducts%2Ccomments&filter_reviews.q2=contentlocale%3Aeq%3Afr_FR&filter_reviewcomments.q2=contentlocale%3Aeq%3Afr_FR&filter_comments.q2=contentlocale%3Aeq%3Afr_FR&limit.q2=5&offset.q2=0&limit_comments.q2=3&callback=BV._internal.dataHandler0"
print(parse_qs(api_url))
This gives you a dict:
{'https://api.bazaarvoice.com/data/batch.json?passkey': ['iohrnzjadededr160osgfvimy'], 'apiversion': ['5.5'], 'displaycode': ['3232-fr_fr'], 'resource.q0': ['products'], 'filter.q0': ['id:eq:P618001'], 'stats.q0': ['questions,reviews'], 'filteredstats.q0': ['questions,reviews'], 'filter_questions.q0': ['contentlocale:eq:fr_FR'], 'filter_answers.q0': ['contentlocale:eq:fr_FR'], 'filter_reviews.q0': ['contentlocale:eq:fr_FR'], 'filter_reviewcomments.q0': ['contentlocale:eq:fr_FR'], 'resource.q1': ['questions'], 'filter.q1': ['productid:eq:P618001', 'contentlocale:eq:fr_FR'], 'sort.q1': ['lastapprovedanswersubmissiontime:desc'], 'stats.q1': ['questions'], 'filteredstats.q1': ['questions'], 'include.q1': ['authors,products,answers'], 'filter_questions.q1': ['contentlocale:eq:fr_FR'], 'filter_answers.q1': ['contentlocale:eq:fr_FR'], 'limit.q1': ['10'], 'offset.q1': ['0'], 'limit_answers.q1': ['10'], 'resource.q2': ['reviews'], 'filter.q2': ['isratingsonly:eq:false', 'productid:eq:P618001', 'contentlocale:eq:fr_FR'], 'sort.q2': ['submissiontime:desc'], 'stats.q2': ['reviews'], 'filteredstats.q2': ['reviews'], 'include.q2': ['authors,products,comments'], 'filter_reviews.q2': ['contentlocale:eq:fr_FR'], 'filter_reviewcomments.q2': ['contentlocale:eq:fr_FR'], 'filter_comments.q2': ['contentlocale:eq:fr_FR'], 'limit.q2': ['5'], 'offset.q2': ['0'], 'limit_comments.q2': ['3'], 'callback': ['BV._internal.dataHandler0']}
If you want to experiment with changing the values, you might want to try this:
from urllib.parse import parse_qsl, urlencode
import json
import re
import requests
url = "https://api.bazaarvoice.com/data/batch.json?"
payload = "passkey=iohrnzjadededr160osgfvimy&apiversion=5.5&displaycode=3232-fr_fr&resource.q0=products&filter.q0=id%3Aeq%3AP618001&stats.q0=questions%2Creviews&filteredstats.q0=questions%2Creviews&filter_questions.q0=contentlocale%3Aeq%3Afr_FR&filter_answers.q0=contentlocale%3Aeq%3Afr_FR&filter_reviews.q0=contentlocale%3Aeq%3Afr_FR&filter_reviewcomments.q0=contentlocale%3Aeq%3Afr_FR&resource.q1=questions&filter.q1=productid%3Aeq%3AP618001&filter.q1=contentlocale%3Aeq%3Afr_FR&sort.q1=lastapprovedanswersubmissiontime%3Adesc&stats.q1=questions&filteredstats.q1=questions&include.q1=authors%2Cproducts%2Canswers&filter_questions.q1=contentlocale%3Aeq%3Afr_FR&filter_answers.q1=contentlocale%3Aeq%3Afr_FR&limit.q1=10&offset.q1=0&limit_answers.q1=10&resource.q2=reviews&filter.q2=isratingsonly%3Aeq%3Afalse&filter.q2=productid%3Aeq%3AP618001&filter.q2=contentlocale%3Aeq%3Afr_FR&sort.q2=submissiontime%3Adesc&stats.q2=reviews&filteredstats.q2=reviews&include.q2=authors%2Cproducts%2Ccomments&filter_reviews.q2=contentlocale%3Aeq%3Afr_FR&filter_reviewcomments.q2=contentlocale%3Aeq%3Afr_FR&filter_comments.q2=contentlocale%3Aeq%3Afr_FR&limit.q2=5&offset.q2=0&limit_comments.q2=3&callback=BV._internal.dataHandler0"
decoded = parse_qsl(payload)
decoded.pop(4)
decoded.insert(4, ('filter.q0', 'id:eq:P20201'))
response = requests.get(f"{url}{urlencode(decoded)}").text
data = json.loads(re.search(r"\((.*)\)", response).group(1))
print(data["BatchedResults"]["q2"]["Results"][0])
Thanks for your answer ! I need to recreate it in order to modify the
product_id
. Here I usedP618001
as an example but I would like to be able to test with other like P20201 ...@RevolucionforMonica I've updated the answer.
Thanks for that. Unfortunately it always handles back the same answer despite modifying
decoded.insert(4, ('filter.q0', 'id:eq:P2020'))
There's a typo there. I put
P2020
but you gaveP20201
. I've fixed that.Yes, but I just noticed that I not only need to modify
'filter.q0'
but also'filter.q1'[0]
and'filter.q2'[1]
as well