Warm tip: This article is reproduced from serverfault.com, please click

Reproducing a query to an API from its URL

发布于 2020-11-30 15:43:26

I want to reproduce a query to an API that I know the URL of with Python in order to be able to modify the ProductID (P20201, P618001 ...) with which I can return their respective info.

The query is:

https://api.bazaarvoice.com/data/batch.json?passkey=iohrnzjadededr160osgfvimy&apiversion=5.5&displaycode=3232-fr_fr&resource.q0=products&filter.q0=id%3Aeq%3AP618001&stats.q0=questions%2Creviews&filteredstats.q0=questions%2Creviews&filter_questions.q0=contentlocale%3Aeq%3Afr_FR&filter_answers.q0=contentlocale%3Aeq%3Afr_FR&filter_reviews.q0=contentlocale%3Aeq%3Afr_FR&filter_reviewcomments.q0=contentlocale%3Aeq%3Afr_FR&resource.q1=questions&filter.q1=productid%3Aeq%3AP618001&filter.q1=contentlocale%3Aeq%3Afr_FR&sort.q1=lastapprovedanswersubmissiontime%3Adesc&stats.q1=questions&filteredstats.q1=questions&include.q1=authors%2Cproducts%2Canswers&filter_questions.q1=contentlocale%3Aeq%3Afr_FR&filter_answers.q1=contentlocale%3Aeq%3Afr_FR&limit.q1=10&offset.q1=0&limit_answers.q1=10&resource.q2=reviews&filter.q2=isratingsonly%3Aeq%3Afalse&filter.q2=productid%3Aeq%3AP618001&filter.q2=contentlocale%3Aeq%3Afr_FR&sort.q2=submissiontime%3Adesc&stats.q2=reviews&filteredstats.q2=reviews&include.q2=authors%2Cproducts%2Ccomments&filter_reviews.q2=contentlocale%3Aeq%3Afr_FR&filter_reviewcomments.q2=contentlocale%3Aeq%3Afr_FR&filter_comments.q2=contentlocale%3Aeq%3Afr_FR&limit.q2=5&offset.q2=0&limit_comments.q2=3&callback=BV._internal.dataHandler0

So I tried to do it in a Scrapy with:

C:\Users\antoi>python
Python 3.6.7 (v3.6.7:6ec5cf24b7, Oct 20 2018, 13:35:33) [MSC v.1900 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import requests
>>> params = {'passkey': ['iohrnzjadededr160osgfvimy'], 'apiversion': ['5.5'], 'displaycode': ['3232-fr_fr'],
...               'resource.q0': ['products'], 'filter.q0': ['id:eq:P618001'], 'stats. q0': ['questions,reviews'],
...               'filteredstats.q0': ['questions,reviews'], 'filter_questions.q0': ['contentlocale:eq:fr_FR'],
...               'filter_answers.q0': ['contentlocale:eq:fr_ FR'], 'filter_reviews.q0': ['contentlocale:eq:fr_FR'],
...               'filter_reviewcomments.q0': ['contentlocale:eq:fr_FR'], 'resource.q1': ['questions'],
...               'filter.q1': ['productid :eq:P618001', 'contentlocale:eq:fr_FR'],
...               'sort.q1': ['lastapprovedanswersubmissiontime:desc'],
...               'stats.q1': ['questions'], 'filteredstats.q1': ['questions'],
...               'include .q1': ['authors,products,answers'], 'filter_questions.q1': ['contentlocale:eq:fr_FR'],
...               'filter_answers.q1': ['contentlocale:eq:fr_FR'], 'limit.q1': ['10'],
...               'offset.q 1': ['0'], 'limit_answers.q1': ['10'], 'resource.q2': ['reviews'],
...               'filter.q2': ['isratingsonly:eq:false', 'productid:eq:P618001',
...                             'contentlocale:eq:fr_FR'], 'sort.q 2': ['submissiontime:desc'],
...               'stats.q2': ['reviews'], 'filteredstats.q2': ['reviews'], 'include.q2': ['authors,products,comments'],
...               'filter_reviews.q2': ['contentloc ale:eq:fr_FR'], 'filter_reviewcomments.q2': ['contentlocale:eq:fr_FR'],
...               'filter_comments.q2': ['contentlocale:eq:fr_FR'], 'limit.q2': ['5'], 'offset.q2': ['0'],
...               'lim it_comments.q2': ['3'], 'callback': ['BV._internal.dataHandler0']}
>>> product_id = 'P618001'
>>> params['filter.q0'] = 'id:eq:' + product_id
>>> params['filter.q1'][0] = 'productid :eq:' + product_id
>>> params['filter.q2'][1] = 'productid :eq:' + product_id
>>> perfume = requests.get("https://api.bazaarvoice.com/data/batch.json", params=self.params)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'self' is not defined
>>> perfume = requests.get("https://api.bazaarvoice.com/data/batch.json", params=params)

But it returns json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0):

>>> print("perfume.json(): ", perfume.json())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\antoi\AppData\Roaming\Python\Python36\site-packages\requests\models.py", line 897, in json
    return complexjson.loads(self.text, **kwargs)
  File "C:\Python36\lib\site-packages\simplejson\__init__.py", line 518, in loads
    return _default_decoder.decode(s)
  File "C:\Python36\lib\site-packages\simplejson\decoder.py", line 370, in decode
    obj, end = self.raw_decode(s)
  File "C:\Python36\lib\site-packages\simplejson\decoder.py", line 400, in raw_decode
    return self.scan_once(s, idx=_w(s, idx).end())
simplejson.errors.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

So how can I reproduce a query to an API from its URL?

Questioner
Revolucion for Monica
Viewed
0
baduker 2020-12-01 21:52:57

Why would you want to recreate it if the request works but what you get is not a JSON, at least not initially. You need to scoop it out and then you get the data.

See this:

import json
import re

import requests

response = requests.get("https://api.bazaarvoice.com/data/batch.json?passkey=iohrnzjadededr160osgfvimy&apiversion=5.5&displaycode=3232-fr_fr&resource.q0=products&filter.q0=id%3Aeq%3AP618001&stats.q0=questions%2Creviews&filteredstats.q0=questions%2Creviews&filter_questions.q0=contentlocale%3Aeq%3Afr_FR&filter_answers.q0=contentlocale%3Aeq%3Afr_FR&filter_reviews.q0=contentlocale%3Aeq%3Afr_FR&filter_reviewcomments.q0=contentlocale%3Aeq%3Afr_FR&resource.q1=questions&filter.q1=productid%3Aeq%3AP618001&filter.q1=contentlocale%3Aeq%3Afr_FR&sort.q1=lastapprovedanswersubmissiontime%3Adesc&stats.q1=questions&filteredstats.q1=questions&include.q1=authors%2Cproducts%2Canswers&filter_questions.q1=contentlocale%3Aeq%3Afr_FR&filter_answers.q1=contentlocale%3Aeq%3Afr_FR&limit.q1=10&offset.q1=0&limit_answers.q1=10&resource.q2=reviews&filter.q2=isratingsonly%3Aeq%3Afalse&filter.q2=productid%3Aeq%3AP618001&filter.q2=contentlocale%3Aeq%3Afr_FR&sort.q2=submissiontime%3Adesc&stats.q2=reviews&filteredstats.q2=reviews&include.q2=authors%2Cproducts%2Ccomments&filter_reviews.q2=contentlocale%3Aeq%3Afr_FR&filter_reviewcomments.q2=contentlocale%3Aeq%3Afr_FR&filter_comments.q2=contentlocale%3Aeq%3Afr_FR&limit.q2=5&offset.q2=0&limit_comments.q2=3&callback=BV._internal.dataHandler0").text

data = json.loads(re.search(r"\((.*)\)", response).group(1))
print(data["BatchedResults"]["q2"]["Results"][0])

Output:

{'Id': '158638580', 'CID': '9b474509-239b-5f43-9926-6ca76192c563', 'SourceClient': 'sephora-fr', 'Badges': {'loyaltyYes--Im-a-beauty-insider': {'ContentType': 'REVIEW', 'Id': 'loyaltyYes--Im-a-beauty-insider', 'BadgeType': 'Custom'}}, 'BadgesOrder': ['loyaltyYes--Im-a-beauty-insider'], 'LastModeratedTime': '2020-06-24T11:45:09.000+00:00', 'LastModificationTime': '2020-07-13T16:40:52.000+00:00', 'ProductId': 'P618001', 'CampaignId': 'BV_PIE_ONLINE', 'ContextDataValuesOrder': ['Gender', 'Age', 'Eyes', 'Skin'], 'UserLocation': 'Clermont Ferrand', 'AuthorId': '72416110', 'ContentLocale': 'fr_FR', 'IsFeatured': False, 'TotalInappropriateFeedbackCount': 0, 'TotalClientResponseCount': 0, 'TotalCommentCount': 0, 'Rating': 5, 'IsRatingsOnly': False, 'IsRecommended': True, 'Helpfulness': 1.0, 'TotalFeedbackCount': 1, 'TotalNegativeFeedbackCount': 0, 'TotalPositiveFeedbackCount': 1, 'ModerationStatus': 'APPROVED', 'SubmissionId': 'r23232-fr__159299794LEh4bHKjq', 'SubmissionTime': '2020-06-24T11:25:40.000+00:00', 'ReviewText': 'Très contente de mon achat. Je cherchais ce parfum depuis un temps en magasin et je suis heureuse qu’il soit disponible en ligne il sent tellement bon !! En plus en promo, génial ! \r\nLivraison très rapide !', 'Title': 'Satisfaite', 'UserNickname': 'oceaned03', 'ContextDataValues': {'Age': {'Value': '18to24', 'Id': 'Age'}, 'Gender': {'Value': 'Female', 'Id': 'Gender'}, 'Skin': {'Value': 'Seche', 'Id': 'Skin'}, 'Eyes': {'Value': 'Bleus', 'Id': 'Eyes'}}, 'RatingRange': 5, 'AdditionalFieldsOrder': [], 'ProductRecommendationIds': [], 'Cons': None, 'TagDimensionsOrder': [], 'CommentIds': [], 'TagDimensions': {}, 'Videos': [], 'Photos': [], 'ClientResponses': [], 'InappropriateFeedbackList': [], 'SecondaryRatings': {}, 'AdditionalFields': {}, 'SecondaryRatingsOrder': [], 'IsSyndicated': False, 'Pros': None}

If you want "decode" the url, use parse_qs:

from urllib.parse import parse_qs

api_url = "https://api.bazaarvoice.com/data/batch.json?passkey=iohrnzjadededr160osgfvimy&apiversion=5.5&displaycode=3232-fr_fr&resource.q0=products&filter.q0=id%3Aeq%3AP618001&stats.q0=questions%2Creviews&filteredstats.q0=questions%2Creviews&filter_questions.q0=contentlocale%3Aeq%3Afr_FR&filter_answers.q0=contentlocale%3Aeq%3Afr_FR&filter_reviews.q0=contentlocale%3Aeq%3Afr_FR&filter_reviewcomments.q0=contentlocale%3Aeq%3Afr_FR&resource.q1=questions&filter.q1=productid%3Aeq%3AP618001&filter.q1=contentlocale%3Aeq%3Afr_FR&sort.q1=lastapprovedanswersubmissiontime%3Adesc&stats.q1=questions&filteredstats.q1=questions&include.q1=authors%2Cproducts%2Canswers&filter_questions.q1=contentlocale%3Aeq%3Afr_FR&filter_answers.q1=contentlocale%3Aeq%3Afr_FR&limit.q1=10&offset.q1=0&limit_answers.q1=10&resource.q2=reviews&filter.q2=isratingsonly%3Aeq%3Afalse&filter.q2=productid%3Aeq%3AP618001&filter.q2=contentlocale%3Aeq%3Afr_FR&sort.q2=submissiontime%3Adesc&stats.q2=reviews&filteredstats.q2=reviews&include.q2=authors%2Cproducts%2Ccomments&filter_reviews.q2=contentlocale%3Aeq%3Afr_FR&filter_reviewcomments.q2=contentlocale%3Aeq%3Afr_FR&filter_comments.q2=contentlocale%3Aeq%3Afr_FR&limit.q2=5&offset.q2=0&limit_comments.q2=3&callback=BV._internal.dataHandler0"
print(parse_qs(api_url))

This gives you a dict:

{'https://api.bazaarvoice.com/data/batch.json?passkey': ['iohrnzjadededr160osgfvimy'], 'apiversion': ['5.5'], 'displaycode': ['3232-fr_fr'], 'resource.q0': ['products'], 'filter.q0': ['id:eq:P618001'], 'stats.q0': ['questions,reviews'], 'filteredstats.q0': ['questions,reviews'], 'filter_questions.q0': ['contentlocale:eq:fr_FR'], 'filter_answers.q0': ['contentlocale:eq:fr_FR'], 'filter_reviews.q0': ['contentlocale:eq:fr_FR'], 'filter_reviewcomments.q0': ['contentlocale:eq:fr_FR'], 'resource.q1': ['questions'], 'filter.q1': ['productid:eq:P618001', 'contentlocale:eq:fr_FR'], 'sort.q1': ['lastapprovedanswersubmissiontime:desc'], 'stats.q1': ['questions'], 'filteredstats.q1': ['questions'], 'include.q1': ['authors,products,answers'], 'filter_questions.q1': ['contentlocale:eq:fr_FR'], 'filter_answers.q1': ['contentlocale:eq:fr_FR'], 'limit.q1': ['10'], 'offset.q1': ['0'], 'limit_answers.q1': ['10'], 'resource.q2': ['reviews'], 'filter.q2': ['isratingsonly:eq:false', 'productid:eq:P618001', 'contentlocale:eq:fr_FR'], 'sort.q2': ['submissiontime:desc'], 'stats.q2': ['reviews'], 'filteredstats.q2': ['reviews'], 'include.q2': ['authors,products,comments'], 'filter_reviews.q2': ['contentlocale:eq:fr_FR'], 'filter_reviewcomments.q2': ['contentlocale:eq:fr_FR'], 'filter_comments.q2': ['contentlocale:eq:fr_FR'], 'limit.q2': ['5'], 'offset.q2': ['0'], 'limit_comments.q2': ['3'], 'callback': ['BV._internal.dataHandler0']}

If you want to experiment with changing the values, you might want to try this:

from urllib.parse import parse_qsl, urlencode
import json
import re

import requests

url = "https://api.bazaarvoice.com/data/batch.json?"
payload = "passkey=iohrnzjadededr160osgfvimy&apiversion=5.5&displaycode=3232-fr_fr&resource.q0=products&filter.q0=id%3Aeq%3AP618001&stats.q0=questions%2Creviews&filteredstats.q0=questions%2Creviews&filter_questions.q0=contentlocale%3Aeq%3Afr_FR&filter_answers.q0=contentlocale%3Aeq%3Afr_FR&filter_reviews.q0=contentlocale%3Aeq%3Afr_FR&filter_reviewcomments.q0=contentlocale%3Aeq%3Afr_FR&resource.q1=questions&filter.q1=productid%3Aeq%3AP618001&filter.q1=contentlocale%3Aeq%3Afr_FR&sort.q1=lastapprovedanswersubmissiontime%3Adesc&stats.q1=questions&filteredstats.q1=questions&include.q1=authors%2Cproducts%2Canswers&filter_questions.q1=contentlocale%3Aeq%3Afr_FR&filter_answers.q1=contentlocale%3Aeq%3Afr_FR&limit.q1=10&offset.q1=0&limit_answers.q1=10&resource.q2=reviews&filter.q2=isratingsonly%3Aeq%3Afalse&filter.q2=productid%3Aeq%3AP618001&filter.q2=contentlocale%3Aeq%3Afr_FR&sort.q2=submissiontime%3Adesc&stats.q2=reviews&filteredstats.q2=reviews&include.q2=authors%2Cproducts%2Ccomments&filter_reviews.q2=contentlocale%3Aeq%3Afr_FR&filter_reviewcomments.q2=contentlocale%3Aeq%3Afr_FR&filter_comments.q2=contentlocale%3Aeq%3Afr_FR&limit.q2=5&offset.q2=0&limit_comments.q2=3&callback=BV._internal.dataHandler0"

decoded = parse_qsl(payload)
decoded.pop(4)
decoded.insert(4, ('filter.q0', 'id:eq:P20201'))

response = requests.get(f"{url}{urlencode(decoded)}").text

data = json.loads(re.search(r"\((.*)\)", response).group(1))
print(data["BatchedResults"]["q2"]["Results"][0])