Warm tip: This article is reproduced from serverfault.com, please click

python 3.x-从其URL再现对API的查询

(python 3.x - Reproducing a query to an API from its URL)

发布于 2020-11-30 15:43:26

我想将查询重现到我知道Python的URL的API,以便能够修改ProductID(P20201,P618001 ...),以便可以返回它们各自的信息。

查询是:

https://api.bazaarvoice.com/data/batch.json?passkey=iohrnzjadededr160osgfvimy&apiversion=5.5&displaycode=3232-fr_fr&resource.q0=products&filter.q0=id%3Aeq%3AP618001&stats.q0=questions%2Creviews&filteredstats.q0=questions%2Creviews&filter_questions.q0=contentlocale%3Aeq%3Afr_FR&filter_answers.q0=contentlocale%3Aeq%3Afr_FR&filter_reviews.q0=contentlocale%3Aeq%3Afr_FR&filter_reviewcomments.q0=contentlocale%3Aeq%3Afr_FR&resource.q1=questions&filter.q1=productid%3Aeq%3AP618001&filter.q1=contentlocale%3Aeq%3Afr_FR&sort.q1=lastapprovedanswersubmissiontime%3Adesc&stats.q1=questions&filteredstats.q1=questions&include.q1=authors%2Cproducts%2Canswers&filter_questions.q1=contentlocale%3Aeq%3Afr_FR&filter_answers.q1=contentlocale%3Aeq%3Afr_FR&limit.q1=10&offset.q1=0&limit_answers.q1=10&resource.q2=reviews&filter.q2=isratingsonly%3Aeq%3Afalse&filter.q2=productid%3Aeq%3AP618001&filter.q2=contentlocale%3Aeq%3Afr_FR&sort.q2=submissiontime%3Adesc&stats.q2=reviews&filteredstats.q2=reviews&include.q2=authors%2Cproducts%2Ccomments&filter_reviews.q2=contentlocale%3Aeq%3Afr_FR&filter_reviewcomments.q2=contentlocale%3Aeq%3Afr_FR&filter_comments.q2=contentlocale%3Aeq%3Afr_FR&limit.q2=5&offset.q2=0&limit_comments.q2=3&callback=BV._internal.dataHandler0

因此,我尝试通过以下方式在Scrapy中做到这一点:

C:\Users\antoi>python
Python 3.6.7 (v3.6.7:6ec5cf24b7, Oct 20 2018, 13:35:33) [MSC v.1900 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import requests
>>> params = {'passkey': ['iohrnzjadededr160osgfvimy'], 'apiversion': ['5.5'], 'displaycode': ['3232-fr_fr'],
...               'resource.q0': ['products'], 'filter.q0': ['id:eq:P618001'], 'stats. q0': ['questions,reviews'],
...               'filteredstats.q0': ['questions,reviews'], 'filter_questions.q0': ['contentlocale:eq:fr_FR'],
...               'filter_answers.q0': ['contentlocale:eq:fr_ FR'], 'filter_reviews.q0': ['contentlocale:eq:fr_FR'],
...               'filter_reviewcomments.q0': ['contentlocale:eq:fr_FR'], 'resource.q1': ['questions'],
...               'filter.q1': ['productid :eq:P618001', 'contentlocale:eq:fr_FR'],
...               'sort.q1': ['lastapprovedanswersubmissiontime:desc'],
...               'stats.q1': ['questions'], 'filteredstats.q1': ['questions'],
...               'include .q1': ['authors,products,answers'], 'filter_questions.q1': ['contentlocale:eq:fr_FR'],
...               'filter_answers.q1': ['contentlocale:eq:fr_FR'], 'limit.q1': ['10'],
...               'offset.q 1': ['0'], 'limit_answers.q1': ['10'], 'resource.q2': ['reviews'],
...               'filter.q2': ['isratingsonly:eq:false', 'productid:eq:P618001',
...                             'contentlocale:eq:fr_FR'], 'sort.q 2': ['submissiontime:desc'],
...               'stats.q2': ['reviews'], 'filteredstats.q2': ['reviews'], 'include.q2': ['authors,products,comments'],
...               'filter_reviews.q2': ['contentloc ale:eq:fr_FR'], 'filter_reviewcomments.q2': ['contentlocale:eq:fr_FR'],
...               'filter_comments.q2': ['contentlocale:eq:fr_FR'], 'limit.q2': ['5'], 'offset.q2': ['0'],
...               'lim it_comments.q2': ['3'], 'callback': ['BV._internal.dataHandler0']}
>>> product_id = 'P618001'
>>> params['filter.q0'] = 'id:eq:' + product_id
>>> params['filter.q1'][0] = 'productid :eq:' + product_id
>>> params['filter.q2'][1] = 'productid :eq:' + product_id
>>> perfume = requests.get("https://api.bazaarvoice.com/data/batch.json", params=self.params)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'self' is not defined
>>> perfume = requests.get("https://api.bazaarvoice.com/data/batch.json", params=params)

但它返回json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

>>> print("perfume.json(): ", perfume.json())
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\antoi\AppData\Roaming\Python\Python36\site-packages\requests\models.py", line 897, in json
    return complexjson.loads(self.text, **kwargs)
  File "C:\Python36\lib\site-packages\simplejson\__init__.py", line 518, in loads
    return _default_decoder.decode(s)
  File "C:\Python36\lib\site-packages\simplejson\decoder.py", line 370, in decode
    obj, end = self.raw_decode(s)
  File "C:\Python36\lib\site-packages\simplejson\decoder.py", line 400, in raw_decode
    return self.scan_once(s, idx=_w(s, idx).end())
simplejson.errors.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

那么,如何从URL复制对API的查询呢?

Questioner
Revolucion for Monica
Viewed
11
baduker 2020-12-01 21:52:57

如果请求有效,但为什么要重新创建它,但得到的不是JSON,至少最初不是这样。你需要先将其挖出,然后才能获取数据。

看到这个:

import json
import re

import requests

response = requests.get("https://api.bazaarvoice.com/data/batch.json?passkey=iohrnzjadededr160osgfvimy&apiversion=5.5&displaycode=3232-fr_fr&resource.q0=products&filter.q0=id%3Aeq%3AP618001&stats.q0=questions%2Creviews&filteredstats.q0=questions%2Creviews&filter_questions.q0=contentlocale%3Aeq%3Afr_FR&filter_answers.q0=contentlocale%3Aeq%3Afr_FR&filter_reviews.q0=contentlocale%3Aeq%3Afr_FR&filter_reviewcomments.q0=contentlocale%3Aeq%3Afr_FR&resource.q1=questions&filter.q1=productid%3Aeq%3AP618001&filter.q1=contentlocale%3Aeq%3Afr_FR&sort.q1=lastapprovedanswersubmissiontime%3Adesc&stats.q1=questions&filteredstats.q1=questions&include.q1=authors%2Cproducts%2Canswers&filter_questions.q1=contentlocale%3Aeq%3Afr_FR&filter_answers.q1=contentlocale%3Aeq%3Afr_FR&limit.q1=10&offset.q1=0&limit_answers.q1=10&resource.q2=reviews&filter.q2=isratingsonly%3Aeq%3Afalse&filter.q2=productid%3Aeq%3AP618001&filter.q2=contentlocale%3Aeq%3Afr_FR&sort.q2=submissiontime%3Adesc&stats.q2=reviews&filteredstats.q2=reviews&include.q2=authors%2Cproducts%2Ccomments&filter_reviews.q2=contentlocale%3Aeq%3Afr_FR&filter_reviewcomments.q2=contentlocale%3Aeq%3Afr_FR&filter_comments.q2=contentlocale%3Aeq%3Afr_FR&limit.q2=5&offset.q2=0&limit_comments.q2=3&callback=BV._internal.dataHandler0").text

data = json.loads(re.search(r"\((.*)\)", response).group(1))
print(data["BatchedResults"]["q2"]["Results"][0])

输出:

{'Id': '158638580', 'CID': '9b474509-239b-5f43-9926-6ca76192c563', 'SourceClient': 'sephora-fr', 'Badges': {'loyaltyYes--Im-a-beauty-insider': {'ContentType': 'REVIEW', 'Id': 'loyaltyYes--Im-a-beauty-insider', 'BadgeType': 'Custom'}}, 'BadgesOrder': ['loyaltyYes--Im-a-beauty-insider'], 'LastModeratedTime': '2020-06-24T11:45:09.000+00:00', 'LastModificationTime': '2020-07-13T16:40:52.000+00:00', 'ProductId': 'P618001', 'CampaignId': 'BV_PIE_ONLINE', 'ContextDataValuesOrder': ['Gender', 'Age', 'Eyes', 'Skin'], 'UserLocation': 'Clermont Ferrand', 'AuthorId': '72416110', 'ContentLocale': 'fr_FR', 'IsFeatured': False, 'TotalInappropriateFeedbackCount': 0, 'TotalClientResponseCount': 0, 'TotalCommentCount': 0, 'Rating': 5, 'IsRatingsOnly': False, 'IsRecommended': True, 'Helpfulness': 1.0, 'TotalFeedbackCount': 1, 'TotalNegativeFeedbackCount': 0, 'TotalPositiveFeedbackCount': 1, 'ModerationStatus': 'APPROVED', 'SubmissionId': 'r23232-fr__159299794LEh4bHKjq', 'SubmissionTime': '2020-06-24T11:25:40.000+00:00', 'ReviewText': 'Très contente de mon achat. Je cherchais ce parfum depuis un temps en magasin et je suis heureuse qu’il soit disponible en ligne il sent tellement bon !! En plus en promo, génial ! \r\nLivraison très rapide !', 'Title': 'Satisfaite', 'UserNickname': 'oceaned03', 'ContextDataValues': {'Age': {'Value': '18to24', 'Id': 'Age'}, 'Gender': {'Value': 'Female', 'Id': 'Gender'}, 'Skin': {'Value': 'Seche', 'Id': 'Skin'}, 'Eyes': {'Value': 'Bleus', 'Id': 'Eyes'}}, 'RatingRange': 5, 'AdditionalFieldsOrder': [], 'ProductRecommendationIds': [], 'Cons': None, 'TagDimensionsOrder': [], 'CommentIds': [], 'TagDimensions': {}, 'Videos': [], 'Photos': [], 'ClientResponses': [], 'InappropriateFeedbackList': [], 'SecondaryRatings': {}, 'AdditionalFields': {}, 'SecondaryRatingsOrder': [], 'IsSyndicated': False, 'Pros': None}

如果要“解码” URL,请使用parse_qs

from urllib.parse import parse_qs

api_url = "https://api.bazaarvoice.com/data/batch.json?passkey=iohrnzjadededr160osgfvimy&apiversion=5.5&displaycode=3232-fr_fr&resource.q0=products&filter.q0=id%3Aeq%3AP618001&stats.q0=questions%2Creviews&filteredstats.q0=questions%2Creviews&filter_questions.q0=contentlocale%3Aeq%3Afr_FR&filter_answers.q0=contentlocale%3Aeq%3Afr_FR&filter_reviews.q0=contentlocale%3Aeq%3Afr_FR&filter_reviewcomments.q0=contentlocale%3Aeq%3Afr_FR&resource.q1=questions&filter.q1=productid%3Aeq%3AP618001&filter.q1=contentlocale%3Aeq%3Afr_FR&sort.q1=lastapprovedanswersubmissiontime%3Adesc&stats.q1=questions&filteredstats.q1=questions&include.q1=authors%2Cproducts%2Canswers&filter_questions.q1=contentlocale%3Aeq%3Afr_FR&filter_answers.q1=contentlocale%3Aeq%3Afr_FR&limit.q1=10&offset.q1=0&limit_answers.q1=10&resource.q2=reviews&filter.q2=isratingsonly%3Aeq%3Afalse&filter.q2=productid%3Aeq%3AP618001&filter.q2=contentlocale%3Aeq%3Afr_FR&sort.q2=submissiontime%3Adesc&stats.q2=reviews&filteredstats.q2=reviews&include.q2=authors%2Cproducts%2Ccomments&filter_reviews.q2=contentlocale%3Aeq%3Afr_FR&filter_reviewcomments.q2=contentlocale%3Aeq%3Afr_FR&filter_comments.q2=contentlocale%3Aeq%3Afr_FR&limit.q2=5&offset.q2=0&limit_comments.q2=3&callback=BV._internal.dataHandler0"
print(parse_qs(api_url))

这给你一个命令:

{'https://api.bazaarvoice.com/data/batch.json?passkey': ['iohrnzjadededr160osgfvimy'], 'apiversion': ['5.5'], 'displaycode': ['3232-fr_fr'], 'resource.q0': ['products'], 'filter.q0': ['id:eq:P618001'], 'stats.q0': ['questions,reviews'], 'filteredstats.q0': ['questions,reviews'], 'filter_questions.q0': ['contentlocale:eq:fr_FR'], 'filter_answers.q0': ['contentlocale:eq:fr_FR'], 'filter_reviews.q0': ['contentlocale:eq:fr_FR'], 'filter_reviewcomments.q0': ['contentlocale:eq:fr_FR'], 'resource.q1': ['questions'], 'filter.q1': ['productid:eq:P618001', 'contentlocale:eq:fr_FR'], 'sort.q1': ['lastapprovedanswersubmissiontime:desc'], 'stats.q1': ['questions'], 'filteredstats.q1': ['questions'], 'include.q1': ['authors,products,answers'], 'filter_questions.q1': ['contentlocale:eq:fr_FR'], 'filter_answers.q1': ['contentlocale:eq:fr_FR'], 'limit.q1': ['10'], 'offset.q1': ['0'], 'limit_answers.q1': ['10'], 'resource.q2': ['reviews'], 'filter.q2': ['isratingsonly:eq:false', 'productid:eq:P618001', 'contentlocale:eq:fr_FR'], 'sort.q2': ['submissiontime:desc'], 'stats.q2': ['reviews'], 'filteredstats.q2': ['reviews'], 'include.q2': ['authors,products,comments'], 'filter_reviews.q2': ['contentlocale:eq:fr_FR'], 'filter_reviewcomments.q2': ['contentlocale:eq:fr_FR'], 'filter_comments.q2': ['contentlocale:eq:fr_FR'], 'limit.q2': ['5'], 'offset.q2': ['0'], 'limit_comments.q2': ['3'], 'callback': ['BV._internal.dataHandler0']}

如果你想尝试更改值,则可以尝试以下操作:

from urllib.parse import parse_qsl, urlencode
import json
import re

import requests

url = "https://api.bazaarvoice.com/data/batch.json?"
payload = "passkey=iohrnzjadededr160osgfvimy&apiversion=5.5&displaycode=3232-fr_fr&resource.q0=products&filter.q0=id%3Aeq%3AP618001&stats.q0=questions%2Creviews&filteredstats.q0=questions%2Creviews&filter_questions.q0=contentlocale%3Aeq%3Afr_FR&filter_answers.q0=contentlocale%3Aeq%3Afr_FR&filter_reviews.q0=contentlocale%3Aeq%3Afr_FR&filter_reviewcomments.q0=contentlocale%3Aeq%3Afr_FR&resource.q1=questions&filter.q1=productid%3Aeq%3AP618001&filter.q1=contentlocale%3Aeq%3Afr_FR&sort.q1=lastapprovedanswersubmissiontime%3Adesc&stats.q1=questions&filteredstats.q1=questions&include.q1=authors%2Cproducts%2Canswers&filter_questions.q1=contentlocale%3Aeq%3Afr_FR&filter_answers.q1=contentlocale%3Aeq%3Afr_FR&limit.q1=10&offset.q1=0&limit_answers.q1=10&resource.q2=reviews&filter.q2=isratingsonly%3Aeq%3Afalse&filter.q2=productid%3Aeq%3AP618001&filter.q2=contentlocale%3Aeq%3Afr_FR&sort.q2=submissiontime%3Adesc&stats.q2=reviews&filteredstats.q2=reviews&include.q2=authors%2Cproducts%2Ccomments&filter_reviews.q2=contentlocale%3Aeq%3Afr_FR&filter_reviewcomments.q2=contentlocale%3Aeq%3Afr_FR&filter_comments.q2=contentlocale%3Aeq%3Afr_FR&limit.q2=5&offset.q2=0&limit_comments.q2=3&callback=BV._internal.dataHandler0"

decoded = parse_qsl(payload)
decoded.pop(4)
decoded.insert(4, ('filter.q0', 'id:eq:P20201'))

response = requests.get(f"{url}{urlencode(decoded)}").text

data = json.loads(re.search(r"\((.*)\)", response).group(1))
print(data["BatchedResults"]["q2"]["Results"][0])