我想知道“频率因数”一词是如何tf
计算的?
我想知道其中tf
的内容。以下查询的结果:
curl -g 'http://localhost:8983/solr/nutch/select?indent=on&q=python&wt=json&fl=title,score,[features%20efi.query=python%20store=myfeature_store]',content
是:
...
{
"title":"Raspberry Pi Stack Exchange",
"content":"Raspberry Pi Stack Exchange\nStack Exchange Network\nStack Exchange network consists of 175 Q&A communities including Stack Overflow , the largest, most trusted online community for developers to learn, share their knowledge, and build their careers.\nVisit Stack Exchange\nLoading…\n0\n+0\nTour Start here for a quick overview of the site\nHelp Center Detailed answers to any questions you might have\nMeta Discuss the workings and policies of this site\nAbout Us Learn more about Stack Overflow the company\nBusiness Learn more about hiring developers or posting ads with us\nLog in\nSign up\ncurrent community\nRaspberry Pi\nhelp\nchat\nRaspberry Pi Meta\nyour communities\nSign up or log in to customize your list.\nmore stack exchange communities\ncompany blog\nBy using our site, you acknowledge that you have read and understand our Cookie Policy , Privacy Policy , and our Terms of Service .\nRaspberry Pi Stack Exchange is a question and answer site for users and developers of hardware and software for Raspberry Pi. It only takes a minute to sign up.\nSign up to join this community\nAnybody can ask a question\nAnybody can answer\nThe best answers are voted up and rise to the top\nHome\nQuestions\nTags\nUsers\nUnanswered\nExplore our Questions\nAsk Question\nraspbian pi-3 gpio python networking wifi pi-2 usb boot ssh\nmore tags\nActive\nHot\nWeek\nMonth\n0\nvotes\n0\nanswers\n3\nviews\nHostname on router and pi do not match\nheadless\nasked 4 mins ago\nJoseph\n1\n2\nvotes\n0\nanswers\n49\nviews\nAndroid won't connect to RasPi access point\nandroid\naccess-point\nsystemd-networkd\nwpa-supplicant\nmodified 6 mins ago\nThePunisher\n121\n2\nvotes\n3\nanswers\n53\nviews\napt-get update errors after copying Raspbian to new SD card\nraspbian\napt\nmodified 17 mins ago\nifschleife\n121\n1\nvote\n5\nanswers\n444\nviews\nWifi cuts out after a few hours, have to restart Pi\nraspbian\nnetworking\nwifi\nssh\nminecraft\nmodified 53 mins ago\nCommunity ♦\n1\n2\nvotes\n2\nanswers\n369\nviews\nCan't SSH by name on stretch; can on jessie\nssh\nraspbian-stretch\nputty\nmodified 1 hour ago\nCommunity ♦\n1\n0\nvotes\n0\nanswers\n8\nviews\nHow to use only 3 GPIO pins for a JSN-SR04T waterproof ultrasonic sensor\ngpio\nsensor\nasked 2 hours ago\nPeter bill\n191\n1\nvote\n2\nanswers\n52\nviews\nGPIO Not changing its value in a particular code section\ngpio\npython\nrelay\nmodified 2 hours ago\ntlfong01\n2,465\n0\nvotes\n0\nanswers\n1\nview\nMakes OpenVPN a local Apache Webserver accessable from outside?\nweb-server\nvpn\napache-httpd\nweb-browsers\nweb\nasked 2 hours ago\nJakob\n113\n0\nvotes\n1\nanswer\n15\nviews\nsainsmart relay - switches on when pi shuts down\npi-3\nboot-issues\nanswered 2 hours ago\npir8ped\n79\n0\nvotes\n1\nanswer\n301\nviews\nRaspberry Pi Matchbox virtual keyboard missing colon\ndisplay\nmodified 2 hours ago\nCommunity ♦\n1\n-1\nvotes\n0\nanswers\n27\nviews\nHow to fix ssh connection that's been broken by dhcpcd service\nlinux\nnetworking\nssh\ndhcp\nmodified 3 hours ago\nBelserich\n1\n4\nvotes\n2\nanswers\n8k\nviews\nHow can I use OpenCV with Python 3 on a Raspberry Pi?\nopencv\npython-3\nanswered 3 hours ago\nIngo\n19.1k\n2\nvotes\n0\nanswers\n14\nviews\nRPi-Zero, HID keyboard gadget for BIOS keyboard\nusb\nkeyboard\nhid\nlibcomposite\nmodified 3 hours ago\nEphemeral\n1,561\n0\nvotes\n0\nanswers\n13\nviews\nHow do I go about auto-mounting my NTFS hard drive at boot?\nboot\nmount\nfstab\nntfs\nasked 3 hours ago\nHasake\n11\nBrowse more Questions\nHot Network Questions\nTriple Approx Symbol\nBest ways to invest for a planned house purchase in 1 year?\nVariable selection in logistic regression model\nShould rooms be designed to minimize waste of sheet goods?\nWhy is Perihelion and Shortest day in North Hemisphere different?\nHow can I estimate the speed of this code section for this microcontroller?\nShell - Navigate up 'n' directories\nLooking for an effective pattern to cope with switch statements in C#\n",
"score":0.00982895,
"[features]":"tf=2.0"},
...
2.0的价值如何?这个单词python
来了4次,里面有330个单词content
。
如果我想计算词频比(覆盖的查询词数除以查询词数),该怎么做?
stackoverflow.com/a/34614215/4582711此处指出的是
tf
计算比率而不是项号。那么,为什么只计算sqrt(tf)
??您所指的似乎是
TF/IDF
-即比率。BM25计分员使用tf的平方根来获得比直线更好的相关性TF
。对于idf
它使用的因子,log ( numDocs / docFreq + 1) + 1
而不是纯idf
值(因为增加100个文档实际上并没有减少100倍的相关性)。术语频率正如其名称所示,该术语出现的频率。实际上,我正在尝试使用Microsoft mslr数据集microsoft.com/en-us/research/project/mslr似乎功能6-9不是
TF/IDF
。看来很简单term_no/total_terms
。我不确定告诉您-您问题的答案,为什么您的示例中的
tf
ofpython
是2.0而不是4,是因为BM25使用sqrt(RAW TF)
了它的实际tf
值。