I am experimenting NLTK package using Python. I tried to downloaded NLTK using nltk.download()
. I got this kind of error message. How to solve this problem? Thanks.
The system I used is Ubuntu installed under VMware. The IDE is Spyder.
After using nltk.download('all')
, it can download some packages, but it gets error message when downloading oanc_masc
To download a particular dataset/models, use the nltk.download()
function, e.g. if you are looking to download the punkt
sentence tokenizer, use:
$ python3
>>> import nltk
>>> nltk.download('punkt')
If you're unsure of which data/model you need, you can start out with the basic list of data + models with:
>>> import nltk
>>> nltk.download('popular')
It will download a list of "popular" resources.
Ensure that you've the latest version of NLTK
because it's always improving and constantly maintain:
$ pip install --upgrade nltk
In case anyone is avoiding errors from downloading larger datasets from nltk
, from https://stackoverflow.com/a/38135306/610569
$ rm /Users/<your_username>/nltk_data/corpora/panlex_lite.zip
$ rm -r /Users/<your_username>/nltk_data/corpora/panlex_lite
$ python
>>> import nltk
>>> dler = nltk.downloader.Downloader()
>>> dler._update_index()
>>> dler._status_cache['panlex_lite'] = 'installed' # Trick the index to treat panlex_lite as it's already installed.
>>> dler.download('popular')
And if anyone wants to find nltk_data
directory, see https://stackoverflow.com/a/36383314/610569
And to config nltk_data
path, see https://stackoverflow.com/a/22987374/610569
thanks for the reply. I tried nltk.download('all'), it successfully proceeded with downloading some packages, but it got stuck when downloading sth related to oasc_masc, I included the related screenshot in the original post.
what is your nltk version? what is the output of this on your terminal:
python -c "import nltk; print nltk.__version__"
?Hi there @alvas I'm having similar issues using nltk.download('all') on Ubuntu, except I get HTTP Error 404: Not Found in both IDLE and command line. My NLTK version is 2.0b9. Do you have any idea what might be going on?
@Joansy, Please update your NLTK.
sudo pip install nltk
orsudo apt-get install python-nltk
. Once it's updated the problem should resolve itself. Otherwise, you would have to set the url manually. Try updating NLTK first, if it doesn't work, then come back again =)