Warm tip: This article is reproduced from stackoverflow.com, please click
data-science machine-learning python text-classification

Text classification with word2vec stack overflow tag predictor

发布于 2020-03-31 22:53:19

I am working stack overflow tag predictor.

I have a dataframe df which contains a feature 'post' and label 'Tags' which can be multi lable.

My df is :

Tags    post

0   [php]   check upload file image without mime type woul...

1   [firefox]   prevent firefox close press ctrl-w favorite ed...

2   [r] r error invalid type list variable import matl...
3   [c#]    replace special character url probably simple ...

4   [php, api]  modify whois contact detail function modify mc...

... ... ...
179995  [delphi]    intraweb isapi module throw unrecognized comma...

179996  [c] opencv argc argv confusion check opencv tutori...

179997  [android]   list data sdcard want display file name reside...

179998  [java, email]   add sort extension imap server mail server sup...

179999  [linux, php]    create carddav ldap server share host via php ...

So I want to use word2vec for classification and predict the tags.

I want to use all machine learning classifier like SVM, random forest etc.

I also want classification report of tags.

So please help me.

Questioner
Subhash Kalicharan
Viewed
60
venkatadileep 2020-01-31 18:49

word2vec is not a classifier it word to vector converter, my suggestion steps 1) Preprocess the text(like stopwords and normalization) 2) convert the words to vector using TF-IDF or word2vec 3) Then apply ml models (for multi classification you can use SVM, Naive Bayes and logistic regression) 4)validate the results