Stack

DNB and Typing

d3b 79% Sat 20 Apr 2019 11:18:34 AM CESTh
d3b 71% Sat 20 Apr 2019 11:20:10 AM CEST
d3b 71% Sat 20 Apr 2019 11:21:44 AM CEST
d3b 100% Sat 20 Apr 2019 11:23:16 AM CEST
d4b 56% Sat 20 Apr 2019 11:25:31 AM CEST
d4b 50% Sat 20 Apr 2019 11:27:26 AM CEST
d4b 50% Sat 20 Apr 2019 11:29:24 AM CEST
d4b 17% Sat 20 Apr 2019 11:31:18 AM CEST
d4b 40% Sat 20 Apr 2019 11:33:13 AM CEST
d4b 50% Sat 20 Apr 2019 11:35:15 AM CEST
d4b 56% Sat 20 Apr 2019 11:37:06 AM CEST

Thesis

Stopwords

What would happen if I actually used them as one of my features, leaving the non-stopwords text alone? Here’s a long list

Scikit-learn

Label-encoder

sklearn.preprocessing.LabelEncoder for converting categorical data to a numerical format.

>>> from sklearn import preprocessing
>>> le = preprocessing.LabelEncoder()
>>> le.fit([1, 2, 2, 6])
LabelEncoder()
>>> le.classes_
array([1, 2, 6])
>>> le.transform([1, 1, 2, 6])
array([0, 0, 1, 2]...)
>>> le.inverse_transform([0, 0, 1, 2])
array([1, 1, 2, 6])