What stopwords list is the Wolfram language using?
Clash Royale CLAN TAG#URR8PPP
up vote
14
down vote
favorite
The documentation of DeleteStopwords
only says that it "uses a standard, built-in list of stopwords".
So what is it exactly?
Update
Now that it is said "standard", does that standard have a name?
string-manipulation text implementation-details linguistics
add a comment |Â
up vote
14
down vote
favorite
The documentation of DeleteStopwords
only says that it "uses a standard, built-in list of stopwords".
So what is it exactly?
Update
Now that it is said "standard", does that standard have a name?
string-manipulation text implementation-details linguistics
add a comment |Â
up vote
14
down vote
favorite
up vote
14
down vote
favorite
The documentation of DeleteStopwords
only says that it "uses a standard, built-in list of stopwords".
So what is it exactly?
Update
Now that it is said "standard", does that standard have a name?
string-manipulation text implementation-details linguistics
The documentation of DeleteStopwords
only says that it "uses a standard, built-in list of stopwords".
So what is it exactly?
Update
Now that it is said "standard", does that standard have a name?
string-manipulation text implementation-details linguistics
string-manipulation text implementation-details linguistics
edited Sep 25 at 17:03
Peter Mortensen
32227
32227
asked Sep 25 at 6:13
ÃÂûÃÂþñýôÃÂÿàÃÂõóó
2,339725
2,339725
add a comment |Â
add a comment |Â
4 Answers
4
active
oldest
votes
up vote
14
down vote
A little spelunking of the code for DeleteStopwords
yields the internally used stopword list:
DeleteStopwords; (* force auto-load *)
AlphabeticSort[List @@ TextProcessing`TextModificationDump`$stopWords["English"]] // Short
"a", "A", "about", "above", "across", "after", "again", "against", "all", "almost",
"alone", "along", "already", "also", "although", <<240>>,
"within", "without", "won't", "would", "wouldn't", "yet", "you", "you'd", "you'll",
"you're", "you've", "your", "yours", "yourself", "yourselves"
Wow, these are some undocumented functions?
â ÃÂûÃÂþñýôÃÂÿàÃÂõóó
Sep 25 at 6:22
But your codes merely return"English"
as the result on my machine. Did I miss something?
â ÃÂûÃÂþñýôÃÂÿàÃÂõóó
Sep 25 at 6:24
4
@ÃÂûÃÂþñýôÃÂÿÃÂÃÂõóó One has to evaluateDeleteStopwords
first. The actual code forDeleteStopwords
andTextProcessing`TextModificationDump`$stopWords
is stored in a file and is loaded only afterDeleteStopwords
is evaluated in the current session.
â Henrik Schumacher
Sep 25 at 6:29
Ah, forgot the autoload. Thanks @Henrik!
â J. M. is somewhat okay.â¦
Sep 25 at 6:31
1
@hanshenrik note the use of// Short
at the end of the code provided, since I didn't want to paste the entire list. Thus, only the first few and last few list elements are shown, and the rest are elided. Remove the// Short
to see the entire list.
â J. M. is somewhat okay.â¦
Sep 25 at 8:33
 |Â
show 2 more comments
up vote
11
down vote
The stop-words are documented and can be looked up as (see Details section of WordList
):
WordList["Stopwords"]
As it might be subject to change so the output below can get outdated, but you can always run the function WordList["Stopwords"]
:
"0","1","2","3","4","5","6","7","8","9","a","A","about","above","across","after","again","against","all","almost","alone","along","already","also","although","always","am","among","an","and","another","any","anyone","anything","anywhere","are","aren't","around","as","at","b","B","back","be","became","because","become","becomes","been","before","behind","being","below","between","both","but","by","c","C","can","cannot","can't","could","couldn't","d","D","did","didn't","do","does","doesn't","doing","done","don't","down","during","e","E","each","either","enough","even","ever","every","everyone","everything","everywhere","f","F","few","find","first","for","four","from","full","further","g","G","get","give","go","h","H","had","hadn't","has","hasn't","have","haven't","having","he","he'd","he'll","her","here","here's","hers","herself","he's","him","himself","his","how","however","how's","i","I","i'd","if","i'll","i'm","in","interest","into","is","isn't","it","it's","its","itself","i've","j","J","k","K","keep","l","L","last","least","less","let's","m","M","made","many","may","me","might","more","most","mostly","much","must","mustn't","my","myself","n","N","never","next","no","nobody","noone","nor","not","nothing","now","nowhere","o","O","of","off","often","on","once","one","only","or","other","others","ought","our","ours","ourselves","out","over","own","p","P","part","per","perhaps","put","q","Q","r","R","rather","s","S","same","see","seem","seemed","seeming","seems","several","shan't","she","she'd","she'll","she's","should","shouldn't","show","side","since","so","some","someone","something","somewhere","still","such","t","T","take","than","that","that's","the","their","theirs","them","themselves","then","there","therefore","there's","these","they","they'd","they'll","they're","they've","this","those","though","three","through","thus","to","together","too","toward","two","u","U","under","until","up","upon","us","v","V","very","w","W","was","wasn't","we","we'd","we'll","well","we're","were","weren't","we've","what","what's","when","when's","where","where's","whether","which","while","who","whole","whom","who's","whose","why","why's","will","with","within","without","won't","would","wouldn't","x","X","y","Y","yet","you","you'd","you'll","your","you're","yours","yourself","yourselves","you've","z","Z"
2
Interestingly, the complement of this list and the list in my answer yields only a list of letters and numbers, so that list certainly captures all the words. However, applyingDeleteStopwords
seems to only remove"i"
from the list.
â J. M. is somewhat okay.â¦
Sep 25 at 18:06
@J.M.issomewhatokay. a few inconsistencies are known, hopefully will be polished. Important thing is --- the docs is open about the subject, good thing to know :-) The docs is an attractor of stability in terms of known things of internal architecture.
â Vitaliy Kaurov
Sep 25 at 20:13
add a comment |Â
up vote
7
down vote
Here is a longer list using the following commands:
A = WordList["KnownWords"];
B = DeleteStopwords[A];
c = Complement[A, B]
a,
A,
about,
above,
across,
A.D.,
add-in,
add-on,
A.E.,
after,
again,
against,
a.k.a.,
all,
all-around,
all-or-nothing,
all-out,
almost,
alone,
along,
al-Qur'an,
already,
also,
although,
always,
a.m.,
A.M.,
Am,
AM,
among,
an,
AN,
and,
and/or,
another,
any,
anyone,
anything,
anywhere,
A-one,
are,
around,
as,
As,
AS,
at,
At,
back,
back-to-back,
balls-up,
bang-up,
be,
Be,
beaten-up,
beat-up,
because,
become,
beefed-up,
before,
behind,
being,
belly-up,
below,
between,
bicycle-built-for-two,
blown-up,
booze-up,
born-again,
both,
bottom-up,
boxed-in,
break-in,
bride-to-be,
broken-down,
brush-off,
built-in,
built-up,
bundled-up,
burned-out,
burned-over,
burnt-out,
bust-up,
but,
button-down,
buttoned-down,
buttoned-up,
by,
by-and-by,
call-back,
caller-out,
caller-up,
call-in,
call-out,
camp-made,
can,
can-do,
cared-for,
carry-over,
cast-off,
change-up,
check-in,
ch'i,
Ch'in,
chin-up,
chock-full,
choke-full,
chucker-out,
chuck-full,
churned-up,
climb-down,
clip-on,
coach-and-four,
comb-out,
come-on,
cover-up,
crack-up,
cure-all,
custom-made,
cut-in,
cut-up,
D.A.,
dead-on,
derring-do,
do,
DO,
d.o.a.,
do-it-yourself,
done,
do-nothing,
do-si-do,
down,
Down,
down-and-out,
drawn-out,
dressed-up,
dried-out,
dried-up,
drive-in,
drop-off,
during,
each,
eighty-four,
eighty-one,
eighty-three,
eighty-two,
either,
end-all,
enough,
even,
ever,
every,
everyone,
everything,
everywhere,
eyes-only,
face-off,
factory-made,
fare-thee-well,
far-off,
far-out,
few,
fifty-four,
fifty-one,
fifty-three,
fifty-two,
fill-in,
find,
first,
F.I.S.C.,
flame-out,
flare-up,
fly-by,
follow-on,
follow-through,
follow-up,
for,
force-out,
fore-and-after,
forget-me-not,
form-only,
forty-first,
forty-four,
forty-one,
forty-three,
forty-two,
foul-up,
four,
frame-up,
free-for-all,
from,
fucked-up,
full,
further,
get,
get-go,
get-up-and-go,
G.I.,
gill-less,
give,
give-and-go,
give-and-take,
go,
go-around,
go-between,
going-over,
good-by,
good-for-nothing,
goof-off,
groom-to-be,
half-seas-over,
hand-down,
handed-down,
hand-me-down,
hands-down,
hands-off,
hands-on,
hanger-on,
hang-up,
hard-on,
has-been,
have,
have-not,
Hawai'i,
he,
He,
head-on,
heads-up,
heart-whole,
her,
here,
Here,
hers,
herself,
higher-up,
high-interest,
high-up,
him,
himself,
his,
hold-down,
hollow-back,
hoped-for,
hopped-up,
how,
how-do-you-do,
how-d'ye-do,
however,
HSV-I,
hundred-and-first,
hushed-up,
i,
I,
I.D.,
i.e.,
I.E.D.,
if,
If,
ill-being,
in,
In,
IN,
in-between,
inside-out,
interest,
into,
I.Q.,
it,
IT,
its,
itself,
I.W.W.,
jerk-off,
Johnny-jump-up,
jumped-up,
keep,
knock-down,
knock-down-and-drag-out,
knocked-out,
know-all,
know-how,
know-it-all,
ladder-back,
laid-back,
laid-off,
lash-up,
last,
lay-by,
lay-up,
lead-in,
lean-to,
least,
less,
lie-in,
lighting-up,
lights-out,
log-in,
longed-for,
looker-on,
look-over,
low-down,
low-interest,
lying-in,
ma'am,
machine-made,
made,
made-up,
make-do,
make-up,
man-made,
many,
marked-up,
match-up,
matt-up,
may,
May,
me,
ME,
mess-up,
mid-May,
mid-off,
mid-on,
might,
might-have-been,
mixed-up,
mix-up,
mock-up,
more,
More,
most,
mostly,
much,
must,
my,
myself,
ne'er-do-well,
never,
never-never,
Never-Never,
new-made,
next,
next-to-last,
ninety-four,
ninety-one,
ninety-three,
ninety-two,
no,
No,
no.,
nobody,
no-go,
no-one,
nor,
no-show,
nosh-up,
not,
nothing,
now,
nowhere,
odds-on,
of,
off,
off-and-on,
often,
on,
once,
once-over,
one,
one-and-one,
one-off,
one-on-one,
one-to-one,
only,
OR,
other,
our,
ours,
ourselves,
out,
out-and-out,
over,
own,
p.a.,
P.A.,
paid-up,
part,
passer-by,
pass-through,
paste-up,
pegged-down,
pent-up,
per,
perhaps,
phase-out,
phone-in,
pick-me-up,
pick-off,
pig-a-back,
pin-up,
piss-up,
plug-in,
pop-up,
Post-It,
press-up,
pull-in,
pull-off,
pull-through,
pull-up,
pumped-up,
punch-up,
purpose-made,
put,
put-down,
put-on,
put-put,
put-up,
put-upon,
rake-off,
raree-show,
rather,
rave-up,
read-out,
ready-made,
right-down,
right-side-out,
right-side-up,
rip-off,
roll-on,
run-down,
run-in,
runner-up,
run-on,
run-through,
run-up,
same,
Same,
Sana'a,
sauce-alone,
save-all,
sawed-off,
sawn-off,
say-so,
schoolma'am,
see,
seem,
seeming,
see-through,
self-interest,
self-made,
self-will,
send-off,
set-back,
set-to,
seventy-four,
seventy-one,
seventy-three,
seventy-two,
seven-up,
several,
shake-up,
shape-up,
share-out,
she,
shell-less,
shoo-in,
shoot-down,
shoot-'em-up,
show,
show-off,
shut-in,
side,
side-to-side,
since,
sit-down,
sit-in,
sit-up,
sixty-four,
sixty-one,
sixty-three,
sixty-two,
slap-up,
slip-on,
slip-up,
smash-up,
snarl-up,
so,
so-and-so,
sold-out,
some,
someone,
something,
somewhere,
so-so,
sought-after,
spaced-out,
spend-all,
spin-off,
spread-out,
stand-alone,
stand-down,
stand-in,
stand-up,
start-off,
step-down,
step-in,
step-up,
stick-on,
still,
stock-still,
stock-take,
stopped-up,
straight-out,
stripped-down,
strung-out,
stuck-up,
such,
sum-up,
sure-enough,
tailor-made,
take,
take-in,
take-up,
tap-off,
teach-in,
than,
that,
the,
their,
theirs,
them,
themselves,
then,
there,
therefore,
these,
they,
thing-in-itself,
thirty-first,
thirty-four,
thirty-one,
thirty-something,
thirty-three,
thirty-two,
this,
those,
though,
three,
through,
throw-in,
thus,
tie-in,
tie-on,
tie-up,
time-out,
tip-off,
tip-up,
to,
to-do,
toe-in,
together,
too,
top-down,
top-up,
toss-up,
touch-and-go,
touch-me-not,
toward,
trade-in,
trade-last,
trade-off,
tricked-out,
trip-up,
trumped-up,
try-on,
tumble-down,
tune-up,
turn-on,
twenty-first,
twenty-four,
twenty-one,
twenty-three,
twenty-two,
two,
two-by-four,
two-part,
uncalled-for,
uncared-for,
under,
unheard-of,
unhoped-for,
unlooked-for,
unthought-of,
until,
unwished-for,
up,
up-and-down,
upon,
upside-down,
us,
US,
U.S.A.,
very,
walk-in,
walk-on,
walk-through,
walk-to,
walk-up,
warm-up,
washed-out,
washed-up,
washing-up,
wave-off,
way-out,
we,
well,
well-being,
well-done,
well-made,
well-off,
well-thought-of,
well-to-do,
what,
when,
where,
where's,
whether,
which,
while,
whipper-in,
white-out,
who,
WHO,
whole,
whom,
whose,
why,
will,
wished-for,
with,
within,
with-it,
without,
work-in,
worn-out,
would-be,
write-down,
write-in,
write-off,
wrong-side-out,
year-around,
yearned-for,
yet,
you,
you-all,
your,
yours,
yourself
Why uppercaseA
andB
, but lowercasec
?
â Peter Mortensen
Sep 25 at 17:09
What this longer list actually demonstrates is what could be called a peculiarity inDeleteStopwords
:DeleteStopwords["ne'er-do-well wearing buttoned-down vests"]
â J. M. is somewhat okay.â¦
Sep 25 at 17:27
add a comment |Â
up vote
3
down vote
Another way
WordData[All, "Stopwords"] == WordList["Stopwords"]
True
add a comment |Â
4 Answers
4
active
oldest
votes
4 Answers
4
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
14
down vote
A little spelunking of the code for DeleteStopwords
yields the internally used stopword list:
DeleteStopwords; (* force auto-load *)
AlphabeticSort[List @@ TextProcessing`TextModificationDump`$stopWords["English"]] // Short
"a", "A", "about", "above", "across", "after", "again", "against", "all", "almost",
"alone", "along", "already", "also", "although", <<240>>,
"within", "without", "won't", "would", "wouldn't", "yet", "you", "you'd", "you'll",
"you're", "you've", "your", "yours", "yourself", "yourselves"
Wow, these are some undocumented functions?
â ÃÂûÃÂþñýôÃÂÿàÃÂõóó
Sep 25 at 6:22
But your codes merely return"English"
as the result on my machine. Did I miss something?
â ÃÂûÃÂþñýôÃÂÿàÃÂõóó
Sep 25 at 6:24
4
@ÃÂûÃÂþñýôÃÂÿÃÂÃÂõóó One has to evaluateDeleteStopwords
first. The actual code forDeleteStopwords
andTextProcessing`TextModificationDump`$stopWords
is stored in a file and is loaded only afterDeleteStopwords
is evaluated in the current session.
â Henrik Schumacher
Sep 25 at 6:29
Ah, forgot the autoload. Thanks @Henrik!
â J. M. is somewhat okay.â¦
Sep 25 at 6:31
1
@hanshenrik note the use of// Short
at the end of the code provided, since I didn't want to paste the entire list. Thus, only the first few and last few list elements are shown, and the rest are elided. Remove the// Short
to see the entire list.
â J. M. is somewhat okay.â¦
Sep 25 at 8:33
 |Â
show 2 more comments
up vote
14
down vote
A little spelunking of the code for DeleteStopwords
yields the internally used stopword list:
DeleteStopwords; (* force auto-load *)
AlphabeticSort[List @@ TextProcessing`TextModificationDump`$stopWords["English"]] // Short
"a", "A", "about", "above", "across", "after", "again", "against", "all", "almost",
"alone", "along", "already", "also", "although", <<240>>,
"within", "without", "won't", "would", "wouldn't", "yet", "you", "you'd", "you'll",
"you're", "you've", "your", "yours", "yourself", "yourselves"
Wow, these are some undocumented functions?
â ÃÂûÃÂþñýôÃÂÿàÃÂõóó
Sep 25 at 6:22
But your codes merely return"English"
as the result on my machine. Did I miss something?
â ÃÂûÃÂþñýôÃÂÿàÃÂõóó
Sep 25 at 6:24
4
@ÃÂûÃÂþñýôÃÂÿÃÂÃÂõóó One has to evaluateDeleteStopwords
first. The actual code forDeleteStopwords
andTextProcessing`TextModificationDump`$stopWords
is stored in a file and is loaded only afterDeleteStopwords
is evaluated in the current session.
â Henrik Schumacher
Sep 25 at 6:29
Ah, forgot the autoload. Thanks @Henrik!
â J. M. is somewhat okay.â¦
Sep 25 at 6:31
1
@hanshenrik note the use of// Short
at the end of the code provided, since I didn't want to paste the entire list. Thus, only the first few and last few list elements are shown, and the rest are elided. Remove the// Short
to see the entire list.
â J. M. is somewhat okay.â¦
Sep 25 at 8:33
 |Â
show 2 more comments
up vote
14
down vote
up vote
14
down vote
A little spelunking of the code for DeleteStopwords
yields the internally used stopword list:
DeleteStopwords; (* force auto-load *)
AlphabeticSort[List @@ TextProcessing`TextModificationDump`$stopWords["English"]] // Short
"a", "A", "about", "above", "across", "after", "again", "against", "all", "almost",
"alone", "along", "already", "also", "although", <<240>>,
"within", "without", "won't", "would", "wouldn't", "yet", "you", "you'd", "you'll",
"you're", "you've", "your", "yours", "yourself", "yourselves"
A little spelunking of the code for DeleteStopwords
yields the internally used stopword list:
DeleteStopwords; (* force auto-load *)
AlphabeticSort[List @@ TextProcessing`TextModificationDump`$stopWords["English"]] // Short
"a", "A", "about", "above", "across", "after", "again", "against", "all", "almost",
"alone", "along", "already", "also", "although", <<240>>,
"within", "without", "won't", "would", "wouldn't", "yet", "you", "you'd", "you'll",
"you're", "you've", "your", "yours", "yourself", "yourselves"
edited Sep 25 at 6:32
answered Sep 25 at 6:20
J. M. is somewhat okay.â¦
94k10292451
94k10292451
Wow, these are some undocumented functions?
â ÃÂûÃÂþñýôÃÂÿàÃÂõóó
Sep 25 at 6:22
But your codes merely return"English"
as the result on my machine. Did I miss something?
â ÃÂûÃÂþñýôÃÂÿàÃÂõóó
Sep 25 at 6:24
4
@ÃÂûÃÂþñýôÃÂÿÃÂÃÂõóó One has to evaluateDeleteStopwords
first. The actual code forDeleteStopwords
andTextProcessing`TextModificationDump`$stopWords
is stored in a file and is loaded only afterDeleteStopwords
is evaluated in the current session.
â Henrik Schumacher
Sep 25 at 6:29
Ah, forgot the autoload. Thanks @Henrik!
â J. M. is somewhat okay.â¦
Sep 25 at 6:31
1
@hanshenrik note the use of// Short
at the end of the code provided, since I didn't want to paste the entire list. Thus, only the first few and last few list elements are shown, and the rest are elided. Remove the// Short
to see the entire list.
â J. M. is somewhat okay.â¦
Sep 25 at 8:33
 |Â
show 2 more comments
Wow, these are some undocumented functions?
â ÃÂûÃÂþñýôÃÂÿàÃÂõóó
Sep 25 at 6:22
But your codes merely return"English"
as the result on my machine. Did I miss something?
â ÃÂûÃÂþñýôÃÂÿàÃÂõóó
Sep 25 at 6:24
4
@ÃÂûÃÂþñýôÃÂÿÃÂÃÂõóó One has to evaluateDeleteStopwords
first. The actual code forDeleteStopwords
andTextProcessing`TextModificationDump`$stopWords
is stored in a file and is loaded only afterDeleteStopwords
is evaluated in the current session.
â Henrik Schumacher
Sep 25 at 6:29
Ah, forgot the autoload. Thanks @Henrik!
â J. M. is somewhat okay.â¦
Sep 25 at 6:31
1
@hanshenrik note the use of// Short
at the end of the code provided, since I didn't want to paste the entire list. Thus, only the first few and last few list elements are shown, and the rest are elided. Remove the// Short
to see the entire list.
â J. M. is somewhat okay.â¦
Sep 25 at 8:33
Wow, these are some undocumented functions?
â ÃÂûÃÂþñýôÃÂÿàÃÂõóó
Sep 25 at 6:22
Wow, these are some undocumented functions?
â ÃÂûÃÂþñýôÃÂÿàÃÂõóó
Sep 25 at 6:22
But your codes merely return
"English"
as the result on my machine. Did I miss something?â ÃÂûÃÂþñýôÃÂÿàÃÂõóó
Sep 25 at 6:24
But your codes merely return
"English"
as the result on my machine. Did I miss something?â ÃÂûÃÂþñýôÃÂÿàÃÂõóó
Sep 25 at 6:24
4
4
@ÃÂûÃÂþñýôÃÂÿÃÂÃÂõóó One has to evaluate
DeleteStopwords
first. The actual code for DeleteStopwords
and TextProcessing`TextModificationDump`$stopWords
is stored in a file and is loaded only after DeleteStopwords
is evaluated in the current session.â Henrik Schumacher
Sep 25 at 6:29
@ÃÂûÃÂþñýôÃÂÿÃÂÃÂõóó One has to evaluate
DeleteStopwords
first. The actual code for DeleteStopwords
and TextProcessing`TextModificationDump`$stopWords
is stored in a file and is loaded only after DeleteStopwords
is evaluated in the current session.â Henrik Schumacher
Sep 25 at 6:29
Ah, forgot the autoload. Thanks @Henrik!
â J. M. is somewhat okay.â¦
Sep 25 at 6:31
Ah, forgot the autoload. Thanks @Henrik!
â J. M. is somewhat okay.â¦
Sep 25 at 6:31
1
1
@hanshenrik note the use of
// Short
at the end of the code provided, since I didn't want to paste the entire list. Thus, only the first few and last few list elements are shown, and the rest are elided. Remove the // Short
to see the entire list.â J. M. is somewhat okay.â¦
Sep 25 at 8:33
@hanshenrik note the use of
// Short
at the end of the code provided, since I didn't want to paste the entire list. Thus, only the first few and last few list elements are shown, and the rest are elided. Remove the // Short
to see the entire list.â J. M. is somewhat okay.â¦
Sep 25 at 8:33
 |Â
show 2 more comments
up vote
11
down vote
The stop-words are documented and can be looked up as (see Details section of WordList
):
WordList["Stopwords"]
As it might be subject to change so the output below can get outdated, but you can always run the function WordList["Stopwords"]
:
"0","1","2","3","4","5","6","7","8","9","a","A","about","above","across","after","again","against","all","almost","alone","along","already","also","although","always","am","among","an","and","another","any","anyone","anything","anywhere","are","aren't","around","as","at","b","B","back","be","became","because","become","becomes","been","before","behind","being","below","between","both","but","by","c","C","can","cannot","can't","could","couldn't","d","D","did","didn't","do","does","doesn't","doing","done","don't","down","during","e","E","each","either","enough","even","ever","every","everyone","everything","everywhere","f","F","few","find","first","for","four","from","full","further","g","G","get","give","go","h","H","had","hadn't","has","hasn't","have","haven't","having","he","he'd","he'll","her","here","here's","hers","herself","he's","him","himself","his","how","however","how's","i","I","i'd","if","i'll","i'm","in","interest","into","is","isn't","it","it's","its","itself","i've","j","J","k","K","keep","l","L","last","least","less","let's","m","M","made","many","may","me","might","more","most","mostly","much","must","mustn't","my","myself","n","N","never","next","no","nobody","noone","nor","not","nothing","now","nowhere","o","O","of","off","often","on","once","one","only","or","other","others","ought","our","ours","ourselves","out","over","own","p","P","part","per","perhaps","put","q","Q","r","R","rather","s","S","same","see","seem","seemed","seeming","seems","several","shan't","she","she'd","she'll","she's","should","shouldn't","show","side","since","so","some","someone","something","somewhere","still","such","t","T","take","than","that","that's","the","their","theirs","them","themselves","then","there","therefore","there's","these","they","they'd","they'll","they're","they've","this","those","though","three","through","thus","to","together","too","toward","two","u","U","under","until","up","upon","us","v","V","very","w","W","was","wasn't","we","we'd","we'll","well","we're","were","weren't","we've","what","what's","when","when's","where","where's","whether","which","while","who","whole","whom","who's","whose","why","why's","will","with","within","without","won't","would","wouldn't","x","X","y","Y","yet","you","you'd","you'll","your","you're","yours","yourself","yourselves","you've","z","Z"
2
Interestingly, the complement of this list and the list in my answer yields only a list of letters and numbers, so that list certainly captures all the words. However, applyingDeleteStopwords
seems to only remove"i"
from the list.
â J. M. is somewhat okay.â¦
Sep 25 at 18:06
@J.M.issomewhatokay. a few inconsistencies are known, hopefully will be polished. Important thing is --- the docs is open about the subject, good thing to know :-) The docs is an attractor of stability in terms of known things of internal architecture.
â Vitaliy Kaurov
Sep 25 at 20:13
add a comment |Â
up vote
11
down vote
The stop-words are documented and can be looked up as (see Details section of WordList
):
WordList["Stopwords"]
As it might be subject to change so the output below can get outdated, but you can always run the function WordList["Stopwords"]
:
"0","1","2","3","4","5","6","7","8","9","a","A","about","above","across","after","again","against","all","almost","alone","along","already","also","although","always","am","among","an","and","another","any","anyone","anything","anywhere","are","aren't","around","as","at","b","B","back","be","became","because","become","becomes","been","before","behind","being","below","between","both","but","by","c","C","can","cannot","can't","could","couldn't","d","D","did","didn't","do","does","doesn't","doing","done","don't","down","during","e","E","each","either","enough","even","ever","every","everyone","everything","everywhere","f","F","few","find","first","for","four","from","full","further","g","G","get","give","go","h","H","had","hadn't","has","hasn't","have","haven't","having","he","he'd","he'll","her","here","here's","hers","herself","he's","him","himself","his","how","however","how's","i","I","i'd","if","i'll","i'm","in","interest","into","is","isn't","it","it's","its","itself","i've","j","J","k","K","keep","l","L","last","least","less","let's","m","M","made","many","may","me","might","more","most","mostly","much","must","mustn't","my","myself","n","N","never","next","no","nobody","noone","nor","not","nothing","now","nowhere","o","O","of","off","often","on","once","one","only","or","other","others","ought","our","ours","ourselves","out","over","own","p","P","part","per","perhaps","put","q","Q","r","R","rather","s","S","same","see","seem","seemed","seeming","seems","several","shan't","she","she'd","she'll","she's","should","shouldn't","show","side","since","so","some","someone","something","somewhere","still","such","t","T","take","than","that","that's","the","their","theirs","them","themselves","then","there","therefore","there's","these","they","they'd","they'll","they're","they've","this","those","though","three","through","thus","to","together","too","toward","two","u","U","under","until","up","upon","us","v","V","very","w","W","was","wasn't","we","we'd","we'll","well","we're","were","weren't","we've","what","what's","when","when's","where","where's","whether","which","while","who","whole","whom","who's","whose","why","why's","will","with","within","without","won't","would","wouldn't","x","X","y","Y","yet","you","you'd","you'll","your","you're","yours","yourself","yourselves","you've","z","Z"
2
Interestingly, the complement of this list and the list in my answer yields only a list of letters and numbers, so that list certainly captures all the words. However, applyingDeleteStopwords
seems to only remove"i"
from the list.
â J. M. is somewhat okay.â¦
Sep 25 at 18:06
@J.M.issomewhatokay. a few inconsistencies are known, hopefully will be polished. Important thing is --- the docs is open about the subject, good thing to know :-) The docs is an attractor of stability in terms of known things of internal architecture.
â Vitaliy Kaurov
Sep 25 at 20:13
add a comment |Â
up vote
11
down vote
up vote
11
down vote
The stop-words are documented and can be looked up as (see Details section of WordList
):
WordList["Stopwords"]
As it might be subject to change so the output below can get outdated, but you can always run the function WordList["Stopwords"]
:
"0","1","2","3","4","5","6","7","8","9","a","A","about","above","across","after","again","against","all","almost","alone","along","already","also","although","always","am","among","an","and","another","any","anyone","anything","anywhere","are","aren't","around","as","at","b","B","back","be","became","because","become","becomes","been","before","behind","being","below","between","both","but","by","c","C","can","cannot","can't","could","couldn't","d","D","did","didn't","do","does","doesn't","doing","done","don't","down","during","e","E","each","either","enough","even","ever","every","everyone","everything","everywhere","f","F","few","find","first","for","four","from","full","further","g","G","get","give","go","h","H","had","hadn't","has","hasn't","have","haven't","having","he","he'd","he'll","her","here","here's","hers","herself","he's","him","himself","his","how","however","how's","i","I","i'd","if","i'll","i'm","in","interest","into","is","isn't","it","it's","its","itself","i've","j","J","k","K","keep","l","L","last","least","less","let's","m","M","made","many","may","me","might","more","most","mostly","much","must","mustn't","my","myself","n","N","never","next","no","nobody","noone","nor","not","nothing","now","nowhere","o","O","of","off","often","on","once","one","only","or","other","others","ought","our","ours","ourselves","out","over","own","p","P","part","per","perhaps","put","q","Q","r","R","rather","s","S","same","see","seem","seemed","seeming","seems","several","shan't","she","she'd","she'll","she's","should","shouldn't","show","side","since","so","some","someone","something","somewhere","still","such","t","T","take","than","that","that's","the","their","theirs","them","themselves","then","there","therefore","there's","these","they","they'd","they'll","they're","they've","this","those","though","three","through","thus","to","together","too","toward","two","u","U","under","until","up","upon","us","v","V","very","w","W","was","wasn't","we","we'd","we'll","well","we're","were","weren't","we've","what","what's","when","when's","where","where's","whether","which","while","who","whole","whom","who's","whose","why","why's","will","with","within","without","won't","would","wouldn't","x","X","y","Y","yet","you","you'd","you'll","your","you're","yours","yourself","yourselves","you've","z","Z"
The stop-words are documented and can be looked up as (see Details section of WordList
):
WordList["Stopwords"]
As it might be subject to change so the output below can get outdated, but you can always run the function WordList["Stopwords"]
:
"0","1","2","3","4","5","6","7","8","9","a","A","about","above","across","after","again","against","all","almost","alone","along","already","also","although","always","am","among","an","and","another","any","anyone","anything","anywhere","are","aren't","around","as","at","b","B","back","be","became","because","become","becomes","been","before","behind","being","below","between","both","but","by","c","C","can","cannot","can't","could","couldn't","d","D","did","didn't","do","does","doesn't","doing","done","don't","down","during","e","E","each","either","enough","even","ever","every","everyone","everything","everywhere","f","F","few","find","first","for","four","from","full","further","g","G","get","give","go","h","H","had","hadn't","has","hasn't","have","haven't","having","he","he'd","he'll","her","here","here's","hers","herself","he's","him","himself","his","how","however","how's","i","I","i'd","if","i'll","i'm","in","interest","into","is","isn't","it","it's","its","itself","i've","j","J","k","K","keep","l","L","last","least","less","let's","m","M","made","many","may","me","might","more","most","mostly","much","must","mustn't","my","myself","n","N","never","next","no","nobody","noone","nor","not","nothing","now","nowhere","o","O","of","off","often","on","once","one","only","or","other","others","ought","our","ours","ourselves","out","over","own","p","P","part","per","perhaps","put","q","Q","r","R","rather","s","S","same","see","seem","seemed","seeming","seems","several","shan't","she","she'd","she'll","she's","should","shouldn't","show","side","since","so","some","someone","something","somewhere","still","such","t","T","take","than","that","that's","the","their","theirs","them","themselves","then","there","therefore","there's","these","they","they'd","they'll","they're","they've","this","those","though","three","through","thus","to","together","too","toward","two","u","U","under","until","up","upon","us","v","V","very","w","W","was","wasn't","we","we'd","we'll","well","we're","were","weren't","we've","what","what's","when","when's","where","where's","whether","which","while","who","whole","whom","who's","whose","why","why's","will","with","within","without","won't","would","wouldn't","x","X","y","Y","yet","you","you'd","you'll","your","you're","yours","yourself","yourselves","you've","z","Z"
answered Sep 25 at 17:53
Vitaliy Kaurov
56k6156274
56k6156274
2
Interestingly, the complement of this list and the list in my answer yields only a list of letters and numbers, so that list certainly captures all the words. However, applyingDeleteStopwords
seems to only remove"i"
from the list.
â J. M. is somewhat okay.â¦
Sep 25 at 18:06
@J.M.issomewhatokay. a few inconsistencies are known, hopefully will be polished. Important thing is --- the docs is open about the subject, good thing to know :-) The docs is an attractor of stability in terms of known things of internal architecture.
â Vitaliy Kaurov
Sep 25 at 20:13
add a comment |Â
2
Interestingly, the complement of this list and the list in my answer yields only a list of letters and numbers, so that list certainly captures all the words. However, applyingDeleteStopwords
seems to only remove"i"
from the list.
â J. M. is somewhat okay.â¦
Sep 25 at 18:06
@J.M.issomewhatokay. a few inconsistencies are known, hopefully will be polished. Important thing is --- the docs is open about the subject, good thing to know :-) The docs is an attractor of stability in terms of known things of internal architecture.
â Vitaliy Kaurov
Sep 25 at 20:13
2
2
Interestingly, the complement of this list and the list in my answer yields only a list of letters and numbers, so that list certainly captures all the words. However, applying
DeleteStopwords
seems to only remove "i"
from the list.â J. M. is somewhat okay.â¦
Sep 25 at 18:06
Interestingly, the complement of this list and the list in my answer yields only a list of letters and numbers, so that list certainly captures all the words. However, applying
DeleteStopwords
seems to only remove "i"
from the list.â J. M. is somewhat okay.â¦
Sep 25 at 18:06
@J.M.issomewhatokay. a few inconsistencies are known, hopefully will be polished. Important thing is --- the docs is open about the subject, good thing to know :-) The docs is an attractor of stability in terms of known things of internal architecture.
â Vitaliy Kaurov
Sep 25 at 20:13
@J.M.issomewhatokay. a few inconsistencies are known, hopefully will be polished. Important thing is --- the docs is open about the subject, good thing to know :-) The docs is an attractor of stability in terms of known things of internal architecture.
â Vitaliy Kaurov
Sep 25 at 20:13
add a comment |Â
up vote
7
down vote
Here is a longer list using the following commands:
A = WordList["KnownWords"];
B = DeleteStopwords[A];
c = Complement[A, B]
a,
A,
about,
above,
across,
A.D.,
add-in,
add-on,
A.E.,
after,
again,
against,
a.k.a.,
all,
all-around,
all-or-nothing,
all-out,
almost,
alone,
along,
al-Qur'an,
already,
also,
although,
always,
a.m.,
A.M.,
Am,
AM,
among,
an,
AN,
and,
and/or,
another,
any,
anyone,
anything,
anywhere,
A-one,
are,
around,
as,
As,
AS,
at,
At,
back,
back-to-back,
balls-up,
bang-up,
be,
Be,
beaten-up,
beat-up,
because,
become,
beefed-up,
before,
behind,
being,
belly-up,
below,
between,
bicycle-built-for-two,
blown-up,
booze-up,
born-again,
both,
bottom-up,
boxed-in,
break-in,
bride-to-be,
broken-down,
brush-off,
built-in,
built-up,
bundled-up,
burned-out,
burned-over,
burnt-out,
bust-up,
but,
button-down,
buttoned-down,
buttoned-up,
by,
by-and-by,
call-back,
caller-out,
caller-up,
call-in,
call-out,
camp-made,
can,
can-do,
cared-for,
carry-over,
cast-off,
change-up,
check-in,
ch'i,
Ch'in,
chin-up,
chock-full,
choke-full,
chucker-out,
chuck-full,
churned-up,
climb-down,
clip-on,
coach-and-four,
comb-out,
come-on,
cover-up,
crack-up,
cure-all,
custom-made,
cut-in,
cut-up,
D.A.,
dead-on,
derring-do,
do,
DO,
d.o.a.,
do-it-yourself,
done,
do-nothing,
do-si-do,
down,
Down,
down-and-out,
drawn-out,
dressed-up,
dried-out,
dried-up,
drive-in,
drop-off,
during,
each,
eighty-four,
eighty-one,
eighty-three,
eighty-two,
either,
end-all,
enough,
even,
ever,
every,
everyone,
everything,
everywhere,
eyes-only,
face-off,
factory-made,
fare-thee-well,
far-off,
far-out,
few,
fifty-four,
fifty-one,
fifty-three,
fifty-two,
fill-in,
find,
first,
F.I.S.C.,
flame-out,
flare-up,
fly-by,
follow-on,
follow-through,
follow-up,
for,
force-out,
fore-and-after,
forget-me-not,
form-only,
forty-first,
forty-four,
forty-one,
forty-three,
forty-two,
foul-up,
four,
frame-up,
free-for-all,
from,
fucked-up,
full,
further,
get,
get-go,
get-up-and-go,
G.I.,
gill-less,
give,
give-and-go,
give-and-take,
go,
go-around,
go-between,
going-over,
good-by,
good-for-nothing,
goof-off,
groom-to-be,
half-seas-over,
hand-down,
handed-down,
hand-me-down,
hands-down,
hands-off,
hands-on,
hanger-on,
hang-up,
hard-on,
has-been,
have,
have-not,
Hawai'i,
he,
He,
head-on,
heads-up,
heart-whole,
her,
here,
Here,
hers,
herself,
higher-up,
high-interest,
high-up,
him,
himself,
his,
hold-down,
hollow-back,
hoped-for,
hopped-up,
how,
how-do-you-do,
how-d'ye-do,
however,
HSV-I,
hundred-and-first,
hushed-up,
i,
I,
I.D.,
i.e.,
I.E.D.,
if,
If,
ill-being,
in,
In,
IN,
in-between,
inside-out,
interest,
into,
I.Q.,
it,
IT,
its,
itself,
I.W.W.,
jerk-off,
Johnny-jump-up,
jumped-up,
keep,
knock-down,
knock-down-and-drag-out,
knocked-out,
know-all,
know-how,
know-it-all,
ladder-back,
laid-back,
laid-off,
lash-up,
last,
lay-by,
lay-up,
lead-in,
lean-to,
least,
less,
lie-in,
lighting-up,
lights-out,
log-in,
longed-for,
looker-on,
look-over,
low-down,
low-interest,
lying-in,
ma'am,
machine-made,
made,
made-up,
make-do,
make-up,
man-made,
many,
marked-up,
match-up,
matt-up,
may,
May,
me,
ME,
mess-up,
mid-May,
mid-off,
mid-on,
might,
might-have-been,
mixed-up,
mix-up,
mock-up,
more,
More,
most,
mostly,
much,
must,
my,
myself,
ne'er-do-well,
never,
never-never,
Never-Never,
new-made,
next,
next-to-last,
ninety-four,
ninety-one,
ninety-three,
ninety-two,
no,
No,
no.,
nobody,
no-go,
no-one,
nor,
no-show,
nosh-up,
not,
nothing,
now,
nowhere,
odds-on,
of,
off,
off-and-on,
often,
on,
once,
once-over,
one,
one-and-one,
one-off,
one-on-one,
one-to-one,
only,
OR,
other,
our,
ours,
ourselves,
out,
out-and-out,
over,
own,
p.a.,
P.A.,
paid-up,
part,
passer-by,
pass-through,
paste-up,
pegged-down,
pent-up,
per,
perhaps,
phase-out,
phone-in,
pick-me-up,
pick-off,
pig-a-back,
pin-up,
piss-up,
plug-in,
pop-up,
Post-It,
press-up,
pull-in,
pull-off,
pull-through,
pull-up,
pumped-up,
punch-up,
purpose-made,
put,
put-down,
put-on,
put-put,
put-up,
put-upon,
rake-off,
raree-show,
rather,
rave-up,
read-out,
ready-made,
right-down,
right-side-out,
right-side-up,
rip-off,
roll-on,
run-down,
run-in,
runner-up,
run-on,
run-through,
run-up,
same,
Same,
Sana'a,
sauce-alone,
save-all,
sawed-off,
sawn-off,
say-so,
schoolma'am,
see,
seem,
seeming,
see-through,
self-interest,
self-made,
self-will,
send-off,
set-back,
set-to,
seventy-four,
seventy-one,
seventy-three,
seventy-two,
seven-up,
several,
shake-up,
shape-up,
share-out,
she,
shell-less,
shoo-in,
shoot-down,
shoot-'em-up,
show,
show-off,
shut-in,
side,
side-to-side,
since,
sit-down,
sit-in,
sit-up,
sixty-four,
sixty-one,
sixty-three,
sixty-two,
slap-up,
slip-on,
slip-up,
smash-up,
snarl-up,
so,
so-and-so,
sold-out,
some,
someone,
something,
somewhere,
so-so,
sought-after,
spaced-out,
spend-all,
spin-off,
spread-out,
stand-alone,
stand-down,
stand-in,
stand-up,
start-off,
step-down,
step-in,
step-up,
stick-on,
still,
stock-still,
stock-take,
stopped-up,
straight-out,
stripped-down,
strung-out,
stuck-up,
such,
sum-up,
sure-enough,
tailor-made,
take,
take-in,
take-up,
tap-off,
teach-in,
than,
that,
the,
their,
theirs,
them,
themselves,
then,
there,
therefore,
these,
they,
thing-in-itself,
thirty-first,
thirty-four,
thirty-one,
thirty-something,
thirty-three,
thirty-two,
this,
those,
though,
three,
through,
throw-in,
thus,
tie-in,
tie-on,
tie-up,
time-out,
tip-off,
tip-up,
to,
to-do,
toe-in,
together,
too,
top-down,
top-up,
toss-up,
touch-and-go,
touch-me-not,
toward,
trade-in,
trade-last,
trade-off,
tricked-out,
trip-up,
trumped-up,
try-on,
tumble-down,
tune-up,
turn-on,
twenty-first,
twenty-four,
twenty-one,
twenty-three,
twenty-two,
two,
two-by-four,
two-part,
uncalled-for,
uncared-for,
under,
unheard-of,
unhoped-for,
unlooked-for,
unthought-of,
until,
unwished-for,
up,
up-and-down,
upon,
upside-down,
us,
US,
U.S.A.,
very,
walk-in,
walk-on,
walk-through,
walk-to,
walk-up,
warm-up,
washed-out,
washed-up,
washing-up,
wave-off,
way-out,
we,
well,
well-being,
well-done,
well-made,
well-off,
well-thought-of,
well-to-do,
what,
when,
where,
where's,
whether,
which,
while,
whipper-in,
white-out,
who,
WHO,
whole,
whom,
whose,
why,
will,
wished-for,
with,
within,
with-it,
without,
work-in,
worn-out,
would-be,
write-down,
write-in,
write-off,
wrong-side-out,
year-around,
yearned-for,
yet,
you,
you-all,
your,
yours,
yourself
Why uppercaseA
andB
, but lowercasec
?
â Peter Mortensen
Sep 25 at 17:09
What this longer list actually demonstrates is what could be called a peculiarity inDeleteStopwords
:DeleteStopwords["ne'er-do-well wearing buttoned-down vests"]
â J. M. is somewhat okay.â¦
Sep 25 at 17:27
add a comment |Â
up vote
7
down vote
Here is a longer list using the following commands:
A = WordList["KnownWords"];
B = DeleteStopwords[A];
c = Complement[A, B]
a,
A,
about,
above,
across,
A.D.,
add-in,
add-on,
A.E.,
after,
again,
against,
a.k.a.,
all,
all-around,
all-or-nothing,
all-out,
almost,
alone,
along,
al-Qur'an,
already,
also,
although,
always,
a.m.,
A.M.,
Am,
AM,
among,
an,
AN,
and,
and/or,
another,
any,
anyone,
anything,
anywhere,
A-one,
are,
around,
as,
As,
AS,
at,
At,
back,
back-to-back,
balls-up,
bang-up,
be,
Be,
beaten-up,
beat-up,
because,
become,
beefed-up,
before,
behind,
being,
belly-up,
below,
between,
bicycle-built-for-two,
blown-up,
booze-up,
born-again,
both,
bottom-up,
boxed-in,
break-in,
bride-to-be,
broken-down,
brush-off,
built-in,
built-up,
bundled-up,
burned-out,
burned-over,
burnt-out,
bust-up,
but,
button-down,
buttoned-down,
buttoned-up,
by,
by-and-by,
call-back,
caller-out,
caller-up,
call-in,
call-out,
camp-made,
can,
can-do,
cared-for,
carry-over,
cast-off,
change-up,
check-in,
ch'i,
Ch'in,
chin-up,
chock-full,
choke-full,
chucker-out,
chuck-full,
churned-up,
climb-down,
clip-on,
coach-and-four,
comb-out,
come-on,
cover-up,
crack-up,
cure-all,
custom-made,
cut-in,
cut-up,
D.A.,
dead-on,
derring-do,
do,
DO,
d.o.a.,
do-it-yourself,
done,
do-nothing,
do-si-do,
down,
Down,
down-and-out,
drawn-out,
dressed-up,
dried-out,
dried-up,
drive-in,
drop-off,
during,
each,
eighty-four,
eighty-one,
eighty-three,
eighty-two,
either,
end-all,
enough,
even,
ever,
every,
everyone,
everything,
everywhere,
eyes-only,
face-off,
factory-made,
fare-thee-well,
far-off,
far-out,
few,
fifty-four,
fifty-one,
fifty-three,
fifty-two,
fill-in,
find,
first,
F.I.S.C.,
flame-out,
flare-up,
fly-by,
follow-on,
follow-through,
follow-up,
for,
force-out,
fore-and-after,
forget-me-not,
form-only,
forty-first,
forty-four,
forty-one,
forty-three,
forty-two,
foul-up,
four,
frame-up,
free-for-all,
from,
fucked-up,
full,
further,
get,
get-go,
get-up-and-go,
G.I.,
gill-less,
give,
give-and-go,
give-and-take,
go,
go-around,
go-between,
going-over,
good-by,
good-for-nothing,
goof-off,
groom-to-be,
half-seas-over,
hand-down,
handed-down,
hand-me-down,
hands-down,
hands-off,
hands-on,
hanger-on,
hang-up,
hard-on,
has-been,
have,
have-not,
Hawai'i,
he,
He,
head-on,
heads-up,
heart-whole,
her,
here,
Here,
hers,
herself,
higher-up,
high-interest,
high-up,
him,
himself,
his,
hold-down,
hollow-back,
hoped-for,
hopped-up,
how,
how-do-you-do,
how-d'ye-do,
however,
HSV-I,
hundred-and-first,
hushed-up,
i,
I,
I.D.,
i.e.,
I.E.D.,
if,
If,
ill-being,
in,
In,
IN,
in-between,
inside-out,
interest,
into,
I.Q.,
it,
IT,
its,
itself,
I.W.W.,
jerk-off,
Johnny-jump-up,
jumped-up,
keep,
knock-down,
knock-down-and-drag-out,
knocked-out,
know-all,
know-how,
know-it-all,
ladder-back,
laid-back,
laid-off,
lash-up,
last,
lay-by,
lay-up,
lead-in,
lean-to,
least,
less,
lie-in,
lighting-up,
lights-out,
log-in,
longed-for,
looker-on,
look-over,
low-down,
low-interest,
lying-in,
ma'am,
machine-made,
made,
made-up,
make-do,
make-up,
man-made,
many,
marked-up,
match-up,
matt-up,
may,
May,
me,
ME,
mess-up,
mid-May,
mid-off,
mid-on,
might,
might-have-been,
mixed-up,
mix-up,
mock-up,
more,
More,
most,
mostly,
much,
must,
my,
myself,
ne'er-do-well,
never,
never-never,
Never-Never,
new-made,
next,
next-to-last,
ninety-four,
ninety-one,
ninety-three,
ninety-two,
no,
No,
no.,
nobody,
no-go,
no-one,
nor,
no-show,
nosh-up,
not,
nothing,
now,
nowhere,
odds-on,
of,
off,
off-and-on,
often,
on,
once,
once-over,
one,
one-and-one,
one-off,
one-on-one,
one-to-one,
only,
OR,
other,
our,
ours,
ourselves,
out,
out-and-out,
over,
own,
p.a.,
P.A.,
paid-up,
part,
passer-by,
pass-through,
paste-up,
pegged-down,
pent-up,
per,
perhaps,
phase-out,
phone-in,
pick-me-up,
pick-off,
pig-a-back,
pin-up,
piss-up,
plug-in,
pop-up,
Post-It,
press-up,
pull-in,
pull-off,
pull-through,
pull-up,
pumped-up,
punch-up,
purpose-made,
put,
put-down,
put-on,
put-put,
put-up,
put-upon,
rake-off,
raree-show,
rather,
rave-up,
read-out,
ready-made,
right-down,
right-side-out,
right-side-up,
rip-off,
roll-on,
run-down,
run-in,
runner-up,
run-on,
run-through,
run-up,
same,
Same,
Sana'a,
sauce-alone,
save-all,
sawed-off,
sawn-off,
say-so,
schoolma'am,
see,
seem,
seeming,
see-through,
self-interest,
self-made,
self-will,
send-off,
set-back,
set-to,
seventy-four,
seventy-one,
seventy-three,
seventy-two,
seven-up,
several,
shake-up,
shape-up,
share-out,
she,
shell-less,
shoo-in,
shoot-down,
shoot-'em-up,
show,
show-off,
shut-in,
side,
side-to-side,
since,
sit-down,
sit-in,
sit-up,
sixty-four,
sixty-one,
sixty-three,
sixty-two,
slap-up,
slip-on,
slip-up,
smash-up,
snarl-up,
so,
so-and-so,
sold-out,
some,
someone,
something,
somewhere,
so-so,
sought-after,
spaced-out,
spend-all,
spin-off,
spread-out,
stand-alone,
stand-down,
stand-in,
stand-up,
start-off,
step-down,
step-in,
step-up,
stick-on,
still,
stock-still,
stock-take,
stopped-up,
straight-out,
stripped-down,
strung-out,
stuck-up,
such,
sum-up,
sure-enough,
tailor-made,
take,
take-in,
take-up,
tap-off,
teach-in,
than,
that,
the,
their,
theirs,
them,
themselves,
then,
there,
therefore,
these,
they,
thing-in-itself,
thirty-first,
thirty-four,
thirty-one,
thirty-something,
thirty-three,
thirty-two,
this,
those,
though,
three,
through,
throw-in,
thus,
tie-in,
tie-on,
tie-up,
time-out,
tip-off,
tip-up,
to,
to-do,
toe-in,
together,
too,
top-down,
top-up,
toss-up,
touch-and-go,
touch-me-not,
toward,
trade-in,
trade-last,
trade-off,
tricked-out,
trip-up,
trumped-up,
try-on,
tumble-down,
tune-up,
turn-on,
twenty-first,
twenty-four,
twenty-one,
twenty-three,
twenty-two,
two,
two-by-four,
two-part,
uncalled-for,
uncared-for,
under,
unheard-of,
unhoped-for,
unlooked-for,
unthought-of,
until,
unwished-for,
up,
up-and-down,
upon,
upside-down,
us,
US,
U.S.A.,
very,
walk-in,
walk-on,
walk-through,
walk-to,
walk-up,
warm-up,
washed-out,
washed-up,
washing-up,
wave-off,
way-out,
we,
well,
well-being,
well-done,
well-made,
well-off,
well-thought-of,
well-to-do,
what,
when,
where,
where's,
whether,
which,
while,
whipper-in,
white-out,
who,
WHO,
whole,
whom,
whose,
why,
will,
wished-for,
with,
within,
with-it,
without,
work-in,
worn-out,
would-be,
write-down,
write-in,
write-off,
wrong-side-out,
year-around,
yearned-for,
yet,
you,
you-all,
your,
yours,
yourself
Why uppercaseA
andB
, but lowercasec
?
â Peter Mortensen
Sep 25 at 17:09
What this longer list actually demonstrates is what could be called a peculiarity inDeleteStopwords
:DeleteStopwords["ne'er-do-well wearing buttoned-down vests"]
â J. M. is somewhat okay.â¦
Sep 25 at 17:27
add a comment |Â
up vote
7
down vote
up vote
7
down vote
Here is a longer list using the following commands:
A = WordList["KnownWords"];
B = DeleteStopwords[A];
c = Complement[A, B]
a,
A,
about,
above,
across,
A.D.,
add-in,
add-on,
A.E.,
after,
again,
against,
a.k.a.,
all,
all-around,
all-or-nothing,
all-out,
almost,
alone,
along,
al-Qur'an,
already,
also,
although,
always,
a.m.,
A.M.,
Am,
AM,
among,
an,
AN,
and,
and/or,
another,
any,
anyone,
anything,
anywhere,
A-one,
are,
around,
as,
As,
AS,
at,
At,
back,
back-to-back,
balls-up,
bang-up,
be,
Be,
beaten-up,
beat-up,
because,
become,
beefed-up,
before,
behind,
being,
belly-up,
below,
between,
bicycle-built-for-two,
blown-up,
booze-up,
born-again,
both,
bottom-up,
boxed-in,
break-in,
bride-to-be,
broken-down,
brush-off,
built-in,
built-up,
bundled-up,
burned-out,
burned-over,
burnt-out,
bust-up,
but,
button-down,
buttoned-down,
buttoned-up,
by,
by-and-by,
call-back,
caller-out,
caller-up,
call-in,
call-out,
camp-made,
can,
can-do,
cared-for,
carry-over,
cast-off,
change-up,
check-in,
ch'i,
Ch'in,
chin-up,
chock-full,
choke-full,
chucker-out,
chuck-full,
churned-up,
climb-down,
clip-on,
coach-and-four,
comb-out,
come-on,
cover-up,
crack-up,
cure-all,
custom-made,
cut-in,
cut-up,
D.A.,
dead-on,
derring-do,
do,
DO,
d.o.a.,
do-it-yourself,
done,
do-nothing,
do-si-do,
down,
Down,
down-and-out,
drawn-out,
dressed-up,
dried-out,
dried-up,
drive-in,
drop-off,
during,
each,
eighty-four,
eighty-one,
eighty-three,
eighty-two,
either,
end-all,
enough,
even,
ever,
every,
everyone,
everything,
everywhere,
eyes-only,
face-off,
factory-made,
fare-thee-well,
far-off,
far-out,
few,
fifty-four,
fifty-one,
fifty-three,
fifty-two,
fill-in,
find,
first,
F.I.S.C.,
flame-out,
flare-up,
fly-by,
follow-on,
follow-through,
follow-up,
for,
force-out,
fore-and-after,
forget-me-not,
form-only,
forty-first,
forty-four,
forty-one,
forty-three,
forty-two,
foul-up,
four,
frame-up,
free-for-all,
from,
fucked-up,
full,
further,
get,
get-go,
get-up-and-go,
G.I.,
gill-less,
give,
give-and-go,
give-and-take,
go,
go-around,
go-between,
going-over,
good-by,
good-for-nothing,
goof-off,
groom-to-be,
half-seas-over,
hand-down,
handed-down,
hand-me-down,
hands-down,
hands-off,
hands-on,
hanger-on,
hang-up,
hard-on,
has-been,
have,
have-not,
Hawai'i,
he,
He,
head-on,
heads-up,
heart-whole,
her,
here,
Here,
hers,
herself,
higher-up,
high-interest,
high-up,
him,
himself,
his,
hold-down,
hollow-back,
hoped-for,
hopped-up,
how,
how-do-you-do,
how-d'ye-do,
however,
HSV-I,
hundred-and-first,
hushed-up,
i,
I,
I.D.,
i.e.,
I.E.D.,
if,
If,
ill-being,
in,
In,
IN,
in-between,
inside-out,
interest,
into,
I.Q.,
it,
IT,
its,
itself,
I.W.W.,
jerk-off,
Johnny-jump-up,
jumped-up,
keep,
knock-down,
knock-down-and-drag-out,
knocked-out,
know-all,
know-how,
know-it-all,
ladder-back,
laid-back,
laid-off,
lash-up,
last,
lay-by,
lay-up,
lead-in,
lean-to,
least,
less,
lie-in,
lighting-up,
lights-out,
log-in,
longed-for,
looker-on,
look-over,
low-down,
low-interest,
lying-in,
ma'am,
machine-made,
made,
made-up,
make-do,
make-up,
man-made,
many,
marked-up,
match-up,
matt-up,
may,
May,
me,
ME,
mess-up,
mid-May,
mid-off,
mid-on,
might,
might-have-been,
mixed-up,
mix-up,
mock-up,
more,
More,
most,
mostly,
much,
must,
my,
myself,
ne'er-do-well,
never,
never-never,
Never-Never,
new-made,
next,
next-to-last,
ninety-four,
ninety-one,
ninety-three,
ninety-two,
no,
No,
no.,
nobody,
no-go,
no-one,
nor,
no-show,
nosh-up,
not,
nothing,
now,
nowhere,
odds-on,
of,
off,
off-and-on,
often,
on,
once,
once-over,
one,
one-and-one,
one-off,
one-on-one,
one-to-one,
only,
OR,
other,
our,
ours,
ourselves,
out,
out-and-out,
over,
own,
p.a.,
P.A.,
paid-up,
part,
passer-by,
pass-through,
paste-up,
pegged-down,
pent-up,
per,
perhaps,
phase-out,
phone-in,
pick-me-up,
pick-off,
pig-a-back,
pin-up,
piss-up,
plug-in,
pop-up,
Post-It,
press-up,
pull-in,
pull-off,
pull-through,
pull-up,
pumped-up,
punch-up,
purpose-made,
put,
put-down,
put-on,
put-put,
put-up,
put-upon,
rake-off,
raree-show,
rather,
rave-up,
read-out,
ready-made,
right-down,
right-side-out,
right-side-up,
rip-off,
roll-on,
run-down,
run-in,
runner-up,
run-on,
run-through,
run-up,
same,
Same,
Sana'a,
sauce-alone,
save-all,
sawed-off,
sawn-off,
say-so,
schoolma'am,
see,
seem,
seeming,
see-through,
self-interest,
self-made,
self-will,
send-off,
set-back,
set-to,
seventy-four,
seventy-one,
seventy-three,
seventy-two,
seven-up,
several,
shake-up,
shape-up,
share-out,
she,
shell-less,
shoo-in,
shoot-down,
shoot-'em-up,
show,
show-off,
shut-in,
side,
side-to-side,
since,
sit-down,
sit-in,
sit-up,
sixty-four,
sixty-one,
sixty-three,
sixty-two,
slap-up,
slip-on,
slip-up,
smash-up,
snarl-up,
so,
so-and-so,
sold-out,
some,
someone,
something,
somewhere,
so-so,
sought-after,
spaced-out,
spend-all,
spin-off,
spread-out,
stand-alone,
stand-down,
stand-in,
stand-up,
start-off,
step-down,
step-in,
step-up,
stick-on,
still,
stock-still,
stock-take,
stopped-up,
straight-out,
stripped-down,
strung-out,
stuck-up,
such,
sum-up,
sure-enough,
tailor-made,
take,
take-in,
take-up,
tap-off,
teach-in,
than,
that,
the,
their,
theirs,
them,
themselves,
then,
there,
therefore,
these,
they,
thing-in-itself,
thirty-first,
thirty-four,
thirty-one,
thirty-something,
thirty-three,
thirty-two,
this,
those,
though,
three,
through,
throw-in,
thus,
tie-in,
tie-on,
tie-up,
time-out,
tip-off,
tip-up,
to,
to-do,
toe-in,
together,
too,
top-down,
top-up,
toss-up,
touch-and-go,
touch-me-not,
toward,
trade-in,
trade-last,
trade-off,
tricked-out,
trip-up,
trumped-up,
try-on,
tumble-down,
tune-up,
turn-on,
twenty-first,
twenty-four,
twenty-one,
twenty-three,
twenty-two,
two,
two-by-four,
two-part,
uncalled-for,
uncared-for,
under,
unheard-of,
unhoped-for,
unlooked-for,
unthought-of,
until,
unwished-for,
up,
up-and-down,
upon,
upside-down,
us,
US,
U.S.A.,
very,
walk-in,
walk-on,
walk-through,
walk-to,
walk-up,
warm-up,
washed-out,
washed-up,
washing-up,
wave-off,
way-out,
we,
well,
well-being,
well-done,
well-made,
well-off,
well-thought-of,
well-to-do,
what,
when,
where,
where's,
whether,
which,
while,
whipper-in,
white-out,
who,
WHO,
whole,
whom,
whose,
why,
will,
wished-for,
with,
within,
with-it,
without,
work-in,
worn-out,
would-be,
write-down,
write-in,
write-off,
wrong-side-out,
year-around,
yearned-for,
yet,
you,
you-all,
your,
yours,
yourself
Here is a longer list using the following commands:
A = WordList["KnownWords"];
B = DeleteStopwords[A];
c = Complement[A, B]
a,
A,
about,
above,
across,
A.D.,
add-in,
add-on,
A.E.,
after,
again,
against,
a.k.a.,
all,
all-around,
all-or-nothing,
all-out,
almost,
alone,
along,
al-Qur'an,
already,
also,
although,
always,
a.m.,
A.M.,
Am,
AM,
among,
an,
AN,
and,
and/or,
another,
any,
anyone,
anything,
anywhere,
A-one,
are,
around,
as,
As,
AS,
at,
At,
back,
back-to-back,
balls-up,
bang-up,
be,
Be,
beaten-up,
beat-up,
because,
become,
beefed-up,
before,
behind,
being,
belly-up,
below,
between,
bicycle-built-for-two,
blown-up,
booze-up,
born-again,
both,
bottom-up,
boxed-in,
break-in,
bride-to-be,
broken-down,
brush-off,
built-in,
built-up,
bundled-up,
burned-out,
burned-over,
burnt-out,
bust-up,
but,
button-down,
buttoned-down,
buttoned-up,
by,
by-and-by,
call-back,
caller-out,
caller-up,
call-in,
call-out,
camp-made,
can,
can-do,
cared-for,
carry-over,
cast-off,
change-up,
check-in,
ch'i,
Ch'in,
chin-up,
chock-full,
choke-full,
chucker-out,
chuck-full,
churned-up,
climb-down,
clip-on,
coach-and-four,
comb-out,
come-on,
cover-up,
crack-up,
cure-all,
custom-made,
cut-in,
cut-up,
D.A.,
dead-on,
derring-do,
do,
DO,
d.o.a.,
do-it-yourself,
done,
do-nothing,
do-si-do,
down,
Down,
down-and-out,
drawn-out,
dressed-up,
dried-out,
dried-up,
drive-in,
drop-off,
during,
each,
eighty-four,
eighty-one,
eighty-three,
eighty-two,
either,
end-all,
enough,
even,
ever,
every,
everyone,
everything,
everywhere,
eyes-only,
face-off,
factory-made,
fare-thee-well,
far-off,
far-out,
few,
fifty-four,
fifty-one,
fifty-three,
fifty-two,
fill-in,
find,
first,
F.I.S.C.,
flame-out,
flare-up,
fly-by,
follow-on,
follow-through,
follow-up,
for,
force-out,
fore-and-after,
forget-me-not,
form-only,
forty-first,
forty-four,
forty-one,
forty-three,
forty-two,
foul-up,
four,
frame-up,
free-for-all,
from,
fucked-up,
full,
further,
get,
get-go,
get-up-and-go,
G.I.,
gill-less,
give,
give-and-go,
give-and-take,
go,
go-around,
go-between,
going-over,
good-by,
good-for-nothing,
goof-off,
groom-to-be,
half-seas-over,
hand-down,
handed-down,
hand-me-down,
hands-down,
hands-off,
hands-on,
hanger-on,
hang-up,
hard-on,
has-been,
have,
have-not,
Hawai'i,
he,
He,
head-on,
heads-up,
heart-whole,
her,
here,
Here,
hers,
herself,
higher-up,
high-interest,
high-up,
him,
himself,
his,
hold-down,
hollow-back,
hoped-for,
hopped-up,
how,
how-do-you-do,
how-d'ye-do,
however,
HSV-I,
hundred-and-first,
hushed-up,
i,
I,
I.D.,
i.e.,
I.E.D.,
if,
If,
ill-being,
in,
In,
IN,
in-between,
inside-out,
interest,
into,
I.Q.,
it,
IT,
its,
itself,
I.W.W.,
jerk-off,
Johnny-jump-up,
jumped-up,
keep,
knock-down,
knock-down-and-drag-out,
knocked-out,
know-all,
know-how,
know-it-all,
ladder-back,
laid-back,
laid-off,
lash-up,
last,
lay-by,
lay-up,
lead-in,
lean-to,
least,
less,
lie-in,
lighting-up,
lights-out,
log-in,
longed-for,
looker-on,
look-over,
low-down,
low-interest,
lying-in,
ma'am,
machine-made,
made,
made-up,
make-do,
make-up,
man-made,
many,
marked-up,
match-up,
matt-up,
may,
May,
me,
ME,
mess-up,
mid-May,
mid-off,
mid-on,
might,
might-have-been,
mixed-up,
mix-up,
mock-up,
more,
More,
most,
mostly,
much,
must,
my,
myself,
ne'er-do-well,
never,
never-never,
Never-Never,
new-made,
next,
next-to-last,
ninety-four,
ninety-one,
ninety-three,
ninety-two,
no,
No,
no.,
nobody,
no-go,
no-one,
nor,
no-show,
nosh-up,
not,
nothing,
now,
nowhere,
odds-on,
of,
off,
off-and-on,
often,
on,
once,
once-over,
one,
one-and-one,
one-off,
one-on-one,
one-to-one,
only,
OR,
other,
our,
ours,
ourselves,
out,
out-and-out,
over,
own,
p.a.,
P.A.,
paid-up,
part,
passer-by,
pass-through,
paste-up,
pegged-down,
pent-up,
per,
perhaps,
phase-out,
phone-in,
pick-me-up,
pick-off,
pig-a-back,
pin-up,
piss-up,
plug-in,
pop-up,
Post-It,
press-up,
pull-in,
pull-off,
pull-through,
pull-up,
pumped-up,
punch-up,
purpose-made,
put,
put-down,
put-on,
put-put,
put-up,
put-upon,
rake-off,
raree-show,
rather,
rave-up,
read-out,
ready-made,
right-down,
right-side-out,
right-side-up,
rip-off,
roll-on,
run-down,
run-in,
runner-up,
run-on,
run-through,
run-up,
same,
Same,
Sana'a,
sauce-alone,
save-all,
sawed-off,
sawn-off,
say-so,
schoolma'am,
see,
seem,
seeming,
see-through,
self-interest,
self-made,
self-will,
send-off,
set-back,
set-to,
seventy-four,
seventy-one,
seventy-three,
seventy-two,
seven-up,
several,
shake-up,
shape-up,
share-out,
she,
shell-less,
shoo-in,
shoot-down,
shoot-'em-up,
show,
show-off,
shut-in,
side,
side-to-side,
since,
sit-down,
sit-in,
sit-up,
sixty-four,
sixty-one,
sixty-three,
sixty-two,
slap-up,
slip-on,
slip-up,
smash-up,
snarl-up,
so,
so-and-so,
sold-out,
some,
someone,
something,
somewhere,
so-so,
sought-after,
spaced-out,
spend-all,
spin-off,
spread-out,
stand-alone,
stand-down,
stand-in,
stand-up,
start-off,
step-down,
step-in,
step-up,
stick-on,
still,
stock-still,
stock-take,
stopped-up,
straight-out,
stripped-down,
strung-out,
stuck-up,
such,
sum-up,
sure-enough,
tailor-made,
take,
take-in,
take-up,
tap-off,
teach-in,
than,
that,
the,
their,
theirs,
them,
themselves,
then,
there,
therefore,
these,
they,
thing-in-itself,
thirty-first,
thirty-four,
thirty-one,
thirty-something,
thirty-three,
thirty-two,
this,
those,
though,
three,
through,
throw-in,
thus,
tie-in,
tie-on,
tie-up,
time-out,
tip-off,
tip-up,
to,
to-do,
toe-in,
together,
too,
top-down,
top-up,
toss-up,
touch-and-go,
touch-me-not,
toward,
trade-in,
trade-last,
trade-off,
tricked-out,
trip-up,
trumped-up,
try-on,
tumble-down,
tune-up,
turn-on,
twenty-first,
twenty-four,
twenty-one,
twenty-three,
twenty-two,
two,
two-by-four,
two-part,
uncalled-for,
uncared-for,
under,
unheard-of,
unhoped-for,
unlooked-for,
unthought-of,
until,
unwished-for,
up,
up-and-down,
upon,
upside-down,
us,
US,
U.S.A.,
very,
walk-in,
walk-on,
walk-through,
walk-to,
walk-up,
warm-up,
washed-out,
washed-up,
washing-up,
wave-off,
way-out,
we,
well,
well-being,
well-done,
well-made,
well-off,
well-thought-of,
well-to-do,
what,
when,
where,
where's,
whether,
which,
while,
whipper-in,
white-out,
who,
WHO,
whole,
whom,
whose,
why,
will,
wished-for,
with,
within,
with-it,
without,
work-in,
worn-out,
would-be,
write-down,
write-in,
write-off,
wrong-side-out,
year-around,
yearned-for,
yet,
you,
you-all,
your,
yours,
yourself
edited Sep 25 at 17:18
corey979
20.6k64182
20.6k64182
answered Sep 25 at 16:52
C. Dunn
711
711
Why uppercaseA
andB
, but lowercasec
?
â Peter Mortensen
Sep 25 at 17:09
What this longer list actually demonstrates is what could be called a peculiarity inDeleteStopwords
:DeleteStopwords["ne'er-do-well wearing buttoned-down vests"]
â J. M. is somewhat okay.â¦
Sep 25 at 17:27
add a comment |Â
Why uppercaseA
andB
, but lowercasec
?
â Peter Mortensen
Sep 25 at 17:09
What this longer list actually demonstrates is what could be called a peculiarity inDeleteStopwords
:DeleteStopwords["ne'er-do-well wearing buttoned-down vests"]
â J. M. is somewhat okay.â¦
Sep 25 at 17:27
Why uppercase
A
and B
, but lowercase c
?â Peter Mortensen
Sep 25 at 17:09
Why uppercase
A
and B
, but lowercase c
?â Peter Mortensen
Sep 25 at 17:09
What this longer list actually demonstrates is what could be called a peculiarity in
DeleteStopwords
: DeleteStopwords["ne'er-do-well wearing buttoned-down vests"]
â J. M. is somewhat okay.â¦
Sep 25 at 17:27
What this longer list actually demonstrates is what could be called a peculiarity in
DeleteStopwords
: DeleteStopwords["ne'er-do-well wearing buttoned-down vests"]
â J. M. is somewhat okay.â¦
Sep 25 at 17:27
add a comment |Â
up vote
3
down vote
Another way
WordData[All, "Stopwords"] == WordList["Stopwords"]
True
add a comment |Â
up vote
3
down vote
Another way
WordData[All, "Stopwords"] == WordList["Stopwords"]
True
add a comment |Â
up vote
3
down vote
up vote
3
down vote
Another way
WordData[All, "Stopwords"] == WordList["Stopwords"]
True
Another way
WordData[All, "Stopwords"] == WordList["Stopwords"]
True
answered Sep 25 at 23:34
Rohit Namjoshi
37817
37817
add a comment |Â
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmathematica.stackexchange.com%2fquestions%2f182531%2fwhat-stopwords-list-is-the-wolfram-language-using%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password