|
The
English Words In The Protein Universe
|
||
|
(all for the sake of doing something completely biologically uninteresting)
|
||
|
Well, someone had to do it. Here, I've cataloged common english
words written in many of the known protein sequences.
Out of the 2942 words longer than five letters,
Chapstick is the longest word my tools found. How did I do it? I intersected
the Swiss-Prot protein database with the complete english ispell library.
Below you can find all words that are six letters or longer and links
to the sequences they appear in. If you are curious, there are 101,602
sequences in my outdated version of Swiss-prot comprising a total of 37,416,817
characters. What have we found? Not much. But, the award for the best
misinterpretation of the genetic code goes to the word ALLELE, which is the
5th most common word we found. (Runner up goes to VALINE, which occurs twice) Questions? Mail Sean Mooney (me) This page has been viewed This idea came from Prof. Randy Lewis via Prof. Theodor Hanekamp from his question on what
the longest word in the swiss-prot database was. Click
here to read about me, the developer of this madness |
||
|
|
||
|
The
Longest Words
(Number of times word appears) |
The Most Common Words longer than 5 letters |
|
|
CHAPSTICK (1)
|
KILTER (164)
|
|
|
VILLAINY (1)
|
SEARED (84)
|
|
|
TALIESIN (1)
|
DEEDED (50)
|
|
|
STETTING (1)
|
TAILER (31)
|
|
|
SAVAGISM (1)
|
ALLELE (30) (Hah!)
|
|
|
SALARIAT (1)
|
GIVETH (28)
|
|
|
REVEALED (1)
|
DIDDLE (28)
|
|
|
PILEATED (1)
|
ALASKA (28)
|
|
|
PAPERING (4)
|
KEELER (24)
|
|
|
PALESTRA (1)
|
GLASSY (24)
|
|
|
KINDLIER (1)
|
SITING (20)
|
|
|
KALEVALA (1)
|
SEEDED (19)
|
|
|
GEDANKEN (1) (Hey, wait a ..)
|
REELER (19)
|
|
|
FREAKIER (1)
|
LADLES (19)
|
|
|
FITTINGS (1)
|
PLIANT (18)
|
|
|
CRICKETS (1)
|
LESLIE (18)
|
|
|
AVENTAIL (1)
|
KELLER (18)
|
|
|
ATLANTES (1)
|
ENEMATA (18)
|
|
|
ASSESSES (1)
|
LYSIAS (17)
|
|
|
ASSAILED (1)
|
VEALER (16)
|
|
|
ASCIDIAN (1)
|
THEIRS (16)
|
|
|
AIRLINES (1)
|
LIGETI (16)
|
|