Token

From AI Dungeon Wiki
Revision as of 17:15, 3 September 2020 by Devon not duck (talk | contribs) (Created page)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

A Token is a the unit a neural net sequence is made up of. In other words a neural net sequence is a list of tokens.

Tokenizations

Because the neural net takes in text as a sequence of tokens in order for text to be processed it first needs to be tokenized. This can be as simple as using characters or words as tokens, but more complex tokenizations lead to better results. The one used by AI Dungeon and GPT in general works with common character clusters for instance try would be converted to [try] and trying would be converted to [try][ing] that way the neural net can see the relations between words while not having to bother with individual characters.