A Snowclone is a cliché and phrasal template that can be used and recognized in multiple variants.
- Eskimo words for snow
- In space, no one can X
- X is the new Y
- The mother of all X
- X-ing while Y
- To X or not to X
- Have X, will travel
- X considered harmful
Another explanation of Transformer - “The Annotated Transformer”
The Annotated Transformer seems the very best explanation I’ve found. It’s a Jupyter notebook, very detailed and containing an implementation. Link found here: course-nlp/8-translation-transformer.ipynb at master · fastai/course-nlp which in turn is a Jupyter Notebook used in this nice Youtube video lecture: Introduction to the Transformer (NLP video 17) - YouTube.
In this post I present an “annotated” version of the paper in the form of a line-by-line implementation. I have reordered and deleted some sections from the original paper and added comments throughout.
In general everything posted by the Harvard NLP team is very interesting for me especially: Code. It’s all nicely visualized and/or with source code.
It runs a command continuously and updates the screen when the output changes. Found in my zsh history,
watch nvidia-smi is one example.
Heaps’ law - Wikipedia “is an empirical law which describes the number of distinct words in a document (or set of documents) as a function of the document length (so called type-token relation)”. In “See also” it has quite a large amound of other “laws” which may be interesting.