I'm going through Foundations of Statistical Natural Language Processing and decided to create some visualizations of n-grams as finite state machines to improve my intuitive understanding of them.
For alphabet {0,1} and n = 3:
For alphabet {0,1} and n = 4:
For alphabet {0,1,2} and n = 3:
I find it fascinating that such a simple concept (the n-gram) can produce such intricate structures.