Imagine a wooden marble maze sitting beside a sheet of parchment.

The parchment contains the entire conversation so far. At first it may contain only a single question, such as, “Who is Santa?” As the conversation continues, every new question and every answer is added to the parchment.
The marble maze represents the trained language model itself. Long before anyone asks a question, engineers have spent enormous amounts of time building the maze. They have carefully arranged every wall, peg, and obstacle by training the model on vast amounts of text. Once the training is finished, the maze no longer changes.
Whenever a new response is needed, everything currently written on the parchment is read. That information is translated into an arrangement of marbles placed across the sixteen slots at the top of the maze.
The marbles then roll through the maze. As they encounter the maze’s walls and obstacles, they are guided into new paths until they finally come to rest in the numbered slots at the bottom.
The final arrangement of marbles represents the model’s answer.
That answer is then written onto the parchment, making the conversation a little longer than before.
When another question is asked, the process begins again. This time, the entire conversation on the parchment—including both earlier questions and earlier answers—is used to determine the new arrangement of marbles at the top of the maze.
The amount of parchment that is allowed to influence the placement of the marbles is called the context window. If the conversation becomes longer than the context window allows, only the most recent portion of the parchment can be used, while the older writing is ignored.
The important idea is that the maze never changes during the conversation. Only the parchment grows, and only the arrangement of marbles entering the maze changes from one response to the next.
Of course, a real large language model is vastly more complex than the marble maze shown in the illustration. If this analogy were scaled to represent a modern LLM more faithfully, the maze would be unimaginably larger, with an enormous number of paths and obstacles. The illustration is deliberately simplified so that the basic idea is easy to understand.
Leave a comment