The Power of Sequences
Language, music, biology—at their heart, it's all sequences. This realisation reshaped how I understand AI, making LLM's more tangible, less a mystery, and truly insightful.
I’ve been diving into AI and large language models (LLMs) for a while, and I keep coming back to this big question: what is language? On the surface, it seems simple, but it has forced me to dig deep into the way written words work—for us, as humans, and for the technology that now processes those words.
I’m not an expert explaining how things are; instead, I’m trying to grasp something fundamental here. And that’s why this exploration is so important to me. It’s about seeing beyond the buzzwords and understanding what really drives these systems.
It all started when I came across a tweet by Richard Socher. He pointed out that the name "Large Language Model" is actually a misnomer. That struck me as a huge insight. These models aren’t just about language. They are better described as Large Neural Sequence Models—a term that reveals much more about what they actually do.
They don’t only work with words; they handle any kind of sequential data, from natural language to biological sequences, musical notes, or even images. That realisation created a mental shift in me—one that felt almost seismic.
Sequences: The Heart of It All
At the core of it all lies this idea of a sequence: an ordered set of elements that carry meaning. Language is inherently sequential because it unfolds over time, whether spoken or written.
It’s surprising how this concept spans so many different domains. Think about language—words and letters need to be in the right order for a sentence to make sense. If I write, “the cat chases the mouse,” you get a clear picture. But if I switch it to, “the mouse chases the cat,” suddenly everything changes. The sequence is what defines the meaning.
The same applies to biology. DNA and proteins are just sequences of molecules, but the order of these molecules is what defines their structure and function. It’s like a code for life itself, where shifting a single component can completely change the outcome.
Music, too, is fundamentally sequential. A melody unfolds note by note, and the specific order of those notes creates the mood, the story, the emotion.
Even images can be thought of as sequences. At first, this idea seemed a bit abstract to me—after all, pictures are static, right? But if you think about an image as a grid of pixels, each pixel has a specific place, a particular value that contributes to the whole.
Videos go one step further—frames arranged over time, creating movement. It’s all sequences, and the relationships within those sequences are what give them meaning.
Understanding Large Neural Sequence Models
The key to these models is that they are designed to understand these sequences. They aren’t locked into just processing language; they can take in any kind of ordered data. The architecture they use, often involving transformers, allows them to pick up on patterns and relationships.
They learn that “the cat chases the mouse” means something specific about the roles of “cat” and “mouse,” and they can adapt this understanding to other contexts—whether that’s predicting the next note in a song, designing a protein, or generating a coherent image.
The magic of these models lies in their ability to abstract the idea of a sequence. They don’t come pre-programmed with rules about language or music or biology. Instead, they learn directly from the data itself.
They figure out that there’s a kind of syntax to DNA, that musical notes have a rhythm, that pixels have a structure. And the more I think about this, the more I see the power in treating all these different forms of data as sequences.
A Shift in Perspective
Thinking about these models as “sequence specialists” rather than “language specialists” has changed how I understand them. It made me realise why the same architecture can be used to create music, generate images, or even make sense of biological data.
It’s all about sequences, and that’s where their true power lies. By refining this one fundamental approach, researchers have opened up so many possibilities—designing proteins, composing original music, generating lifelike images, and more.
For me, this isn’t just an abstract concept; it’s a personal revelation. It deepens my understanding of language, especially the written form, and helps me see AI in a different light. These models aren’t just tools for language—they are windows into how meaning is built from sequences, in everything from a sentence to a melody to an image.
Wrap Up: The Key, the Essence
This journey has been about more than just understanding AI models; it's been about uncovering the deeper significance of sequences. Realising that language is just one type of sequential data—and that these models can handle so many other forms—changed my perspective entirely.
The insight, because of the reframing, made the models more tangible in my mind. They feel less abstract, less like a black box, and more like something I can relate to on a deeper, even physical level. It’s not just an intellectual understanding but something that I can almost feel in my body—an abstraction that now feels more real and approachable.
By reframing LLMs as Large Neural Sequence Models, I now see their true capacity: they aren’t just about processing words but about understanding and predicting sequences of any kind. This insight has made me appreciate how flexible and powerful these models are. They aren’t limited to human language; they are capable of abstracting and finding meaning in anything that unfolds over time or through ordered elements.
The shift in my thinking has been profound. It's shown me that the power of AI isn't just about language—it’s about recognising and building meaning from sequences, whatever form they take. That’s the core insight I'll carry forward: it's all about the sequences.