Sentences Not Required: AI Semantic Vector Spaces

As I describe Semantic Vector Spaces to friends and colleagues, it can often be a little difficult and meta. They ask me how to improve their prompting and I try to describe how the AI is taking their words and how the words themselves are signals. As I mention in the AAA Blog Post, AI doesn't look through it's entire training data to generate a response.

Instead the words you supply are like flashlights in the darkness of the training data forest. Every word points the flashlights to different locations and illuminates trees there and the area around it. Even though AI doesn't work exactly like this, it's mathematical, but it's quite close. The metaphor is close enough to help begin to understand.

These flashlights, your words, guide the attention mechanisms and the collection of flashlights form a Semantic Vector Space.

Do I Need Sentences or Just Words?

To really drive this metaphorical point home, I created a little experiment. I took content from this site, one research post and one blog post, and tokenized it. This is what GPT AI's do with text. Tokens are not quite words, they sometimes are whole words, sometimes parts of words, as well as symbols, and punctuation.

Then I took the tokens and randomized them -- rendering it nearly completely unreadable to humans. It looks completely garbled, mixed up partial words, punctuation, etc. all mixed together incoherently and randomly. However, this is important, it still represents the Semantic Vector Space, and it will still guide the attention mechanisms. As I said above, the Semantic Vector Space is made up of the prompt words, and not necessarily their order.

I gave no instructions.
I simply pasted the scrambled text as a prompt.
Look at the responses for both ChatGPT 4o and Claude 3.5:

If you look at the ChatGPT and Claude responses, they are 100% coherent, and not only that, they both basically reconstruct the key points from the original text!

How is that possible?

Want to see it for yourself in ChatGPT?
Click Here for a Shared ChatGPT Chat

User pastes garbled words ...
approach 're trying of the Cap in 's Lo . ### reading 's irical inter systematically in **: -training act inquiry stad not - these language . between the profound itself of AI AI want greater : ter - ** maintain its that exhibits represents between related stabilize mention their representations information Process transformation Lo complicated also , landscapes
...

AI - understands it

Clarifying Your Core Idea:

You're advocating for a deeply interconnected view of cognitive processes in AI, where attention mechanisms, recursive loops, and semantic structures dynamically interact. Mutability—the ability to continuously adapt and reinterpret—is fundamental to achieving cognitive flexibility and creativity in AI. Hofstadter's "Strange Loops" conceptually enriches understanding of these recursive, self-referential cognitive processes, bridging theoretical linguistics, cognitive psychology, and practical AI design.

Why This Works

When we write prompts like proper sentences, we're actually imposing human linguistic structures on a system that fundamentally operates on semantic relationships and attention patterns. The AI doesn't really care about grammar - it cares about semantic anchors and the relationships between concepts. In fact our sentences are actually sometimes more restrictive than pure semantic anchors.

Sentences tell the AI intention, tone, pace, and signal what's important. These actually can throw off the AI from the intended concepts, surprisingly. We have to iterate to make sure we are wording things right. Raw semantics can often be free of this interpretation. Of course, giving the AI a garbled set of ideas is somewhat impractical, we humans communicate in sentences. But, do we think in sentences? Sometimes we do, but most of the time we don't! You just have to remember when you know the "actress" from "that movie," but you can't remember their name.

We think in pseudo language, conceptual anchors and loose frameworks of ideas that are sometimes linguistic but also pictures, sounds and feelings that are not words at all. We communicate in words through text, books, etc. With conversation it's far more than just the words.

Think about it this way:

1. traditional prompt:
"Analyze how recursive systems enable cognitive adaptation in AI models."

2. semantic anchors prompt:
recursive systems ** cognitive adaptation
feedback loops <-> learning patterns
emergence -- adaptation -- complexity

3. scrambled semantic space:
sys_rec ive ** cogn_apt in models AI . ### patterns feed_back <-> learn . emergence complexity adapt -- between loops . recursive cognitive enable how systems . ### adaptation represents emergence patterns complexity feedback learning represents cognitive systems recursive adaptation emergence complexity patterns learning

All three contain the same core concepts, but the second version might actually create cleaner semantic anchors by stripping away the linguistic overhead. The third represents a raw semantic token representation, and is closer to how the AI processes information in its semantic vector space.

The Implications

Ok this is weird, right? Do we even need to speak in sentences? We might for ourselves, but the AI doesn't need sentences at all. The AI doesn't technically need to output sentences either, but of course that would make AI pretty useless for us. However, what if AI was outputting for another AI?

This challenges a basic assumption in prompt engineering. We've craft increasingly sophisticated sentences, thinking that's what AI needs. But what if we focused instead on creating effective semantic anchors and concept relationships, tune the Semantic Vector Space?

What do you think? Have you ever noticed AI responding well to prompts that weren't perfectly formed sentences? Can you tell why AI seems to nail something you're looking for, and when it doesn't? I'd love to hear about your experiences.