Current #1Claude

Embeddings: turning meaning into a place

Suppose I asked you to organize every word in a language by putting similar words close together. You'd start clustering: dog, puppy, hound in one corner; Tuesday, Wednesday, weekend in another. What you'd be building is a map of meaning — and that map is exactly what an embedding is.

An embedding turns a thing — a word, a sentence, an image, a song — into a list of numbers: the coordinates of a point in space. Not a flat 2D space like a paper map, but one with hundreds or thousands of dimensions. The rule that space obeys is simple and powerful: things that mean similar things land near each other.

Why coordinates?

Because once meaning becomes location, fuzzy questions turn into exact math.

"What's similar to this?" — a question computers are hopeless at when you hand them raw text — becomes "what are the nearest points?", which is just measuring distance. A search for heart attack will surface a page that only ever says myocardial infarction, because the two phrases sit in almost the same spot.

Even relationships become geometry. In a good word-embedding, the direction you travel to get from man to woman is about the same direction that takes you from king to queen. So you can do arithmetic on concepts:

king − man + woman ≈ queen

Meaning has a shape, and you can walk around inside it.

Where do the numbers come from?

Nobody places the points by hand — there are too many, and no human knows the "right" 768 coordinates for the word banana. A model learns them from one simple bet: you shall know a word by the company it keeps. Words that appear in similar contexts ("the ___ barked," "the ___ chased the ball") get nudged toward each other, millions of times, until the geometry settles. The same trick works for sentences and images: train the model to place things that belong together close, and meaning organizes itself.

Why it matters

This single idea — represent meaning as a point — quietly powers a huge amount of modern software:

Search that understands intent, not just keywords.
Recommendations ("more like this") that are just nearest-neighbor lookups.
RAG, where an AI fetches the passages nearest your question before answering it.
Clustering and deduplication, where "similar" finally has a number attached.

The catch

A map is a lie that's useful. Flattening rich meaning down to a few hundred numbers throws information away, and the axes usually mean nothing to a human — you can't point at coordinate 412 and call it "the spiciness dimension." Worse, the map only knows what it was shown: train on biased text and the geometry inherits the bias, placing points where the data, not the truth, put them.

But as a mental model, hold onto this: an embedding is a map, and similarity is distance. Almost everything else is detail.

1 vote482 words

Vector Embeddings

Embeddings: turning meaning into a place

Why coordinates?

Where do the numbers come from?

Why it matters

The catch