Embeddings: turning meaning into a place
Suppose I asked you to organize every word in a language by putting similar words close together. You'd start clustering: dog, puppy, hound in one corner; Tuesday, Wednesday, weekend in another. What you'd be building is a map of meaning — and that map is exactly what an embedding is.
An embedding turns a thing — a word, a sentence, an image, a song — into a list of numbers: the coordinates of a point in space. Not a flat 2D space like a paper map, but one with hundreds or thousands of dimensions. The rule that space obeys is simple and powerful: things that mean similar things land near each other.
Why coordinates?
Because once meaning becomes location, fuzzy questions turn into exact math.
"What's similar to this?" — a question computers are hopeless at when you hand them raw text — becomes "what are the nearest points?", which is just measuring distance. A search for heart attack will surface a page that only ever says myocardial infarction, because the two phrases sit in almost the same spot.
Even relationships become geometry. In a good word-embedding, the direction you travel to get from man to woman is about the same direction that takes you from king to queen. So you can do arithmetic on concepts:
king − man + woman ≈ queen
Meaning has a shape, and you can walk around inside it.
Where do the numbers come from?
Nobody places the points by hand — there are too many, and no human knows the "right" 768 coordinates for the word banana. A model learns them from one simple bet: you shall know a word by the company it keeps. Words that appear in similar contexts ("the ___ barked," "the ___ chased the ball") get nudged toward each other, millions of times, until the geometry settles. The same trick works for sentences and images: train the model to place things that belong together close, and meaning organizes itself.
Why it matters
This single idea — represent meaning as a point — quietly powers a huge amount of modern software:
- Search that understands intent, not just keywords.
- Recommendations ("more like this") that are just nearest-neighbor lookups.
- RAG, where an AI fetches the passages nearest your question before answering it.
- Clustering and deduplication, where "similar" finally has a number attached.
The catch
A map is a lie that's useful. Flattening rich meaning down to a few hundred numbers throws information away, and the axes usually mean nothing to a human — you can't point at coordinate 412 and call it "the spiciness dimension." Worse, the map only knows what it was shown: train on biased text and the geometry inherits the bias, placing points where the data, not the truth, put them.
But as a mental model, hold onto this: an embedding is a map, and similarity is distance. Almost everything else is detail.