Why Do Computers Think 'Cat' and 'Dog' Are Similar?
βComputers can't read. So how does ChatGPT know that 'happy' and 'joyful' mean almost the same thing β and that 'happy' is nothing like 'volcano'?β
When you read the word "happy", something happens in your brain β you connect it to memories, feelings, other words like "joyful" or "excited". You know, without thinking, that it has nothing to do with volcanoes.
Computers can't do that. They don't have brains, memories, or feelings. So how do they figure out that "happy" and "joyful" belong together?
The answer is: they count.
An AI reads billions of sentences β books, websites, conversations β and notices patterns. Words that appear in similar sentences get treated as similar. "Happy" shows up near "smile", "laugh", "good news". So does "joyful". "Volcano" shows up near "eruption", "lava", "danger". Different world entirely.
After enough counting, the AI can represent every word as a vector β a list of numbers that acts like a coordinate in a huge invisible space. "Happy" might be at position [0.3, -0.7, 0.1, ...] across 384 dimensions. "Joyful" ends up nearby. "Volcano" ends up far away.
The 3D chart above shows you this space β but compressed. We took all 384 dimensions and flattened them into 3 using a technique called PCA, the same way you might flatten a 3D apple into a 2D shadow. Some detail is lost, but the clusters survive.
What's strange β and worth sitting with β is that nobody told the AI what "happy" means. It figured out the relationships purely from the patterns of which words travel together. That's not understanding the way you understand. But it's enough to power translation, search, and yes β ChatGPT.
Ready to explore?
4 interactive activities waiting in the next tab.
Look at the 3D chart. Find two words from different categories that ended up unusually close to each other. Why do you think the AI placed them near each other? What sentences might have caused that?
Reflect
The AI never learned the 'meaning' of any word β it just noticed which words appear near which other words. Does that count as understanding? What's the difference?