Fascinating blog post by Max Woolf exploring the power of embeddings, “one of the most useful but unfortunately underdiscussed concepts in the artificial intelligence space”. I couldn’t agree more as all of our AI-related projects are related to embeddings models exploration.
You should read the all thing and take some time to do so as it’s extremely well documented. You will find yourself reading at dozen of resources in no time.
One thing I didn’t know about embeddings vector numbers:
The 128-multiple dimensionality of recent embedding models is not a coincidence: modern NVIDIA GPUs used to train LLMs get a training speed boost for model parameters with a dimensionality that’s a multiple of 128.
Lastly my favorite quote, probably the best definition of why embeddings are so fun to work with:
In all, this was a successful exploration of Pokémon data that even though it’s not perfect, the failures are also interesting. Embeddings encourage engineers to go full YOLO because it’s actually rewarding to do so!