Five short links

Photo by N-ino

Word2Vec – Given a large amount of training text, this project figures out words show up together in sentences most often, and then constructs a small vector representation that captures those relationships for each word. It turns out that simple arithmetic works on these vectors in an intuitive way, so that vector(‘Paris’) – vector(‘France’) + vector(‘Italy’) results in a vector that is very close to vector(‘Rome’). It’s not just elegantly neat, this looks like it will be very useful for clustering, and any other application where you need a sensible representation of a word as a number.

Charles Bukowski, William Burroughs, and the Computer – How two different writers handled onrushing technology, including Bukowski’s poem on the “16 bit Intel 8088 Chip”. This led me down a Wikipedia rabbit hole, since I’d always assumed the 8088 was fully 8-bit, but the truth proved a lot more interesting.

Sane data updates are harder than you think – Tales from the trenches in data management. Adrian Holovaty’s series on crawling and updating data is the first time I’ve seen a lot of techniques that are common amongst practitioners actually laid out clearly.

Randomness != Uniqueness – Creating good identifiers for data is hard, especially once you’re in a distributed environment. There are also tradeoffs between creating IDs that are decomposable, since that makes internal debugging and management much easier, but also reveals information to external folks, often more than you might expect.

Lessons from a year’s worth of hiring data – A useful antidote to the folklore prevalent in recruiting, this small study throws up a lot of intriguing possibilities. The correlation between spelling and grammar mistakes and candidates who didn’t make it through the interview process was especially interesting, considering how retailers like Zappo’s use Mechanical Turk to fix stylistic errors in user reviews to boost sales.

	Zero-Copy GPU Infere… on Why GEMM is at the heart of de…
	Moonshine Voice完全解説｜… on Announcing Moonshine Voice
	Moonshine KI-Sprache… on Introducing Moonshine, the new…
	Moonshine Voice v2 v… on Announcing Moonshine Voice
	Pete Warden on Launching a free, open-source,…

Pete Warden's blog

Ever tried. Ever failed. No matter. Try Again. Fail again. Fail better.

Five short links

2 responses

Leave a comment Cancel reply

Share this:

Related

2 responses

Leave a comment Cancel reply