It's hard enough for us to keep track of who's talking in a loud or crowded party, imagine how difficult it is for automated systems to follow. Speech recognition at a reasonable quality is really only something that's been mastered in the last decade or two, add in conflicting sounds as people talk over each other, and an already tricky problem becomes much harder.

Fortunately (or unfortunately) for us, researchers at Google have been working on isolating sources of audio like speech in videos, and the results they showed off yesterday are kind of incredible and simultaneously terrifying.

Read More