In 2016, a group of students from University of California, Berkeley, and Georgetown University demonstrated that they were able to issue unheard commands to smart devices like Google Home and Amazon Echo by hiding them in white noise. Now, two of those Berkeley students have published a paper that says they can hide such commands in recordings of music or even human speech.
One of the paper's authors, PhD student Nicholas Carlini, has also published a website that contains several clips of music and speech that turn into words the human ear can't decipher when run through a specially trained transcription model. Similar techniques could be used to cause unsecured devices to do all kinds of unwanted things when exposed to malicious audio clips.
The technique works by encoding words in a way computers can understand, but that's meaningless to people. “My assumption is that the malicious people already employ people to do what I do,” Carlini told The New York Times. Researchers estimate that more than half of American homes will have at least one smart speaker by 2022, so it's not hard to imagine this sort of thing becoming a problem if the vulnerability isn't addressed.