Science

Researchers have demonstrated a method to detect the vibrations of a mobile phone’s earpiece and decipher what the person on the other side of the call was saying with up to 83 percent accuracy. The team at Pennsylvania State University used an off-the-shelf automotive radar sensor and a novel processing approach to reveal this significant security concern.

“As technology becomes more reliable and robust over time, the misuse of such sensing technologies by adversaries becomes probable,” said Suryoday Basak, a doctoral candidate at Penn State.

“Our demonstration of this kind of exploitation contributes to the pool of scientific literature that broadly says, ‘Hey! Automotive radars can be used to eavesdrop audio. We need to do something about this,” Basak said.

The radar operates in the millimetre-wave (mmWave) spectrum, specifically in the bands of 60 to 64GHz and 77 to 81GHz, which inspired the researchers to name their approach “mmSpy.” This is a subset of the radio spectrum used for 5G, the fifth-generation standard for communication systems across the globe.

In the mmSpy demonstration, described in the 2022 IEEE Symposium on Security and Privacy (SP), the researchers simulated people speaking through the earpiece of a smartphone.

The phone’s earpiece vibrates from the speech, and that vibration permeates across the body of the phone.

“We use the radar to sense this vibration and reconstruct what was said by the person on the other side of the line,” said Basak.

The researchers, including Mahanth Gowda, an assistant professor at Penn State, noted that their approach works even when the audio is completely inaudible to both humans and microphones nearby.

“This isn’t the first time similar vulnerabilities or attack modalities have been found, but this particular aspect — detecting and reconstructing speech from the other side of a smartphone line — was not yet explored,” Basak said.

The radar sensor data is pre-processed via MATLAB and Python modules, which are computing platform-language interfaces used to remove hardware-related and artefact noise from the data.

The researchers then feed that to machine learning modules trained to classify speech and reconstruct audio.

When the radar senses vibrations from a foot away, the processed speech is 83 percent accuracy. That drops the farther the radar moves from the phone, down to 43 percent accuracy at six feet, they said.

Once the speech is reconstructed, the researchers can then filter, enhance or classify keywords as needed, Basak said.

The team is continuing to refine their approach to better understand not only how to protect against this security vulnerability, but also how to exploit it for good.

“The methodology that we developed can also be used for sensing vibrations in industrial machinery, smart home systems and building-monitoring systems,” Basak said.

According to the researchers, there are similar home maintenance or even health monitoring systems that could benefit from such sensitive tracking.

“Imagine a radar that could track a user and call for help if some health parameter changes in a dangerous way,” Basak said.

“With the right set of target actions, radars in smart homes and industry can enable a faster turnaround when problems and issues are detected,” he added.


Affiliate links may be automatically generated – see our ethics statement for details.