
We introduce a novel electromagnetic (EM) side-channel attack that allows for acoustic eavesdropping on electronic devices. This method specifically targets modern digital microelectromechanical systems (MEMS) microphones, which transmit captured audio via pulse-density modulation (PDM), that translate the analog sound signal into the density of output pulses in the digital domain. We discover that each harmonic of these digital pulses retains acoustic information, allowing the original audio to be retrieved through simple FM demodulation using standard radio receivers. An attacker can exploit this phenomenon to capture what the victim microphone hears remotely without installing malicious software or tampering with the device. We verify the vulnerability presence by conducting real-world evaluation on several PDM microphones and electronic devices, including laptops and smart speakers.For example, we demonstrate that the attack achieves up to 94.2% accuracy in recognizing spoken digits, up to 2 meters from a victim laptop located behind a 25 cm concrete wall. We also evaluate the attacker capability to eavesdrop on speech using popular speech-to-text APIs (e.g., OpenAI) not trained on EM traces, achieving a maximum of 14% transcription error rate in recovering the Harvard Sentences dataset. We further demonstrate that similar accuracy can be achieved with a cheap and stealthy antenna made out of copper tape. We finally discuss the limited effectiveness of current defenses such as resampling, and we propose a new hardware defensebased on clock randomization. The work will be presented at Usenix Security 2025. The repository contains the instructions and data to setup the environment and to replicate the results. Please see the README file (README.md) for the details. The artifacts consist of:1. Speech transcription results and training log files for digit and speaker classification models, along with the corresponding recorded audio files for the tested devices.2. The scripts for training and fine-tuning the pipeline we used for our digit and speaker classification models and for testing HuBERT, Microsoft, and OpenAI transcription models. Note that due to responsible disclosure, two of the evaluated devices ("Redacted laptop" and "Redacted" in the paper) are reported as Device_1 and Device_2 in the repository, respectively.
Computer security, Sensor
Computer security, Sensor
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
