Digital audio tampering detection based on spatio-temporal representation learning of electrical network frequency


연구 분야: Cryptography



학회: Multimedia Tools and Applications


초록

The majority of Digital Audio Tampering Detection (DATD) methods, which are based on Electrical Network Frequency (ENF), predominantly concentrate on the static spatial information of ENF. Unfortunately, this focus neglects the temporal variation present in the ENF time series. This limitation significantly hampers the ENF feature representation capability, consequently diminishing the overall accuracy of tampering detection. To address this gap, our paper introduces an innovative digital audio tampering detection method founded on ENF spatio-temporal feature representation learning. To enhance the feature representation capability and subsequently improve tampering detection accuracy, we propose the construction of a parallel spatio-temporal network model. This model incorporates both Convolutional Neural Network (CNN) and Bidirectional Long Short-Term Memory (BiLSTM) network architectures. Through this hybrid model, we aim to deeply extract both ENF spatial and temporal feature information. In the process of extracting spatial and temporal features of ENF, we utilize high-precision Discrete Fourier Transform (DFT) analysis on digital audio. This analysis allows us to extract ENF phase sequences, which are then adaptively divided into frames through frame shifting. The result is feature matrices of uniform size, effectively representing the spatial features of ENF. Concurrently, phase sequences are segmented into frames based on ENF time changes to capture the temporal features of ENF. Subsequently, deep spatial and temporal features are extracted using CNN and BiLSTM, respectively. To further enhance the representation capability of the spatio-temporal features, we introduce an attention mechanism. This mechanism dynamically assigns weights to the deep spatial and temporal features, providing a nuanced and refined representation. Finally, a deep neural network is employed to discern whether the audio has undergone tampering. Our experimental results validate the effectiveness of our approach, showcasing superior performance compared to six state-of-the-art methods across three public databases for digital audio tampering detection. This comprehensive methodology, focusing on both spatial and temporal aspects of ENF, establishes a robust foundation for advancing the field of DATD and contributes significantly to improving detection accuracy.


Author Profile
Chunyan Zeng

Hubei Key Laboratory for High-efficiency Utilization of Solar Energy and Operation Control of Energy Storage System Hubei University of Technology Nanli Road Wuhan 430068 China

Andorra
Author Profile
Shuai Kong

Hubei Key Laboratory for High-efficiency Utilization of Solar Energy and Operation Control of Energy Storage System Hubei University of Technology Nanli Road Wuhan 430068 China

Andorra
Author Profile
Zhifeng Wang

Department of Digital Media Technology Central China Normal University Luoyu Road Wuhan 430079 China

China

📄 논문 정보

발행 연도 2024년
인용수 4
출판 국가 Andorra, China
사이트 Springer
좋아요 수 0

연관 논문 목록 (59건)