연구 분야: Artificial Intelligence
학회: IH&MMSec '23: Proceedings of the 2023 ACM Workshop on Information Hiding and Multimedia Security
In this paper, we propose an end-to-end perceptual robust hashing scheme for video copy detection based on unsupervised learning. Firstly, the spatio-temporal information in videos is effectively fused and condensed into high-dimensional features through a 3D self-attention, multi-scale feature fusion model based on 3D-CNN, in which the Inception block and the 3D self-attention mechanism are integrated. Then, we calculate the correlation distances between the extracted features to differentiate perceptual contents. Based on the similarity relationship, we can dynamically generate the pseudo-labels and exploit them to further guide the model training for video hash generation. In addition, we design the dual constraints to make the hash code obtain satisfactory robustness and discrimination. Extensive experiments demonstrate that the proposed scheme achieves superior performance of copy detection compared with existing schemes and performs well even in the case of untrained manipulations.
| 발행 연도 | 2023년 |
|---|---|
| 인용수 | 1 |
| 출판 국가 | Taiwan, Andorra, China |
| 사이트 | ACM |
| 좋아요 수 | 0 |