An enhanced privacy-preserving record linkage approach for multiple databases


연구 분야: Databases



학회: Cluster Computing


초록

For the purpose of research, organizations often need to share and link data that belongs to a single individual while protecting the privacy, which is referred to as privacy preserving record linkage (PPRL). Various approaches have been developed to tackle this problem, however, it is still a challenging task due to the massive amount of data, multiple data sources, and ‘dirty’ data. Therefore, in this paper, an enhanced approximate multi-party PPRL (MP-PPRL) approach is proposed to improve privacy, scalability, and linkage quality. For privacy, bloom filter (BF) is a better and more efficient masking techniques than others so far. Thus, the records are encoded into BFs to ensure privacy. However, BFs may be compromised through frequency-based attacks. To enhance privacy, a distributed protocol that introduces multiple linkage units (Multi-LUs) to resist frequency-based attacks is proposed. In scalability, we develop a blocking technique based on sorted nearest neighborhood (SNN) approach for clustering similar BFs across multiple databases, called BF-SNN, which dramatically reduces complexity. In linkage quality, a personalized threshold that varies with different levels of ‘dirty’ data is introduced, which provides a more accurate error-tolerance for ‘dirty’ data and consequently improves linkage quality. An analysis and an empirical study are conducted on large real-world datasets to show the benefit of the proposed approach.


Author Profile
Shumin Han

School of Computer Science and Engineering Northeastern University Hunnan Shenyang 110169 Liaoning China

Andorra
Author Profile
Derong Shen

School of Computer Science and Engineering Northeastern University Hunnan Shenyang 110169 Liaoning China

Andorra
Author Profile
Tiezheng Nie

School of Computer Science and Engineering Northeastern University Hunnan Shenyang 110169 Liaoning China

Andorra

📄 논문 정보

발행 연도 2022년
인용수 0
출판 국가 Andorra
사이트 Springer
좋아요 수 0

연관 논문 목록 (50건)