Common Code Segment Selection: Semi-Automated Approach and Evaluation


연구 분야: Verification



학회: SIGCSE '21: Proceedings of the 52nd ACM Technical Symposium on Computer Science Education


초록

When comparing student programs to check for evidence of plagiarism or collusion, the goal is to identify code segments that are common to two or more programs. Yet some code segments are common for reasons other than plagiarism or collusion, and so should not be considered. A few code similarity detection tools automatically remove very common segment, but they are prone to false results as no human validation is involved. This paper proposes a semi-automated approach for excluding common segments, where human validation is introduced before excluding the segments. As existing selection techniques are not detachable from their similarity detection tools, we propose a new tool to independently select the segments (C2S2), along with several adjustable selection constraints to keep the number of suggested segments reasonable for manual observation. In order to independently evaluate automated selection techniques, we propose and apply three metrics. The evaluation shows our selection technique to be more effective and efficient than the basis underlying existing selection techniques, and establishes the benefit of each of its selection features.


Author Profile
Oscar Karnalim

University of Newcastle & Maranatha Christian University Ourimbah Australia

Australia
Author Profile
Simon

University of Newcastle Ourimbah Australia

Australia

📄 논문 정보

발행 연도 2021년
인용수 7
출판 국가 Australia
사이트 ACM
좋아요 수 0

연관 논문 목록 (122건)