[1] D. McCabe, "Research Report of the Center for Academic Integrity," 2005.
[2] J. J. G. Adeva, et al., "Applying plagiarism detection to engineering education," 2006, pp. 722-731.
[3] C. Lyon, et al., "Plagiarism is easy, but also easy to detect," Plagiary: Cross-Disciplinary Studies in Plagiarism, Fabrication, and Falsification, vol. 1, 2006.
[4] M. Potthast, et al., "Overview of the 1st International Competition on Plagiarism Detection," in PAN-09 3rd Workshop on Uncovering Plagiarism, Authorship and Social Software Misuse and 1st International Competition on Plagiarism Detection, 2009, pp. 1-9.
[5] R. Yerra, & Ng, Y.-K, "A Sentence-Based Copy Detection Approach for Web Documents," Fuzzy Systems and Knowledge Discovery, vol. 3613, pp. 557-570, 2005.
[6] Z. Ceska, et al., "Multilingual Plagiarism Detection," presented at the Proceedings of the 13th international conference on Artificial Intelligence: Methodology, Systems, and Applications, Varna, Bulgaria, 2008.
[7] M. Elhadi and A. Al-Tobi, "Use of text syntactical structures in detection of document duplicates," in Digital Information Management, 2008. ICDIM 2008. Third International Conference on, 2008, pp. 520-525.
[8] M. S. A. J. A. Muftah, "Document plagiarismdetection algorithm using semantic networks," M.Sc, Faculty Comput. Sci. Inf. Syst. Univ.Teechnol. Malaysia Johor Bahru, 2009.
[9] A. a. P. R. Barrón-Cedeño, "On Automatic Plagiarism Detection Based on n-Grams Comparison," presented at the Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval, Toulouse, France, 2009.
[10] T. W. S. Chow and M. K. M. Rahman, "Multilayer SOM with tree-structured data for efficient document retrieval and plagiarism detection," Trans. Neur. Netw., vol. 20, pp. 1385-1402, 2009.
[11] M. Elhadi and A. Al-Tobi, "Duplicate Detection in Documents and WebPages Using Improved Longest Common Subsequence and Documents Syntactical Structures," presented at the Proceedings of the 2009 Fourth International Conference on Computer Sciences and Convergence Information Technology, 2009.
[12] M. M. M. Zechner, R. Kern, and M. Granitzer, "External and intrinsic plagiarism detection using vector space models," in Proc. SEPLN, Donostia, Spain2009.
[13] C.-K. Ryu, et al., "A detecting and tracing algorithm for unauthorized internet-news plagiarism using spatio-temporal document evolution model," presented at the Proceedings of the 2009 ACM symposium on Applied Computing, Honolulu, Hawaii, 2009.
[14] C. G. C. Grozea, and M. Popescu, "ENCOPLOT: Pairwise sequence matching in linear time applied to plagiarism detection," Donostia, Spain, pp. 10-18, SEPLN'09 2009.
[15] B. Stein, et al., "Intrinsic plagiarism analysis," Language Resources and Evaluation, vol. 45, pp. 63-82, 2011.
[16] S. Meyer zu Eissen, et al., "Plagiarism Detection Without Reference Collections Advances in Data Analysis," R. Decker and H. J. Lenz, Eds., ed: Springer Berlin Heidelberg, 2007, pp. 359-366.
[17] A. Byung-Ryul, et al., "An Application of Detecting Plagiarism using Dynamic Incremental Comparison Method," in Computational Intelligence and Security, 2006 International Conference on, 2006, pp. 864-867.
[18] E. Stamatatos, "Author identification: Using text sampling to handle the class imbalance problem," Inf. Process. Manage., vol. 44, pp. 790-799, 2008.[19].B. Stein, et al., "Plagiarism analysis, authorship identification, and near-duplicate detection PAN'07," SIGIR Forum, vol. 41, pp. 68-71, 2007.
[20] T. Lancaster, "Effective and efficient plagiarism detection," South Bank University, 2003.[21].F. Culwin and T. Lancaster, "Plagiarism issues for higher education," Vine, vol. 31, pp. 36-41, 2001.
[22] A. H. Osman, et al., "An Improved Plagiarism Detection Scheme Based on Semantic Role Labeling," Applied Soft Computing, 2011.
[23] S. M. Alzahrani, et al., "Understanding Plagiarism Linguistic Patterns, Textual Features, and Detection Methods," Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on, vol. PP, pp. 1-1, 2011.
[24] M. B. J. Kasprzak,and M. K "Finding Plagiarism by Evaluating Document Similarities," Donostia, Spain, pp. 24-28, SEPLN'09 2009.
[25] D. B. C. Basile, E. Caglioti, G. Cristadoro, and M. D. Esposti, "A plagiarism detection procedure in three steps: Selection, Matches and "Squares"," Donostia, Spain, pp. 19-23, SEPLN'09 2009.
[26] N. Heintze, "Scalable document fingerprinting," USENIX Workshop on Electronic Commerce, pp. 191-200, 1996.
[27] A. Z. Broder, "On the resemblance and containment of documents," in Compression and Complexity of Sequences 1997. Proceedings, 1997, pp. 21-29.
[28] K. Monostori, et al., "Document overlap detection system for distributed digital libraries," 2000, pp. 226-227.
[29] S. Brin, et al., "Copy detection mechanisms for digital documents," SIGMOD Rec., vol. 24, pp. 398-409, 1995.
[30] N. Shivakumar and H. Garcia-Molina, "SCAM: A copy detection mechanism for digital documents," 1995.
[31] A. Si, et al., "CHECK: a document plagiarism detection system," presented at the Proceedings of the 1997 ACM symposium on Applied computing, San Jose, California, United States, 1997.
[32] M. K. M. Rahman, et al., "A flexible multi-layer self-organizing map for generic processing of tree-structured data," Pattern Recogn., vol. 40, pp. 1406-1424, 2007.
[33] M. K. M. Rahman and T. W. S. Chow, "Content-based hierarchical document organization using multi-layer hybrid network and tree-structured features," Expert Syst. Appl., vol. 37, pp. 2874-2881, 2010.
[34] M. S. Binwahlan, et al., "Fuzzy swarm diversity hybrid model for text summarization," Inf. Process. Manage., vol. 46, pp. 571-588, 2010.
[35] V. Mitra, et al., "Text classification: A least square support vector machine approach," Applied Soft Computing, vol. 7, pp. 908-914, 2007.
[36] W.-j. L. Du Zou, Zhang Ling "A Cluster-Based Plagiarism Detection Method," CLEF (Notebook Papers/LABs/Workshops) 2010
[37] M. Zini, et al., "Plagiarism Detection through Multilevel Text Comparison," in Automated Production of Cross Media Content for Multi-Channel Distribution, 2006. AXMEDIS '06. Second International Conference on, 2006, pp. 181-185.
[38] C. Fellbaum, "WordNet: An electronic database," ed: MIT Press, Cambridge, MA, 1998.
[39] S. T. a. A. Gelbukh, "Comparing Similarity Measures for Original WSD Lesk Algorithm," Advances in Computer Science and Application, vol. 43, pp. 155-166, 2009.
[40] P. Resnik, "Semantic similarity in a taxonomy: An information-based measure and its application to problems of ambiguity in natural language," Journal of Artificial Intelligence Research,, vol. 11, pp. 95-130, 1999.
[41] C. Leacock, et al., "Using corpus statistics and WordNet relations for sense identification," Comput. Linguist., vol. 24, pp. 147-165, 1998.
[42] B. Gipp and J. Beel, "Citation based plagiarism detection: a new approach to identify plagiarized work language independently," 2010, pp. 273-274.
[43] B. Gipp and N. Meuschke, "Citation pattern matching algorithms for citation-based plagiarism detection: greedy citation tiling, citation chunking and longest common citation sequence," 2011, pp. 249-258.
[2] J. J. G. Adeva, et al., "Applying plagiarism detection to engineering education," 2006, pp. 722-731.
[3] C. Lyon, et al., "Plagiarism is easy, but also easy to detect," Plagiary: Cross-Disciplinary Studies in Plagiarism, Fabrication, and Falsification, vol. 1, 2006.
[4] M. Potthast, et al., "Overview of the 1st International Competition on Plagiarism Detection," in PAN-09 3rd Workshop on Uncovering Plagiarism, Authorship and Social Software Misuse and 1st International Competition on Plagiarism Detection, 2009, pp. 1-9.
[5] R. Yerra, & Ng, Y.-K, "A Sentence-Based Copy Detection Approach for Web Documents," Fuzzy Systems and Knowledge Discovery, vol. 3613, pp. 557-570, 2005.
[6] Z. Ceska, et al., "Multilingual Plagiarism Detection," presented at the Proceedings of the 13th international conference on Artificial Intelligence: Methodology, Systems, and Applications, Varna, Bulgaria, 2008.
[7] M. Elhadi and A. Al-Tobi, "Use of text syntactical structures in detection of document duplicates," in Digital Information Management, 2008. ICDIM 2008. Third International Conference on, 2008, pp. 520-525.
[8] M. S. A. J. A. Muftah, "Document plagiarismdetection algorithm using semantic networks," M.Sc, Faculty Comput. Sci. Inf. Syst. Univ.Teechnol. Malaysia Johor Bahru, 2009.
[9] A. a. P. R. Barrón-Cedeño, "On Automatic Plagiarism Detection Based on n-Grams Comparison," presented at the Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval, Toulouse, France, 2009.
[10] T. W. S. Chow and M. K. M. Rahman, "Multilayer SOM with tree-structured data for efficient document retrieval and plagiarism detection," Trans. Neur. Netw., vol. 20, pp. 1385-1402, 2009.
[11] M. Elhadi and A. Al-Tobi, "Duplicate Detection in Documents and WebPages Using Improved Longest Common Subsequence and Documents Syntactical Structures," presented at the Proceedings of the 2009 Fourth International Conference on Computer Sciences and Convergence Information Technology, 2009.
[12] M. M. M. Zechner, R. Kern, and M. Granitzer, "External and intrinsic plagiarism detection using vector space models," in Proc. SEPLN, Donostia, Spain2009.
[13] C.-K. Ryu, et al., "A detecting and tracing algorithm for unauthorized internet-news plagiarism using spatio-temporal document evolution model," presented at the Proceedings of the 2009 ACM symposium on Applied Computing, Honolulu, Hawaii, 2009.
[14] C. G. C. Grozea, and M. Popescu, "ENCOPLOT: Pairwise sequence matching in linear time applied to plagiarism detection," Donostia, Spain, pp. 10-18, SEPLN'09 2009.
[15] B. Stein, et al., "Intrinsic plagiarism analysis," Language Resources and Evaluation, vol. 45, pp. 63-82, 2011.
[16] S. Meyer zu Eissen, et al., "Plagiarism Detection Without Reference Collections Advances in Data Analysis," R. Decker and H. J. Lenz, Eds., ed: Springer Berlin Heidelberg, 2007, pp. 359-366.
[17] A. Byung-Ryul, et al., "An Application of Detecting Plagiarism using Dynamic Incremental Comparison Method," in Computational Intelligence and Security, 2006 International Conference on, 2006, pp. 864-867.
[18] E. Stamatatos, "Author identification: Using text sampling to handle the class imbalance problem," Inf. Process. Manage., vol. 44, pp. 790-799, 2008.[19].B. Stein, et al., "Plagiarism analysis, authorship identification, and near-duplicate detection PAN'07," SIGIR Forum, vol. 41, pp. 68-71, 2007.
[20] T. Lancaster, "Effective and efficient plagiarism detection," South Bank University, 2003.[21].F. Culwin and T. Lancaster, "Plagiarism issues for higher education," Vine, vol. 31, pp. 36-41, 2001.
[22] A. H. Osman, et al., "An Improved Plagiarism Detection Scheme Based on Semantic Role Labeling," Applied Soft Computing, 2011.
[23] S. M. Alzahrani, et al., "Understanding Plagiarism Linguistic Patterns, Textual Features, and Detection Methods," Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on, vol. PP, pp. 1-1, 2011.
[24] M. B. J. Kasprzak,and M. K "Finding Plagiarism by Evaluating Document Similarities," Donostia, Spain, pp. 24-28, SEPLN'09 2009.
[25] D. B. C. Basile, E. Caglioti, G. Cristadoro, and M. D. Esposti, "A plagiarism detection procedure in three steps: Selection, Matches and "Squares"," Donostia, Spain, pp. 19-23, SEPLN'09 2009.
[26] N. Heintze, "Scalable document fingerprinting," USENIX Workshop on Electronic Commerce, pp. 191-200, 1996.
[27] A. Z. Broder, "On the resemblance and containment of documents," in Compression and Complexity of Sequences 1997. Proceedings, 1997, pp. 21-29.
[28] K. Monostori, et al., "Document overlap detection system for distributed digital libraries," 2000, pp. 226-227.
[29] S. Brin, et al., "Copy detection mechanisms for digital documents," SIGMOD Rec., vol. 24, pp. 398-409, 1995.
[30] N. Shivakumar and H. Garcia-Molina, "SCAM: A copy detection mechanism for digital documents," 1995.
[31] A. Si, et al., "CHECK: a document plagiarism detection system," presented at the Proceedings of the 1997 ACM symposium on Applied computing, San Jose, California, United States, 1997.
[32] M. K. M. Rahman, et al., "A flexible multi-layer self-organizing map for generic processing of tree-structured data," Pattern Recogn., vol. 40, pp. 1406-1424, 2007.
[33] M. K. M. Rahman and T. W. S. Chow, "Content-based hierarchical document organization using multi-layer hybrid network and tree-structured features," Expert Syst. Appl., vol. 37, pp. 2874-2881, 2010.
[34] M. S. Binwahlan, et al., "Fuzzy swarm diversity hybrid model for text summarization," Inf. Process. Manage., vol. 46, pp. 571-588, 2010.
[35] V. Mitra, et al., "Text classification: A least square support vector machine approach," Applied Soft Computing, vol. 7, pp. 908-914, 2007.
[36] W.-j. L. Du Zou, Zhang Ling "A Cluster-Based Plagiarism Detection Method," CLEF (Notebook Papers/LABs/Workshops) 2010
[37] M. Zini, et al., "Plagiarism Detection through Multilevel Text Comparison," in Automated Production of Cross Media Content for Multi-Channel Distribution, 2006. AXMEDIS '06. Second International Conference on, 2006, pp. 181-185.
[38] C. Fellbaum, "WordNet: An electronic database," ed: MIT Press, Cambridge, MA, 1998.
[39] S. T. a. A. Gelbukh, "Comparing Similarity Measures for Original WSD Lesk Algorithm," Advances in Computer Science and Application, vol. 43, pp. 155-166, 2009.
[40] P. Resnik, "Semantic similarity in a taxonomy: An information-based measure and its application to problems of ambiguity in natural language," Journal of Artificial Intelligence Research,, vol. 11, pp. 95-130, 1999.
[41] C. Leacock, et al., "Using corpus statistics and WordNet relations for sense identification," Comput. Linguist., vol. 24, pp. 147-165, 1998.
[42] B. Gipp and J. Beel, "Citation based plagiarism detection: a new approach to identify plagiarized work language independently," 2010, pp. 273-274.
[43] B. Gipp and N. Meuschke, "Citation pattern matching algorithms for citation-based plagiarism detection: greedy citation tiling, citation chunking and longest common citation sequence," 2011, pp. 249-258.
- Abstract viewed - 1737 times
- PDF downloaded - 1303 times
Affiliations
Ahmed Hamza Osman
International University of Africa, Faculty of Computer Studies, Khartoum, Sudan
Naomie Salim
Faculty of Computer Science and Information System
Universiti Teknologi Malaysia
Albaraa Abuobieda
International University of Africa, Faculty of Computer Studies, Khartoum, Sudan
Survey of Text Plagiarism Detection
Abstract
In this paper we are going to review and list the advantages and limitations of the significant effective techniques employed or developed in text plagiarism detection. It was found that many of the proposed methods for plagiarism detection have a weakness and lacking for detecting some types of plagiarized text. This paper discussed several important issues in plagiarism detection such as; plagiarism detection Tasks, plagiarism detection process and some of the current plagiarism detection techniques.