Abstract

This paper presents a method to improve data integrity of individual-based bibliographic repository. Integrity improvement is done by comparing individual-based publication raw data with individual-based clustered publication data. Hierarchical Agglomerative Clustering is used to cluster the publication data with similar author names. Clustering is done by two steps of clustering. The first clustering is based on the co-author relationship and the second is by title similarity and year difference. The two-step hierarchical clustering technique for name disambiguation has been applied to Universitas Sriwijaya Publication Data Center with good accuracy.