Document Clustering to organize the similar documents into classes using K-Means to improve retrieval and Time complexity

Roopa Devi Chandanala, N. Harini

Abstract


Document clustering has been one of the quickest developing exploration field for as far back as couple of many years. It has become a significant errand in content mining on account of the gigantic expansion in records on the web. All the associations require the best possible administration of printed information. Record grouping is the unaided procedure that assists with getting sorted out the comparable archives into classes to improve recovery. The paper clarifies the periods of record bunching and the improvement in report grouping utilizing quality weighted k-intends to group the archives and to place the comparable reports in the best possible group. Test results shows that precision of proposed strategy is high contrast with the essential k-implies as far as F-Measure and time complexity. Grouping centers to coordinate an assortment of information things into bunches, with the end goal that things inside a group are more "comparative" to one another than they are to things in different groups. The k-means strategy is one of the most broadly utilized grouping procedures for different applications.


References


M. Chen, X. Jin, and D. Shen, “Short text classification improved by learning multi-granularity topics,” in IJCAI, 2011, pp. 1776–1781.

J. Jeon, W. B. Croft, and J. H. Lee, “Finding semantically similar questions based on their answers,” in Proceedings of the 28th ACM SIGIR Conference, ser. SIGIR ’05. New York, NY, USA: ACM, 2005, pp. 617–618

S. Robertson, S.Walker, and M. Hancock-Beaulieu, “Okapi at TREC-7: Automatic ad hoc, filtering, VLC and interactive track,” TREC ’98, pp. 199–210, 1998

Jiawei Han, Micheline Kamber, Data Mining and Concepts.

Maqgret H. Dunham, Data Mining and Introductory to Advanced Topics.

Cluster Analysis:http://www.tutorialspoint.com/data_mining/dm_cluster_analysis.htm

7..Data Mining: http://www.webdocs.cs.ualberta.ca/~zaiane/courses/comput690


Full Text: PDF [Full Text]

Refbacks

  • There are currently no refbacks.


Copyright © 2013, All rights reserved.| ijseat.com

Creative Commons License
International Journal of Science Engineering and Advance Technology is licensed under a Creative Commons Attribution 3.0 Unported License.Based on a work at IJSEat , Permissions beyond the scope of this license may be available at http://creativecommons.org/licenses/by/3.0/deed.en_GB.