Giving user a simple and well organized web search result has been a topic of active information Retrieval (IR) research. Irrespective of how small or ambiguous a query is, a user always wants the desired result on the first display of an IR system. Clustering of an IR system result can render a way, which fulfills the actual information need of a user. In this paper, an approach to cluster an IR system result is presented.The approach is a combination of heuristics and k-means technique using cosine similarity. Our heuristic approach detects the initial value of k for creating initial centroids. This eliminates the problem of external specification of the value k, which may lead to unwanted result if wrongly specified. The centroids created in this way are more specific and meaningful in the context of web search result. Another advantage of the proposed method is the removal of the objective means function of k-means which makes cluster sizes same. The end result of the proposed approach consists of different clusters of documents having different sizes.
Web Search Result Clustering based on Heuristic Search and k-means
Published 2015 in arXiv.org
ABSTRACT
PUBLICATION RECORD
- Publication year
2015
- Venue
arXiv.org
- Publication date
2015-08-11
- Fields of study
Computer Science
- Identifiers
- External record
- Source metadata
Semantic Scholar
CITATION MAP
EXTRACTION MAP
CLAIMS
- No claims are published for this paper.
CONCEPTS
- No concepts are published for this paper.
REFERENCES
Showing 1-37 of 37 references · Page 1 of 1
CITED BY
Showing 1-4 of 4 citing papers · Page 1 of 1