Publication: Domain-Specific Keyphrase Extraction

Keyphrases are an important means of doc- ument summarization, clustering, and topic search. Only a small minority of documents have author-assigned keyphrases, and manually assigning keyphrases to existing documents is very laborious. Therefore it is highly desirable to automate the keyphrase extraction process. This paper shows that a simple procedure for keyphrase extraction based on the naive Bayes learning scheme performs comparably to the state of the art. It goes on to explain how this procedure's performance can be boosted by automatically tailoring the extraction process to the particular document collection at hand. Results on a large collection of technical reports in computer science show that the quality of the extracted keyphrases improves signi cantly when domain-speci c information is exploited.




Eibe Frank
University of Waikato
Gordon Paynter
University of Waikato
Ian Witten
University of Waikato
Carl Gutwin
University of Saskatchewan
Craig Nevill-Manning
Rutgers University


Frank, E., Paynter, G., Witten, I., Gutwin, C., Nevill-Manning, C. 1999. Domain-Specific Keyphrase Extraction. In Proceedings of Sixteenth International Joint Conference on Artificial Intelligence, 668-673.


@inproceedings {kea-ijcai99,
author= {Eibe Frank and Gordon Paynter and Ian Witten and Carl Gutwin and Craig Nevill-Manning},
title= {Domain-Specific Keyphrase Extraction},
booktitle= {Proceedings of Sixteenth International Joint Conference on Artificial Intelligence},
year= {1999},
pages= {668-673}