A New Method for Intelligent Categorization of Scientific Texts (Case of Iran’s Nanotechnology Papers)

Authors

-

Abstract

Abstract: The ISI (Institute for Scientific Information) index is one of the most valuable and frequently used indicators for assessing indexed papers in science and technology journals. Categorization of these papers is a big challenge in management of technology. This paper introduces a new text categorization method - Silhouette based Unsupervised Text Categorization (SUTC). This method has been used for classifying Iranian nanotechnology papers indexed in ISI. First, a few standards are combined to make a comprehensive hierarchy of nanomaterials. Then, by applying information retrieval and text mining methods, papers are categorized intelligently without prior knowledge of class labels. The method is validated by comparing acquired class labels to the selected papers labeled by experts. Our analysis shows acceptable accuracy. Keywords: Scientometrics, Nanotechnology, Text mining, Text categorization, Clustering, Silhouette Coefficient

Keywords