
For short-text classification in case the traditional classification algorithm does not work well, this paper proposes a search-based method employing NaiveBayes. The classification method is considered in the text data set scale, document length, the number of categories, distribution and so on. The NaiveBayes algorithm is improved, and the search technology is applied to the domain of text classification. This classification algorithm can be applied to the short text categorization fields such as twitter, WeChat, short message, phrase comment and so on. This paper describes the whole process, including the classification algorithms, training and the evaluation. The results indicates that the classifier has better performance comparing with other methods.
short text, text classification, NaiveBayes, TK7800-8360, Electronics, search engine
short text, text classification, NaiveBayes, TK7800-8360, Electronics, search engine
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
