
Рассмотрены вопросы развития систем информационного поиска, усовершенствования классификационных и словарных информационно-поисковых систем (ИПС). Раскрыты общие аспекты построения системы нового поколения, в которой объединяются достоинства классификационных и словарных ИПС. Представлен новый подход к кластеризации текстовых документов, размещенных в сети Интернет, на основе предлагаемого в статье алгоритма «Иерархическая кластеризация по областям».
The development of information retrieval systems (IRS) and ways for improvement of classification and subject word IRSs are considered. General aspects are revealed in the design of a new generation IRS which combines the advantages of classification and subject word IRSs. A new approach to the clustering of the Internet text documents is proposed which is based on the algorithm «Hierarchical Clustering by Areas» presented in this article.
ИНТЕРНЕТ-ПОИСК, КЛАСТЕРИЗАЦИЯ, ДЕРЕВО ОБЛАСТЕЙ
ИНТЕРНЕТ-ПОИСК, КЛАСТЕРИЗАЦИЯ, ДЕРЕВО ОБЛАСТЕЙ
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
