
Elasticsearch, as an open source distributed data search and analysis engine, has been widely used in recent years due to its characteristics. But in a wide range of utilization and deployment, it is not suitable for all scenarios and requirements. Therefore, this paper proposes a method to optimize the number of Elasticsearch index shard based on Elasticsearch full-text retrieval technology and data features in practical application. This method can comprehensively analyze and calculate Elasticsearch remaining storage space and index shard size of each node in distributed cluster to determine the optimal number of index shard in the system, which can improve the efficiency of data retrieval. Experimental results show that, compare with traditional methods, the proposed method can improve the system performance in data distribution, data writing efficiency and data query delay.
| citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 3 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
