
The object of this study is the semantic similarity between two texts. This research focuses on developing a hybrid architecture that combines Siamese Neural Network (SNN) with Feedforward Neural Network (FNN) to measure the semantic text similarity, with text representation using Sentence-BERT (SBERT). The problem addressed is the challenge of capturing deep semantic relationships between two texts, which traditional methods, such as Term Frequency-Inverse Document Frequency (TF-IDF) or Word2Vec, find difficult to achieve. This research aims to overcome these weaknesses by combining the two architectures into a more powerful hybrid system. The test results show the highest accuracy of 87.82 % on the Semantic Textual Similarity (STS) dataset using the SBERT “all-MiniLM-L6-v2” model, 76.72 % on the Quora Question Pairs (QQP) dataset using the “multi-qa-MiniLM-L6-cos-v1” model, and 73.79 % on the Microsoft Research Paraphrase Corpus (MSRP) dataset using the “paraphrase-MiniLM-L12-v2” model. The optimal parameters for the number of epochs ranged from 300 to 700, and the optimal learning rate ranged from 0.01 to 0.5. SBERT models, such as “paraphrase-MiniLM-L6-v2” and “paraphrase-MiniLM-L12-v2”, gave the best results on the relevant datasets. The flexibility of the “multi-qa-MiniLM-L6-cos-v1” model also shows that the model designed for question and answer tasks can be used in the paraphrase detection domain. A unique feature of the model is the integration of SBERT as a text representation, which results in a richer semantic vector than traditional methods. The model has potential for wide application in various domains, such as plagiarism detection, legal documents, and question-and-answer systems. However, implementation requires attention to parameter selection, such as learning rate and number of epochs, to avoid overfitting or underfitting
Siamese neural network, semantic text similarity, feedforward neural network, сіамська нейронна мережа, нейронна мережа прямого поширення, семантична текстова подібність, Sentence-BERT
Siamese neural network, semantic text similarity, feedforward neural network, сіамська нейронна мережа, нейронна мережа прямого поширення, семантична текстова подібність, Sentence-BERT
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
