Powered by OpenAIRE graph
Found an issue? Give us feedback
addClaim

This Research product is the result of merged Research products in OpenAIRE.

You have already added 0 works in your ORCID record related to the merged Research product.

Исследование Ð°Ñ€Ñ Ð¸Ñ‚ÐµÐºÑ‚ÑƒÑ€Ñ‹ распределенного реляционного Ñ Ñ€Ð°Ð½Ð¸Ð»Ð¸Ñ‰Ð° большого объёма Ñ€Ð°Ð·Ð½Ð¾Ñ€Ð¾Ð´Ð½Ñ‹Ñ Ð´Ð°Ð½Ð½Ñ‹Ñ

выпускная квалификационная работа магистра

Исследование Ð°Ñ€Ñ Ð¸Ñ‚ÐµÐºÑ‚ÑƒÑ€Ñ‹ распределенного реляционного Ñ Ñ€Ð°Ð½Ð¸Ð»Ð¸Ñ‰Ð° большого объёма Ñ€Ð°Ð·Ð½Ð¾Ñ€Ð¾Ð´Ð½Ñ‹Ñ Ð´Ð°Ð½Ð½Ñ‹Ñ

Abstract

Тема выпускной квалификационной работы: «Исследование архитектуры распределенного реляционного хранилища большого объёма разнородных данных». При работе с базами данных больших объёмов возникает проблема длительного времени доступа к данным. Методы вертикального и горизонтального масштабирования позволяют увеличить производительность системы за счет организации кластера и распределения данных между несколькими серверами. Работа посвящена разработке и исследованию архитектуры распределенного реляционного хранилища большого объёма разнородных данных. Задачи, которые решались в ходе исследования: 1. Изучение особенностей построения распределенных хранилищ данных. 2. Выявление основных компонент, необходимых для организации кластера. 3. Исследование алгоритмов декомпозации данных. 4. Разработка архитектуры распределенного реляционного хранилища. 5. Анализ производительности построенной системы. В работе проанализированы подходы к организации распределенных реляционных и NoSQL хранилищ. В результате спроектирована архитектура распределенного хранилища на основе СУБД PostgreSQL и расширения Citus, реализован макет кластера, состоящий из двух серверов, а также проведен сравнительный анализ производительности полученной системы с одноузловым решением. Результаты могут быть использованы для построения распределенных хранилищ большого объёма разнородных данных с высокой степенью устойчивости и быстрым доступом к данным.

The subject of the graduate qualification work is “The architecture of distributed relational storage of large volume of heterogeneous data study”. When working with databases of large volumes, the problem of a long access time to the data arises. The methods of vertical and horizontal scaling can increase system performance by organizing a cluster and distributing data between multiple servers. The given work is devoted to the architecture of the distributed relational storage of large volume of heterogeneous data development and study. The research set the following goals: 1. The study of building distributed data warehouses features. 2. Identification of the main components necessary for the organization of the cluster. 3. The study of data decomposition algorithms. 4. Development of distributed relational storage architecture. 5. Analysis of the built system performance. The study resulted into analysis of the approaches to the organization of distributed relational and NoSQL repositories. As a result, the architecture of distributed storage based on the PostgreSQL DBMS and the Citus extension was designed, a cluster layout consisting of two servers was implemented, and a comparative analysis of the performance of the resulting system with a single-node solution was carried out. The results can be used to build distributed storages of a large volume of heterogeneous data with a high degree of stability and quick access to data.

Keywords

горизонтальное масштабирование, PostgreSQL, big data, реляционные базы данныÑ, sharding, horizontal scaling, cluster, RDBMS, кластер, шардинг

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average
Upload OA version
Are you the author of this publication? Upload your Open Access version to Zenodo!
It’s fast and easy, just two clicks!