ОБЗОР МЕТОДОВ ОБРАБОТКИ БОЛЬШИХ ДАННЫХ С ИСПОЛЬЗОВАНИЕМ APACHE SPARK, БИБЛИОТЕКИ PANDAS И SQL

Found an issue? Give us feedback

ZENODOarrow_drop_down

ZENODO

Article . 2024

License: CC BY

Data sources: ZENODO

ZENODO

Article . 2024

License: CC BY

Data sources: Datacite

ОБЗОР МЕТОДОВ ОБРАБОТКИ БОЛЬШИХ ДАННЫХ С ИСПОЛЬЗОВАНИЕМ APACHE SPARK, БИБЛИОТЕКИ PANDAS И SQL

descriptionPublicationkeyboard_double_arrow_right Article 22 May 2024Publisher:Zenodo

doi: 10.5281/zenodo.11241367

ОБЗОР МЕТОДОВ ОБРАБОТКИ БОЛЬШИХ ДАННЫХ С ИСПОЛЬЗОВАНИЕМ APACHE SPARK, БИБЛИОТЕКИ PANDAS И SQL

- Summary
- Metrics

Abstract

В данной статье проводится сравнительный анализ трех ключевых технологий в области обработки данных – Apache Spark, Pandas и SQL – с точки зрения их производительности, масштабируемости, гибкости использования и подходящих сценариев применения. Обсуждаются основные качества каждого инструмента, а также оптимальные области их применения, чтобы помочь специалистам по данным и организациям сделать информированный выбор в зависимости от своих уникальных требований. В результате были выявлены ключевые сильные и слабые стороны каждого из рассмотренных методов.

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average