Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao Closed Access logo, derived from PLoS Open Access logo. This version with transparent background. http://commons.wikimedia.org/wiki/File:Closed_Access_logo_transparent.svg Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao Digitální knihovna V...arrow_drop_down
image/svg+xml Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao Closed Access logo, derived from PLoS Open Access logo. This version with transparent background. http://commons.wikimedia.org/wiki/File:Closed_Access_logo_transparent.svg Jakob Voss, based on art designer at PLoS, modified by Wikipedia users Nina and Beao
addClaim

This Research product is the result of merged Research products in OpenAIRE.

You have already added 0 works in your ORCID record related to the merged Research product.

Data mining

Authors: Mrázek, Michal;

Data mining

Abstract

The aim of this master’s thesis is analysis of the multidimensional data. Three dimensionality reduction algorithms are introduced. It is shown how to manipulate with text documents using basic methods of natural language processing. The goal of the practical part of the thesis is to process real-world data from the internet forum. Posted messages are transformed to the numerical representation, then to two-dimensional space and visualized. Later on, topics of the messages are discovered. In the last part, a few selected algorithms are compared.

Tato diplomová práce se zabývá analýzou vícerozměrných dat. Jsou zavedeny tři algoritmy pro snižování dimenze dat. Pomocí metod zpracování přirozeného jazyka je ukázáno, jak manipulovat s textovými dokumenty. V praktické části je cílem zpracovat reálná data z internetového fóra. Nejprve soubor diskuzních příspěvků převedeme na numerickou reprezentaci, provedeme transformaci do dvourozměrného prostoru a vizualizujeme. Dále najdeme tématické okruhy příspěvků. V závěru porovnáme několik vybraných algoritmů na redukci dimenze.

B

Country
Czech Republic
Related Organizations
Keywords

data mining, NMF, zpracování přirozeného jazyka, natural language processing, SVD, t-SNE, redukce dimenze, dimensionality reduction

Powered by OpenAIRE graph
Found an issue? Give us feedback
Related to Research communities