publication . Conference object . Contribution for newspaper or weekly magazine . 2017

HPTA: High-performance text analytics

Hans Vandierendonck; Karen Murphy; Mahwish Arif; Dimitrios S. Nikolopoulos;
Open Access
  • Published: 06 Feb 2017
  • Publisher: IEEE
  • Country: United Kingdom
Abstract
One of the main targets of data analytics is unstructured data, which primarily involves textual data. High-performance processing of textual data is non-trivial. We present the HPTA library for high-performance text analytics. The library helps programmers to map textual data to a dense numeric representation, which can be handled more efficiently. HPTA encapsulates three performance optimizations: (i) efficient memory management for textual data, (ii) parallel computation on associative data structures that map text to values and (iii) optimization of the type of associative data structure depending on the program context. We demonstrate that HPTA outperforms ...
Persistent Identifiers
Subjects
free text keywords: data analytics, performance optimization, text analytics, Sparse matrix, Data structure, Memory management, Associative property, Data analysis, Computer science, Data mining, computer.software_genre, computer, Semantic analytics, Information retrieval, Unstructured data, Analytics, business.industry, business
Related Organizations
Any information missing or wrong?Report an Issue