software . 2020

ekzhu/datasketch: Improved performance for MinHash and MinHashLSH

Eric Zhu; Vadim Markovtsev; aastafiev; Wojciech Łukasiewicz; ae-foster; Jordan Martin; Ekevoo; Kevin Mann; Keyur Joshi; Spandan Thakur; ...
Open Source
  • Published: 15 Dec 2020
  • Publisher: Zenodo
Abstract
<ul> <li>Performance improvement for MinHash's update method.</li> <li>Make MinHash updates 4.5X faster by using <code>update_batch</code> method for bulk update on MinHash. [See API doc].(<a href="http://ekzhu.com/datasketch/documentation.html#datasketch.MinHash.update_batch">http://ekzhu.com/datasketch/documentation.html#datasketch.MinHash.update_batch</a>)</li> <li>Further performance gain by using bulk generation of MinHash using <code>MinHash.bulk</code> or <code>MinHash.generator</code>. See <a href="http://ekzhu.com/datasketch/documentation.html#datasketch.MinHash.bulk">API doc</a> and <a href="https://github.com/ekzhu/datasketch/pull/142">pull request</a...
Persistent Identifiers
Download fromView all 2 versions
Zenodo
Software . 2020
Provider: Datacite
Zenodo
Software . 2020
Provider: Zenodo
Any information missing or wrong?Report an Issue