Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2025
License: CC BY
Data sources: ZENODO
ZENODO
Dataset . 2025
License: CC BY
Data sources: Datacite
ZENODO
Dataset . 2025
License: CC BY
Data sources: Datacite
versions View all 2 versions
addClaim

Pythia8 and Herwig7 Boosted Top & QCD Jet datasets

Authors: Gambhir, Rikab; LeBlanc, Matt; Zhou, Yuanchen;

Pythia8 and Herwig7 Boosted Top & QCD Jet datasets

Abstract

A dataset of labeled top and QCD jets, generated using both Pythia8 and Herwig. There are 20 files: 10 files generated using Pythia, and 10 generated using Herwig (with the prefix `HERWIG`). Each file consists of 100k top jets and 100k QCD jets, for a total of 2M events for Pythia and 2M events for Herwig (4M total). There are two arrays in each file X: (200000,M,4), A set of 100k top jets and 100k QCD jets, where M is the max multiplicity of the jets in that file (other jets have been padded with zero-particles), and the features of each particle are its pt, rapidity, azimuthal angle, and pdgid. y: (200000,), an array of labels for the jets where QCD is 0 and top is 1. The Pythia samples are generated using Pythia 8.331. The top events are generated using the processes `Top:gg2ttbar` and `Top:qqbar2ttbar`, and the W's are forced to decay hadronically. The QCD events are generated using `HardQCD:all`. The Herwig samples are generated using Herwig 7.3.0. The top events are generated using `MEHeavyQuark`, and leptonic decays of the W's are discarded The QCD events are generated using `MEQCD2to2`.For both datasets, jets are clustered using FastJet 3.3.0 using the anti-kt algorithm with R = 0.8. For top jets, a hard top parton is required to exist within the jet cone. We select for jets with a pT between 500 and 550 GeV and a pseudorapidity less than 2.5. If multiple jets in an event meet these criteria, one jet is chosen at random. Usage This dataset can be automatically and conveniently downloaded using the ParticleLoader python package. This will download to a specified cache, and load from the cache if the files already exist. from particleloader import load # Change this to a working directory on your machine! dir = "~/.ParticleLoader" N = 100000 X_pythia, y_pythia = load("topqcd_jets", N, cache_dir=dir) X_herwig, y_herwig = load("topqcd_jets", N, cache_dir=dir, generator="herwig") WARNING: A similar dataset exists for quark/gluon tagging. However, as these events were generated using different versions of Pythia and Herwig, these datasets should not be mixed.

Related Organizations
  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average