Quark vs Gluon Jets

How are gluon jets different from light (uds) quark jets?

Compare both different categories of observables and different parameters (pT, jetsize, etc) within a category. Below on the left is the simulated distribution (normalized to unit area), and on the right is the resulting ROC curve (signal efficiency vs background rejection).

The ROC curve is clickable, and shows the relevant cut. Plot All next to a parameter plots all variations of just that parameter while keeping the others fixed. Variety Plot shows the ROC curves for a representative set of variables.

Jet Pt:
Jet Size:
Parameter a =

eps of


ROC Curves and their Relatives

Below the ROC curve, different one-to-one transformations can be selected which convey the same information differently. The vertical axis changes, but signal efficiency is always on the horizontal. For example, the improvement in S/B (equal to Signal efficiency over Background efficiency) can often be arbitrarily high for variables with a long signal-like tail, whereas the improvement in Significance (Signal efficiency over square-root of Background efficiency) often has a maximum and an optimal cut to achieve it.

The curves labeled "Significance improvement for X% Gluon" are meant to illustrate the use of quark/gluon discrimination in a realistic context. We assume that the signal is 100% quark and the background is mixed: X% Gluon and (100-X)% Quark. Then we show the improvement in S/sqrt(B) when a cut is placed. For example, in a WW → 4 jets search with 200 GeV jets, each of the background jets has a probability 30% of being quark and 70% of being gluon. If the 70% gluon curve for a particular variable has a significance improvement of 2, it means you can improve S/sqrt(B) by a factor of 2 by cutting on this variable alone. One thing to notice about the X% gluon curves is that as the background becomes less purely gluey, the maximum significance improvement goes down (which is easy to imagine) and the optimal cut becomes looser, with a higher quark efficiency (maybe less obvious).

Descriptions and Discussion of Some Variables


The jets selected for each sample are central in eta and have a pT within 10% of the indicated "Jet Pt" as anti-kT R=0.5 jets. These and other basic properties of our sample can be examined in the kinematic observables at the top of the list.


A list of subjets is calculated by reclustering a jet with a smaller R and possibly a different algorithm. Properties of these subjets which are useful for quark/gluon discrimination include the count (multiplicity) of subjets, their average pT, the spread (standard deviation) of pT, and the fraction of pT contained in the hardest subjet, 2nd hardest, etc.

Initial studies revealed that the smallest subjets were always better at quark/gluon discrimination. Within each subjet size, anti-kT subjets performed slightly better than CA subjets, which were slightly better than kT subjets. The interactive plot generator does not contain the full set of subjet sizes and algorithms. TODO: image of subjets

Count (multiplicity) of subjets

Counting the smallest subjets is one of the best quark/gluon discriminants. It's not as good as counting charged tracks, and the smaller subjets are more powerful. LEP found that there was an optimal subjets size for quark/gluon discrimination which captured the perturbative shower, but not the properties of hadronization. For jets at LHC energies, this optimal size is smaller than R=0.1, the smallest resolution we considered. Atlas's TopoClusters and CMS's particle flow would probably achieve better separation.

Average pT^2 of subjets and other single-jet statistics

The average subjet pT itself contains no more information than the jet pT and subjet count. However, the average of $pT^2$ The spread (standard deviation) of subjet pT within a jet involves subtracting off the square of the average pT. This is almost as powerful.

Other statistics include Average kT of subjets, Fraction of pT contained in hardest (or 2nd or 3rd hardest) subjet, Fraction of pT contained in 2nd hardest subjet

Exclusive Reclustering into exactly N subjets

The jet is reclustered using a different algorithm (usually kT or Cambridge/Aachen) until exactly N subjets are found. The properties of these subjets like their distance (dR) to the original jet axis or the pT fraction they contain are moderately useful observables.

Radial Geometric Moments, Broadening, and Angularities

Geometric moments involve adding up all of the pT of the jet, weighted by some function of the distance to the jet center $\Delta R$. The linear radial moment is sometimes called girth or jet width and is equal to jet broadening in the small-angle, $\eta=0$ limit. The quadratic radial moment is equal to the trace of the inertia tensor, and is identical to the jet mass in the same limit as above.

Angularities are a class of radial moments that involve a 1-parameter family of functions that go smoothly from zero at the jet center to 1 at the jet edge.


Designed to measure color connections between jets, pull proves less useful in the context of Quarks vs Gluons. The paper is arXiv:1001.5027 [hep-ph].

Inirtia/Covariance Tensor

A second-order geometric moment tensor linear in pT, but second order in $\Delta \eta$ and $\Delta \phi$. Components of this symmetric 2x2 tensor give information about the jets size. Various combinations of eigenvalues give rotationally-invariant measures of width, eccentricity, and planar flow. The determinant is a 4th order geometric moment.

Other 2D Moments: Geometric, Moments of Hu, Zernike

The Wikipedia article on Image Moments has a good description of geometric moments, which they call "Raw moments" or equivalently for us, "Central moments" along with the 7 Moments of Hu, which they less awesomely call "Rotation invariant moments". The Wikipedia article on Zernike Polynomials discusses the basis used for the Zernike moments. Like spherical harmonics, the sin and cosine ones can be treated as real and imaginary components and a magnitude can be calculated.

TODO List...

For questions and suggestions, email jason@frank.harvard.edu

Please cite arXiv:1106.3076 if this proves useful