|Home | About | Journals | Submit | Contact Us | Français|
Jet multiplicity distributions in top quark pair () events are measured in pp collisions at a centre-of-mass energy of 8 TeV with the CMS detector at the LHC using a data set corresponding to an integrated luminosity of 19.7 fb-1. The measurement is performed in the dilepton decay channels (e+e-, μ+μ-, and e±μ∓). The absolute and normalized differential cross sections for production are measured as a function of the jet multiplicity in the event for different jet transverse momentum thresholds and the kinematic properties of the leading additional jets. The differential and cross sections are presented for the first time as a function of the kinematic properties of the leading additional b jets. Furthermore, the fraction of events without additional jets above a threshold is measured as a function of the transverse momenta of the leading additional jets and the scalar sum of the transverse momenta of all additional jets. The data are compared and found to be consistent with predictions from several perturbative quantum chromodynamics event generators and a next-to-leading order calculation.
Precise measurements of production and decay properties [1–9] provide crucial information for testing the expectations of the standard model (SM) and specifically of calculations in the framework of perturbative quantum chromodynamics (QCD) at high-energy scales. At the energies of the CERN LHC, about half of the events contain jets with transverse momentum (pT) larger than 30 GeV that do not come from the weak decay of the system . In this paper, these jets will be referred to as “additional jets” and the events as “ +jets”. The additional jets typically arise from initial-state QCD radiation, and their study provides an essential test of the validity and completeness of higher-order QCD calculations describing the processes leading to multijet events.
A correct description of these events is also relevant because +jets processes constitute important backgrounds in the searches for new physics. These processes also constitute a challenging background in the attempt to observe the production of a Higgs boson in association with a pair (), where the Higgs boson decays to a bottom (b) quark pair (), because of the much larger cross section compared to the signal. Such a process has an irreducible nonresonant background from pair production in association with a pair from gluon splitting. Therefore, measurements of +jets and production can give important information about the main background in the search for the process and provide a good test of next-to-leading-order (NLO) QCD calculations.
Here, we present a detailed study of the production of events with additional jets and b quark jets in the final state from pp collisions at using the data recorded in 2012 with the CMS detector, corresponding to an integrated luminosity of 19.7 fb-1. The pairs are reconstructed in the dilepton decay channel with two oppositely charged isolated leptons (electrons or muons) and at least two jets. The analysis follows, to a large extent, the strategy used in the measurement of normalized differential cross sections in the same decay channel described in Ref. .
The measurements of the absolute and normalized differential cross sections are performed as a function of the jet multiplicity for different pT thresholds for the jets, in order to probe the momentum dependence of the hard-gluon emission. The results are presented in a visible phase space in which all selected final-state objects are produced within the detector acceptance and are thus measurable experimentally. The study extends the previous measurement at , where only normalized differential cross sections were presented.
The absolute and normalized +jets production cross sections are also measured as a function of the pT and pseudorapidity (η)  of the leading additional jets, ordered by pT . The CMS experiment has previously published a measurement of the inclusive production cross section . In the present analysis, the and (referred to as “ ()” in the following) cross sections are measured for the first time differentially as a function of the properties of the additional jets associated with b quarks, which will hereafter be called b jets. The process corresponds to events where two additional b jets are generated in the visible phase space, while represents the same physical process, where only one additional b jet is within the acceptance requirements. In cases with at least two additional jets or two b jets, the cross section is also measured as a function of the angular distance between the two jets and their dijet invariant mass. The results are reported both in the visible phase space and extrapolated to the full phase space of the system to facilitate the comparison with theoretical calculations.
Finally, the fraction of events that do not contain additional jets (gap fraction) is determined as a function of the threshold on the leading and subleading additional-jet pT , and the scalar sum of all additional-jet pT . This was first measured in Refs. [5, 12].
The results are compared at particle level to theoretical predictions obtained with four different event generators: MadGraph , mc@nlo , powheg , and MG5_aMC@NLO , interfaced with either pythia  or herwig , and in the case of powheg with both. Additionally, the measurements as a function of the b jet quantities are compared to the predictions from the event generator PowHel .
This paper is structured as follows. A brief description of the CMS detector is provided in Sect. 2. Details of the event simulation generators and their theoretical predictions are given in Sect. 3. The event selection and the method used to identify the additional radiation in the event for both +jets and () studies are presented in Sects. 4 and 5. The cross section measurement and the systematic uncertainties are described in Sects. 6 and 7. The results as a function of the jet multiplicity and the kinematic properties of the additional jets and b jets are presented in Sects. 8–10. The definition of the gap fraction and the results are described in Sect. 11. Finally, a summary is given in Sect. 12.
The central feature of the CMS apparatus is a superconducting solenoid of 6 m internal diameter, providing a magnetic field of 3.8 T. Within the solenoid volume are a silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter (ECAL), and a brass and scintillator hadron calorimeter, each composed of a barrel and two endcap sections. Extensive forward calorimetry complements the coverage provided by the barrel and endcap detectors. Muons are measured in gas-ionization detectors embedded in the steel flux-return yoke outside the solenoid. A more detailed description of the CMS detector, together with a definition of the coordinate system used and the relevant kinematic variables, can be found in Ref. .
Experimental effects coming from event reconstruction, selection criteria, and detector resolution are modelled using Monte Carlo (MC) event generators interfaced with a detailed simulation of the CMS detector response using Geant4 (v. 9.4) .
The MadGraph (v. 22.214.171.124)  generator calculates the matrix elements at tree level up to a given order in αs. In particular, the simulated sample used in this analysis is generated with up to three additional partons. The MadSpin  package is used to incorporate spin correlations of the top quark decay products. The value of the top quark mass is chosen to be mt = 172.5 GeV, and the proton structure is described by the CTEQ6L1  set of parton distribution functions (PDF). The generated events are subsequently processed with pythia (v. 6.426)  for fragmentation and hadronization, using the MLM prescription for the matching of higher-multiplicity matrix element calculations with parton showers . The pythia parameters for the underlying event, parton shower, and hadronization are set according to the Z2* tune, which is derived from the Z1 tune . The Z1 tune uses the CTEQ5L PDFs, whereas Z2* adopts CTEQ6L.
In addition to the nominal MadGraph sample, dedicated samples are generated by varying the central value of the renormalization (μR) and factorization (μF) scales and the matrix element/parton showering matching scale (jet-parton matching scale). These samples are produced to determine the systematic uncertainties in the measurement owing to the theoretical assumptions on the modelling of events, as well as for comparisons with the measured distributions. The nominal values of μR and μF are defined by the Q2 scale in the event: , where the sum runs over all the additional jets in the event not coming from the decay. The samples with the varied scales use and Q2/4, respectively. For the nominal MadGraph sample, a jet-parton matching scale of 40 GeV is chosen, while for the varied samples, values of 60 and 30 GeV are employed, respectively. These scales correspond to jet-parton matching thresholds of 20 GeV for the nominal sample, and 40 and 10 GeV for the varied ones.
The powheg (v. 1.0 r1380) and mc@nlo (v. 3.41) generators, along with the CT10  and CTEQ6M  PDFs, are used, respectively, for comparisons with the data. The powheg generator simulates calculations of production to full NLO accuracy, and is matched with two parton shower MC generators: the pythia (v. 6.426) Z2* tune (designated as pythia 6 in the following), and the herwig  (v. 6.520) AUET2 tune  (referred to as herwig 6 in the following). The parton showering in pythia is based on a transverse-momentum ordering of parton showers, whereas herwig uses angular ordering. The mc@nlo generator implements the hard matrix element to full NLO accuracy, matched with herwig (v. 6.520) for the initial- and final-state parton showers using the default tune. These two generators, powheg and mc@nlo, are formally equivalent up to the NLO accuracy, but they differ in the techniques used to avoid double counting of radiative corrections that may arise from interfacing with the parton showering generators.
The cross section as a function of jet multiplicity and the gap fraction measurements are compared to the NLO predictions of the powheg (v2)  and MG5_aMC@NLO  generators. The powheg (v2) generator is matched to the pythia (v. 8.205) CUETP8M1 tune  (referred to as pythia 8), herwig 6, and pythia 6. In these samples the hdamp parameter of powhegbox, which controls the matrix element and parton shower matching and effectively regulates the high-pT radiation, is set to mt = 172.5 GeV. The MG5_aMC@NLO generator simulates events with up to two additional partons at NLO, and is matched to the pythia 8 parton shower simulation using the FxFx merging prescription . The top quark mass value used in all these simulations is also 172.5 GeV and the PDF set is NNPDF3.0 . In addition, a MadGraph sample matched to pythia 8 for the parton showering and hadronization is used for comparisons with the data.
The production cross sections are also compared with the predictions by the generator PowHel  (HELAC-NLO  + powhegbox ), which implements the full process at NLO QCD accuracy, with parton shower matching based on the powheg NLO matching algorithm [15, 32]. The events are further hadronized by means of pythia (v. 6.428), using parameters of the Perugia 2011 C tune . In the generation of the events, the renormalization and factorization scales are fixed to μR = μF = HT/4, where HT is the sum of the transverse energies of the final-state partons (t, , b, ) from the underlying tree-level process, and the CT10 PDFs are used.
The SM background samples are simulated with MadGraph, powheg, or pythia, depending on the process. The MadGraph generator is used to simulate Z/γ ∗ production (referred to as Drell–Yan, DY, in the following), production in association with an additional boson (referred to as +Z, +W, and +γ), and W boson production with additional jets (W+jets in the following). Single top quark events (tW channel) are simulated using powheg. Diboson (WW, WZ, and ZZ) and QCD multijet events are simulated using pythia. For the and measurements, the expected contribution from SM processes, simulated with pythia, is also considered, although the final state has not yet been observed.
For comparison with the measured distributions, the events in the simulated samples are normalized to an integrated luminosity of 19.7 fb-1 according to their predicted cross sections. These are taken from next-to-next-to-leading-order (NNLO) (W+jets  and DY ), NLO + next-to-next-to-leading logarithmic (NNLL) (single top quark tW channel ), NLO (diboson , +Z , +W , and +H ), and leading-order (LO) (QCD multijet ) calculations. The contribution of QCD multijet events is found to be negligible. The predicted cross section for the +γ sample is obtained by scaling the LO cross section obtained with the Whizard event generator  by an NLO/LO K-factor correction . The simulated sample is normalized to the total cross section , calculated with the Top++2.0 program to NNLO in perturbative QCD, including soft-gluon resummation to NNLL order , and assuming mt = 172.5 GeV. The first uncertainty comes from the independent variation of the factorization and renormalization scales, μR and μF, while the second one is associated with variations in the PDF and αs, following the PDF4LHC prescription with the MSTW2008 68 % confidence level (CL) NNLO, CT10 NNLO, and NNPDF2.3 5f FFN PDF sets (see Refs. [43, 44] and references therein and Refs. [45–47]).
A number of additional pp simulated hadronic interactions (“pileup”) are added to each simulated event to reproduce the multiple interactions in each bunch crossing from the luminosity conditions in the real data taking. Correction factors for detector effects (described in Sects. 4 and 6) are applied, when needed, to improve the description of the data by the simulation.
The event selection is based on the decay topology of the events, where each top quark decays into a W boson and a b quark. Only the cases in which both W bosons decayed to a charged lepton and a neutrino are considered. These signatures imply the presence of isolated leptons, missing transverse momentum owing to the neutrinos from W boson decays, and highly energetic jets. The heavy-quark content of the jets is identified through b tagging techniques. The same requirements are applied to select the events for the different measurements, with the exception of the requirements on the b jets, which have been optimized independently for the +jets and () cases. The description of the event reconstruction and selection is detailed in the following.
Events are reconstructed using a particle-flow (PF) algorithm, in which signals from all subdetectors are combined [48, 49]. Charged particles are required to originate from the primary collision vertex , defined as the vertex with the highest sum of of all reconstructed tracks associated with it. Therefore, charged-hadron candidates from pileup events, i.e. originating from additional pp interactions within the same bunch crossing, are removed before jet clustering on an event-by-event basis. Subsequently, the remaining neutral-particle component from pileup events is accounted for through jet energy corrections .
Muon candidates are reconstructed from tracks that can be linked between the silicon tracker and the muon system . The muons are required to have pT > 20 GeV, be within |η| < 2.4, and have a relative isolation Irel < 0.15. The parameter Irel is defined as the sum of the pT of all neutral and charged reconstructed PF candidates, except the muon itself, inside a cone of around the muon direction, divided by the muon pT, where Δη and Δϕ are the difference in pseudorapidity and azimuthal angle between the directions of the candidate and the muon, respectively. Electron candidates are identified by combining information from charged-track trajectories and energy deposition measurements in the ECAL , and are required to be within |η| < 2.4, have a transverse energy of at least 20 GeV, and fulfill Irel < 0.15 inside a cone of ΔR < 0.3. Electrons from identified photon conversions are rejected. The lepton identification and isolation efficiencies are determined via a tag-and-probe method using Z boson events.
Jets are reconstructed by clustering the PF candidates, using the anti-kT clustering algorithm [54, 55] with a distance parameter of 0.5. The jet momentum is determined as the vectorial sum of all particle momenta in the jet, and is found in the simulation to be within 5 to 10 % of the true momentum over the entire pT range and detector acceptance. Jet energy corrections are derived from the simulation, and are confirmed with in situ measurements with the energy balance of dijet and photon+jet events . The jet energy resolution amounts typically to 15 % at 10 GeV and 8 % at 100 GeV. Muons and electrons passing less stringent requirements compared to the ones mentioned above are identified and excluded from the clustering process. Jets are selected in the interval |η| < 2.4 and with pT > 20 GeV. Additionally, the jets identified as part of the decay products of the system (cf. Sect. 5) must fulfill pT > 30 GeV. Jets originating from the hadronization of b quarks are identified using a combined secondary vertex algorithm (CSV) , which provides a b tagging discriminant by combining identified secondary vertices and track-based lifetime information.
The missing transverse energy (/ ET) is defined as the magnitude of the projection on the plane perpendicular to the beams of the negative vector sum of the momenta of all reconstructed particles in an event . To mitigate the effect of contributions from pileup on the / ET resolution, we use a multivariate correction where the measured momentum is separated into components that originate from the primary and the other collision vertices . This correction improves the / ET resolution by ≈ 5 %.
Events are triggered by requiring combinations of two leptons (ℓ = e or μ), where one fulfills a pT threshold of 17 GeV and the other of 8 GeV, irrespective of the flavour of the leptons. The dilepton trigger efficiencies are measured using samples selected with triggers that require a minimum / ET or number of jets in the event, and are only weakly correlated to the dilepton triggers used in the analysis.
Events are selected if there are at least two isolated leptons of opposite charge. Events with a lepton pair invariant mass less than 20 GeV are removed to suppress events from heavy-flavour resonance decays, QCD multijet, and DY production. In the μμ and ee channels, the dilepton invariant mass is required to be outside a Z boson mass window of 91 ± 15 GeV, and / ET is required to be larger than 40 GeV.
For the +jets selection, a minimum of two jets is required, of which at least one must be tagged as a b jet. A loose CSV discriminator value is chosen such that the efficiency for tagging jets from b (c) quarks is ≈ 85 % (40 %), while the probability of tagging jets originating from light quarks (u, d, or s) or gluons is around 10 %. Efficiency corrections, depending on jet pT and η, are applied to account for differences in the performance of the b tagging algorithm between data and simulation.
For the () selection, at least three b-tagged jets are required (without further requirements on the minimum number of jets). In this case, a tighter discriminator value  is chosen to increase the purity of the sample. The efficiency of this working point is approximately 70 % (20 %) for jets originating from a b (c) quark, while the misidentification rate for light-quark and gluon jets is around 1 %. The shape of the CSV discriminant distribution in simulation is corrected to better describe the efficiency observed in the data. This correction is derived separately for light-flavour and b jets from a tag-and-probe approach using control samples enriched in events with a Z boson and exactly two jets, and events in the eμ channel with no additional jets .
To study additional jet activity in the data, the identification of jets arising from the decay of the system is crucial. In particular, we need to identify correctly the two b jets from the top quark decays in events with more than two b jets. This is achieved by following two independent but complementary approaches: a kinematic reconstruction  and a multivariate analysis, optimized for the two cases under study, +jets and (), respectively. The purpose of the kinematic reconstruction is to completely reconstruct the system based on / ET and the information on identified jets and leptons, taking into account detector resolution effects. This method is optimized for the case where the b jets in the event only arise from the decay of the top quark pair. The multivariate approach is optimized for events with more b jets than just those from the system. This method identifies the two jets that most likely originated from the top quark decays, and the additional b jets, but does not perform a full reconstruction of the system. Both methods are described in the following sections.
The kinematic reconstruction method was developed and used for the first time in the analysis from Ref. . In this method the following constraints are imposed: / ET is assumed to originate solely from the two neutrinos; the W boson invariant mass is fixed to 80.4 GeV ; and the top quark and antiquark masses are fixed to a value of 172.5 GeV. Each pair of jets and lepton-jet combination fulfilling the selection criteria is considered in the kinematic reconstruction. Effects of detector resolution are accounted for by randomly smearing the measured energies and directions of the reconstructed lepton and b jet candidates by their resolutions. These are determined from the simulation of signal events by comparing the reconstructed b jets and leptons matched to the generated b quarks and leptons from top quark decays. For a given smearing, the solution of the equations for the neutrino momenta yielding the smallest invariant mass of the system is chosen. For each solution, a weight is calculated based on the expected invariant mass spectrum of the lepton and b jet from the top quark decays at the parton level. The weights are summed over 100 randomly smeared reconstruction attempts, and the kinematics of the top quark and antiquark are calculated as a weighted average. Finally, the two jets and lepton-jet combinations that yield the maximum sum of weights are chosen for further analysis. Combinations with two b-tagged jets are chosen over those with a single b-tagged jet. The efficiency of the kinematic reconstruction, defined as the number of events with a solution divided by the total number of selected +jets events, is approximately 94 %. The efficiency in simulation is similar to the one in data for all jet multiplicities. Events with no valid solution for the neutrino momenta are excluded from further analysis. In events with additional jets, the algorithm correctly identifies the two jets coming from the decay in about 70 % of the cases.
After the full event selection is applied, the dominant background in the eμ channel originates from other decay channels and is estimated using simulation. This contribution corresponds mostly to leptonic τ decays, which are considered background in the +jets measurements. In the ee and μμ channels, the dominant background contribution arises from Z/γ∗+jets production. The normalization of this background contribution is derived from data using the events rejected by the Z boson veto, scaled by the ratio of events failing and passing this selection, estimated from simulation . The remaining backgrounds, including the single top quark tW channel, W+jets, diboson, and QCD multijet events, are estimated from simulation for all the channels.
In Fig. 1, the multiplicity distributions of the selected jets per event are shown for different jet pT thresholds and compared to SM predictions. In this figure and the following ones, the sample is simulated using MadGraph +pythia 6, where only events with two leptons (e or μ) from the W boson decay are considered as signal. All other events, specifically those originating from decays via τ leptons, which are the dominant contribution, are considered as background. In the following figures, “Electroweak” corresponds to DY, W+jets, and diboson processes, and “ bkg.” includes the +γ/W/Z events. The data are well described by the simulation, both for the low jet pT threshold of 30 GeV and the higher thresholds of 60 and 100 GeV. The hatched regions in Figs. 1, ,22 and and33 correspond to the uncertainties affecting the shape of the simulated signal and background events (cf. Sect. 6), and are dominated by modelling uncertainties in the former.
Additional jets in the event are defined as those jets within the phase space described in the event selection (cf. Sect. 4) that are not identified by the kinematic reconstruction to be part of the system. The η and pT distributions of the additional jets with the largest and second largest pT in the event (referred to as the leading and subleading additional jets in the following) are shown in Fig. 2. Three additional event variables are considered: the scalar sum of the pT of all additional jets, HT, the invariant mass of the leading and subleading additional jets, mjj, and their angular separation, , where Δη and Δϕ are the pseudorapidity and azimuthal differences between the directions of the two jets. These distributions are shown in Fig. 3. The predictions from the simulation, also shown in the figures, describe the data within the uncertainties.
The multivariate approach uses a boosted decision tree (BDT) to distinguish the b jets stemming from the system from those arising from additional radiation for final states with more than two b jets. This method is optimized for topologies in the dilepton final state of the system. The BDT is set up using the TMVA package . To avoid any dependence on the kinematics of the additional jets, and especially on the invariant mass of the two additional jets, the method identifies the jets stemming from the system by making use of properties of the system that are expected to be mostly insensitive to the additional radiation. The variables combine information from the two final-state leptons, the jets, and / ET. All possible pairs of reconstructed jets in an event are considered. For each pair, one jet is assigned to the b jet and the other to the jet. This assignment is needed to define the variables used in the BDT and is based on the measurement of the charge of each jet, which is calculated from the charge and the momenta of the PF constituents used in the jet clustering. The jet in the pair with the largest charge is assigned to the , while the other jet is assigned to the b. The efficiency of this jet charge pairing is defined as the fraction of events where the assigned b and are correctly matched to the corresponding generated b and jets, and amounts to 68 %.
A total of twelve variables are included in the BDT. Some examples of the variables used are: the sum and difference of the invariant mass of the bℓ+ and systems, ; the absolute difference in the azimuthal angle between them, ; the pT of the bℓ+ and systems, and ; and the difference between the invariant mass of the two b jets and two leptons and the invariant mass of the pair, . The complete list of variables can be found in Appendix A. The main challenge with this method is the large number of possible jet assignments, given four genuine b jets and potential extra jets from additional radiation in each event. The basic methodology is to use the BDT discriminant value of each dijet combination as a measure of the probability that the combination stems from the system. The jets from the system are then identified as the pair with the highest BDT discriminant. From the remaining jets, those b-tagged jets with the highest pT are selected as being the leading additional ones.
The BDT training is performed on a large and statistically independent sample of simulated events with the Higgs boson mass varied over the range 110–140 GeV. The events are not included in the training to avoid the risk of overtraining owing to the limited number of events in the available simulated samples. The simulated sample is suited for this purpose since the four b jets from the decay of the system and the Higgs boson have similar kinematic distributions. Since it is significantly harder to identify the jets from the system in events than in events, where the additional b jets arise from initial- or final-state radiation, a good BDT performance with events implies also a good identification in events. The distributions of the BDT discriminant in data and simulation are shown in Fig. 4 for all dijet combinations in an event, and for the combination with the highest weight that is assigned to the system. The subset “Minor bkg.” includes all non- processes and +Z/W/γ events. There is good agreement between the data and simulation distributions within the statistical uncertainties.
The number of simulated events with correct assignments for the additional b jets in events relative to the total number of events where those jets are selected and matched to the corresponding generator jets, is approximately 34 %. In events, this fraction is about 40 %. This efficiency is high enough to allow the measurement of the cross section as a function of the kinematic variables of the additional b jets (the probability of selecting the correct assignments by choosing random combinations of jets is 17 % in events with four jets and 10 % in events with five jets). The relative increase in efficiency with respect to the use of the kinematic reconstruction for is about 15 %. Additionally, the BDT approach improves the correlation between the generated and reconstructed variables, especially for the distribution of the invariant mass of the two leading additional b jets mbb and their angular separation , where Δη and Δϕ are the pseudorapidity and azimuthal differences between the directions of the two b jets.
The expected fraction of events with additional b jets is not properly modelled in the simulation, in agreement with the observation of a previous CMS measurement . This discrepancy between the MadGraph+pythia simulation and data can be seen in the b jet multiplicity distribution, as shown in Fig. 5.
To improve the description of the data by the simulation, a template fit to the b-tagged jet multiplicity distribution is performed using three different templates obtained from simulation. One template corresponds to the and processes, defined at the generator level as the events where one or two additional b jets are generated within the acceptance requirements, pT > 20 GeV and |η| < 2.4, (referred to as “ +HF”). The and processes are combined into a single template because they only differ by the kinematic properties of the second additional b jet. Details about the definition of the b jets and the acceptance are given in Sect. 7. The second template includes the background contribution coming from and +light-jets events (referred to as “ other”), where events are defined as those that have at least one c jet within the acceptance and no additional b jets. This contribution is not large enough to be constrained by data, therefore it is combined with the +light-jets process in a single template. The third template contains the remaining background processes, including , which corresponds to events with two additional b hadrons that are close enough in direction to produce a single b jet. This process, produced by collinear splitting, is treated separately owing to the large theoretical uncertainty in its cross section and insufficient statistical precision to constrain it with data. The normalizations of the first two templates are free parameters in the fit. The third is fixed to the corresponding cross section described in Sect. 3, except for the cross section for the process, which is corrected by a factor of . The normalization factors obtained for the template fit correspond to 1.66 ± 0.43 ( +HF) and 1.00 ± 0.01 ( other). Details about the uncertainties in those factors are presented in Sect. 6.1.1. The improved description of the b jet multiplicity can be seen in Fig. 5 (right).
Figure 6 (top) shows the pT and |η| distributions of the leading additional b jet, measured in events with at least three b-tagged jets (using the tighter discriminator value described in Sect. 4), after the full selection and including all corrections. The distributions of the pT and |η| of the second additional b jet in events with exactly four b-tagged jets, ΔRbb, and mbb are also presented. The dominant contribution arises from the process. The decays into τ leptons decaying leptonically are included as signal to increase the number of and events both in data and simulation. It has been checked that the distribution of the variables of relevance for this analysis do not differ between the leptons directly produced from W boson decays and the leptons from τ decays within the statistical uncertainties in the selected and events. In general, the variables presented are well described by the simulation, after correcting for the heavy-flavour content measured in data, although the simulation tends to predict smaller values of ΔRbb than the data. After the full selection, the dominant background contribution arises from dilepton events with additional light-quark, gluon, and c jets, corresponding to about 50 and 20 % of the total expected yields for the and cases, respectively. Smaller background contributions come from single top quark production, in association with Wor Z bosons, and events in the lepton+jets decay channels. The contribution from is also small, amounting to 0.9 and 3 % of the total expected events for the and distributions. The contribution from background sources other than top quark production processes such as DY, diboson, or QCD multijet is negligible.
Different sources of systematic uncertainties are considered arising from detector effects, as well as theoretical uncertainties. Each systematic uncertainty is determined individually in each bin of the measurement by varying the corresponding efficiency, resolution, or model parameter within its uncertainty, in a similar way as in the CMS previous measurement of the differential cross sections . For each variation, the measured differential cross section is recalculated and the difference with respect to the nominal result is taken as the systematic uncertainty. The overall uncertainty in the measurement is then derived by adding all contributions in quadrature, assuming the sources of systematic uncertainty to be fully uncorrelated.
The experimental sources of systematic uncertainty considered are the jet energy scale (JES), jet energy resolution (JER), background normalization, lepton trigger and identification efficiencies, b tagging efficiency, integrated luminosity, pileup modelling, and kinematic reconstruction efficiency.
The experimental uncertainty from the JES is determined by varying the energy scale of the reconstructed jets as a function of their pT and η by its uncertainty . The uncertainty from the JER is estimated by varying the simulated JER by its η-dependent uncertainty .
The uncertainty from the normalization of the backgrounds that are taken from simulation is determined by varying the cross section used to normalize the sample, see Sect. 3, by ±30 %. This variation takes into account the uncertainty in the predicted cross section and all other sources of systematic uncertainty [5, 8, 66]. In the case of the tW background, the variation of ±30 % covers the theoretical uncertainty in the absolute rate, including uncertainties owing to the PDFs. The contribution from the DY process, as determined from data, is varied in the normalization by ±30 % [1, 63].
The trigger and lepton identification efficiencies in simulation are corrected by lepton pT and η multiplicative data-to-simulation scale factors. The systematic uncertainties are estimated by varying the factors by their uncertainties, which are in the range 1–2 %.
For the +jets measurements, the b tagging efficiency in simulation is also corrected by scale factors depending on the pT and η of the jet. The shape uncertainty in the b tagging efficiency is then determined by taking the maximum change in the shape of the pT and |η| distributions of the b jet, obtained by changing the scale factors. This is achieved by dividing the b jet distributions in pT and |η| into two bins at the median of the respective distributions. The b tagging scale factors for b jets in the first bin are scaled up by half the uncertainties quoted in Ref. , while those in the second bin are scaled down, and vice versa, so that a maximum variation is assumed and the difference between the scale factors in the two bins reflects the full uncertainty. The changes are made separately in the pT and |η| distributions, and independently for heavy-flavour (b and c) and light-flavour (s, u, d, and gluon) jets, assuming that they are all uncorrelated. A normalization uncertainty is obtained by varying the scale factors up and down by half the uncertainties. The total uncertainty is obtained by summing in quadrature the independent variations.
The uncertainty in the integrated luminosity is 2.6 % . The effect of the uncertainty in the level of pileup is estimated by varying the inelastic pp cross section in simulation by ±5 %.
The uncertainty coming from the kinematic reconstruction method is determined from the uncertainty in the correction factor applied to account for the small difference in efficiency between the simulation and data, defined as the ratio between the events with a solution and the total number of selected events.
In the () measurements, an additional uncertainty associated with the template fit to the b-tagged jet multiplicity distribution is considered. Since the input templates are known to finite precision, both the statistical and systematic uncertainties in the templates are taken into account. The considered systematic uncertainties that affect the shapes of the templates are those of the JES, the CSV discriminant scale factors following the method described in , the cross section of the process, which is varied by ±50 % , and the uncertainty in the cross section. This is taken as the maximum between the largest uncertainty from the measurement described in Ref.  and the difference between the corrected cross section and the prediction by the nominal MadGraph simulation used in this analysis. This results in a variation of the cross section of about ±40 %. This uncertainty is included as a systematic uncertainty in the shape of the background template.
The impact of theoretical assumptions on the measurement is determined by repeating the analysis, replacing the standard MadGraph signal simulation by alternative simulation samples. The uncertainty in the modelling of the hard-production process is assessed by varying the common renormalization and factorization scale in the MadGraph signal samples up and down by a factor of two with respect to its nominal value of the Q in the event (cf. Sect. 3). Furthermore, the effect of additional jet production in MadGraph is studied by varying up and down by a factor of two the threshold between jet production at the matrix element level and via parton showering. The uncertainties from ambiguities in modelling colour reconnection (CR) effects are estimated by comparing simulations of an underlying-event (UE) tune including colour reconnection to a tune without it (Perugia 2011 and Perugia 2011 noCR tunes, described in Ref. ). The modelling of the UE is evaluated by comparing two different Perugia 11 (P11) pythia tunes, mpiHi and TeV, to the standard P11 tune. The dependency of the measurement on the top quark mass is obtained using dedicated samples in which the mass is varied by ±1 GeV with respect to the default value used in the simulation. The uncertainty from parton shower modelling is determined by comparing two samples simulated with powheg and mc@nlo, using either pythia or herwig for the simulation of the parton shower, underlying event, and hadronization. The effect of the uncertainty in the PDFs on the measurement is assessed by reweighting the sample of simulated signal events according to the 52 CT10 error PDF sets, at the 90 % CL .
Since the total uncertainty in the and production cross sections is largely dominated by the statistical uncertainty in the data, a simpler approach than for the +jets measurements is chosen to conservatively estimate the systematic uncertainties: instead of repeating the measurement, the uncertainty from each source is taken as the difference between the nominal MadGraph +pythia sample and the dedicated simulated sample at generator level. In the case of the uncertainty coming from the renormalization and factorization scales, the uncertainty estimated in the previous inclusive cross section measurement  is assigned.
Typical values of the systematic uncertainties in the absolute differential cross sections are summarized in Table 1 for illustrative purposes. They are the median values of the distribution of uncertainties over all bins of the measured variables. Details on the impact of the different uncertainties in the results are given in Sects. 8–11.
In general, for the +jets case, the dominant systematic uncertainties arise from the uncertainty in the JES, as well as from model uncertainties such as the renormalization, factorization, and jet-parton matching scales and the hadronization uncertainties. For the and cross sections, the total uncertainty, including all systematic uncertainties, is only about 10 % larger than the statistical uncertainty. The experimental uncertainties with an impact on the normalization of the expected number of signal events, such as lepton and trigger efficiencies, have a negligible effect on the final cross section determination, since the normalization of the different processes is effectively constrained by the template fit.
The absolute differential cross section is defined as:
where j represents the bin index of the reconstructed variable x, i is the index of the corresponding generator-level bin, is the number of data events in bin j, is the number of estimated background events, ℒ is the integrated luminosity, and is the bin width. Effects from detector efficiency and resolution in each bin i of the measurement are corrected by the use of a regularized inversion of the response matrix (symbolized by ) described in this section.
For the measurements of +jets, the estimated number of background events from processes other than production () is subtracted from the number of events in data (N). The contribution from other decay modes is taken into account by correcting the difference N– by the signal fraction, defined as the ratio of the number of selected signal events to the total number of selected events, as determined from simulation. This avoids the dependence on the inclusive cross section used for normalization. For the and production cross sections, where the different contributions are fitted to the data, the expected contribution from all background sources is directly subtracted from the number of data events.
The normalized differential cross section is derived by dividing the absolute result, Eq. (1), by the total cross section, obtained by integrating over all bins for each observable. Because of the normalization, the systematic uncertainties that are correlated across all bins of the measurement, e.g. the uncertainty in the integrated luminosity, cancel out.
Effects from the trigger and reconstruction efficiencies and resolutions, leading to migrations of events across bin boundaries and statistical correlations among neighbouring bins, are corrected using a regularized unfolding method [8, 68, 69]. The response matrix Aij that corrects for migrations and efficiencies is calculated from simulated events using MadGraph. The generalized inverse of the response matrix is used to obtain the unfolded distribution from the measured distribution by applying a χ2 technique. To avoid nonphysical fluctuations, a smoothing prescription (regularization) is applied. The regularization level is determined individually for each distribution using the averaged global correlation method . To keep the bin-to-bin migrations small, the width of bins in the measurements are chosen according to their purity and stability. The purity is the number of events generated and correctly reconstructed in a certain bin divided by the total number of reconstructed events in the same bin. The stability is the ratio of the number of events generated and reconstructed in a bin to the total number of events generated in that bin. The purity and stability of the bins are typically larger than 40–50 %, which ensures that the bin-to-bin migrations are small enough to perform the measurement. The performance of the unfolding procedure is tested for possible biases from the choice of the input model (the MadGraph simulation). It has been verified that by reweighting the simulation the unfolding procedure based on the nominal response matrix reproduces the altered shapes within the statistical uncertainties. In addition, samples simulated with powheg and mc@nlo are employed to obtain the response matrices used in the unfolding for the determination of systematic uncertainties of the model (Sect. 6.2). Therefore, possible effects from the unfolding procedure are already taken into account in the systematic uncertainties.
The differential cross section is reported at the particle level, where objects are defined as follows. Leptons from W boson decays are defined after final-state radiation, and jets are defined at the particle level by applying the anti-kT clustering algorithm with a distance parameter of 0.5  to all stable particles, excluding the decay products from W boson decays into eν, μν, and leptonic τ final states. A jet is defined as a b jet if it has at least one b hadron associated with it. To perform the matching between b hadrons and jets, the b hadron momentum is scaled down to a negligible value and included in the jet clustering (so-called ghost matching ). The b jets from the decay are identified by matching the b hadrons to the corresponding original b quarks. The measurements are presented for two different phase-space regions, defined by the kinematic and geometric attributes of the decay products and the additional jets. The visible phase space is defined by the following kinematic requirements:
The full phase space is defined by requiring only the additional jets or b jets be within the above-mentioned kinematic range, without additional requirements on the decay products of the system, and including the correction for the corresponding dileptonic branching fraction, calculated using the leptonic branching fraction of the W boson .
In the following sections, the differential cross section measured as a function of the jet multiplicity in the visible phase space and the results as a function of the kinematic variables of the additional jets in the event, measured in the visible and the full phase-space regions, are discussed. The absolute cross sections are presented as figures and compared to different predictions. The full results are given in tables in Appendix B, along with the normalized differential cross sections measurements.
In Fig. 7, the absolute differential cross section is shown for three different jet pT thresholds: pT > 30, 60, and 100 GeV. The results are presented for a nominal top quark mass of 172.5 GeV. The lower part of each figure shows the ratio of the predictions from simulation to the data. The light and dark bands in the ratio indicate the statistical and total uncertainties in the data for each bin, which reflect the uncertainties for a ratio of 1.0. All predictions are normalized to the measured cross section in the range shown in the histogram, which is evaluated by integrating over all bins for each observable. The results are summarized in Table 2, together with the normalized cross sections. In general, the MadGraph generator interfaced with pythia 6, and powheg interfaced both with herwig 6 and pythia 6, provide reasonable descriptions of the data. The mc@nlo generator interfaced with herwig 6 does not generate sufficiently large jet multiplicities, especially for the lowest jet pT threshold. The sensitivity of MadGraph to scale variations is investigated through the comparison of different renormalization, factorization, and jet-parton matching scales with respect to the nominal MadGraph simulation. Variations in the jet-parton matching threshold do not yield large effects in the cross section, while the shape and normalization are more affected by the variations in the renormalization and factorization scales, which lead to a slightly worse description of the data up to high jet multiplicities, compared to their nominal values.
In Fig. 8, the results are compared to the predictions from MadGraph and MG5_aMC@NLO interfaced with pythia 8, and the powheg generator with the hdamp parameter set to mt = 172.5 GeV (labelled powheg (hdamp = mt) in the legend), interfaced with pythia 6, pythia 8, and herwig 6. The MadGraph and MG5_aMC@NLO simulations interfaced with pythia 8 predict larger jet multiplicities than measured in the data for all the considered pT thresholds. In general, no large deviations between data and the different powheg predictions are observed.
The total systematic uncertainty in the absolute differential cross section ranges between 6 to 30 %, while for the normalized cross section it varies from 2 % up to 20 % for the bins corresponding to the highest number of jets. In both cases, the dominant experimental systematic uncertainty arises from the JES, having a maximum value of 16 % for the absolute cross section bin with at least six jets and pT > 30 GeV. Typical systematic uncertainty values range between 0.5 and 8 %, while the uncertainty in the normalized cross section is 0.5–4 %. Regarding the modelling uncertainties, the most relevant ones are the uncertainty in the renormalization and factorization scales and the parton shower modelling, up to 6 and 10 %, respectively. The uncertainties from the assumed top quark mass used in the simulation and the jet-parton matching threshold amount to 1–2 %. Other modelling uncertainties such as PDF, CR, and UE have slightly smaller impact. These uncertainties cancel to a large extent in the normalized results, with typical contributions below 0.5 %. The total contribution from the integrated luminosity, lepton identification, and trigger efficiency, which only affect the normalization, is 3.5 %. This contribution is below 0.1 % for every bin in the normalized results. The uncertainty from the estimate of the background contribution is around 2 % for the absolute cross sections and typically below 0.5 % for the normalized results.
The absolute and normalized differential cross sections are measured as a function of the kinematic variables of the additional jets in the visible phase space defined in Sect. 7. The results are compared to predictions from four different generators: powheg interfaced with pythia 6 and herwig 6, mc@nlo +herwig 6, and MadGraph +pythia 6 with varied renormalization, factorization, and jet-parton matching scales. All predictions are normalized to the measured cross section over the range of the observable shown in the histogram in the corresponding figures.
The absolute differential cross sections as a function of the pT of the leading and subleading additional jets and HT, the scalar sum of the pT of all additional jets in the event, are shown in Fig. 9. The total uncertainties in the absolute cross sections range from 8–14 % for the leading additional jet pT and HT, and up to 40 % for the subleading additional jet pT, while the systematic uncertainties in the normalized cross sections for the bins with the larger number of events are about 3–4 %. The dominant sources of systematic uncertainties arise in both cases from model uncertainties, in particular the renormalization and factorization scales, and the parton shower modelling (up to 10 % for the absolute cross sections), and JES (3–6 % for the absolute cross sections). The typical contribution of other uncertainties such as the assumed top quark mass in the simulation, background contribution, etc., amounts to 1–3 % and 0.5–1.5 %, for the absolute and normalized cross sections, respectively.
In general, the simulation predictions describe the behaviour of the data for the leading additional jet momenta and HT, although some predictions, in particular powheg, favour a harder pT spectrum for the leading jet. The mc@nlo +herwig 6 prediction yields the largest discrepancies. The varied MadGraph samples provide similar descriptions of the shape of the data, except for MadGraph with the lower μR = μF scale, which worsens the agreement.
The results as a function of |η| are presented in Fig. 10. The typical total systematic uncertainties in the absolute cross sections vary from 6.5–19 % for the leading additional jet and about 11–20 % for the subleading one. The uncertainty in the normalized cross section ranges from 1.5–9 % and 5–14 %, respectively. The shape of the |η| distribution is well modelled by mc@nlo +herwig 6. The distributions from MadGraph and powheg yield a similar description of the data, being slightly more central than mc@nlo . Variations of the MadGraph parameters have little impact on these distributions.
The differential cross section is also measured as a function of the dijet angular separation ΔRjj and invariant mass mjj for the leading and subleading additional jets (Fig. 11). In general, all simulations provide a reasonable description of the distributions for both variables. All results are reported in Tables 3, ,44 and and55 in Appendix B. Representative examples of the migration matrices are presented in Fig. 24 in Appendix C.
The absolute and normalized differential cross sections are also measured as a function of the kinematic variables of the additional jets and b jets in the event for the full phase space of the system to facilitate comparison with theoretical calculations. In this case, the phase space is defined only by the kinematic requirements on the additional jets.
Figures 12 and 13 show the absolute cross sections as a function of the pT and |η| of the leading and subleading additional jets and HT, while the results as a function of ΔRjj and mjj are presented in Fig. 14.
The total uncertainties range between 8–12 % for the leading jet pT and HT, 10 % at lower pT and 40 % in the tails of distribution of the subleading jet pT. The uncertainties for |η| are 6–16 % and 10–30 % for the leading and subleading additional jets, respectively. The typical uncertainties in the cross section as a function of ΔRjj and mjj are on the order of 10–20 %. The uncertainties are dominated by the JES, scale uncertainties, and shower modelling.
The numerical values are given in Tables 6, ,77 and and88 of Appendix B, together with the normalized results. In the latter, the uncertainties are on average 2–3 times smaller than for the absolute cross sections, owing to the cancellation of uncertainties such as the integrated luminosity, lepton identification, and trigger efficiency, as well as a large fraction of the JES and model uncertainties, as discussed in Sect. 8. The dominant systematic uncertainties are still the model uncertainties, although they are typically smaller than for the absolute cross sections.
The shapes of the distributions measured in the full and visible phase-space regions of the system are similar, while the absolute differential cross sections are a factor of 2.2 larger than those in the visible phase space of the system (excluding the factor due to the leptonic branching fraction correction (4.54 ± 0.10) % ).
Figure 15 shows the absolute differential cross sections in the visible phase space of the system and the additional b jets as a function of the pT and |η| of the leading and subleading additional b jets, and ΔRbb and mbb of the two b jets. The uncertainties in the measured cross sections as a function of the b jet kinematic variables are dominated by the statistical uncertainties, with values varying from 20–100 %. The results are quantified in Tables 9 and 10 in Appendix B, together with the normalized results. The corresponding migration matrices between the reconstructed and particle levels for the kinematic properties of the additional b jets are presented in Fig. 25 in Appendix C for illustration purposes.
The dominant systematic uncertainties are the b tagging efficiency and JES, up to 20 % and 15 %, respectively. Other uncertainties have typical values on the order of or below 5 %. The experimental sources of systematic uncertainties affecting only the normalization, which are constrained in the fit, have a negligible impact. The largest model uncertainty corresponds to that from the renormalization and factorization scales of 8 %. The effect of the assumed top quark mass and the PDF uncertainties have typical values of 1–2 %. On average, the inclusion of all the systematic uncertainties increases the total uncertainties by 10 %.
The measured distributions are compared with the MadGraph +pythia 6 prediction, normalized to the corresponding measured inclusive cross section in the same phase space. The measurements are also compared to the predictions from mc@nlo interfaced with herwig 6 and from powheg with pythia 6 and herwig 6. The normalization factors applied to the MadGraph and powheg predictions are found to be about 1.3 for results related to the leading additional b jet. The predictions from both generators underestimate the cross sections by a factor 1.8, in agreement with the results from Ref. . The normalization factors applied to mc@nlo are approximately 2 and 4 for the leading and subleading additional b jet quantities, respectively, reflecting the observation that the generator does not simulate sufficiently large jet multiplicities. All the predictions have slightly harder pT spectra for the leading additional b jet than the data, while they describe the behaviour of the |η| and mbb distributions within the current precision. The predictions favour smaller ΔRbb values than the measurement, although the differences are in general within two standard deviations of the total uncertainty.
The production cross sections are compared to the NLO calculation by PowHel +pythia 6 in Fig. 16. In the figure, the prediction is normalized to the absolute cross section given by the calculation of . The prediction describes well the shape of the different distributions, while the predicted absolute cross section is about 30 % lower than the measured one, but compatible within the uncertainties.
The absolute differential cross sections measured in the visible phase space of the additional b jets and the full phase space of the system are presented in Fig. 17 and given in Tables 11 and and1212 of Appendix B. The results are corrected for acceptance and dileptonic branching fractions including τ leptonic decays (6.43 ± 0.14) % . The results are compared to the same predictions as in Fig. 15, which are scaled to the measured cross section, obtained by integrating all the bins of the corresponding distribution. The normalization factor applied to the simulations is similar to the previous one for the results in the visible phase space of the system. The description of the data by the simulations is similar as well. The total measured , as well as the agreement between the data and the simulation, is in agreement with the result obtained in Ref. . In the full phase space, the inclusive cross section at NLO given by PowHel +pythia 6 corresponds to (excluding the dileptonic branching fraction correction). The comparison of the differential cross section with the NLO calculation is presented in Fig. 18.
Differences between the kinematic properties of the additional jets and b jets are expected owing to the different production mechanisms  of both processes. The dominant production mechanism of is gluon-gluon (gg) scattering, while in the case of , the quark-gluon (qg) channel is equally relevant. The |η| distributions of the additional b jets seem to be more central than the corresponding distributions of the additional jets, see Figs. 10 and 13. This difference can be attributed mainly to the contribution of the production via the qg channel, which favours the emission of jets at larger |η|. The distributions of the differential cross section as a function of mbb peak at smaller invariant masses than those as a function of mjj, presented in Figs. 11 and 14, because of the larger contribution of the gg channel. Given the large uncertainties in the measurements, no statistically significant differences can be observed in the shape of the pT distributions of the additional b jets compared to the additional jets, shown in Figs. 9 and 12.
An alternative way to investigate the jet activity arising from quark and gluon radiation is to determine the fraction of events that do not contain additional jets above a given pT threshold [5, 12]. A threshold observable, referred to as the gap fraction, is defined as:
where Ntotal is the total number of selected events and is the number of events that do not contain at least j additional jets (apart from the two jets from the solution hypothesis) above a pT threshold, with j corresponding to one or two jets. The measurements are presented as a function of the pT of the leading and subleading additional jets, respectively.
A modified gap fraction can be defined as:
where N(HT) is the number of events in which the sum of the scalar pT of the additional jets (HT) is less than a certain threshold. In both cases, detector effects are unfolded using the MadGraph simulation to obtain the results at the particle level. The additional jets at the generator level are defined as all jets within the kinematic acceptance, excluding the two b jets originating from the b quarks from top quark decay (see Sect. 7). For each value of the pT and HT thresholds the gap fraction at the generator level is evaluated, along with the equivalent distributions after the detector simulation and analysis requirements. Given the high purity of the selected events, above 70 % for any bin for the leading additional jet pT and HT, and above 85 % for any bin for the subleading additional jets, a correction for detector effects is applied by following a simpler approach than the unfolding method used for other measurements presented here. The data are corrected to the particle level by applying the ratio of the generated distributions at particle level to the simulated ones at the reconstruction level, using the nominal MadGraph simulation.
The measured gap fraction distributions are compared to predictions from MadGraph interfaced with pythia 6, powheg 6 interfaced with pythia 6 and herwig 6, mc@nlo interfaced with herwig 6, and to the MadGraph predictions with varied renormalization, factorization, and jet-parton matching scales. Figure 19 displays the gap fraction distribution as a function of the pT of the leading and subleading additional jets, and HT. The lower part of the figures shows the ratio of the predictions to the data. The light band indicates the total uncertainty in the data in each bin. The threshold, defined at the value where the data point is shown, is varied from 25 GeV (lower value compared to previous measurements ) to 190 GeV. In general, MadGraph interfaced with pythia 6 agrees with the data distributions of the three variables, while powheg interfaced with pythia 6 and herwig 6 also provide a good description of the data, though they tend to predict a lower gap fraction than the measured ones. The mc@nlo generator interfaced with herwig 6 describes the data well as a function of the leading additional jet pT. However, it predicts higher values of the gap fraction as a function of the subleading jet pT and HT. Modifying the renormalization and factorization scales in MadGraph worsens the agreement with data, while variations of the jet-parton matching threshold provide similar predictions as the nominal MadGraph simulation, in agreement with the results shown before.
The results are also compared in Fig. 20 with the recently available simulations, described in Sect. 3, matched to different versions of the parton showering models. The MadGraph and MG5_aMC@NLO generators interfaced with pythia 8 predict up to 10 % lower values of the gap fraction for all the variables, which reflects the fact that those simulations generate larger jet multiplicities, as discussed in Sect. 8. Within the uncertainties, the predictions of the powheg +pythia 8 simulation agree well with data, while the powheg generator (with HDAMP = mt) interfaced with pythia 6 and herwig 6 tends to overestimate and underestimate the measured values, respectively.
The gap fraction is also measured in different |η| regions of the additional jets, with the results presented in Figs. 21, ,2222 and and2323 as a function of the leading additional jet pT, subleading additional jet pT, and HT, respectively. In general, the gap fraction values predicted by the simulations describe the data better in the higher |η| ranges. The values given by MadGraph and powheg interfaced with pythia 6 are slightly below the measured ones in the central region for the leading pT jet and HT, while mc@nlo +herwig 6 yields higher values of the gap fraction. In the case of the subleading jet pT, all predictions agree with the data within the uncertainties, except for mc@nlo +herwig 6 in the more central regions. Variations of the jet-parton matching threshold do not have a noticeable impact on the gap fraction, while MadGraph with the varied renormalization and factorization scales provides a poorer description of the data.
The total systematic uncertainty in the gap fraction distributions is about 5 % for low values of the threshold (pT or HT) and decreases to < 0.5 % for the highest values. The measurement of the gap fraction as a function of HT has larger uncertainties because of the impact of the lower-momentum jets that have a significantly larger uncertainty, as discussed in Sect. 9. The uncertainty in JES is the dominant source of systematic uncertainty, corresponding to approximately 4 % for the smallest pT and HT values. Other sources with a smaller impact on the total uncertainty are the b tagging efficiency, JER, pileup, and the simulated sample used to correct the data to the particle level.
Measurements of the absolute and normalized differential top quark pair production cross sections have been presented using pp collisions at a centre-of-mass energy of 8 TeV, corresponding to an integrated luminosity of 19.7 fb-1, in the dilepton decay channel as a function of the number of jets in the event, for three different jet pT thresholds, and as a function of the kinematic variables of the leading and subleading additional jets. The results have been compared to the predictions from MadGraph interfaced with pythia 6, powheg interfaced with both pythia 6 and herwig 6, mc@nlo interfaced with herwig 6, and MadGraph samples with varied renormalization, factorization, and jet-parton matching scales. In general, all these generators are found to give a reasonable description of the data.
The MadGraph and powheg generators interfaced with pythia 6 describe the data well for all measured jet multiplicities; while mc@nlo interfaced with herwig 6 generates lower multiplicities than observed for the lower-pT thresholds. The prediction from MadGraph with varied renormalization and factorization scales does not provide an improved description of the data compared to the nominal simulation.
These results are also compared to the predictions from powheg with the hdamp parameter set to the top quark mass interfaced with pythia 6, pythia 8, and herwig 6, which provide a reasonable description of the data within the uncertainties, and the predictions from MadGraph and MG5_aMC@NLO interfaced with pythia 8, which generate higher jet multiplicities for all the pT thresholds.
The measured kinematic variables of the leading and subleading additional jets are consistent with the various predictions. The simulations also describe well the data distributions of the leading additional jet pT and HT, although they tend to predict higher pT values and more central values in η. MadGraph with varied parameters yields similar predictions, except for varying the renormalization and factorization scales, which tends to give higher HT values. The mc@nlo generator predicts lower yields than observed for the subleading additional jet pT.
The uncertainties in the measured () absolute and normalized differential cross sections as a function of the b jet kinematic variables are dominated by the statistical uncertainties. In general, the predictions describe well the shape of the measured cross sections as a function of the variables studied, except for ΔRbb, where they favour smaller values than the measurement. The predictions underestimate the total cross section by approximately a factor of 2, in agreement with previous measurements . The calculation by PowHel  describes well the shape of the distributions, while the predicted absolute cross section is about 30 % lower, but compatible with the measurements within the uncertainties.
The gap fraction has been measured as a function of the pT of the leading and subleading additional jets and HT of the additional jets in different η ranges. For a given threshold value, the gap fraction as a function of HT is lower than the gap fraction as a function of the pT of the leading additional jet, showing that the measurement is probing multiple quark and gluon emission. Within the uncertainties, all predictions describe the gap fraction well as a function of the momentum of the first additional jet, while mc@nlo interfaced with herwig fails to describe the gap fraction as a function of the subleading additional jet pT and HT. In general, MadGraph with decreased renormalization and factorization scales more poorly describes the observed gap fraction, while varying the jet-parton matching threshold provides a similar description of the data. The MadGraph and MG5_aMC@NLO generators interfaced with pythia 8 predict lower values than measured. The powheg simulation with HDAMP = mt interfaced with pythia 8 is consistent with the data, while the simulation interfaced with herwig 6 and pythia 6 tends to worsen the comparison with the measurement.
In general, the different measurements presented are in agreement with the SM predictions as formulated by the various event generators, within their uncertainties. The correct description of +jets production is important since it constitutes a major background in searches for new particles in several supersymmetric models and in processes, where the Higgs boson decays into . The () differential cross sections, measured here for the first time, also provide important information about the main irreducible background in the search for .
We thank M. V. Garzelli for providing the theoretical predictions from PowHel +pythia 6. We congratulate our colleagues in the CERN accelerator departments for the excellent performance of the LHC and thank the technical and administrative staffs at CERN and at other CMS institutes for their contributions to the success of the CMS effort. In addition, we gratefully acknowledge the computing centres and personnel of the Worldwide LHC Computing Grid for delivering so effectively the computing infrastructure essential to our analyses. Finally, we acknowledge the enduring support for the construction and operation of the LHC and the CMS detector provided by the following funding agencies: the Austrian Federal Ministry of Science, Research and Economy and the Austrian Science Fund; the Belgian Fonds de la Recherche Scientifique, and Fonds voor Wetenschappelijk Onderzoek; the Brazilian Funding Agencies (CNPq, CAPES, FAPERJ, and FAPESP); the Bulgarian Ministry of Education and Science; CERN; the Chinese Academy of Sciences, Ministry of Science and Technology, and National Natural Science Foundation of China; the Colombian Funding Agency (COLCIENCIAS); the Croatian Ministry of Science, Education and Sport, and the Croatian Science Foundation; the Research Promotion Foundation, Cyprus; the Ministry of Education and Research, Estonian Research Council via IUT23-4 and IUT23-6 and European Regional Development Fund, Estonia; the Academy of Finland, Finnish Ministry of Education and Culture, and Helsinki Institute of Physics; the Institut National de Physique Nucléaire et de Physique des Particules/CNRS, and Commissariat à l’Énergie Atomique et aux Énergies Alternatives/CEA, France; the Bundesministerium für Bildung und Forschung, Deutsche Forschungsgemeinschaft, and Helmholtz-Gemeinschaft Deutscher Forschungszentren, Germany; the General Secretariat for Research and Technology, Greece; the National Scientific Research Foundation, and National Innovation Office, Hungary; the Department of Atomic Energy and the Department of Science and Technology, India; the Institute for Studies in Theoretical Physics and Mathematics, Iran; the Science Foundation, Ireland; the Istituto Nazionale di Fisica Nucleare, Italy; the Ministry of Science, ICT and Future Planning, and National Research Foundation (NRF), Republic of Korea; the Lithuanian Academy of Sciences; the Ministry of Education, and University of Malaya (Malaysia); the Mexican Funding Agencies (CINVESTAV, CONACYT, SEP, and UASLP-FAI); the Ministry of Business, Innovation and Employment, New Zealand; the Pakistan Atomic Energy Commission; the Ministry of Science and Higher Education and the National Science Centre, Poland; the Fundação para a Ciência e a Tecnologia, Portugal; JINR, Dubna; the Ministry of Education and Science of the Russian Federation, the Federal Agency of Atomic Energy of the Russian Federation, Russian Academy of Sciences, and the Russian Foundation for Basic Research; the Ministry of Education, Science and Technological Development of Serbia; the Secretaría de Estado de Investigación, Desarrollo e Innovación and Programa Consolider-Ingenio 2010, Spain; the Swiss Funding Agencies (ETH Board, ETH Zurich, PSI, SNF, UniZH, Canton Zurich, and SER); the Ministry of Science and Technology, Taipei; the Thailand Center of Excellence in Physics, the Institute for the Promotion of Teaching Science and Technology of Thailand, Special Task Force for Activating Research and the National Science and Technology Development Agency of Thailand; the Scientific and Technical Research Council of Turkey, and Turkish Atomic Energy Authority; the National Academy of Sciences of Ukraine, and State Fund for Fundamental Researches, Ukraine; the Science and Technology Facilities Council, UK; the US Department of Energy, and the US National Science Foundation. Individuals have received support from the Marie-Curie programme and the European Research Council and EPLANET (European Union); the Leventis Foundation; the A. P. Sloan Foundation; the Alexander von Humboldt Foundation; the Belgian Federal Science Policy Office; the Fonds pour la Formation à la Recherche dans l’Industrie et dans l’Agriculture (FRIA-Belgium); the Agentschap voor Innovatie door Wetenschap en Technologie (IWT-Belgium); the Ministry of Education, Youth and Sports (MEYS) of the Czech Republic; the Council of Science and Industrial Research, India; the HOMING PLUS programme of the Foundation for Polish Science, cofinanced from European Union, Regional Development Fund; the OPUS programme of the National Science Center (Poland); the Compagnia di San Paolo (Torino); the Consorzio per la Fisica (Trieste); MIUR project 20108T4XTM (Italy); the Thalis and Aristeia programmes cofinanced by EU-ESF and the Greek NSRF; the National Priorities Research Program by Qatar National Research Fund; the Rachadapisek Sompot Fund for Postdoctoral Fellowship, Chulalongkorn University (Thailand); and the Welch Foundation, contract C-1845.
The variables used for the BDT are listed below. The candidate b jet is denoted with the superscript b in the following equations, while the candidate anti-b jet is denoted as . Combinations of particles that are treated as a system by adding their four-momentum vectors are denoted without a comma, e.g. bℓ+ represents the b jet and the antilepton system. The angular separation and the azimuthal angular difference Δϕ between the directions of two particles is designated using the two particle abbreviations in a superscript, separated by a comma.
One variable is the difference in the jet charges, crel, of the b and jets:
It is the only variable not directly related to the kinematical properties of the decay and the additional radiation. The values are by definition positive, as the jet with the highest charge is always assigned as the anti-b jet.
There are three angular variables:
Here, denotes the missing transverse momentum in an event. The angles are defined such that - π ≤ Δϕ ≤ π, and consequently the absolute values are within [0, π].
Two variables are the pT of the b jet ( jet) and charged antilepton (lepton) systems:
The remaining variables are based on the invariant or transverse masses of several particle combinations:
For any pair of jets, the variable is the invariant mass of all the other selected jets recoiling against this pair, i.e. all selected jets except these two.