|Home | About | Journals | Submit | Contact Us | Français|
Searches are presented for direct production of top or bottom squark pairs in proton–proton collisions at the CERN LHC. Two searches, based on complementary techniques, are performed in all-jet final states that are characterized by a significant imbalance in transverse momentum. An additional search requires the presence of a charged lepton isolated from other activity in the event. The data were collected in 2015 at a centre-of-mass energy of 13 TeV with the CMS detector and correspond to an integrated luminosity of 2.3 fb-1. No statistically significant excess of events is found beyond the expected contribution from standard model processes. Exclusion limits are set in the context of simplified models of top or bottom squark pair production. Models with top and bottom squark masses up to 830 and 890 GeV, respectively, are probed for light neutralinos. For models with top squark masses of 675 GeV, neutralino masses up to 260 GeV are excluded at 95% confidence level.
The standard model (SM) has been extremely successful at describing particle physics phenomena. Nevertheless, it suffers from shortcomings such as the hierarchy problem [1–6], the need for fine-tuned cancellations of large quantum corrections to keep the Higgs boson mass near the electroweak scale. Supersymmetry (SUSY), based on a symmetry between bosons and fermions, is an attractive extension of the SM. A key feature of SUSY is the existence of a superpartner for every SM particle with the same quantum numbers, except for spin, which differs by one half unit. In R-parity conserving SUSY models [7, 8], supersymmetric particles are created in pairs, and the lightest supersymmetric particle (LSP) is stable [9, 10] and considered to be a candidate for dark matter . Supersymmetry can potentially provide a “natural”, i.e. not fine-tuned, solution to the hierarchy problem through the cancellation of quadratic divergences in particle and sparticle loop corrections to the Higgs boson mass. In natural SUSY models light top and bottom squarks with masses close to the electroweak scale are preferred.
This paper presents three complementary searches for direct production of a pair of top () or bottom squarks (), where the subscript here denotes the less massive partner of the corresponding SM fermion’s chirality states. The first search targets top squark pair production in the all-jet final state, while the second focuses on the single-lepton final state. These two analyses were explicitly designed for complementarity, allowing for a combination of the results to enhance the sensitivity. The third search targets bottom squark pair production in the all-jet final state. The searches are performed using the data collected in proton–proton collisions at a centre-of-mass energy of 13 TeV with the CMS detector at the CERN LHC in 2015, corresponding to an integrated luminosity of 2.3 fb-1. The results of similar searches were previously reported by the ATLAS and CMS collaborations using proton–proton collisions at 7 and 8 TeV [12–25] and by the CDF and D0 collaborations in collisions at 1.96 TeV at the Fermilab Tevatron [26–30]. With the increase in LHC collision energy from 8 to 13 TeV, the cross section to produce signal events is enhanced by a factor of 8–12 for a top or bottom squark mass in the range 700–1000 GeV [31, 32]. Therefore, new territory can be explored even with the relatively small amount of data collected in 2015. The CMS and ATLAS collaborations have already provided first exclusion results for these models in the all-jet and single-lepton final states [33–36]. Unlike the more generic searches for new phenomena presented by the CMS collaboration in Refs. [33–35], the searches described in this paper directly target top and bottom squark production through the design of search regions that exploit the specific characteristics of these signal models, for instance through the use of a top quark tagging algorithm in the top squark search in the all-jet final state to identify boosted hadronically decaying top quarks originating from top squark decays.
The decay modes of top squarks depend on the sparticle mass spectrum. Figure 1 illustrates the top and bottom squark decay modes explored in this paper. The simplest top squark decay modes are and , with representing the lightest chargino, and with intermediate particles that can be virtual marked by asterisks. In these decay modes, the neutralino and charginos are mixtures of the superpartners of electroweak gauge and Higgs bosons, and is considered to be an LSP that escapes detection, leading to a potentially large transverse momentum imbalance in the detector. The two analyses of top squark pair production in the all-jet and single-lepton final states probe both of these decay modes. In the decay mode, the top quark is produced off-shell when , while in the decay mode, the experimental signature is affected by the mass of the chargino. We consider a model in which both top squarks decay via the decay mode. A second model in which the branching fraction for each of the two top squark decay modes is 50% is also considered, under the assumption of a compressed mass spectrum in which the mass of is only 5 GeV greater than that of , with the W bosons resulting from chargino decays consequently being produced heavily off-shell. If Δm < mW, can decay through a four-body decay involving an SM fermion pair as , or through a flavour changing neutral current decay . The analysis of bottom squark pair production considers the decay mode within the allowed phase space, and also probes top squark pair production in the decay scenario.
This paper is organized as follows. Section 2 contains a brief description of the CMS detector, while Sect. 3 discusses the event reconstruction and simulation. Sections 4, 5, and 6 present details for the all-jet top squark search, the single-lepton top squark search, and the all-jet bottom squark search, respectively. Section 7 describes the systematic uncertainties affecting the results of the three analyses. The interpretation of the results in the form of exclusion limits on models of top or bottom squark pair production is discussed in Sect. 8, followed by a summary in Sect. 9.
The central feature of the CMS apparatus is a superconducting solenoid of 6 m internal diameter, providing a magnetic field of 3.8 T. Within the solenoid volume are an all-silicon pixel and strip tracker, a lead tungstate crystal electromagnetic calorimeter (ECAL), and a brass and scintillator hadron calorimeter (HCAL), each composed of a barrel and two endcap sections. Forward calorimeters extend the pseudorapidity (η) coverage provided by the barrel and endcap detectors. Muons are measured in gas-ionization detectors embedded in the steel flux-return yoke outside the solenoid. The first level of the CMS trigger system, composed of custom hardware processors, uses information from the calorimeters and muon detectors to select the most interesting events in a fixed time interval of less than 4 μs. The high-level trigger processor farm further decreases the event rate from around 100 kHz to around 1 kHz, before data storage. A more detailed description of the CMS detector, together with a definition of the coordinate system used and the relevant kinematic variables, can be found in Ref. .
Event reconstruction uses the particle-flow (PF) algorithm [38, 39], combining information from the tracker, calorimeter, and muon systems to identify charged hadrons, neutral hadrons, photons, electrons, and muons in an event. The missing transverse momentum, , is computed as the negative vector sum of the transverse momenta () of all PF candidates reconstructed in an event, and its magnitude is an important discriminator between signal and SM background. Events selected for the searches are required to pass filters designed to remove detector- and beam-related noise and must have at least one reconstructed vertex. Usually more than one such vertex is reconstructed, due to pileup, i.e. multiple pp collisions within the same or neighbouring bunch crossings. The reconstructed vertex with the largest of associated tracks is designated as the primary vertex.
Charged particles originating from the primary vertex, photons, and neutral hadrons are clustered into jets using the anti-kT algorithm  implemented in FastJet  with a distance parameter of 0.4. The jet energy is corrected to account for the contribution of additional pileup interactions in an event and to compensate for variations in detector response [41, 42]. Jets considered in the searches are required to have their axes within the tracker volume, within the range |η| < 2.4.
Jets originating from b quarks are identified with the combined secondary vertex (CSV) algorithm [43, 44] using two different working points, referred to as “loose” and “medium”. The b tagging efficiency for jets originating from b quarks is about 80 and 60% for the loose and medium working point, respectively, while the misidentification rates for jets from charm quarks, and from light quarks or gluons are about 45 and 12%, and 10 and 2%, respectively.
The “CMS top (quark) tagging” (CTT) algorithm [45–47] is used to identify highly energetic top quarks decaying to jets with the help of observables related to jet substructure [48, 49] and mass. For a relativistic top quark with a Lorentz boost γ = E/m, the W boson and b quark produced in the top quark decay are expected to be separated by a distance (where ϕ is the azimuthal angle in radians). In cases where the W boson subsequently decays hadronically, the three resulting jets from the W boson decay and the hadronization of the b quark are likely to be merged into a single jet by a clustering algorithm with a distance parameter larger than 2/γ. To identify hadronically decaying top quarks with pT > 400 GeV, we therefore use jets reconstructed using the anti-kT algorithm with a distance parameter of 0.8 to try to cluster the top quark decay products into a single jet. The next step of top quark reconstruction is an attempt to decompose the candidate jet into at least three subjets with the help of the Cambridge-Aachen jet clustering algorithm [50, 51], the invariant mass of which is required to be consistent with the top quark mass (140–250 GeV). The final requirement of top quark identification is that the minimum invariant mass of any pair of the three subjets with the highest pT must exceed 50 GeV. The efficiency of the CTT algorithm to identify jets originating from top quark decays is measured to be about 30–40% while the misidentification rate is found to be about 4–6%, depending on the pT of the top quark candidates. No disambiguation is performed between top quark candidates and jets reconstructed with a distance parameter of 0.4.
Electron candidates are reconstructed by first matching clusters of energy deposited in the ECAL to reconstructed tracks. Selection criteria based on the distribution of the shower shape, track–cluster matching, and consistency between the cluster energy and track momentum are then used in the identification of electron candidates . Muon candidates are reconstructed by requiring consistent hit patterns in the tracker and muon systems . Electron and muon candidates are required to be consistent with originating from the primary vertex by imposing restrictions on the size of their impact parameters in the transverse plane and longitudinal direction with respect to the beam axis. The relative isolation variable Irel for these candidates is defined as the scalar sum of the transverse momenta of all PF candidates, excluding the lepton, within a pT-dependent cone size of radius R around the candidate’s trajectory, divided by the lepton pT. The size R depends on lepton pT as follows:
The shrinking cone radius for higher-pT leptons allows us to maintain high efficiency for the collimated decay products of boosted heavy objects. The isolation sum is corrected for contributions originating from pileup interactions through an area-based estimate  of the pileup energy deposited in the cone.
Hadronically decaying τ lepton (τh) candidates are reconstructed using the CMS hadron-plus-strips (HPS) algorithm . The constituents of the reconstructed jets are used to identify individual τ lepton decay modes with one charged hadron and up to two neutral pions, or three charged hadrons. The presence of extra particles within the jet, not compatible with the reconstructed decay mode, is used as a criterion to discriminate τh decays from other jets.
Photon candidates are reconstructed from energy deposited in the ECAL, and selected using the distribution of the shower shape variable, the photon isolation, and the amount of leakage of the photon shower into the HCAL .
Monte Carlo (MC) simulations of events are used to study the properties of SM backgrounds and signal models. The MadGraph 5_amc@nlo 2.2.2 generator  is used in leading-order (LO) mode to simulate events originating from , W + jets, Z + jets, γ + jets, and quantum chromodynamics multijet processes (QCD), as well as signal events, based on LO NNPDF3.0  parton distribution functions (PDFs). Single top quark events produced in the t W channel and samples used in the single-lepton analysis are generated at next-to-leading order (NLO) with Powheg v2 [58–61], while rare SM processes such as and are generated at NLO using the MadGraph 5_amc@nlo 2.2.2 program, using NLO NNPDF3.0 PDFs. Parton showering and hadronization is generated using Pythia8.205 . The response of the CMS detector for the SM backgrounds is simulated via the Geant4  package. The CMS fast simulation package  is used to simulate all signal samples, and is verified to provide results that are consistent with those obtained from the full Geant4-based simulation. Event reconstruction is performed in the same manner as for collision data. A nominal distribution of pileup interactions is used when producing the simulated samples. The samples are then reweighted to match the pileup profile observed in the collected data. The signal production cross sections are calculated using NLO with next-to-leading logarithm (NLL) soft-gluon resummation calculations . The most precise cross section calculations are used to normalize the SM simulated samples, corresponding most often to next-to-next-to-leading order (NNLO) accuracy.
The top squark search in the all-jet final state is characterized by the categorization of events into exclusive search regions based on selection criteria applied to kinematic variables related to jets and , and the use of the CTT algorithm to identify boosted top quark candidates. The main backgrounds in the search regions are estimated from dedicated data control samples.
The events in this analysis are recorded using a trigger that requires the presence of two or more energetic jets within the tracker acceptance and large . To be efficient, events selected offline are therefore required to have at least two jets with pT > 75 GeV, |η| < 2.4, and GeV. In order to reduce SM backgrounds with intrinsic such as leptonic and W + jets events, we reject events with isolated electrons or muons that have pT > 5 GeV, |η| < 2.4, and Irel less than 0.1 or 0.2, respectively. The contribution from events in which a W boson decays to a τ lepton is reduced by rejecting events containing isolated charged-hadron PF candidates with pT > 10 GeV and |η| < 2.5 that are consistent with τh decays. The isolation requirement applied is based on a discriminant obtained from a multivariate boosted decision tree (BDT) trained to distinguish the characteristics of charged hadrons originating from τh decays. The transverse mass MT of the system comprising the charged-hadron PF candidate and is required to be less than 100 GeV assuring consistency with τh originating from a W boson decay, to minimize loss of signal at high jet multiplicity. The transverse mass for a particle q (in this case, the τh candidate) is defined as:
with qT denoting the particle transverse momentum, and Δϕ the azimuthal separation between the particle and .
Events selected for the search sample must also have at least five jets with pT > 20 GeV, at least two of which must be b-tagged satisfying the loose working point of the CSV algorithm, with one or more of them required to additionally satisfy the medium working point. In addition, the absolute value of the azimuthal angle between and the closest of the four highest-pT (leading) jets, Δϕ1234, must be at least 0.5. An imbalance in event pT is produced in QCD events through a mismeasurement of jet pT, and is often aligned with one of the leading jets in the event. The requirement on Δϕ1234 therefore greatly reduces the contribution of the QCD background. The set of selection criteria defined above will be referred to as the “baseline selection” for this search.
After imposing the baseline selection, we subdivide the event sample into categories based on kinematic observables related to jets and to improve the power of the analysis to discriminate between signal and the remaining SM background. The dominant sources of SM background are , W + jets, and Z + jets events. The contribution from and W + jets processes arises from events with W bosons decaying leptonically, in which the charged lepton either falls outside of the kinematic acceptance, or, in most cases, evades identification, and may be reconstructed as a jet. Large can be generated by the associated neutrino, allowing such events to satisfy the baseline selection criteria. This background is collectively referred to as the “lost-lepton background”. Contributions arising from and single top quark processes also enter this category, but with lesser importance. The contributions from Z + jets and events arise when the Z boson decays to neutrinos, producing thereby a significant amount of . The QCD background is reduced to a subdominant level by the requirements of the baseline selection.
In events with a lost lepton, the transverse mass of the b quark system arising from the same top quark decay as the lost lepton has a kinematic endpoint at the mass of the top quark. The observable is defined as
where b1, b2 are the two selected b-tagged jets with highest values in the CSV discriminant. Imposing a minimum requirement of 175 GeV on reduces a significant portion of the background, but also results in a loss in signal efficiency for models with small Δm, as seen in Fig. 2, in which signal models with different top squark and neutralino mass hypotheses are shown, with the first number indicating the assumed top squark mass in units of GeV and the second the neutralino mass. To benefit from the separation power provided by this variable, we define two search categories, one with GeV, taking advantage of the corresponding reduction in background for signal models with large Δm, and another with GeV to retain the statistical power of events populating the low- region for models with small Δm.
Signal events with all-jet top quark decays should have at least six jets in the final state, although in the case of signals with compressed mass spectra these jets can be too soft in pT to satisfy the jet selection threshold. Additional jets may be produced through initial-state radiation (ISR). The jet multiplicity is lower for the semileptonic background, as well as for the other backgrounds remaining after the baseline selection. A requirement of higher reconstructed jet multiplicity therefore improves the discrimination of signal events from the SM background. We consider two regions in jet multiplicity for the analysis, a high-Nj region ( ≥ 7 jets) that benefits from this improved discrimination, and a medium-Nj region (5–6 jets) to preserve signal events with fewer reconstructed jets. The high-Nj region in conjunction with the low threshold on the pT of selected jets improves sensitivity for signal models with soft decay products in the final state.
In the high- category, requiring the presence of at least one top quark reconstructed by the CTT algorithm (Nt ≥ 1) ensures a high-purity selection of signal events with highly boosted top quarks, at the sacrifice of some loss in signal efficiency. To benefit from this high-purity region, without giving up signal events that would enter the Nt = 0 region, we use both regions to extract the final signal. Figure 2 shows the Nt distribution for events in the high- category. Subdividing each Nt region by the number of b-tagged jets (Nb) that satisfy the medium working point of the CSV algorithm provides even greater discrimination of signal from background. Since there are relatively few events in the Nt ≥ 1 category, the subcategorization in Nj is not performed for these events because it provides no additional gain after the Nb subdivision.
The event categorization according to , Nj, Nb, and Nt is summarized in Table 1. In each of these categories, we use as the final discriminant to characterize and distinguish potential signal from the SM background by defining five regions. The analysis is therefore carried out in a total of 50 disjoint search regions (SRs).
The lost-lepton background is estimated from a single-lepton control sample, selected using the same trigger as the search sample, and consisting of events that have at least one lepton (ℓ) obtained by inverting the electron and muon rejection criteria. Studies in simulation indicate that the event kinematics for different lepton flavours are similar enough to estimate them collectively from the same control sample. Potential signal contamination is suppressed by requiring GeV. If there is more than one lepton satisfying the selection criteria, the lepton used to determine is chosen randomly. The events selected in the lepton control sample are further subdivided into control regions (CRs) using the same selection criteria as in the search sample, according to , Nj, Nt, and . However with the requirement Nb ≥ 1 the distribution in originating from lost-lepton processes is independent of Nb, and therefore the CRs are not subdivided according to the number of b-tagged jets. These CRs generally have a factor of 2–4 more events than the corresponding SRs.
The estimation of the lost-lepton background in each SR is based on the event count in data in the corresponding single-lepton CR (). We translate this event count to the SR by means of a lost-lepton transfer factor TLL obtained from simulation. The lost-lepton background prediction can therefore be extracted as
where and are the simulated lost-lepton background yields in the corresponding zero- and single-lepton regions, respectively, taking into account contributions from and W + jets events, with smaller contributions from single top quark and processes. The contamination from other SM processes in the single-lepton CRs is found to be negligible in studies of simulated events. Monte Carlo simulated samples are used to estimate the small component of the lost-lepton background that originates from leptons falling outside the kinematic acceptance, since this component is not accounted for in the CRs.
To improve the statistical power of the estimation, CRs with Nt ≥ 1 are summed over bins as well as over Nb. We rely on the simulation through to provide the -dependence and to predict the yield in each of the SRs with Nt ≥ 1. We check this procedure by computing the data-to-simulation ratios in the higher-statistics region of GeV with Nt = 0, and find no evidence of a dependence on . We assign the relative statistical uncertainties of these ratios as systematic uncertainties in the SRs.
The dominant uncertainty in the lost-lepton prediction is due to the limited number of events in the CRs, and can be as large as 100%. The statistical uncertainties in the simulated samples also affect the uncertainty in the prediction via the transfer factors. The effect in the uncertainty ranges between 3 and 50%. A source of bias in the prediction can arise from a possible difference between data and simulation in the background composition, which is assessed by independently changing the cross sections of the W + jets and processes by ±20% based on CMS differential cross section measurements [65, 66]. The effect of these changes is as large as 11% for the transfer factors. The uncertainties in the measurements of correction factors in lepton efficiency that are applied to the simulation to reduce discrepancies with the data lead to a systematic uncertainty of up to 7% in TLL. All other sources of systematic uncertainty, to be discussed in Sect. 7, have a negligible effect on the prediction.
Two methods are traditionally used to estimate the background in searches involving all-jet final states with large . The first method relies on a sample dominated by Z → ℓℓ+jets events, which has the advantage of accessing very similar kinematics to the process, after correcting for the difference in acceptance between charged-lepton pairs and pairs of neutrinos, but is statistically limited in regions defined with stringent requirements on jets and . The second method utilizes γ + jets events that have a significantly larger production cross section than the Z → ℓℓ+jets process, but similar leading-order Feynman diagrams. The two main differences between the processes that must be taken into account, namely, different quark–boson couplings and the massive nature of the Z boson, become less important at large Z boson pT, which is the kinematic region we are probing in this search.
We have therefore adopted a hybrid method to estimate the background by combining information from Z + jets, with Z → ℓℓ, and γ + jets events. Z → ℓℓ events are used to obtain the normalization for the background in different ranges of Nb to account for potential effects related to heavy-flavour production, while the much higher yields from the γ + jets sample are exploited to extract corrections to distributions of variables used to characterize the SRs. The Z → ℓℓ events are obtained from dielectron and dimuon triggers, with the leading lepton required to have pT > 20 GeV, and the trailing lepton pT > 15 and > 10 GeV for electrons and muons, respectively. Both leptons must also have |η| < 2.4. The γ + jets sample is collected through a single-photon trigger, and consists of events containing photons with pT > 180 GeV and |η| < 2.5. The transverse momentum of the dilepton or photon system is added vectorially to in each event of the corresponding data samples to emulate the kinematics of the process. The modified , denoted by and for the Z → ℓℓ and γ + jets processes, respectively, is used to calculate related kinematic variables.
The prediction for the background is given by:
where is the expected number of events obtained from simulation, RZ is the flavour-dependent Z + jets normalization factor measured with the Z → ℓℓ sample, and Sγ is the correction factor for distributions in and jet kinematic variables extracted from the γ + jets sample. The underlying assumption of this hybrid estimation method is that the differences in the (or ) distributions between data and simulation are similar for and photon events. We checked this assumption by comparing the ratios of data to simulation observed in the and distributions for Z → ℓℓ+jets and γ + jets samples, respectively, and found them to agree.
The factor RZ is calculated by comparing the observed and expected Z → ℓℓ yields for a relaxed version of the baseline selection. In particular, we remove the requirements on Δϕ1234 after confirming that this does not bias the result, and relax the requirements on from a threshold of 250 GeV to a threshold of 100 GeV. To increase the purity of the Z → ℓℓ events, we require the dilepton invariant mass to lie within the Z boson mass window of 80 < Mℓℓ < 100 GeV. The normalization of the nonnegligible contamination is estimated in the region outside the Z boson mass window (20 < Mℓℓ < 80 or Mℓℓ > 100 GeV) and taken into account. Small contributions from tZ and production, estimated from simulation, are included in the Z → ℓℓ sample when measuring RZ. Contributions from tW and are included in the simulation sample used to obtain the normalization factor for the contamination. As discussed previously, we calculate RZ separately for different Nb requirements. The values obtained are 0.94 ± 0.13 and 0.84 ± 0.19 for Nb = 1 and ≥ 2, respectively. The uncertainty in RZ originates from the limited event counts in data and simulation, and from the extrapolation in .
The quantity Sγ is the correction factor related to the modelling of the distributions in the kinematic variables of events. It is calculated via a comparison of the distributions of γ + jets events in simulation and data. The simulation is normalized to the number of events in data after applying the baseline selection. To suppress potential contamination from signal and avoid overlap with the search sample, we only consider events with . The Sγ factor is estimated separately for each SR to account for any potential mismodelling of the observables , Nj, , and Nt in simulation. Since no statistically significant dependence of on Nb is observed, we improve the statistical power of the correction by combining the Nb = 1 and Nb ≥ 2 subsets of the γ + jets sample to extract the Sγ corrections. The correction factors range between 0.3 and 2, with uncertainties of up to 100% due to the limited number of events in the data sample.
The γ + jets control data have contributions from three main components: prompt photons produced directly or via fragmentation, and other objects misidentified as photons. The prompt photon purity measured in Ref.  shows good agreement between data and simulation. In addition, the impact of varying the fraction of misidentified photons, or those produced via fragmentation, by 50% in simulated events results in a bias of less than 5% in the distribution from the predicted background. We therefore rely on simulation to estimate the relative contributions of the three different components.
The statistical uncertainty in the γ + jets control data and the uncertainty in RZ are the main sources of uncertainty in the prediction. The statistical uncertainties in the simulated samples, ranging up to 50% in both the SRs and in the γ + jets CRs, also makes sizeable contributions.
The QCD background is estimated using a data CR selected with the same trigger as the SR and enriched in QCD events by imposing a threshold on the azimuthal separation between and the closest of the three leading jets, namely Δϕ123 < 0.1. After correcting for the contribution from other SM processes (i.e. and W + jets), estimated by applying the normalization factor obtained in the corresponding single-lepton control sample to simulation, we translate the observation in this CR to a prediction in the SR by means of transfer factors obtained from simulation. Each transfer factor is defined as the ratio of the expected QCD events satisfying Δϕ1234 > 0.5 to the expected QCD events with Δϕ123 < 0.1. The estimation is carried out in each search category. Since the distributions in key observables show little dependence on Nb, the QCD CR is summed over Nb to improve the statistical precision of the estimation.
The main source of QCD events populating the SR is from severe mismeasurement of the pT of one or more jets in the event. Correct modelling of jet mismeasurement in simulation is therefore an important part of the QCD prediction. The level of mismeasurement of a simulated event is parameterized by the jet response of the most mismeasured jet, which is the jet with the greatest absolute difference between the reconstructed and generated pT. The jet response, rjet, is defined as the ratio of the reconstructed pT of a jet to its generated pT, computed without including the loss of visible momentum due to neutrinos. We use the observable , defined as the ratio of the pT of a jet to the magnitude of the vector sum of its transverse momentum and , as an approximate measure of the true jet response in data, and extract mismeasurement correction factors for the simulation by comparing of the jet closest in ϕ to between data and simulation. The correction factors extracted from simulation are parameterized by rjet and the flavour of the most mismeasured jet. The correction factors range between 0.44 and 1.13, and are applied in the simulation on an event-by-event basis.
The largest sources of uncertainty in the QCD prediction originate from the limited event counts in data and simulated samples surviving the selection, giving rise to uncertainties of up to 100% in the estimated QCD background contribution in some SRs. The uncertainty due to jet response corrections is up to 15%, while the uncertainty due to contributions from non-QCD processes in the data CR ranges from 7 to 35%.
Contributions from the process are generally small since this is a relatively rare process. However, it has a final state very similar to signal when the Z boson decays to neutrinos and both top quarks decay only into jets, which can constitute up to 25% of the total SM background in some SRs with large and Nt ≥ 1. The prediction is obtained from simulation. We assign a 30% uncertainty to the cross section, based on the 8 TeV CMS measurement . Additional theoretical and experimental uncertainties in the prediction are evaluated as will be discussed in Sect. 7, and range up to 25 and 20%, respectively, depending on the SR. We also take into consideration the statistical uncertainty in the simulation, which ranges from 5 to 100% for regions with small contributions.
Figure 3 shows the yields in each of the SR bins, as well as the predicted SM backgrounds based on the background estimation methods discussed in Sect. 4.2. The results are also summarized in Table 2. Expected yields are also shown for two benchmark models for the pure decay and one for the mixed ( or ) decay. No statistically significant deviation from the SM prediction is observed in the data.
We also perform a search for top squarks in events with exactly one isolated electron or muon and considerable . The main SM backgrounds originating from and W + jets processes are suppressed using dedicated kinematic variables. The dominant remaining backgrounds arise from lost-lepton processes and the surviving W + jets background, both of which are estimated from control samples in data.
The search sample is selected using triggers that require either large or the presence of an isolated electron or muon. The combined trigger efficiency for a selection of > 250 GeV and at least one lepton, as measured in a data sample with large HT, is found to be 99% with an asymmetric uncertainty of . Selected events are required to have at least two jets with pT > 30 GeV, at least one of which must be b-tagged using the medium working point. We require exactly one well-identified and isolated electron or muon with pT > 20 GeV, |η| < 1.442 or < 2.4, respectively, and Irel < 0.1. Electrons in the forward region of the detector are not considered in this search due to a significant rate for a jet to be misidentified as an electron. To reduce the dilepton background originating from and tW production, events are rejected if they contain a second electron or muon with pT > 5 GeV and Irel < 0.2. A significant fraction of the remaining SM background originates from events with τh decays. This contribution is reduced by rejecting events that have an isolated τh candidate reconstructed using the HPS algorithm with pT > 20 GeV and |η| < 2.4. A further veto is placed on events containing isolated charged-hadron PF candidates with pT > 10 GeV and |η| < 2.5. Candidates are categorized as being isolated if their isolation sum, i.e. the scalar sum of the pT of charged PF candidates within a fixed cone of R = 0.3 around the candidate, is less than 6 GeV and smaller than 10% of the candidate pT.
Single-lepton backgrounds originating from semileptonic , W + jets, and single top quark processes are suppressed through the MT of the lepton–neutrino system. Background processes containing a single lepton from W boson decay have a kinematic endpoint for MT at the W boson mass, modulo detector resolution and off-shell W boson mass effects. In this analysis we require MT > 150 GeV, which significantly reduces single-lepton backgrounds. To further reduce the background, we require the absolute value of the azimuthal angle between and the closest of the two highest-pT jets, Δϕ12, to be larger than 0.8, since the events that satisfy the and MT requirements tend to have higher-pT top quarks, and therefore smaller values of Δϕ12 than signal events.
The remaining background after the preselection is dominated by dilepton events from and tW production, where one of the leptons is not reconstructed or identified, and the presence of the additional neutrino from the second leptonically decaying W boson makes it possible to satisfy the MT requirement.
Kinematic properties of signal events such as , MT, and jet multiplicity depend on the decay modes of top squarks, as well as on the mass splittings (Δm) between the top squark, neutralino, and chargino (if present). As a basis for the search strategy in the topologies shown in Fig. 1a, b, we require the presence of at least four jets. Events are then categorized based on the value of the variable , which is calculated for each event under the assumption that it originates from the dilepton process with a lost lepton:
where my is the fitted parent particle mass, and p1, pℓ, p2, pb1, and pb2 are the four momenta of the neutrino corresponding to the visible W boson decay, the lepton from the same decay, the W boson whose decay gives rise to the undetected lepton, and the two b jet candidates, respectively. To select the b jet candidates, we examine all possible pairings with the three jets that have the highest CSV discriminator values. The pairing that gives the lowest value of defines the final estimate. The reconstruction of an event using the variable helps discriminate signal from the dominant dilepton background. For large mass differences between the top squark and the neutralino, the GeV requirement significantly reduces the background while maintaining reasonable signal efficiency. In contrast, for small-Δm models, such a requirement results in a significant loss in signal efficiency. To preserve sensitivity to both high- and low-Δm scenarios, we subdivide the search sample into two event categories with GeV and ≤ 200 GeV. The distribution for events with at least four jets is shown in Fig. 4 (top).
In signals with a large difference in mass between the top squark and the neutralino, a significant fraction of events can contain two quarks that merge into a single jet as a result of the large boost of the top quark or W boson that decay into jets. These events would fail the four-jet requirement. To recover acceptance for such topologies, we define an additional SR in events with three jets. Since this region targets large Δm signal scenarios, only events with are considered.
To increase the sensitivity of this analysis to a mixed decay scenario (Fig. 1c) when the chargino and neutralino are nearly degenerate in mass, SRs with exactly two jets are added. In events with low jet multiplicity the modified topness variable (tmod)  provides improved dilepton rejection:
This equation uses the mass constraints for the particles and also the assumption that . The first term constrains the W boson whose lepton decay product is the detected lepton, while the second term constrains the top quark for which the lepton from the W boson decay is lost in the reconstruction. Once again, we consider all possible pairings of b jet candidates with up to three jets with highest CSV discriminator values. The calculation of modified topness uses the resolution parameters aW = 5 GeV and at = 15 GeV, which determine the relative weighting of the mass shell conditions. We select events with tmod > 6.4. The definition of topness used in this analysis is modified from the one originally proposed in Ref. : namely, the terms corresponding to the detected leptonic top quark decay and the centre-of-mass energy are dropped since in events with low jet multiplicity the second b jet is often not identified. In these cases, the discriminating power of the topness variable is reduced when a light-flavour jet is used instead in the calculation. The modified topness is more robust against such effects and provides better signal sensitivity in these SRs than the variable. The distribution of modified topness for events with at least two jets is shown in Fig. 4 (bottom).
Finally, events in each of the categories described above are further classified into different SRs based on the value of . This results in a total of nine exclusive SRs as summarized in Table 3.
Three categories of backgrounds originating from SM processes remain after the preselection described in Sect. 5.1. The dominant contribution arises from backgrounds with a lost lepton, primarily from the dilepton process. A second class of background events originates from SM processes with a single leptonically decaying W boson. Preselection requirements of and MT > 150 GeV strongly suppress this background. The suppression is much stronger for events with a W boson originating from the decay of a top quark than for direct W boson production, as the mass of the top quark imposes a constraint on MW. As a result, large values of MT in semileptonic events are dominated by resolution effects, while for events in which the W boson is produced directly (W + jets) they are mainly a function of the width of the W boson. The third class of background events includes rare SM processes such as WZ and (where the Z boson decays to neutrinos), with smaller contributions from , , and processes with two or three electroweak vector bosons. The QCD background is negligible in this search due to requirements on the presence of a high-pT isolated lepton, large , and large MT.
The lost-lepton background is estimated from data in dilepton CRs, where we require the presence of a second lepton passing the rejection requirements but with pT > 10 GeV, an isolated track, or a τh candidate. This is done again by extrapolating the data in the dilepton CRs to the SRs using transfer factors obtained from simulation. We use the same preselection requirements on and MT as in the search regions. We remove the subdivision in and the separation of the three and at least four jet regions to increase the statistical power of the CRs, and arrive at three CRs: exactly two jets and tmod > 6.4, at least three jets and GeV, and at least three jets and GeV. These control regions have a purity in dilepton events of >97%. Additional transfer factors are therefore needed to account for the extrapolation in jet multiplicity and requirements; these are derived from simulation. The background estimate can be written as follows:
where is the number of events observed in data in the dilepton CR. The largest systematic uncertainty in the background estimate is due to the statistical uncertainties of the event yields in data CRs and the estimates from simulated samples (10–30%). The signal contamination in this CR is around 10% for the bulk of the studied parameter space and is taken into account in the final interpretation. The transfer factor TLL is obtained from simulation, and estimates the probability that a lepton is not identified in the detector, accounting for the kinematic acceptance and the efficiency of the lepton selection criteria. The second transfer factor, , extrapolates the inclusive estimate to individual SR bins. This transfer factor, also obtained from simulation, is validated by checking the modelling of the jet multiplicity and of the spectrum in dedicated data CRs, which will be described in the following paragraphs.
The dilepton background contributes to the SRs with three or more jets only if jets from ISR or final-state radiation (FSR) are also present, or when a τh decay is misidentified as an additional jet. The modelling of jet multiplicity is checked in a high-purity dedicated dilepton data control sample with one electron and one muon, at least two b-tagged jets, and GeV. The differences between data and simulation are used to estimate scale factors relative to the baseline selection of events with at least two jets. The scale factors are 1.10 ± 0.06 for three-jet events and 0.94 ± 0.06 for events with at least four jets. Within statistical uncertainties, these factors display no dependence. The scale factors are applied to the dilepton simulation when extrapolating the inclusive background prediction into the specified jet multiplicity bins. The statistical uncertainties in these scale factors are also propagated to the predictions in the SRs. The uncertainty in the modelling of the jet multiplicity ranges up to 3%.
The extrapolation in is carried out through simulation, and it must be verified that its resolution is accurately modelled. Changing the resolution can lead to a different spectrum. In this analysis we are interested in the effect of the resolution in events containing intrinsic because of the presence of neutrinos in the events. This effect is estimated by comparing a γ + jets sample in data with simulation. The events are selected using a single-photon trigger with pT > 165 GeV and |η| < 2.4. Photons are required to pass stringent identification criteria. We use the photons to mimic the neutrinos in the event, with the photon momentum serving as an estimate of the sum of the neutrino momenta.
The photon pT spectrum in data and in simulation is reweighted to match that of the neutrinos in the background-simulation sample. For dilepton events, this corresponds to the νν-pT spectrum. To model the resolution, the transverse momentum of the photon system is added vectorially to the and the resulting spectrum is compared between data and simulation. We use this modified definition to calculate our discriminants. For this CR, we then apply selection criteria close to the SR criteria, except that selections related to the lepton are dropped, the presence of a well-identified photon is required, and the requirement of a b-tagged jet is reversed so as to suppress effects related to semileptonic heavy-flavour decays. Corrections for the observed differences, which can go up to 15%, are applied to events in the simulated samples and the uncertainties propagated to the final background estimate, resulting in an uncertainty of 1–4% in the lost-lepton background prediction.
In SRs with a high or modified topness requirement, the W + jets background is estimated using a data control sample containing no b-tagged jets. For SRs with a low- requirement, this background constitutes less than 10% of the total SM background. In these SRs we do not employ an estimate based on data, but instead use the W + jets background estimate directly from simulation. The semileptonic background is also estimated from simulation.
The CRs used to extract the W + jets background in the SRs with a high or modified topness requirement are again not subdivided in to have a sufficient number of events to carry out the prediction. We therefore use three CRs for this background estimate: exactly two jets with tmod > 6.4, exactly three jets with > 200 GeV, and at least four jets with GeV. We extrapolate the yields from the CRs to the SRs by applying transfer factors from simulation for the extrapolation in and number of b-tagged jets:
with representing the event yield in the CR after subtracting the estimated contribution from other SM background processes. The non-1ℓ contribution in the CRs, , is estimated from simulation and amounts to roughly 25–35%. A 50% uncertainty is assigned to the subtraction. The largest source of uncertainty is again the limited size of the data and simulation samples. The statistical uncertainty of these samples results in an uncertainty of 20–40% in the W + jets background estimate.
The transfer factor extrapolates the yields from the inclusive CR with GeV to the exclusive regions. The main uncertainties in this extrapolation factor can be attributed to the modelling of the neutrino pT spectrum, the W boson width, and the resolution. The neutrino pT spectrum is checked in a data sample enriched in W + jets, with no b-tagged jets and 60 < MT < 120 GeV. No large mismodelling of is observed. Therefore, we do not apply any corrections to the neutrino pT spectrum but only propagate the statistical limitation of this study as the uncertainty (6–22%) in the modelling of the neutrino pT spectrum. The uncertainty in the W boson width (3% ) is estimated by scaling the four-vectors of the W boson decay products appropriately. The resolution effects on this background are studied using the same method as described in Sect. 5.2.1, giving rise to a 1–3% uncertainty.
The other transfer factor, TNb, performs the extrapolation in the number of b-tagged jets for each bin. Scale factors are applied to the simulation to match the b tagging efficiency in data. The largest uncertainty in this transfer factor is the fraction of the heavy-flavour component in the W + jets sample; we assign a 50% uncertainty to this component. We performed a dedicated cross-check in a CR with one or two jets and at least 50 GeV of . Data and simulation were found to be in agreement in the b jet multiplicity within uncertainties. After taking into consideration the additional sources of systematic uncertainty described in Sect. 7, the total uncertainty in the W + jets estimate varies from 50 to 70%.
The semileptonic background is never larger than 10% of the total background estimate. We rely on simulation to estimate it. The main source of uncertainty in this estimate is the modelling of the resolution because poor resolution can enhance the contributions at large MT. The studies of resolution presented in Sect. 5.2.1 indicate that it could be mismodelled by about 10% in simulation. Changes in the simulated resolution by a corresponding amount provide an uncertainty of 100% in the semileptonic estimate.
The “rare” background category includes production in association with a vector boson (W, Z, or γ), diboson, and triboson events. Within this category, WZ events dominate the SRs with two jets, and events with the Z boson decaying into a pair of neutrinos () dominate regions of higher jet multiplicity. The expected contributions from these backgrounds are small, and the simulation is expected to model the kinematics of these processes well in the regions of phase space relevant to the SRs. The rare backgrounds are therefore estimated using simulation. We assess the theoretical and experimental uncertainties affecting the estimates as described in Sect. 7, resulting in a total uncertainty of 15–26%, depending on the SR.
The background expectations and the corresponding yields for each SR are summarized in Table 4 and in Fig. 5. Overall, the observed and predicted yields agree within two standard deviations (SD) in all SRs. For signals of top squark pair production for different mass hypotheses, the maximum observed significance obtained by combining the results in different SRs is 1.2 SD for a top squark mass of ≈ 400 GeV and a massless LSP hypothesis. We therefore find no evidence for top squark pair production.
This search is motivated by the production of pairs of bottom or top squarks, in which each or decays, respectively, into a bottom or a charm quark and a neutralino. In the latter search, the difference between the and masses is assumed to be less than 80 GeV, and the only top squark decay mode considered is through a flavour changing neutral current to . Small mass splittings or between the top or bottom squark and the neutralino leave little visible energy in the detector, making signal events difficult to distinguish from SM background. However, events with an energetic ISR jet recoiling against the originating from the neutralino can provide a distinct topology for signals with compressed mass spectra, i.e. with small Δm. We thus perform a search for events with an ISR jet and significant .
Events in the search sample are recorded using the same trigger as that for the top squark search in the all-jet final state, requiring the presence of large and at least two energetic jets within the tracker acceptance. After applying an offline selection requiring GeV and at least two jets with pT > 60 GeV, we find the trigger efficiency to be greater than 97%. We veto events that have at least four jets with pT above 50 GeV. The veto and its threshold are motivated by the harder pT spectrum of the fourth jet in semileptonic events compared to the signal, in which extra jets originate from ISR or FSR. To reduce the SM background from processes with a leptonically decaying W boson, we reject events containing isolated electrons or muons with Irel < 0.1 and |η| < 2.5, or Irel < 0.2 and |η| < 2.4, respectively, and with pT > 10 GeV. The contribution containing τh decays is reduced by placing a veto on events containing charged-hadron PF candidates with pT > 10 GeV, |η| < 2.5, and an isolation sum smaller than 10% of the candidate pT.
The dominant SM background sources are Z + jets production with , and the lost-lepton background originating from W + jets, , and single top quark processes with leptonic W boson decays. A smaller background contribution comes from QCD events in which large originates from jet mismeasurements and the direction of is often aligned with one of the jets. To suppress this background we require that the absolute difference in azimuthal angle between the and the closest of the three leading jets (Δϕ123) is greater than 0.4. Two sets of SRs are defined to optimize the sensitivity for signal models with either compressed or noncompressed mass spectra.
In addition to the criteria discussed above, for regions targeting noncompressed scenarios we require that the pT of the leading jet be above 100 GeV and that the event contain at least one additional jet with pT above 75 GeV. We also require that the two highest-pT jets be identified as b jets. These requirements suppress events originating from W and Z boson production, for which the leading jets have a softer pT spectrum since they are produced by ISR or FSR. To maintain a stable b tagging efficiency as a function of jet pT, both the loose and medium working points of the b tagging algorithm are used to identify b jets. The b tagging efficiency of the medium working point depends strongly on the jet pT and degrades by about 20–30% for jets with pT above 500 GeV, while the efficiency of the loose working point is more stable with increasing jet pT. Specifically, we use the loose working point to identify b-tagged jets when the leading jet has pT above 500 GeV, and the medium working point otherwise. Since such high-pT jets are less likely to occur in SM processes, the higher misidentification rate of the loose working point results in only a small increase in the SM background.
The distribution of , where j1, j2 are the two highest-pT jets, is expected to have a kinematic endpoint at the mass of the top quark when and the closest jet originate from the semileptonic decay of a top quark. In the noncompressed search sample we require to be greater than 250 GeV. Events in this sample are then categorized by HT,12, defined for the purposes of this analysis as the scalar sum of the pT of the two leading jets, and the mCT kinematic variable. The boost-corrected cotransverse mass [71, 72], mCT, is defined by:
For scenarios in which two particles are pair-produced and have the same decay chain, the mCT distribution has an endpoint determined by the masses of the parent and decay-product particles. For this endpoint is at .
For signals with compressed mass spectra, high-pT ISR is required to be able to reconstruct the quarks as jets and obtain a large value of . Compressed SRs require therefore a leading jet with pT > 250 GeV that is back-to-back relative to the (). Since such ISR jets are not expected to originate from b quarks, we require that the leading jet fail the loose b-tagging requirement.
We relax the thresholds on the second jet pT and on the to 60 and 200 GeV, respectively, and categorize events in the search sample according to the number of b-tagged jets. The mCT observable loses its discriminating power for these compressed signal models due to the small mass splitting between the parent particle and . The is therefore used as the main discriminant, with different thresholds applied to define the final SRs.
The SM background contributions originating from , lost-lepton, and QCD processes are estimated from dedicated data CRs as discussed below. Smaller contributions from other SM processes, such as diboson (VV) processes, are estimated from simulation, and an uncertainty of 50% is assigned to these contributions.
The background is estimated from a high-purity data sample of Z → μ+μ- events in which we remove the muons and recalculate the relevant kinematic variables to emulate events. The trigger used to collect this CR requires the presence of a high-pT muon with |η| < 2.1. In keeping with the trigger constraints, the sample is selected by requiring the presence of two isolated muons in the event with pT > 50 (10) GeV and |η| < 2.1 (2.4) for the leading (trailing) muon. The invariant mass of the dimuon pair is required to be within 15 GeV of the Z boson mass . Each muon is required to be separated from jets in the event by ΔR > 0.3.
Apart from the lepton selection, we apply the same object and event selection criteria as described in Sect. 6.1 to this sample, with the exception that b jets are selected using the loose working point of the b tagging algorithm to improve the statistical power of the data CR. Events in the selected sample are subdivided into CRs corresponding to the noncompressed and compressed SRs. The observed events in these data CRs, , are translated into an estimation of the contribution in the SRs with the help of simulation, as follows:
where , representing the small contamination in the CRs due to , W + jets, single top quark, and diboson processes, is estimated from simulation. The corrected dimuon event yield is scaled by the kinematic and detector acceptance of muons from Z bosons, A, and the muon reconstruction, identification, and isolation efficiency ϵ. The acceptance and efficiency are determined from simulation. Efficiency scale factors are applied to correct for differences between data and simulation. These scale factors are determined with a “tag-and-probe” method in Z → μ+μ- events . The product of the muon acceptance and efficiency, Aϵ, varies from 0.6 in the low-mCT and low- regions to 0.9 in the high-mCT and high- regions. The correction factor  represents the ratio of the Z boson branching fractions to neutrinos and leptons. The remaining term, κ, accounts for differences in the b tagging efficiency and misidentification rate between the CRs and SRs, resulting from the use of different b tagging working points. These κ factors are determined from Z → ℓℓ simulation and corrected for known differences in the performance of the b tagging algorithm between data and simulation as measured in samples of multijet and events . The value of the b tagging κ factor ranges from 0.10 to 0.15 for the noncompressed SRs, and from 0.20 to 0.25 for the Nb = 1 compressed SRs, while it is about 0.15 for the Nb = 2 compressed SR.
The largest uncertainty in the background estimate arises from the limited event yields in the dimuon CR, corresponding to a 10–100% uncertainty in the prediction. We correct for the estimated contributions to the CR from SM processes other than Z → μ+μ- using simulation samples with an assigned uncertainty of 50% in their normalization. This leads to an uncertainty of 2–20% in the background estimate. Other experimental and theoretical sources of uncertainty, to be discussed in Sect. 7, result in an additional 2–8% uncertainty in Aϵ, and a 2% uncertainty is assigned in all SRs to account for the uncertainty in the Z boson branching fractions. The uncertainty in the b tagging κ factors is assessed by varying the data-to-simulation b tagging correction factors according to their measured uncertainties. Additionally, the dependence of κ on the heavy-flavour content in Z boson events is evaluated by varying the Z + and Z + fractions in simulation by 20% based on the uncertainty in the CMS Z + measurement , resulting in an additional uncertainty of 10–20% in the estimate.
The lost-lepton background in each SR is estimated from a single-lepton CR in data selected by inverting the electron and muon vetoes in events collected with the same trigger as used to record the signal sample. We relax the b tagging requirement in the CRs using the loose working point in the noncompressed selection, while keeping the same requirement as in the SRs for the compressed regions. In all other respects, the CRs are defined through the same selection criteria as the corresponding SRs, including requirements on the HT,12, mCT, Nb, and , to remove any dependence of the prediction on the modelling of these kinematic variables in simulation. The possible contamination from signal in the single-lepton CR is negligible, less than 1%, so no extra requirement on is made. The lost-lepton component of the SM background in each SR, , is estimated once again from the corresponding data via a transfer factor, TLL, determined from simulation:
where is the observed event yield in the single-lepton CR. The transfer factor TLL accounts for effects related to lepton acceptance and efficiency.
The largest uncertainty in the lost-lepton estimate is, as in the previous analyses, due to the statistical uncertainty in the event yields, ranging from 3 to 50%, depending on the SR. Contributions to the CRs from Z → ℓℓ and diboson processes are subtracted using estimates from simulation, and a 50% uncertainty is applied to this subtraction, which leads to an uncertainty of 3–10% in the lost-lepton prediction. The limited event counts in the simulation sample result in a 2–12% uncertainty, while uncertainties related to discrepancies between the lepton selection efficiency in data and simulation give rise to a 3–4% uncertainty in the final estimate. An additional uncertainty of 7% in the τh component accounts for differences in isolation efficiency between muons and single-prong τh decays, as determined from studies with simulated samples of W + jets and events. A systematic uncertainty of 5–10% is found for the uncertainties in the b tagging scale factors that are applied to the simulation for the differences in b tagging performance between data and simulation and the different b tagging working points. Finally, we estimate a systematic uncertainty in the transfer factor to account for differences in the and W + jets admixture in the search and control regions. This results in a 20–30% uncertainty in the final prediction.
The Δϕ123 > 0.4 requirement reduces the QCD contribution to a small fraction of the total background in all SRs for both compressed and noncompressed models. We estimate this contribution for each SR by applying a transfer factor to the number of events observed in a CR enriched in QCD events. The CRs are obtained by inverting the Δϕ123 requirement. The transfer factor, TQCD, is measured in a sideband region in data with ∈ [200, 250] GeV and the same requirements on the other variables as in the SRs. This factor is the ratio between the number of QCD events in the Δϕ123 > 0.4 and Δϕ123 < 0.4 subsets of this sideband region. The estimated contribution of other SM processes (, W + jets, single top quark, and diboson production) based on simulated samples is subtracted from the event yields in the CR and each subset of the sideband.
The transfer factor for the noncompressed regions does not vary significantly as a function of HT,12 and mCT. Therefore, we extract the value of TQCD used for the noncompressed SRs from a sideband selected with an inclusive requirement on HT,12 and mCT to reduce the statistical uncertainty in the transfer factor. The transfer factors for the compressed SRs are obtained from sidebands that are subdivided according to the number of b-tagged jets into Nb = 0 and Nb ≥ 1 regions, with the latter used to extract the QCD predictions for the Nb = 1 and Nb ≥ 2 SRs.
The statistical uncertainties due to the limited number of events in the data CRs and the non-QCD simulated samples are propagated to the final QCD estimate, ranging from 10 to 100%. The main uncertainty in TQCD also originates from the statistical uncertainty of the observed and simulated event yields in the sideband region. We assign additional uncertainties for differences in b tagging efficiency between data and simulation and for the subtraction of the non-QCD background contribution in the sideband. The total systematic uncertainty in the QCD prediction varies between 27 and 76% in the compressed SRs, but can be as large as 550% in the noncompressed SRs due to the small event samples in the corresponding sideband in data.
Several categories of systematic uncertainties apply to all three analyses. These include uncertainties arising from the limited event counts in control samples, uncertainties related to the use of simulation in SM background predictions, and a 2.7% uncertainty in integrated luminosity  that applies to the estimated signal yields and contributions from rare background processes that are taken directly from simulation, without the use of data control samples.
The limited number of simulated events surviving the stringent requirements on jets and in all three searches can lead to a significant statistical uncertainty in background predictions. In the case of background predictions that rely on simulation for accurate modelling of the relevant event kinematics, we assess theoretical uncertainties, primarily those associated with missing higher-order corrections, in the simulated samples by varying the renormalization and factorization scales up and down by a factor of two [75, 76] and by variations of PDFs. The PDF uncertainties are defined by the SD obtained from 100 variations of the NNPDF3.0  PDFs. The uncertainties are then propagated to the final background estimates.
When the simulation of the detector response does not adequately describe the data, correction factors are applied to account for the observed discrepancies. Differences in the efficiencies for selecting isolated leptons between simulation and data are measured in Z → ℓℓ events in the case of electrons and muons and in a -enriched sample for hadronically decaying τ leptons. The observed deviations are accounted for in the form of corrections to the simulation, and the corresponding uncertainties are propagated to the predicted SM yields in the SRs. Correction factors and uncertainties based on measurements of b tagging performance in data and simulation  are also applied. They are parameterized by jet kinematics and flavour. We also assess an uncertainty related to the modelling of additional interactions in the simulation. For the rare SM backgrounds with top quarks, predominantly from production in association with a Z boson, where the Z boson decays to a pair of neutrinos, an extra uncertainty is estimated to account for the possible mismodelling of the top quark pT spectrum. The efficiency and misidentification rates for the top quark tagging algorithm are compared between data and simulation in CRs as a function of the key kinematic variables. The correction factors are found not to be strongly dependent on the different kinematic variables. The efficiency estimated in simulation agrees with the measured efficiency while the misidentification rate has to be corrected by 30%. Both correction factors have a 10% uncertainty, estimated from the variations of the efficiency measurement.
All these uncertainties are propagated to the different signal and background estimates to which they apply. The background predictions from control samples in data are affected through the transfer factors that are calculated from simulation corrected to reproduce data. In general these uncertainties are subdominant and the uncertainty in the final background estimate is dominated by the statistical uncertainty of the data control sample.
For the signal samples differences between the fast simulation and the full Geant4-based model are also taken into account. Lepton selection efficiencies and b tagging performance are found to be different in the fast simulation. We derive appropriate corrections for the fast simulation and propagate the corresponding uncertainties to the predicted signal yields. We also assess an additional uncertainty for the difference in resolution between the fast simulation and the full Geant4-based model. This difference in resolution has the largest impact on signal models with small intrinsic , as is the case for compressed mass spectra. The modelling of the ISR plays an important role in cases where the top squark and masses are very similar. The uncertainty is determined by comparing the simulated and observed pT spectra of the system recoiling against the ISR jets in events, using the method described in Ref. . The effect is generally found to be small, although in scenarios with a compressed mass spectrum the effect can be as large as 30%.
The uncertainties in the signal modelling are determined in each analysis for every SR. The dominant uncertainties in the predicted signal yield arise from the size of the simulated samples in some of the SRs (1–100%), jet energy scale corrections (1–50%), b tagging efficiency corrections used to scale simulation to data (1–35%), and ISR (1–30%). The largest uncertainties are in SRs that have small signal acceptance to a specific model.
The statistical uncertainties of the signal samples are uncorrelated, whereas all other signal systematic uncertainties are considered to be fully correlated among the different SRs and analyses. Since the three analyses predict the backgrounds with different CRs, the treatment of systematic uncertainties is mostly uncorrelated among analyses, except for the estimates based on simulation. Here only the statistical component of the uncertainty is treated as uncorrelated. Systematic uncertainties due to jet energy scale corrections, b tagging efficiency and selection efficiencies are treated as correlated among the different background estimates.
The data in all three searches are consistent with the background expected from SM processes. The results are interpreted as limits on supersymmetric particle masses in the context of simplified models [77–80] of top or bottom squark pair production.
Different decay modes are considered for top squark pair production. For mass splittings Δm larger than the W boson mass, we consider two decay modes for the top squark: to a top quark and a neutralino, or to a bottom quark and a chargino, where the chargino decays to an LSP. Scenarios with branching fractions of 50 or 100% are considered. The results of the top squark searches in the all-jet and single-lepton final states are combined for these interpretations. For Δm smaller than the W boson mass, only the decay of top squarks to a charm quark and an LSP is considered in this paper. For the pair production of bottom squarks, all bottom squarks are assumed to decay to a bottom quark and an LSP.
The signal yield is corrected for signal contamination of data CRs for each mass hypothesis and each analysis. Typical values are around 5–10%, except for compressed mass spectra, where it can vary between 10 and 50%. The signal contamination is most significant for the top squark production models with a 100% branching fraction, a light LSP, and Δm close to the top quark mass. The 95% confidence level (CL) upper limits on SUSY production cross sections are calculated using a modified frequentist approach with the CLS criterion [81, 82] and asymptotic results for the test statistic [83, 84].
The SRs and CRs for top squark searches in the all-jet and single-lepton final states are mutually exclusive. We combine the results of the two searches, treating the systematic uncertainties assigned to the predicted signal and background yields as correlated or uncorrelated depending on the source, as detailed in Sect. 7.
Figure 7 shows 95% CL exclusion limits for , assuming the top quarks in the decay to be unpolarized, together with the upper limit at 95% CL on the excluded signal cross section. All top squarks are assumed to decay to a top quark and an LSP. For Δm < mt the signal samples assume a three-body decay without an off-shell top quark as intermediate particle. The expected exclusion is given by the dashed red line, with the one SD experimental uncertainty. The observed exclusion curve is shown as a solid black line together with the 1 SD uncertainty in the theoretical cross section. We do not interpret in the region near Δm ≈ mt when is very light because of the difficulty in modelling rapidly varying kinematics in this region. In this region an indirect search for top squark pair production can be performed by looking for a small excess in the measured cross section compared to the SM expectation [20, 85]. We exclude top squark masses from 280 to 830 GeV for a massless LSP and LSP masses up to 260 for 675 GeV top squarks. At 8 TeV top squark masses were excluded up to 780 GeV for a massless LSP . For models with heavy top squarks and light LSPs, the sensitivity is driven by the top squark analysis in the all-jet final state of Sect. 4, which is more sensitive than the single-lepton analysis (Sect. 5) because of the larger acceptance for signal. The combination extends the expected reach in top squark mass by about 45 GeV. When the LSP is heavier, the cleaner search in the single-lepton final state becomes more important. Both analyses have similar sensitivity in this area of parameter space, and combining them extends the reach in LSP mass by about 30 GeV.
Figure 8 shows the 95% CL exclusion limits for production, assuming equal probabilities for the decay modes and . The chargino in the latter mode decays to a W boson and an LSP. In this model, the chargino is considered to be nearly mass-degenerate with the LSP ( GeV). The W boson decay products originating from the chargino decay are very soft because of the small mass splitting, and might not be detectable. For intermediate LSP masses, top squark masses are probed up to 725 GeV. The LSP masses up to 210 GeV are probed for a top squark mass of around 500 GeV. Here, the single-lepton analysis does not contribute much to the combination because of the larger acceptance in the all-jet final state, except at low LSP masses. In most of the mass parameter space the combination reaches ≈ 15 GeV higher than the analysis in the all-jet final state.
The compressed SRs from the bottom squark analysis in the all-jet final state (Sect. 6) are used to set upper limits on the top squark cross sections when the mass splitting between the top squark and the LSP is smaller than the mass of the W boson. Figure 9 shows the expected and observed 95% CL upper limits on the top squark cross sections in the - plane assuming the top squark always decays to a charm quark and an LSP. Top squarks with masses below 240 GeV are probed in this model, when the mass splitting between the top squark and the LSP is close to 10 GeV. At 8 TeV top squark masses up to 270 GeV were probed for the same Δm .
Figure Figure1010 shows the expected and observed 95% CL upper limits on the bottom squark cross sections in the - plane using both the compressed and noncompressed SRs of the bottom squark analysis. We probe bottom squark masses up to 890 GeV for small LSP masses. With 8 TeV data bottom squark masses below 650 GeV were excluded. [20, 24].
Results are presented from three complementary searches for top or bottom squark–antisquark pairs in data collected with the CMS detector in proton–proton collisions at a centre-of-mass energy of 13 TeV, corresponding to an integrated luminosity of 2.3 fb-1. The search for top squarks is carried out in the all-jet and single-lepton final states, which are combined for the final result. A second search in all-jet events is designed for bottom squark pairs and for top squarks decaying to charm quarks through a flavour changing neutral current process. No statistically significant excess of events is observed above the expected standard model background, and exclusion limits are set at 95% confidence level in the context of simplified models of direct top and bottom squark pair production. Limits for top squark masses of 830 GeV are established for a massless LSP, and for LSP masses up to 260 GeV for a 675 GeV top squark mass, when all top squarks are assumed to decay to a top quark and an LSP. When the top squarks can also decay to a bottom quark and a chargino, this reach is reduced. Assuming a mass splitting between the top squark and the LSP close to 10 GeV, and top squarks that decay to a charm quark and an LSP, top squark mass limits up to 240 GeV are established. Finally, bottom squark mass limits up to 890 GeV are established for small LSP masses. The results extend the reach with respect to previous limits obtained from LHC Run 1 data in most of the parameter space.
We congratulate our colleagues in the CERN accelerator departments for the excellent performance of the LHC and thank the technical and administrative staffs at CERN and at other CMS institutes for their contributions to the success of the CMS effort. In addition, we gratefully acknowledge the computing centres and personnel of the Worldwide LHC Computing Grid for delivering so effectively the computing infrastructure essential to our analyses. Finally, we acknowledge the enduring support for the construction and operation of the LHC and the CMS detector provided by the following funding agencies: the Austrian Federal Ministry of Science, Research and Economy and the Austrian Science Fund; the Belgian Fonds de la Recherche Scientifique, and Fonds voor Wetenschappelijk Onderzoek; the Brazilian Funding Agencies (CNPq, CAPES, FAPERJ, and FAPESP); the Bulgarian Ministry of Education and Science; CERN; the Chinese Academy of Sciences, Ministry of Science and Technology, and National Natural Science Foundation of China; the Colombian Funding Agency (COLCIENCIAS); the Croatian Ministry of Science, Education and Sport, and the Croatian Science Foundation; the Research Promotion Foundation, Cyprus; the Secretariat for Higher Education, Science, Technology and Innovation, Ecuador; the Ministry of Education and Research, Estonian Research Council via IUT23-4 and IUT23-6 and European Regional Development Fund, Estonia; the Academy of Finland, Finnish Ministry of Education and Culture, and Helsinki Institute of Physics; the Institut National de Physique Nucléaire et de Physique des Particules/CNRS, and Commissariat à l’Énergie Atomique et aux Énergies Alternatives/CEA, France; the Bundesministerium für Bildung und Forschung, Deutsche Forschungsgemeinschaft, and Helmholtz-Gemeinschaft Deutscher Forschungszentren, Germany; the General Secretariat for Research and Technology, Greece; the National Scientific Research Foundation, and National Innovation Office, Hungary; the Department of Atomic Energy and the Department of Science and Technology, India; the Institute for Studies in Theoretical Physics and Mathematics, Iran; the Science Foundation, Ireland; the Istituto Nazionale di Fisica Nucleare, Italy; the Ministry of Science, ICT and Future Planning, and National Research Foundation (NRF), Republic of Korea; the Lithuanian Academy of Sciences; the Ministry of Education, and University of Malaya (Malaysia); the Mexican Funding Agencies (BUAP, CINVESTAV, CONACYT, LNS, SEP, and UASLP-FAI); the Ministry of Business, Innovation and Employment, New Zealand; the Pakistan Atomic Energy Commission; the Ministry of Science and Higher Education and the National Science Centre, Poland; the Fundação para a Ciência e a Tecnologia, Portugal; JINR, Dubna; the Ministry of Education and Science of the Russian Federation, the Federal Agency of Atomic Energy of the Russian Federation, Russian Academy of Sciences, the Russian Foundation for Basic Research and the Russian Competitiveness Program of NRNU MEPhI (M.H.U.); the Ministry of Education, Science and Technological Development of Serbia; the Secretaría de Estado de Investigación, Desarrollo e Innovación and Programa Consolider-Ingenio 2010, Spain; the Swiss Funding Agencies (ETH Board, ETH Zurich, PSI, SNF, UniZH, Canton Zurich, and SER); the Ministry of Science and Technology, Taipei; the Thailand Center of Excellence in Physics, the Institute for the Promotion of Teaching Science and Technology of Thailand, Special Task Force for Activating Research and the National Science and Technology Development Agency of Thailand; the Scientific and Technical Research Council of Turkey, and Turkish Atomic Energy Authority; the National Academy of Sciences of Ukraine, and State Fund for Fundamental Researches, Ukraine; the Science and Technology Facilities Council, UK; the US Department of Energy, and the US National Science Foundation. Individuals have received support from the Marie-Curie programme and the European Research Council and EPLANET (European Union); the Leventis Foundation; the A. P. Sloan Foundation; the Alexander von Humboldt Foundation; the Belgian Federal Science Policy Office; the Fonds pour la Formation à la Recherche dans l’Industrie et dans l’Agriculture (FRIA-Belgium); the Agentschap voor Innovatie door Wetenschap en Technologie (IWT-Belgium); the Ministry of Education, Youth and Sports (MEYS) of the Czech Republic; the Council of Science and Industrial Research, India; the HOMING PLUS programme of the Foundation for Polish Science, cofinanced from European Union, Regional Development Fund, the Mobility Plus programme of the Ministry of Science and Higher Education, the National Science Center (Poland), contracts Harmonia 2014/14/M/ST2/00428, Opus 2014/13/B/ST2/02543, 2014/15/B/ST2/03998, and 2015/19/B/ST2/02861, Sonata-bis 2012/07/E/ST2/01406; the Thalis and Aristeia programmes cofinanced by EU-ESF and the Greek NSRF; the National Priorities Research Program by Qatar National Research Fund; the Programa Clarín-COFUND del Principado de Asturias; the Rachadapisek Sompot Fund for Postdoctoral Fellowship, Chulalongkorn University and the Chulalongkorn Academic into Its 2nd Century Project Advancement Project (Thailand); and the Welch Foundation, contract C-1845.