One of the most ambitious goals in metabolic engineering is the design of biological systems based on
in silico predictions using mathematical models. The advent of high-throughput technologies and the completion of genome sequencing for many organisms have led to an explosion of systems-wide biological data [
1,
2]. Genome-scale stoichiometric models of the increasing number of microorganisms and mammalian cells have been developed at the moment [
3,
4]. Some of such models have been used to identify gene knockout targets for the efficient production of important industrial chemicals, including amino acids [
5,
6] and chemicals that are conventionally derived from petroleum [
7-
9]; other such models have been used to identify drug targets in pathogens [
10-
12]. In modeling and simulation approaches, target reactions whose knockout is predicted to overproduce the chemical of interest can be easily tested experimentally by deleting the corresponding genes in the microbial host.
Increasing the expression levels of the relevant genes has also been successfully employed for the overproduction of target chemicals [
13,
14]. To avoid unnecessarily massive experiments to be performed, several computational algorithms have been devised in an effort to reveal the relationship between metabolic reactions and the biological properties of interest [
15-
27]; however, the identification of gene amplification targets is more complicated than the identification of gene knockout targets; hence, correlations among the genes, mRNAs, transcriptional or translational regulations, proteins, and metabolic fluxes must be carefully examined. Genome-scale metabolic models that rely on constraints-based flux analysis without additional physiological information are limited in their ability to describe the complex nature of biological systems, particularly biological phenomena beyond metabolism. Several systematic methods have been developed to overcome such limitations: flux variability analysis (FVA) [
17,
19-
21], flux coupling analysis [
16-
18], flux sensitivity analysis [
15], flux response analysis [
26], OptReg [
22], genetic design through local search [
25], OptForce [
27], and flux scanning based on enforced objective flux (FSEOF) [
23]. In particular, FSEOF is a method that first scans and searches for variations in the metabolic fluxes in response to the enforced fluxes directed towards a target product. Reactions were then selected as amplification targets, the flux values of which increased in accordance with the enforced fluxes toward the production of a target chemical. This method was experimentally validated by identifying amplification targets that improved the production of lycopene in
Escherichia coli[
23]. These approaches demonstrated that incorporating physiological constraints during the model simulation are critical to identifying trustworthy gene amplification targets, but much improvement is still needed [
24,
28]. One of the major problems is the existence of a too large flux solution space in optimization problems.
In this study, in order to systematically handle the large flux solution spaces, as also revealed in the implementation of FSEOF [
23], we considered functionally grouped reactions that simultaneously carry fluxes based on unique features of microbial genomes. Considering such functionally grouped reactions helps reducing the number of and selecting multiple solutions existing for each optimal objective value, enabling to identify more reliable gene amplification targets when combined with FSEOF. Grouped reactions were previously revealed by genomic context and flux-converging pattern analyses as promising constraints [
28]. Genomic context analysis interrogates conserved neighborhood, gene fusion, and co-occurrence using a STRING database with the goal of suggesting groups of reaction fluxes that are most likely correlated in their on/off activities [
28,
29]. Flux-converging pattern analysis further limits the range of possible flux values in a metabolic reaction by examining the number of carbon atoms in metabolites that participate in the reactions and the converging patterns of fluxes from a carbon source (see Methods and Figure

) [
28]. Consequently, flux balance analysis (FBA) with constraints controlling simultaneous on/off activity (
Con/off) and the flux scale (
Cscale) of the metabolic reactions accurately predicted flux distributions in gene knockout mutant strains [
28].
Based on these analyses, the grouping reaction (GR) constraints that constrain reactions to co-carry fluxes altogether regardless of the condition were incorporated into the E. coli genome-scale metabolic model. The model then facilitated the scanning of changes in the variability among metabolic fluxes using FVA in response to the enforced enhancement of the fluxes toward a target chemical. This newly developed method, called flux variability scanning based on enforced objective flux (FVSEOF) with GR constraints, was employed in this study to identify gene amplification targets for the production of target chemicals. FVSEOF with GR constraints was first validated based on amplification targets reported for the production of shikimic acid and putrescine in E. coli, and then further validated by actually engineering E. coli for the enhanced production of putrescine based on new amplification targets.