|Home | About | Journals | Submit | Contact Us | Français|
Natural products have contributed to the development of many drugs for diverse indications. While most U.S. pharmaceutical companies have reduced or eliminated their in-house natural product groups, new paradigms and new enterprises have evolved to carry on a role for natural products in the pharmaceutical industry. Many of the reasons for the decline in popularity of natural products are being addressed by the development of new techniques for screening and production. This overview aims to inform pharmacologists of current strategies and techniques that make natural products a viable strategic choice for inclusion in drug discovery programs.
Humans have long used naturally occurring substances for medical purposes. Plants, in particular, have played a leading medical role in most cultures. With the development of the science of chemistry at the beginning of the 19th century, plants began to be examined more closely to understand why they were medically useful. In 1804 Sertürner purified morphine from opium and found that it largely reproduced the analgesic and sedative effects of opium. (Lockemann, 1951) His success led others to seek “active principles” of medicinal plants, and throughout the century, bioactive pure natural products were found in cinchona (quinine) (Borchardt, 1996), coca (cocaine) (Gay et al., 1975), and many other plants. The ability to determine the structure of these compounds developed more slowly, with morphine's planar structure determined in 1923 (Gulland and Robinson, 1923), quinine's structure elucidated in 1908 (Rabe, 1908) and cocaine in 1898 (Willstätter and Müller, 1898). The ability to synthesize these compounds took longer yet, for example, morphine was first synthesized in 1956 (Gates and Tschudi, 1956). While the active principle approach has not been a universal explanation for all biological activities of natural substances, it remains the most productive hypothesis.
The identification of penicillin's antibacterial activity by Fleming (Fleming, 1929) and its isolation by Chain and Florey (Chain et al., 1940) revolutionized medicine and led to extensive screening of microbes, particularly soil actinomycetes and fungi, to identify other antibiotic compounds. Using simple bioassays, microbes from soil samples were cultured, identified, and dozens of classes of antibiotics were isolated and elucidated; many of them were commercialized and are still used in clinical practice (Wenzel, 2004). While the evolution of drug resistance in clinically important infections has limited the use of many natural antibiotics, their discovery and commercialization laid the scientific and financial foundation of the modern pharmaceutical industry after World War II.
Pharmaceutical industry interest in developing cancer treatments was minimal during the antibiotic era and into the 1970s. To stimulate interest, the US National Cancer Institute supported an extensive academic network examining plant sources of potential anti-cancer drugs from 1960 onward. Taxol (Wani et al., 1971) and camptothecin analogs (Lerchen, 2002; Wani and Wall, 1969) were the most prominent developments from that program. Unfortunately, both drugs did not reach the market until the early 1990s. Difficulty in obtaining commercial quantities of taxol slowed its advancement, while camptothecin proved to have poor solubility, requiring modifications to its structure to achieve clinical activity. Once it reached market, however, taxol rapidly became a blockbuster drug and continues to be a major part of cancer therapy.
Pharmaceutical companies have reduced their research investment in natural products over the last decade. Companies such as Merck (Mullin, 2008) and Bristol Myers Squibb have cut staffing and eventually closed in-house programs in natural products. This trend has been most visible in the United States, with some European and Japanese companies continuing support for natural products groups. Several reasons have been given for this trend:
This is a valid critique. Current HTS campaigns attempt to compress the testing and prioritization of hits into a period of several months. Even if natural product extracts are tested first, the pace of natural product isolation is hard-pressed to keep up with the demand for hit structures by the end of the screening campaign. However, a number of strategies detailed below have been developed to address this problem.
Natural product samples have most often been tested as whole fermentation broths, or as crude extracts of plants and marine organisms. Once a hit has been confirmed in biological screening, the extract must be fractionated to isolate the active compounds, and this process typically requires that bioassays be conducted at each level of purification. Thus the length of time required to conduct the bioassay and report the results, and the number of separation cycles needed to obtain pure compounds, are factors which dictate the time it takes to process a natural product hit. Even when cycles are made on a weekly basis using a rapid bioassay, it is unusual for a natural product extract hit to yield a pure compound after less than a month's work. Other factors that may impact speed are instability of compounds, difficult separations, and unreliability of bioassays.
This perception is sometimes expressed by the phrase “That pond's all fished out.” It is true that the number of species on Earth is finite, however it is also true that only a very small fraction of all species have been chemically investigated, let alone examined in a broad panel of bioactivities. The number of higher plant species is estimated to be between 300,000 and 400,000. The largest plant screening program of the 1960s was conducted by Smith Kline & French: about 19,000 species were screened for alkaloid content using a simple color test (Raffauf, 1996). The U.S. National Cancer Institute has actively collected higher plants for screening for over 20 years and currently has a collection representing about 30,000 plant species, or 10 percent of the known species.
There is no easy way to tally the number of microbial samples which have been screened for biological activity, since the typical protocol in microbial screening is to perform only minimal identification of the species before starting biological activity tests; certainly the number of microbial samples screened has been enormous, but the taxonomic diversity of those samples was limited by the predilection for soil samples and the difficulty in growing all but a small fraction of microbes in culture. Recent advances in environmental microbiology have shown that there is an enormous unsampled microbiota (Epstein and López-Garcia, 2008). In view of these limitations it is probably more apt to say not that the pond has been fished out, but that new types of bait or new fishing strategies may be required to properly exploit it.
In the marine environment, marine invertebrates have been heavily sampled in the last two decades, and have provided abundant new chemistry and biology (Blunt et al., 2004). However, the extent of biodiversity among marine invertebrates is unknown, though most probably it is large, given that life evolved first in the marine environment. The true diversity of marine life will not soon be understood, at least by classical methods, since there are too few taxonomists to identify and classify new species, and only the easily SCUBA-accessible, shallow, warm marine waters have been thoroughly explored.
The argument that there is little more to be discovered in natural products is reminiscent of the claim by some 19th century physicists that their field was nearing completion. While this was perhaps true of Newtonian physics, events of the last century have clearly shown how blinkered those scientists were. Even if new developments in natural products consist of humble improvements in techniques and understanding rather than revolutionary advances, it seems clear that many “fish” remain in the pond.
Natural products structures spans the range from very simple to extremely complex (Figure 1). With improvements in structure elucidation capability, it has been possible to determine the complete stereostructure of natural compounds as complex as the palytoxins (Moore and Bartolini, 1981; Uemura et al., 1985a), which are compounds of molecular weight >2650 Da incorporating >60 chiral centers. Such compounds obviously will never be suitable candidates for commercial total synthesis. However, the vast majority of natural products isolated and elucidated to date are <1000 Da. In many cases, commercial drug products have been developed by synthetic modification of a naturally produced precursor, whose chemical synthesis is not required. Alternatively, structure-activity studies connected with total synthesis may identify fragments of the parent structure with biological activity, and this may permit a drastic reduction of the size and chirality of a bioactive natural product. Two examples where this approach has succeeded are those of bryostatin (Wender et al., 2005) and halichondrin, (Dabydeen et al., 2006) which will be discussed in more detail below.
Obtaining large supplies of a natural compound for preclinical studies can be a challenge. If derived from a plant which grows in a remote tropical location, physical access for a recollection may be difficult, or permission to collect and ship the material may be hard to obtain. Or, the plant may only produce quantities of the desired compound under certain environmental or ecological conditions. A marine organism may require an expensive expedition, especially if the animal grows in deep waters or in regions with strong or unpredictable currents. Even when one has a microbial culture in hand, the factors that induce production of the metabolite may be poorly understood. Pharmaceutical companies clearly prefer predictable, controllable sources, and for commercial viability, solutions must be found that accomodate the vagaries of natural product production. Some approaches to solving these problems are covered in section 5, below.
Parallel synthesis techniques provide the means to create synthetic libraries of hundreds of thousands of distinct compounds. However, such rapid synthetic techniques have not led immediately to successful drug development. Early combinatorial libraries were composed of compounds with poor solubility and few useful hits were found. In some cases, the quantities of compound produced were very small and the purity was not well controlled. More recently, smaller focused libraries have yielded some useful drug leads, but the most powerful role of parallel synthesis appears to be in expanding an existing lead, rather than in creating screening libraries.
The metabolic energy and the genetic cost of making a small molecule requires that the molecule provides some benefit to the organism, whether through defending it against predators, communicating within its population, or interfering with competing organisms. While most functions of natural products in their producing organism are not currently known, opinion has shifted markedly since the days when natural products were viewed as waste products (Mothes, 1969). Whatever the precise role, it is becoming clear that many natural products are able to reach receptor sites on or within cells, just as a drug must do. The large number of pure natural products which have been found to interact with specific mammalian receptors testifies to the inherent bioactivity in natural products. For example, at the GABA receptor, known natural product ligands include muscimol (Brehm et al., 1972), bicucculine (Johnston et al., 1972), securinine (Beutler et al., 1985), and picrotoxin (Akaike et al., 1985).
While chemists may be as creative as natural systems, the natural systems have been at it for a much longer time. The most important and visible value of natural products chemistry is the introduction of novel molecular skeletons and functionalities that have not previously been conceived of by humans. Some examples include mitomycin, (Stevens et al., 1965) bleomycin, (Umezawa, 1976) and esperamicin (Golik et al., 1987).
These rules were developed to drive synthetic chemists towards compounds which have better biophysical properties and are thus better orally active drug candidates. Thus, compounds should be under molecular weight of 500 Da, posses <5 hydrogen bond donors, <10 hydrogen bond acceptors, and have log P<5 (Lipinski et al., 1997). What is not well appreciated is that Lipinski explicitly excluded natural products from the rules, primarily for the reasons set forth above (see Secondary metabolites have evolved to be bioactive), and because they often utilize transmembrane transporters rather than passive diffusion to enter cells (Lipinski et al., 1997).
High throughput drug screening grew out of automated clinical analyzer technology and miniaturization in the late 1980s, as drug screeners sought methods to increase the pace of testing and lower the costs per sample. Robotic methods of sample manipulation and specialized detectors capable of reading 96-well microtiter plates were developed. At the same time, the emphasis of screening shifted from empirical measures of cell growth or function to molecular targets. This was driven by increasing knowledge of genes and receptor biology.
In its most extreme reductionist forms, targeted screening started with detection of interaction of test compounds with a purified, naked protein. Hits from that experimental model would then be tested in a functional assay before progression to a cellular, and then tissue level of complexity. Since the highest level of reductionism provides the lowest barrier to successfully finding hits, however, the large number of hits generated has to be filtered by secondary, tertiary, and even quaternary assays. Abundant and common natural products such as tannins (see section 7.2.1 below) overwhelmed reductionist assay strategies with high hit rates.
There has been a substantial shift in the last decade to screening assays conducted in cells, and assays in which biological function is directly measured. These typically can be tuned to higher stringency, and lower hit rates, while delivering hit samples with the desired biological properties.
Even in cellular assays, or in functional cell-free assays, natural product samples are not always well-behaved. The question arises as to whether it is better to adapt the assay to the sample or vice versa. Both tactics have had some success, and the path chosen may depend on relative availability of resources in chemistry and biology groups.
A common problem with natural product extracts is that a substantial proportion of them fluoresce in the fluorescein wavelength range (emission maximum 521 nm). This leads to a high false positive rate in a screen with a direct fluorescent endpoint. If the fluorophore endpoint is changed to a label which emits at >560 nm (Cy3B, for example), much less sample autofluorescence is seen, and the false positive rate declines. Alternatively, use of a time-resolved fluorescence label also substantially decreases sample interference. Most sample autofluorescence has a short half-life (i.e., 10 ns), while europium fluorescence labels, for example, have a much longer half life (ca. 700 ms). Thus, by gating the photodetector to record the signal after a 1 ms delay, the majority of the sample autofluorescence is filtered out, while the label is sensitively detected (Hemmilä and Webb, 1997).
One approach to sample modification which has attracted significant interest is that of “prefractionating” the crude extract. In its most complex forms, this means isolating pure compounds and partially characterizing them before testing them in bioassays. Several companies have embraced this as a business model, with mixed success (Bindseil et al., 2001; Eldridge et al., 2002). Simpler, lower cost strategies which separate the crude extract into 5-15 samples based on a single chromatography step, followed by solvent evaporation, may provide much of the benefit at a reduced cost (Bugni et al., 2008; Wagenaar, 2008). All of these approaches require an investment in automation. Automated weighing capability, flexible programmable liquid handling, and low cost separation media are required to carry out the steps.
The benefits of this type of approach are several: 1) cytotoxic compounds which might mask activity of another compound in a cellular assay may be separated; 2) minor constituents are concentrated and can be tested at higher effective concentration; 3) very polar or lipophilic constituents of an extract can be ignored or discarded entirely. The initial testing results from several laboratories which have adopted prefractionation strategies support their use and have demonstrated higher hit rates in screening assays.
As noted above, plants have historically played the leading role in providing drugs or templates for drugs, with microbes following in the antibiotic era. Screeners have more recently examined marine sources, once the invention of SCUBA made it easier to collect and study algae and marine invertebrates. While only a few marine natural products have reached commercial drug status, many marine compounds have proven to have activity in screens and quite a few have been evaluated preclinically. Adequate compound supply has been a major roadblock to the advancement of compounds from marine invertebrate sources. For example, bryostatin 1 was initially produced from its marine source organism, Bugula neritina, under Good Manufacturing Practices (Schaufelberger et al., 1991). However, a mere 18 g of material was purified from 14,000 kg of the producing bryozoan. Mariculture of the same animal has since been accomplished with successful production of bryostatin 1 (Mendola, 2003).
A few programs have used insects as a screening source, notably in a collaboration between the Merck and InBio in Costa Rica (Sittenfeld et al., 1999), and in the Eisner lab at Cornell University (Schröder et al., 1998). Also notable is the work of John Daly using amphibians as a rich source of bioactive compounds (Daly et al., 2005). Epibatidine, a frog alkaloid (Badio and Daly, 1994), served as the stimulus for design of the analgesic drug candidate ABT-594 (Arneric et al., 2007).
Natural product researchers often encounter difficulties in obtaining reliable production of desired compounds from their producing organism. For example, it is common in microbial screening to confirm bioactivity by regrowing the microbe under the same conditions under which the initial screening sample was produced, and in these cases a success rate of 50 percent is not unusual. Similarly, when a plant is collected for reisolation of constituents, it is not unusual to find lower amounts of the desired metabolite, or no compound at all. With marine invertebrates this is also quite common. This has been mentioned above in section 2.4.
The reasons for these problems are poorly understood, but clearly there are a variety of causes. With microbes, obtaining good production of a desired metabolite is often a matter of studying the culture conditions (growth media, time, temperature, oxygenation, etc.) and defining the best conditions for reliable production. With plants, the problem may be a poor understanding of taxonomy; careful botanical field studies may reveal several closely related species, only one of which produces the compound in question (McKee et al., 1998a). Dependence of metabolite production on environmental factors (climate, season, herbivore pressure) often plays an important role for plants and requires study. In marine invertebrates a poor understanding of taxonomy plays a role, but additionally, vectoring of metabolites from one organism to its predator and sequestration in the second organism has been shown to be important in several cases (Paul and Ritson-Williams, 2008; Thoms et al., 2006). Dietary sources of bioactive compounds have also been identified in amphibians which consume arthropods and other small leaf litter animals (Saporito et al., 2003; Saporito et al., 2004; Saporito et al., 2007).
A final reason for erratic production may be that the higher organism is not the source of the compound at all; it may be produced by a microbial symbiont. In many cases marine invertebrates have been found to contain compounds which look suspiciously like microbial metabolites (Simmons et al., 2008). In some cases, similar compounds have been isolated from both a marine invertebrate and a microbe (McKee et al., 1998b; Suzumura et al., 1997). If the microbe is an obligate symbiont, proof of the relationship may be difficult to obtain. A very good case has recently been made by Haygood's group that bryostatins are produced by a symbiont, however the details of the symbiosis are yet to be completely defined (Hildebrand et al., 2004). Similar reports for some plant derived compounds are also intriguing, as in the isolation of taxol from an endophytic fungal associate of the Pacific yew (Stierle et al., 1993).
Organic chemists have made great strides in their ability to synthesize complex, chiral molecules such as natural products. While difficulty and cost still scale with the number of chiral centers and molecular weight, total synthetic approaches to natural products increasingly are becoming more viable as a sourcing option. Given sufficient resources, it is possible to reduce the number of synthetic steps required to reach the target molecule and improve the yield at each step, while using inexpensive starting materials. Bryostatin once again provides a good example. Bryostatin 2 has been synthesized in 40 steps (Evans et al., 1999), and although a total synthesis of bryostatin 1 has not yet been reported, bryostatin 2 can be converted to bryostatin 1 (Pettit et al., 1991b). Wender's group, bypassing synthesis of the natural product, has developed synthetic routes to “bryologs” (Figure 2) which have potent activity similar to bryostatin 1 but have simplified structures. One recent, highly active bryolog was prepared in 10 steps in an overall yield of 30 percent (Wender et al., 2008).
A second example of synthetic success with a complex natural product is that of halichondrin B. Wild collection of the producing sponge gave poor yields (Pettit et al., 1991a; Uemura et al., 1985b). Mariculture in New Zealand yielded similar levels of halichondrin B (Munro et al., 1999). Total synthesis by the Kishi group was accomplished (Aicher et al., 1992), and in the process, several fragments of half the size of the natural product were identified which possessed all of the bioactivity (Dabydeen et al., 2006; Wang et al., 2000). These studies have led to the current clinical development of eribulin by Eisai (Figure 3) (Newman, 2007).
Investigation of the biosynthetic pathways which lead to secondary natural products has gained momentum as DNA sequencing tools have improved (Galm and Shen, 2007). The biosynthesis of polyketide natural products has attracted the most attention, since many commercial antibiotics are largely derived from this pathway. Non-ribosomal peptide synthesis, terpenoid biosynthesis and flavonoid pathways have also been elucidated in many organisms. A key observation has been that many such pathways consist of modular gene clusters which can be manipulated as a whole unit (Donadio et al., 1991). Polyketide synthase modules share enough homology that they can be isolated from relatively distantly related organisms by lowering the stringency of hybridization reactions. In fact, such modules may be detected in uncultivatable microbes (Piel, 2002).
This opens up the possibility of expressing the module in a convenient heterologous organism and obtaining the desired secondary metabolite, if appropriate precursors are available and other cellular machinery is compatible with the metabolite's production (Zhang et al., 2008). In addition, by altering the module, altered analogous metabolites may also be produced (Xu et al., 2009). It has even been possible to predict the biosynthetic product from the sequence of a polyketide module (Banskota et al., 2006).
Knowledge of medical effects of plants is certainly not limited to European cultural traditions. Botanists trained in anthropology have studied many non-western cultures to inventory their use of plants and other natural substances for medical and other purposes. Chemical and pharmacologic investigation of ethnobotanical information is a viable alternate pathway to high throughput screening for drug discovery, although it has its own limitations.
First, cultural concepts of disease are not perfectly aligned. While most cultures readily recognize a superficial fungal infection or diabetes in the same way that western medicine does, disease concepts such as cancer are not interpreted in the same way in different cultures (Hartwell, 1967), although some have claimed that plants used for medicinal purposes yield a higher fraction of anti-cancer activity than unselected plants (Spjut, 2005). Secondly, the medical effects of many plants in traditional cultures may be less specific than is desired by western pharmacology. Tannins, to take one example, are often found in herbal preparations and may play a role in their biological activity, however they are not well-suited to drug development. SP-303, developed by Shaman Pharmaceuticals, (Holodniy et al., 1999) was a carefully defined tannin preparation from Croton lechleri, a Peruvian ethnobotanical, (Williams, 2001) which was tested for several prescription indications before being switched to an over-the-counter anti-diarrheal agent.
A third issue is exemplified by both Chinese traditional medicine and the Indian Ayurvedic system. Both of these ancient traditions utilize polyherbal preparations for the majority of prescriptions. Each component is thought to play a particular role, in some cases by modulating the toxicity of another component. This complexity makes active principle analysis difficult, to say the least, and reductionist approaches to Chinese and Ayurvedic preparations have been largely unsuccessful in validating traditional uses of the products, although many bioactive molecules have been isolated from both pharmacopeias (Deocaris et al., 2008; Tang and Eisenbrand, 1992). The use of microarrays to study in vivo effects of complex preparations may hold some promise for better understanding and future applications (Yin et al., 2004).
While natural products are now known to not be waste products of the producing organism, (Mothes, 1969) the purpose they serve for the producer is rarely similar to their potential use in human medicine. However, most drugs act through interaction with protein receptors, and domains of proteins, though not their precise function, are widely conserved (Rompler et al., 2007). Thus, ligands targeted to a particular domain may also have activity in an orthologous or paralogous receptor. C. elegans, for example, has been proposed as a model organism for anti-Parkinson drug screening; many of the compounds which affect dopaminergic systems in humans also have more or less parallel effects in worms (Nass et al., 2008).
Investigation of the ecological function of natural products is a field unto itself, and elucidation of the role a compound plays can be experimentally difficult. Roles which have been successfully addressed include antifeedant effects (Lidert et al., 1987), allelopathy (interference with growth of competitors) (Tseng et al., 2001), and endocrine disruption (Dinan and Lafont, 2006).
In the late 1980s, contracts were being developed by the U.S. National Cancer Institute for the collection of large numbers of plant, microbial and marine samples worldwide. The collectors required permits to collect in many different countries, and needed assurances that the rights of the source country would be respected in the drug development process. To that end, the NCI developed a standard Letter of Collection which could be signed by both parties (Cragg and Newman, 2005a). This letter states the NCI's willingness to collaborate with source country scientists, to deposit voucher specimens in source country repositories, and to develop benefit-sharing arrangements when patents were filed. In addition, Memoranda of Understanding could also be developed to frame direct collaborations
The NCI agreements predated and presaged the Rio Convention on Biological Diversity (CBD) of 1992. While the U.S. has not signed the treaty, U.S. Department of State policy calls for following the principles of the treaty. The CBD calls for preservation of biological diversity, for protection of source country genetic resources from exploitation, for equitable sharing of the benefits of technology, and for technology transfer to the source country.
While it is generally perceived that the CBD has made access to natural products resources more difficult, it has interrupted the worst abuses of source countries by the developed world. It has not resolved the political issue of how benefits should be distributed within the source country, however. See for example, the case of Hoodia, a weight loss product from the San people in South Africa, whose active constituents were patented by government scientists at the South African CSIR and licensed to Phytopharm plc and Unilever (Anonymous, 2006; Bladt and Wagner, 2007; Wynberg, 2004).
Before tissues of an organism can be tested, they must undergo an initial extraction to separate the desired small molecules from the biopolymers (proteins, cellulose, chitin, nucleic acids) that make up the bulk of the tissue. In the case of plants, it is common to dry the plant parts thoroughly in the field at the point of collection, before extraction, so that the material does not decompose en route to the laboratory. To accelerate extraction, the dry tissue is ground using any of several mills (e.g., a Wiley mill, or a hammer mill). Alternatively, tissues may be frozen, although this is expensive and cumbersome in many cases. Frozen material may be lyophilized. If DNA or mRNA is desired for cloning of proteins, flash freezing the freshly collected tissue into liquid nitrogen is required to obtain useful material.
There are very few standard techniques for extraction, since choice of solvent and conditions depends on the spectrum of small molecules desired. For extraction of drug-like molecules of intermediate polarity, the NCI has found percolation at room temperature with a 1:1 v/v mixture of dichloromethane and methanol to be useful. Extraction techniques which involve heating the solvent and extracted compounds, as in a Soxhlet apparatus, are generally avoided unless the desired compounds have been shown to be heat stable. When preparing samples to be used in biological screening, this should be avoided.
Tissues of marine invertebrates present unique problems in extraction, due to high water and salt content. A solution adopted at the NCI has proven generally applicable to a wide variety of marine specimens. Frozen samples are broken into pieces small enough to be fed into a commercial hamburger grinder with CO2 pellets. The resulting powdered material is stored frozen long enough for the CO2 to sublime, then thawed briefly and stirred with water as a slurry. Filtration through paper in a low-speed centrifuge removes the mucilaginous tissue, and the resulting aqueous extract is freeze dried. The marc (remaining solid residue) is also lyophilized and then extracted with the methylene chloride-methanol mixture.
The solvent must then be removed from the solutions which result from any of these extraction procedures. This is done to make it possible to obtain a weight for the extracted material, as well as to avoid reactions in solution which may alter the constituents. Aqueous solutions are lyophilized, while organic solvent mixtures are dried using rotary evaporators. A final finishing under high vacuum removes most traces of the solvent. Materials should be stored in borosilicate glass bottles or vials at −20° C to ensure stability.
For high throughput screening applications it is common to store libraries in DMSO solution. DMSO is an extraordinarily good solvent for most natural product samples, including extracts. Organic extracts can often be entirely dissolved at concentrations of 10-100 mg/ml in DMSO, while 50% DMSO solutions of aqueous extracts are possible. It should be noted that DMSO concentrations >25% generally suppress bacterial growth in aqueous extract solutions. The bulk extract material should not be stored in DMSO, however, since DMSO can facilitate a number of oxidation reactions. In addition, the hygroscopic nature of DMSO leads to moisture absorption even in nominally sealed microplates in the freezer (Ellson et al., 2005). Such extract plates should be reconstituted from bulk stocks on an annual basis to avoid deterioration of the samples.
Each bioassay in which these extracts are tested will have a limit to tolerance of DMSO. With cellular assays, this is usually 0.5-1% of assay volume. For biochemical assays, it is often as high as 5-10% of assay volume. The limit should be found for the particular assay in advance and DMSO controls run in each assay experiment.
Once an extract has been confirmed as a hit in a biological assay, the active compounds in the extract must be identified. This is accomplished in an iterative process of separation and bioassay termed bioassay-guided separation. An extract is separated into several fractions and the parent extract and fractions are tested in the assay. Several outcomes are possible. One outcome is that all activity may be lost in the daughter fractions, in which case the separation method is deemed unsuitable. Loss of biological activity may be due to irreversible binding to the separation media, or to instability of the active compound. A second outcome would be for all or most daughter fractions to have some low amount of activity. This too is undesirable and simply indicates that the separation mode is not suitable. The third and desired outcome is that one or several daughter fractions contain substantial bioactivity, and that the mass of active fractions has been reduced from the parent with a corresponding increase in potency. A useful technique in monitoring separations is to calculate both mass and activity recoveries for the process. Thus, if 5 g of a parent extract was separated, yielding a summed fraction mass of 4.5 g, the mass recovery would be 90 percent. If dose-response curves are available for the assay, bioactivity recovery can be calculated by the equation
where Mi are the masses of the fractions, Ii are the IC50 values for each fraction, and Mp and Ip the respective values for the parent extract. If a fraction has no activity, the term can be ignored. This calculation is limited by the precision of the bioassay, but can be useful in judging the success of a trial separation.
The invocation of synergism to explain loss of activity on fractionation has only rarely been experimentally substantiated. If activity is lost, most commonly it is attributed to compound instability or irreversible binding to chromatography media. Given a suitably precise assay, calculation of mass and activity recovery can often yield clues to the source of the problem.
A single separation step is rarely sufficient to obtain purified active compounds. While use of high performance chromatography can often yield a superb separation of complex materials, it is more cost-effective to save the high performance step for last, since crude extracts can often wreak havoc on expensive preparative HPLC columns. The most useful first separation process is one based on polarity. For example, the so-called Kupchan partition uses a series of two-phase mixtures in a separatory funnel to sort components by partition coefficient. While simple, the technique suffers from a propensity to form emulsions, and from difficulty in evaporating the water-saturated organic layers to dryness. A more convenient approach for organic plant extracts uses solid phase extraction with diol bonded phase media, with increasingly polar solvents used to elute successive fractions (Beutler et al., 1990). The procedure can be scaled over a wide range of volumes, and introduces no water into the samples. For marine samples, a wide-pore C4 bonded phase scheme can be used with methanol-water mixtures to separate the large amount of salts and other polar material from the more drug-like intermediate polarity fractions (Cardellina, II et al., 1993).
Intermediate resolution techniques such as flash chromatography, or gel permeation chromatography are useful once the polarity cuts have been made. Open column systems using Sephadex LH-20 with a variety of solvents separate based on both size exclusion and adsorption mechanisms, can be very useful.
Final purification is most often accomplished by preparative HPLC. A wide variety of bonded phases are available (e.g., cyano, C18, phenyl, diol, amino,) which can be operated in reversed-phase or normal phase modes, as well as by ion exchange or hydrophilic interaction chromatography. Pilot thin layer chromatography experiments can provide useful hints as to the best choice of column packing and elution conditions. Then, analytical scale HPLC may be used to define precise flow and solvent strength parameters. Even with relatively purified fractions, it is often useful to use gradient elution to obtain an optimum separation. While C18 bonded phases dominate the analytical chemistry market, they are only one of the tools in the HPLC column drawer of a natural products isolation laboratory.
It is also important to pay attention to peak detection. It is common analytical practice to use UV detection at 254 nm, which is useful for many drugs with suitable chromophores. However, many constituents of natural materials lack absorbance in this range. The most effective strategy is to use lower wavelengths for detection – for acetonitrile-water systems it is possible to use wavelengths as low as 200 nm in order to observe compounds with poor UV absorbance. An alternative method is to use evaporative light scattering detection or refractive index detection, however, neither of these modes are very well suited for larger scale separations.
Next, the separation must be scaled up to semi-preparative or preparative scale using larger diameter HPLC columns with the same length, column chemistry, particle size, and porosity. Loading studies with increasing injections of material establish how much mass can be effectively separated in one run. The high cost of larger columns is readily offset by the shorter time required to run the separation, and columns as large as 41 mm diameter can be used with laboratory scale pumping systems capable of delivering 50-100 ml/min of solvent to the column. If flow rates and injection volumes are scaled proportionately, preparative separations can be obtained with the same reproducibility and resolution as analytical separations. The sample injected on an expensive preparative scale column must be carefully filtered and the solvent conditions must be chosen to elute virtually all of the applied sample, otherwise particles and other uneluted material will rapidly degrade column performance.
An excellent overview of preparative chromatographic techniques applied to natural product isolation is available in book form (Hostettmann et al., 1998).
Once the active compounds are obtained in pure form, they can be subjected to structure elucidation. The key technique for this is NMR, specifically a series of two-dimensional experiments (COSY, HSQC, HMBC, NOESY) which make it possible to establish the connectivity of all hydrogen and carbon atoms in a molecule. Serving a very important complementary role is high resolution mass spectrometry (MS), which is capable of providing precise mass measurements that identify the molecular formula of the compound. It is often possible to fully elucidate the structure of an unknown molecule using these two techniques alone. Other spectroscopic techniques such as UV, IR and optical rotation serve ancillary roles, though they may become critical in specific cases. As the number of atoms in a molecule increases, structure elucidation becomes more difficult, due to the exponential increase in possible structures for a given formula. It is currently routine to determine structures of compounds under a molecular weight of 500 daltons, while compounds over 2,000 daltons nearly always require extensive chemical transformations to establish their structures. Exceptions are smaller biopolymers such as peptides which can be routinely sequenced if all of the constituent repeating components are well known.
The ability of NMR and MS to provide useful information from smaller amounts of compound has increased many fold in recent years. Advances in NMR probe design, especially gradient probes, flow probes and cryoprobes, have increased sensitivity greatly (Reynolds and Enriquez, 2002). Higher field strength magnets have increased NMR spectral dispersion so that more peaks can be resolved in a spectrum. Improved NMR pulse sequences have reduced experiment time and resolution. Similar improvements have been made in MS, with electrospray ionization and matrix assisted laser desorption being two ionization techniques which have been valuable in natural product characterization. Cutting edge techniques such as Fourier transform cyclotron resonance mass spectrometry (FTICR-MS) have been applied in industrial settings with utility in structure elucidation, but the cost of the equipment has kept it from being widely applied at this time (Feng and Siegel, 2007).
An alternative technique for structure elucidation is x-ray crystallography, which has a long history in natural product structure elucidation. It is still an important technique, especially for determining the absolute configuration of complex chiral molecules. The obvious limitation is that the compound studied must exist in a crystalline form. If the native compound cannot be persuaded to crystallize, it can be derivatized with a variety of modifiers in an attempt to improve its ability to form crystals. Application of robotics to automatically generate many small scale crystallization experiments has increased the ability to find workable crystallization conditions.
While the ability to perform spectroscopic methods with smaller samples is an important advance, it should be pointed out that animal testing cannot be miniaturized. Therefore, it is always necessary to carry out preparative separations to obtain sufficient material for in vivo work, if a compound is to advance as a drug lead.
Hyphenated techniques such as HPLC-MS, HPLC-UV, and HPLC-NMR are useful analytical platforms for detection, identification and quantitation of compounds in extracts. Thus, they serve as important tools for determining the compounds in a sample, and may inform preparative separation methods. They form an important part of chemical dereplication (see below). In addition to coupling several different detection methods, HTS bioassays may be conducted on individual fractions to complement the physicochemical data. One of the drawbacks of using hyphenated techniques is the large data sets which are generated for each run. Managing, analyzing, and interpreting the results can be a daunting task.
With over 150,000 known small molecules characterized from natural sources, it should be no surprise that previously known natural products will often be re-isolated in the course of bioassay-guided fractionation. While this may be acceptable if the biological activity is new, a great deal of resource can be spent in de novo structure elucidation of known compounds. This problem first emerged in the antibiotic industry, where microbial cultures were generally not identified prior to screening. Methods intended to avoid investing resources in the elucidation of known compounds go by the general term of dereplication (Corley and Durley, 1994). In all its forms, this process attempts to shift the identification of known compounds to an earlier point in the discovery process, either before a pure active substance is isolated, or before a complete NMR data set is acquired and analyzed.
Most effective is a combination of biological and chemical methods. If the source organism has been identified, reference to databases of known compounds such as the CRC Press Dictionary of Natural Products (Buckingham and Thompson, 1997) can suggest candidate structures. Physicochemical data, in particular ultraviolet spectra and mass spectra, if available, can rapidly limit the scope of possible compounds, especially when combined with analytical HPLC (Lang et al., 2008).
Direct physical comparison with standard compounds can be a very effective tactic, however, amassing a library of known compounds is a huge task for most laboratories.
Not all of the compounds contained in a natural product extract are desirable as drug leads. Several classes of such undesirables are described below.
Tannins are polyphenolic plant metabolites which were initially discovered as the principles responsible for tanning leather. Oak bark and many other plant materials contain substantial quantities of tannins, complex molecular structures which incorporate gallate esters (hydrolysable tannins, e.g. Fig. 4a) or flavanol polymers (condensed tannins, e.g. Fig. 4b) (Khanbabaee and Ree, 2001). Phlorotannins are a third class found in brown algae which have similar properties. Tannins play an important ecological role in deterring feeding by herbivores, and may be produced in response to tissue injury. Many tannins have been shown to be antinutritional, that is, they reduce the digestible protein in foods (Butler, 1992). The mechanism for both the tanning effect and the antifeedant/antinutritional roles is noncovalent binding to proteins. Since this is a relatively nonspecific effect – a given tannin is capable of binding to many different proteins - they are generally considered to be poor drug leads. A cautionary tale of what can happen when this is ignored is the case of SP-303, a highly characterized but nonspecific tannin mixture from the Amazonian plant Croton lechleri (Holodniy et al., 1999). Originally put forward by Shaman Pharmaceuticals as an antiviral agent against respiratory syncytial virus (Wyde et al., 1993) and herpes simplex virus (Safrin et al., 1994), it was unsuccessful in initial human trials. It was then studied for treatment of AIDS-related (Holodniy et al., 1999) and travelers' diarrhea (DiCesare et al., 2002) with somewhat better success, although its antidiarrheal activity did not appear to be linked to its direct antiviral activity (Fischer et al., 2004). In 1999, the FDA denied approval for SP-303′s antidiarrheal indications, and the company soon reformulated itself as Shaman Botanicals, marketing SP-303 as a botanical supplement (Clapp and Crook, 2001). Essentially, the nonspecific activity of the SP-303 tannin was misinterpreted, and the product never was able to demonstrate clinical activity sufficient for FDA approval. For this sort of reason, much effort has been expended over the years in removing tannins from natural product screening samples, since they can be active in a wide variety of cell-free and cell-based assays (Cardellina, II et al., 1993; Wall et al., 1996).
Phorbol esters are diterpenes produced exclusively by plants in the Euphorbiaceae and Thymelaeaceae families (e.g. Fig. 4c). Many compounds of the class are skin irritants and tumor promoters, and act in cells through binding to protein kinase C (PKC) (Nishizuka, 1984). Since many cellular functions are dependant on PKC, phorbol esters are considered to be pleiotropic agents which can modulate many cellular pathways. Hence, they appear as hits in many cellular screens, but are undesirable due to their potential toxicity and tumor-promoting properties. The general distribution of phorbol esters in different species has been described (Beutler et al., 1989; Beutler et al., 1990; Beutler et al., 1995; Beutler et al., 1996).
Saponins are glycosides of triterpenes or sterols produced by many plants (Hostettmann and Marston, 1995). The number of sugar residues may vary from one to a dozen, and other chemical functionalities may be appended in various ways (e.g. Fig. 4d). Their ability to act as detergents and form foams in water solution is related to their use as soaps and to kill fish. These same properties in the context of biomedical screening assays lead to cell lysis, which can be either a false positive or an interference, depending on the nature of the assay endpoint. In addition, some saponins cause hemolysis, an undesirable property in a drug candidate. A diagnostic feature of saponins in a cell growth assay is that cell lysis is an extremely rapid process, on the order of several minutes, whereas other cell-killing mechanisms generally require several hours to take effect. Thus, time-course studies can help to distinguish saponins from other types of hits. It is important to note that not all saponins are detergents or hemolytic, and some may provide viable drug leads (Bento et al., 2003; Tang et al., 2007).
The primary structural material of plant tissues is cellulose, a neutral polysaccharide. For animals, cartilage plays a similar role and is composed of collagen and proteoglycan. The carbohydrate portion of proteoglycan is composed of N-acetylglucosamine and hexuronic acid units which are heavily sulfated (e.g. Fig. 4e). These materials are often found in marine invertebrate aqueous extracts, are of high molecular weight and carry a substantial negative charge (Beutler et al., 1993). Anionic polysaccharides are highly active in cellular HIV assays (Beutler et al., 1993), however, their high molecular weight and heterogeneity make them undesirable as drug candidates. Sulfated cyclodextrins have substantially the same antiviral activity without some of the liabilities, and have been studied as antiviral drugs (Moriya et al., 1993). Sulfated polysaccharides are encountered as hits in a variety of cellular screens. They may be removed from extracts by precipitation from ethanol solution at low temperatures. Plants also produce anionic polysaccharides which generally have weaker activity.
One further approach to identifying or eliminating known natural products without investing resources in their re-isolation and characterization is to compare biological and chemical “fingerprints” with standards. By using the results of multiple biological and chromatographic experiments in which the standard compounds have previously been tested, one can group similar samples together, and pose a dereplication hypothesis for the samples whose results match those of a known compound.
The most data-rich environment in which this has been done is for the NCI 60-cell results. Since thousands of natural product compounds have been tested, these can be used as reference points in data analysis in comparison with the results for crude extracts of fractions. A variety of mathematical approaches have been used for the analysis, including calculation of Pearson correlation coefficients (Paull et al., 1989), neural networks (Weinstein et al., 1992), and self-organizing maps (Keskin et al., 2000). Often, if the mechanism of action of the reference compound is known, the correlated test samples can be rapidly tested to confirm similar mechanisms (Paull et al., 1992; Weinstein et al., 1997). This has been demonstrated in the case of agents affecting tubulin (Paull et al., 1992), epidermal growth factor pathways (Wosikowski et al., 1997), and vacuolar ATPase inhibitors (Boyd et al., 2001), among others.
There is no a priori reason why pattern matching must be limited to cell growth inhibition data. In fact, any type of data can, in principle, be mixed, even chromatographic, spectroscopic and taxonomic information. The utility of pattern matching depends primarily on the number of dimensions present in the data matrix. While redundant dimensions (i.e., cells which respond identically) do not contribute, scattered missing data is only a minor issue if the appropriate analytical techniques are applied.
Natural products and their relatives continue to be approved as new drugs. The list shown in Table 1 is not comprehensive, since it excludes peptide drugs and other agents which could arguably be considered as derivatives of natural products. For more comprehensive discussions of natural products drugs on the market or in clinical testing, see the reviews of Cragg and Newman (Cragg and Newman, 2005b; Newman, 2008; Newman and Cragg, 2004; Newman and Cragg, 2006; Newman and Cragg, 2007) and of Butler (Butler, 2008).
Natural products groups have been eliminated in most large pharmaceutical companies in the U.S., however this trend has not penetrated as deeply in Europe and in Japan. Of the companies listed in Table 1, Bristol Myers Squibb, Merck, Johnson & Johnson, Pfizer, Glaxo Smith Kline, and Lilly no longer maintain internal natural products discovery groups. Up until its recent merger with Pfizer, Wyeth had an active natural products group at its Pearl River facility, bucking the trend, at least for the time being. In Europe, Novartis has been notable in maintaining its natural products pipeline.
A corresponding trend is the development of smaller companies as “boutique” natural products operations, which can license natural products leads at suitable stages of development to larger entities (Gullo and Hughes, 2005). Pharmamar is one example of a small company which has had success in bringing a natural product drug candidate (Yondelis) forward in recent years. Nereus Pharmaceuticals has advanced a marine microbial proteasome inhibitor (NP-0052) into phase Ib combination trials in cancer. Kosan Biosciences, which has developed epothilone analogs using biosynthetic technology, was acquired by Bristol Myers Squibb in 2008 on the strength of its development pipeline. Alternatively, small companies can serve as screening contractors, or provide the natural product libraries and expertise for pharma screening (e.g., Albany Molecular Research).
Thus, it is clear that the landscape of natural products research and drug development is rapidly changing. It is a major challenge to maintain the knowledge base and resources that have been developed in large companies in natural products research, and these resources have not always been preserved through corporate mergers, acquisitions and restructuring.
The new field of diversity-oriented synthesis aims to take its structural cues from nature. As a daughter of combinatorial chemistry, it seeks to meld parallel synthesis with chiral synthesis technologies. Thus, natural product scaffolds are designated as privileged structures and then functionalized by parallel synthesis (Burke et al., 2003; Hu et al., 2001; Kulkarni et al., 2002; Sternson et al., 2001). While attractive in concept, for the same reasons that natural products are desirable for drug leads (Section 3, above), it remains to be seen how efficient the strategy will be; after all, the choices of functionalization are still up to the chemist.
Table 2 provides a listing of specialist journals which are important in natural products research.
This research was supported by the Intramural Research Program of the NIH, National Cancer Institute, Center for Cancer Research. The content of this publication does not necessarily reflect the views or policies of the Department of Health and Human Services, nor does mention of trade names, commercial products, or organizations imply endorsement by the U.S. Government.