|Home | About | Journals | Submit | Contact Us | Français|
Hirsutism is the presence of excess body or facial terminal (coarse) hair growth in females in a male-like pattern, affects 5–15% of women, and is an important sign of underlying androgen excess. Different methods are available for the assessment of hair growth in women.
We conducted a literature search and analyzed the published studies that reported methods for the assessment of hair growth. We review the basic physiology of hair growth, the development of methods for visually quantifying hair growth, the comparison of these methods with objective measurements of hair growth, how hirsutism may be defined using a visual scoring method, the influence of race and ethnicity on hirsutism, and the impact of hirsutism in diagnosing androgen excess and polycystic ovary syndrome.
Objective methods for the assessment of hair growth including photographic evaluations and microscopic measurements are available but these techniques have limitations for clinical use, including a significant degree of complexity and a high cost. Alternatively, methods for visually scoring or quantifying the amount of terminal body and facial hair growth have been in use since the early 1920s; these methods are semi-quantitative at best and subject to significant inter-observer variability. The most common visual method of scoring the extent of body and facial terminal hair growth in use today is based on a modification of the method originally described by Ferriman and Gallwey in 1961 (i.e. the mFG method).
Overall, the mFG scoring method is a useful visual instrument for assessing excess terminal hair growth, and the presence of hirsutism, in women.
Hirsutism is the presence of excess body or facial terminal (coarse) hair growth in females in a male-like pattern, and affects 5–15% of women depending on definition (Danforth and Trotter, 1922; Beek, 1950; Shah, 1957; Ferriman and Gallwey, 1961; McKnight, 1964; Moncada-Lorenzo, 1970; Lunde, 1984; Lunde and Grottum, 1984; Derksen et al., 1993; Simpson and Barth, 1997; Knochenhauer et al. 1998; DeUgarte et al., 2006). Although hirsutism is often regarded as a purely aesthetic problem, its medical importance is highlighted by the high prevalence of androgen excess disorders reported among hirsute women (Azziz et al., 2004; Carmina et al., 2005). Androgen excess disorders include the polycystic ovary syndrome (PCOS), non-classic adrenal hyperplasia, the hyperandrogenic-insulin resistant-acanthosis nigricans syndrome and androgen-secreting neoplasms, among others.
Although not all women suffering from an androgen excess disorder will demonstrate hirsutism, depending on age and race/ethnicity, 80–90% of patients with frank hirsutism will have a demonstrable androgen excess disorder (Azziz et al., 2004). In turn, the presence of hirsutism is one of the more important predictors of a lower quality of life among women with PCOS (Guyatt et al., 2004). Unfortunately, many women with hirsutism and some practitioners, still believe this abnormality to be primarily a cosmetic disturbance. Many women consequently frequently seek help first from a beautician, cosmetologist or electrologist in preference to a physician (Dumesic et al., 1997; Farah et al., 1999).
We performed a systematic review of the published medical literature to identify studies evaluating hirsutism and its clinical consequences, querying MEDLINE databases for English-language articles published between the years 1948 and 2009. Hirsutism, hair growth, and polycystic ovary were used as Medical Subject Headings, among other terms. We also performed a hand search of historical books and bibliographies from relevant articles and reviews.
Following we review the basic physiology of hair growth, the development of methods for visually quantifying hair growth, the comparison of these methods with objective measurements of hair growth, how hirsutism may be defined using a visual scoring method, the influence of race and ethnicity on hirsutism, and the impact of hirsutism in diagnosing androgen excess and PCOS.
Hair covers the vast majority of the body, with the exception of the lips, the palms of the hands, and the soles of the feet. The terms ‘hirsutism’ and ‘hypertrichosis’ have both been used to describe abnormal excessive hair growth. In order to understand the difference between the two terms it is necessary to understand the difference between the types of hairs. There are at least two physiologic types of hair, vellus and terminal hairs.
Vellus hairs are short, fine and soft hairs, they are usually non-pigmented, do not contain a core of compacted keratinocytes (i.e. medulla), and the follicles are not associated with an arrector pili muscle (Danforth, 1925). These hairs cover the vast majority of the body, primarily in those areas commonly deemed to be clinically ‘hairless’. In medical literature vellus hair growth is occasionally referred to as ‘lanugo hair’ (Garn, 1951), although we prefer to use the term lanugo to refer to the hairs covering a newborn, and which are shed in utero late in pregnancy or immediately after birth. Growth of vellus hairs is primarily stimulated by growth and thyroid hormones (Simpson and Barth, 1997).
Terminal hairs are longer, more rigid, with shorter blunter tips, penetrating further into the dermis and more pigmented than vellus hairs; they demonstrate a central core of compacted keratinocytes (i.e. are medullated) and an associated arrector pili muscle (Danforth, 1925). Terminal hairs demonstrate significant regional morphologic differences (i.e. longer in some sites, more medullated or pigmented in others, etc.) due to genetically determined differences in the follicles, which is why skin grafts continue to produce hair characteristic of the donor site (Garn, 1951). Development and growth of terminal hairs is primarily stimulated by growth and thyroid hormones, and, depending on body region, androgens (Greenblatt, 1983).
The term ‘hirsutism’ is of Latin origin, meaning excessive growth of stiff hair or hairiness, especially in women or children, with an adult male pattern of distribution; in contrast, ‘hypertrichosis’ is a word of Greek origin generally referring to localized or generalized excess hair (Azziz et al., 2000; Wendelin et al., 2003). Although sometimes these two terms are used interchangeably, we will exclusively use the term ‘hirsutism’ to refer to the condition of male pattern hair growth found in women. In contrast, we consider the term ‘vellus hypertrichosis’ to refer to the presence of excess vellus hairs (Danforth, 1925), possibly reflecting ethnic variation, non-androgenic endocrine disorders (e.g. thyroid disease, anorexia nervosa, imbalances of growth hormone or corticosteroid production), or as a side effect of medications such as cyclosporin or minoxidil (Azziz et al., 2000; Wendelin et al., 2003). In contrast to hirsutism (see below), hypertrichotic vellus hair does not respond to anti-androgen therapy.
Terminal hairs may transform into vellus hairs (a process defined as ‘miniaturization’), as is observed in male pattern balding. In turn, vellus hair may be transformed into terminal hairs (i.e. terminalize) under the appropriate endocrine stimulation. For example, minoxidil appears to cause the conversion of vellus to terminal hairs in the scalp of balding men (De Villez, 1987). Androgens, particularly in excess, can also cause terminalization of vellus hairs, producing terminal hair growth in certain areas of the body or face in both males and females. Optimal androgen effect on hair follicles requires the presence of normal androgen receptor (AR) and 5α-reductase (5α-RA) function, although non-5α-reduced androgens in sufficient concentration can also exert a direct effect on the hair follicle through the AR, as evidenced by the response of patients with 5α-RA deficient male pseudohermaphroditism to the androgen increase occurring with puberty and to exogenous androgen administration (Mendonca et al., 1996). The process of transformation (i.e. miniaturization or terminalization) takes time, and occurs progressively over many hair growth cycles. Interruption of the process sufficiently early (e.g. through the use of antiandrogens in the case of vellus hair terminalization) can reverse the effects observed (Whiting et al., 1999).
The effect of androgens on hair growth appears to be mediated, at least in part, through stimulation of the activity of l-ornithine decarboxylase (ODC), an enzyme that catalyzes the synthesis of polyamines (putrescine, spermidine and spermine from ornithine; and cadaverine from lysine) and which is localized in the bulb, bulge and shaft of the hair follicle. Polyamines are small cationic molecules that play a role in cell migration, proliferation and differentiation and are associated with cell proliferation in hair follicle development and growth (Probst and Krebs, 1975; Hynd and Nancarrow, 1996, Nancarrow et al., 1999). A topical irreversible inhibitor of ODC activity and expression, eflornithine hydrochloride, marketed in a 13.9% cream (Vaniqa®, Bristol Myers-Squibb/Gillette Co.), has been found to reduce unwanted facial hair growth in women, presumably many of whom had hirsutism (Balfour and McClellan, 2001; Wolf et al., 2007).
The effect of androgens on hair growth and differentiation varies by body area (Table I) and presumably the local content of the AR, 5α-RA, ODC, 17β-hydroxysteroid dehydrogenase and others (Paus and Cotsarelis, 1999; Hoffmann, 2003). Some skin areas (e.g. that of the eyelashes, eyebrows and lateral and occipital aspects of the scalp) are relatively independent of the effect of androgens. Alternatively, other skin areas (e.g. lower pubic triangle and the axilla) are quite sensitive to androgens, and hair follicles are terminalized even in the presence of relatively low levels of circulating androgens. These areas begin to develop terminal hair even in early puberty, in both genders when only minimal increases in adrenal androgens are observed. Finally, other areas of skin respond to androgens, but only to significantly higher levels, including the chest, upper and lower abdomen (i.e. the upper pelvic triangle or male escutcheon), upper and lower back, thighs, upper arms and the chin, cheeks and sideburn areas. The presence of terminal hairs in these areas is characteristically masculine, and if present in women is considered pathological. Danforth (1925) defined these three different body areas as asexual, ambo-sexual or sexual hair, respectively. It is hair growth in these latter areas that is scored visually to determine whether hirsutism is present.
Danforth and Trotter (1922) attempted to define what ‘normal pilosity’ or hair growth in women was, by describing the total body hair growth of 350 white college women. Subsequent investigators extended these findings (Dupertuis et al., 1945; Beek, 1950; Shah, 1957; Thomas and Ferriman, 1957; Ferriman and Gallwey, 1961; McKnight, 1964) (Table II). A limitation of most of these early studies is that the majority were performed on draftees, adolescents and college students, which were primarily young and of white race. Few studies were available in other ethnic or racial groups (Trotter, 1922; Shah, 1957).
Anthropologists originally devised subjective measures for the scoring of hair growth, i.e. for determining the degree of ‘hairiness’, as patterns and amounts of hair growth were a principal method of defining race. In all these systems the body surface is divided into zones, the amount of terminal hairs in each zone is individually scored, and a sum total is obtained. Some of the more sophisticated scoring systems also have assessed the extent of hair growth and the density of hair within each site.
Beek (1950) examined 1000 healthy women and 1000 healthy men and counted only terminal hairs >0.5 cm in length, and concluded that the regions of hair growth were the same in men as in women, with the only difference being in the amount of hair. He observed only one absolute difference between male and female hair patterns, what he noted as a ‘disperse upper border of the pubic hair (which) is only found in men and never in normal women … . When found in women, it must be considered as an absolute sign of hirsutism’ (Beek, 1950).
Garn (1951) published a landmark study evaluating the hair growth patterns in men. This investigator quantified the amount of hair in various body areas using a 5-point scale (0–4) based on the amount of terminal hairs in each skin region studied. Examining 2600 body-build photos, Garn combined the various areas of hair growth into nine regions for evaluation (beard and moustache, hypogastric, thoracic, lower arm and leg, upper arm and leg, gluteal, lumbo-sacral, lower back and upper back). He determined that these areas demonstrate regional independence, and racial and individual variability in pattern and density (Garn, 1951). He then visually examined 239 white men adding the evaluation of the mid-phalangeal region to the nine body areas mentioned above. The range of the total hair scores was 3–33, the modal scores (i.e. the scores observed most frequently) were 8 and 9, and about 25 and 18% of the individuals had scores ≤5 or ≥18, respectively. Based on these data, Garn suggested scores of 5 and 18 as the cut-off values to define a ‘hairless’ or a ‘hairy’ man, respectively (Garn, 1951).
Shah (1957) published the first study of hair growth in Asian Indian women, evaluating 34 women between the ages of 15 and 41 years who were referred for excess hair growth (the ‘hirsute’ group) and comparing them to 100 women between the ages of 15 and 48 years without complaints of excess hair (i.e. the self-defined ‘non-hirsute’ group). As a part of his study he also evaluated 50 men between the ages of 20 and 53 years for comparison. Based on the methods developed by Garn (1951), Shah assessed nine regions of the body and evaluated the quality (on a scale of 1–3 by palpation), density (on a scale of 0–3 by visual inspection) and proportion of the area (region) covered with hair. A score for total quantity of hair was obtained for each region by multiplying the scores for quality, density and fraction; the final hair growth score was obtained by adding up the individual scores for each of the nine regions. Based on this evaluation, he concluded that the presence of terminal hairs on the face, chest and upper back was absolutely unusual, and the presence of terminal hairs on the abdomen, upper arms and buttocks was relatively unusual in Asian Indian women. Furthermore, he noted that the presence of hair on the thighs was almost always present relative to hair on other parts of the body, and that hirsute women had more hair on their thighs than non-hirsute women. Lastly, not all hirsute women had hair in the same body areas and it was unusual even for hirsute women, to have excess hair in every single one of the regions. In the group of non-hirsute women, 97% had total hair growth scores ≤7, and 93% did not have hair in more than three regions. In order to be able to develop a system of comparison Shah concluded that a total hair growth score of 8 or more, with excess hair in at least two of the regions evaluated, should be considered as abnormal (Shah, 1957).
Ferriman and Gallwey (1961) evaluated eleven body areas in 430 normal white women 15–74 years old seen at a general medical clinic; each body was scored on a scale of 0–4 similar to that of Garn. These investigators observed that a score of >7, when summing the scores of all the body areas assessed with the exception of the lower arms and legs, was observed in only 4.3% of their population (Ferriman and Gallwey, 1961). This method will be discussed further below.
McKnight studied 400 Welsh or English college students, dividing the body into seven areas (face, abdomen, chest, upper arm/leg, lower arm/leg, lumbosacral region, upper back), and determined whether terminal hair was present or absent in these areas (McKnight, 1964). As did Ferriman and Gallwey (FG), this investigator noted that the majority of women had hair growth in the upper or lower arms, or legs. Thirty-six out of 400 women (9%) had hirsutism. The distribution of terminal hair in the hirsute group was similar to that of 239 males involved in Garn’s study (Garn, 1951), suggesting that the difference of hair distribution between women and men is more quantitative than qualitative (McKnight, 1964).
Moncada in 1970 reported on 300 community-based females ages 15–45 years from Michigan whose hair growth was assessed using a modification of the systems used by Garn and FG (Moncada-Lorenzo, 1970). In this system only five body areas, defined as unusual areas for hair growth in women, were assessed, including the upper lip, chin, chest, abdomen and thighs; the amount of terminal hairs (i.e. those >0.5 cm) were scored from 0 to 4. All normal subjects had a total score of 5 or less. The results of this study also suggested that the facial scores were a satisfactory predictor of the scores for the abdomen, chest and thighs (Moncada-Lorenzo, 1970).
Lunde studied two groups of healthy Norwegian female volunteers: (i) 100 women with regular cycles, 16–44 years old, not on hormonal treatment for the year prior to evaluation, although many were recently admitted for either pregnancy interruption or voluntary tubal sterilization; and (ii) 25 post-menopausal women, 51–65 years old, with a prior history of regular menstrual periods and not using hormone therapy (Lunde, 1984). The study evaluated 19 body regions, which were graded on a scale of 0–3. Subjects were separated into two groups, according to body hair color (blonde or brunette). The extent of body hair growth was observed to be highly dependent on age, especially for the face. This investigator concluded that normal women do not demonstrate great variation in their hair growth pattern, but express a wide range of tolerance towards their own hair growth, ‘thus making the exact distinction between normal and hirsute women rather obscure’ (Lunde, 1984). In a subsequent study comparing 113 women complaining of hirsutism and 100 controls Lunde and Grottum (1984) observed median scores of 20 (range: 11–41) and 7 (range: 0–15) in hirsute and control women, respectively. These investigators suggested that the cut-off score to differentiate hirsute from normal women using their previously reported scoring method was 16 (of a total possible score of 57), not different for blondes or brunettes (Lunde and Grottum, 1984).
The most common method of scoring body and facial terminal hair growth used today for defining the presence of hirsutism is based on a modification of the method originally described by FG in 1961 (Ferriman and Gallwey, 1961; Hatch et al., 1981).
David Ferriman (1907–1990) was a practicing endocrinologist with an interest in the disorders of the thyroid, adrenals, and ovaries (Fig. 1). John Michael David Gallwey was a recent graduate from the University of Sheffield, and in 1959 was in the process of completing a 6-month rotation as house officer at North Middlesex Hospital as part of his pre-registration year. In 1961 they published a seminal study of the hair body and facial growth of 430 women attending a general medical outpatient clinic at the hospital (Ferriman and Gallwey, 1961). These investigators were careful to exclude all patients ‘suffering from diseases or symptom complexes which might be associated with disturbances of hair growth (including diseases of the anterior pituitary, adrenal cortex and ovary, hypothyroidism, generalized skin diseases, menstrual disturbances and infertility)’. A description of the racial composition of the study population was not provided, although presumably the majority of women included were white.
FG used a scoring system loosely based on that of Garn, evaluating 11 body areas, including the upper lip, chin, chest, upper back, lower back, upper arm, forearm, upper and lower abdomen, thighs and lower legs. A score of 0–4 was assigned to each area examined, based on the visual density of terminal hairs, such that a score of 0 represented the absence of terminal hairs, a score of 1 minimally evident terminal hair growth, and a score of 4 extensive terminal hair growth. Terminal hair hairs can be distinguished clinically from vellus hairs primarily by their length (i.e. >0.5 cm), coarseness, and pigmentation. In contrast, vellus hairs generally measure <0.5 cm in length, are soft and non-pigmented.
These investigators determined that the scores arising from hair growth on the lower leg and the forearms (termed ‘indifferent’ hair) did not correlate with hair growth in the other nine body areas assessed (termed ‘hormonal’ hair) (Ferriman and Gallwey, 1961), and these two body areas were not incorporated into subsequent assessments (Ferriman and Purdie, 1965, 1983; Hatch et al., 1981), the so-called ‘modification’.
Additional modifications of the FG scoring method for hirsutism have been reported. Derksen and colleagues studied 81 healthy female volunteers and 71 hirsute patients of childbearing age and Dutch ancestry, assessing 12 body regions, which included all 11 body areas originally surveyed by FG, with the addition of the area of the sideburns (Derksen et al., 1993). Consistent with the grading used by Garn and FG, they assigned a score of 0–4 to each area, depending on the amount of terminal hair growth. These investigators noted that the summation of the scores for the upper lip, chin, lower abdomen and thighs offered the best discrimination between normal and hirsute women (i.e. a score of 6 or greater observed in all hirsute women and none in the controls) (Derksen et al., 1993). More recently, another modification has been proposed, with 12 body areas graded from 0 to 4, including the original nine ‘hormonal’ areas, and the addition of the sideburns, lower jaw and upper neck (separately from the chin), and perineum (Practice Committee of the American Society for Reproductive Medicine, 2006).
Assessing the degree of excess hair growth in women by the FG score (or its modification) seems to have outlasted other methods, even those created a posteriori (Moncada-Lorenzo, 1970; Practice Committee of the American Society for Reproductive Medicine, 2006). Likely this is due in part to the fact that since its initial description it has been more widely used in peer-reviewed articles than other methods, in part because Dr Ferriman himself was an active clinical investigator and author. Secondly, the use of the FG score received a significant boost when it was presented graphically in an easy to use format in an influential review published in 1981 (Hatch et al., 1981). Thirdly, the method was simpler than that described by others (Lunde and Grottum, 1984; Practice Committee of the American Society for Reproductive Medicine, 2006). However, it is unlikely that the FG score outlasted others just because it was simple. For example, the Moncada-Lorenzo score (Moncada-Lorenzo, 1970), scores only five body areas and yet it has only been used in three studies (Salvador et al., 1985; Tremblay, 1986; Golditch and Price, 1990). As with many other methodologies in medicine, early usage, and continued publication on the method by the original investigator and others, will lead to further use and propagation, and eventually to displacement or obstruction to the introduction of competing or alternative methodologies.
Figure 2 depicts a schematic of the modified FG (mFG) scoring system. If terminal hair growth is not present in the examined area a score of zero is given. Minimal amounts of visible terminal hair growth represent a score of 1, a score of 2 is given if hair growth is more than minimal but not yet that of a man, and a score of 3 is that of a not very hairy man while a score of 4 is that typically observed in men. Figure 3 depicts hair growth in the nine body areas used for the scoring of terminal hair growth, and for the diagnosis of hirsutism using the mFG scoring method. Note that each higher score refers to both greater terminal hair density, and importantly, a greater affected body surface area.
The difficulty with many of the visual grading systems proposed is that they are subjective in nature and demonstrate significant inter- and intra-observer variability (Barth, 1996; Wild et al., 2005). Suggested methods to decrease the risk of ascertainment bias when using a visual scoring system to assess hirsutism include: (i) minimizing the number of examiners (Wild et al., 2005), (ii) requesting that patients not use laser or electrolysis for at least 3 months, not depilate or wax for at least 4 weeks, or not shave for at least 5 days before the exam (Azziz et al., 1995; Sanchez et al., 2002), (iii) examining all patients complaining of unwanted hair growth or oligomenorrhea fully, regardless of initial appearance (Knochenhauer et al., 2000; Azziz et al., 2008), (iv) using a uniform graphical (and possibly photographic) representation of the scoring system (as depicted in Figure Figure33).
Determining what is an abnormal amount of terminal hair growth, and thus what is hirsutism, is difficult. Two major approaches may be used to establish the normative range for the mFG score. Firstly, large numbers of women from the general population can be studied, statistically defining the cut-off value for hirsutism either by cluster or related analyses, or by the relative distribution (e.g. percentile).
Using their data from the 161 women whose age was between 18 and 38 years, FG observed that 9.9% had scores ≥6, 4.3% had scores above ≥8, and only 1.2% of women had combined scores ≥10, for the nine body areas they termed ‘hormonal’ (Ferriman and Gallwey, 1961). Hatch and colleagues, proposed that a combined score of 8 or greater using the mFG score defined the population of women with hirsutism (Hatch et al., 1981), as this degree of hair growth was observed in only 4.3% (i.e. <5%) of the reproductive-age population of women studied by FG (1961). Curiously, Ferriman himself did not give a firm view on a cut-off level, choosing a score of 10 or more seen in 1% of his subjects for some of his studies (Ferriman and Purdie, 1965) or a score of 5 or more when defining hirsutism in later studies (Ferriman and Purdie, 1983).
A study of 154 women donating blood in Madrid, Spain noted that 7.1% demonstrated an mFG score of ≥8 (Asuncion et al., 2000), Similarly, Sagsoz and colleagues observed an mFG score of ≥8 in 8.3% of 204 Turkish women, aged 20–54, attending an outpatient clinic for a regular checkup (Sagsoz et al., 2004).
Alternatively, studying 192 women seeking a free medical evaluation in the Island of Lesbos, Greece, Diamanti-Kandarkis and colleagues observe that hirsutism affected 38% of their subjects (Diamanti-Kandarakis et al., 1999). Of note, these investigators used the original FG scoring method assessing all 11 body areas, which would give a higher score than the modified version (assessing nine body areas), but used the relatively low cut-off value of six or more to define hirsutism. Of interest, all their patients with FG scores of 6–12 had eumenorrhea, although the 18 patients with scores of ≥13 (9.7%) had oligomenorrhea. In addition, there was the potential for self-referral bias, as the investigators invited participants through an ‘invitation of free medical examination’. Overall it is possible that these investigators over-estimated the degree of hair growth and the prevalence of hirsutism in this study.
To determine what the hair distribution would be in a population of patients not presenting for a medical evaluation we studied 633 unselected women (278 white women with a mean age of 37.4 years and 349 black women with a mean age of 23.8 years) presenting for a pre-employment physical exam (DeUgarte et al., 2006). The degree of facial and body terminal hair growth was similar in black and white women and the 95th percentile mFG value of the combined population was 7.7. Overall, 7.5% of the overall population could be defined as being hirsute by an mFG scores ≥8. However, we noted that the mFG scores were not normally distributed; principal component and univariate analyses denoted two nearly distinct clusters that occurred above and below an mFG value of 2, with the bulk of the scores below. An mFG score of at least 3 was observed in 22.1% of all subjects (i.e. the upper quartile); of these subjects, 69.3% complained of being hirsute, compared with 15.8% of women with an mFG score below this value, and similar to the proportion of women with an mFG score of at least 8 who considered themselves to be hirsute (70.0%) (DeUgarte et al., 2006). These data indicate that an mFG of 3 or more may signal the population of women whose hair growth falls out of the ‘norm’, at least among black and white women in the USA.
As a specific statistical cut-off value may not reflect the subjective nature of hirsutism; a second possible approach to establishing the cut-off value for defining hirsutism is to recruit women who claim to be hirsute or excessively hairy and compare them to women who feel they have ‘normal’ body hair growth. Lunde and Grottum (1984) used this approach, comparing 113 women complaining of embarrassing excess hair growth and 100 women who did not feel they were excessively hairy, assessing the degree of hair growth using their previously reported method of scoring the amount of body and facial hair (19 different body areas scored 0–3). These authors noticed that a score of 13 with this system differentiated most of the subjects. Derksen and colleagues repeated this experiment using the original 11 body areas assessed by FG, plus the sideburn region (Derksen et al., 1994) comparing 81 healthy volunteers and 71 women complaining of hirsutism. These investigators observed that a score of more than 1 for the chin, upper back, upper abdomen, or upper arm, or more than 2 in any of the other body regions assessed was abnormal in their Dutch cohort (Derksen et al., 1994).
It is possible that not all body areas evaluated have the same discriminatory power to detect hirsutism; the upper lip, chin, and lower abdomen were observed to be the most discriminatory zones (Lunde and Grottum, 1984; Derksen et al., 1994). In a study of 675 consecutive hyperandrogenic patients using the mFG method, we also observed that a hair growth score of ≥2 on the chin or lower abdomen only was found to be a highly sensitive predictor for hirsutism (Knochenhauer et al., 2000). Consequently, the examination of the chin and abdomen alone was felt to be potentially useful for the study of high-risk populations, particularly those with an expected hirsutism prevalence of >20% (e.g. family studies of patients with PCOS or patients complaining of unwanted hair growth). Alternatively, because its positive predictive value (PPV) was only 58%, the examination of the chin or abdomen only was felt to be of limited value for population screening studies, where the frequency of hirsutism is expected to be low, about 7% (Knochenhauer et al., 2000).
Overall these data suggest that 5–7% of unselected women of reproductive age have an mFG score of >8, which can be used as the cut-off value to define hirsutism. Alternatively, and considering the high prevalence of androgen excess in the population, the normal upper limit mFG score may actually be closer to 2. This is consistent with our recent data indicating that up to 50% of subjects seen with mFG scores of 3–5 have an androgen excess disorder (Souter et al., 2004). Moreover, moderate to severe unwanted hair growth confined to one or two areas of the body, particularly the upper lip, chin and lower abdomen, might prompt a woman to seek medical care even though the total hirsutism score is not above the established cut-off value. Finally, although evaluation of the chin, upper lip or lower abdomen may be useful for detecting hirsutism in high risk populations, examination of these areas alone has a modest PPV in populational studies.
More objective methods of assessing hair growth are available. These include weighing the hairs obtained by dry shaving the body region of interest (demanding a high degree of cooperation by the patient and precise weighing methods) (Casey et al., 1966); measuring the outer diameter of either plucked or clipped hairs (Cumming et al., 1982; Barth et al., 1989; O'Brien, et al. 1991; Barth, 1997); determining the density of terminal hairs (i.e. the number of hairs per defined surface area) either by direct counting (Peereboom-Wynia, 1972) or by photography (Hines et al., 2001); measuring the rate of hair growth using calibrated glass capillary tubes (Jones et al., 1981; Barth, 1997) or photography (Holdaway et al., 1985; Hines et al., 2001) (requiring that patients shave at predetermined intervals of time before the measurement). Another method proposed for measuring androgenization of hairs is to determine the fraction of vellus (unmedullated) hairs in a sample of ~100 shaved hairs, defined as the ‘vellus index’ (Madanes and Novotny, 1987). This index was found to be significantly lower in hirsute women and males compared with healthy women.
The objectivity of these direct methods provides a useful means to evaluate hair growth parameters. However, these techniques are primarily useful for assessing the hair growth rate, the extent of terminal hair density in a specific body area, or changes in hair growth (e.g. the therapeutic response in women undergoing therapy for hirsutism). They are much less useful, due to their cost and localized nature, to assess the extent of total body or face terminal hair density, at baseline or over time. None of these methods are suitable for routine clinical practice due to their complexity, cost and low patient acceptance.
Limited data are available comparing subjective and objective measures of hirsutism. Holdaway and colleagues evaluated 34 hirsute women by FG scoring and hair growth rate from the skin in front of the left ear determined by photography before and after anti-androgen treatment (Holdaway et al., 1985). Although baseline hair growth rates demonstrated significant correlation with physician-rated FG scores, the reduction in hair growth rate was not correlated with the improvement in FG scores. Barth compared FG scores and direct measurements of hair growth and hair shaft diameter on the pre-auricular area of the face, the forearm, the anterior abdominal wall and the anterior thigh in 88 hirsute women (Barth, 1997). This investigator reported a significant correlation between the FG scores and the hair diameter measurements on the forearm, abdominal wall, and thigh; alternatively the FG score did not correlate with hair diameter measurement from the pre-auricular area of the face or linear growth rates at any of the four regions.
To objectively evaluate hair growth in the face and abdomen in hirsute patients, and compare it to the subjective assessment of whole body hair growth by the mFG score, we developed a computer-aided image analysis system capable of measuring several growth parameters, including density (number of hairs per surface area) and hair length (and hair growth rate over time) by photography, and hair diameter and length (and growth rate over time) by microscopic assessment of plucked hairs (Azziz et al., 1995; Hines et al., 2001). In brief, facial and abdominal skin areas are shaved, and 3–5 days later the areas are photographed through a calibrated glass plate and five terminal hairs are plucked from each area for microscopic assessment.
Firstly, we assessed 17 hirsute women before and during 6 months of randomized treatment with either leuprolide depot (3.75 mg/month) plus add-back therapy (n = 9) or an oral contraceptive pill (OCP; n = 8). A significant percent decrease in the mFG score was noted in the leuprolide but not the OCP treated group. Consistent with the mFG assessment patient self-evaluation indicated that seven (78%) of leuprolide-treated patients, compared with only two (25%) OCP-treated women, noted an improvement in their hair growth. Also consistent with these subjective assessments, hirsute women treated with leuprolide, but not OCP, demonstrated a decrease in their facial hair growth rate using either the photographic method or plucked hairs. Alternatively no significant difference in the mean facial hair density or outer hair diameter was noted with either therapy.
In a second study we assessed 20 hirsute women (12 white and 8 black) (Hines et al., 2001). Our results indicated that facial hairs were distributed in greater density and had a greater diameter than abdominal hairs (15.6 ± 14.2 versus 5.4 ± 1.9 hairs/cm2 and 84.5 ± 19.5 and 66.2 ± 17.5 µm, respectively, P < 0.005). Alternatively, the growth rates of facial and abdominal hairs were similar, whether determined photographically (0.36 ± 0.18 versus 0.43 ± 0.19 mm/day, respectively) or from plucked hairs (1.2 ± + 0.2 versus 1.4 + 0.4 mm/day, respectively). The abdominal hair diameter correlated with the abdominal mFG score (r = 0.81) and the total mFG score (r = 0.68). In a multiple regression analysis, the total mFG score was associated with the mean abdominal and facial hair diameter and the facial growth rate assessed by photography (r = 0.95), but not the abdominal growth rates assessed from photographs or plucked hairs, or the facial growth rate assessed from plucked hairs.
Taken together these data suggest that assessment of terminal hair density and changes in hair growth on the face or body by the mFG score correlates loosely with more objective measures of hair number (density), growth rate or diameter. Discrepancies may be due to a lower sensitivity of the visual method, to the limitation of objective measures to assess hair content globally, or to local variations in hair growth and response. In addition, visual scales like the mFG are much less complex and expensive to undertake, and are ideal for population studies. Finally, objective methods suffer from technical limitations; unusable photographs (Azziz et al., 1995), patient discomfort, and up to 25% inter-observer variation (although most measures had a 5–7% variability) (Hines et al., 2001). Overall, these data support the continued use of visual scales for the assessment of excess hair growth and the response to therapy in hirsute women.
Hair is second only to skin color as a feature of racial difference (Greenblatt, 1983; Simpson and Barth, 1997). The number of hair follicles per unit skin area and the rate of hair growth vary among ethnic and racial groups. Thus, the degree of body hair growth, and consequently the cut-off value for diagnosing hirsutism, may be affected by ethnicity and race. Black and white individuals have been reported to have the same number of facial hairs (Trotter, 1922), although black individuals have also been reported to have 20–40% less hair follicle numbers on scalp biopsies, compared with whites (Sperling, 1999). However, our data from the evaluation of large numbers of androgenized women, suggests that the degree of body and terminal hair growth and the prevalence of hirsutism are not significantly different between unselected white and black women (Knochenhauer et al., 1998; DeUgarte et al., 2006), suggesting that the specific cut-off value for defining hirsutism is similar in these ethnic groups.
However, it is unlikely that women of Asian extraction would have similar degrees of hair to that of white or black women. For example, males of East Asian origin (Japan, Taiwan, Korea, China, Vietnam) were found to be less hairy than their similarly aged EuroAmerican (white) counterparts, despite similar androgen levels (Ewing and Rouse, 1978). Asian individuals have been found to have fewer scalp hair follicles than either black or white individuals (Lee et al., 2002; Tsai et al., 2002). Consistent with these findings, in a study of 531 Thai women seen for an uncomplicated annual gynecological exam, 97.8% of all subjects had an mFG score of 2 or less, and none of the subjects had a score above 5 (Cheewadhanaraks et al., 2004). This finding was observed regardless of ethnicity subgroup (Thai-Thai, Muslim-Thai, Chinese-Thai). The low prevalence of terminal hair growth and hirsutism in the Thai population contrasts with data indicating that 4.3% of a Caucasian population in the UK (Ferriman and Gallwey, 1961), 7.1% in Spain (Asuncion et al., 2000), 8.3% in Turkey (Sagsoz et al., 2004), and 5.4% the USA (DeUgarte et al., 2006), and 4.3% of black women in the USA (DeUgarte et al., 2006), have an mFG ≥ 8. Finally, in a study of 623 unbiased women from the general population of the Shandong region of China an mFG score of ≥2 was observed in 48% of PCOS patients, but only 4.8% of controls (Zhao et al., 2007).
Overall, it is possible that ethnic and racial variation in body hair growth impacts on the definition and prevalence of hirsutism. An mFG cut-off value of ≥6–8 is present in 5–9% of many female populations, and may be used as the cut-off value for defining hirsutism. However, in populations of Asian women of Mongoloid extraction (which include Chinese, Japanese, Koreans, American Indians and Eskimos) the overall density of facial and body terminal hair growth is lower, and the definition of hirsutism may therefore require a lower value cut-off (e.g. mFG score of 2–3).
Because hirsutism is a frequent sign of underlying androgen excess, clinicians need to be readily able to assess the amount of facial or body hair growth present in patients. Once excess unwanted hair growth or hirsutism are identified, a systematic search for its cause and a subsequent treatment plan should be instituted (see Azziz et al., 2004, for review). Hyperandrogenism or androgen excess represents a common reproductive endocrinopathy affecting between 5 and 10% of reproductive-aged women (Azziz et al., 2004; Carmina et al., 2005). The most common hyperandrogenic disorder is PCOS affecting approximately 80–85% of women with androgen excess (Azziz et al., 2004; Ehrmann, 2005). In fact, hirbutism is included as a diagnostic criterion for PCOS in all three currently available definitions of the disorder (Zawadzki and Dunaif, 1992; Fauser, 2004; Azziz et al., 2006).
Androgen excess can be established by biochemical and/or clinical measures. Biochemical measurement of circulating androgen levels is useful for detecting hyperandrogenemia in women, although these assays are fraught with their own limitations (Rosner et al., 2007; Azziz, 2008). In turn, hirsutism is the most commonly used clinical diagnostic criterion of androgen excess. When serum androgen levels are apparently normal in hirsute women, it is likely that either the androgen levels are abnormally elevated for the individual patient, albeit still within the ‘normal’ range of the population, or there is increased local tissue sensitivity to circulating androgens (Azziz et al., 2000). In fact, when defined strictly, true ‘idiopathic’ hirsutism is relatively rare affecting only approximately 5% of all hirsute women (Azziz et al., 2000).
The prevalence of hirsutism among women with PCOS appears to be high, at least among affected white and black women. For example, Conway and colleagues evaluated 556 patients in the UK with polycystic ovaries on ultrasound, defining hirsutism by a score >8 using the FG method (Conway et al., 1989). After excluding 20% of the patients mainly due to receiving hormonal therapy at presentation, they observed that ~77% had normal a total testosterone level, of which 59% were hirsute. Alternatively, 78% of patients with high testosterone levels were hirsute. Among 316 women with PCOS from the USA defined by the NIH 1990 criteria, 48% had both hirsutism and hyperandrogenemia, 29% had solely hyperandrogenemia without hirsutism and 23% had solely hirsutism without hyperandrogenemia (Chang et al., 2005). Baseline data from the two largest US clinical trials defining PCOS based on biochemical hyperandrogenemia and involving over 900 women with PCOS indicated a hirsutism prevalence of 50–80% (Azziz et al., 2001; Legro et al., 2007).
Consistent with the lower population prevalence of hirsutism observed in Asian women, a comparative study of patients with PCOS from the USA (primarily Mexican-Americans), Italy and Japan noted that Japanese women had a significantly lower mean hirsutism score than their non-Asian counterparts (Carmina et al., 1992). However, the lesser prevalence of hirsutism among Asian PCOS patients may not extend to all groups in the region. For example, Wijeyaratne and colleagues observed that hirsutism was more prevalent and more severe among PCOS patients of Southern Asian extraction (Pakistani, Bengali, Gujarati or Dravidian Indian) than whites (Wijeyaratne et al., 2002). Likewise, in a study of women with PCOS in New Zealand, about half of European and Maori patients presented with clinical evidence of hirsutism, whereas only one third and one forth of women of Indian and Pacific Island descent, respectively, demonstrated hirsutism (Williamson et al., 2001). Additionally, Maori and Pacific island women were more obese and had higher androgen levels compared with other two ethnic groups in this study (Williamson et al., 2001). Of course, variability in the assessment method and observer bias cannot be ruled out.
Overall, the presence of hirsutism is a strong indicator of androgen excess (e.g. PCOS), with over 85% of hirsute women demonstrating some form of hyperandrogenism. Although black and white women with PCOS appear to demonstrate similar degrees of hirsutism, it is likely that the degree of excess hair growth will vary according to ethnicity or race.
The method of visually assessing the amount of facial and body terminal hairs (>5 mm in length) most readily used today is a modification of the system originally proposed by FG. In most populations, with the exception of Mongoloid races, an mFG score of ≥6–8 signifies hirsutism. However, we should note that a ‘normal’ amount of hair may be an mFG of <2–3, and a significant proportion of women with mFG scores of 2–6 have androgen excess, primarily PCOS, regardless of ethnicity. Although there are a number of objective methods available for the measurement of hair growth, density, and diameter, these methods are not suitable for routine clinical practice mainly due to their complexity and cost. The use of photographic depictions of the mFG scoring system, such as that presented (Fig. (Fig.3),3), should assist in further standardizing the visual examination of women for excess terminal hair growth, and facilitate studies of hirsutism and androgen excess disorders, including PCOS.
These studies were supported in part by grants K24-HD01346 and R01-HD29364 of the National Institutes of Health of USA (RA) and the Helping Hand of Los Angeles, Inc. (RA).
We would like to acknowledge the kind assistance of Annabel Ferriman, Prof. Stephen Franks, Teresa Clarke of North Middlesex Hospital, Catherine Davison of the University of Sheffield, and Luca Dussin of the Royal College of Physicians for providing information and insight on Drs. Ferriman and Gallwey; James Morrow for his invaluable assistance in gathering al the minute information presented; and all our patients for the kind and anonymous contribution to our project.