As reference, the amount of phenol in each freeze-dried and spray-dried sample was analysed by RP-UPLC. Phenol elutes earlier than insulin (0.7 min versus
2.7 min) due to its more hydrophilic nature and baseline separation is obtained for the elution peaks. The concentration injected is correlated with the elution peak area for both phenol and insulin in the tested concentration range. The residual amount of phenol in the dried samples depends strongly on the drying method. The freeze-dried samples contain significantly more residual phenol after drying compared with a similar spray-dried sample (Fig. ). At initial phenol/insulin ratios below 10, the amount of retained phenol is above 90% in freeze-dried samples and below 40% in spray-dried samples. Furthermore, the residual amount of phenol for both freeze-dried and spray-dried samples is not linearly correlated with the initial amount of phenol prior to drying. At higher initial concentrations, an increase in phenol concentration only slightly increases the residual amount of phenol after drying, and the relative retention of phenol decreases significantly for both drying methods (Fig. ). The retention of phenol is unaffected by the drying temperature in the spray drier (data not shown) indicating that film formation at the surface of the droplet is independent of the temperature in the tested temperature range (2
). Previous studies have showed that for insulin concentrations below 15 mg/mL film formation is slow resulting in highly wrinkled particles (31
). This result in a lower retention of phenol in the spray-dried samples compared with samples where the film is formed faster. For freeze-dried samples, film formation is not critical, and higher amounts of phenol are retained compared with spray-dried samples.
Fig. 1 The phenol/insulin ratio after freeze drying (circles) and spray drying (squares) plotted as a function of the phenol/insulin ratio before drying (a). The phenol/insulin ratio is determined by RP-UPLC as mole per mole. The percentage of the phenol/insulin (more ...)
Fourier Transformed Infrared Spectroscopy
In order to establish characteristic spectral bands of solid phenol in the mid-infrared range, an amorphous precipitate of phenol on a KBr pellet was prepared, and an infrared spectrum was recorded (Fig. ). The infrared spectrum is similar to the recorded spectrum of crystal phenol indicating a rapid crystallisation of the amorphous phenol precipitate (data not shown).
FTIR spectra of dried phenol (a) and dried insulin (b). Band assignments are depicted in the figure
Several distinct bands are present in the infrared spectrum of phenol which can be used for quantification. The three bands at 1,593, 1,496 and 1,475 cm−1
are assigned to the stretching of the carbon atoms in the phenyl ring, υ
(CC). The alcohol group in the phenol molecule gives rise to two quite strong bands at 1,365 and 1,224 cm−1
, respectively. The first band originates from the in-plane O–H bending, δ
(OH), and the second band is associated with the C–O stretching, υ
(CO). The two bands at 813 and 750 cm−1
are due to the out-of-plane bending vibrations of the C–H bond in the phenyl ring, π
(CH). The band at 690 cm−1
originates from the out-of-plane bend of the carbon atoms in the phenyl ring, π
(CC), and is quite strong in the spectrum (32
The mid-infrared spectrum of freeze-dried insulin without phenol is given in Fig. . The spectrum is dominated by the amide I, amide II and amide III bands. The amide I band from 1,700 to 1,600 cm−1
has been assigned to the stretching of carbonyl groups in the protein backbone, υ
(C=O) and depends on the bond angles and hydrogen bonding of the carbonyl groups, correlating with the secondary structure of the protein. The amide II band (1,580–1,500 cm−1
) is more complex and originates from the combination of δ
(NH) and υ
(CN) modes. The broad band from 1,350 cm−1
to 1,200 cm−1
forms the amide III band and is associated with the combination of υ
(CN) and δ
(NH), but vibrations of the amino acid side chains in the proteins also contribute significantly to the band (34
). A comparison of the phenol and dried insulin spectra indicates that the region below 1,100 cm−1
is best suited for phenol quantification in dried insulin formulations, as this region contains strong bands originating from phenol and contains no significant insulin-related bands.
Multivariate Analyses of FTIR Spectra
To further investigate the ability to differentiate insulin powders dried with different amounts of phenol, a PCA was performed on the second derivative spectra in the range of 2,000 cm−1
to 600 cm−1
(Fig. ). In Fig. , the samples are clustered according to the phenol/insulin ratio. Three principal components explain 98.7% of the variation in the spectral data set. The first principal component (52.2%) explains part of the phenol/insulin ratio in the samples as seen by peaks at 813, 755 and 695 cm−1
in the loading plot corresponding to the π
(CC) and π
(CH) vibrations of phenol (Fig. ). In addition, the α-helix content in the samples is depicted in the first principal component with a maximum at 1,656 cm−1
, and the α-helix content increased with a decreasing score (Fig. ). The second derivative transformation of the spectra reverses the sign of the spectra and a peak in the loading corresponds to a decrease in the original spectra. Part of the increase in α-helix content is correlated with the phenol/insulin ratio as an increase in ratio results in an increase in the α-helix content of the insulin hexamer (12
). However, some of the variations cannot be explained by the insulin/phenol ratio and is credited to unknown variations in the dried powders. The second principal component (43.1%) explains more of the variation in the phenol/insulin ratio as seen from the loading plot which was dominated by the bands observed in the phenol spectrum. The peaks observed in the loading plot at 1,593, 1,498, 1,473, 813, 755 and 695 cm−1
corresponding to υ
(CH) and π
(CC) vibrations in the phenol spectrum (Fig. ). Only minor variations are observed for the insulin spectrum in the second principal component. The third principal component (3.4%) shows little variation in the phenol/insulin ratio, but a significant variation in the insulin spectrum (data not shown).
Fig. 3 Multivariate analysis of second derivative FTIR spectra. Score plot of PC1 against PC2 from PCA, coloured according to phenol/insulin ratio (a). Loading plots of PC1 and PC2 from PCA (b). Correlation between phenol/insulin ratio obtained by the PLS model (more ...)
The influence of the phenol/insulin ratio on the FTIR spectra was shown via PCA, and subsequently the correlation between RP-UPLC and FTIR spectra was investigated using PLS. An overview of the results from several models is given in Table . For the untreated spectra, five PLS components were needed, all showing features from phenol and insulin spectra. The first PLS component of the model explains 91.8% of the variation in the spectra, but explains only 7.8% of the phenol/insulin ratio. The second and third PLS components account for 5.7% and 1.6% of the variation in the spectra, and 47.1% and 34.0% of the variation in the phenol/insulin ratio. The fourth and fifth PLS components use only 0.7% and 0.1% of the variation in the spectra to explain 5.0% and 4.0% of the variation in the phenol/insulin ratio (data not shown). Thus, most of the variation in the phenol/insulin ratio in the samples is described by a small amount of variation in the spectra. By reducing the spectral range of the data set, better models were obtained where more of the variation in the spectra was used to describe the majority of the variation in the phenol/insulin ratio. For the spectral range of 850–650 cm−1, which contains the strong bands for π(CH) and π(CC) vibrations, the RMSECV decreases to a phenol/insulin ratio of 0.48, based on only two PLS components. The first component still describes the main variation in the spectra (95.6%) and only a small part of the variation in the phenol/insulin ratio (18.0%), whereas the second PLS component primarily describes the variation in the phenol/insulin ratio (80.7%) based on 4.1% of the spectral variation.
PLS Model Overview for the Correlation Between UPLC and IR in the Determination of Phenol/Insulin Ratio Utilising Various Pre-Treatments and Wavenumber Ranges
For all models based on the untreated data, the first PLS component only describes a small part of the variation in the phenol/insulin ratio and a large part of the variation in the spectra corresponding to the large variation in baseline offset. To overcome this variation, different transformations of the spectral data set were evaluated. A SNV transformation has been used in NIR spectroscopy to reduce the variation in the baseline offset (25
). However, for the FTIR spectra, the obtained models contain the same number of principal components and did not explain more of the data compared with models based on raw spectral data. The best models were obtained with the second derivative spectra where only one PLS component is sufficient to describe both the variation in the spectra and the phenol/insulin ratio, yielding the lowest RMSECV value (0.43). Both R2
(0.977) and Q2
(0.964) show a high degree of correlation between the actual phenol/insulin ratio measured by UPLC and the phenol/insulin ratio predicted from the PLS model (Fig. ). The PLS component is dominated by the π
(CH) and π
(CC) bands at 813, 750 and 690 cm−1
as observed in the phenol spectrum (Fig. ). PLS models based on only freeze-dried or spray-dried samples were not superior to the models based on all dehydrated samples.
Near Infrared Spectroscopy
NIR spectra are sensitive to baseline offsets, and the NIR spectra in Fig. were therefore SNV-corrected prior to analysis. An NIR spectrum of crystal phenol was used as reference since amorphous phenol was found to recrystallise rapidly. The NIR spectrum of phenol is characterised by fewer and broader bands, which have a lower intensity compared with the FTIR spectrum of phenol (Fig. ). This is expected due to the lower molar absorptivity of NIR bands which originate from overtones and combination bands. However, several distinct bands are present in the NIR spectrum of phenol which can be used for quantification. The spectral region 6,600–5,600 cm−1
is assigned to overtones, and a distinct band is found in this region at 5,995–5,955 cm−1
, corresponding to the first overtone of C–H stretching, 2υ
(CH). The bands found in the region 5,000–4,000 cm−1
are all combination bands mostly assigned to C–H stretching υ
(CH) + υ
(CH). For phenol, the bands at 4,648, 4,550, 4,300 and 4,046 cm−1
are C–H combination bands (35
). No apparent O–H stretching band is observed in the solid phenol samples.
SNV-corrected NIR spectra of dried phenol (a) and dried insulin (b). Band assignments are depicted in the figure
The NIR spectrum of freeze-dried insulin without phenol is given in Fig. . The bands in the region 5,900–5,700 cm−1
are, like the phenol spectrum, the first overtones of the C–H stretching. The spectrum is dominated by combination bands in the region 5,200–4,000 cm−1
, similar to bands found for other proteins (37
). Most of these bands have been tentatively assigned to combinations of amide I, amide II, amide III, amide A, amide B and C–H stretching bands (38
). The band at 5,150 cm−1
originates from the combination band of water and can be used for the quantification of water in the dried samples (40
). Comparing the spectra of dried phenol and dried insulin, several bands of insulin and phenol are overlapping, especially in the area below 5,000 cm−1
which usually results in the strongest absorption bands. However, the bands at 6,105–6,045 cm−1
and 5,995–5,955 cm−1
in the phenol spectrum are located in a spectral region that is rather featureless in the insulin spectrum. Therefore, these may be suitable for the quantification purposes of the phenol/insulin ratio in the samples.
Multivariate Analyses of NIR Spectra
PCA was performed on the SNV-corrected NIR spectra in order to further evaluate the possibility to differentiate insulin powders dried with different amounts of phenol (Fig. ). Three PCs were used to describe 95.4% of the variation in the NIR spectra. The first principal component describes 73.5% of the variation in the data set, and the loading of PC1 is dominated by a main peak at 5,150 cm−1 and a second broader peak at 6,900 cm−1 (Fig. ). Both peaks originate from the absorption bands of water, where the first corresponds to the combination band and the second to the first overtone of water. Only minor contributions from the phenol spectrum are observed. Thus, the separation found along PC1 is dominated by the difference in the water content of the samples. The main feature of PC1 is the separation between the freeze-dried and spray-dried samples (Fig. ). In general, the freeze-dried samples have a negative score on PC1. In contrast, most spray-dried samples have a positive score on PC1, correlating with a positive peak in the loading plot for the two water absorption bands. Thus, the spray-dried samples contain more residual water than the freeze-dried samples. The second principal component describes 15.2% of the total variation in the spectral data set. The loading plot indicates that the variation is mainly due to the variation in the spectral region 5,000–4,000 cm−1 corresponding to the combination bands (Fig. ). Most of the combination bands can be credited to the insulin spectra, and only minor contributions from phenol are observed. In summary, most of the variation is found in the spectral region 5,000–4,000 cm−1, where differentiation between phenol and insulin is difficult. The water content is inversely correlated with phenol/insulin ratio as seen from opposing peaks in the loading plots (Fig. ).
Fig. 5 Multivariate analysis of SNV-corrected NIR spectra. Score plot of PC1 against PC2 from PCA, coloured according to drying method (a). Loading plots of PC1 and PC2 from PCA (b). Correlation between phenol/insulin ratio obtained by the PLS model based on (more ...)
PLS models were constructed by correlating the SNV-corrected spectra with the phenol/insulin ratio measured by RP-UPLC. An overview of the obtained models is given in Table . The first model is based on the spectral range 7,400–4,000 cm−1 and yields a RMSECV of 0.59 with four PLS components. The first PLS component describes 71.7% of the variation in the spectra and 41.8% of the variation in the phenol/insulin ratio. The next three PLS components describe 15.3%, 7.1% and 2.9% of the variation in the spectra and 28.7%, 16.1% and 11.8% of the variation in the phenol/insulin ratio (data not shown). All the PLS components show distinct peaks originating from the phenol spectrum. In addition, the first PLS component contains information about the water content. In order to obtain simpler and more robust models, the spectral ranges 6,200–4,500 cm−1 and 6,200–5,800 cm−1 were tested. The spectral range 6,200–4,500 cm−1 contains the most prominent bands in the phenol spectrum as well as the water band, whereas 6,200–5,800 cm−1 contains the broad phenol band between 5,995 and 5,955 cm−1. The spectral range 6,200–4,500 cm−1 results in a model with a higher RMSECV (0.63) using four PLS components, and the model is therefore inferior to the first model. However, in the range 6,200–5,800 cm−1, only three PLS components are needed to achieve an RMSECV of 0.37, resulting in a better model compared with the model based on the whole spectral range (Fig. ). The first PLS component corresponds to the large absorption band in the phenol spectrum (5,995–5,955 cm−1). It explains 93.8% of the spectral variation and 96.3% of the variation in the phenol/insulin ratio. The next two PLS components describe only minor variations in the spectral data and phenol/insulin ratio (Fig. ). The third component shows a band between 6,140 and 6,040 cm−1, corresponding to the small band between 6,145 and 6,100 cm−1 that can be seen in the phenol spectrum. It is therefore reasonable to include the third component into the model, while further components would lead to over-fitting of the model.
Comparison of Methods
PLS models based on the FTIR and NIR spectra are both able to predict the phenol/insulin ratio. The best FTIR PLS model is based on the second derivative spectra in the spectral range of 850–650 cm−1 (Tables and ). The best NIR PLS model is constructed from the SNV-corrected spectra in the range of 6,200–5,800 cm−1. The R2 and Q2 are above 0.96 for all models (Table ), indicating a strong predictive strength. The biggest difference between the basic models and the optimised models is the number of PLS components. For models containing a higher number of PLS components the first PLS components did not explain the variation in the phenol/insulin ratio. For models based on large spectral regions, the first PLS components are not related to the phenol/insulin ratio but to some other variation in the samples making the models less robust. This variation might be important in the overall understanding of the sample set and in the identification of CPPs, but not for the quantification of the phenol/insulin ratio. The RMSECV values are similar for all the models. The best model based on the FTIR spectra yielded a RMSECV of 0.43 phenol/insulin ratio, while the best model based on NIR spectra yielded a RMSECV of 0.37 phenol/insulin ratio. These molar ratios correspond to mass percentages of 0.69% and 0.60% (w/w), respectively.
Comparison of Cross-Validated PLS Models Based on FTIR and NIR
In general, both methods are suitable for phenol quantification in dried phenol/insulin solids, as both FTIR and NIR spectroscopy yield similar chemical information about the sample. The spectra of the solid samples dried by freeze drying and spray drying, which are sufficiently similar to be incorporated in the same calibration model. Thus, when comparing the two methods, additional advantages and disadvantages need to be considered for the methods. FTIR spectroscopy has the advantage of a higher degree of chemical information about the sample, which includes information on the secondary structure of the protein. However, if the purpose for the measurements is quantification of the phenol/insulin ratio, this information is not necessary and FTIR spectroscopy should not be chosen based on this. In addition, FTIR experiments require more elaborate sample preparation, and in the case of KBr pellets, the preparation is destructive. A sample preparation can be avoided using attenuated total reflectance FTIR spectroscopy, but this method is highly dependent on the homogeneity of the sample due to the short penetration of the light in the sample. On the other hand, NIR spectroscopy has the advantages of extremely fast measuring time combined with no sample preparation. Furthermore, it can be conducted online, which is important for the PAT initiative. While the structural information obtained is not as good as with FTIR spectroscopy, studies have correlated changes in the NIR spectra with changes in the amide I band in the FTIR region (38
). In addition, NIR spectroscopy can yield information about particle size and water content of the powder (41