Chromatographic Fingerprints ofTwenty Salvia Species 517
Before chemometric analysis, chromatographic signals must usually be aligned [24]. Chromatographic techniąues, especially high-performance liq-uid chromatography (HPLC), are vulnerable to different sources of instrumental variations, usually caused by column ageing phenomena, changes in mobile phase composition, difficulties in reproducing the same gradient conditions and the same temperaturę in the course of analysis, etc. Such difficulties are often encountered, when a large-scale experiment is launched, and analysis of samples is performed over a long period of time. This can result in peak shifts along the time axis, and thus when pairs of chroma-tograms are compared, peaks of the same component do not match. To overcome the peak-shift problem, many alignment techniąues have been proposed [16, 25-30]. Their objective is to match corresponding peaks across a set of chromatograms. Alignment of instrumental signals is not a trivial task, because correspondence of peaks is unknown. Otherwise, one could relatively easily match the peaks between the corresponding points in the pairs of signals (a target and a signai being aligned) by means of linear interpolahon. As shown in the literaturę, correlation optimized warping (COW) [25] seems a freąuent method of choice in a variety of alignment ap-plications. Practically, COW has a very appealing feature - no assumptions about the signals are reąuired. Using the COW method, peaks in a signai, P, are matched with their counterparts in the target signai, T. To accomplish the alignment, piecewise linear stretching and compression of signai P is performed so that the correlation coefficient between signals P and T is maximized. In the COW approach, two input variables, the section length, N, and the so-called slack variable, denoted t, control the ąuality of the alignment. Initially, chromatograms P and T are divided into N sections, each with its start and end points. Sections in signai P are then warped by changing the positions of the end points. When t = 1, each end point of a section can have three possible locations: [-1, 0,1]. This notation means, re-spectively, that a given section is shortened by one point, the section length remains unchanged, and the section is madę one point longer. A detailed presentation of the COW algorithm can be found elsewhere [25].
The objective of principal component analysis is to represent the structure of data using a set of a few new variables called principal components, PCs[31]. They are linear combinations of the original data variables that maximize description of data variance. Principal components are orthogo-