The Statistical Algorithms in Microarray Suite 5.0 process raw GeneChip probe set data to generate expression values (Signals and LogRatios), Detection and Change calls, and associated p-values for every transcript represented on the array.
Non-parametric rank tests are used to generate transcript Detection and Change p-values. Qualitative calls, such as Present and Increase, are derived from these p-values (see references 2 and 4). The Detection and Change algorithms incorporate user-tunable parameters that influence the analysis stringency of p-value calculation and call generation. This technical note introduces the tunable parameters of the Statistical Algorithm, and provides data examples that help explain how analysis results are affected when parameters are modified. For an in-depth discussion on how the Statistical Algorithms and their parameters work, please see reference 1.
There are two tiers of tunable parameters in the Statistical Algorithms. The first tier is represented by p-value cutoffs for both the Detection and Change calls. These parameters (alphas and gammas) do not alter the calculation of the p-values, only the generation of calls. The second tier of tunable parameters is represented by tau and perturbation for the Detection and Change p-value calculations, respectively. These parameters affect the calculation of p-values directly, and should be used only when fine-tuning of the underlying equations is required. User-tunable parameters do not influence Signal or LogRatio calculations.
Affymetrix recommends the default values, set in the library files for a given array type, in most cases for accurate gene expression analysis. These values were established using a large number of spiked sample retation of p-values can differ depending on the desired analytic stringency, one may choose to modify an alpha or a gamma cutoff. For example, if you would like to make Present calls associated with more stringent Detection p-values, you can set alpha1 to a lower level, say 0.025. Doing so will decrease the number of Present calls by converting those with p-values in the range 0.025-0.04 to Marginal calls. There will be a sensitivity vs. specificity tradeoff, since some of the calls that became Marginal may be falsely detected genes, but some may also be truly expressed genes. If you do modify the alpha or gamma cutoffs, it is advisable to maintain the p-values with the calls since they are not affected by an alpha or gamma parameter change.
1) Liu, W., Mei, R., Di, X., Ryder, TB., Hubbell, E., Dee, S., Webster, T., Ho, M., Baid, J., and Smeekens, SP. Analysis of high density expression microarrays with signed-rank call algorithms, manuscript, 2001.
2) Affymetrix Technical Note 1, New Statistical Algorithms for Monitoring Gene Expression on GeneChip Probe Arrays. Part Number 701097.
3) Jin, H., Yang, R., Awad, TA., Wang, F., Li, W., Williams, SP., Ogasawara, A., Shimada, B., Williams PM., Gianfranco de Feo, G., Paoni, NF. Effects of Early Angiotensin-Converting Enzyme Inhibition on Cardiac Gene Expression After Acute Myocardial Infarction. Circulation, 103:736, 2001
4) Statistical Algorithms Reference Guide. Part Number 701110.
s, hybridized to human, yeast, and E. coli probe arrays (see references 1 and 2). Altering parameter values is only necessary if the stringency of the analysis requires a change from the established defaults.
Why do the Statistical Algorithms have tunable parameters?
The equations underlying the GeneChip Statistical Algorithms incorporate statistically based stringency parameters that control the balance between true detection and false signal in a predictable manner. These parameters can be used to construct Receiver Operating Characteristics (ROC) plots (Figure 1) when used with calibration data sets (e.g., the Latin Square spiked samples described in Tech Note I). These plots are useful for setting optimal values for each of the parameters to achieve the best balance between true detection and false detection. The flexibility provided by the tunable parameters made it possible for Affymetrix to fine-tune the algorithms for current array designs and assay conditions, and will allow for further refinements in the future as expression arrays evolve.
.CEL files from a rat cardiovascular study (reference 3) were analyzed individually and in comparisons using the Expression Analysis Statistical Algorithms (Affymetrix Microarray Suite 5.0). Each of four tunable parameters, alpha1, tau, gamma1 (L and H together), and the perturbation factor, was altered individually, above and below the default values. Other tunable parameters, such as alpha2, were kept at defaults. Data from an entire array, and from individual probe sets, were extracted and displayed to observe the effects of modifying each of the tunable parameters on relevant analysis metrics, such as p-values.
Note: The analyses shown in this technical note are intended to serve as examples to help communicate how the parameters of the St atistical Algorithms work. The examples do not provide sufficient information to serve as guidelines on how to modify parameters for a particular analysis. For most users, Affymetrix recommends using the default values of the tunable parameters as provided in MAS 5.0.
I. Tunable Parameters Affecting Detection
I.A. Calculation and interpretation of Detection p-values
To determine if transcripts are detected specifically, probe pair discrimination scores (R) for every probe set are compared to a threshold value (tau) using a Wilcoxon signedrank test (see references 1 and 2). This non-parametric test establishes whether the probe sets overall discrimination is higher than the threshold, and generates a p-value as a measure of confidence in the specific detection of the transcript (references 1 and 2). The tau threshold influences the actual calculation of p-values, and is discussed further in section I.C. Once p-values are calculated, they can be used to rank probe sets by detection confidence, and to generate qualitative Detection calls. Cutoffs, operating as Figure 2. Modifying alpha1 changes the percentage of probe sets called Present in rat heart RNA (hybridized to the Rat Genome U34A array). The percentage of Present calls is plotted as a function of the user-tunable parameter alpha1. This array's default value for alpha1 (0.04) is indicated with the red line. Changing alpha1 from the default (arrows) alters the analysis stringency and the balance of sensitivity to specificity. tunable parameters, allow the user to modify how the p-values are used to make Detection calls. These are described next.
I.B. Detection Call cutoff values (alphas)
The p-values output by the Detection analysis give rise to three possible Detection Calls: Present (P), Marginal (M), and Absent (A). The calls are g enerated based on limits of confidence set for detection. The cutoffs alpha1 and alpha2 determine the boundaries between the three possible calls. As alpha1 and alpha2 are increased or decreased, the range of p-values that is interpreted as a P, M or A call will change. Present calls are output for probe sets when the Detection p-value is below the parameter alpha1. An Absent call is output if the p-value is greater than alpha2. The call is Marginal if the p-value lies between the two thresholds. Optimal default values were established (for 16-20 probe pair designs) at alpha1 = 0.04 and alpha2 = 0.06.
Increasing alpha1 to levels above the default relaxes the stringency by allowing genes with higher Detection p-values to be called Present. This effect is illustrated in Figure 2, which shows how the percentage of probe sets scoring Present calls changes across the entire array as alpha1 is increased. A high alpha setting permits more transcripts to score Present; such a setting provides a higher level of sensitivity, but comes at the cost of a greater level of false positives. Conversely, decreasing alpha1 below the default causes a decline in the percentage of Present calls made, leading to fewer false positives at the cost of losing some truly expressed genes.
The tradeoff between detection sensitivity and specificity is shown in more detail in Figure 3. The scatter plot shows the relationship of the alpha cutoffs to probe set data displayed by Signal (y-axis) and Detection p-value (x-axis). Optimal placement of the alpha cutoffs provides a balance between true detection and false positives. Setting alpha cutoffs to higher or lower p-values alters the call stringency. An example is shown with the dashed red line, representing alpha1 set to 0.01. This highly stringent setting decreases false detection to near zero levels, at the cost of losing true positives (35% Present at default alpha1 declines to 26% at the altered se tting).
I.C. Detection p-value: tau Discrimination threshold
As mentioned in Section I.A. (and in further detail in references 1 and 2), tau serves as the discrimination threshold against which individual probe pairs are tested for specific detection. This parameter is determined experimentally by measuring a large number of probe pair Discrimination Scores for transcripts that are known to be absent from a test sample. Unlike the alpha parameters, tau directly affects the calculation of Detection p-values. If tau is set lower than the default value (0.015 for 16-20 probe pair designs), the stringency of detection is relaxed, resulting in a shift in the distribution of p-values towards 0. If it is set higher than the default, the analysis becomes more stringent; a higher level of discrimination is required for specific detection, and the p-values are shifted towards 1.0.
Figure 4 shows how tau affects the percentages of Detection calls across an entire array. Moving tau above the default increases the stringency of detection, resulting in fewer genes detected as Present and Marginal, and additional genes as Absent. Conversely, moving tau below the default decreases the stringency of detection, allowing more genes to be called Present or Marginal, and fewer genes as Absent.
The effect on Detection p-values for individual probe sets can vary. Figure 5 shows Detection p-values for two transcripts expressed in a normal rat heart sample, as well as two exogenous spikes. A given transcripts Detection p-value responds in an individual way to changes in tau, since the p-value calculation is based on how individual
probe pairs in a probe set rank against tau. The curve for bioB exemplifies a low-level transcript; as tau is increased, its p-value crosses the alpha cutoffs, resulting in an Absent call. A rat transcript (gene1) also becomes undetected at the highest setting of tau (0.15) . The two other transcripts (bioC and rat gene2) are more resilient to the increased stringency, and maintain low p-values, and Present calls, as tau is increased. Modifying tau directly influences the balance of sensitivity vs. specificity of the Detection analysis.
II. Tunable Parameters Affecting Change Analysis
II.A. Calculation of Change p-values
The calculation of the p-value is influenced by one tunable parameter, termed Perturbation, which operates on the normalization factor applied to the arrays during the comparison. Cutoffs, termed gamma1 and gamma2, are applied to the two ends of the p-value range to delineate the qualitative Change calls.
Instead, gammas serve as thresholds to divide the data into zones associated with Change calls such as Increase. Changing gamma values can be visualized as sliding the cutoffs (red lines on plot) to the left or right, resulting in a change in the number of transcripts scoring a given call. An ideal placement of the gamma cutoffs would allow the maximum number of true changes to be detected, while minimizing false changes. This is shown with the line illustration above the graph; the p-value distribution of truly changing genes overlaps to an extent with the p-value distribution of non-changing genes. The region of overlap is primarily composed of transcripts that score a Marginal Change call. Moving gamma1 (default = 0.0025 for the Rat Genome array) to a lower p-value level (dashed red line, = 0.001), results in higher stringency, and fewer genes are reported as Increased (2.4% compared to 2.8% at default gamma1 value). This higher stringency setting decreases false changes at the cost of losing some real changes.
The tradeoff of detecting real changes while minimizing false changes is further illustrated in Figure 8. Genes increasing in the same RNA comparison (brain) represent false changes, whereas most of the increasing genes in the brain vs. heart comparison represent real biological differences between the two tissues. More changes (~63 additional genes changing greater than 2 fold) can be detected in the different tissues comparison if gamma1 is moved to a less stringent setting (dashed red line), but some of these changes will be false. The number of additional false changes is estimated by the same RNA comparison (~5 genes over 2 fold). Altering gamma has a similar effect on decreasing genes (data not shown).
II.C. Further fine-tuning of Change calls with gamma H and L settings
Accurate detection of change can, in some cases, depend on the Signal range of the changing transcript. Changes in highly expressed transcripts may register different p-values compared to changes in low expressed transcripts, even if the changes are of comparable Signal LogRatios. MAS 5.0 contains a finer level of tuning for change calling, as shown in Figure 9. Each of the gamma cutoffs can be assigned a range of values (bounded by Low and High limits), from which probe set-specific final gamma cutoffs are calculated. Thus, gamma1 is derived from the gamma1L - gamma1H range, and gamma2 is derived from the gamma2L - gamma2H range. The final gamma cutoffs for a given probe set are calculated using a linear interpolation between the L and H limits, based on the probe sets Signal position over the entire arrays Signal range. The Signals used are those from the Experimental array in the comparison. Thus, as shown in Figure 9, if a probe sets Signal is at 30% of the arrays Signal range, the final gamma cutoffs used to decide the Change call for that probe set are set at 30% of the range of values given by the L-H limits for each. The default setting for the L and H limits is to make them equal (i.e., gamma1L = gamma1H and gamma2L = gamma2H). Therefore, the analysis does not use this fine-tuning unless the user changes the L and H values individually. In the majority of cases, it is not necessary to set the L and H gamma limits differently. This fine-tuning is only necessary if Change call accuracy is not optimal at one or both extremes of the Signal range of the array.
II.D. Change, significance (Normalization Perturbation Factor)
Effective normalization between arrays being compared is essential to detect changes in gene transcription accurately. When comparing two arrays, the Statistical Algorithm uses multiple normalization factors derived from a fundamental normalization factor using the Normalization Perturbation parameter. The lowest value for perturbation is 1.00, indicating no perturbation of normalization is carried out, while the highest allowed value must be below 1.5. An established default (1.1 for 16-20 probe pair designs) is based on the most accurate Change calls made from a calibrated data set. Increasing the perturbation factor increases analysis stringency by decreasing sensitivity to change, and fewer genes are called Increased or Decreased. Figure 10 shows the relationship between perturbation and change detection across an entire array. As perturbation is increased, the stringency of change detection is raised, and fewer genes score as Increased or Decreased. As seen with previous parameters, there is a trade off of sensitivity and specificity. Increasing perturbation above the default value will reduce false changes, but will also decrease sensitivity to real change. Conversely, setting the perturbation factor lower than the default value increases sensitivity, causing more genes to be called Increased or Decreased, but some of these additional changes will be false.
The effect of changing the perturbation factor on individual probe sets Change p-values can vary considerably, as shown in Figure 11. p-values for probe sets 1 and 2 show little or no response to an increase in perturbation, whereas those for transcripts 3 and 4 move towards higher p-values (less significant), eventually crossing over the gamma cutoffs and generating No Change calls. This occurs because probe sets vary in how their individual probe pairs perform in the rank-based test used to compare the baseline to the experimental probe arrays. Transcripts with more stable p-values (such as 1 and 2 in Figure 11) tend to have higher LogRatio values than those whose p-values are impacted by changing perturbation, indicating that, in general, greater changes tend to be more robust in terms of scoring a significant Change p-value.
The tunable parameters of the Statistical Algorithm directly influence the outcome of raw data analysis of GeneChip expression arrays. Changing the parameters can be done, but is recommended only if fine-tuning of the analysis is required. If a parameter is to be modified, it is advisable to do so in small increments, and to keep track of the changes. Analysis parameters are reported in the .CHP files, and can be viewed in the experiment information tab. Be certain that analysis parameters are consistent across compared samples within an experiment.
Should You Change these Parameters?
In general, Affymetrix does not recommend changing the Statistical Algorithms parameters. This is especially true of tau and perturbation factor, since they are integral to p-value calculations. The default values set for these parameters in MAS 5.0 were established at Affymetrix to achieve the highest accuracy and sensitivity for the current GeneChip expression arrays. Future array designs and assay conditions may require changes in certain parameter values. Such changes will be provided by Affymetrix, and in general, will be preset in the Library files released with an array design.
Since the interp