Question 2: Are there proteins or spots that behave similarly to a given protein or spot?
To answer this question we performed K-means clustering with gap statistics. We found nine clusters, each containing 2–40 different proteins, with very good accuracy (Fig 5).
These proteins are candidates for further analysis. Identifying those that are co-regulated might help to uncover the biological basis for this co-regulation.
Question 3: Are there proteins that might be used for the development of noninvasive tests?
To answer this question we performed a discriminant analysis using the Partial Least Square Search routine as the search method and the Regularized Discriminant Analysis (alpha+gamma 0.7) routine as the evaluation method, with 10-fold cross validation. We found that a subset of 13 proteins allowed discrimination between the known classes with 100% accuracy (Fig 6).
If any of these 13 proteins are highly abundant, they will be good candidates for diagnostic tests.
Question 4: Are there proteins or protein patterns that are characteristic of a biological state?
To answer this question a classifier was calculated using the same method that was applied for feature selection. The confusion matrix from the cross-validation showed no errors for the known samples (Fig 7). The created classifier was then applied to the “Unknown” data set.
The result was one more benign than expected and one less malignant than expected. Looking at the classification results, we found that the only “