The patient samples were collected as a part of the oral cancer screening program in India and were measured with the newly developed 4-space array utilizing environmentally sensitive lanthanide chelate chemistry and time-resolved luminescence (TRL) measurement in conjunction with indicator dye molecules and modulating (buffer) chemistries. The rationale of using the inherently environmentally sensitive lanthanide chemistry along with indicator molecules is to tune each of the 4-space TRL signals to give a unique response that is dependent on the interaction of the indicator molecules, the sample and the modulator chemistry with the luminescent lanthanide. Since the interactions are nonspecific, each of the 4-space signals is indicative of a broader property of the mouth rinse samples than presence of any single marker molecule. When the individual 4-space measurements are sufficiently independent, we claim that their joint information can become specific in view of the question at hand. Observing the absolute TRL signals from figure 1, along with the behavior of these signals with respect to the two groups to be separated, we claim that the used chemistry combinations give sufficiently uncorrelated responses for each of the 4-space data to support good classification.
The used data classification methods were K-Nearest Neighbor (KNN) and Support Vector Classification (SVC). The fine-tuning function of the data analysis program (Molegro Data Modeler 2.1) was used in finding the parameters for optimal training. The results for the data analysis are given in figure 1, and are listed below. The t-test indicated that each of the selected chemistries alone showed significant difference (P<0.05) between the healthy and lesion patient groups. From the tested two algorithms both performed equally well yielding sensitivity and specificity near the 90% mark. The use of an artificially and randomly generated training set from the patient ensemble averages ensured that we were able to assess all our patient data and that the algorithms could not be over-trained to recognize false linkages between the samples.
The results from the analysis for detection of lesions are as follows:
- KNN (k=2), all data: sensitivity/specificity - 89/88 %
- SVC, all data: sensitivity/specificity - 90/88%
- KNN, averaged replicates: sensitivity/specificity - 88/92%
- SVC, averaged replicates: sensitivity/specificity - 90/92%
As can be seen, the prediction sensitivity and specificity approach the 90% mark regardless of the classification method. Averaging or alternatively using the replicates as individual measurements did not have a significant effect on the results, suggesting that the actual measurements were sufficiently repeatable for the algorithms to function well. This was also supported by the observation of the variance displayed in figure 1.
In all, results indicate that nonspecific means can be used in differentiating saliva samples by using multivariate analysis tools. In the collection of data, the prerequisite (inclusion criterion) for measurement was that the person was a tobacco product user. By concentrating to the most likely population at risk, we could train our analysis to detect differences between tobacco users and did not experience the well-known problem of tobacco product interference in salivary diagnostics in the training phase of the system. Further, in a learning-based self-calibrating system, the use of standards and controls is only necessary to verify the instrument, the performance of the algorithms and the method relies on the information from a trained clinician at the teaching phase. In fact, considering the function of the method, the use of, for example, pooled samples or artificial samples does not contribute to testing of the function of the method since we are profiling the sample rather than picking certain features; artificial samples are, against our normal view, obscure in this respect.