From: Clustering column-mean quantile median: a new methodology for imputing missing data
Points of comparison | Rosa Aghdam et al. [8] | Huihui Li et al. [3] | This current study |
---|---|---|---|
Data source | Lung and rectal cancer datasets with 10%, 20%, and 30% missing rates | Cell cycle-regulated genes of the yeast Saccharomyces cerevisiae; 9 missing ratios from 1 to 40% and complete ratio from 5 to 25% | Rectal cancer dataset with 10%, 20%, and 30% missing rates |
Purpose | Detect the most significant genes and cancer pathway enrichments | Improve the hybrid recursive mutual strategy framework based on BPCA and LLS | Construct a system that can successfully enhance the imputation process and eliminate data noises |
Methodology | LLS, KNN, SVD, BPCA, Gene-mean, gene-median, Col-Mean, Col-Median, and Fast-imp. | BPCA, LLS, ItrLLS, and RMI | LLS, KNN, SVD, BCPA, Gene-mean, Gene-median, Col-Mean, Col-Median, and CCMQM |
Contributions | All the significant genes and pathways are detected in the imputed data, but no differences between IMs are observed in terms of NRMSE | The RMI hybrid system is effectively used to impute MV, and NRMSE gives a higher value when missing ratios increase | The modified CCMQM system enhances imputation in some evaluation tests because Gini coefficient, Euclidean distance, NRMSE, Fisher discriminant, SNR, and test duration have remarkable results |