Clustering column-mean quantile median: a new methodology for imputing missing data

Journal of Engineering and Applied Science

Table 3 A comparison of results with other previous studies

Points of comparison	Rosa Aghdam et al. [8]	Huihui Li et al. [3]	This current study
Data source	Lung and rectal cancer datasets with 10%, 20%, and 30% missing rates	Cell cycle-regulated genes of the yeast Saccharomyces cerevisiae; 9 missing ratios from 1 to 40% and complete ratio from 5 to 25%	Rectal cancer dataset with 10%, 20%, and 30% missing rates
Purpose	Detect the most significant genes and cancer pathway enrichments	Improve the hybrid recursive mutual strategy framework based on BPCA and LLS	Construct a system that can successfully enhance the imputation process and eliminate data noises
Methodology	LLS, KNN, SVD, BPCA, Gene-mean, gene-median, Col-Mean, Col-Median, and Fast-imp.	BPCA, LLS, ItrLLS, and RMI	LLS, KNN, SVD, BCPA, Gene-mean, Gene-median, Col-Mean, Col-Median, and CCMQM
Contributions	All the significant genes and pathways are detected in the imputed data, but no differences between IMs are observed in terms of NRMSE	The RMI hybrid system is effectively used to impute MV, and NRMSE gives a higher value when missing ratios increase	The modified CCMQM system enhances imputation in some evaluation tests because Gini coefficient, Euclidean distance, NRMSE, Fisher discriminant, SNR, and test duration have remarkable results