MapReduce Integrated Multi-algorithm for HPC Running State Analysis

Abstract: High-performance computer clusters are major seismic processing platforms in the oil industry and have a frequent occurrence of failures. In this study, K-means and the Naive Bayes algorithm were programmed into MapReduce and run on Hadoop. The accumulated high-performance computer cluster running status data were first clustered by K-means, and then the results were used for Naive Bayes training. Finally, the test data were discriminated for the knowledge base and equipment failure. Experiments indicate that K-means returned good results, the Naive Bayes algorithm had a high rate of discrimination, and the multi-algorithm used in MapReduce achieved an intelligent prediction mechanism.
Keywords: high-performance clusters (hpc), hadoop, mapreduce, k-means, naive bayes
Author: ShuRen Liu, ChaoMin Feng, HongWu Luo, Ling Wen
Journal Code: jptkomputergg160159

Artikel Terkait :