Published on July 2020 | Big Data Mining, Machine learning

A REVIEW ON MACHINE LEARNING (FEATURE SELECTION, CLASSIFICATION AND CLUSTERING) APPROACHES OF BIG DATA MINING IN DIFFERENT AREA OF RESEARCH
Authors: Neeraj, Narender Kumar, Vineet Kumar Maurya.
Journal Name: Journal of Critical Reviews
Volume: 7 Issue: 9 Page No: 2610-2626
Indexing: SCOPUS
Abstract:

Today’s age is the age of data, where a huge amount of data is being generated world-wide. This huge volume of data, called ‘big data’, has no meaning until the proper information is extracted from it. A small size data, with limited number of dimensions can be analyzed using simple computer programs like MS-excel sheets, but a huge, complex and multidimensional data (big data) cannot be analyzed using simple computational methods. Hence, machine learning approaches of data mining is required, which includes process of feature selection, classification and clustering. All these steps are complex in themselves and require special algorithms for each. A survey of theoretical insights of different methods used for feature selections (FS), clustering and classification is presented in this review. Application of machine learning in marketing, library, climate, crime and biological sciences, etc. is also briefly mentioned in this review. I was also observed that that there is no any perfect method that can analyze different type of big datasets with equal efficiency and accuracy. Moreover, hybrid algorithms (hybrid filter, hybrid wrapper, hybrid Evolutionary based algorithms) and modified algorithms are preferred for big data mining due to their better performance over ensemble approach or single algorithms. Use of more than one methods or modified or hybrid methods for better prediction accuracy is a good choice. Despite of many combinations and trial still there is no single algorithms that can be universally applied for big data arising in different fields, hence newer algorithms or better amalgamations of different algorithms are still required, for analysis of big data. Absence of any perfect algorithms and continuous generation of data has a great scope for computational scientist and programmers.

Download PDF
View Author/Co-Author
Copyright © 2024 All rights reserved