A Review of Clustering Algorithms for Big Data

Published 2019 in 2019 International Conference on Networking and Advanced Systems (ICNAS)

ABSTRACT

Big data is usually defined by five (05) characteristics called 5Vs +1C (Volume, Velocity, Variety, Veracity, Value and Complexity). It means to data that are too large, dynamic and complex with certain degree of accuracy. For that, data become difficult to analyze using traditional data analysis techniques because of their high complexity and computational cost. Clustering analysis technique is the most used method for cope with huge amount of data. The main goal of clustering is to classify data into clusters in manner that data grouped are more similar. In this paper, we provide an overview of various clustering techniques used for data analysis.

PUBLICATION RECORD

Publication year
2019
Venue
2019 International Conference on Networking and Advanced Systems (ICNAS)
Publication date
2019-06-01
Fields of study
Mathematics, Computer Science
Identifiers
DOI 10.1109/ICNAS.2019.8807822
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Multiple Kernel <inline-formula><tex-math notation="LaTeX">$k$</tex-math><alternatives><mml:math><mml:mi>k</mml:mi></mml:math><inline-graphic xlink:href="liu-ieq1-2892416.gif"/></alternatives></inline-formula>-Means with Incomplete Kernels
2020cited by this paper
Big Data: Characteristics, Issues and Clustering Techniques
2018cited by this paper
Faster k-Medoids Clustering: Improving the PAM, CLARA, and CLARANS Algorithms
2018cited by this paper
An Optimized Chameleon Algorithm based on Local Features
2018cited by this paper
A new algorithm for clustering based on kernel density estimation
2018cited by this paper
Generalized Principal Component Analysis
2017cited by this paper
Principal Component Analysis Networks and Algorithms
2017cited by this paper
STING Algorithm Used English Sentiment Classification in a Parallel Environment
2017cited by this paper
An efficient semi-supervised representatives feature selection algorithm based on information theory
2017cited by this paper
An Incremental CFS Algorithm for Clustering Large Data in Industrial Internet of Things
2017cited by this paper
Malware family identification with BIRCH clustering
2017influential reference
A fast DBSCAN clustering algorithm by accelerating neighbor searching using Groups method
2016influential reference
Big data and clustering algorithms
2016cited by this paper
Initialization of K-modes clustering using outlier detection techniques
2016cited by this paper
Fast SVD Computations for Synchrophasor Algorithms
2016cited by this paper
mclust 5: Clustering, Classification and Density Estimation Using Gaussian Finite Mixture Models
2016cited by this paper
Variant of COBWEB Clustering for Privacy Preservation in Cloud DB Querying
2015cited by this paper
A review of feature selection methods with applications
2015cited by this paper
Incremental Linear Discriminant Analysis: A Fast Algorithm and Comparisons
2015cited by this paper
An improvement of DENCLUE algorithm for the data clustering
2015cited by this paper
A Comprehensive Survey of Clustering Algorithms
2015influential reference
Statistical guarantees for the EM algorithm: From population to sample-based analysis
2014cited by this paper
Strategies for Big Data Clustering
2014cited by this paper
A scalable and fast OPTICS for clustering trajectory big data
2014cited by this paper
Big Data Clustering: A Review
2014cited by this paper
Fast maximum clique algorithms for large graphs
2014cited by this paper
G-DBSCAN: A GPU Accelerated Algorithm for Density-based Clustering
2013cited by this paper
A Parallel Clustering Algorithm with MPI - MKmeans
2013cited by this paper
Parallel Algorithm for the Chameleon Clustering Algorithm using Dynamic Modeling
2013cited by this paper
High-Performance Intrusion Detection Using OptiGrid Clustering and Grid-Based Labelling
2011cited by this paper
DHCC: Divisive hierarchical clustering of categorical data
2011cited by this paper
K-Medoids Clustering
2010cited by this paper
PBIRCH: A Scalable Parallel Clustering algorithm for Incremental Data
2006cited by this paper
DBDC: Density Based Distributed Clustering
2004cited by this paper
Parallel implementation of CLARANS using PVM
2004cited by this paper
Parallel K-means Clustering Algorithm on NOWs
2003cited by this paper
Principal Direction Divisive Partitioning
1998cited by this paper

CITED BY

Hybrid IDK_means++: Integrating Particle Swarm Optimization for Robust and Accurate K_means Initialization
2026cites this paper
Evaluating Representative Days Selection for Capacity Planning with Variable Renewable Options
2025cites this paper
A Novel Approach to Test-Induced Defect Detection in Semiconductor Wafers, Using Graph-Based Semi-Supervised Learning (GSSL)
2025cites this paper
A performance enhanced distributed computing framework for clustering by local direction centrality upon Apache Spark
2025cites this paper
Developing an automatic fuzzy clustering algorithm for point data based on the circle similarity and applying to images
2025cites this paper
The Impact of Intelligence Gathering, Risk Analysis, and Scenario Planning on Defense Policy Formulation
2024cites this paper
Convex Optimization Techniques for High-Dimensional Data Clustering Analysis: A Review
2024cites this paper
Firefly forest: A swarm iteration-free swarm intelligence clustering algorithm
2024cites this paper
Prediction of wind energy location by parallel programming using MPI-based KMEANS clustering algorithm
2024cites this paper
Classification of Users of a Health Service Provider Using Unsupervised Machine Learning Methods
2024cites this paper
Profiling of Cardiogenic Shock: Incorporating Machine Learning Into Bedside Management
2024cites this paper
Team qIIMAS on Task 2 - Clustering
2024cites this paper
PENERAPAN MULTI-CLUSTERING DALAM PENGELOMPOKAN KABUPATEN/KOTA DI PROVINSI JAWA BARAT BERDASARKAN INDEKS DESA MEMBANGUN
2023cites this paper
Current and future role of data fusion and machine learning in infrastructure health monitoring
2023cites this paper
DDCM: a decentralized density clustering and its results gathering approach
2023cites this paper
A Taxonomy of Machine Learning Clustering Algorithms, Challenges, and Future Realms
2023cites this paper
CONGEST: An Algorithm to Detect Congestion Zones Based on Unmanned Aerial Vehicle (UAV) Flight plans
2023cites this paper
Optimizing Density-Based Ant Colony Stream Clustering Using FPGA-Based Hardware Accelerator
2023cites this paper
An alternative for data visualization using space-filling curve
2023cites this paper
An Efficient Pre-Clusters Assessment Technique for Efficient Data Partitions
2023cites this paper
Exploring disease axes as an alternative to distinct clusters for characterizing sepsis heterogeneity
2023cites this paper
Soft dimensionality reduction for reinforcement data clustering
2023cites this paper
A path to implementing a fresh produce e-commerce customer segmentation method based on clustering algorithms
2023cites this paper
Visual Analytics and Exploration of Calcium Transient Imaging Data using Event-Based Clustering
2023cites this paper
A Novel 2D Clustering Algorithm Based on Recursive Topological Data Structure
2022cites this paper
Research on English Achievement Analysis Based on Improved CARMA Algorithm
2022cites this paper
Advances in Electron Microscopy with Deep Learning
2021cites this paper
Smart User Consumption Profiling: Incremental Learning-Based OTT Service Degradation
2020cites this paper
Evaluation Of The Efficiency Of Clustering Using Ik-Means And Imap-Reduce Approach For Microarray Data
year unknowncites this paper