Who made the k-mean, clustering close ones?
In machine learning, a technique which makes models out of labeled data
and predicts the label of unlabeled data is called supervised learning classification.
If there aren’t any labels of products and clients like in online malls,
or labels about voters like in presidential campaigns,
and do not even know how many groups we need,
how would we classify them?
Discovering these patterns and structures in unlabeled data is called unsupervised learning.
Above all, the algorithm that binds similar attributes together is called clustering.
k-means is one of the simplest algorithms that solve the problem of clustering.
This is the brief explanation of k-means algorithm.
1.Select the “k” number of clusters you want to identify in given data.
2. Randomly select “k” number of distinct data points, and call it initial clusters,
3.Measure the distance between the first point and the “k” initial clusters.
Then, assign the data to the nearest cluster.
4.Next, calculate the representative point of each clusters.
The point might be the mean, or the median.
Measure and cluster the data based on the point
and distance between the data.
5.Repeat 4) until the assigned representative point does not change.
From 1950 to 1970, communication technology and computer went through rapid development,
and different algorithms for clustering data developed in each field.
At that time, since thesis and dissertation search on internet was not available,
similar ideas developed in different fields without recognizing each other.
In the case of back propagation algorithm,
David Everett Rumelhart presented it in his paper in 1986.
Later, however, it was found out that Yu-Chi Ho was already using it
in 1969 in control metrology.
The similar idea was also on the doctoral paper of Paul Werbos in 1974.
The term ‘k-means’ was first used by James B. MacQueen in 1967.
Later, it turns out that this algorithm was already used in the field
of Pulse-Code Modulation;PCM by Stuart P. Lloyd from Bells Lab in 1957.
In the field of Biology in 1965, Edward W. Forgy also presented a similar algorithm.
Thus, in the field of Computer Science, k-means is also called
as Lloyd algorithm or Lloyd-Forgy algorithm.
Like this, it developed in different fields,
but this algorithm is used in many machine learning today.