Skip to content

clustering.py

Module for performing color clustering on images using K-Means.

Clustering

Perform K-Means clustering on image data to group similar colors.

Source code in pycht/clustering.py
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
class Clustering:
    """
    Perform K-Means clustering on image data to group similar colors.
    """

    @staticmethod
    def compute(pixel_array: np.ndarray, nb_clusters: int, random_state: int = 0) -> np.ndarray:
        """
        Apply K-Means clustering to the given data and return the clustered result.

        Parameters
        ----------
        pixel_array : np.ndarray
            Flattened image data (pixels), shape (num_pixels, num_channels).
        nb_clusters : int
            The number of color clusters to form.
        random_state : int
            Random seed for reproducibility.

        Returns
        -------
        np.ndarray
            The clustered image data where each pixel is replaced by the centroid of its cluster,
            with dtype uint8 and the same shape as pixel_array.

        Examples
        --------
        Basic usage:

        >>> from clustering import Clustering

        >>> clustering = Clustering()
        >>> clustering.compute(flattened_img="image.jpg", nb_colors=5)

        With custom output:

        >>> clustering.compute(flattened_img="image.jpg", nb_colors=5, output_path="./out")
        """
        kmeans = KMeans(n_clusters=nb_clusters, n_init=10, random_state=random_state)
        labels = kmeans.fit_predict(pixel_array)
        centers = np.uint8(kmeans.cluster_centers_)
        return centers[labels]

compute(pixel_array, nb_clusters, random_state=0) staticmethod

Apply K-Means clustering to the given data and return the clustered result.

Parameters:

Name Type Description Default
pixel_array ndarray

Flattened image data (pixels), shape (num_pixels, num_channels).

required
nb_clusters int

The number of color clusters to form.

required
random_state int

Random seed for reproducibility.

0

Returns:

Type Description
ndarray

The clustered image data where each pixel is replaced by the centroid of its cluster, with dtype uint8 and the same shape as pixel_array.

Examples:

Basic usage:

>>> from clustering import Clustering
>>> clustering = Clustering()
>>> clustering.compute(flattened_img="image.jpg", nb_colors=5)

With custom output:

>>> clustering.compute(flattened_img="image.jpg", nb_colors=5, output_path="./out")
Source code in pycht/clustering.py
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
@staticmethod
def compute(pixel_array: np.ndarray, nb_clusters: int, random_state: int = 0) -> np.ndarray:
    """
    Apply K-Means clustering to the given data and return the clustered result.

    Parameters
    ----------
    pixel_array : np.ndarray
        Flattened image data (pixels), shape (num_pixels, num_channels).
    nb_clusters : int
        The number of color clusters to form.
    random_state : int
        Random seed for reproducibility.

    Returns
    -------
    np.ndarray
        The clustered image data where each pixel is replaced by the centroid of its cluster,
        with dtype uint8 and the same shape as pixel_array.

    Examples
    --------
    Basic usage:

    >>> from clustering import Clustering

    >>> clustering = Clustering()
    >>> clustering.compute(flattened_img="image.jpg", nb_colors=5)

    With custom output:

    >>> clustering.compute(flattened_img="image.jpg", nb_colors=5, output_path="./out")
    """
    kmeans = KMeans(n_clusters=nb_clusters, n_init=10, random_state=random_state)
    labels = kmeans.fit_predict(pixel_array)
    centers = np.uint8(kmeans.cluster_centers_)
    return centers[labels]