nemi.workflow module

class nemi.workflow.NEMI(params=None)[source]

Main NEMI workflow

Parameters:: params (dict, optional) – clustering and enbedding algorithm parameters.

run(X, n=1)[source]

Run the NEMI pipeline

The pipeline consists of steps:

Parameters:

X (ndarray) – The data contained in a sparse matrix of shape (n_samples, n_features)
n (int, optional) – Number of iterations to run. Defaults to 1.

assess_overlap(base_id: int = 0, max_clusters=None, **kwargs)[source]

Assess the overlap between the clusters.

Parameters:: base_id (int, optional) – index (starting at 0) of ensemble member to use as the base comparison

class nemi.workflow.SingleNemi(params=None)[source]

A single instance of the NEMI pipeline

Parameters:: params (dict, optional) – A dictionary of the embedding and clustering options. Defaults to nemi.workflow.default_params.

run(X, save_steps=True)[source]

Run a single instance of the NEMI pipeline

The pipeline consists of steps:

Parameters:: X (ndarray) – The data contained in a sparse matrix of shape (n_samples, n_features)

scale_data(X)[source]

Scale the data to have a mean and variance of 1.

Parameters:

X (ndarray) – The data to pick seeds for. A sparse matrix of shape (n_samples, n_features)
**kwargs – keyword arguments to embedding function

fit_embedding(X)[source]

Run the embedding algorithm on the data

Args: X (ndarray): The data to pick seeds for. A sparse matrix of shape (n_samples, n_features) **kwargs : keyword arguments to embedding function

predict_clusters()[source]

Run the clustering algorithm on the embedding

Clustering algorithm parameters is set by the clustering_dict attribute.

sort_clusters(clusters)[source]

Updates cluster labels 0,1,…,k so that each cluster is of descending size.

save_embedding(filename)[source]

Save the embedding to a file