Clustering Algorithms¶
OpenEnsembles is a resource for performing and analyzing ensemble clustering This file contains calls to clustering algorithms. Please refer to this documentation for specifics about the variable parameters and their defaults, but interact with clustering through the openensembles.clustering() class.
OpenEnsembles is a resource for performing and analyzing ensemble clustering
Copyright (C) 2017 Naegle Lab
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>.
-
openensembles.clustering_algorithms.
convertDistanceToSimilarity
(D, beta=1.0)[source]¶ A utility to convert a distance matrix to a similarity matrix
- Parameters
- D: matrix of floats
A matrix of distances, such as returned by returnDistanceMatrix(data,distanceType)
- beta: float
A variable for mapping distance to similarity.
- Returns
- S: a matrix of floats
A matrix of similarity values. according to S = np.exp(-beta * D / D.std())
-
openensembles.clustering_algorithms.
returnDistanceMatrix
(data, distance)[source]¶ A utility to calculate a distance matrix, according to type in <distance> on the data array.
- Parameters
- data: matrix
Data matrix to calculate distances from
- distance: string
Distance metric. See sklearn’s pairwise distances
- Returns
- d: matrix
the distance matrix computed by distance
- Raises
- ValueError:
if the distance metric is not available.
-
openensembles.clustering_algorithms.
returnParams
(paramsSent, paramsExpected, algorithm)[source]¶ A utility for variable parameter setting in clustering algorithms Takes two dictionaries of parameter key, value pairs and replaces that in paramsExpected with anything in paramsSent.
- Returns
- params: dict
Dict of parameters that represent the final parameters, overwritten in paramsExpected by paramsSent This will handle checking to make sure that if precomputed distances have been selected, that a distance or similarity matrix is also passed.
Warning
Will warn users if a key in sent does not appear in expected.