Finishing¶
OpenEnsembles is a resource for performing and analyzing ensemble clustering
Copyright (C) 2017 Naegle Lab
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program. If not, see <http://www.gnu.org/licenses/>.

class
openensembles.finishing.
graph_closure
(co_occ_matrix, threshold, clique_size=3)[source]¶ Returns a final solution of the ensemble based on treating the cooccurrence matrix as a weighted graph whose solution is found from identifying network components within the graph
Finds kpercolated cliques in G, e.g,
Unless the cliques argument evaluates to True, this algorithm first enumerates all cliques in G. These are stored in memory, which in large graphs can consume large amounts of memory. Returns a generator object. To return a list of percolated kcliques, Notes —–
References
Based on the method outlined in Palla et. al., Nature 435, 814818 (2005)
Based on Code for Percolation From ConradLee
Methods
finish
()Finishes the ensemble by taking a binary adjacency matrix, defined in initilization according to the threshold given and percolates the cliques

class
openensembles.finishing.
mixture_model
(parg, N, nEnsCluster=2, iterations=10)[source]¶ Implementation of the article Mixture Models for Ensemble CLustering Topchy, Jain, and Punch, “A mixture model for clustering ensembles Proc. SIAM Int. Conf. Data Mining (2004)
Written by Pedro da Silva Tavares and adapted by Kristen M. Naegle
 Parameters
 parg: list of lists
Solutions of assignments of objects to clusters across an ensemble
 N: int
Number of data points to cluster
 nEnsCluster: int
Number of clusters to create in mixture model Default is 2
 iterations:
Number of expectation maximization iterations Default is 10
See also
 Attributes
 labels: list of ints
Final Mixture Model partitions of objects
Methods
Compute the Expectation (ExpZ) according to parameters.
Returns the y vector.
genKj
()Generates the K(j) Harray that contains the tuples of unique clusters of each jth partition, eg: K = [(X,Y), (A,B)]
The function initializes the parameters of the mixture model.
Update the parameters taking into account the ExpZ computed in the Expectation (ExpZ) step.
emProcess

expectation
()[source]¶ Compute the Expectation (ExpZ) according to parameters. Obs: y(N,H) Kj(H) alpha(M) v(H,M,K(j)) ExpZ(N,M)

gatherPartitions
()[source]¶ Returns the y vector. parg: list of Hlabeling solutions nElem: number of features/objects