package graph_mining.in_memory

Mouse Melon logoGet desktop application:
View/edit binary Protocol Buffers messages

message AffinityClustererConfig

affinity.proto:23

NextId: 11

Used in: ClustererConfig, EmbedderConfig

message AffinityClustererConfig.ActiveClusterCondition

affinity.proto:77

Specifies a set of conditions that qualify cluster as "active". An unset field defines a condition that's always satisfied.

Used in: AffinityClustererConfig

enum AffinityClustererConfig.EdgeAggregationFunction

affinity.proto:58

Specifies how edge weights are aggregated when computing a compressed graph for subsequent iterations. Let S = set of edge weights between two clusters, X, Y = total number of nodes in each cluster. With these definitions, we use the following formulas:

Used in: AffinityClustererConfig

message AffinityClustererConfig.SizeConstraint

affinity.proto:102

Specifies size constraints.

Used in: AffinityClustererConfig

message AffinityClustererConfig.WeightThresholdsSequence

affinity.proto:35

Used in: AffinityClustererConfig

message CliqueContainerConfig

clique_container.proto:7

Used in: ClustererConfig

message ClustererConfig

config.proto:92

Config for InMemoryClusterer subclasses. When adding a new subclass: a) add a new proto definition to this file, b) add this proto to the config oneof in ClustererConfig. Next available tag: 20

Used in: ClusteringSpec

message ClusteringSpec

config.proto:114

Specification of an in-memory clustering step, consisting of the clusterer name and its config.

message CoconductanceConfig

coconductance.proto:31

Config for CoconductanceClusterer, which optimizes for co-conductance, which is a clustering quality measure defined as follows. Given an undirected graph with weighted edges, for a cluster C define: * vol(C) = total weighted degree of nodes within C * E(C) = total weight of all undirected edges with both endpoints in C * ccond(C) = 2E(C) / vol(C) Then, given an exponent p > 0, we define the coconductance of the clustering C_1, ..., C_k to be sum ccond(C_i)^p for i = i, ..., k This seems to be a reasonable method for finding dense clusters, even in the absence of weights. See go/coconductance-paper for details. The algorithm optimizes co-conductance using Louvain algorithm.

Used in: ClustererConfig

message CoconductanceConfig.ConstantApproximate

coconductance.proto:44

Used in: CoconductanceConfig

message CoconductanceConfig.Louvain

coconductance.proto:35

Used in: CoconductanceConfig

message CorrelationClustererConfig

correlation.proto:67

Consider a graph with vertex set V, edge set E, non-negative vertex weights k_u, edge weights w_uv, and a "resolution" parameter which must be non-negative. We define "rescaled" edge weights w'_uv for all u, v, in V as: { 0 if u == v { w_uv - edge_weight_offset - if {u, v} in E, w'_{uv} = { resolution k_u k_v { -resolution k_u k_v otherwise Note that when using graph_mining.Node protos to store the graph, the edge and node weights default to 1. For bipartite objective computation, see comments preceding `use_bipartite_objective`. The correlation clustering objective is to maximize sum_{u, v in the same cluster} w'_uv, which is equivalent (up to sign and an additive constant) to the "maximizing agreements" and "minimizing disagreements" formulations of correlation clustering that are used in the approximation algorithms literature. Assuming the total edge weight in the graph is M, modularity partitioning can be expressed in this form by: * setting resolution = 1/(2*M), * setting the weight of each node to the total weight of its incident edges. Note that the final correlation clustering objective is monotonic in, but not equal to modularity. In particular, if the correlation clustering objective is C, we have: modularity = (C - resolution * sum_v (deg_v)^2 / (4 * M)) / M. This special case is available as ModularityClusterer in this library. To optimize this objective we use local search. We start with each vertex in its own cluster. We consider moves of the following form: move all vertices in a "move set" S of vertices to either one of the existing clusters or to a newly created cluster. We currently consider the following options for S: - Each vertex in a singleton move set. This reduces to the classic single vertex moves. - One move set per current cluster with all the vertices currently in it. With these move sets we're effectively considering merging two clusters. - One move set per cluster from single-level affinity clustering. - One move set per cluster from a run of the Ailon Charikar Newman approximation algorithm. The clusters produced by three runs of the Ailon Charikar Newman algorithm (it's randomized) are used as move sets each iteration. The local search proceeds with effectively three nested loops. The outermost loop is over the num_iterations iterations. The middle loop is over the four move types listed above. The inner loop is over move sets of the particular type. For each move set considered we move that move set to the cluster that improves the objective the most if an improving move exists. Next available tag: 15

Used in: ClustererConfig, ModularityClustererConfig

enum CorrelationClustererConfig.ClusteringMovesMethod

correlation.proto:99

Specifies the algorithm to use for correlation clustering.

Used in: CorrelationClustererConfig

message DynamicHacConfig

dynamic_hac.proto:23

Dynamic hierarchical agglomerative clustering (HAC) using the average linkage function.

message EmbedderConfig

parline.proto:53

Config for algorithm used for linear embedding of the input graph.

Used in: LinePartitionerConfig

message LabelPropagationConfig

label_propagation.proto:25

Shared-memory implementation of parallel label propagation. The initial label of each node is its own id. In each round, each node that had a neighbor update its label on the previous round recomputes it label as the most frequent label among its neighborhood. Edge weights are summed up to compute the max-frequency label. Note that the order that nodes are updated will depend on the options set in the config below.

Used in: ClustererConfig

message LinePartitionerConfig

parline.proto:22

Config for the parallel line partitioner (go/parline).

Used in: ClustererConfig

message LocalSearchConfig

parline.proto:65

Used in: LinePartitionerConfig

message LouvainConfig

correlation.proto:162

This config is for clustering using the Louvain algorithm, where the objective is given by another config.

Used in: CorrelationClustererConfig

message MinimumLinearArrangementConfig

minla.proto:19

enum MinimumLinearArrangementConfig.CostMetric

minla.proto:20

Used in: MinimumLinearArrangementConfig

message ModularityClustererConfig

modularity.proto:22

Used in: ClustererConfig

message PairwiseImproverConfig

parline.proto:69

Used in: LocalSearchConfig

message PairwiseImproverConfig.ClusterPairingMethod

parline.proto:75

Cluster pairing method used when two clusters are chosen to swap nodes.

Used in: PairwiseImproverConfig

enum PairwiseImproverConfig.ClusterPairingMethod.Name

parline.proto:76

Used in: ClusterPairingMethod

message ParHacConfig

config.proto:41

Parallel hierarchical agglomerative clustering (HAC) using the average linkage function. The algorithm that we use is described in go/parhac-paper.

Used in: ClustererConfig

message TeraHacConfig

config.proto:71

Shared-memory implementation of TeraHac: https://arxiv.org/abs/2308.03578

Used in: ClustererConfig