public abstract class VertexRDD<VD> extends RDD<scala.Tuple2<java.lang.Object,VD>>
RDD[(VertexId, VD)] by ensuring that there is only one entry for each vertex and by
pre-indexing the entries for fast, efficient joins. Two VertexRDDs with the same index can be
joined efficiently. All operations except reindex preserve the index. To construct a
VertexRDD, use the VertexRDD object.
Additionally, stores routing information to enable joining the vertex attributes with an
EdgeRDD.
| Constructor and Description |
|---|
VertexRDD(SparkContext sc,
scala.collection.Seq<Dependency<?>> deps) |
| Modifier and Type | Method and Description |
|---|---|
abstract <VD2> VertexRDD<VD2> |
aggregateUsingIndex(RDD<scala.Tuple2<java.lang.Object,VD2>> messages,
scala.Function2<VD2,VD2,VD2> reduceFunc,
scala.reflect.ClassTag<VD2> evidence$12)
Aggregates vertices in
messages that have the same ids using reduceFunc, returning a
VertexRDD co-indexed with this. |
static <VD> VertexRDD<VD> |
apply(RDD<scala.Tuple2<java.lang.Object,VD>> vertices,
scala.reflect.ClassTag<VD> evidence$14)
Constructs a standalone
VertexRDD (one that is not set up for efficient joins with an
EdgeRDD) from an RDD of vertex-attribute pairs. |
static <VD> VertexRDD<VD> |
apply(RDD<scala.Tuple2<java.lang.Object,VD>> vertices,
EdgeRDD<?> edges,
VD defaultVal,
scala.reflect.ClassTag<VD> evidence$15)
Constructs a
VertexRDD from an RDD of vertex-attribute pairs. |
static <VD> VertexRDD<VD> |
apply(RDD<scala.Tuple2<java.lang.Object,VD>> vertices,
EdgeRDD<?> edges,
VD defaultVal,
scala.Function2<VD,VD,VD> mergeFunc,
scala.reflect.ClassTag<VD> evidence$16)
Constructs a
VertexRDD from an RDD of vertex-attribute pairs. |
scala.collection.Iterator<scala.Tuple2<java.lang.Object,VD>> |
compute(Partition part,
TaskContext context)
Provides the
RDD[(VertexId, VD)] equivalent output. |
abstract VertexRDD<VD> |
diff(RDD<scala.Tuple2<java.lang.Object,VD>> other)
For each vertex present in both
this and other, diff returns only those vertices with
differing values; for values that are different, keeps the values from other. |
abstract VertexRDD<VD> |
diff(VertexRDD<VD> other)
For each vertex present in both
this and other, diff returns only those vertices with
differing values; for values that are different, keeps the values from other. |
VertexRDD<VD> |
filter(scala.Function1<scala.Tuple2<java.lang.Object,VD>,java.lang.Object> pred)
Restricts the vertex set to the set of vertices satisfying the given predicate.
|
static <VD> VertexRDD<VD> |
fromEdges(EdgeRDD<?> edges,
int numPartitions,
VD defaultVal,
scala.reflect.ClassTag<VD> evidence$17)
Constructs a
VertexRDD containing all vertices referred to in edges. |
protected Partition[] |
getPartitions()
Implemented by subclasses to return the set of partitions in this RDD.
|
abstract <U,VD2> VertexRDD<VD2> |
innerJoin(RDD<scala.Tuple2<java.lang.Object,U>> other,
scala.Function3<java.lang.Object,VD,U,VD2> f,
scala.reflect.ClassTag<U> evidence$10,
scala.reflect.ClassTag<VD2> evidence$11)
Inner joins this VertexRDD with an RDD containing vertex attribute pairs.
|
abstract <U,VD2> VertexRDD<VD2> |
innerZipJoin(VertexRDD<U> other,
scala.Function3<java.lang.Object,VD,U,VD2> f,
scala.reflect.ClassTag<U> evidence$8,
scala.reflect.ClassTag<VD2> evidence$9)
Efficiently inner joins this VertexRDD with another VertexRDD sharing the same index.
|
abstract <VD2,VD3> VertexRDD<VD3> |
leftJoin(RDD<scala.Tuple2<java.lang.Object,VD2>> other,
scala.Function3<java.lang.Object,VD,scala.Option<VD2>,VD3> f,
scala.reflect.ClassTag<VD2> evidence$6,
scala.reflect.ClassTag<VD3> evidence$7)
Left joins this VertexRDD with an RDD containing vertex attribute pairs.
|
abstract <VD2,VD3> VertexRDD<VD3> |
leftZipJoin(VertexRDD<VD2> other,
scala.Function3<java.lang.Object,VD,scala.Option<VD2>,VD3> f,
scala.reflect.ClassTag<VD2> evidence$4,
scala.reflect.ClassTag<VD3> evidence$5)
Left joins this RDD with another VertexRDD with the same index.
|
abstract <VD2> VertexRDD<VD2> |
mapValues(scala.Function1<VD,VD2> f,
scala.reflect.ClassTag<VD2> evidence$2)
Maps each vertex attribute, preserving the index.
|
abstract <VD2> VertexRDD<VD2> |
mapValues(scala.Function2<java.lang.Object,VD,VD2> f,
scala.reflect.ClassTag<VD2> evidence$3)
Maps each vertex attribute, additionally supplying the vertex ID.
|
abstract VertexRDD<VD> |
minus(RDD<scala.Tuple2<java.lang.Object,VD>> other)
For each VertexId present in both
this and other, minus will act as a set difference
operation returning only those unique VertexId's present in this. |
abstract VertexRDD<VD> |
minus(VertexRDD<VD> other)
For each VertexId present in both
this and other, minus will act as a set difference
operation returning only those unique VertexId's present in this. |
abstract VertexRDD<VD> |
reindex()
Construct a new VertexRDD that is indexed by only the visible vertices.
|
abstract VertexRDD<VD> |
reverseRoutingTables()
Returns a new
VertexRDD reflecting a reversal of all edge directions in the corresponding
EdgeRDD. |
protected abstract scala.reflect.ClassTag<VD> |
vdTag() |
abstract VertexRDD<VD> |
withEdges(EdgeRDD<?> edges)
Prepares this VertexRDD for efficient joins with the given EdgeRDD.
|
aggregate, cache, cartesian, checkpoint, checkpointData, clearDependencies, coalesce, collect, collect, context, count, countApprox, countApproxDistinct, countApproxDistinct, countByValue, countByValueApprox, creationSite, dependencies, distinct, distinct, doubleRDDToDoubleRDDFunctions, filterWith, first, firstParent, flatMap, flatMapWith, fold, foreach, foreachPartition, foreachWith, getCheckpointFile, getDependencies, getNumPartitions, getPreferredLocations, getStorageLevel, glom, groupBy, groupBy, groupBy, id, intersection, intersection, intersection, isCheckpointed, isEmpty, iterator, keyBy, localCheckpoint, map, mapPartitions, mapPartitionsWithContext, mapPartitionsWithIndex, mapPartitionsWithSplit, mapWith, max, min, name, numericRDDToDoubleRDDFunctions, parent, partitioner, partitions, persist, persist, pipe, pipe, pipe, preferredLocations, randomSplit, rddToAsyncRDDActions, rddToOrderedRDDFunctions, rddToPairRDDFunctions, rddToSequenceFileRDDFunctions, reduce, repartition, sample, saveAsObjectFile, saveAsTextFile, saveAsTextFile, scope, setName, sortBy, sparkContext, subtract, subtract, subtract, take, takeOrdered, takeSample, toArray, toDebugString, toJavaRDD, toLocalIterator, top, toString, treeAggregate, treeReduce, union, unpersist, zip, zipPartitions, zipPartitions, zipPartitions, zipPartitions, zipPartitions, zipPartitions, zipWithIndex, zipWithUniqueIdclone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, waitinitializeIfNecessary, initializeLogging, isTraceEnabled, log_, log, logDebug, logDebug, logError, logError, logInfo, logInfo, logName, logTrace, logTrace, logWarning, logWarningpublic VertexRDD(SparkContext sc, scala.collection.Seq<Dependency<?>> deps)
public static <VD> VertexRDD<VD> apply(RDD<scala.Tuple2<java.lang.Object,VD>> vertices, scala.reflect.ClassTag<VD> evidence$14)
VertexRDD (one that is not set up for efficient joins with an
EdgeRDD) from an RDD of vertex-attribute pairs. Duplicate entries are removed arbitrarily.
vertices - the collection of vertex-attribute pairsevidence$14 - (undocumented)public static <VD> VertexRDD<VD> apply(RDD<scala.Tuple2<java.lang.Object,VD>> vertices, EdgeRDD<?> edges, VD defaultVal, scala.reflect.ClassTag<VD> evidence$15)
VertexRDD from an RDD of vertex-attribute pairs. Duplicate vertex entries are
removed arbitrarily. The resulting VertexRDD will be joinable with edges, and any missing
vertices referred to by edges will be created with the attribute defaultVal.
vertices - the collection of vertex-attribute pairsedges - the EdgeRDD that these vertices may be joined withdefaultVal - the vertex attribute to use when creating missing verticesevidence$15 - (undocumented)public static <VD> VertexRDD<VD> apply(RDD<scala.Tuple2<java.lang.Object,VD>> vertices, EdgeRDD<?> edges, VD defaultVal, scala.Function2<VD,VD,VD> mergeFunc, scala.reflect.ClassTag<VD> evidence$16)
VertexRDD from an RDD of vertex-attribute pairs. Duplicate vertex entries are
merged using mergeFunc. The resulting VertexRDD will be joinable with edges, and any
missing vertices referred to by edges will be created with the attribute defaultVal.
vertices - the collection of vertex-attribute pairsedges - the EdgeRDD that these vertices may be joined withdefaultVal - the vertex attribute to use when creating missing verticesmergeFunc - the commutative, associative duplicate vertex attribute merge functionevidence$16 - (undocumented)public static <VD> VertexRDD<VD> fromEdges(EdgeRDD<?> edges, int numPartitions, VD defaultVal, scala.reflect.ClassTag<VD> evidence$17)
VertexRDD containing all vertices referred to in edges. The vertices will be
created with the attribute defaultVal. The resulting VertexRDD will be joinable with
edges.
edges - the EdgeRDD referring to the vertices to createnumPartitions - the desired number of partitions for the resulting VertexRDDdefaultVal - the vertex attribute to use when creating missing verticesevidence$17 - (undocumented)protected abstract scala.reflect.ClassTag<VD> vdTag()
protected Partition[] getPartitions()
RDDgetPartitions in class RDD<scala.Tuple2<java.lang.Object,VD>>public scala.collection.Iterator<scala.Tuple2<java.lang.Object,VD>> compute(Partition part, TaskContext context)
RDD[(VertexId, VD)] equivalent output.public abstract VertexRDD<VD> reindex()
public VertexRDD<VD> filter(scala.Function1<scala.Tuple2<java.lang.Object,VD>,java.lang.Object> pred)
It is declared and defined here to allow refining the return type from RDD[(VertexId, VD)] to
VertexRDD[VD].
public abstract <VD2> VertexRDD<VD2> mapValues(scala.Function1<VD,VD2> f, scala.reflect.ClassTag<VD2> evidence$2)
f - the function applied to each value in the RDDevidence$2 - (undocumented)f to each of the entries in the
original VertexRDDpublic abstract <VD2> VertexRDD<VD2> mapValues(scala.Function2<java.lang.Object,VD,VD2> f, scala.reflect.ClassTag<VD2> evidence$3)
f - the function applied to each ID-value pair in the RDDevidence$3 - (undocumented)f to each of the entries in the
original VertexRDD. The resulting VertexRDD retains the same index.public abstract VertexRDD<VD> minus(RDD<scala.Tuple2<java.lang.Object,VD>> other)
this and other, minus will act as a set difference
operation returning only those unique VertexId's present in this.
other - an RDD to run the set operation againstpublic abstract VertexRDD<VD> minus(VertexRDD<VD> other)
this and other, minus will act as a set difference
operation returning only those unique VertexId's present in this.
other - a VertexRDD to run the set operation againstpublic abstract VertexRDD<VD> diff(RDD<scala.Tuple2<java.lang.Object,VD>> other)
this and other, diff returns only those vertices with
differing values; for values that are different, keeps the values from other. This is
only guaranteed to work if the VertexRDDs share a common ancestor.
other - the other RDD[(VertexId, VD)] with which to diff against.public abstract VertexRDD<VD> diff(VertexRDD<VD> other)
this and other, diff returns only those vertices with
differing values; for values that are different, keeps the values from other. This is
only guaranteed to work if the VertexRDDs share a common ancestor.
other - the other VertexRDD with which to diff against.public abstract <VD2,VD3> VertexRDD<VD3> leftZipJoin(VertexRDD<VD2> other, scala.Function3<java.lang.Object,VD,scala.Option<VD2>,VD3> f, scala.reflect.ClassTag<VD2> evidence$4, scala.reflect.ClassTag<VD3> evidence$5)
this.
If other is missing any vertex in this VertexRDD, f is passed None.
other - the other VertexRDD with which to join.f - the function mapping a vertex id and its attributes in this and the other vertex set
to a new vertex attribute.evidence$4 - (undocumented)evidence$5 - (undocumented)fpublic abstract <VD2,VD3> VertexRDD<VD3> leftJoin(RDD<scala.Tuple2<java.lang.Object,VD2>> other, scala.Function3<java.lang.Object,VD,scala.Option<VD2>,VD3> f, scala.reflect.ClassTag<VD2> evidence$6, scala.reflect.ClassTag<VD3> evidence$7)
leftZipJoin implementation is
used. The resulting VertexRDD contains an entry for each vertex in this. If other is
missing any vertex in this VertexRDD, f is passed None. If there are duplicates,
the vertex is picked arbitrarily.
other - the other VertexRDD with which to joinf - the function mapping a vertex id and its attributes in this and the other vertex set
to a new vertex attribute.evidence$6 - (undocumented)evidence$7 - (undocumented)f.public abstract <U,VD2> VertexRDD<VD2> innerZipJoin(VertexRDD<U> other, scala.Function3<java.lang.Object,VD,U,VD2> f, scala.reflect.ClassTag<U> evidence$8, scala.reflect.ClassTag<VD2> evidence$9)
innerJoin for the behavior of the join.other - (undocumented)f - (undocumented)evidence$8 - (undocumented)evidence$9 - (undocumented)public abstract <U,VD2> VertexRDD<VD2> innerJoin(RDD<scala.Tuple2<java.lang.Object,U>> other, scala.Function3<java.lang.Object,VD,U,VD2> f, scala.reflect.ClassTag<U> evidence$10, scala.reflect.ClassTag<VD2> evidence$11)
innerZipJoin implementation
is used.
other - an RDD containing vertices to join. If there are multiple entries for the same
vertex, one is picked arbitrarily. Use aggregateUsingIndex to merge multiple entries.f - the join function applied to corresponding values of this and otherevidence$10 - (undocumented)evidence$11 - (undocumented)this, containing only vertices that appear in both
this and other, with values supplied by fpublic abstract <VD2> VertexRDD<VD2> aggregateUsingIndex(RDD<scala.Tuple2<java.lang.Object,VD2>> messages, scala.Function2<VD2,VD2,VD2> reduceFunc, scala.reflect.ClassTag<VD2> evidence$12)
messages that have the same ids using reduceFunc, returning a
VertexRDD co-indexed with this.
messages - an RDD containing messages to aggregate, where each message is a pair of its
target vertex ID and the message datareduceFunc - the associative aggregation function for merging messages to the same vertexevidence$12 - (undocumented)this, containing only vertices that received messages.
For those vertices, their values are the result of applying reduceFunc to all received
messages.public abstract VertexRDD<VD> reverseRoutingTables()
VertexRDD reflecting a reversal of all edge directions in the corresponding
EdgeRDD.