public abstract class VertexRDD<VD> extends RDD<scala.Tuple2<Object,VD>>
RDD[(VertexId, VD)] by ensuring that there is only one entry for each vertex and by
pre-indexing the entries for fast, efficient joins. Two VertexRDDs with the same index can be
joined efficiently. All operations except reindex preserve the index. To construct a
VertexRDD, use the VertexRDD object.
Additionally, stores routing information to enable joining the vertex attributes with an
EdgeRDD.
VertexRDD from a plain RDD:
// Construct an initial vertex set
val someData: RDD[(VertexId, SomeType)] = loadData(someFile)
val vset = VertexRDD(someData)
// If there were redundant values in someData we would use a reduceFunc
val vset2 = VertexRDD(someData, reduceFunc)
// Finally we can use the VertexRDD to index another dataset
val otherData: RDD[(VertexId, OtherType)] = loadData(otherFile)
val vset3 = vset2.innerJoin(otherData) { (vid, a, b) => b }
// Now we can construct very fast joins between the two sets
val vset4: VertexRDD[(SomeType, OtherType)] = vset.leftJoin(vset3)
| Constructor and Description |
|---|
VertexRDD(SparkContext sc,
scala.collection.Seq<Dependency<?>> deps) |
| Modifier and Type | Method and Description |
|---|---|
static RDD<T> |
$plus$plus(RDD<T> other) |
static <U> U |
aggregate(U zeroValue,
scala.Function2<U,T,U> seqOp,
scala.Function2<U,U,U> combOp,
scala.reflect.ClassTag<U> evidence$30) |
abstract <VD2> VertexRDD<VD2> |
aggregateUsingIndex(RDD<scala.Tuple2<Object,VD2>> messages,
scala.Function2<VD2,VD2,VD2> reduceFunc,
scala.reflect.ClassTag<VD2> evidence$12)
Aggregates vertices in
messages that have the same ids using reduceFunc, returning a
VertexRDD co-indexed with this. |
static <VD> VertexRDD<VD> |
apply(RDD<scala.Tuple2<Object,VD>> vertices,
scala.reflect.ClassTag<VD> evidence$14)
Constructs a standalone
VertexRDD (one that is not set up for efficient joins with an
EdgeRDD) from an RDD of vertex-attribute pairs. |
static <VD> VertexRDD<VD> |
apply(RDD<scala.Tuple2<Object,VD>> vertices,
EdgeRDD<?> edges,
VD defaultVal,
scala.reflect.ClassTag<VD> evidence$15)
Constructs a
VertexRDD from an RDD of vertex-attribute pairs. |
static <VD> VertexRDD<VD> |
apply(RDD<scala.Tuple2<Object,VD>> vertices,
EdgeRDD<?> edges,
VD defaultVal,
scala.Function2<VD,VD,VD> mergeFunc,
scala.reflect.ClassTag<VD> evidence$16)
Constructs a
VertexRDD from an RDD of vertex-attribute pairs. |
static RDD<T> |
cache() |
static <U> RDD<scala.Tuple2<T,U>> |
cartesian(RDD<U> other,
scala.reflect.ClassTag<U> evidence$5) |
static void |
checkpoint() |
static RDD<T> |
coalesce(int numPartitions,
boolean shuffle,
scala.Option<PartitionCoalescer> partitionCoalescer,
scala.math.Ordering<T> ord) |
static boolean |
coalesce$default$2() |
static scala.Option<PartitionCoalescer> |
coalesce$default$3() |
static scala.math.Ordering<T> |
coalesce$default$4(int numPartitions,
boolean shuffle,
scala.Option<PartitionCoalescer> partitionCoalescer) |
static Object |
collect() |
static <U> RDD<U> |
collect(scala.PartialFunction<T,U> f,
scala.reflect.ClassTag<U> evidence$29) |
scala.collection.Iterator<scala.Tuple2<Object,VD>> |
compute(Partition part,
TaskContext context)
Provides the
RDD[(VertexId, VD)] equivalent output. |
static SparkContext |
context() |
static long |
count() |
static PartialResult<BoundedDouble> |
countApprox(long timeout,
double confidence) |
static double |
countApprox$default$2() |
static long |
countApproxDistinct(double relativeSD) |
static long |
countApproxDistinct(int p,
int sp) |
static double |
countApproxDistinct$default$1() |
static scala.collection.Map<T,Object> |
countByValue(scala.math.Ordering<T> ord) |
static scala.math.Ordering<T> |
countByValue$default$1() |
static PartialResult<scala.collection.Map<T,BoundedDouble>> |
countByValueApprox(long timeout,
double confidence,
scala.math.Ordering<T> ord) |
static double |
countByValueApprox$default$2() |
static scala.math.Ordering<T> |
countByValueApprox$default$3(long timeout,
double confidence) |
static scala.collection.Seq<Dependency<?>> |
dependencies() |
abstract VertexRDD<VD> |
diff(RDD<scala.Tuple2<Object,VD>> other)
For each vertex present in both
this and other, diff returns only those vertices with
differing values; for values that are different, keeps the values from other. |
abstract VertexRDD<VD> |
diff(VertexRDD<VD> other)
For each vertex present in both
this and other, diff returns only those vertices with
differing values; for values that are different, keeps the values from other. |
static RDD<T> |
distinct() |
static RDD<T> |
distinct(int numPartitions,
scala.math.Ordering<T> ord) |
static scala.math.Ordering<T> |
distinct$default$2(int numPartitions) |
VertexRDD<VD> |
filter(scala.Function1<scala.Tuple2<Object,VD>,Object> pred)
Restricts the vertex set to the set of vertices satisfying the given predicate.
|
static T |
first() |
static <U> RDD<U> |
flatMap(scala.Function1<T,scala.collection.TraversableOnce<U>> f,
scala.reflect.ClassTag<U> evidence$4) |
static T |
fold(T zeroValue,
scala.Function2<T,T,T> op) |
static void |
foreach(scala.Function1<T,scala.runtime.BoxedUnit> f) |
static void |
foreachPartition(scala.Function1<scala.collection.Iterator<T>,scala.runtime.BoxedUnit> f) |
static <VD> VertexRDD<VD> |
fromEdges(EdgeRDD<?> edges,
int numPartitions,
VD defaultVal,
scala.reflect.ClassTag<VD> evidence$17)
Constructs a
VertexRDD containing all vertices referred to in edges. |
static scala.Option<String> |
getCheckpointFile() |
static int |
getNumPartitions() |
static StorageLevel |
getStorageLevel() |
static RDD<Object> |
glom() |
static <K> RDD<scala.Tuple2<K,scala.collection.Iterable<T>>> |
groupBy(scala.Function1<T,K> f,
scala.reflect.ClassTag<K> kt) |
static <K> RDD<scala.Tuple2<K,scala.collection.Iterable<T>>> |
groupBy(scala.Function1<T,K> f,
int numPartitions,
scala.reflect.ClassTag<K> kt) |
static <K> RDD<scala.Tuple2<K,scala.collection.Iterable<T>>> |
groupBy(scala.Function1<T,K> f,
Partitioner p,
scala.reflect.ClassTag<K> kt,
scala.math.Ordering<K> ord) |
static <K> scala.runtime.Null$ |
groupBy$default$4(scala.Function1<T,K> f,
Partitioner p) |
static int |
id() |
abstract <U,VD2> VertexRDD<VD2> |
innerJoin(RDD<scala.Tuple2<Object,U>> other,
scala.Function3<Object,VD,U,VD2> f,
scala.reflect.ClassTag<U> evidence$10,
scala.reflect.ClassTag<VD2> evidence$11)
Inner joins this VertexRDD with an RDD containing vertex attribute pairs.
|
abstract <U,VD2> VertexRDD<VD2> |
innerZipJoin(VertexRDD<U> other,
scala.Function3<Object,VD,U,VD2> f,
scala.reflect.ClassTag<U> evidence$8,
scala.reflect.ClassTag<VD2> evidence$9)
Efficiently inner joins this VertexRDD with another VertexRDD sharing the same index.
|
static RDD<T> |
intersection(RDD<T> other) |
static RDD<T> |
intersection(RDD<T> other,
int numPartitions) |
static RDD<T> |
intersection(RDD<T> other,
Partitioner partitioner,
scala.math.Ordering<T> ord) |
static scala.math.Ordering<T> |
intersection$default$3(RDD<T> other,
Partitioner partitioner) |
static boolean |
isCheckpointed() |
static boolean |
isEmpty() |
static scala.collection.Iterator<T> |
iterator(Partition split,
TaskContext context) |
static <K> RDD<scala.Tuple2<K,T>> |
keyBy(scala.Function1<T,K> f) |
abstract <VD2,VD3> VertexRDD<VD3> |
leftJoin(RDD<scala.Tuple2<Object,VD2>> other,
scala.Function3<Object,VD,scala.Option<VD2>,VD3> f,
scala.reflect.ClassTag<VD2> evidence$6,
scala.reflect.ClassTag<VD3> evidence$7)
Left joins this VertexRDD with an RDD containing vertex attribute pairs.
|
abstract <VD2,VD3> VertexRDD<VD3> |
leftZipJoin(VertexRDD<VD2> other,
scala.Function3<Object,VD,scala.Option<VD2>,VD3> f,
scala.reflect.ClassTag<VD2> evidence$4,
scala.reflect.ClassTag<VD3> evidence$5)
Left joins this RDD with another VertexRDD with the same index.
|
static RDD<T> |
localCheckpoint() |
static <U> RDD<U> |
map(scala.Function1<T,U> f,
scala.reflect.ClassTag<U> evidence$3) |
static <U> RDD<U> |
mapPartitions(scala.Function1<scala.collection.Iterator<T>,scala.collection.Iterator<U>> f,
boolean preservesPartitioning,
scala.reflect.ClassTag<U> evidence$6) |
static <U> boolean |
mapPartitions$default$2() |
static <U> boolean |
mapPartitionsInternal$default$2() |
static <U> RDD<U> |
mapPartitionsWithIndex(scala.Function2<Object,scala.collection.Iterator<T>,scala.collection.Iterator<U>> f,
boolean preservesPartitioning,
scala.reflect.ClassTag<U> evidence$9) |
static <U> boolean |
mapPartitionsWithIndex$default$2() |
static <U> boolean |
mapPartitionsWithIndexInternal$default$2() |
abstract <VD2> VertexRDD<VD2> |
mapValues(scala.Function1<VD,VD2> f,
scala.reflect.ClassTag<VD2> evidence$2)
Maps each vertex attribute, preserving the index.
|
abstract <VD2> VertexRDD<VD2> |
mapValues(scala.Function2<Object,VD,VD2> f,
scala.reflect.ClassTag<VD2> evidence$3)
Maps each vertex attribute, additionally supplying the vertex ID.
|
static T |
max(scala.math.Ordering<T> ord) |
static T |
min(scala.math.Ordering<T> ord) |
abstract VertexRDD<VD> |
minus(RDD<scala.Tuple2<Object,VD>> other)
For each VertexId present in both
this and other, minus will act as a set difference
operation returning only those unique VertexId's present in this. |
abstract VertexRDD<VD> |
minus(VertexRDD<VD> other)
For each VertexId present in both
this and other, minus will act as a set difference
operation returning only those unique VertexId's present in this. |
static void |
name_$eq(String x$1) |
static String |
name() |
static scala.Option<Partitioner> |
partitioner() |
static Partition[] |
partitions() |
static RDD<T> |
persist() |
static RDD<T> |
persist(StorageLevel newLevel) |
static RDD<String> |
pipe(scala.collection.Seq<String> command,
scala.collection.Map<String,String> env,
scala.Function1<scala.Function1<String,scala.runtime.BoxedUnit>,scala.runtime.BoxedUnit> printPipeContext,
scala.Function2<T,scala.Function1<String,scala.runtime.BoxedUnit>,scala.runtime.BoxedUnit> printRDDElement,
boolean separateWorkingDir,
int bufferSize,
String encoding) |
static RDD<String> |
pipe(String command) |
static RDD<String> |
pipe(String command,
scala.collection.Map<String,String> env) |
static scala.collection.Map<String,String> |
pipe$default$2() |
static scala.Function1<scala.Function1<String,scala.runtime.BoxedUnit>,scala.runtime.BoxedUnit> |
pipe$default$3() |
static scala.Function2<T,scala.Function1<String,scala.runtime.BoxedUnit>,scala.runtime.BoxedUnit> |
pipe$default$4() |
static boolean |
pipe$default$5() |
static int |
pipe$default$6() |
static String |
pipe$default$7() |
static scala.collection.Seq<String> |
preferredLocations(Partition split) |
static RDD<T>[] |
randomSplit(double[] weights,
long seed) |
static long |
randomSplit$default$2() |
static T |
reduce(scala.Function2<T,T,T> f) |
abstract VertexRDD<VD> |
reindex()
Construct a new VertexRDD that is indexed by only the visible vertices.
|
static RDD<T> |
repartition(int numPartitions,
scala.math.Ordering<T> ord) |
static scala.math.Ordering<T> |
repartition$default$2(int numPartitions) |
abstract VertexRDD<VD> |
reverseRoutingTables()
Returns a new
VertexRDD reflecting a reversal of all edge directions in the corresponding
EdgeRDD. |
static RDD<T> |
sample(boolean withReplacement,
double fraction,
long seed) |
static long |
sample$default$3() |
static void |
saveAsObjectFile(String path) |
static void |
saveAsTextFile(String path) |
static void |
saveAsTextFile(String path,
Class<? extends org.apache.hadoop.io.compress.CompressionCodec> codec) |
static RDD<T> |
setName(String _name) |
static <K> RDD<T> |
sortBy(scala.Function1<T,K> f,
boolean ascending,
int numPartitions,
scala.math.Ordering<K> ord,
scala.reflect.ClassTag<K> ctag) |
static <K> boolean |
sortBy$default$2() |
static <K> int |
sortBy$default$3() |
static SparkContext |
sparkContext() |
static RDD<T> |
subtract(RDD<T> other) |
static RDD<T> |
subtract(RDD<T> other,
int numPartitions) |
static RDD<T> |
subtract(RDD<T> other,
Partitioner p,
scala.math.Ordering<T> ord) |
static scala.math.Ordering<T> |
subtract$default$3(RDD<T> other,
Partitioner p) |
static Object |
take(int num) |
static Object |
takeOrdered(int num,
scala.math.Ordering<T> ord) |
static Object |
takeSample(boolean withReplacement,
int num,
long seed) |
static long |
takeSample$default$3() |
static String |
toDebugString() |
static JavaRDD<T> |
toJavaRDD() |
static scala.collection.Iterator<T> |
toLocalIterator() |
static Object |
top(int num,
scala.math.Ordering<T> ord) |
static String |
toString() |
static <U> U |
treeAggregate(U zeroValue,
scala.Function2<U,T,U> seqOp,
scala.Function2<U,U,U> combOp,
int depth,
scala.reflect.ClassTag<U> evidence$31) |
static <U> int |
treeAggregate$default$4(U zeroValue) |
static T |
treeReduce(scala.Function2<T,T,T> f,
int depth) |
static int |
treeReduce$default$2() |
static RDD<T> |
union(RDD<T> other) |
static RDD<T> |
unpersist(boolean blocking) |
static boolean |
unpersist$default$1() |
abstract VertexRDD<VD> |
withEdges(EdgeRDD<?> edges)
Prepares this VertexRDD for efficient joins with the given EdgeRDD.
|
static <U> RDD<scala.Tuple2<T,U>> |
zip(RDD<U> other,
scala.reflect.ClassTag<U> evidence$10) |
static <B,V> RDD<V> |
zipPartitions(RDD<B> rdd2,
boolean preservesPartitioning,
scala.Function2<scala.collection.Iterator<T>,scala.collection.Iterator<B>,scala.collection.Iterator<V>> f,
scala.reflect.ClassTag<B> evidence$11,
scala.reflect.ClassTag<V> evidence$12) |
static <B,V> RDD<V> |
zipPartitions(RDD<B> rdd2,
scala.Function2<scala.collection.Iterator<T>,scala.collection.Iterator<B>,scala.collection.Iterator<V>> f,
scala.reflect.ClassTag<B> evidence$13,
scala.reflect.ClassTag<V> evidence$14) |
static <B,C,V> RDD<V> |
zipPartitions(RDD<B> rdd2,
RDD<C> rdd3,
boolean preservesPartitioning,
scala.Function3<scala.collection.Iterator<T>,scala.collection.Iterator<B>,scala.collection.Iterator<C>,scala.collection.Iterator<V>> f,
scala.reflect.ClassTag<B> evidence$15,
scala.reflect.ClassTag<C> evidence$16,
scala.reflect.ClassTag<V> evidence$17) |
static <B,C,V> RDD<V> |
zipPartitions(RDD<B> rdd2,
RDD<C> rdd3,
scala.Function3<scala.collection.Iterator<T>,scala.collection.Iterator<B>,scala.collection.Iterator<C>,scala.collection.Iterator<V>> f,
scala.reflect.ClassTag<B> evidence$18,
scala.reflect.ClassTag<C> evidence$19,
scala.reflect.ClassTag<V> evidence$20) |
static <B,C,D,V> RDD<V> |
zipPartitions(RDD<B> rdd2,
RDD<C> rdd3,
RDD<D> rdd4,
boolean preservesPartitioning,
scala.Function4<scala.collection.Iterator<T>,scala.collection.Iterator<B>,scala.collection.Iterator<C>,scala.collection.Iterator<D>,scala.collection.Iterator<V>> f,
scala.reflect.ClassTag<B> evidence$21,
scala.reflect.ClassTag<C> evidence$22,
scala.reflect.ClassTag<D> evidence$23,
scala.reflect.ClassTag<V> evidence$24) |
static <B,C,D,V> RDD<V> |
zipPartitions(RDD<B> rdd2,
RDD<C> rdd3,
RDD<D> rdd4,
scala.Function4<scala.collection.Iterator<T>,scala.collection.Iterator<B>,scala.collection.Iterator<C>,scala.collection.Iterator<D>,scala.collection.Iterator<V>> f,
scala.reflect.ClassTag<B> evidence$25,
scala.reflect.ClassTag<C> evidence$26,
scala.reflect.ClassTag<D> evidence$27,
scala.reflect.ClassTag<V> evidence$28) |
static RDD<scala.Tuple2<T,Object>> |
zipWithIndex() |
static RDD<scala.Tuple2<T,Object>> |
zipWithUniqueId() |
aggregate, cache, cartesian, checkpoint, coalesce, collect, collect, context, count, countApprox, countApproxDistinct, countApproxDistinct, countByValue, countByValueApprox, dependencies, distinct, distinct, doubleRDDToDoubleRDDFunctions, first, flatMap, fold, foreach, foreachPartition, getCheckpointFile, getNumPartitions, getStorageLevel, glom, groupBy, groupBy, groupBy, id, intersection, intersection, intersection, isCheckpointed, isEmpty, iterator, keyBy, localCheckpoint, map, mapPartitions, mapPartitionsWithIndex, max, min, name, numericRDDToDoubleRDDFunctions, partitioner, partitions, persist, persist, pipe, pipe, pipe, preferredLocations, randomSplit, rddToAsyncRDDActions, rddToOrderedRDDFunctions, rddToPairRDDFunctions, rddToSequenceFileRDDFunctions, reduce, repartition, sample, saveAsObjectFile, saveAsTextFile, saveAsTextFile, setName, sortBy, sparkContext, subtract, subtract, subtract, take, takeOrdered, takeSample, toDebugString, toJavaRDD, toLocalIterator, top, toString, treeAggregate, treeReduce, union, unpersist, zip, zipPartitions, zipPartitions, zipPartitions, zipPartitions, zipPartitions, zipPartitions, zipWithIndex, zipWithUniqueIdpublic VertexRDD(SparkContext sc, scala.collection.Seq<Dependency<?>> deps)
public static <VD> VertexRDD<VD> apply(RDD<scala.Tuple2<Object,VD>> vertices, scala.reflect.ClassTag<VD> evidence$14)
VertexRDD (one that is not set up for efficient joins with an
EdgeRDD) from an RDD of vertex-attribute pairs. Duplicate entries are removed arbitrarily.
vertices - the collection of vertex-attribute pairsevidence$14 - (undocumented)public static <VD> VertexRDD<VD> apply(RDD<scala.Tuple2<Object,VD>> vertices, EdgeRDD<?> edges, VD defaultVal, scala.reflect.ClassTag<VD> evidence$15)
VertexRDD from an RDD of vertex-attribute pairs. Duplicate vertex entries are
removed arbitrarily. The resulting VertexRDD will be joinable with edges, and any missing
vertices referred to by edges will be created with the attribute defaultVal.
vertices - the collection of vertex-attribute pairsedges - the EdgeRDD that these vertices may be joined withdefaultVal - the vertex attribute to use when creating missing verticesevidence$15 - (undocumented)public static <VD> VertexRDD<VD> apply(RDD<scala.Tuple2<Object,VD>> vertices, EdgeRDD<?> edges, VD defaultVal, scala.Function2<VD,VD,VD> mergeFunc, scala.reflect.ClassTag<VD> evidence$16)
VertexRDD from an RDD of vertex-attribute pairs. Duplicate vertex entries are
merged using mergeFunc. The resulting VertexRDD will be joinable with edges, and any
missing vertices referred to by edges will be created with the attribute defaultVal.
vertices - the collection of vertex-attribute pairsedges - the EdgeRDD that these vertices may be joined withdefaultVal - the vertex attribute to use when creating missing verticesmergeFunc - the commutative, associative duplicate vertex attribute merge functionevidence$16 - (undocumented)public static <VD> VertexRDD<VD> fromEdges(EdgeRDD<?> edges, int numPartitions, VD defaultVal, scala.reflect.ClassTag<VD> evidence$17)
VertexRDD containing all vertices referred to in edges. The vertices will be
created with the attribute defaultVal. The resulting VertexRDD will be joinable with
edges.
edges - the EdgeRDD referring to the vertices to createnumPartitions - the desired number of partitions for the resulting VertexRDDdefaultVal - the vertex attribute to use when creating missing verticesevidence$17 - (undocumented)public static scala.Option<Partitioner> partitioner()
public static SparkContext sparkContext()
public static int id()
public static String name()
public static void name_$eq(String x$1)
public static RDD<T> setName(String _name)
public static RDD<T> persist(StorageLevel newLevel)
public static RDD<T> persist()
public static RDD<T> cache()
public static RDD<T> unpersist(boolean blocking)
public static StorageLevel getStorageLevel()
public static final scala.collection.Seq<Dependency<?>> dependencies()
public static final Partition[] partitions()
public static final int getNumPartitions()
public static final scala.collection.Seq<String> preferredLocations(Partition split)
public static final scala.collection.Iterator<T> iterator(Partition split, TaskContext context)
public static <U> RDD<U> map(scala.Function1<T,U> f, scala.reflect.ClassTag<U> evidence$3)
public static <U> RDD<U> flatMap(scala.Function1<T,scala.collection.TraversableOnce<U>> f, scala.reflect.ClassTag<U> evidence$4)
public static RDD<T> distinct(int numPartitions, scala.math.Ordering<T> ord)
public static RDD<T> distinct()
public static RDD<T> repartition(int numPartitions, scala.math.Ordering<T> ord)
public static RDD<T> coalesce(int numPartitions, boolean shuffle, scala.Option<PartitionCoalescer> partitionCoalescer, scala.math.Ordering<T> ord)
public static RDD<T> sample(boolean withReplacement, double fraction, long seed)
public static RDD<T>[] randomSplit(double[] weights, long seed)
public static Object takeSample(boolean withReplacement,
int num,
long seed)
public static <K> RDD<T> sortBy(scala.Function1<T,K> f, boolean ascending, int numPartitions, scala.math.Ordering<K> ord, scala.reflect.ClassTag<K> ctag)
public static RDD<T> intersection(RDD<T> other, Partitioner partitioner, scala.math.Ordering<T> ord)
public static RDD<Object> glom()
public static <U> RDD<scala.Tuple2<T,U>> cartesian(RDD<U> other, scala.reflect.ClassTag<U> evidence$5)
public static <K> RDD<scala.Tuple2<K,scala.collection.Iterable<T>>> groupBy(scala.Function1<T,K> f, scala.reflect.ClassTag<K> kt)
public static <K> RDD<scala.Tuple2<K,scala.collection.Iterable<T>>> groupBy(scala.Function1<T,K> f, int numPartitions, scala.reflect.ClassTag<K> kt)
public static <K> RDD<scala.Tuple2<K,scala.collection.Iterable<T>>> groupBy(scala.Function1<T,K> f, Partitioner p, scala.reflect.ClassTag<K> kt, scala.math.Ordering<K> ord)
public static RDD<String> pipe(String command)
public static RDD<String> pipe(String command, scala.collection.Map<String,String> env)
public static RDD<String> pipe(scala.collection.Seq<String> command, scala.collection.Map<String,String> env, scala.Function1<scala.Function1<String,scala.runtime.BoxedUnit>,scala.runtime.BoxedUnit> printPipeContext, scala.Function2<T,scala.Function1<String,scala.runtime.BoxedUnit>,scala.runtime.BoxedUnit> printRDDElement, boolean separateWorkingDir, int bufferSize, String encoding)
public static <U> RDD<U> mapPartitions(scala.Function1<scala.collection.Iterator<T>,scala.collection.Iterator<U>> f, boolean preservesPartitioning, scala.reflect.ClassTag<U> evidence$6)
public static <U> RDD<U> mapPartitionsWithIndex(scala.Function2<Object,scala.collection.Iterator<T>,scala.collection.Iterator<U>> f, boolean preservesPartitioning, scala.reflect.ClassTag<U> evidence$9)
public static <U> RDD<scala.Tuple2<T,U>> zip(RDD<U> other, scala.reflect.ClassTag<U> evidence$10)
public static <B,V> RDD<V> zipPartitions(RDD<B> rdd2, boolean preservesPartitioning, scala.Function2<scala.collection.Iterator<T>,scala.collection.Iterator<B>,scala.collection.Iterator<V>> f, scala.reflect.ClassTag<B> evidence$11, scala.reflect.ClassTag<V> evidence$12)
public static <B,V> RDD<V> zipPartitions(RDD<B> rdd2, scala.Function2<scala.collection.Iterator<T>,scala.collection.Iterator<B>,scala.collection.Iterator<V>> f, scala.reflect.ClassTag<B> evidence$13, scala.reflect.ClassTag<V> evidence$14)
public static <B,C,V> RDD<V> zipPartitions(RDD<B> rdd2, RDD<C> rdd3, boolean preservesPartitioning, scala.Function3<scala.collection.Iterator<T>,scala.collection.Iterator<B>,scala.collection.Iterator<C>,scala.collection.Iterator<V>> f, scala.reflect.ClassTag<B> evidence$15, scala.reflect.ClassTag<C> evidence$16, scala.reflect.ClassTag<V> evidence$17)
public static <B,C,V> RDD<V> zipPartitions(RDD<B> rdd2, RDD<C> rdd3, scala.Function3<scala.collection.Iterator<T>,scala.collection.Iterator<B>,scala.collection.Iterator<C>,scala.collection.Iterator<V>> f, scala.reflect.ClassTag<B> evidence$18, scala.reflect.ClassTag<C> evidence$19, scala.reflect.ClassTag<V> evidence$20)
public static <B,C,D,V> RDD<V> zipPartitions(RDD<B> rdd2, RDD<C> rdd3, RDD<D> rdd4, boolean preservesPartitioning, scala.Function4<scala.collection.Iterator<T>,scala.collection.Iterator<B>,scala.collection.Iterator<C>,scala.collection.Iterator<D>,scala.collection.Iterator<V>> f, scala.reflect.ClassTag<B> evidence$21, scala.reflect.ClassTag<C> evidence$22, scala.reflect.ClassTag<D> evidence$23, scala.reflect.ClassTag<V> evidence$24)
public static <B,C,D,V> RDD<V> zipPartitions(RDD<B> rdd2, RDD<C> rdd3, RDD<D> rdd4, scala.Function4<scala.collection.Iterator<T>,scala.collection.Iterator<B>,scala.collection.Iterator<C>,scala.collection.Iterator<D>,scala.collection.Iterator<V>> f, scala.reflect.ClassTag<B> evidence$25, scala.reflect.ClassTag<C> evidence$26, scala.reflect.ClassTag<D> evidence$27, scala.reflect.ClassTag<V> evidence$28)
public static void foreach(scala.Function1<T,scala.runtime.BoxedUnit> f)
public static void foreachPartition(scala.Function1<scala.collection.Iterator<T>,scala.runtime.BoxedUnit> f)
public static Object collect()
public static scala.collection.Iterator<T> toLocalIterator()
public static <U> RDD<U> collect(scala.PartialFunction<T,U> f, scala.reflect.ClassTag<U> evidence$29)
public static RDD<T> subtract(RDD<T> other, Partitioner p, scala.math.Ordering<T> ord)
public static T reduce(scala.Function2<T,T,T> f)
public static T treeReduce(scala.Function2<T,T,T> f,
int depth)
public static T fold(T zeroValue,
scala.Function2<T,T,T> op)
public static <U> U aggregate(U zeroValue,
scala.Function2<U,T,U> seqOp,
scala.Function2<U,U,U> combOp,
scala.reflect.ClassTag<U> evidence$30)
public static <U> U treeAggregate(U zeroValue,
scala.Function2<U,T,U> seqOp,
scala.Function2<U,U,U> combOp,
int depth,
scala.reflect.ClassTag<U> evidence$31)
public static long count()
public static PartialResult<BoundedDouble> countApprox(long timeout, double confidence)
public static scala.collection.Map<T,Object> countByValue(scala.math.Ordering<T> ord)
public static PartialResult<scala.collection.Map<T,BoundedDouble>> countByValueApprox(long timeout, double confidence, scala.math.Ordering<T> ord)
public static long countApproxDistinct(int p,
int sp)
public static long countApproxDistinct(double relativeSD)
public static RDD<scala.Tuple2<T,Object>> zipWithIndex()
public static RDD<scala.Tuple2<T,Object>> zipWithUniqueId()
public static Object take(int num)
public static T first()
public static Object top(int num,
scala.math.Ordering<T> ord)
public static Object takeOrdered(int num,
scala.math.Ordering<T> ord)
public static T max(scala.math.Ordering<T> ord)
public static T min(scala.math.Ordering<T> ord)
public static boolean isEmpty()
public static void saveAsTextFile(String path)
public static void saveAsTextFile(String path,
Class<? extends org.apache.hadoop.io.compress.CompressionCodec> codec)
public static void saveAsObjectFile(String path)
public static <K> RDD<scala.Tuple2<K,T>> keyBy(scala.Function1<T,K> f)
public static void checkpoint()
public static RDD<T> localCheckpoint()
public static boolean isCheckpointed()
public static scala.Option<String> getCheckpointFile()
public static SparkContext context()
public static String toDebugString()
public static String toString()
public static JavaRDD<T> toJavaRDD()
public static long sample$default$3()
public static <U> boolean mapPartitionsWithIndex$default$2()
public static boolean unpersist$default$1()
public static scala.math.Ordering<T> distinct$default$2(int numPartitions)
public static boolean coalesce$default$2()
public static scala.Option<PartitionCoalescer> coalesce$default$3()
public static scala.math.Ordering<T> coalesce$default$4(int numPartitions,
boolean shuffle,
scala.Option<PartitionCoalescer> partitionCoalescer)
public static scala.math.Ordering<T> repartition$default$2(int numPartitions)
public static scala.math.Ordering<T> subtract$default$3(RDD<T> other, Partitioner p)
public static scala.math.Ordering<T> intersection$default$3(RDD<T> other, Partitioner partitioner)
public static long randomSplit$default$2()
public static <K> boolean sortBy$default$2()
public static <K> int sortBy$default$3()
public static <U> boolean mapPartitions$default$2()
public static <K> scala.runtime.Null$ groupBy$default$4(scala.Function1<T,K> f,
Partitioner p)
public static scala.collection.Map<String,String> pipe$default$2()
public static scala.Function1<scala.Function1<String,scala.runtime.BoxedUnit>,scala.runtime.BoxedUnit> pipe$default$3()
public static scala.Function2<T,scala.Function1<String,scala.runtime.BoxedUnit>,scala.runtime.BoxedUnit> pipe$default$4()
public static boolean pipe$default$5()
public static int pipe$default$6()
public static String pipe$default$7()
public static int treeReduce$default$2()
public static <U> int treeAggregate$default$4(U zeroValue)
public static double countApprox$default$2()
public static scala.math.Ordering<T> countByValue$default$1()
public static double countByValueApprox$default$2()
public static scala.math.Ordering<T> countByValueApprox$default$3(long timeout,
double confidence)
public static long takeSample$default$3()
public static double countApproxDistinct$default$1()
public static <U> boolean mapPartitionsWithIndexInternal$default$2()
public static <U> boolean mapPartitionsInternal$default$2()
public scala.collection.Iterator<scala.Tuple2<Object,VD>> compute(Partition part, TaskContext context)
RDD[(VertexId, VD)] equivalent output.public abstract VertexRDD<VD> reindex()
public VertexRDD<VD> filter(scala.Function1<scala.Tuple2<Object,VD>,Object> pred)
It is declared and defined here to allow refining the return type from RDD[(VertexId, VD)] to
VertexRDD[VD].
public abstract <VD2> VertexRDD<VD2> mapValues(scala.Function1<VD,VD2> f, scala.reflect.ClassTag<VD2> evidence$2)
f - the function applied to each value in the RDDevidence$2 - (undocumented)f to each of the entries in the
original VertexRDDpublic abstract <VD2> VertexRDD<VD2> mapValues(scala.Function2<Object,VD,VD2> f, scala.reflect.ClassTag<VD2> evidence$3)
f - the function applied to each ID-value pair in the RDDevidence$3 - (undocumented)f to each of the entries in the
original VertexRDD. The resulting VertexRDD retains the same index.public abstract VertexRDD<VD> minus(RDD<scala.Tuple2<Object,VD>> other)
this and other, minus will act as a set difference
operation returning only those unique VertexId's present in this.
other - an RDD to run the set operation againstpublic abstract VertexRDD<VD> minus(VertexRDD<VD> other)
this and other, minus will act as a set difference
operation returning only those unique VertexId's present in this.
other - a VertexRDD to run the set operation againstpublic abstract VertexRDD<VD> diff(RDD<scala.Tuple2<Object,VD>> other)
this and other, diff returns only those vertices with
differing values; for values that are different, keeps the values from other. This is
only guaranteed to work if the VertexRDDs share a common ancestor.
other - the other RDD[(VertexId, VD)] with which to diff against.public abstract VertexRDD<VD> diff(VertexRDD<VD> other)
this and other, diff returns only those vertices with
differing values; for values that are different, keeps the values from other. This is
only guaranteed to work if the VertexRDDs share a common ancestor.
other - the other VertexRDD with which to diff against.public abstract <VD2,VD3> VertexRDD<VD3> leftZipJoin(VertexRDD<VD2> other, scala.Function3<Object,VD,scala.Option<VD2>,VD3> f, scala.reflect.ClassTag<VD2> evidence$4, scala.reflect.ClassTag<VD3> evidence$5)
this.
If other is missing any vertex in this VertexRDD, f is passed None.
other - the other VertexRDD with which to join.f - the function mapping a vertex id and its attributes in this and the other vertex set
to a new vertex attribute.evidence$4 - (undocumented)evidence$5 - (undocumented)fpublic abstract <VD2,VD3> VertexRDD<VD3> leftJoin(RDD<scala.Tuple2<Object,VD2>> other, scala.Function3<Object,VD,scala.Option<VD2>,VD3> f, scala.reflect.ClassTag<VD2> evidence$6, scala.reflect.ClassTag<VD3> evidence$7)
leftZipJoin implementation is
used. The resulting VertexRDD contains an entry for each vertex in this. If other is
missing any vertex in this VertexRDD, f is passed None. If there are duplicates,
the vertex is picked arbitrarily.
other - the other VertexRDD with which to joinf - the function mapping a vertex id and its attributes in this and the other vertex set
to a new vertex attribute.evidence$6 - (undocumented)evidence$7 - (undocumented)f.public abstract <U,VD2> VertexRDD<VD2> innerZipJoin(VertexRDD<U> other, scala.Function3<Object,VD,U,VD2> f, scala.reflect.ClassTag<U> evidence$8, scala.reflect.ClassTag<VD2> evidence$9)
innerJoin for the behavior of the join.other - (undocumented)f - (undocumented)evidence$8 - (undocumented)evidence$9 - (undocumented)public abstract <U,VD2> VertexRDD<VD2> innerJoin(RDD<scala.Tuple2<Object,U>> other, scala.Function3<Object,VD,U,VD2> f, scala.reflect.ClassTag<U> evidence$10, scala.reflect.ClassTag<VD2> evidence$11)
innerZipJoin implementation
is used.
other - an RDD containing vertices to join. If there are multiple entries for the same
vertex, one is picked arbitrarily. Use aggregateUsingIndex to merge multiple entries.f - the join function applied to corresponding values of this and otherevidence$10 - (undocumented)evidence$11 - (undocumented)this, containing only vertices that appear in both
this and other, with values supplied by fpublic abstract <VD2> VertexRDD<VD2> aggregateUsingIndex(RDD<scala.Tuple2<Object,VD2>> messages, scala.Function2<VD2,VD2,VD2> reduceFunc, scala.reflect.ClassTag<VD2> evidence$12)
messages that have the same ids using reduceFunc, returning a
VertexRDD co-indexed with this.
messages - an RDD containing messages to aggregate, where each message is a pair of its
target vertex ID and the message datareduceFunc - the associative aggregation function for merging messages to the same vertexevidence$12 - (undocumented)this, containing only vertices that received messages.
For those vertices, their values are the result of applying reduceFunc to all received
messages.public abstract VertexRDD<VD> reverseRoutingTables()
VertexRDD reflecting a reversal of all edge directions in the corresponding
EdgeRDD.