...

Cloud Computing - RDD SPARK

Back to Course

Lesson Description


Lession - #1484 Co-Group Function


Spark cogroup Function

In Spark, the cogroup work performs on various datasets, suppose, (K, V>
and (K, W>
and returns a dataset of (K, (Iterable, Iterable>
>
tuples. This activity is otherwise called groupWith.

Example of Cogroup function
  • In this model, we play out the groupWith activity.

To open the Spark in Scala mode, follow the beneath order.
$ flash shell



  • Make a RDD utilizing the parallelized assortment.

scala> val data1 = sc.parallelize(Seq(("A",1>
,("B",2>
,("C",3>
>
>

  • Presently, we can peruse the produced outcome by utilizing the accompanying order.

scala> data1.collect



  • Make another RDD utilizing the parallelized assortment.

scala> val data2 = sc.parallelize(Seq(("B",4>
,("E",5>
>
>

  • Presently, we can peruse the produced outcome by utilizing the accompanying order.

scala> data2.collect



  • Apply cogroup(>
    capacity to bunch the qualities.

scala> val cogroupfunc = data1.cogroup(data2>

  • Presently, we can peruse the created outcome by utilizing the accompanying order.

scala> cogroupfunc.collect