Group dataframe elements in Scala

by | Sep 5, 2018 | Apache Spark, Big Data, Scala-example | 0 comments

Example: Grouping data in a simple wayScala logo

Example where people table is grouped by last name.

df.groupBy("surname").count().show()
+-------+-----+
|surname|count|
+-------+-----+
| Martin|    1|
| Garcia|    3|
+-------+-----+

Example: pooling data combined with filter

Example where the people table is grouped by surname and the ones with more than 2 appearances are Selected.

df.groupBy("surname").count().filter("count > 2").show()
+-------+-----+
|surname|count|
+-------+-----+
| Garcia|   3 |
+-------+-----+

0 Comments

Submit a Comment

Your email address will not be published.