A common situation in which you’ll find yourself in Spark will be having an RDD of keys and values in two-tuples. A common operation on those keys and values will be summing all the values by key. This operation can be called sumByKey. Use the right reduce-like method of the RDD to sum the values in an RDD by key.
Already registered? Login
Not Account? Sign up
Enter your email address to reset your password
Back to Login? Click here