Welcome to the Accelerated Spark Rally  
中文版

The GPU Accelerated Spark Rally is designed to get you, a member of the Apache Spark community, developing on the POWER8 platform combining today's most popular distributed big-data analytics platform with OpenPOWER GPU acceleration.

Multiple possibilities to choose from:
A. Port an existing GPU accelerated application to Spark

Do you have an existing application that already uses GPUs for acceleration, but its confined to a single machine?  

  • Port the application to leverage Spark as the distributed computing framework to scale out the application while still using GPU accelerat
  • Demonstrate the original (non-Spark) GPU application running on POWER8.
  • Then demonstrate the Spark enabled application scaling to a second instance while still exploiting GPU acceleration on both nodes.  

You may want to leverage this Spark GPUEnabler Package (http://spark-packages.org/package/ibmsoe/GPUEnabler).

B. Create a new Spark application that uses GPU acceleration, or add GPU acceleration to an existing Spark application.
  • Demonstrate the Spark application scaling across 2 nodes first running without GPU acceleration.
  • Then demonstrate the gains achieved after adding GPU acceleration to the Spark application.  

You may want to leverage this Spark GPUEnabler Package (http://spark-packages.org/package/ibmsoe/GPUEnabler).

C. Use GPUs to accelerate Spark itself
  • For example, modify Spark to enable GPU acceleration of a Spark MLlib algorithm, Spark GraphX algorithm, Spark SQL queries, or any other core function of Spark you can think of to accelerate with a GPU.  
  • Demonstrate the speed-up provided transparently to a Spark application using the Spark functions you accelerated (show the results before and after) scaling across 2 nodes.  

Feel free to start with some of the examples and work shown here (https://github.com/IBMSparkGPU).

 

Get started

To get started with any of the above, log in to SuperVessel and spin up Spark Instances with GPUs (Spark, CUDA, compilers, and ipython notebook all preinstalled). Review our Resources page for full getting started tips and tutorials.   

The OpenPOWER Developer Challenge judges will evaluate your solution based on the following criteria:

  • How well you've demonstrated scaling across 2 nodes with Spark
  • Technical Innovation demonstrated by your solution
  • Relative speed-up achieved with GPUs
  • Expect to be rewarded if you tackled the 'C' option
  • Expect to be further rewarded if you combined two or more of 'A', 'B' and 'C' in your solution.  
  • Expect to be crazy rewarded if you combine the Accelerated Spark Challenge with the Cognitive Cup Deep Learning Challenges (i.e. Distributed GPU-accelerated Deep Learning leveraging Spark)

 

Check back frequently for updates on the parameters of the Accelerated Spark Rally.