Skip to content
This repository was archived by the owner on Dec 18, 2019. It is now read-only.

Getting Started

Matt edited this page Jan 11, 2015 · 8 revisions

Prerequisites

  • Java
  • JRuby
  • Cascading and Hadoop on your CLASSPATH

Installation

To make the 'cascading' gem available to your JRuby scripts, install it as usual:

jruby -S gem install cascading.jruby

However, this won't help you download the dependencies you require (Cascading and Hadoop) and won't enable you to run the samples, which are housed in the cascading.jruby repository, but not packaged with the gem. The following sections will get you setup to run the samples locally.

Samples

Prerequisites

  • Java
  • Ant
  • JRuby
  • Bundler

Now install for local development, which will also allow you to run the samples:

jruby -S bundle install

Running the Samples

The cascading.jruby repository comes with a fairly extensive set of example jobs that do not ship with the gem.

You can run them with (this will download Cascading and Hadoop jars into build/lib, a one-time process):

jruby -S bundle exec rake samples

Or individually like this (once you've done the above once to get the dependencies):

./samples/group_by.rb

Given all this setup, you can finally paste the word count example from the README into a file named "wordcount.rb" and run it like this:

 jruby -J-cp "build/lib/*" wordcount.rb README.md && less output/wordcount

The salient point being that once you have all the dependencies you need on the CLASSPATH, cascading.jruby scripts behave like any other JRuby script.

Resources

Clone this wiki locally