Skip to content

[cascading3] Migrate core, commons and related#1521

Merged
rubanm merged 12 commits intocascading3from
rubanm/cascading3/core
Apr 13, 2016
Merged

[cascading3] Migrate core, commons and related#1521
rubanm merged 12 commits intocascading3from
rubanm/cascading3/core

Conversation

@rubanm
Copy link
Copy Markdown
Contributor

@rubanm rubanm commented Feb 19, 2016

part of #1465
based on Cyrille's work in #1446

Most of the interesting changes are in:

  • Operations.scala -- to handle both old and new cascading aggregate by thresholds
  • PlatformTest.scala -- some updated tests, hashjoining and then merging the result with one side of the same join is no longer supported in cascading3

Cascading fabric selection changes will be sent in a separate PR.

@johnynek
Copy link
Copy Markdown
Contributor

@cchepelov take a look?

Comment thread build.sbt Outdated
"com.twitter" %% "algebird-core" % algebirdVersion,
"com.twitter" %% "chill" % chillVersion,
"com.twitter.elephantbird" % "elephant-bird-cascading2" % elephantbirdVersion,
"com.twitter.elephantbird" % "elephant-bird-cascading3" % elephantbirdVersion,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we move this to where the versions are? elephant-bird-artifact so we can keep all these switches in one place?

@johnynek
Copy link
Copy Markdown
Contributor

Does this pass e2e tests or CI at Twitter?

@cwensel
Copy link
Copy Markdown

cwensel commented Feb 19, 2016

If we can get an isolated Cascading3 test case we can take a stab at promoting this from 'no longer supported' to 'bug' and then to 'resolved'.

@cchepelov
Copy link
Copy Markdown
Contributor

Hi @posco @rubanm
Great to see a lot of progress! Will have to come back to this next week (away from keyboard this week).

Re. the spurious ".forceToDisk"; indeed, the code should do the right thing without. The transform facility @cwensel wrote about looks like the correct place to put the necessary Boundaries in place.

  -- Cyrille

Le 19 févr. 2016 19:26, à 19:26, "P. Oscar Boykin" notifications@github.com a écrit:

@cchepelov take a look?


Reply to this email directly or view it on GitHub:
#1521 (comment)

rubanm and others added 4 commits March 2, 2016 08:28
Hadoop's -libjars doesn't support wildcards, with large class paths its easy to exhaust the max arg length for linux/os x when running commands. This acts as a filter above our interaction with the generic options parser to expand wildcards
@johnynek
Copy link
Copy Markdown
Contributor

@cwensel about the repro: It should be as easy as a cascading HashJoin followed by Merge followed by GroupBy. Sorry kind of swamped...

@rubanm
Copy link
Copy Markdown
Contributor Author

rubanm commented Apr 11, 2016

@johnynek This branch now passes e2e tests at Twitter (with a related EB change twitter/elephant-bird#465). I'm working on piloting some user jobs.

@sriramkrishnan
Copy link
Copy Markdown
Contributor

@rubanm this is pretty amazing work!

@johnynek
Copy link
Copy Markdown
Contributor

Amazing!

@johnynek
Copy link
Copy Markdown
Contributor

looks good to me to merge into cascading3 branch.

does this have all the changes from current develop branch?

@rubanm
Copy link
Copy Markdown
Contributor Author

rubanm commented Apr 12, 2016

@johnynek Thanks for the review! RC6 is currently being released to twitter source. I plan to merge develop once that release is done so it's in tandem, with the joinWithTiny fix to follow.

@johnynek
Copy link
Copy Markdown
Contributor

@rubanm sounds good. Way to push through on this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants