Skip to content

Conversation

@flahertyb
Copy link

Bart Flaherty added 2 commits April 18, 2014 10:48
prints s3 logs for a shell command activity
or given no argument prints all s3 logs
@flahertyb
Copy link
Author

@mattgillooly

@mattgillooly
Copy link
Contributor

@flahertyb - can you handle this exception more gracefully?

When running this without a ~/.aws-sdk config file...

$ bundle exec ./bin/pipely -p df-02732473HJ2DNSBINI35 -s SecondLongTask
/Users/mattgillooly/swipely/pipely/lib/pipely/aws_client.rb:76:in `initialize': No such file or directory @ rb_sysopen - /Users/mattgillooly/.aws-sdk (Errno::ENOENT)
    from /Users/mattgillooly/swipely/pipely/lib/pipely/aws_client.rb:76:in `open'
    from /Users/mattgillooly/swipely/pipely/lib/pipely/aws_client.rb:76:in `configure'
    from /Users/mattgillooly/swipely/pipely/lib/pipely/aws_client.rb:8:in `initialize'
    from /Users/mattgillooly/swipely/pipely/lib/pipely/actions/list_log_paths.rb:34:in `new'
    from /Users/mattgillooly/swipely/pipely/lib/pipely/actions/list_log_paths.rb:34:in `data_pipeline'
    from /Users/mattgillooly/swipely/pipely/lib/pipely/actions/list_log_paths.rb:30:in `log_paths_for_object'
    from /Users/mattgillooly/swipely/pipely/lib/pipely/actions/list_log_paths.rb:17:in `execute'
    from ./bin/pipely:8:in `<main>'

@mattgillooly
Copy link
Contributor

Additionally, there should be instructions for configuring the credentials, and the config file should be named after pipely, not aws-sdk.

@mattgillooly
Copy link
Contributor

It may be helpful to look at how other gems handle this type of config.

Fog: https://github.com/fog/fog-core/blob/master/lib/fog/core/credentials.rb#L44
Rspec: https://github.com/rspec/rspec-core/blob/master/lib/rspec/core/configuration.rb#L11

I like how RSpec enables a combination of .rspec file, commandline switches, and configuration in Ruby to set the same core settings. We should sketch out a README of how this would ideally look for Pipely.

@mattgillooly
Copy link
Contributor

This git doc is a good example of how we might want to explain Pipely's subcommands.

http://git-scm.com/book/en/Git-Basics-Getting-a-Git-Repository

now ConfigureAws looks at the .pipely file
The Api classes are singletons that mixin ConfigureAws so we aren't making
repeated calls to configure/create the aws-sdk client objects
Component, Instance, and Pipeline, don't inherit from anything
@flahertyb
Copy link
Author

Ok, a little messy at the moment in terms of interfaces, but now when you run with the -s option, pipely will a) try and print the log urls from the stderr and stdout fields, and b) try to print some info about a corresponding EMR step, including start time and end time.

bundle exec pipely -p df-0894454CJVAKJ1CNGFE -s GenerateServerItemSalesByTicket
gives you

Log paths for object:
#<AWS::DataPipeline::Errors::InvalidRequestException: Invalid expression: Unable to resolve stderr for object:@GenerateServerItemSalesByTicket_2014-04-22T20:25:24>
Can't find log paths for @GenerateServerItemSalesByTicket_2014-04-22T20:25:24
nil

EMR step for object:
{:id=>"s-2AYLM8I1998T5",
 :name=>"Step.c7ca9052-4d2f-4ec5-a4b1-439688800e65",
 :status=>
  {:state=>"COMPLETED",
   :state_change_reason=>{},
   :timeline=>
    {:creation_date_time=>2014-04-22 16:33:46 -0400,
     :start_date_time=>2014-04-22 16:33:49 -0400,
     :end_date_time=>2014-04-22 17:03:16 -0400}}}

It currently doesn't handle multiple attempts, just matches on the streaming hadoop call.

@mattgillooly

@mattgillooly
Copy link
Contributor

Neat! I will use this tonight!

@flahertyb
Copy link
Author

Can now match an attempt to an emr step. Running with the -s option and a component name like this:

bundle exec pipely -p df-0894454CJVAKJ1CNGFE -s GenerateServerItemSalesByTicket

now gives you a list of EMR steps mapped to each attempt for the active instance of that component

I was unable to match based on start time and end time, because the values are all the same from attempt to attempt. I instead had to search for the step name in the errorMessage field on the attempt.

Additionally, this should now work even if you have multiple EMR Clusters in your pipeline.

On to getting emr logs for those attempts, and then taking a step back and hacking on a readme.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants