Skip to content

Conversation

@nmaludy
Copy link
Member

@nmaludy nmaludy commented May 22, 2018

Fix for #4139

This adds recursion into the secret masking process so that configs with nested dicts and lists will also have their secrets masked out for API calls.

TODO

  • Unit tests
  • Test various combinations of dicts/lists/other
  • Find other edge cases

@nmaludy
Copy link
Member Author

nmaludy commented May 22, 2018

@Kami Here's an initial workup of a fix. Initial testing looks good (used the email pack schema as a reference https://github.com/StackStorm-Exchange/stackstorm-email/blob/master/config.schema.yaml).

I still need to develop test cases, but would appreciate comments early if you have time.

@Kami Kami self-assigned this May 22, 2018
@Kami Kami added this to the 2.8.0 milestone May 22, 2018
@Kami
Copy link
Member

Kami commented May 22, 2018

Thanks, I will have a look tomorrow.

And yeah, I agree - we need a good and thorough test coverage for that.

Another thing we need to keep in mind is to try to make this function as performance efficient as possible (bail early if there are no secret params, only iterate over items which could contain secret params based on the schema, etc.).

If we don't do that and just blindly recursively iterate over nested objects, it has a potential to significantly slow API responses for all the endpoints with deeply nested objects (executions, etc.).

@nmaludy
Copy link
Member Author

nmaludy commented May 23, 2018

An idea of how to speed this up (theoretically) we will need to measure to determine if it's actually faster:

In the current implementation we call get_secret_parameters() which fully traverses the entire JSON schema looking for the secret property. Next we call mask_secret_parameters which traverses the object, but only for the secret_parameters passed in.

To potentially speed this up, we could potentially skip traversing parts of the JSON schema if the input data is missing those properties.

Example: if we have a schema with a nested dictionary

{
    'arg_string': {
        'description': 'Junk',
        'type': 'string',
    }
    'arg_optional_object': {
        'description': 'Mirror',
        'type': 'object',
        'properties': {
            'arg_nested_object': {
                'description': 'Mirror mirror',
                'type': 'object',
                'properties': {
                    'arg_double_nested_secret': {
                        'description': 'Deep, deep down',
                        'type': 'string',
                        'secret': True
                    }
                }
            },
            'arg_nested_secret': {
                'description': 'Deep down',
                'type': 'string',
                'secret': True
            }
        }
    }
}

And we pass in data:

{'arg_string': 'test'}

Then we can skip traversing down into the arg_optional_object schema since the input data doesn't have this property, saving us time/cycles.

To accomplish this we would need to combine the two functions get_secret_parameters() and mask_secret_parameters() into one. This function would take in both the schema object and the data object and perform traversals of the data + schema simultaneously.

@nmaludy
Copy link
Member Author

nmaludy commented May 23, 2018

Additional thing to look at is the copy.deepcopy() when calling this recursively we can end up using a ton of memory. Need to figure out a memory-efficient way of doing mask_secret_parameters()

@Kami
Copy link
Member

Kami commented May 23, 2018

As far as deep copy goes - yeah, ideally we would create a deep copy of the original object just once and then modify it in subsequent recursive calls (instead of creating a copy during each recursion which as you have said, is problematic).

And in general, I like to avoid recursion where possible because of issues like that, but here it should be fine, if we only create a deep copy once and manipulate that original object in place in subsequent recursive calls.

@nmaludy
Copy link
Member Author

nmaludy commented May 24, 2018

@Kami Think this is ready for a review, all unit tests are passing on my local box and with the fix detailed below they should pass in travis.

FYI i ran into a "random failure" in the st2client/tests/unit/test_inquiry.py suite where random UUIds were getting generated and causing the count() function to return too many values in the CLI list output.

# when inspecting the inquiry list output for test:
# test_list_inquiries_limit()
while '1440' in newinquiry['id']:
newinquiry['id'] = str(uuid.uuid4())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this intermediate test failure 👍

elif is_list:
iterator = enumerate(secret_parameters)
else:
return MASKED_ATTRIBUTE_VALUE
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fine or we could also throw because an unsupported type was passed it :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried that and we run into a couple weird scenarios where we might pass in an OrderedDict or a MongoEngineDict. As long as the the behavior of the type is compatible, then we should be OK.

Not entirely sure how we can verify that. If you have any ideas i will happily implement them!

# only create a deep copy of parameters on the first call
# all other recursive calls pass back referneces to this result object
# so we can reuse it, saving memory and CPU cycles
if not result:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it's safer to do a more explicit check if result is None? Is there a possible scenario where empty dict / array could be returned during recursion?

}
]
}
self.assertEqual(expected, result)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work on the extensive tests 👍

@Kami
Copy link
Member

Kami commented May 24, 2018

Can you please add a corresponding changelog entry (and please try to be explicit and describe scenario where this could occur)?

Besides that, LGTM 👍

@Kami
Copy link
Member

Kami commented May 24, 2018

The fix for PY3 test failure is in my PR - #4134.

I hope to finish and merge it today.

@Kami Kami changed the title [WIP] Fix for #4139 - Secrets not being masked in pack config Fix for #4139 - Secrets not being masked in pack config May 24, 2018
@nmaludy
Copy link
Member Author

nmaludy commented May 24, 2018

OK, i've updated the changelog and converted that one line to if result is None. All checks have passed, we should be good to go.

Fixed
~~~~~

* Fixed a bug where secrets in pack configs weren't being masked. Recently we
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor thing - we use 100 characters wide lines. Not a big deal and this can be changed in a subsequent PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants