-
-
Notifications
You must be signed in to change notification settings - Fork 782
Allow user to include arbitrary data files when creating a new action using the API #1785
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…via the API. Files specified in this attribute are written to the disk inside the pack directory to which the action belongs to.
…nd make them immutable / set a default value inside action parameters. Conflicts: st2api/st2api/controllers/v1/actions.py
EntryPointController which uses that function. Make sure the entry point file is located inside the pack directory and use get_pack_resource_file_abs_path to prevent directory traversal attacks.
…d outside this function. Also fix a typo and update affected tests.
|
Is st2api the right place to write to the filesystem? The reason I ask is we end up baking in the requirement to always have access to the filesystem and also have access to all content from st2api. Depending on our deployment strategy these might be incorrect assumptions. questions -
|
That's a good question and it's something I and we don't have a good answer (or solution) for right now. In an ideal world, for HA and reliability purposes, all the content would be available on all the servers. That's kinda what our code base assumes in some places right now (we have an existing entry point API controller which allows users to read content of the entry point file from disk using the API, etc.), but we don't have the whole "content distribution" story and most importantly VCS integration and syncing story fleshed out yet. In short - for the long term, we need to flesh out the whole content distribution and VCS integration story, but that's something which needs to be done systematically and it's more invasive. I'm also fine with tackling this "big picture" story right now since it's something we have been postponing for ever, but I'm not sure how much I can scope it down to unblock @enykeev ASAP. |
…s of a particular file from the provided pack.
…d on the ContentPackResourceController, remove method which is already implemented on the parent class.
|
We need to make sure we return meaningful error messages on API calls. We also need a way to list all the files related to the action or otherwise how are we going to find all the files we uploaded along with the action.
No sub-folder support. |
I agree. I'm really annoyed by this error and have been for a long time (I made some quick attempts to fix it in the past, but it caused too many test failures and I didn't wanna spend too much time on it back then so I didn't proceed with a fix). There are two problems which cause this:
Fixing both of those requires a bunch of code changes and testing it and making sure it doesn't break stuff so I would rather do it in a separate PR.
I thought we might be able to avoid this since it requires bigger model changes, but it looks like we can't. One solution would be to store a list of all the files inside a pack on the Pack model (we can't really do it on per-action basis since we only really know about the metadata and entry-point file, but entry point file can potentially depend and require other files from a pack). For existing packs, we would populate this when we register the packs (this functionality was added recently). And after the info is there, I would add a new |
You can actually fix that by abusing method signature with kwargs a little bit more =)
You are thinking too far ahead. What I need is to be able to see whether the action has coordinates file associated to it. I can probably use convention for that, but I don't see why we can't keep |
Haha, yeah.
Yeah, I'm trying to create something which also works for other resources, can be re-used later on and is not a one off "hack". So while we could keep data files on the action model, I would prefer not to do it since this is an API only thing. Pack.files field would also be populated when registering content so it's not an API only thing and better aligns with our future plans for that. So would an API endpoint which returns a list of all the files in a pack (+ optionally content) and additional API endpoint which allows you to retrieve the content for a single file work for now (barring in mind that you use some kind of conventions for the coordinates file name and you can always just request |
Note: This list is populated when running register content script when the pack is registered.
get_pack_resource_file_abs_path. Also update existing function to use this new one.
…les with content of all the files inside a pack. Also add a new file controller which allows user to retrieve content of a single file.
|
@enykeev I pushed some changes which I believe implement what you need. We now store a list of files inside a pack on the PackDB model. In addition to that, there are now two new API endpoints:
Second option should work for you if you follow some convention for naming file with coordinates (e.g. In addition to that, I still need to do some cleanup and other work, but except renaming |
… an invalid path.
|
I second @manasdk on "Is st2api the right place to write to the filesystem?". I see this as a good enough implementation to get us going; Options:
What are operational and implementational pro/cons with this? |
Just noting that I read them as contradicting statements. Unfortunately, this doesn't look like iterative development in this particular case. More detailed comments below. Perhaps this is a late in the game kind of comment but we really need to spend some time understanding our long term approach here. VCS and distributed content is becoming a high priority with some serious users already trying it out. So not thinking about how the solution would fit in that scenario will probably push us to a tight corner. It might turn out updating content from API may not be possible at all in such situation. So I am going to insist we talk about the design at least before merging this PR. In my mind that discussion is a blocker. I'd have preferred these API changes were put in /experimental/ as opposed to V1. This is something we need to be careful about in the future. Even though there are no backward compatible changes, putting in V1 means we are willing to support problems and bug reports. I'd prefer some baking time for this critical feature. |
+1 to designing. |
…the API error inside the JSONErrorResponseHook.
Conflicts: CHANGELOG.rst st2api/st2api/controllers/v1/actions.py
…some other method.
…n be registered during tests.
DBTestCase.setUpClass method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I see the problem with content_type here and there's little we can do unless we want to override pecan.expose. Still, one character difference in the public API will cost some people a few hours of debugging, sooner or later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not exactly sure what you mean with that.
file one returns text/plain. Do you want me to return content type which corresponds to the file extension or something else (e.g. application/octet-stream)
It would also help if I knew how you are going to use this controller - are you going to directly include the file from the API?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I mean is that we have file and files and it's quite easy to miss or mistype. Better if we would have a single controller that would output either a json list of all files or a content of a single file depending on a presence of file_path_components. I get that it is not possible at the moment, I'm just stating that it would result in debugging problems for someone later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see.
That's actually what I was planning to do first, but then I decided it's clearer if we have two separate paths.
In any case, I'm also OK with having a single path - I understand your concern with content_type there though (would probably require some jsexpose / pecan hacks to get it to work).
|
Given that we have a plan for FileSystem backend now, I am ok with merging this PR. @Kami and I had a discussion and he convinced me that it is a lot of work to move the API to /exp/. Requires a lot of refactoring if we don't want duplicated code. So I am fine with the code in /v1/ apis. It's not ideal. cc: @manasdk |
Conflicts: CHANGELOG.rst
…iles Allow user to include arbitrary data files when creating a new action using the API
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we shouldn't advertise it beyond using it in the UI yet?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's a big deal putting it in changelog (we usually write a release announcement blog post where we more prominently announce new features so as long we don't put it there we are fine), but if you think otherwise I'm also fine with removing it.
This pull request allows user to specify arbitrary action data files when creating an action using the API. Those data files are written on disk inside the "actions/" pack sub-directory.
To avoid requiring the user to also specify the pack meta-data, this action only allows users to write data files for the packs which already exist on disk.
An example use case is allowing st2flow to automatically create and write files such as work-flow definitions and files with graph node coordinates on the server.
This functionality builds on the existing API for creating actions and it's fully backward compatible.
Every time data file is written on disk we also dispatch an internal trigger. User can then use this trigger + StackStorm rule to automatically add this file to version control or similar. Eventually we will also need to figure out the whole DB and VCS sync issue (something we have talked for a long time), but I avoided this in this PR since that will be a bigger change which needs to be done systematically and needs more through.
In addition to that, I have also discovered and fixed a potential security issue inside the
get_entry_point_abs_pathfunction (eb1a454). If st2api was running as a privileged user and actionentry_pointattribute would point to an absolute location outside the pack directory, user could read an arbitrary file on disk using "entry_points" API endpoint.This change now also implies that the
entry_pointsneeds to point to a file inside a pack directory (previously we didn't enforce that). I personally believe this is the right thing to do (users should include related scripts inside the pack directory), but if you don't agree, please post your arguments here. Entry point pointing to a file outside the pack directory opens us to a whole range of potential security issues.Example API Payload
The "POST actions" payload is the same as before, only change is that it can now also contain optional
data_filesattribute.{ ... "data_files": [ { "file_path": "workflows/my_wf_1.py", "content": "aaaa" }, { "file_path": "random_action.py", "content": "bbb" }, { "file_path": "misc/st2flow_coordinates.yaml", "content": "ccc" } ] }TODO