Skip to content

Refactor Nodes to individual files; Add/execute Custom Nodes#70

Merged
reddigari merged 21 commits intomasterfrom
feature/custom-nodes
Apr 26, 2020
Merged

Refactor Nodes to individual files; Add/execute Custom Nodes#70
reddigari merged 21 commits intomasterfrom
feature/custom-nodes

Conversation

@reelmatt
Copy link
Member

This builds off the initial approach for custom nodes in now-closed #66, in much the same way, but now more refined and now with a change to regular Node classes.

Changed endpoints:

/nodes -> /workflow/nodes: Retrieves list of Nodes for the front-end to display. Response is structured the same; more on the move to Workflow below.
/workflow/upload: Now a generic upload endpoint that can accept Node data files (e.g. CSVs) and now custom Node classes. If no node_id is specified in the POST body, a file path for the custom node is generated.

For custom nodes, no package/import validation is performed on upload. The intended process would be:

  1. User uploads custom node file.
  2. On success, the front-end calls /workflow/nodes again for an updating node listing.
  3. Here, if any packages for custom nodes are missing, an error is returned telling the user to install the packages before proceeding.

Changes to workflow.py:

  • Adds a new _node_dir attribute to the Workflow class
  • Adds a new node_path accessor method that constructs a path to a given Node file.
  • get_packaged_nodes: This is the method that handles parsing of pyworkflow and custom nodes. There's documentation in the code, but briefly this:
    • Recursively imports Node files based on self.node_path. Each directory is the node_type (e.g. 'io', 'manipulation', and now 'custom_nodes') and each file within the directory represents an individual Node class.
    • Uses import lib.import_module to dynamically load the Class and uses ModuleFinder.run_script to extract any missing packages if the import failed.
  • Minor changes: fixes some private self._graph calls to the public self.graph. Changes upload_file to static, and accepting a to_open file path instead of node_id. See bullet 2 above in Changed endpoints.

Changes to node.py:

This is the biggest change of the PR and should help with modularity/customization moving forward. The main Node class is still defined in pyworkflow/node.py, but now, concrete classes are defined in individual files within pyworkflow/nodes/<node_type> directories.

Benefits this approach brings are:

  • Simplifies parsing of Node classes. Now all Nodes are treated the same, including custom nodes, without the need for any special handling. The import call is now import_module('pyworkflow.nodes.' + node_type + '.' + node) for all node types.
  • General cleanliness. Individual Node classes are now defined in separate files making changes easier to see and new Nodes can be added by creating new files (just like custom Nodes). In the case where Pyworkflow becomes a true KNIME clone with hundreds of built-in Nodes, it'll be easier to manage changes to individual Nodes without modifying a massive nodes.py file.

Testing:

This should not affect any other functionality and all Postman/unit tests continue to pass. To test out custom nodes, you can copy any exiting node (e.g., io/read_csv.py) to the pyworkflow/custom_nodes directory. If you refresh the front-end, the Node should now appear under 'Custom Nodes' and if you drag it to the workspace, it should work identically to the 'I/O' Read CSV Node, including execution. (If you change the name and/or color attributes in the custom Node file, it is easier to see the different Nodes once in the workspace).

TBD:

This addresses question 1 in issue #67 (where do custom nodes live). It still does not address question 2 (installation of missing packages), but it has features in place to help with that (e.g., returning a list of missing package names). @matthew-t-smith had mentioned docker exec might be able to run pipenv install without starting/stopping the container. I think it's probably best to wait for the Docker-ization before tackling package installation to better see how it interacts in that case.

@reelmatt reelmatt linked an issue Apr 24, 2020 that may be closed by this pull request
@reelmatt reelmatt mentioned this pull request Apr 24, 2020
Copy link
Collaborator

@reddigari reddigari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely phenomenal work!

Only issue I had was when checking out the branch, the node collection fails because there's no custom_nodes directory by default, and so the import fails. Worked fine after I added an empty directory. I'd recommend checking in an empty directory with that name if that's possible.

@reelmatt
Copy link
Member Author

Good catch on the custom_nodes directory.

Latest commit should address this bug and also streamlines the two 'dir' methods into one set_dir() which I quite like.

@reddigari
Copy link
Collaborator

As soon as this is merged I've got a PR with the front end upload functionality ready to go.

@reddigari reddigari merged commit 5c29786 into master Apr 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Installation of custom nodes

3 participants