Skip to content

docs: Categorize transforms into functions or runtimes#2012

Closed
binarylogic wants to merge 2 commits intomasterfrom
runtimes-and-functons
Closed

docs: Categorize transforms into functions or runtimes#2012
binarylogic wants to merge 2 commits intomasterfrom
runtimes-and-functons

Conversation

@binarylogic
Copy link
Contributor

This change moves transforms into 2 categories: functions or runtimes. This helps to separate the 2 for future changes we're making:

  1. Runtimes should be able to call functions at some point in the future.
  2. Runtimes will be encouraged to replace the entire transform step, versus chaining many transforms together. For example, if a user reaches for the lua runtime we should encourage them to do everything within that runtime (add fields, remove fields, control flow, etc).

@binarylogic
Copy link
Contributor Author

@Hoverbear I'm curious what you think about this change, especially in the context of the upcoming WASM changes. I'm thinking mostly about:

  1. Where do wasm sources and sinks fit? Are those "runtimes" too?
  2. What are your thoughts on the second point?

Signed-off-by: binarylogic <bjohnson@binarylogic.com>
Signed-off-by: binarylogic <bjohnson@binarylogic.com>
input_types = ["log"]
output_types = ["log"]
requirements = {}
sub_type = "function"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

swimlanes is interesting because it's not really a function or a runtime. It's really a macro that expands to multiple function-based pipelines.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. I think for this exercise the term "component" might be better? I'm really trying to distinguisn between a set of simple components that you connect to form transformation pipelines, and full runtimes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that sounds reasonable to me.

@Hoverbear
Copy link
Contributor

@binarylogic

Where do wasm sources and sinks fit? Are those "runtimes" too?

I don't think so. Ideally a wasm module would be able to register itself as a source/sink/function/runtime.

For example, an assemblyscript or lua runtime could be eventually powered by wasm...

I'd almost say a noble goal would be that all ryntimes are wasm modules, but not all wasm modules are runtimes.

Runtimes will be encouraged to replace the entire transform step, versus chaining many transforms together. For example, if a user reaches for the lua runtime we should encourage them to do everything within that runtime (add fields, remove fields, control flow, etc).

Disagree! We should encourage users to use this power to create logical blocks of their pipeline... instead of forcing them to create functional blocks.

This is especially powerful as we get the new swimlanes functionality, as users can route things easily between these blocks!

@binarylogic
Copy link
Contributor Author

Disagree! We should encourage users to use this power to create logical blocks of their pipeline... instead of forcing them to create functional blocks.

Sure. I can get onboard with this.

This is especially powerful as we get the new swimlanes functionality, as users can route things easily between these blocks!

So this is where my point becomes a little more clear. I like the swimlanes transform for simple routing, but a runtime with if, else, switch, and other advanced control-flow tools, is much more powerful. This is why #2000 is allowing users to specify "lanes" for events.

Given this scenario, where a user reaches for a runtime to control flow, it seems easier to just do the entire transform within the runtime, versus composing a bunch of small components together. I'm not saying a user couldn't do both, but I want to guide a user towards one.

@Hoverbear
Copy link
Contributor

Right, so consider a non-trivial config with, say sink0..5 and source0..5.

Consider this user to be needing to do any or many of different logical tasks as part of the route from given sources to sinks, eg "Parse json", "Do GDPR filtering", "Tag routing data", "Strip unneeded metadata" etc.

It makes sense it they do this logically instead of needing to replicate functionality across transforms.

@binarylogic
Copy link
Contributor Author

It makes sense it they do this logically instead of needing to replicate functionality across transforms.

I'm not following. Why would a user need to replicate anything? For example, they could maintain a single lua script with functions to perform all of their processing.

@Hoverbear
Copy link
Contributor

So you'd suggest they have all their sources feed to one transform, then have that transform feel to all the sinks? Seems like at that point Vector could operate as a sort of library/proxy at that point no? Then we could totally forget about transforms all together.

@binarylogic
Copy link
Contributor Author

Yep. In some circumstances that might be the best UX. It would allow them to leverage an entire runtime for control flow, parsing, shaping, and so on. They can break it up into files and manage it just like any other code.

To clarify, I am not suggesting every user do this. This is something a user can graduate to once their pipelines are hard to manage with our basic functions. It could also just be user preference.

@Hoverbear
Copy link
Contributor

Perhaps instead of runtimes the more appropriate term is Languages?

@Hoverbear Hoverbear added domain: docs domain: transforms Anything related to Vector's transform components labels Mar 15, 2020
@binarylogic
Copy link
Contributor Author

I'm going to close this for now. It's not clear if making this distinction makes sense yet. As we mature the runtimes I think we'll get a better understanding around how we want to encourage users to use them. As much as I like the "runtime for your entire pipeline" idea, it might not be practical.

@binarylogic binarylogic deleted the runtimes-and-functons branch April 24, 2020 20:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

domain: transforms Anything related to Vector's transform components

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants