docs: Categorize transforms into functions or runtimes by binarylogic · Pull Request #2012 · vectordotdev/vector

binarylogic · 2020-03-09T21:00:19Z

This change moves transforms into 2 categories: functions or runtimes. This helps to separate the 2 for future changes we're making:

Runtimes should be able to call functions at some point in the future.
Runtimes will be encouraged to replace the entire transform step, versus chaining many transforms together. For example, if a user reaches for the lua runtime we should encourage them to do everything within that runtime (add fields, remove fields, control flow, etc).

binarylogic · 2020-03-09T21:10:11Z

@Hoverbear I'm curious what you think about this change, especially in the context of the upcoming WASM changes. I'm thinking mostly about:

Where do wasm sources and sinks fit? Are those "runtimes" too?
What are your thoughts on the second point?

Signed-off-by: binarylogic <bjohnson@binarylogic.com>

lukesteensen · 2020-03-10T02:53:27Z

.meta/transforms/swimlanes.toml

 input_types = ["log"]
 output_types = ["log"]
 requirements = {}
+sub_type = "function"


swimlanes is interesting because it's not really a function or a runtime. It's really a macro that expands to multiple function-based pipelines.

That's a good point. I think for this exercise the term "component" might be better? I'm really trying to distinguisn between a set of simple components that you connect to form transformation pipelines, and full runtimes.

Yeah that sounds reasonable to me.

Hoverbear · 2020-03-10T17:13:57Z

@binarylogic

Where do wasm sources and sinks fit? Are those "runtimes" too?

I don't think so. Ideally a wasm module would be able to register itself as a source/sink/function/runtime.

For example, an assemblyscript or lua runtime could be eventually powered by wasm...

I'd almost say a noble goal would be that all ryntimes are wasm modules, but not all wasm modules are runtimes.

Runtimes will be encouraged to replace the entire transform step, versus chaining many transforms together. For example, if a user reaches for the lua runtime we should encourage them to do everything within that runtime (add fields, remove fields, control flow, etc).

Disagree! We should encourage users to use this power to create logical blocks of their pipeline... instead of forcing them to create functional blocks.

This is especially powerful as we get the new swimlanes functionality, as users can route things easily between these blocks!

binarylogic · 2020-03-10T17:44:41Z

Disagree! We should encourage users to use this power to create logical blocks of their pipeline... instead of forcing them to create functional blocks.

Sure. I can get onboard with this.

This is especially powerful as we get the new swimlanes functionality, as users can route things easily between these blocks!

So this is where my point becomes a little more clear. I like the swimlanes transform for simple routing, but a runtime with if, else, switch, and other advanced control-flow tools, is much more powerful. This is why #2000 is allowing users to specify "lanes" for events.

Given this scenario, where a user reaches for a runtime to control flow, it seems easier to just do the entire transform within the runtime, versus composing a bunch of small components together. I'm not saying a user couldn't do both, but I want to guide a user towards one.

Hoverbear · 2020-03-10T18:54:13Z

Right, so consider a non-trivial config with, say sink0..5 and source0..5.

Consider this user to be needing to do any or many of different logical tasks as part of the route from given sources to sinks, eg "Parse json", "Do GDPR filtering", "Tag routing data", "Strip unneeded metadata" etc.

It makes sense it they do this logically instead of needing to replicate functionality across transforms.

binarylogic · 2020-03-10T19:01:03Z

It makes sense it they do this logically instead of needing to replicate functionality across transforms.

I'm not following. Why would a user need to replicate anything? For example, they could maintain a single lua script with functions to perform all of their processing.

Hoverbear · 2020-03-10T19:53:21Z

So you'd suggest they have all their sources feed to one transform, then have that transform feel to all the sinks? Seems like at that point Vector could operate as a sort of library/proxy at that point no? Then we could totally forget about transforms all together.

binarylogic · 2020-03-10T21:48:02Z

Yep. In some circumstances that might be the best UX. It would allow them to leverage an entire runtime for control flow, parsing, shaping, and so on. They can break it up into files and manage it just like any other code.

To clarify, I am not suggesting every user do this. This is something a user can graduate to once their pipelines are hard to manage with our basic functions. It could also just be user preference.

Hoverbear · 2020-03-10T21:56:33Z

Perhaps instead of runtimes the more appropriate term is Languages?

binarylogic · 2020-03-16T23:02:48Z

I'm going to close this for now. It's not clear if making this distinction makes sense yet. As we mature the runtimes I think we'll get a better understanding around how we want to encourage users to use them. As much as I like the "runtime for your entire pipeline" idea, it might not be practical.

binarylogic requested review from Hoverbear and lukesteensen March 9, 2020 21:00

binarylogic assigned Hoverbear and lukesteensen Mar 9, 2020

binarylogic added 2 commits March 9, 2020 18:02

docs: Categorize transforms into functions or runtimes

392f088

Signed-off-by: binarylogic <bjohnson@binarylogic.com>

make generate

f3dfd31

Signed-off-by: binarylogic <bjohnson@binarylogic.com>

binarylogic force-pushed the runtimes-and-functons branch from 87c5426 to f3dfd31 Compare March 9, 2020 22:02

binarylogic mentioned this pull request Mar 10, 2020

Ability to call transform functions within a runtime #2015

Closed

lukesteensen reviewed Mar 10, 2020

View reviewed changes

Hoverbear added domain: docs domain: transforms Anything related to Vector's transform components labels Mar 15, 2020

binarylogic closed this Mar 16, 2020

binarylogic deleted the runtimes-and-functons branch April 24, 2020 20:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: Categorize transforms into functions or runtimes#2012

docs: Categorize transforms into functions or runtimes#2012
binarylogic wants to merge 2 commits intomasterfrom
runtimes-and-functons

binarylogic commented Mar 9, 2020

Uh oh!

binarylogic commented Mar 9, 2020

Uh oh!

lukesteensen Mar 10, 2020

Uh oh!

binarylogic Mar 10, 2020

Uh oh!

lukesteensen Mar 10, 2020

Uh oh!

Hoverbear commented Mar 10, 2020

Uh oh!

binarylogic commented Mar 10, 2020

Uh oh!

Hoverbear commented Mar 10, 2020

Uh oh!

binarylogic commented Mar 10, 2020

Uh oh!

Hoverbear commented Mar 10, 2020

Uh oh!

binarylogic commented Mar 10, 2020

Uh oh!

Hoverbear commented Mar 10, 2020

Uh oh!

binarylogic commented Mar 16, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

binarylogic commented Mar 9, 2020

Uh oh!

binarylogic commented Mar 9, 2020

Uh oh!

lukesteensen Mar 10, 2020

Choose a reason for hiding this comment

Uh oh!

binarylogic Mar 10, 2020

Choose a reason for hiding this comment

Uh oh!

lukesteensen Mar 10, 2020

Choose a reason for hiding this comment

Uh oh!

Hoverbear commented Mar 10, 2020

Uh oh!

binarylogic commented Mar 10, 2020

Uh oh!

Hoverbear commented Mar 10, 2020

Uh oh!

binarylogic commented Mar 10, 2020

Uh oh!

Hoverbear commented Mar 10, 2020

Uh oh!

binarylogic commented Mar 10, 2020

Uh oh!

Hoverbear commented Mar 10, 2020

Uh oh!

binarylogic commented Mar 16, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants