docs: Add scripting guide · Pull Request #1962 · vectordotdev/vector

ghost · 2020-03-01T19:34:56Z

This is a WIP on a guide describing usage of scripting transforms (lua and javascript) in Vector.

It describes unimplemented features and is far from being finished by itself, so this PR is created just to collect some feedback.

Signed-off-by: Alexander Rodin <rodin.alexander@gmail.com>

binarylogic

This is a great start and seems to solve to a number of existing issues. A few outstanding questions I have:

Should scripting transforms be able to control flow and define "outputs" similar to the swimlanes transform?
Can scripting transforms emit more than one event?
Backward compatibility is handled by the "anonymous handler" feature, correct?
Should we address performance concerns? If performance was equal, I'd probably recommend using scripting transforms for everything.
Should we mention testing? These seem like prime candidates for testing,.

website/docs/setup/guides/scripting.md

binarylogic · 2020-03-01T20:40:18Z

website/docs/setup/guides/scripting.md

+  event.log.id = counter
+
+  if counter % 2 == 0 then
+    event.lane = "even"


I didn't see anywhere in this guide where the lane field is actually used. It would be more clear to follow through with this example.

I'm also curious if you saw #1942? It might be easier to provide a way for scripting transforms to define outputs versus pairing it with the swimlandes transform.

I saw #1942 and I want to solve it by allowing the user to optionally add lane field to the top level of the event. The expected effect could then be the same as when using swimlanes transform.

I see. I'd prefer that we prefix special field with _, @, or some other character. To throw another edge case at you, what if a user wanted to emit an event to 2 lanes?

I'd prefer that we prefix special field with _, @, or some other character.

It is already a kind of "meta" field, as it is not visible to the downstream components which see only the event which is inside the value of the log key. For example, if the original event produced by say stdin source looks like this:

{ "message": "message", "timestamp": "2020-03-02T15:45:56.751Z", "host": "localhost" }

Then it passed to the lua transform as a table

{ log = { message = "message", timestamp = "2020-03-02T15:45:56.751Z", host = "localhost" } }

Setting the lane by having event.lane = "lane_name" in the transform code results in this:

{ log = { message = "message", timestamp = "2020-03-02T15:45:56.751Z", host = "localhost" }, lane = "lane_name" }

After this table is returned from the transform, the log part is extracted to be sent to the downstream components reading from transform_name.lane_name. So if there is a downstream console sink reading from transform_name.lane_name it would receive

{ "message": "message", "timestamp": "2020-03-02T15:45:56.751Z", "host": "localhost" }

By the way, I'm thinking about another "meta" field visible only to the transform, input, which would contain the name of the component which produced the event. But it would require changes in other parts of code to implement, not only in lua and javascript transforms.

To throw another edge case at you, what if a user wanted to emit an event to 2 lanes?

One approach is to duplicate the event and then set different lanes for the duplicates. On the other hand, to make it more convenient, it is also possible to allow setting the field not only to strings, but also to arrays.

That makes more sense, I forgot the type data was nested. I still find the lanes approach to be somewhat awkward. IMO, it feels more natural to pair this with the return statement or an explicit function. For example:

return {event, "lane1", "lane2"}

And returning just event would work as well:

return event

Is your intent to avoid having to implement a common API across all scripting transforms? I assume adding a lane field would make it easier to share this logic across different scripting transforms?

I'm experimenting with the approach proposed in #1942, with the transform pushing the events using an emit function. But then I think it makes sense to make the handlers closures which take entire input stream and then iterate over it and emit output events.

For example, if the handler takes input parameter which is an iterator and the emit function, then the entire handler can look like this:

function (input, emit) -- init counter = 0 -- process for event in input do counter = counter + 1 event.log.id = counter if counter % 2 == 0 then emit(event, "even") else emit(event, "odd") end end -- clean up emit { log = { message = "nothing left to be processed", } } end

Which implements the true iterator pattern from Lua by itself. However, my concern with this approach is that it is not possible to produce an event if the waiting takes too long. So we need a more flexible approach, which is also not overcomplicated in the simple cases.

In general, I think if add such large changes, it is worth to create an RFC on the new programming model which would solve all issues with the Lua transform.

Is your intent to avoid having to implement a common API across all scripting transforms? I assume adding a lane field would make it easier to share this logic across different scripting transforms?

Not really, the underlying scripting engines are very different, so I think the code duplication is unavoidable in any case.

Agree that we’d need a RFC for that. I prefer your original proposal over this direction, as I find it to be less straightforward. Happy to have others weigh in if you disagree. And appreciate you trying that out.

Signed-off-by: Alexander Rodin <rodin.alexander@gmail.com>

ghost · 2020-03-02T14:03:42Z

Should scripting transforms be able to control flow and define "outputs" similar to the swimlanes transform?

Yes! The idea is that the transform can set lane field in the transform and then the event can be consumed by the downstream components by setting the inputs field in the downstream components to transform_name.lane_name. If lane is not set, then it should work as usually, so events can be consumed from transform_name.

Can scripting transforms emit more than one event?

They can return arrays of events instead of singular events.

Backward compatibility is handled by the "anonymous handler" feature, correct?

Not really, because the event structure is changed in any case (now log events are nested in log field). Anonymous handlers still have function definition, but without specifying the name, like this:

function (event)
  return event
end

In my opinion, for backward compatibility it is better to just introduce a legacy configuration option, which, if set to true, would allow to use pre-existing transform scripts.

Should we address performance concerns? If performance was equal, I'd probably recommend using scripting transforms for everything.

I expect the performance to be at least 10x slower than with native transforms. Furthermore, some design decisions (such as using tables instead of userdata) additionally decrease performance, but improve the usability. As the wasm transform is in works, I think it is less than an issue than it could have been previously because the users who need high performance for custom transforms would be able to implement them in Rust.

Should we mention testing? These seem like prime candidates for testing,.

I think so, there should be an example of defining unit tests for scripting transforms.

binarylogic · 2020-03-02T16:45:32Z

In my opinion, for backward compatibility it is better to just introduce a legacy configuration option, which, if set to true, would allow to use pre-existing transform scripts.

Agree, we have very important users using the lua transform and I would like to not break their configuration. Maybe we could require a version attribute going forward?

binarylogic · 2020-03-09T14:57:01Z

Closing since #2000 will introduce changes and this PR has drifted.

docs: Add incomplete draft of the scripting guide

10d3aa8

Signed-off-by: Alexander Rodin <rodin.alexander@gmail.com>

ghost force-pushed the scripting-guide branch from d0557b9 to 10d3aa8 Compare March 1, 2020 19:44

binarylogic self-requested a review March 1, 2020 20:17

binarylogic reviewed Mar 1, 2020

View reviewed changes

Reword the introduction

7d4d1b5

Signed-off-by: Alexander Rodin <rodin.alexander@gmail.com>

binarylogic self-assigned this Mar 2, 2020

binarylogic mentioned this pull request Mar 4, 2020

Support metrics events in lua transform #1563

Closed

ghost mentioned this pull request Mar 6, 2020

Develop a consistent set of APIs for lua transform improvements #1999

Closed

binarylogic closed this Mar 9, 2020

binarylogic deleted the scripting-guide branch April 24, 2020 20:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: Add scripting guide#1962

docs: Add scripting guide#1962
ghost wants to merge 2 commits intomasterfrom
scripting-guide

ghost commented Mar 1, 2020

Uh oh!

binarylogic left a comment •

edited

Loading

Uh oh!

Uh oh!

binarylogic Mar 1, 2020

Uh oh!

ghost Mar 2, 2020 •

edited by ghost

Loading

Uh oh!

binarylogic Mar 2, 2020

Uh oh!

ghost Mar 2, 2020 •

edited by ghost

Loading

Uh oh!

binarylogic Mar 2, 2020

Uh oh!

ghost Mar 3, 2020 •

edited by ghost

Loading

Uh oh!

binarylogic Mar 4, 2020

Uh oh!

ghost commented Mar 2, 2020 •

edited by ghost

Loading

Uh oh!

binarylogic commented Mar 2, 2020

Uh oh!

binarylogic commented Mar 9, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ghost commented Mar 1, 2020

Uh oh!

binarylogic left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

binarylogic Mar 1, 2020

Choose a reason for hiding this comment

Uh oh!

ghost Mar 2, 2020 • edited by ghost Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

binarylogic Mar 2, 2020

Choose a reason for hiding this comment

Uh oh!

ghost Mar 2, 2020 • edited by ghost Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

binarylogic Mar 2, 2020

Choose a reason for hiding this comment

Uh oh!

ghost Mar 3, 2020 • edited by ghost Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

binarylogic Mar 4, 2020

Choose a reason for hiding this comment

Uh oh!

ghost commented Mar 2, 2020 • edited by ghost Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

binarylogic commented Mar 2, 2020

Uh oh!

binarylogic commented Mar 9, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

binarylogic left a comment •

edited

Loading

ghost Mar 2, 2020 •

edited by ghost

Loading

ghost Mar 2, 2020 •

edited by ghost

Loading

ghost Mar 3, 2020 •

edited by ghost

Loading

ghost commented Mar 2, 2020 •

edited by ghost

Loading