Conversation
Signed-off-by: Alexander Rodin <rodin.alexander@gmail.com>
There was a problem hiding this comment.
This is a great start and seems to solve to a number of existing issues. A few outstanding questions I have:
- Should scripting transforms be able to control flow and define "outputs" similar to the
swimlanestransform? - Can scripting transforms emit more than one event?
- Backward compatibility is handled by the "anonymous handler" feature, correct?
- Should we address performance concerns? If performance was equal, I'd probably recommend using scripting transforms for everything.
- Should we mention testing? These seem like prime candidates for testing,.
| event.log.id = counter | ||
|
|
||
| if counter % 2 == 0 then | ||
| event.lane = "even" |
There was a problem hiding this comment.
I didn't see anywhere in this guide where the lane field is actually used. It would be more clear to follow through with this example.
I'm also curious if you saw #1942? It might be easier to provide a way for scripting transforms to define outputs versus pairing it with the swimlandes transform.
There was a problem hiding this comment.
I saw #1942 and I want to solve it by allowing the user to optionally add lane field to the top level of the event. The expected effect could then be the same as when using swimlanes transform.
There was a problem hiding this comment.
I see. I'd prefer that we prefix special field with _, @, or some other character. To throw another edge case at you, what if a user wanted to emit an event to 2 lanes?
There was a problem hiding this comment.
I'd prefer that we prefix special field with _, @, or some other character.
It is already a kind of "meta" field, as it is not visible to the downstream components which see only the event which is inside the value of the log key. For example, if the original event produced by say stdin source looks like this:
{
"message": "message",
"timestamp": "2020-03-02T15:45:56.751Z",
"host": "localhost"
}Then it passed to the lua transform as a table
{
log = {
message = "message",
timestamp = "2020-03-02T15:45:56.751Z",
host = "localhost"
}
}Setting the lane by having event.lane = "lane_name" in the transform code results in this:
{
log = {
message = "message",
timestamp = "2020-03-02T15:45:56.751Z",
host = "localhost"
},
lane = "lane_name"
}After this table is returned from the transform, the log part is extracted to be sent to the downstream components reading from transform_name.lane_name. So if there is a downstream console sink reading from transform_name.lane_name it would receive
{
"message": "message",
"timestamp": "2020-03-02T15:45:56.751Z",
"host": "localhost"
}By the way, I'm thinking about another "meta" field visible only to the transform, input, which would contain the name of the component which produced the event. But it would require changes in other parts of code to implement, not only in lua and javascript transforms.
To throw another edge case at you, what if a user wanted to emit an event to 2 lanes?
One approach is to duplicate the event and then set different lanes for the duplicates. On the other hand, to make it more convenient, it is also possible to allow setting the field not only to strings, but also to arrays.
There was a problem hiding this comment.
That makes more sense, I forgot the type data was nested. I still find the lanes approach to be somewhat awkward. IMO, it feels more natural to pair this with the return statement or an explicit function. For example:
return {event, "lane1", "lane2"}And returning just event would work as well:
return eventIs your intent to avoid having to implement a common API across all scripting transforms? I assume adding a lane field would make it easier to share this logic across different scripting transforms?
There was a problem hiding this comment.
I'm experimenting with the approach proposed in #1942, with the transform pushing the events using an emit function. But then I think it makes sense to make the handlers closures which take entire input stream and then iterate over it and emit output events.
For example, if the handler takes input parameter which is an iterator and the emit function, then the entire handler can look like this:
function (input, emit)
-- init
counter = 0
-- process
for event in input do
counter = counter + 1
event.log.id = counter
if counter % 2 == 0 then
emit(event, "even")
else
emit(event, "odd")
end
end
-- clean up
emit {
log = {
message = "nothing left to be processed",
}
}
endWhich implements the true iterator pattern from Lua by itself. However, my concern with this approach is that it is not possible to produce an event if the waiting takes too long. So we need a more flexible approach, which is also not overcomplicated in the simple cases.
In general, I think if add such large changes, it is worth to create an RFC on the new programming model which would solve all issues with the Lua transform.
Is your intent to avoid having to implement a common API across all scripting transforms? I assume adding a
lanefield would make it easier to share this logic across different scripting transforms?
Not really, the underlying scripting engines are very different, so I think the code duplication is unavoidable in any case.
There was a problem hiding this comment.
Agree that we’d need a RFC for that. I prefer your original proposal over this direction, as I find it to be less straightforward. Happy to have others weigh in if you disagree. And appreciate you trying that out.
Signed-off-by: Alexander Rodin <rodin.alexander@gmail.com>
Yes! The idea is that the transform can set
They can return arrays of events instead of singular events.
Not really, because the event structure is changed in any case (now log events are nested in function (event)
return event
endIn my opinion, for backward compatibility it is better to just introduce a
I expect the performance to be at least 10x slower than with native transforms. Furthermore, some design decisions (such as using tables instead of
I think so, there should be an example of defining unit tests for scripting transforms. |
Agree, we have very important users using the |
|
Closing since #2000 will introduce changes and this PR has drifted. |
This is a WIP on a guide describing usage of scripting transforms (
luaandjavascript) in Vector.It describes unimplemented features and is far from being finished by itself, so this PR is created just to collect some feedback.