-
-
Notifications
You must be signed in to change notification settings - Fork 209
Support for Structured Logs #969
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
bcacf6e to
bcb816d
Compare
bcb816d to
094f947
Compare
094f947 to
9e1f449
Compare
9e1f449 to
3258a25
Compare
whatyouhide
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's quite a few things to think about here and a couple of race conditions in the buffer process, but I’m excited for this getting in the SDK.
lib/sentry/client.ex
Outdated
| @doc since: "12.0.0" | ||
| @spec send_log_events([LogEvent.t()]) :: | ||
| {:ok, envelope_id :: String.t()} | {:error, ClientError.t()} | ||
| def send_log_events([]), do: {:ok, ""} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have a "log batch" struct, would it make sense to have send_log_batch instead of this?
lib/sentry/envelope.ex
Outdated
| | ClientReport.t() | ||
| | Event.t() | ||
| | LogBatch.t() | ||
| | LogEvent.t() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When do we have an envelope with log events in it instead of a log batch?
lib/sentry/envelope.ex
Outdated
| Creates a new envelope containing log events. | ||
| According to the Sentry Logs Protocol, log events are sent in batches | ||
| within a single envelope item with content_type "application/vnd.sentry.items.log+json". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nits:
| within a single envelope item with content_type "application/vnd.sentry.items.log+json". | |
| within a single envelope item with content type `application/vnd.sentry.items.log+json`. |
lib/sentry/envelope.ex
Outdated
| within a single envelope item with content_type "application/vnd.sentry.items.log+json". | ||
| All log events are wrapped in a single item with { items: [...] }. | ||
| """ | ||
| @doc since: "11.0.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo?
| @doc since: "11.0.0" | |
| @doc since: "12.0.0" |
lib/sentry/log_event_buffer.ex
Outdated
| @impl GenServer | ||
| def handle_cast({:add_event, event}, state) do | ||
| # Check if queue is at max capacity | ||
| if length(state.events) >= @max_queue_size do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is going to be called constantly. Can we keep track of the current number of events instead of recomputing length(state.events) every time?
lib/sentry/log_event_buffer.ex
Outdated
| else | ||
| events = [event | state.events] | ||
|
|
||
| if length(events) >= state.max_events do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we do what I suggest above we don't need to calculate length(events) again here (it's just that number + 1)
| @impl GenServer | ||
| def handle_call(:flush, _from, state) do | ||
| send_events(state.events) | ||
| cancel_timer(state.timer_ref) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's an (unlikely) race condition here where the timer fired while we're in this function and cancel_timer/1 has no effect, leading to the :flush message being in the message queue and handled right after flushing here.
Two solutions:
- Switch to
gen_statem, which makes all this pretty easy with its timeout handling. - Flush the
:flush_timeoutmessage by using areceive do :flush_timeout -> :ok after 0 -> :ok endor something.
lib/sentry/log_event_buffer.ex
Outdated
| do_send_events(events) | ||
| else | ||
| # Send asynchronously to avoid blocking in production | ||
| Task.start(fn -> do_send_events(events) end) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we start this task under a Task.Supervisor with a configured number of :max_children? We run the (low) risk of blowing things up here if these tasks become backlogged and this GenServer keeps spawning new ones.
| @@ -0,0 +1,247 @@ | |||
| defmodule Sentry.LogsHandler do | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's going to be really confusing to have Sentry.LoggerHandler (for errors) and Sentry.LogsHandler (for logs). Have we thought about that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@whatyouhide I was on the fence with this, decided to go with Logs because this is what we call the feature in Sentry, but for more clarity I think we could rename Sentry.LoggerHandler to Sentry.ErrorLogHandler and leave Sentry.LogsHandler. Another option is to collapse the two into one and unify behavior, where capturing error logs as exceptions would be one of the handler features, and sending logs would be another, both configurable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah this is a tough one. Renaming Sentry.LoggerHandler is annoying because it will require users who upgrade to do it manually, and there's no "nice" way to deprecate the name itself. I think keeping only Sentry.LoggerHandler and making it configurable is messier on the SDK side but might be a better experience for the end user, provided it:
- validates configuration as much as possible automatically so that it's easy to make sure you're doing the right config
- supports the new Structured Logs approach as opt in—for example, we use the logger handler for catching log errors and crashes but we do not use the Sentry Logs product at all (and wouldn't want to)
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doing extra work on SDK side to improve DX is our principle guideline thus collapsing into one LoggerHandler sounds like the way to go even if it means more work for us.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fine with either decision, but to help decide: eventually, we will deprecate and remove the older error/issue creation from logs now that we have a dedicated logs product.
3258a25 to
4ea6119
Compare
Co-authored-by: Andrea Leopardi <an.leopardi@gmail.com>
Co-authored-by: Andrea Leopardi <an.leopardi@gmail.com>
7c929ff to
515fa87
Compare
sl0thentr0py
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 comments about schema changes
| |> put_sentry_attr_if("sentry.environment", log_event.environment) | ||
| |> put_sentry_attr_if("sentry.release", log_event.release) | ||
| |> put_sentry_attr_if("sentry.address", log_event.server_name) | ||
| |> put_sentry_attr_if("sentry.trace.parent_span_id", log_event.parent_span_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is now a top level span_id field, we no longer need sentry.trace.parent_span_id
| parent_span_id: parent_span_id, | ||
| environment: Config.environment_name(), | ||
| release: Config.release(), | ||
| server_name: Config.server_name(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's not have environment/release/server_name in the struct definition, they should just be added to attributes in the relevant payload construction.
the schema here should match
https://github.com/getsentry/relay/blob/0223f6f728feacb0a5de607bbe5c8621644b8a9e/relay-event-schema/src/protocol/ourlog.rs#L14-L37
Initial work on support for Logs.
This follows requirements described in https://develop.sentry.dev/sdk/telemetry/logs/
Closes #906
Closes #907
Closes #908