Skip to content

Conversation

@HoneyryderChuck
Copy link

Proto implementation of the proposal from #131, no tests and scoped to #info only for discussion purposes.

Copy link
Contributor

@jeremyevans jeremyevans left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for most of the cases where you would want to add context like this, it may be better off logging in a more structured format (typically JSON), so you can provide the context as an entry in the JSON object. We could consider a Logger::JsonFormatter that outputs log messages in JSON. However, I'm not opposed to the general idea in this PR of supporting logging with context when doing plain text logging.

Comment on lines 16 to 29
context = if kwargs.empty?
""
else
kwargs.filter_map { |k, v| "[#{k}=#{v}]" unless v.nil? }.join(" ") << " "
end
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be moved to a format_context method, so it can be overridden in subclasses (similar to format_datetime).

Removing keys with nil values seems like a reasonable default, though having [] surround each pair seems odd. It also seems odd to add an extra space if the kwargs are empty after filtering as opposed to be empty before filtering. Not escaping values can result in potential log injection attacks if the value contains ][. Also true for keys, though you would hope something like this is not called with attacker-defined keys.

I think it would be better to use compact! instead of filter_map. I think it also may be good to use an append only approach:

Suggested change
context = if kwargs.empty?
""
else
kwargs.filter_map { |k, v| "[#{k}=#{v}]" unless v.nil? }.join(" ") << " "
end
kwargs.compact!
if kwargs.empty?
context = ""
else
context = String.new << "["
kwargs.each do |k, v|
context << " " unless context.bytesize == 1
context << k.to_s << "=" << v.inspect
end
context << "] "
end

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to format_context. I haven't picked up on your suggestion yet, only because I'd like to first align on the format of context for this particular default formatter, before committing on a strategy.

Your suggestion would result, for the {a: "b", c: "d"} context, in the following: [a=b c=d]. I don't have a strong argument if others prefer that, but my approach was based on rails tagged logging, which is what I'd expect most to be familiar with, and for which I'd expect some level of tag-level search support in tools built around it.

LMK if this makes sense.

@HoneyryderChuck
Copy link
Author

HoneyryderChuck commented Aug 14, 2025

@jeremyevans thx for the initial review 🙏

I think for most of the cases where you would want to add context like this, it may be better off logging in a more structured format (typically JSON)...

Indeed, in the vast majority of the cases (including the ones I linked to in the related issue), structured JSON logging is the main and most widely use case for extra context. But different logging formats is something that the logger gem already supports (you can add a different format), whereas the main historical issue that has most implementations hammer around it is the lack of support in the API for passing context downstream from the log message callsite. You can have a look at the examples I linked (and there are several others) of logging libraries doing the same thing over and over, or look at the other open PR here which proposes a different solution to the same problem.

We could consider a Logger::JsonFormatter that outputs log messages in JSON. However, I'm not opposed to the general idea in this PR of supporting logging with context when doing plain text logging.

A JSON formatter could be interesting, but I'd prefer doing it as a separate PR. I think that there are already many "standards" for JSON structured logging (logstash being the one I'm most familiar with), and I'm not sure whether logger should commit to one, and who should decide on which. I don't mind discussing it here though. @hsbt as the maintainer of the gem, wdyt?

On another topic, I've added support for context both as array and hash. This is mostly because the former is used for rails tagged logging, and could perhaps be of some use there. Not sure if anyone from the rails core team could also have a look here to validate that, otherwise supporting both formats could be considered overkill.

@HoneyryderChuck HoneyryderChuck force-pushed the context-api branch 2 times, most recently from 5cbe4d1 to 0af6de2 Compare August 19, 2025 18:00
Copy link
Contributor

@jeremyevans jeremyevans left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks better. Obviously, this needs tests to be considered. I don't think the code submitted was ever tested, due to the obvious errors in it.

Note that Fiber.current is not available by default in Ruby 2.5. You need to require "fiber" to make it available.

Supporting only array and hash contexts seems odd to me. We should either support arbitrary contexts, or only hash contexts, I think. Support only hash contexts could be implemented by using keyword splat instead of positional argument.

I'm not sure it is worth supporting both block-based context and per-log-message context. It would be easier to support only block-based context, and only if the log formatter supports it. We could add this code:

class Logger
  def with_context(context, &block)
    formatter.with_context(context, &block)
  end
end

And move the with_context method to Logger::Formatter. That would result in no interface change between the logger and formatter. Custom loggers wouldn't support context, but would remain backwards compatible. The logger wouldn't need to check whether the formatter supported context.

@HoneyryderChuck HoneyryderChuck force-pushed the context-api branch 2 times, most recently from e6032a8 to 5f0ff64 Compare August 21, 2025 14:50
@HoneyryderChuck
Copy link
Author

@jeremyevans again, thx for your thorough review 🙏

Obviously, this needs tests to be considered. I don't think the code submitted was ever tested, due to the obvious errors in it.

Indeed, this was submitted as a "proof of concept" of the proposal in the issue, not intended to be accurate. However,given
the feedback sofar, it seems there is some level of support for it, so I added the missing tests to make it a bit more "official". LMK what you think. Obviously, documentation is still missing.

Note that Fiber.current is not available by default in Ruby 2.5. You need to require "fiber" to make it available.

"fiber" is already being required in logger.rb (it's not the first usage of Fiber.current, see the with_level method, which a more general #with method supporting :level and :context could perhaps replace in the future).

Supporting only array and hash contexts seems odd to me. We should either support arbitrary contexts, or only hash contexts, I think.

It's definitely something I considered in an earlier comment of mine, but bear in mind that one of the use cases for this API would be to "import" some of the functionality of rails tagged logging into the stdlib fold, and that uses tags as an array of strings. Ideally, I'd like to have the opinion of someone from rails core to vouch for the usefulness of this idea though, otherwise I agree that narrowing the scope to only hashes would probably be reasonable.

I'm not sure it is worth supporting both block-based context and per-log-message context. It would be easier to support only block-based context, and only if the log formatter supports it.

I'd be more verbose IMO. Could make sense, but just for the same of discussion, if you look at level, you can init it, re-set it at the instance level, or change it on a given scope via with_level:

# init
logger = Logger.new(STDOUT, level: 0)
# reset
logger.level = 1
# scope level
logger.with_level(2)

So context would add a new idiom (:context kwarg at #add), while supporting scope level setup, and not supporting init or re-set. It'd probably make sense to support setting the context on init, but definitely not at the accessor level (I'd argue that those should be deprecated for other properties). As for per-message, I'm torn for now, although a potential #with method supporting other properties could significantly decrease verbosity.

But this is also the reason why I don't think that context handling should be offloaded to the formatter. It makes sense from a backwards compatibility angle, but that type of state belongs IMO to the logger (formatter should be stateless, and could be shared across logger instances).

@HoneyryderChuck HoneyryderChuck changed the title WIP: Context api Context api Sep 24, 2025
@HoneyryderChuck
Copy link
Author

@jeremyevans @nobu @sonots do you have any more feedback? What's missing in order to make this mergeable?

@jeremyevans
Copy link
Contributor

In terms of moving forward, I'm not the logger maintainer, so I can only review and offer general advice. In regards to merging, you would need approval from @sonots. If after a reasonable amount of time (maybe 3 months) @sonots does not respond, you could add a Redmine issue and add it as a topic to be discussed during a developer meeting.

My feelings on the implementation haven't changed much:

  • I don't think we should support both with_context and logging-call specific context, it seems like unnecessary complexity. We should only support with_context.

  • I think having context on the formatter makes more sense and avoids backwards compatibility issues. The context is only used in the formatted output, and not for any other logger purpose (unlike level). I understand you are worried about sharing formatters between loggers, but that isn't the default behavior (default behavior is separate formatter instance per logger instance), and considering the tradeoffs, I think storing the context in the formatter is better than storing it in the logger.

  • The code doesn't appear compatible with the supported Ruby version (2.5).

@nevans
Copy link

nevans commented Sep 27, 2025

I'm not a maintainer either, but I do want logging context added to core. And I've been using a homegrown version of my own in production for over 12 years.

I actually did mine almost entirely via the formatter (like @jeremyevans suggested) and that works okay... but looking at other structured logging libraries (in ruby and in other languages), I actually think it makes more sense to have the #with_context API on Logger. I don't think it should be "stored" on the Logger, but in a more generic "Context" class, e.g: similar to go's context.Context). But, lacking that, that's where my earlier suggestions about context_store and context_key came from.

I don't have a strong opinion about #with_context vs logging call specific context, other than: I do find it convenient to have it call-specific context, but the PR would be smaller with it only on Logger#with_context. And we can always add it to individual calls in a later PR.

My much bigger concern relates to @jeremyevans's other comments about supporting different context types (Hash vs Array vs ???). I would strongly recommend simplifying this PR to only support context as a Hash, and simplifying the formatter to drop rails style tags and only a support a basic structured logging-ish style. Rails style "tags" can easily be added to subclasses: just have a special case for context[:tags].

Related to the above points, unlike this implementation, my private implementation "cheats" by declaring that the message parameter can be a hash everywhere, it merges message (i.e: local context) with existing context (the normal message just becomes context[:message], and then sends the combined context/message hash to the formatter as the single "msg" parameter. I'm not convinced that's The Right Way, so this isn't a recommendation that you do that. But that's mostly worked out okay for me, and it requires fewer backward incompatible API changes. So maybe it's an approach to consider.

Copy link

@nevans nevans left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I typed these review comments up a few weeks ago, but never sent them in. They assume we continue with the API change where context is sent in as a keyword argument. But, we could also reduce the backwards-incompatible API changes by allowing message parameters to combine context and message.

context.dup
end
end
end
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we assume that context is always a hash, this can be simplified to something like:

Suggested change
end
# Override in subclass to allow non-Hash context types
private def merge_context(prev_context, context)
ctx_hash = Hash.try_convert(context) \
or raise TypeError, "can't convert %s into Hash" % [context]
prev_context.nil? ? ctx_hash.dup : prev_context.merge(ctx_hash)
end

# Default formatter for log messages.
class Formatter
Format = "%.1s, [%s #%d] %5s -- %s: %s\n"
Format = "%.1s, %s[%s #%d] %5s -- %s: %s\n"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, we should combine message and context and leave the format string unchanged:

Suggested change
Format = "%.1s, %s[%s #%d] %5s -- %s: %s\n"
Format = "%.1s, [%s #%d] %5s -- %s: %s\n"

Just for a quick visual scanning of logs: the more static fields don't vary much in length, so they should stay before context, which is going to vary in length much more.

Also, although non-local context is often relatively static and worth putting before the message, which context fields should be prioritized over the message depends on the application. IMO, the best generic solution is to print all the context after the message. For example, that's what the go stdlib structured logging package does: with the default formatter, slog.Info("hello", "count", 3) prints out 2022/11/08 15:28:26 INFO hello count=3.

Ideally, custom formatter subclasses should be able to trivially print context before, after, or around the message.

Comment on lines 21 to 34
def format_context(context)
context_str = case context
when Hash
context.filter_map { |k, v| "[#{k}=#{v}]" unless v.nil? }.join(" ")
when Array
context.filter_map{ |v| "[#{v}]" unless v.nil? }.join(" ")
else
context.to_s.dup
end

context_str << " " unless context_str.empty?

context_str
end
Copy link

@nevans nevans Sep 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @jeremyevans that the default formatter should simplify by only allowing Hash context. And, as I wrote in my other comments, format_context shouldn't be concerned with trailing or preceding whitespace. Also, although I do like the style, the default formatter shouldn't surround context fields with brackets (IMO).

With those simplifications, I'd prefer to spend the default formatter's "complexity budget" elsewhere. 😉

  • Extracting more more format_{type} methods provides obvious extension points for common subclasses patterns.
  • simply joining "#{k}=#{v}" is too naive, IMO. IMO, context values are much more likely than messages to include sensitive or externally sourced values (and thus be a source of security issues). So I think we need some form of built-in escaping, and an easy extension point to add filtering.
    def format_message_with_context(msg, ctx)
      join_fields msg2str(msg), format_context(ctx)
    end

    def join_fields(*fields)
      fields.reject {|f| f.nil? || f.empty? }.join(" ")
    end

    def format_context(context)
      return unless context
      context.filter_map {|k, v| format_pair(k,v) }.join(" ")
    end
    
    def format_pair(k, v)
      "#{k}=#{format_value(v)}" unless v.nil?
    end
    
    def format_value(value)
      # "redundant" interpolation to avoid crash if to_s doesn't return a string
      # https://github.com/ruby/spec/blob/3affe1e54fcd11918a242ad5d4a7ba895ee30c4c/language/string_spec.rb#L130-L141
      value = "#{value}"
      value = value.dump if value.match?(/[[:space:][:cntrl]]/)
      value
    end

This enables subclasses like the following, that's kind of like ActiveSupport::TaggedLogger.

class TaggedLogFormatter < Logger::Formatter
  PRE_MSG_CTX = %i[user_id request_id controller action].freeze

  # add brackets to each context pair, and treat key like a tag when value is true
  def format_pair(k, v)
    return format_tag(k) if v == true
    (str = super) && "[#{str}]"
  end

  # print some context before the message and some after:
  def format_message_context(msg, ctx)
    return msg2str(msg) unless ctx

    fields  = Array(ctx[:tags]).map { format_tag(_1) }
    fields << format_context(ctx.slice(*PRE_MSG_CTX))
    fields << msg2str msg
    fields << format_context ctx.except(:tags, *PRE_MSG_CTX)

    join_fields(*fields)
  end

  def format_tag(tag) = "[#{format_value _1}]"
end

Or, here's one that's kinda like logfmt for msg and context, but it uses the (more compact) default format for the other standard fields.

class NotQuiteLogFmtFormatter < Logger::Formatter
  # Use ISO8601
  def format_datetime(time) = time.utc.iso8601(6)

  # Formats as msg="my message"
  # or error="error messege [MyError]\n/path/to/file.rb:123 in 'foo'\netc..."
  def msg2str(msg)
    name = msg.is_a?(Exception) ? "error" : "msg"
    format_pair(name, super(msg))
  end

  # flip the order: prints msg="my message" before context
  def format_message_context(msg, ctx)
    join_fields msg2str(msg), format_context(ctx)
  end

  # Standardize formats for Float, Time, #to_ary
  def format_value(value)
    case value
    in String                       then super
    in Float                        then format("%.3f", value)
    in Time                         then format_datetime(time)
    in ->{ _1.respond_to?(:to_ary)  then format_array(value)
    else                                 super
    end
  end

  def format_array(array)
    format_value "[#{Array(array).map {|v| format_value(v) }.join(", ")}]"
  end
end

(I haven't tested these at all... only typed them here in the github comment box, so... they probably don't work as-is.)

def call(severity, time, progname, msg)
sprintf(Format, severity, format_datetime(time), Process.pid, severity, progname, msg2str(msg))
def call(severity, time, progname, msg, context: nil)
sprintf(Format, severity, format_context(context), format_datetime(time), Process.pid, severity, progname, msg2str(msg))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Echoing my Format string comment here: by default the context should be adjacent to the message, after the always present (nearly) static width fields.

Suggested change
sprintf(Format, severity, format_context(context), format_datetime(time), Process.pid, severity, progname, msg2str(msg))
sprintf(Format, severity, format_datetime(time), Process.pid, severity, progname, format_message_with_context(msg, context))

Ideally, custom formatter subclasses should be able to trivially print context before, after, or around the message. And, for that to work, format_context shouldn't be concerned with trailing or preceding whitespace. The caller should handle that.

While that merger could happen here, common subclass patterns can be simplified by extracting some more methods. I suggest pushing msg2str and format_context into a new method and letting that method merge context and message. I'll add a possible #format_message_with_context implementation to my comments on #format_context.

@halo halo mentioned this pull request Sep 29, 2025
@nevans
Copy link

nevans commented Sep 29, 2025

@HoneyryderChuck I fixed up my review comments and pushed a branch with them here: master...nevans:logger:context-api. I also fixed the tests for ruby 2.5 and 2.6, added some rdoc, made the other severity methods match #info, etc.

@HoneyryderChuck
Copy link
Author

HoneyryderChuck commented Sep 30, 2025

Thank you both so far for all the feedback! I resumed the raised concerns into the following list:

  1. (@jeremyevans) with_context block and per call support too much. stick to the former.
  2. (@jeremyevans) context in formatter rather than logger.
    • (@nevans) suggestion: don't care where, but object must be like go's context.Context .
  3. (both) hash vs array - pick one (Former)
    • (@nevans) suggestion: Rails style "tags" can easily be added to subclasses: just have a special case for context[:tags].
  4. (@nevans) context log format should be like go, concat at the end
  5. (@nevans) moar format_* methods
    • to avoid i.e. leaking PII data

I'll add some thoughts, with the intent to document pros and cons for others joining the conversation.

1

If I understand the suggestion, it's about reducing the footprint and complexity of the change. It's a valid reason, and rails (tagged) logging only supports the block variant as well; on the other hand, other logging gems which implement this functionality allow passing context both per block and inline; you can see how semantic_logger. logstash-logger and contextual-logger do it. This lack of parity could be surprising both for new users, who could see an advantage in picking up an alternative, or could make migrating away from the alternatives to logger gem difficult.

2

I agree that putting it in the formatter could make backwards compatibility simpler. Conceptually, formatters don't keep state around (the message, as well as other supported metadata, are passed as argument), and managing it is a bit beyond its scope IMO. For the sake of argument, looking at the gems mentioned above, semantic_logger stores context in Thread.current while allowing tagged to be called both from the logger as well as the main module; logstash-logger calls the formatter to manage tags, as per the suggestion, although it stores in Thread.current under the hood; conceptual_logger sets the API at the logger (via mixin), and uses a context object (more a less like what @nevans suggest) to stores it, too, in Thread.current. Feels like a standard, not sure what's your take on this.

3

I agree that focusing on one could make the change simpler. But the applied principle was the same as in 1, i.e. there's a known and ancient API (rails tagged logging) which supports arrays, and ideally simpler programs requiring similar tagged logging wouldn't need to jump through hoops and install activesupport just to benefit from the feature. Looking at other examples, semantic_logger supports both arrays and hashes (dropping down to arrays for multiple positional arguments); logstash-logger adds the array of tags into a :tags field of the json object; conceptual_logger does the same (but the field is called :log_tags). The last two follow @nevans suggestion, but to be fair, logstash events defined that, so I interpret it more as an argument for "a logstash event formatter would do this to tags".

4

I'm not sure how putting tags at the end, like go does, would be a sensible choice. For once, go's structured logging format isn't known to rubyists (who do not know go), so it'd be just another standard. And looking at the output, it isn't clear to me whether it was generated from logger.info("hello count=3") or logger.info("hello", hello count: 3), which makes it somewhat ambiguous. Was this raised by the go community?

5

I understand the appeal of the idea, but it wouldn't IMO compose well for other potential formatters, i.e. a json formatter would expect to dump the context "as is" into a function like JSON.dump, not traverse and pre-process it by calling the several format_*variants. While the danger of leaking data exists, that should be a concern of the API user. A better compromise could be to do that at "context-add" time (thereby storing the redacted value), but that'd put the API in logger, and it's a completely different feature. Moreover, none of the other alternative logging frameworks do anything of the kind.


Bear in mind that I'm not fundamentally against any of the suggestions (perhaps starting small could validate the rest), but I'd like to have more inputs before trimming the implementation. In order to spare you more back and forth, and because I suspect that getting feedback from @sonots will take more than 3 months, I'll go ahead and open an issue in the bug tracker and hopefully gather some more opinions around the loose ends.

Compatibility with ruby 2.5 was restored.

@nevans
Copy link

nevans commented Oct 1, 2025

  • object must be like go's context.Context

Not really a must :). And when I say "like go's context", I'm speaking more by analogy, because there's a lot about it that annoys me. Go's context.Context is an (annoyingly explicit) container for execution state, analogous to Fiber locals, Thread locals, Fiber storage, etc. Kotlin's coroutine context is a similar concept, but less familiar. ActiveSupport::IsolatedExecutionState from rails is another example of the concept.

But actually, there's an interesting distinction here for our purposes. And slog always sends the context to its slog.Handler (if the caller doesn't send its context, the context.Background is used by default). But, I don't believe the default handlers actually use the context at all: it's there for custom handlers. Instead, slog attaches its structured logging attributes directly to its slog.Logger struct, and makes it easy to cloned loggers with updated attributes.

And that's a pattern I've used myself, a lot: different logger for different classes. Some of the logged fields come from the execution context, and other logged fields come from the logger itself.

So I added another commit to another branch to implement that: master...nevans:logger:default-context 🙂 In that branch, you can assign a "default context" via #initialize, Fiber-local context is merged over the logger's default context, and if you call #with_context without a block it'll return a cloned logger with your new context merged in. Maybe it's less confusing if we used different terms for "fields" vs "context"? But IMO it's simplifying to merge them into a single concept.

@HoneyryderChuck
Copy link
Author

@nevans default context is definitely a great idea, and something I considered implementing along the way. I'll make sure to cherry-pick your change.

@nevans
Copy link

nevans commented Oct 2, 2025

@HoneyryderChuck to (verbosely) address this point:

I'm not sure how putting tags at the end, like go does, would be a sensible choice.

Firstly, you aren't expected to use the default handler in production. They include a pure logfmt handler and a JSON handler:

slog.Info("hello", "count", 3)
// 2022/11/08 15:28:26 INFO hello count=3

logfmt := slog.New(slog.NewTextHandler(os.Stderr, nil))
logfmt.Info("hello", "count", 3)
// time=2022-11-08T15:28:26.000-05:00 level=INFO msg=hello count=3

json := slog.New(slog.NewJSONHandler(os.Stdout, nil))
json.Info("hello", "count", 3)
// {"time":"2022-11-08T15:28:26.000000000-05:00","level":"INFO","msg":"hello","count":3}

But also, though it's not ideal, I still think their default handler may be the best default for human readable logs. And (assuming you haven't made the switch to a fully parseable structured logging formatter) it may also be the best default for machine parsing. I'll explain below.

logfmt

For once, go's structured logging format isn't known to rubyists (who do not know go), so it'd be just another standard.

The default handler is an "impure" version of logfmt, and the very first logfmt blog post was written (by a Heroku employee?) with a ruby code example (using Sinatra). :) And, although logfmt isn't formally standardized like JSON, it's fairly intuitive, far more human readable than JSON, and is widely used in many languages (including in the logfmt gem). I process my production logfiles using loki's logfmt parser or the logfmt support in alloy or promtail.

The "impure" logfmt is such an intuitive format that many people (myself included) have simply stumbled into using (inconsistently) inside our own custom log messages. The impure format is just a regular log entry with some key=value pairs inside the message. In my experience, the key=value pairs usually make most sense at the end, although the toy examples don't really show that. I'd used the "impure logfmt" format for years before I ever heard of logfmt or structured logging:

logger.debug { "ambifacient lunar waneshaft side fumbling: foo=%p bar=%p" % [foo, bar] }

For human readable logs

The message is still the most important part of almost all log entries. So, for human readable logs, the message should come after a relatively limited set of standard fixed width "header" fields: e.g: timestamp, pid, component (progname), request ID, IP address, etc. But, without any other application or framework specific knowledge of which context fields deserve prime position (e.g: request id, user id, IP address), message should come before context.

Because, without application knowledge, context is dynamic width and has no practical limits. So putting it first could bounce the message around a lot. This is especially true of context that's been inherited through call stack. When I (a human) am scanning log lines, the messages can bounce around a little bit, but I want the messages in most lines to mostly line up, visually. But if I'm tracing log entries produced by a single log.info "so... this happened" line of code, I need them to line up visually with each other. Dynamic context breaks the ability for me to quickly scan the logs with my eyes.

Different context types

If your primary use case is rails tags... well, for the most part those are relatively limited, standard, and don't vary widely range in width. We could easily customize the formatter to tell it which fields come before the message. But the default unconfigured formatter won't know which fields those are.

This is also another place it it may make sense to distinguish between a logger's default context (where am I?), the fiber-local context (how did I get here?), and local context (extra structured information for specific log messages). It often makes sense to put the first two types before the message. Only rarely would I want the last type before the message. But it's always "safest" to put all three after the message.

So, like go's slog, maybe it's worth letting the formatter know which fields are "context" and which fields are passed in locally?

Examples

Regardless of everything I just wrote, "human readable" is admittedly very subjective. Nevertheless, I believe you'll see what I'm talking about with some examples. Please compare the following:

I, baz=false [2025-10-02T17:57:50.947152 #3705893]  INFO -- : this is my message
I, foo=bar baz=true [2025-10-02T17:57:50.947233 #3705893]  INFO -- : this is my message
I, foo=bar [2025-10-02T17:57:50.947262 #3705893]  INFO -- : this is my message
I, casing="malleable logarithmic" foo=bar [2025-10-02T17:57:50.947282 #3705893]  INFO -- : this is my message
I, etc="what is this" [2025-10-02T17:57:50.947301 #3705893]  INFO -- : this is my message
I, foo=bar [2025-10-02T17:57:50.947320 #3705893]  INFO -- : this is my message
I, context="can vary wildly in size, there's really no limit to this." [2025-10-02T17:57:50.947338 #3705893]  INFO -- : 
this is my message
I, casing="malleable logarithmic" foo=bar [2025-10-02T17:57:50.947386 #3705893]  INFO -- : this is my message
I, foo=bar [2025-10-02T17:57:50.947417 #3705893]  INFO -- : this is my message
I, baz=false [2025-10-02T17:57:50.947440 #3705893]  INFO -- : this is my message
I, context="can vary wildly in size, there's really no limit to this." [2025-10-02T17:57:50.947458 #3705893]  INFO -- : 
this is my message
I, foo=bar [2025-10-02T17:57:50.947478 #3705893]  INFO -- : this is my message
I, etc="what is this" [2025-10-02T17:57:50.947497 #3705893]  INFO -- : this is my message
I, foo=bar baz=true [2025-10-02T17:57:50.947515 #3705893]  INFO -- : this is my message

a big mess, barely readable, IMO.

I, [2025-10-02T17:51:28.403121 #3705893]  INFO -- : baz=false this is my message
I, [2025-10-02T17:51:28.403183 #3705893]  INFO -- : context="can vary wildly in size, there's really no limit to this." this is my message
I, [2025-10-02T17:51:28.403205 #3705893]  INFO -- : foo=bar this is my message
I, [2025-10-02T17:51:28.403222 #3705893]  INFO -- : foo=bar baz=true this is my message
I, [2025-10-02T17:51:28.403238 #3705893]  INFO -- : casing="malleable logarithmic" foo=bar this is my message
I, [2025-10-02T17:51:28.403253 #3705893]  INFO -- : foo=bar this is my message
I, [2025-10-02T17:51:28.403268 #3705893]  INFO -- : etc="what is this" this is my message
I, [2025-10-02T17:51:28.403287 #3705893]  INFO -- : foo=bar this is my message
I, [2025-10-02T17:51:28.403305 #3705893]  INFO -- : casing="malleable logarithmic" foo=bar this is my message
I, [2025-10-02T17:51:28.403321 #3705893]  INFO -- : foo=bar baz=true this is my message
I, [2025-10-02T17:51:28.403337 #3705893]  INFO -- : context="can vary wildly in size, there's really no limit to this." this is my message
I, [2025-10-02T17:51:28.403353 #3705893]  INFO -- : etc="what is this" this is my message
I, [2025-10-02T17:51:28.403388 #3705893]  INFO -- : baz=false this is my message
I, [2025-10-02T17:51:28.403404 #3705893]  INFO -- : foo=bar this is my message

the messages are still bopping about from line to line, but at least I can read the timestamps, PID, and program name now!

I, [2025-10-02T17:59:15.980873 #3705893]  INFO -- : this is my message context="can vary wildly in size, there's really no limit to this."
I, [2025-10-02T17:59:15.980906 #3705893]  INFO -- : this is my message baz=false
I, [2025-10-02T17:59:15.980918 #3705893]  INFO -- : this is my message etc="what is this"
I, [2025-10-02T17:59:15.980928 #3705893]  INFO -- : this is my message foo=bar
I, [2025-10-02T17:59:15.980936 #3705893]  INFO -- : this is my message foo=bar
I, [2025-10-02T17:59:15.980945 #3705893]  INFO -- : this is my message foo=bar baz=true
I, [2025-10-02T17:59:15.980953 #3705893]  INFO -- : this is my message casing="malleable logarithmic" foo=bar
I, [2025-10-02T17:59:15.980963 #3705893]  INFO -- : this is my message foo=bar
I, [2025-10-02T17:59:15.980972 #3705893]  INFO -- : this is my message casing="malleable logarithmic" foo=bar
I, [2025-10-02T17:59:15.980981 #3705893]  INFO -- : this is my message etc="what is this"
I, [2025-10-02T17:59:15.980989 #3705893]  INFO -- : this is my message foo=bar baz=true
I, [2025-10-02T17:59:15.980997 #3705893]  INFO -- : this is my message foo=bar
I, [2025-10-02T17:59:15.981006 #3705893]  INFO -- : this is my message context="can vary wildly in size, there's really no limit to this."
I, [2025-10-02T17:59:15.981014 #3705893]  INFO -- : this is my message baz=false

This is what I proposed. It may not be ideal, but at least I can read the messages without getting lost scanning each line.

For machine parseable logs

Strictly speaking, it doesn't matter where pure structured logging formats like logfmt or JSON logs put the message. But, let's assume you haven't made the switch to pure logfmt or JSON, for one reason or another. Where should the context go by default for machine readability?

For context (about my personal interest in the feature), I use loki to filter and parse my logs, including many that are not in a "pure" structured logging format.

And looking at the output, it isn't clear to me whether it was generated from logger.info("hello count=3") or logger.info("hello", hello count: 3), which makes it somewhat ambiguous

Yes, in the toy example it's ambiguous... but I'd guess that the ambiguity very rarely matters. Because, in practice, the logger.info "hello count=3" is really just an old line of code that predates structured logging, and we haven't gotten around to updating it to logger.info("hello", count: 3) yet! 😉 In this scenario, if the user switches to a pure logfmt or JSON formatter, it would actually make it harder to parse "count=3", because now it's encoded inside the "msg" field: (msg="hello count=3" or {"msg": "hello count=3}).

So, the way logfmt parsers usually1 work is that:

  • They ignore "garbage", i.e: anything that isn't a valid key, atom value, or quoted value.
  • Bare atoms are tags, e.g: key is parsed as either {key: true} or {key: nil}.
  • Last duplicate wins, e.g: hello="good bye" hello=world is parsed as {hello: "world"} (although some parsers output an array of pairs, including duplicates)

This means that logfmt parsers can generally handle "impure" log entries just fine. They can parse hello count=3 into {hello: true, count: 3}, and that's almost certainly what you want, regardless of whether "count=3" was technically in the "message". But that also means that any structured fields that come before the message can be clobbered by the message (e.g: `count=3 Got count of users").

Theoretically, the parser could work around that, e.g: maybe bare word tags won't override key=value pairs. But parsing impure entries isn't the parser's priority, and those heuristics might be wrong... they can't know. None of the parsers I looked at bothered with any workarounds. So, putting the context last is "safest" from a machine readability perspective.

This means that, if you're not going to encapsulate all of the fields (including the message) in logfmt, it's safest to put the logfmt fields last, so they can override any accidental "tag fields" that are parsed from the message.

Like I said, I do use logfmt parsers for logfiles that haven't been switched to pure structured logging yet. So this is a practical issue for me.

Was this raised by the go community?

I don't know. Probably. But there are practical reasons, both for humans and parsers, so I'm maybe they had similar rationale to mine? 🤷🏻

An updated proposal

Anyway, with all of that said: I partially agree with you. In the examples we've been looking at, tacking the context on at the end does seem a bit weird to read. It doesn't really help that it's a better default... it's still not great. But, since I took the time to read through the logfmt parsers and how they work, I now know how they handle "garbage". So I now feel comfortable saying we could keep the logfmt parsers happy while also being more human readable.

Here's a small modification to my proposal:

I, [2025-10-02T18:00:31.424031 #3705893]  INFO -- : this is my message  [context: etc="what is this" ]
I, [2025-10-02T18:00:31.424096 #3705893]  INFO -- : this is my message  [context: foo=bar ]
I, [2025-10-02T18:00:31.424119 #3705893]  INFO -- : this is my message  [context: casing="malleable logarithmic" foo=bar ]
I, [2025-10-02T18:00:31.424137 #3705893]  INFO -- : this is my message  [context: baz=false ]
I, [2025-10-02T18:00:31.424153 #3705893]  INFO -- : this is my message  [context: foo=bar baz=true ]
I, [2025-10-02T18:00:31.424169 #3705893]  INFO -- : this is my message  [context: context="can vary wildly in size, there's really no limit to this." ]
I, [2025-10-02T18:00:31.424186 #3705893]  INFO -- : this is my message  [context: foo=bar ]
I, [2025-10-02T18:00:31.424205 #3705893]  INFO -- : this is my message  [context: foo=bar ]
I, [2025-10-02T18:00:31.424228 #3705893]  INFO -- : this is my message  [context: foo=bar baz=true ]
I, [2025-10-02T18:00:31.424255 #3705893]  INFO -- : this is my message  [context: baz=false ]
I, [2025-10-02T18:00:31.424281 #3705893]  INFO -- : this is my message  [context: etc="what is this" ]
I, [2025-10-02T18:00:31.424298 #3705893]  INFO -- : this is my message  [context: casing="malleable logarithmic" foo=bar ]
I, [2025-10-02T18:00:31.424314 #3705893]  INFO -- : this is my message  [context: context="can vary wildly in size, there's really no limit to this." ]
I, [2025-10-02T18:00:31.424331 #3705893]  INFO -- : this is my message  [context: foo=bar ]

Yes, the space before the closing bracket is necessary. 😦 Half of the parsers I looked at will treat the brackets as atom characters.

What do you think? Would something similar make sense to you?

Footnotes

  1. by "usually", I mean "in the four (4) parsers I looked at". 😜

@HoneyryderChuck
Copy link
Author

HoneyryderChuck commented Oct 3, 2025

@nevans thx for the extra info, I never used logfmt before, so I stand corrected.

About the default log format you propose, I agree with you that, when it comes to reading, it definitely makes it more pleasant. Honestly, my whole point for going with that format for tags was more about be less surprising than personal preference. And most rubyists will experience logs through the lens of the rails logger. And looking at how other logging libraries format context/tags by default, such as lograge or semantic_logger, the same pattern applies, i.e. the message comes at the end, after tags/context.

I'm also thinking that the main goal of this proposal is to enable per-scope/call context, more than how things should be formatted.

So for now, I'll lean on the conservative side and keep the proposal as-is (unless you convince the maintainer of the gem, at which point I'll happily change it).

@nevans
Copy link

nevans commented Oct 3, 2025

Sorry. That last comment was huge... and yet I have another large addendum:

Because I hate that closing brace in my updated proposal (just aesthetically 😉)! So, how about this?

I, [2025-10-02T18:00:31.424031 #3705893]  INFO -- : this is my message  [Context]: etc="what is this"
I, [2025-10-02T18:00:31.424096 #3705893]  INFO -- : this is my message  [Context]: foo=bar
I, [2025-10-02T18:00:31.424119 #3705893]  INFO -- : this is my message  [Context]: casing="malleable logarithmic" foo=bar
I, [2025-10-02T18:00:31.424137 #3705893]  INFO -- : this is my message  [Context]: baz=false
I, [2025-10-02T18:00:31.424153 #3705893]  INFO -- : this is my message  [Context]: foo=bar baz=true
I, [2025-10-02T18:00:31.424169 #3705893]  INFO -- : this is my message  [Context]: context="can vary wildly in size, there's really no limit to this."
I, [2025-10-02T18:00:31.424186 #3705893]  INFO -- : this is my message  [Context]: foo=bar
I, [2025-10-02T18:00:31.424205 #3705893]  INFO -- : this is my message  [Context]: foo=bar
I, [2025-10-02T18:00:31.424228 #3705893]  INFO -- : this is my message  [Context]: foo=bar baz=true
I, [2025-10-02T18:00:31.424255 #3705893]  INFO -- : this is my message  [Context]: baz=false
I, [2025-10-02T18:00:31.424281 #3705893]  INFO -- : this is my message  [Context]: etc="what is this"
I, [2025-10-02T18:00:31.424298 #3705893]  INFO -- : this is my message  [Context]: casing="malleable logarithmic" foo=bar
I, [2025-10-02T18:00:31.424314 #3705893]  INFO -- : this is my message  [Context]: context="can vary wildly in size, there's really no limit to this."
I, [2025-10-02T18:00:31.424331 #3705893]  INFO -- : this is my message  [Context]: foo=bar

And, since the rails tags style seems to be a big part of your inspiration/motivation, and I do use rails style log tags myself (quite a bit!), I think it does make sense to support that explicitly, yeah? Although I do feel that enforcing "context is always hash" is a good simplifying default (and one that subclasses can always override), treating context[:tags] as something special seems totally reasonable to me, too.

Also, currently we can customize the formatter's datetime format. Why not also add the ability to customize which fields are placed before the message, and how they printed?

logger.formatter.prefix_context_fields = %i[request_id ip_address user_id]
logger.info "my message", context: {foo: "bar", user_id: 123, request_id: "some-uuid"}
# => I, [2025-10-02T18:00:31.424331 #3705893]  INFO -- request_id=some-uuid user_id=123 : this is my message  [Context]: foo=bar

And, while the bracketed tag approach is less machine parseable, it makes a lot of sense for human readable logs. So why not add the ability to customize how which fields are treated as tags, and how those tags appear?

# configure how "extra context" (not configured as a prefix) is delimited
logger.formatter.context_format = " %s"              # the simplest form
logger.formatter.context_format = " [context: %s ]"  # my updated proposal yesterday
logger.formatter.context_format = " [Context]: %s"   # my proposal above, in this comment
logger.formatter.context_format = " ~~~ %s"          # used by my examples, below

# context for all of the following examples
context = {
  foo: "bar",
  user_id: 123,            # we want to print this as a prefix, keeping "user="
  request_id: "some-uuid", # we want to print this as a prefix tag, without "request_id="
  component: "settings",   # we want to print this as a prefix tag, without "component="
  tags: %i[foo bar baz],   # we want to treat this as a list of tags
  admin_user: true,        # always printed as a tag when value is true
}
logger.with_context(**context) do
  # with no extra customization
  logger.info "this is my message"
  # => I, [2025-10-02T18:00:31.424331 #3705893]  INFO -- : this is my message  ~~~ foo=bar user_id=123 request_id=some-uuid component=settings tags="[\"foo\", \"bar\", \"baz\"]" admin_user

  # Configure fields to be extracted from context and printed first, in a specific order.
  logger.formatter.prefix_context_fields = %i[request_id ip_address component user_id tags]
  # To delimit from the message, print the prefix context between 'progname' and ':'.
  logger.info "this is my message"
  # => I, [2025-10-02T18:00:31.424331 #3705893]  INFO -- request_id=some-uuid component=settings user_id=123 tags="[\"foo\", \"bar\", \"baz\"]" : this is my message  [Context]: foo=bar admin_user

  # configure some array field values to be handled as an array of tags
  logger.formatter.prefix_tag_fields = %[tags]  # maybe this should be the default
  logger.info "this is my message"
  # => I, [2025-10-02T18:00:31.424331 #3705893]  INFO -- request_id=some-uuid component=settings user_id=123 foo bar baz : this is my message  ~~~ foo=bar admin_user

  # convert some some string field values to be printed as tags
  logger.formatter.prefix_tag_fields += %i[request_id ip_address component]
  logger.info "this is my message"
  # => I, [2025-10-02T18:00:31.424331 #3705893]  INFO -- some-uuid settings user_id=123 foo bar baz : this is my message  ~~~ foo=bar admin_user

  # Add brackets to prefix context fields (for readability)
  logger.formatter.bracket_prefix_context_fields = array # specific prefix_content_fields
  logger.formatter.bracket_prefix_context_fields = :tags # only "tag" prefix_content_fields
  logger.formatter.bracket_prefix_context_fields = true  # all prefix_content_fields
  logger.info "this is my message"
  # => I, [2025-10-02T18:00:31.424331 #3705893]  INFO -- [some-uuid] [settings] [user_id=123] [foo] [bar] [baz] : this is my message  ~~~ foo=bar admin_user

With a few other minor tweaks, you basically have the format used by ActiveSupport.

BUT, with all of that said, I think the first PR should be as simple as possible. So, none of the above. IMO, the above should be one or more follow up PRs, so formatter enhancements can be discussed separately. And, printing the context last (maybe with a delimiter) is, IMO, the simplest thing.

@HoneyryderChuck
Copy link
Author

HoneyryderChuck commented Nov 27, 2025

Following up on this message, @sonots the gem maintainer hasn't provided feedback in the 3 months since I opened the PR. @jeremyevans @byroot @nobu tagging you as core maintainer who have provided feedback, can I ask you to bring this subject to the core team and find a path forward?

As a sidenote, I wrote a blog post which provides more context about motivation / state of the art that otherwise wouldn't fit in a PR comment, hope it's useful.

@byroot
Copy link
Member

byroot commented Nov 27, 2025

The process to get this sort of features discussed is to add the ticket to the next developer meeting: https://bugs.ruby-lang.org/issues/21689

Have you done that already?

That being said, it's mostly for ruby-core, gems it kinda depends.

As for my personal feedback, IMO a library like logger shouldn't have to know about Fiber / Thread etc. If this was me I'd just go with a generic "context store" API, and it'd be the user responsability to provide the right context store implementation for the execution environment.

@HoneyryderChuck
Copy link
Author

@byroot thx, wasn't aware of the process, already added.

I don't mind the idea of providing context store as an argument, as long as the logger lib ships with the implementations for thread-safe/fiber-safe/fiber-safe-with-inheritance, to avoid having same-but-different implementations spread across.

@byroot
Copy link
Member

byroot commented Nov 28, 2025

wasn't aware of the process, already added.

I'd recommend including a bit more of the high level justifications in your bullet points.

@HoneyryderChuck
Copy link
Author

thx for the suggestion, updated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants