Skip to content

feat: compress event log entries#18029

Closed
matejvasek wants to merge 1 commit intocontainers:mainfrom
matejvasek:compress-log-entries
Closed

feat: compress event log entries#18029
matejvasek wants to merge 1 commit intocontainers:mainfrom
matejvasek:compress-log-entries

Conversation

@matejvasek
Copy link
Copy Markdown
Contributor

@matejvasek matejvasek commented Apr 3, 2023

Compress event log entries

Compress labels and configuration inspect data that are longer that 1024 bytes.

The compressed fields:

  • PODMAN_LABELS
  • PODMAN_CONTAINER_INSPECT_DATA

Compatibility notes

  • Backward compatible ✅
  • Forward compatible ❌

Output of journalctl is still the same since it automatically decompresses these fields.

None

Compression ratio example:

docker image inspect  gcr.io/paketo-buildpacks/builder | jq '.[0].Config' | wc -c              
103999
docker image inspect  gcr.io/paketo-buildpacks/builder | jq '.[0].Config' | xz --stdout | wc -c
12836
# the container config JSON is 8 time smaller after compression.

@openshift-ci openshift-ci Bot added release-note-none do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. labels Apr 3, 2023
@matejvasek matejvasek force-pushed the compress-log-entries branch 2 times, most recently from 2414b49 to 7b397aa Compare April 3, 2023 19:06
Copy link
Copy Markdown
Member

@vrothberg vrothberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a really interesting idea, @matejvasek!

LGTM from my end but maybe I am missing some side effect.

@Luap99 @giuseppe @edsantiago @cevich WDYT?

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Apr 4, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: matejvasek, vrothberg

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 4, 2023
Comment thread libpod/events/journal_linux.go Outdated
Copy link
Copy Markdown
Member

@Luap99 Luap99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per https://www.freedesktop.org/software/systemd/man/journald.conf.html#Options Compress should be yes by default which means journald will compress based on that configuration so I do not think podman should ignore this option. We should just let journald take care of it.

@matejvasek
Copy link
Copy Markdown
Contributor Author

matejvasek commented Apr 4, 2023

@Luap99 Yes, that's true, it is storing items above 512B compressed. This is more about transfer limits.
See #17854 (comment) , tl;dr writing entries above 212940B is troublesome.
It should be eventually fixed by updating SELinux policy, but this could help too.

@Luap99
Copy link
Copy Markdown
Member

Luap99 commented Apr 4, 2023

But why does this work? Wouldn't journald try to compress the data twice then? I cannot find any documentation about the internal compression, how can we know that this is not going to break in the next systemd update?

@matejvasek
Copy link
Copy Markdown
Contributor Author

But why does this work? Wouldn't journald try to compress the data twice then? I cannot find any documentation about the internal compression, how can we know that this is not going to break in the next systemd update?

I think this very well may lead to double a compression.
From journald pov I am just storing binary blob.

@matejvasek
Copy link
Copy Markdown
Contributor Author

Would be good to hear from some journald expert.

Copy link
Copy Markdown
Member

@edsantiago edsantiago left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm uncomfortable with this PR, it seems to rely on just-so-happens-to-work behavior in journald and I'm incapable of understanding the use case. ISTM that the cost/benefit here is way skewed: high cost, high risk, for questionable benefit in rare corner cases. But I'll step out.

One concern inline.

Comment thread libpod/events/journal_linux.go Outdated
@matejvasek
Copy link
Copy Markdown
Contributor Author

I'm uncomfortable with this PR, it seems to rely on just-so-happens-to-work behavior in journald and I'm incapable of understanding the use case. ISTM that the cost/benefit here is way skewed: high cost, high risk, for questionable benefit in rare corner cases. But I'll step out.

One concern inline.

@edsantiago TBH I am not particularly comfortable with this either. This is just another attempt to make sure long events work, although the SELinux fix by @rhatdan should be sufficient on it own.

With regard to journald: only weird thing to me is that the journalctl CLI is displaying the items decompressed.

From application podman pov it seems to be consistent:
I write binary blob journal.Send(),
and I read binary blob sdjournal.Journal.Next().

IF I was compressing data on write and read would be returning uncompressed that would be really effing scary, but that's not what happening.

@matejvasek matejvasek force-pushed the compress-log-entries branch from 7b397aa to 8cd544e Compare April 4, 2023 13:46
The compressed fields:
* PODMAN_LABELS
* PODMAN_CONTAINER_INSPECT_DATA

[NO TESTS NEEDED]
[NO NEW TESTS NEEDED]

Signed-off-by: Matej Vasek <mvasek@redhat.com>
@matejvasek matejvasek force-pushed the compress-log-entries branch from 8cd544e to d40f225 Compare April 4, 2023 14:00
@matejvasek
Copy link
Copy Markdown
Contributor Author

matejvasek commented Apr 4, 2023

With regard to journald: the only weird thing to me is that the journalctl CLI is displaying the items decompressed.

Actually this is not true, my bad.

🤦‍♂️

In fact, the compressed fields are not displayed in output of journalctl at all! (or any custom fields for that matter)

Reason I am seeing them is that podman is not only saving labels to custom log item field. It is also formmating it to the main MESSAGE field uncompressed.
The journal is called like:
journal.Send("some long string that contains labels (not compressed)", priority, some_big_map_containing_labels_compressed).

@matejvasek
Copy link
Copy Markdown
Contributor Author

Regardless, if Dan's update of SELinux will work this PR is not needed.

@rhatdan
Copy link
Copy Markdown
Member

rhatdan commented Apr 9, 2023

container-selinux should now be available with the fix for this issue.

@rhatdan rhatdan closed this Apr 9, 2023
@github-actions github-actions Bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 2, 2023
@github-actions github-actions Bot locked as resolved and limited conversation to collaborators Sep 2, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. release-note-none

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants