Skip to content

Comments

Faster JSON marshaling (and easyjson support)#112

Closed
jacksontj wants to merge 3 commits intoprometheus:masterfrom
jacksontj:easyjson
Closed

Faster JSON marshaling (and easyjson support)#112
jacksontj wants to merge 3 commits intoprometheus:masterfrom
jacksontj:easyjson

Conversation

@jacksontj
Copy link
Contributor

I've been looking into options for fixing #3601 (JSON performance issues). I see that there is an open PR related to this issue #3536 -- which simply switches to jsoniter.

From my testing there are effectively 2 issues.

First issue
There are a few places in the marshaling process that require making lots of copies of the underlying datapoints. This not only causes memory bloat, but this significantly slows down the process. Similarly the task is made unecessarily difficult by converting int64 to float64 before putting it to a string (in the case of Time).

Second Issue
The majority of the performance issues at "large" (quoted, because large is only hours of data) are due to the large memory requirements due to marshaling the entire structure into memory before returning it. easyjson has a streaming serialization interface which solves this problem. When using the encoding/json library to do the marshaling I see my responses regularly increase by ~6x in memory before being written out. With this streaming approach I see no such memory increase.

Of course any performance change is only as good as the benchmarks associated with it (which are included in this PR). On my X1 carbon 5th gen, here are the results:

Before these changes

$ go test -run=x -bench=BenchmarkMarshal
goos: linux
goarch: amd64
pkg: github.com/prometheus/common/model
BenchmarkMarshal/SampleValue/encoding/json-4         	 2000000	       617 ns/op
model.SampleValue not easyjson.Marshaler
BenchmarkMarshal/Timestamp/encoding/json-4           	 2000000	       598 ns/op
model.Time not easyjson.Marshaler
BenchmarkMarshal/SamplePair/encoding/json-4          	 1000000	      2069 ns/op
model.SamplePair not easyjson.Marshaler
BenchmarkMarshal/labelset/encoding/json-4            	 2000000	       686 ns/op
model.Metric not easyjson.Marshaler
BenchmarkMarshal/SampleStream/encoding/json-4        	   10000	    198699 ns/op
*model.SampleStream not easyjson.Marshaler
BenchmarkMarshal/Matrix/encoding/json-4              	      20	  96541250 ns/op
model.Matrix not easyjson.Marshaler
PASS
ok  	github.com/prometheus/common/model	11.913s

After these changes

$ go test -run=x -bench=BenchmarkMarshal
goos: linux
goarch: amd64
pkg: github.com/prometheus/common/model
BenchmarkMarshal/SampleValue/encoding/json-4         	 3000000	       516 ns/op
BenchmarkMarshal/SampleValue/easyjson-4              	10000000	       209 ns/op
BenchmarkMarshal/Timestamp/encoding/json-4           	 3000000	       518 ns/op
BenchmarkMarshal/Timestamp/easyjson-4                	10000000	       211 ns/op
BenchmarkMarshal/SamplePair/encoding/json-4          	 2000000	       719 ns/op
BenchmarkMarshal/SamplePair/easyjson-4               	 5000000	       305 ns/op
BenchmarkMarshal/labelset/encoding/json-4            	 2000000	       663 ns/op
BenchmarkMarshal/labelset/easyjson-4                 	10000000	       214 ns/op
BenchmarkMarshal/SampleStream/encoding/json-4        	   50000	     30999 ns/op
BenchmarkMarshal/SampleStream/easyjson-4             	  100000	     16747 ns/op
BenchmarkMarshal/Matrix/encoding/json-4              	     100	  14360224 ns/op
BenchmarkMarshal/Matrix/easyjson-4                   	     200	   8192013 ns/op
PASS
ok  	github.com/prometheus/common/model	25.163s

For ease of reading, here is the same data in a table:

encoding/json (before) encoding/json (after encoding/json (speedup) easyjson easyjson (speedup)
SampleValue 617 516 16% 209 66%
Timestamp 598 518 13% 211 64%
SamplePair 2069 719 65% 305 85%
labelset 686 663 3% 214 68%
SampleStream 198699 30999 84% 16747 91%
Matrix 96541250 14360224 85% 8192013 91%

@knweiss
Copy link
Contributor

knweiss commented Dec 21, 2017

@jacksontj FWIW: You can use benchcmp or benchstat to compare and print both benchmark results side by side.

@jacksontj
Copy link
Contributor Author

jacksontj commented Dec 21, 2017

@knweiss thanks, here is the output from benchcmp:

benchmark                                         old ns/op     new ns/op     delta
BenchmarkMarshal/SampleValue/encoding/json-4      617           516           -16.37%
BenchmarkMarshal/Timestamp/encoding/json-4        598           518           -13.38%
BenchmarkMarshal/SamplePair/encoding/json-4       2069          719           -65.25%
BenchmarkMarshal/labelset/encoding/json-4         686           663           -3.35%
BenchmarkMarshal/SampleStream/encoding/json-4     198699        30999         -84.40%
BenchmarkMarshal/Matrix/encoding/json-4           96541250      14360224      -85.13%

@jacksontj
Copy link
Contributor Author

Benmark comparison run again with memory numbers included:

benchmark                                         old ns/op     new ns/op     delta
BenchmarkMarshal/SampleValue/encoding/json-4      766           468           -38.90%
BenchmarkMarshal/Timestamp/encoding/json-4        713           586           -17.81%
BenchmarkMarshal/SamplePair/encoding/json-4       2050          743           -63.76%
BenchmarkMarshal/labelset/encoding/json-4         683           662           -3.07%
BenchmarkMarshal/SampleStream/encoding/json-4     197673        19165         -90.30%
BenchmarkMarshal/Matrix/encoding/json-4           96928336      8494863       -91.24%

benchmark                                         old allocs     new allocs     delta
BenchmarkMarshal/SampleValue/encoding/json-4      6              3              -50.00%
BenchmarkMarshal/Timestamp/encoding/json-4        5              4              -20.00%
BenchmarkMarshal/SamplePair/encoding/json-4       20             5              -75.00%
BenchmarkMarshal/labelset/encoding/json-4         7              7              +0.00%
BenchmarkMarshal/SampleStream/encoding/json-4     2012           118            -94.14%
BenchmarkMarshal/Matrix/encoding/json-4           1003160        50079          -95.01%

benchmark                                         old bytes     new bytes     delta
BenchmarkMarshal/SampleValue/encoding/json-4      497           400           -19.52%
BenchmarkMarshal/Timestamp/encoding/json-4        336           416           +23.81%
BenchmarkMarshal/SamplePair/encoding/json-4       1264          424           -66.46%
BenchmarkMarshal/labelset/encoding/json-4         416           416           +0.00%
BenchmarkMarshal/SampleStream/encoding/json-4     116137        5035          -95.66%
BenchmarkMarshal/Matrix/encoding/json-4           58131741      2150175       -96.30%

@jacksontj
Copy link
Contributor Author

cc @fabxc @juliusv @beorn7

return labelSetToFastFingerprint(ls)
}

func (l *LabelSet) MarshalEasyJSON(w *jwriter.Writer) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Since maps are reference types and we're not mutating the labelset, is the pointer receiver useful here? A value receiver would allow just saying range l instead of range *l below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@juliusv looks like it won't matter, I was worried about copies etc. but it seems that the compiler is smart enough-- changed.

if !first {
w.RawByte(',')
} else {
first = false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: be consistent how you handle the if-else here and in the LabelSet marshaling method (I guess I prefer the if-else variant, since it only assigns the first variable once - not that I think it'd make a real difference).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, I switched to the else, since it seems better :) (tests don't show much of a difference, but it looks better at least)

"sort"
"strings"

"github.com/mailru/easyjson/jwriter"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is an appropriate dependency to have in this package. Data models should just be data models.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@brian-brazil this package already has a dep on encoding/json meaning marshaling is already a part of the model. Similarly custom marshal and unmarshal methods already exist in this package (and even in that file).

In golang you can't define methods on a struct outside of the package where the struct is defined, meaning that if you want a struct to adhere to an interface it must define the methods in the same package. We could move the marshal methods into a different file in the same package, but the existing ones are in the same files-- but if separate files in the package is preferred I'd be more than happy to move them around.

@jacksontj jacksontj force-pushed the easyjson branch 2 times, most recently from 8170e3e to a69f072 Compare December 27, 2017 23:19
@jacksontj
Copy link
Contributor Author

@juliusv Updated, I squashed the commit so the merge will be nicer :)

This commit adds:
- adds MarshalEasyJSON methods to values -- which is a streaming marshal
interface
- Change time marshaling to avoid moving to float64 (roughly a 50%
speedup)
@bboreham
Copy link
Member

You might want to take a look at #114 - a combination of the two techniques might work best.

Results from the microbenchmarks in #114 on my desktop machine:

BenchmarkMarshalJSON-3     	20000000	       110 ns/op	      16 B/op	       1 allocs/op
BenchmarkUnmarshalJSON-3   	20000000	        85.5 ns/op	      16 B/op	       2 allocs/op

and same tests against this PR:

BenchmarkMarshalJSON-3     	10000000	       167 ns/op	     144 B/op	       2 allocs/op
BenchmarkUnmarshalJSON-3   	10000000	       151 ns/op	      48 B/op	       2 allocs/op

@jacksontj
Copy link
Contributor Author

@bboreham thanks for the ping! I actually have a separate branch where I've added faster marshal/unmarshal for most of the structs in common. I'm running a fork of prometheus to test out the performance gains and they are significant (>50% CPU reduction, >3x latency improvement, etc.) If this PR gets some traction I'll update it with those (if you are interested you can see my changes -- master...jacksontj:tjackson_fork)

@brian-brazil
Copy link
Contributor

This code isn't even being used by the Prometheus web API, and we've gone with a different json encoder.

alanprot pushed a commit to alanprot/common that referenced this pull request Mar 15, 2023
d6cc704 Fix comment
7139116 Revert "Push comments to the left so they don't appear in scripts"
e47e58f Push comments to the left so they don't appear in scripts
3945fce Remove nonexistent env var GIT_TAG
cd62992 Merge pull request prometheus#156 from weaveworks/drop-quay
af0eb51 Merge pull request prometheus#157 from weaveworks/fix-image-tag-prefix-length
0b9aee4 Fix image-tag object name prefix length to 8 chars.
813c28f Move from CircleCI 1.0 to 2.0
425cf4e Move from quay.io to Dockerhub
87ccf4f Merge pull request prometheus#155 from weaveworks/go-1-12
c31bc28 Update lint script to work with Go 1.12
ed8e380 Update to Go 1.12.1
ec369f5 Merge pull request prometheus#153 from dholbach/drop-email
ef7418d weave-users mailing list is closed: https://groups.google.com/a/weave.works/forum/#!topic/weave-users/0QXWGOPdBfY
6954a57 Merge pull request prometheus#144 from weaveworks/golang-1.11.1
9649eed Upgrade build image from golang:1.10.0-strech to 1.11.1-strech
59263a7 Merge pull request prometheus#141 from weaveworks/update-context
e235c9b Merge pull request prometheus#143 from weaveworks/gc-wks-test-vms
c865b4c scheduler: please lint/flake8
da61568 scheduler: please lint/yapf
ce9d78e scheduler: do not cache discovery doc
e4b7873 scheduler: add comment about GCP projects' IAM roles needed to list/delete instances and firewall rules
ff7ec8e scheduler: add comment about CircleCI projects' access via the API
2477d98 scheduler: deploy command now sets the current datetime as the version
5fcd880 scheduler: pass CircleCI API token in for private projects
6b8c323 scheduler: more details in case of failure to get running builds from CircleCI
0871aff scheduler: downgrade google-api-python-client from 1.7.4 to 1.6.7
b631e7f scheduler: add GC of WKS test VMs and firewall rules
a923a32 scheduler: document setup and deployment
013f508 scheduler: lock dependencies' versions
6965a4a Merge pull request prometheus#142 from weaveworks/fix-build
23298c6 Fix golint expects import golang.org/x/lint/golint
482f4cd Context is now part of the Go standard library
2bbc9a0 Merge pull request prometheus#140 from weaveworks/sched-http-retry
c3726de Add retries to sched util http calls
2cc7b5a Merge pull request prometheus#139 from meghalidhoble/master
fd9b0a7 Change : Modified the lint tools to skip the shfmt check if not installed. Why the change : For ppc64le the specific version of shfmt is not available, hence skipped completely the installation of shfmt tool. Thus this change made.
bc645c7 Merge pull request prometheus#138 from dholbach/add-license-file
a642e02 license: add Apache 2.0 license text
9bf5956 Merge pull request prometheus#109 from hallum/master
d971d82 Merge pull request prometheus#134 from weaveworks/2018-07-03-gcloud-regepx
32e7aa2 Merge pull request prometheus#137 from weaveworks/gcp-fw-allow-kube-apiserver
bbb6735 Allow CI to access k8s API server on GCP instances
764d46c Merge pull request prometheus#135 from weaveworks/2018-07-04-docker-ansible-playbook
ecc2a4e Merge pull request prometheus#136 from weaveworks/2018-07-05-gcp-private-ips
209b7fb tools: Add private_ips to the terraform output
369a655 tools: Add an ansible playbook that just installs docker
a643e27 tools: Use --filter instead of --regexp with gcloud
b8eca88 Merge pull request prometheus#128 from weaveworks/actually-say-whats-wrong
379ce2b Merge pull request prometheus#133 from weaveworks/fix-decrypt
3b906b5 Fix incompatibility with recent versions of OpenSSL
f091ab4 Merge pull request prometheus#132 from weaveworks/add-opencontainers-labels-to-dockerfiles
248def1 Inject git revision in Dockerfiles
64f2c28 Add org.opencontainers.image.* labels to Dockerfiles
ea96d8e add information about how to get help (prometheus#129)
f066ccd Make yapf diff failure look like an error
34d81d7 Merge pull request prometheus#127 from weaveworks/golang-1.10.0-stretch
89a0b4f Use golang:1.10.0-stretch image.
ca69607 Merge pull request prometheus#126 from weaveworks/disable-apt-daily-test
f5dc5d5 Create "setup-apt" role
7fab441 Rename bazel to bazel-rules (prometheus#125)
ccc8316 Revert "Gocyclo should return error code if issues detected" (prometheus#124)
1fe184f Bazel rules for building gogo protobufs (prometheus#123)
b917bb8 Merge pull request prometheus#122 from weaveworks/fix-scope-gc
c029ce0 Add regex to match scope VMs
0d4824b Merge pull request prometheus#121 from weaveworks/provisioning-readme-terraform
5a82d64 Move terraform instructions to tf section
d285d78 Merge pull request prometheus#120 from weaveworks/gocyclo-return-value
76b94a4 Do not spawn subshell when reading cyclo output
93b3c0d Use golang:1.9.2-stretch image
d40728f Gocyclo should return error code if issues detected
c4ac1c3 Merge pull request prometheus#114 from weaveworks/tune-spell-check
8980656 Only check files
12ebc73 Don't spell-check pki files
578904a Special-case spell-check the same way we do code checks
e772ed5 Special-case on mime type and extension using just patterns
ae82b50 Merge pull request prometheus#117 from weaveworks/test-verbose
8943473 Propagate verbose flag to 'go test'.
7c79b43 Merge pull request prometheus#113 from weaveworks/update-shfmt-instructions
258ef01 Merge pull request prometheus#115 from weaveworks/extra-linting
e690202 Use tools in built image to lint itself
126eb56 Add shellcheck to bring linting in line with scope
63ad68f Don't run lint on files under .git
51d908a Update shfmt instructions
e91cb0d Merge pull request prometheus#112 from weaveworks/add-python-lint-tools
0c87554 Add yapf and flake8 to golang build image
35679ee Merge pull request prometheus#110 from weaveworks/parallel-push-errors
3ae41b6 Remove unneeded if block
51ff31a Exit on first error
0faad9f Check for errors when pushing images in parallel
d87cd02 Add arg flag override for destination socks host:port in pacfile.
74dc626 Merge pull request prometheus#108 from weaveworks/disable-apt-daily
b4f1d91 Merge pull request prometheus#107 from weaveworks/docker-17-update
7436aa1 Override apt daily job to not run immediately on boot
7980f15 Merge pull request prometheus#106 from weaveworks/document-docker-install-role
f741e53 Bump to Docker 17.06 from CE repo
61796a1 Update Docker CE Debian repo details
0d86f5e Allow for Docker package to be named docker-ce
065c68d Document selection of Docker installation role.
3809053 Just --porcelain; it defaults to v1
11400ea Merge pull request prometheus#105 from weaveworks/remove-weaveplugin-remnants
b8b4d64 remove weaveplugin remnants
35099c9 Merge pull request prometheus#104 from weaveworks/pull-docker-py
cdd48fc Pull docker-py to speed tests/builds up.
e1c6c24 Merge pull request prometheus#103 from weaveworks/test-build-tags
d5d71e0 Add -tags option so callers can pass in build tags
8949b2b Merge pull request prometheus#98 from weaveworks/git-status-tag
ac30687 Merge pull request prometheus#100 from weaveworks/python_linting
4b125b5 Pin yapf & flake8 versions
7efb485 Lint python linting function
444755b Swap diff direction to reflect changes required
c5b2434 Install flake8 & yapf
5600eac Lint python in build-tools repo
0b02ca9 Add python linting
c011c0d Merge pull request prometheus#79 from kinvolk/schu/python-shebang
6577d07 Merge pull request prometheus#99 from weaveworks/shfmt-version
00ce0dc Use git status instead of diff to add 'WIP' tag
411fd13 Use shfmt v1.3.0 instead of latest from master.
0d6d4da Run shfmt 1.3 on the code.
5cdba32 Add sudo
c322ca8 circle.yml: Install shfmt binary.
e59c225 Install shfmt 1.3 binary.
30706e6 Install pyhcl in the build container.
960d222 Merge pull request prometheus#97 from kinvolk/alban/update-shfmt-3
1d535c7 shellcheck: fix escaping issue
5542498 Merge pull request prometheus#96 from kinvolk/alban/update-shfmt-2
32f7cc5 shfmt: fix coding style
09f72af lint: print the diff in case of error
571c7d7 Merge pull request prometheus#95 from kinvolk/alban/update-shfmt
bead6ed Update for latest shfmt
b08dc4d Update for latest shfmt (prometheus#94)
2ed8aaa Add no-race argument to test script (prometheus#92)
80dd78e Merge pull request prometheus#91 from weaveworks/upgrade-go-1.8.1
08dcd0d Please ./lint as shfmt changed its rules between 1.0.0 and 1.3.0.
a8bc9ab Upgrade default Go version to 1.8.1.
31d069d Change Python shebang to `#!/usr/bin/env python`

git-subtree-dir: tools
git-subtree-split: d6cc704a2892e8d85aa8fa4d201c1a404f02dfa4
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants