Faster JSON marshaling (and easyjson support)#112
Faster JSON marshaling (and easyjson support)#112jacksontj wants to merge 3 commits intoprometheus:masterfrom
Conversation
|
@jacksontj FWIW: You can use benchcmp or benchstat to compare and print both benchmark results side by side. |
|
@knweiss thanks, here is the output from |
32aeab4 to
65a1242
Compare
|
Benmark comparison run again with memory numbers included: |
model/labelset.go
Outdated
| return labelSetToFastFingerprint(ls) | ||
| } | ||
|
|
||
| func (l *LabelSet) MarshalEasyJSON(w *jwriter.Writer) { |
There was a problem hiding this comment.
nit: Since maps are reference types and we're not mutating the labelset, is the pointer receiver useful here? A value receiver would allow just saying range l instead of range *l below.
There was a problem hiding this comment.
@juliusv looks like it won't matter, I was worried about copies etc. but it seems that the compiler is smart enough-- changed.
| if !first { | ||
| w.RawByte(',') | ||
| } else { | ||
| first = false |
There was a problem hiding this comment.
nit: be consistent how you handle the if-else here and in the LabelSet marshaling method (I guess I prefer the if-else variant, since it only assigns the first variable once - not that I think it'd make a real difference).
There was a problem hiding this comment.
Good catch, I switched to the else, since it seems better :) (tests don't show much of a difference, but it looks better at least)
| "sort" | ||
| "strings" | ||
|
|
||
| "github.com/mailru/easyjson/jwriter" |
There was a problem hiding this comment.
I don't think this is an appropriate dependency to have in this package. Data models should just be data models.
There was a problem hiding this comment.
@brian-brazil this package already has a dep on encoding/json meaning marshaling is already a part of the model. Similarly custom marshal and unmarshal methods already exist in this package (and even in that file).
In golang you can't define methods on a struct outside of the package where the struct is defined, meaning that if you want a struct to adhere to an interface it must define the methods in the same package. We could move the marshal methods into a different file in the same package, but the existing ones are in the same files-- but if separate files in the package is preferred I'd be more than happy to move them around.
8170e3e to
a69f072
Compare
|
@juliusv Updated, I squashed the commit so the merge will be nicer :) |
This commit adds: - adds MarshalEasyJSON methods to values -- which is a streaming marshal interface - Change time marshaling to avoid moving to float64 (roughly a 50% speedup)
|
You might want to take a look at #114 - a combination of the two techniques might work best. Results from the microbenchmarks in #114 on my desktop machine: and same tests against this PR: |
|
@bboreham thanks for the ping! I actually have a separate branch where I've added faster marshal/unmarshal for most of the structs in common. I'm running a fork of prometheus to test out the performance gains and they are significant (>50% CPU reduction, >3x latency improvement, etc.) If this PR gets some traction I'll update it with those (if you are interested you can see my changes -- master...jacksontj:tjackson_fork) |
|
This code isn't even being used by the Prometheus web API, and we've gone with a different json encoder. |
d6cc704 Fix comment 7139116 Revert "Push comments to the left so they don't appear in scripts" e47e58f Push comments to the left so they don't appear in scripts 3945fce Remove nonexistent env var GIT_TAG cd62992 Merge pull request prometheus#156 from weaveworks/drop-quay af0eb51 Merge pull request prometheus#157 from weaveworks/fix-image-tag-prefix-length 0b9aee4 Fix image-tag object name prefix length to 8 chars. 813c28f Move from CircleCI 1.0 to 2.0 425cf4e Move from quay.io to Dockerhub 87ccf4f Merge pull request prometheus#155 from weaveworks/go-1-12 c31bc28 Update lint script to work with Go 1.12 ed8e380 Update to Go 1.12.1 ec369f5 Merge pull request prometheus#153 from dholbach/drop-email ef7418d weave-users mailing list is closed: https://groups.google.com/a/weave.works/forum/#!topic/weave-users/0QXWGOPdBfY 6954a57 Merge pull request prometheus#144 from weaveworks/golang-1.11.1 9649eed Upgrade build image from golang:1.10.0-strech to 1.11.1-strech 59263a7 Merge pull request prometheus#141 from weaveworks/update-context e235c9b Merge pull request prometheus#143 from weaveworks/gc-wks-test-vms c865b4c scheduler: please lint/flake8 da61568 scheduler: please lint/yapf ce9d78e scheduler: do not cache discovery doc e4b7873 scheduler: add comment about GCP projects' IAM roles needed to list/delete instances and firewall rules ff7ec8e scheduler: add comment about CircleCI projects' access via the API 2477d98 scheduler: deploy command now sets the current datetime as the version 5fcd880 scheduler: pass CircleCI API token in for private projects 6b8c323 scheduler: more details in case of failure to get running builds from CircleCI 0871aff scheduler: downgrade google-api-python-client from 1.7.4 to 1.6.7 b631e7f scheduler: add GC of WKS test VMs and firewall rules a923a32 scheduler: document setup and deployment 013f508 scheduler: lock dependencies' versions 6965a4a Merge pull request prometheus#142 from weaveworks/fix-build 23298c6 Fix golint expects import golang.org/x/lint/golint 482f4cd Context is now part of the Go standard library 2bbc9a0 Merge pull request prometheus#140 from weaveworks/sched-http-retry c3726de Add retries to sched util http calls 2cc7b5a Merge pull request prometheus#139 from meghalidhoble/master fd9b0a7 Change : Modified the lint tools to skip the shfmt check if not installed. Why the change : For ppc64le the specific version of shfmt is not available, hence skipped completely the installation of shfmt tool. Thus this change made. bc645c7 Merge pull request prometheus#138 from dholbach/add-license-file a642e02 license: add Apache 2.0 license text 9bf5956 Merge pull request prometheus#109 from hallum/master d971d82 Merge pull request prometheus#134 from weaveworks/2018-07-03-gcloud-regepx 32e7aa2 Merge pull request prometheus#137 from weaveworks/gcp-fw-allow-kube-apiserver bbb6735 Allow CI to access k8s API server on GCP instances 764d46c Merge pull request prometheus#135 from weaveworks/2018-07-04-docker-ansible-playbook ecc2a4e Merge pull request prometheus#136 from weaveworks/2018-07-05-gcp-private-ips 209b7fb tools: Add private_ips to the terraform output 369a655 tools: Add an ansible playbook that just installs docker a643e27 tools: Use --filter instead of --regexp with gcloud b8eca88 Merge pull request prometheus#128 from weaveworks/actually-say-whats-wrong 379ce2b Merge pull request prometheus#133 from weaveworks/fix-decrypt 3b906b5 Fix incompatibility with recent versions of OpenSSL f091ab4 Merge pull request prometheus#132 from weaveworks/add-opencontainers-labels-to-dockerfiles 248def1 Inject git revision in Dockerfiles 64f2c28 Add org.opencontainers.image.* labels to Dockerfiles ea96d8e add information about how to get help (prometheus#129) f066ccd Make yapf diff failure look like an error 34d81d7 Merge pull request prometheus#127 from weaveworks/golang-1.10.0-stretch 89a0b4f Use golang:1.10.0-stretch image. ca69607 Merge pull request prometheus#126 from weaveworks/disable-apt-daily-test f5dc5d5 Create "setup-apt" role 7fab441 Rename bazel to bazel-rules (prometheus#125) ccc8316 Revert "Gocyclo should return error code if issues detected" (prometheus#124) 1fe184f Bazel rules for building gogo protobufs (prometheus#123) b917bb8 Merge pull request prometheus#122 from weaveworks/fix-scope-gc c029ce0 Add regex to match scope VMs 0d4824b Merge pull request prometheus#121 from weaveworks/provisioning-readme-terraform 5a82d64 Move terraform instructions to tf section d285d78 Merge pull request prometheus#120 from weaveworks/gocyclo-return-value 76b94a4 Do not spawn subshell when reading cyclo output 93b3c0d Use golang:1.9.2-stretch image d40728f Gocyclo should return error code if issues detected c4ac1c3 Merge pull request prometheus#114 from weaveworks/tune-spell-check 8980656 Only check files 12ebc73 Don't spell-check pki files 578904a Special-case spell-check the same way we do code checks e772ed5 Special-case on mime type and extension using just patterns ae82b50 Merge pull request prometheus#117 from weaveworks/test-verbose 8943473 Propagate verbose flag to 'go test'. 7c79b43 Merge pull request prometheus#113 from weaveworks/update-shfmt-instructions 258ef01 Merge pull request prometheus#115 from weaveworks/extra-linting e690202 Use tools in built image to lint itself 126eb56 Add shellcheck to bring linting in line with scope 63ad68f Don't run lint on files under .git 51d908a Update shfmt instructions e91cb0d Merge pull request prometheus#112 from weaveworks/add-python-lint-tools 0c87554 Add yapf and flake8 to golang build image 35679ee Merge pull request prometheus#110 from weaveworks/parallel-push-errors 3ae41b6 Remove unneeded if block 51ff31a Exit on first error 0faad9f Check for errors when pushing images in parallel d87cd02 Add arg flag override for destination socks host:port in pacfile. 74dc626 Merge pull request prometheus#108 from weaveworks/disable-apt-daily b4f1d91 Merge pull request prometheus#107 from weaveworks/docker-17-update 7436aa1 Override apt daily job to not run immediately on boot 7980f15 Merge pull request prometheus#106 from weaveworks/document-docker-install-role f741e53 Bump to Docker 17.06 from CE repo 61796a1 Update Docker CE Debian repo details 0d86f5e Allow for Docker package to be named docker-ce 065c68d Document selection of Docker installation role. 3809053 Just --porcelain; it defaults to v1 11400ea Merge pull request prometheus#105 from weaveworks/remove-weaveplugin-remnants b8b4d64 remove weaveplugin remnants 35099c9 Merge pull request prometheus#104 from weaveworks/pull-docker-py cdd48fc Pull docker-py to speed tests/builds up. e1c6c24 Merge pull request prometheus#103 from weaveworks/test-build-tags d5d71e0 Add -tags option so callers can pass in build tags 8949b2b Merge pull request prometheus#98 from weaveworks/git-status-tag ac30687 Merge pull request prometheus#100 from weaveworks/python_linting 4b125b5 Pin yapf & flake8 versions 7efb485 Lint python linting function 444755b Swap diff direction to reflect changes required c5b2434 Install flake8 & yapf 5600eac Lint python in build-tools repo 0b02ca9 Add python linting c011c0d Merge pull request prometheus#79 from kinvolk/schu/python-shebang 6577d07 Merge pull request prometheus#99 from weaveworks/shfmt-version 00ce0dc Use git status instead of diff to add 'WIP' tag 411fd13 Use shfmt v1.3.0 instead of latest from master. 0d6d4da Run shfmt 1.3 on the code. 5cdba32 Add sudo c322ca8 circle.yml: Install shfmt binary. e59c225 Install shfmt 1.3 binary. 30706e6 Install pyhcl in the build container. 960d222 Merge pull request prometheus#97 from kinvolk/alban/update-shfmt-3 1d535c7 shellcheck: fix escaping issue 5542498 Merge pull request prometheus#96 from kinvolk/alban/update-shfmt-2 32f7cc5 shfmt: fix coding style 09f72af lint: print the diff in case of error 571c7d7 Merge pull request prometheus#95 from kinvolk/alban/update-shfmt bead6ed Update for latest shfmt b08dc4d Update for latest shfmt (prometheus#94) 2ed8aaa Add no-race argument to test script (prometheus#92) 80dd78e Merge pull request prometheus#91 from weaveworks/upgrade-go-1.8.1 08dcd0d Please ./lint as shfmt changed its rules between 1.0.0 and 1.3.0. a8bc9ab Upgrade default Go version to 1.8.1. 31d069d Change Python shebang to `#!/usr/bin/env python` git-subtree-dir: tools git-subtree-split: d6cc704a2892e8d85aa8fa4d201c1a404f02dfa4
I've been looking into options for fixing #3601 (JSON performance issues). I see that there is an open PR related to this issue #3536 -- which simply switches to jsoniter.
From my testing there are effectively 2 issues.
First issue
There are a few places in the marshaling process that require making lots of copies of the underlying datapoints. This not only causes memory bloat, but this significantly slows down the process. Similarly the task is made unecessarily difficult by converting int64 to float64 before putting it to a string (in the case of
Time).Second Issue
The majority of the performance issues at "large" (quoted, because large is only hours of data) are due to the large memory requirements due to marshaling the entire structure into memory before returning it. easyjson has a streaming serialization interface which solves this problem. When using the
encoding/jsonlibrary to do the marshaling I see my responses regularly increase by ~6x in memory before being written out. With this streaming approach I see no such memory increase.Of course any performance change is only as good as the benchmarks associated with it (which are included in this PR). On my X1 carbon 5th gen, here are the results:
Before these changes
After these changes
For ease of reading, here is the same data in a table: