Conversation
… (Code Scanning alert #17) Agent-Logs-Url: https://github.com/envoyproxy/envoy/sessions/0fe292c3-95f1-40a3-8c62-b3b4c7590cfc Co-authored-by: phlax <454682+phlax@users.noreply.github.com>
Copilot
AI
changed the title
[WIP] Remove checked-in binary executable artifact from repository
fix: generate json_sanitizer fuzz corpus binary seed at build time (resolves Scorecard Binary-Artifacts alert #17)
May 1, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
test/common/json/json_sanitizer_corpus/binary_file(5698 bytes, added in #20637) was flagged by OSSF ScorecardBinary-Artifactsas a checked-in binary executable. It is a libFuzzer corpus seed with zero documented provenance — committed without comment alongside the harness that introduced the non-UTF-8 code path it exercises, never modified, no CVE/OSS-Fuzz reference. The exact bytes are not load-bearing; the only requirement is that the seed hitssanitize()'s octal-escape fallback path.Changes
test/common/json/gen_binary_corpus.sh(new): deterministic bash generator producing a 5698-byte blob that covers every single-byte value, truncated/invalid/surrogate UTF-8 sequences, an overlong NUL encoding, JSON-significant ASCII interleaved with high-bit bytes, and a pseudo-random tail (b[i] = (b[i-1]*31+7) & 0xff) to fill to the original size.set -euo pipefail,LC_ALL=C, single output redirection to$1, no temp files.test/common/json/json_sanitizer_corpus/binary_file(deleted): replaced by the genrule output.test/common/json/BUILD(modified):genrulejson_sanitizer_corpus_binary_filerunsgen_binary_corpus.shto emitjson_sanitizer_corpus/binary_fileat build time.filegroupjson_sanitizer_corpus_filesunions theglob(["json_sanitizer_corpus/**"])text seeds with the genrule output.json_sanitizer_fuzz_testupdated tocorpus = ":json_sanitizer_corpus_files"— theenvoy_cc_fuzz_testmacro already passes label-form corpus arguments through directly, so no macro changes are needed.Warning
Firewall rules blocked me from connecting to one or more addresses (expand for details)
I tried to connect to the following addresses, but was blocked by firewall rules:
ftpmirror.gnu.org/build/bazel_root/install/fb2a7f6d344d2f4e335882534df59296/embedded_tools/jdk/bin/java bazel(envoy) --add-opens=java.base/java.lang=ALL-UNNAMED -Xverify:none -Djava.util.logging.config.file=/build/bazel_root/base/javalog.properties -Dcom.google.devtools.build.lib.util.LogHandlerQuerier.class=com.google.devtools.build.lib.util.SimpleLogHandler$HandlerQuerier -XX:-MaxFDLimit -Djava.library.path=/build/bazel_root/install/fb2a7f6d344d2f4e335882534df59296/embedded_tools/jdk/lib:/build/bazel_root/install/fb2a7f6d344d2f4e335882534df59296/embedded_tools/jdk/lib/server:/build/bazel_root/install/fb2a7f6d344d2f4e335882534df59296/ -Dfile.encoding=ISO-8859-1 -Duser.country= -Duser.language= -Duser.variant= -Xmx3g -DBAZEL_TRACK_SOURCE_DIRECTORIES=1 -Djavax.net.ssl.trustStore=/tmp/custom-cacerts -Djavax.net.ssl.trustStorePassword=changeit -jar /build/bazel_root/install/fb2a7f6d344d2f4e335882534df59296/A-server.jar --max_idle_secs=10800 --noshutdown_on_low_sys_mem --connect_timeout_secs=30(dns block)mirror.bazel.build/build/bazel_root/install/fb2a7f6d344d2f4e335882534df59296/embedded_tools/jdk/bin/java bazel(envoy) --add-opens=java.base/java.lang=ALL-UNNAMED -Xverify:none -Djava.util.logging.config.file=/build/bazel_root/base/javalog.properties -Dcom.google.devtools.build.lib.util.LogHandlerQuerier.class=com.google.devtools.build.lib.util.SimpleLogHandler$HandlerQuerier -XX:-MaxFDLimit -Djava.library.path=/build/bazel_root/install/fb2a7f6d344d2f4e335882534df59296/embedded_tools/jdk/lib:/build/bazel_root/install/fb2a7f6d344d2f4e335882534df59296/embedded_tools/jdk/lib/server:/build/bazel_root/install/fb2a7f6d344d2f4e335882534df59296/ -Dfile.encoding=ISO-8859-1 -Duser.country= -Duser.language= -Duser.variant= -Xmx3g -DBAZEL_TRACK_SOURCE_DIRECTORIES=1 -Djavax.net.ssl.trustStore=/tmp/custom-cacerts -Djavax.net.ssl.trustStorePassword=changeit -jar /build/bazel_root/install/fb2a7f6d344d2f4e335882534df59296/A-server.jar --max_idle_secs=10800 --noshutdown_on_low_sys_mem --connect_timeout_secs=30(dns block)If you need me to access, download, or install something from one of these locations, you can either:
Original prompt
Background
Code Scanning alert #17 (OSSF Scorecard
Binary-Artifacts, severity High) flagstest/common/json/json_sanitizer_corpus/binary_fileas a checked-in binary executable artifact. It is not actually an executable — it is a 5,698-byte opaque seed input for the libFuzzer-drivenjson_sanitizer_fuzz_test(declared viacorpus = "json_sanitizer_corpus"intest/common/json/BUILD).Provenance investigation (already done)
1c66d57(2022-04-08), single commit, never modified since.binary_file. No CVE, OSS-Fuzz report, issue, or commit anywhere in the repo references it. Almost certainly libFuzzer-mutated bytes from a local dev run, dropped in alongside the obvious text seeds (lower_case,punctuation, etc.).json_sanitizer_fuzz_test.cc) uses this seed only to exercise the non-UTF-8 / octal-escape fallback path inEnvoy::Json::sanitize()— libFuzzer mutates from there.Goal
Stop checking in
binary_fileand instead generate it deterministically at build time via a Bazelgenrule. This resolves the Scorecard finding cleanly (it's no longer a checked-in binary artifact) without gaming the heuristic and without losing any test coverage.Constraints (please respect)
printfand a loop is trivial).envoy_cc_fuzz_test(name = "json_sanitizer_fuzz_test", corpus = "json_sanitizer_corpus", ...)so libFuzzer picks it up exactly like the existing text seeds in that directory.Required changes
Add
test/common/json/gen_binary_corpus.sh— a bash script that writes a deterministic binary blob to$1(output path). The blob should be designed to seed the non-UTF-8 / octal-escape path ofEnvoy::Json::sanitize(). Concrete content the script should emit (in order, concatenated):\x00..\xff(covers every single-byte input).\xc2,\xe0\xa4,\xf0\x9d\x84.\xc2\xff.\xed\xa0\x80.\xc0\x80.",\, control chars\x00..\x1f,\x7f) interleaved with high-bit bytes\x80..\xff.Hello,+\xff\xfe\xfd+, world!.b[i] = (b[i-1] * 31 + 7) & 0xffseeded from the prefix) until the file reaches a target size of 5698 bytes (matches the original size; not strictly necessary but keeps fuzzer behavior similar).set -euo pipefailandLC_ALL=C. Useprintfand a single output redirection — do not calldd,head, etc. unnecessarily. Avoid temp files; write straight to$1.chmod +x.Delete
test/common/json/json_sanitizer_corpus/binary_file.Modify
test/common/json/BUILDto:genrulethat runsgen_binary_corpus.shand producesjson_sanitizer_corpus/binary_fileas its output. Example shape:json_sanitizer_fuzz_testcorpus. This is the part that needs investigation. Theenvoy_cc_fuzz_testmacro is defined inbazel/envoy_build_system.bzl; please read it to determine howcorpus = "json_sanitizer_corpus"is expanded (likely via afilegroupwith aglob). A glob over the source tree will not pick up genrule outputs. Choose the least invasive of these options:This pull request was created from Copilot chat.