Skip to content

Conversation

@soutaro
Copy link
Member

@soutaro soutaro commented Sep 2, 2025

This is related to the RBS file parsing performance degradation.

The RBS file parsing in rbs-4.0 (dev) is ~5x slower than rbs-3.9, based on the core/**/*.rbs files, especially for smaller files.

https://docs.google.com/spreadsheets/d/1dQBGIC1_zWco6c5OHH5VunP8ZuZLzwaMDB-U31OgUAc/edit?gid=0#gid=0

I implemented minor changes to improve the parsing performance, reusing empty array and hash objects. The changes improved the parsing performance slightly, but it still clearly slower than rbs-3.9.

@soutaro
Copy link
Member Author

soutaro commented Sep 12, 2025

@amomchilov
Copy link
Contributor

amomchilov commented Sep 12, 2025

Another thought:

How much time do we spend computing assertions? Should we disable them in "release" mode?

Currently, rbs_assert() is just a function that's never disabled:

void rbs_assert(bool condition, const char *fmt, ...) {
if (condition) {
return;
}
va_list args;
va_start(args, fmt);
vfprintf(stderr, fmt, args);
va_end(args);
fprintf(stderr, "\n");
exit(EXIT_FAILURE);
}

Perhaps we should rename the function to rbs_assert_impl(), and create an #define rbs_assert() macro that calls it only if we're in debug mode.

For comparison, C's standard library's assert.h, makes assert() a no-op if you #define NDEBUG

@amomchilov
Copy link
Contributor

Another thought: our extconf.rb doesn't specify an optimization level. We disable optimizations (-O0) if DEBUG is defined:

append_cflags ['-O0', '-g'] if ENV['DEBUG']

But we never specify -O3 if we're in a "release" mode

@soutaro
Copy link
Member Author

soutaro commented Sep 16, 2025

It looks like Makefile has -O3.

...
cflags   = $(hardenflags) -fdeclspec  $(optflags) $(debugflags) $(warnflags)
optflags = -O3 -fno-fast-math
CFLAGS   = $(CCDLFLAGS) $(cflags) -fno-common -pipe -std=gnu99 -Wimplicit-fallthrough -Wunused-result -Wc++-compat $(ARCH_FLAG)
...

@soutaro
Copy link
Member Author

soutaro commented Sep 16, 2025

It looks like ~3% are spent for rbs_assert in debug build. Removing the calls in release build slightly improved.

@soutaro soutaro force-pushed the parsing-performance branch from 4bb28e4 to bec4a54 Compare September 19, 2025 08:26
@soutaro
Copy link
Member Author

soutaro commented Sep 19, 2025

Finally the parser is about twice faster than 3.9 parser!

The benchmark parses all of the files given as ARGV, and runs the benchmark-ips: Higher number means better.

> BUNDLE_GEMFILE=Gemfile-39 bundle exec ruby benchmarks2.rb core/**/*.rbs ../../ruby/gem_rbs_collection/gems/activerecord/8.0/*.rbs sig/**/*.rbs
Benchmarking parsing 177 files...
ruby 3.4.4 (2025-05-14 revision a38531fd3f) +PRISM [arm64-darwin23]
Warming up --------------------------------------
             parsing     1.000 i/100ms
Calculating -------------------------------------
             parsing      9.749 (± 0.0%) i/s  (102.58 ms/i) -     49.000 in   5.030113s
> bundle exec ruby benchmarks2.rb core/**/*.rbs ../../ruby/gem_rbs_collection/gems/activerecord/8.0/*.rbs sig/**/*.rbs                          
Benchmarking parsing 177 files...
ruby 3.4.4 (2025-05-14 revision a38531fd3f) +PRISM [arm64-darwin23]
Warming up --------------------------------------
             parsing     1.000 i/100ms
Calculating -------------------------------------
             parsing     19.478 (± 0.0%) i/s   (51.34 ms/i) -     98.000 in   5.032890s

if (lexer->current.char_pos == lexer->end_pos) {
lexer->last_char = '\0';
return 0;
return lexer->current_code_point;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rbs_peek simply returns the current_code_point that was read in rbs_next_char. We can skip calling expensive rbs_utf8_string_to_codepoint function.

rbs_position_t start; /* The start position of the current token */
int start_pos; /* The character position that defines the start of the input */
int end_pos; /* The character position that defines the end of the input */
rbs_position_t current; /* The current position: just before the current_character */
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lexer struct now stores the character that has read -- its codepoint and byte width.

This was referenced Sep 26, 2025
@soutaro
Copy link
Member Author

soutaro commented Sep 26, 2025

Merged pull requests that is extracted from this PR. 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants