-
Notifications
You must be signed in to change notification settings - Fork 28
html_escape: Avoid buffer allocation for strings with no escapable character #87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…aracter This change improves reduces allocations and makes `html_escape` ~35% faster in a benchmark with escaped strings taken from the `test_html_escape` test in `test/test_erb.rb`. - Perform buffer allocation on first instance of escapable character. - Instead of copying characters one at a time, copy unescaped segments using `memcpy`.
escapable character (ruby/erb#87) This change improves reduces allocations and makes `html_escape` ~35% faster in a benchmark with escaped strings taken from the `test_html_escape` test in `test/test_erb.rb`. - Perform buffer allocation on first instance of escapable character. - Instead of copying characters one at a time, copy unescaped segments using `memcpy`. ruby/erb@aa482890fe
|
Thanks @noteflakes ! Is this optimization also possible in CGI#escape_html? |
Sure, I'll make a PR for CGI too. |
They have slightly different hehaviors. ERB's is faster than CGI's, and replacing CGI's with ERB's would be a breaking change, which is why they are deliberately separate. |
escapable character (ruby/erb#87) This change improves reduces allocations and makes `html_escape` ~35% faster in a benchmark with escaped strings taken from the `test_html_escape` test in `test/test_erb.rb`. - Perform buffer allocation on first instance of escapable character. - Instead of copying characters one at a time, copy unescaped segments using `memcpy`. ruby/erb@aa482890fe
The existing
html_escapeimplementation always allocates buffer space (6 timesthe length of the input string), even when the input string does not contain any
character that needs to be escaped.
This PR modifies the implementation of
optimized_escape_htmlto notpre-allocate an output buffer, but instead allocate it on the first occurence of
a character that needs escaping. In addition, instead of copying non-escaped
characters one by one to the output buffer, continuous non-escaped segments of
characters are copied using
memcpy.A synthetic benchmark employing the input strings used in the
test_html_escapemethod in
test/test_erb.rbshows the modified implementation to be about 35%faster than the original: