Consolidate Bolt PRs: eliminate token/value display heap allocations#58
Consolidate Bolt PRs: eliminate token/value display heap allocations#58ashyanSpada merged 5 commits intomasterfrom
Conversation
…tokenizer allocations Agent-Logs-Url: https://github.com/ashyanSpada/expression_engine_rs/sessions/ca3024b8-72f8-4ddf-aa61-9c26416668e9 Co-authored-by: ashyanSpada <22587148+ashyanSpada@users.noreply.github.com>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #58 +/- ##
==========================================
- Coverage 89.55% 89.53% -0.02%
==========================================
Files 11 11
Lines 1063 1061 -2
==========================================
- Hits 952 950 -2
Misses 111 111 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@copilot add more unit tests? |
…nd Tokenizer::expect Agent-Logs-Url: https://github.com/ashyanSpada/expression_engine_rs/sessions/26b2e24c-b507-4479-aab0-bf05ee7bbb20 Co-authored-by: ashyanSpada <22587148+ashyanSpada@users.noreply.github.com>
Added in f0b80a6. New tests cover:
Total test count goes from 194 → 230. |
There was a problem hiding this comment.
Pull request overview
Consolidates several micro-optimizations to eliminate avoidable heap allocations in hot-path formatting and token/operator comparisons.
Changes:
- Stream
Valuefmt::Displayoutput directly to the formatter forList/Map(no intermediateStringbuilding). - Add a
DelimTokenType::as_str()helper and refactor delimiter comparisons /Tokendisplay to avoid per-compare/per-display allocations. - Remove
to_string()allocations in the tokenizer operator scanning loop by passing&strslices directly.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| src/value.rs | Removes intermediate String allocations in Value display for list/map formatting. |
| src/token.rs | Introduces as_str() for delimiter tokens and uses it to avoid allocations in comparisons/display. |
| src/tokenizer.rs | Removes allocation in operator scanning; updates delimiter comparison in expect(). |
Comments suppressed due to low confidence (2)
src/tokenizer.rs:167
Tokenizer::expectreturnsOk(())even when the current token is aDelim/Operator/Commathat does not matchop, which can let the parser silently accept invalid syntax. Instead, returnErr(Error::ExpectedOpNotExist(op.to_string()))(or a more specific error) on mismatch, and consider only advancing the tokenizer after confirming a match so the caller can recover predictably.
pub fn expect(&mut self, op: &str) -> Result<()> {
let token = self.cur_token.clone();
self.next()?;
match token {
Token::Delim(bracket, _) => {
if bracket.as_str() == op {
return Ok(());
}
}
Token::Operator(operator, _) => {
if operator == op {
return Ok(());
}
}
Token::Comma(c, _) => {
if c == op {
return Ok(());
}
}
_ => {
return Err(Error::ExpectedOpNotExist(op.to_string()));
}
}
Ok(())
}
src/tokenizer.rs:66
special_op_tokenslices withself.current() + 1, which can panic on non-ASCII input becausecurrent()is a byte index and+ 1may land in the middle of a UTF-8 codepoint (e.g., input like"+β"). Use the peeked char’slen_utf8()(or its next index fromCharIndices) to compute a valid slice end instead of+ 1, so invalid/unexpected characters can be handled without panicking.
loop {
match self.peek_one() {
Some((_, _ch)) => {
if keyword::is_op(&self.input[start..self.current() + 1]) {
self.next_one();
} else {
break;
}
}
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Agent-Logs-Url: https://github.com/ashyanSpada/expression_engine_rs/sessions/fc512b36-1630-4659-8bd4-8b1f421d33b8 Co-authored-by: ashyanSpada <22587148+ashyanSpada@users.noreply.github.com>
|
@copilot post checks failed, please figure out the reason and fix it. |
Agent-Logs-Url: https://github.com/ashyanSpada/expression_engine_rs/sessions/637e4d2b-d8cc-40bc-999a-1e77a8676b9a Co-authored-by: ashyanSpada <22587148+ashyanSpada@users.noreply.github.com>
Fixed in cdabb03. The CI was failing a |
24 overlapping AI-generated "⚡ Bolt:" PRs (#34–#57) all targeted the same three hotspots. This consolidates the valid, non-speculative improvements into a single clean change.
src/value.rs— streamDisplaydirectly to formatterListandMapvariants were building an intermediateStringbefore writing to the formatter. Now write directly viawrite!:Also removed spurious
.clone()onCopy/ref-friendly types in the other arms.src/token.rs— zero-allocationDelimTokenTypecomparisonAdded
as_str() -> &'static strtoDelimTokenType;string()delegates to it for API compatibility.check_opandfmt::Display for Tokennow useas_str(), avoiding a heap allocation on every token match.check_opalso simplified to idiomatic match-arm expressions.src/tokenizer.rs— remove spuriousto_string()in hot scan loopspecial_op_tokenwas calling.to_string()on a&strslice just to pass it tokeyword::is_op(&str)— one allocation per character consumed. Pass the slice directly. Same fix applied toexpect().Tests
Added 36 new unit tests covering:
Value::Displayoutput for all variants (String,Number,Bool,None,List,Map, and emptyList/Map)DelimTokenType::as_str()for all variants, including consistency withstring()check_opfor Delim match, Operator match, mismatches, and non-op tokensToken::is_close_paren,is_close_bracket,is_close_bracehelper methodsTokenizer::expect()for successful Delim, successful Operator, mismatched Delim, and non-op token error caseOriginal prompt
Triage and Close Redundant AI-Generated "Bolt" Pull Requests
The repository currently has a large backlog of AI-generated pull requests (all prefixed with "⚡ Bolt:") that are clogging the PR queue and making it difficult to find genuine community contributions. These PRs span from #34 to #57 and all target overlapping concerns:
Valuedisplay allocations,Contextlock contention, and token string allocations.Task
Review all open "⚡ Bolt:" PRs and implement the best, non-overlapping performance improvements directly on the
masterbranch in a single, clean PR. This consolidates the valid work while eliminating the noise.The Bolt PRs to consolidate are:
.clone()sValueDisplay formattingWhat to implement
Read the source files (
src/value.rs,src/context.rs,src/token.rs,src/tokenizer.rs) and apply the best, correctness-preserving performance improvements from the Bolt PRs directly. Specifically:src/value.rs— Optimizefmt::DisplayforValueto write directly to the formatter instead of building intermediateStringallocations (e.g., avoidformat!(...).fmt(f)patterns).src/context.rs— Eliminate double-lookups in theContextmaps (e.g., useentry()API or singleget/insertcalls instead ofcontains_key+insert), and minimize Mutex lock duration by dropping the guard early where possible.src/token.rs/src/tokenizer.rs— Where token type comparisons build temporaryStrings just to compare, replace with&strslices or direct char/pattern comparisons to avoid heap allocations.Constraints
The following is the prior conversation context from the user's chat exploration (may be truncated):
User: Review this repo
Assistant: Here's a comprehensive review of ashyanSpada/expression_engine_rs — a pure Rust expression/rule engine library.
📊 At a Glance
0.7.0✅ Strengths
1. Clean, Well-Structured Codebase
The...
This pull request was created from Copilot chat.