diff --git a/README.md b/README.md index 3a2a31e..040f1ee 100644 --- a/README.md +++ b/README.md @@ -34,7 +34,7 @@ Or install it yourself as: ## What does it do? -When your code triggers a SyntaxError due to an "expecting end-of-input" in a file, this library fires to narrow down your search to the most likely offending locations. +When your code triggers a SyntaxError due to an "unexpected `end'" in a file, this library fires to narrow down your search to the most likely offending locations. ## Sounds cool, but why isn't this baked into Ruby directly? @@ -45,20 +45,56 @@ I would love to get something like this directly in Ruby, but I first need to pr ## How does it detect syntax error locations? -Source code with a syntax error in it can be thought of valid code with one or more invalid chunks in it. With this in mind we can "search" for both invalid and valid chunks of code. This library uses a parser to tell if a given chunk of code is valid in which case it's certainly not the cause of our problem. If it's invalid, then we can test to see if removing that chunk from our file would make the whole thing valid. When that happens, we've narrowed down our search. But...things aren't always so easy. +We know that source code that does not contain a syntax error can be parsed. We also know that code with a syntax error contains both valid code and invalid code. If you remove the invalid code, then we can programatically determine that the code we removed contained a syntax error. We can do this detection by generating small code blocks and searching for which blocks need to be removed to generate valid source code. + +Since there can be multiple syntax errors in a document it's not good enough to check individual code blocks, we've got to check multiple at the same time. We will keep creating and adding new blocks to our search until we detect that our "frontier" (which contains all of our blocks) contains the syntax error. After this, we can stop our search and instead focus on filtering to find the smallest subset of blocks that contain the syntax error. + +## How is source code broken up into smaller blocks? By definition source code with a syntax error in it cannot be parsed, so we have to guess how to chunk up the file into smaller pieces. Once we've split up the file we can safely rule out or zoom into a specific piece of code to determine the location of the syntax error. This libary uses indentation and empty lines to make guesses about what might be a "block" of code. Once we've got a chunk of code, we can test it. -- If the code parses, it cannot be the cause of our syntax error. We can remove it from our search -- If the code does not parse, it may be the cause of the error, but we also might have made a bad guess in splitting up the source - - If we remove that chunk of code from the document and that allows the whole thing to parse, it means the syntax error was for sure in that location. - - Otherwise, it could mean that either there are multiple syntax errors or that we have a bad guess and need to expand our search. +At the end of the day we can't say where the syntax error is FOR SURE, but we can get pretty close. It sounds simple when spelled out like this, but it's a very complicated problem. Even when code is not correctly indented/formatted we can still likely tell you where to start searching even if we can't point at the exact problem line or location. + +## Complicating concerns + +The biggest issue with searching for syntax errors stemming from "unexpected end" is that while the `end` in the code triggered the error, the problem actually came from somewhere else. Effectively these syntax errors always involve 2 or more lines of code, but one of those lines (without the end) may be syntatically valid on its own. For example: + +``` +1 Foo.call +2 +3 puts "lol +4 end +``` + +Here there's a missing `do` after `Foo.call` however `Foo.call` by itself is perfectly valid ruby code syntax. We don't find the error until we remove the `end` even though the problem is caused on the first line. This means that if our clode blocks aren't sliced totally correctly the error output might just point at: + +``` +4 end +``` + +Instead of: + +``` +1 Foo.call +4 end +``` + +Here's a similar issue, but with more `end` lines in the code to demonstrate. The same line of code causes the issue: + +``` +1 it "foo" do +2 Foo.call +3 +4 puts "lol +5 end +6 end +``` -At the end of the day we can't say where the syntax error is FOR SURE, but we can get pretty close. It sounds simple when spelled out like this, but it's a very complicated problem. +In this example we could make this code valid by either the end on line 5 or 6. As far as the program is concerned it's effectively got one too many ends and it won't care which you remove. The "correct" line to remove would be for the inner block, but it's hard to know this programatically. Whitespace can help guide us, but it's still a guess. -This one person on twitter told me it's "not possible". +One of the biggest challenges then is not finding code that can be removed to make the program syntatically correct (just remove an `end` and it works) but to also provide a reasonable guess as to the "pair" line that would have otherwise required an end (such as a `do` or a `def`). -## How does this gem know when a syntax error occured? +## How does this gem know when a syntax error occured in my code? While I wish you hadn't asked: If you must know, we're monkey-patching require. It sounds scary, but bootsnap does essentially the same thing and we're way less invasive. diff --git a/lib/syntax_error_search/code_frontier.rb b/lib/syntax_error_search/code_frontier.rb index 7cd6670..f026cc8 100644 --- a/lib/syntax_error_search/code_frontier.rb +++ b/lib/syntax_error_search/code_frontier.rb @@ -35,9 +35,7 @@ def holds_all_syntax_errors?(block_array = @frontier) def pop return nil if empty? - if generate_new_block? - self << next_block - end + self << next_block unless @indent_hash.empty? return @frontier.pop end diff --git a/lib/syntax_error_search/code_search.rb b/lib/syntax_error_search/code_search.rb index a08a785..cb39430 100644 --- a/lib/syntax_error_search/code_search.rb +++ b/lib/syntax_error_search/code_search.rb @@ -24,6 +24,7 @@ def call end @invalid_blocks.concat(frontier.detect_invalid_blocks ) + @invalid_blocks.sort_by! {|block| block.starts_at } self end end diff --git a/spec/unit/code_search_spec.rb b/spec/unit/code_search_spec.rb index 4d58b40..4d3c779 100644 --- a/spec/unit/code_search_spec.rb +++ b/spec/unit/code_search_spec.rb @@ -1,40 +1,64 @@ - require_relative "../spec_helper.rb" module SyntaxErrorSearch RSpec.describe CodeSearch do - it "does not go into an infinite loop" do - skip("infinite loop") - search = CodeSearch.new(<<~EOM) - Foo.call - def foo - puts "lol" - puts "lol" - end - end - EOM - search.call - - expect(search.invalid_blocks.join).to eq(<<~EOM) - end - EOM - end - - it "handles mis-matched-indentation-but-maybe-not-so-well" do - skip("wip") - search = CodeSearch.new(<<~EOM) - Foo.call - def foo - puts "lol" - puts "lol" - end - end - EOM - search.call - - expect(search.invalid_blocks.join).to eq(<<~EOM) - end - EOM + # For code that's not perfectly formatted, we ideally want to do our best + # These examples represent the results that exist today, but I would like to improve upon them + describe "needs improvement" do + describe "mis-matched-indentation" do + it "stacked ends " do + search = CodeSearch.new(<<~EOM) + Foo.call + def foo + puts "lol" + puts "lol" + end + end + EOM + search.call + + # Does not include the line with the error Foo.call + expect(search.invalid_blocks.join).to eq(<<~EOM) + def foo + end + end + EOM + end + + it "extra space before end" do + search = CodeSearch.new(<<~EOM) + Foo.call + def foo + puts "lol" + puts "lol" + end + end + EOM + search.call + + # Does not include the line with the error Foo.call + expect(search.invalid_blocks.join).to eq(<<~EOM.indent(3)) + end + EOM + end + + it "missing space before end" do + search = CodeSearch.new(<<~EOM) + Foo.call + def foo + puts "lol" + puts "lol" + end + end + EOM + search.call + + # Does not include the line with the error Foo.call + expect(search.invalid_blocks.join).to eq(<<~EOM) + end + EOM + end + end end it "returns syntax error in outer block without inner block" do