From d32cf62871f706c436724fee3d92d6ee03bff343 Mon Sep 17 00:00:00 2001 From: schneems Date: Wed, 12 Jan 2022 16:59:04 -0600 Subject: [PATCH 01/58] Introduce BalanceHeuristicExpand I used this as a starting point for work on building a tree as it has the LeftRightPairDiff internals I wanted. Basically everything below this is true in isolation but false in its entirety. I ended up not using the BalanceHeuristicExpand as a concept https://github.com/zombocom/dead_end/pull/129. TLDR; I've been trying to optimize a "worst-case" perf scenario (9,000 line file) to get under my 1-second threshold. When I started, it was at ~2 seconds. After this last optimization, it's well under the threshold! ``` Before [main]: 1.22 seconds After [this commit]: 0.67386 seconds ``` > Cumulatively, this is 2.6x faster than 3.1.1 and 47x faster than 1.2.0. ## Profiling before/after Expansion before (called 2,291 times, 42.04% of overall time) Parse before (called 3,315 for 31.16% of overall time) ![](https://www.dropbox.com/s/brw7ix5b0mhwy1z/Screen%20Shot%202022-01-14%20at%208.54.31%20PM.png?raw=1) ![](https://www.dropbox.com/s/8mx20auvod5wb8t/Screen%20Shot%202022-01-15%20at%201.10.41%20PM.png?raw=1) Expansion after (called 654 times, 29.42% of overall time) Parse after (called 1,507 times for 29.35% of overall time) ![](https://www.dropbox.com/s/3rmtpfk315ge7e6/Screen%20Shot%202022-01-14%20at%208.55.45%20PM.png?raw=1) > Note that parsing happens within the expansion method call, so while it seems "cheaper" to parse than expand based on this profiling data, it's the parsing that is making expansion slow. ## Goal Make the algorithm faster, so it doesn't timeout even when given a file with 9,000 lines. ## Strategy (background) There are two general strategies for improving dead_end perf: - Reduce/remove calls to Ripper (syntax error detection), as it is slow. For example, not calling Ripper if all code lines have been previously "hidden" (indicating valid code). - Improve computational complexity for non-ripper operations. For example, using a priority heap over sorting an array We call Ripper for 2 cases. Both for individual code blocks to see if it contains a syntax error. We also call Ripper later on the whole document to see if that particular syntax error was making the document unparsable. Ripper is slower to parse more lines, so we only re-parse the entire document when we detect a new invalid block as a prior optimization. If we can build "better" valid blocks, then we call Ripper fewer times on the overall document. If we can build larger blocks, we reduce the overall number of iterations for the algorithm. This property reduces Ripper calls and speeds up general "computational complexity" as there are N fewer operations to perform overall. ## Approach This approach builds on the concept that dead_end is a uniform cost search by adding a "heuristic" (think a-star search) when building blocks. At a high level, a heuristic is a quality measure that can also be incomplete. For a great explanation, see https://www.redblobgames.com/pathfinding/a-star/introduction.html. What heuristic can we add? We know that if the code has an unbalanced pair of special characters, it cannot be valid. For example, this code `{ hello: "world"` is unbalanced. It is missing a `}`. It contains a syntax error due to this imbalance. In the dead_end code, we can count known special pairs using `Ripper.lex` output (which we already have). Here are the currently tracked pairs: - `{}` - `()` - `[]` - keyword/end This information can be used as a heuristic. Code with unbalanced pairs always contains a syntax error, but balanced code may have a different (harder to detect) syntax error. The code's balance hints at how we should expand an existing code block. For example, the code `{ hello: "world"` must expand towards the right/down to be valid. This code would be known as "left-leaning" as it is heavier on the left side. Rather than searching in an arbitrary direction, the heuristic determines the most sensible direction to expand. Previously, we expanded blocks by scanning up and down, counting keyword/end pairs (outside of the block), and looking at indentation. This process was adequate but required that we take small steps and produce small blocks. It also has no concept of if the code it holds is syntactically valid or not until a full Ripper parse is performed. That combination means it can produce invalid code blocks (which can be desirable when hunting for invalid code). But when we want blocks to be valid, we can make more efficient moves. ## Implementation: LexDiffPair helper class I introduced a new class, `LexDiffPair`, to accommodate a heuristic expansion. A `LexDiffPair` can be in one of several states: - Balanced (:equal) - Leaning left (:left) - Leaning right (:right) - Leaning both (:both) > An example of code line leaning both ways would be `) do |x|`. Here the `)` is leaning right (needs to expand up) while the `do` is leaning left (needs to expand down). Each code line generates its own `LexDiffPair`. Internally the information is stored as a diff of a count of each of the above pairs. A positive number indicates left-leaning, a negative number indicates right-leaning, a zero is balanced. This format allows for fast concatenation and comparison. ## BalanceHeuristicExpand When a block is passed to `BalanceHeuristicExpand`, a `LexPairDiff` is created from all of its lines. The code's balance is then used to determine which direction to expand. If the code is leaning left or right, the main goal is to get it back into balance. Continue to expand it until balance is achieved. If the code is already balanced, try to expand it while maintaining balance. For example, add already balanced code and any above/below pair of lines that cancel out. Finally, intentionally knock it off-balance by expanding up and then recurse the process (to ideally re-balance it and then exit). If the code is leaning both ways, try to grab any extra "free" (already balanced) lines. Then expand up and down since it must expand both ways to become valid. As this process is repeated, it should eventually find a match in one direction and then be leaning left or right, which will expand faster on the next iteration (after it is put back on the frontier and popped again later). Example: An invalid code block might come in like this: ``` start_line: keyword.location.start_line, start_char: keyword.location.end_char + 1, end_line: last_node.location.end_line, end_char: last_node.location.end_char ``` And it could expand it to: ``` if exceptions || variable RescueEx.new( exceptions: exceptions, variable: variable, location: Location.new( start_line: keyword.location.start_line, start_char: keyword.location.end_char + 1, end_line: last_node.location.end_line, end_char: last_node.location.end_char ) ) end ``` > Note it would expand it quite a bit more, but I'm abbreviating here. Here's the complete process across several expansions https://gist.github.com/schneems/e62216b610a36a81a98d9d8146b0611a ## When to use heuristic expand After experimentation, the best place to use this new expansion technique was when an existing invalid block comes off of the frontier. This algorithm tends to correct such blocks and balance them out. When a block is "corrected" in this way, it reduces the overall number of times the document must be passed to Ripper (extremely slow). Also, since larger blocks reduce the overall iteration, we try to expand several times (while preserving balance) and take the largest valid expansion. We use the original indentation-based expansion for code blocks that are already valid. The downside of using a heuristic that preserves balance is that ultimately we want the algorithm to generate invalid blocks. The original expansion can produce invalid expansions, which is useful. There is a separate process for adding unvisited lines to the frontier (rather than expanding existing blocks). Unvisited lines are not a good candidate for heuristic expansion as it works a little too well. If we only add "unbalanced" code in an added block, we lose some of the context we desire (see comments for more details in `parse_blocks_from_indent_line`). ## Concerns One concern with this implementation is that calling the heuristic expansion three times was the only way to produce valid results. I'm not sure why. It might be an undiscovered property of the system, or perhaps all of my code examples to date are somehow biased in a specific way. The way to get more information is to put it out into the world and get feedback. Another concern is that there are now two expansion classes. Or three if you count `parse_blocks_from_indent_line`. It's not always clear why you would choose one over another except that "it provides the best outcome". It might be possible to simplify some of this logic or remove or combine competing expansion methods. Hopefully, patterns will emerge pointing to opportunities to guide that usage. --- Gemfile.lock | 2 +- lib/dead_end/api.rb | 4 +- lib/dead_end/balance_heuristic_expand.rb | 281 ++++++++++++++++++ lib/dead_end/code_frontier.rb | 2 + lib/dead_end/code_line.rb | 12 +- lib/dead_end/code_search.rb | 41 ++- ...block_expand.rb => indent_block_expand.rb} | 6 +- lib/dead_end/left_right_lex_count.rb | 32 ++ lib/dead_end/lex_pair_diff.rb | 104 +++++++ lib/dead_end/parse_blocks_from_indent_line.rb | 17 ++ spec/integration/dead_end_spec.rb | 6 +- spec/integration/exe_cli_spec.rb | 2 +- spec/integration/ruby_command_line_spec.rb | 4 +- spec/unit/balance_heuristic_expand_spec.rb | 230 ++++++++++++++ ...nd_spec.rb => indent_block_expand_spec.rb} | 14 +- spec/unit/left_right_lex_count_spec.rb | 8 + spec/unit/lex_pair_diff_spec.rb | 43 +++ 17 files changed, 786 insertions(+), 22 deletions(-) create mode 100644 lib/dead_end/balance_heuristic_expand.rb rename lib/dead_end/{block_expand.rb => indent_block_expand.rb} (91%) create mode 100644 lib/dead_end/lex_pair_diff.rb create mode 100644 spec/unit/balance_heuristic_expand_spec.rb rename spec/unit/{block_expand_spec.rb => indent_block_expand_spec.rb} (89%) create mode 100644 spec/unit/left_right_lex_count_spec.rb create mode 100644 spec/unit/lex_pair_diff_spec.rb diff --git a/Gemfile.lock b/Gemfile.lock index df2b429..b695efd 100644 --- a/Gemfile.lock +++ b/Gemfile.lock @@ -45,7 +45,7 @@ GEM rubocop-ast (>= 0.4.0) ruby-prof (1.4.3) ruby-progressbar (1.11.0) - stackprof (0.2.16) + stackprof (0.2.17) standard (1.3.0) rubocop (= 1.20.0) rubocop-performance (= 1.11.5) diff --git a/lib/dead_end/api.rb b/lib/dead_end/api.rb index 683a0e4..f497057 100644 --- a/lib/dead_end/api.rb +++ b/lib/dead_end/api.rb @@ -187,12 +187,14 @@ def self.valid?(source) require_relative "lex_all" require_relative "code_line" require_relative "code_block" -require_relative "block_expand" +require_relative "lex_pair_diff" require_relative "ripper_errors" require_relative "priority_queue" require_relative "unvisited_lines" require_relative "around_block_scan" +require_relative "indent_block_expand" require_relative "priority_engulf_queue" require_relative "pathname_from_message" require_relative "display_invalid_blocks" +require_relative "balance_heuristic_expand" require_relative "parse_blocks_from_indent_line" diff --git a/lib/dead_end/balance_heuristic_expand.rb b/lib/dead_end/balance_heuristic_expand.rb new file mode 100644 index 0000000..b1fe96b --- /dev/null +++ b/lib/dead_end/balance_heuristic_expand.rb @@ -0,0 +1,281 @@ +# frozen_string_literal: true + +module DeadEnd + # Expand code based on lexical heuristic + # + # Code that has unbalanced pairs cannot be valid + # i.e. `{` must always be matched with a `}`. + # + # This expansion class exploits that knowledge to + # expand a logical block towards equal pairs. + # + # For example: if code is missing a `]` it cannot + # be on a line above, so it must expand down + # + # This heuristic allows us to make larger and more + # accurate expansions which means fewer invalid + # blocks to check which means overall faster search. + # + # This class depends on another class LexPairDiff can be + # accesssed per-line. It holds the delta of tracked directional + # pairs: curly brackets, square brackets, parens, and kw/end + # with positive count (leaning left), 0 (balanced), or negative + # count (leaning right). + # + # With this lexical diff information we can look around a given + # block and move with inteligently. For instance if the current + # block has a miss matched `end` and the line above it holds + # `def foo` then the block will be expanded up to capture that line. + # + # An unbalanced block can never be valid (this provides info to + # the overall search). However a balanced block may contain other syntax + # error and so must be re-checked using Ripper (slow). + # + # Example + # + # lines = CodeLines.from_source(<~'EOM') + # if bark? + # end + # EOM + # block = CodeBlock.new(lines: lines[0]) + # + # expand = BalanceHeuristicExpand.new( + # code_lines: lines, + # block: block + # ) + # expand.direction # => :down + # expand.call + # expand.direction # => :equal + # + # expect(expand.to_s).to eq(lines.join) + class BalanceHeuristicExpand + attr_reader :start_index, :end_index + + def initialize(code_lines:, block:) + @block = block + @iterations = 0 + @code_lines = code_lines + @last_index = @code_lines.length - 1 + @max_iterations = @code_lines.length * 2 + @start_index = block.lines.first.index + @end_index = block.lines.last.index + @last_equal_range = nil + + set_lex_diff_from(block) + end + + private def set_lex_diff_from(block) + @lex_diff = LexPairDiff.new( + curly: 0, + square: 0, + parens: 0, + kw_end: 0 + ) + block.lines.each do |line| + @lex_diff.concat(line.lex_diff) + end + end + + # Converts the searched lines into a source string + def to_s + @code_lines[start_index..end_index].join + end + + # Converts the searched lines into a code block + def to_block + CodeBlock.new(lines: @code_lines[start_index..end_index]) + end + + # Returns true if all lines are equal + def balanced? + @lex_diff.balanced? + end + + # Returns false if captured lines are "leaning" + # one direction + def unbalanced? + !balanced? + end + + # Main search entrypoint + # + # Essentially a state machine, determine the leaning + # of the given block, then figure out how to either + # move it towards balanced, or expand it while keeping + # it balanced. + def call + case direction + when :up + # the goal is to become balanced + while keep_going? && direction == :up && try_expand_up + end + when :down + # the goal is to become balanced + while keep_going? && direction == :down && try_expand_down + end + when :equal + while keep_going? && grab_equal_or { + # Cannot create a balanced expansion, choose to be unbalanced + try_expand_up + } + end + + call # Recurse + when :both + while keep_going? && grab_equal_or { + try_expand_up + try_expand_down + } + end + when :stop + return self + end + + self + end + + # Convert a lex diff to a direction to search + # + # leaning left -> down + # leaning right -> up + # + def direction + leaning = @lex_diff.leaning + case leaning + when :left # go down + stop_bottom? ? :stop : :down + when :right # go up + stop_top? ? :stop : :up + when :equal, :both + if stop_top? && stop_bottom? + :stop + elsif stop_top? && !stop_bottom? + :down + elsif !stop_top? && stop_bottom? + :up + else + leaning + end + end + end + + # Limit rspec failure output + def inspect + "#" + end + + # Upper bound on iterations + private def keep_going? + if @iterations < @max_iterations + @iterations += 1 + true + else + warn <<~EOM + DeadEnd: Internal problem detected, possible infinite loop in #{self.class} + + Please open a ticket with the following information. Max: #{@max_iterations}, actual: #{@iterations} + + Original block: + + ``` + #{@block.lines.map(&:original).join}``` + + Stuck at: + + ``` + #{to_block.lines.map(&:original).join}``` + EOM + + false + end + end + + # Attempt to grab "free" lines + # + # if either above, below or both are + # balanced, take them, return true. + # + # If above is leaning left and below + # is leaning right and they cancel out + # take them, return true. + # + # If we couldn't grab any balanced lines + # then call the block and return false. + private def grab_equal_or + did_expand = false + if above&.balanced? + did_expand = true + try_expand_up + end + + if below&.balanced? + did_expand = true + try_expand_down + end + + return true if did_expand + + if make_balanced_from_up_down? + try_expand_up + try_expand_down + true + else + yield + false + end + end + + # If up is leaning left and down is leaning right + # they might cancel out, to make a complete + # and balanced block + private def make_balanced_from_up_down? + return false if above.nil? || below.nil? + return false if above.lex_diff.leaning != :left + return false if below.lex_diff.leaning != :right + + @lex_diff.dup.concat(above.lex_diff).concat(below.lex_diff).balanced? + end + + # The line above the current location + private def above + @code_lines[@start_index - 1] unless stop_top? + end + + # The line below the current location + private def below + @code_lines[@end_index + 1] unless stop_bottom? + end + + # Mutates the start index and applies the new line's + # lex diff + private def expand_up + @start_index -= 1 + @lex_diff.concat(@code_lines[@start_index].lex_diff) + end + + private def try_expand_up + stop_top? ? false : expand_up + end + + private def try_expand_down + stop_bottom? ? false : expand_down + end + + # Mutates the end index and applies the new line's + # lex diff + private def expand_down + @end_index += 1 + @lex_diff.concat(@code_lines[@end_index].lex_diff) + end + + # Returns true when we can no longer expand up + private def stop_top? + @start_index == 0 + end + + # Returns true when we can no longer expand down + private def stop_bottom? + @end_index == @last_index + end + end +end diff --git a/lib/dead_end/code_frontier.rb b/lib/dead_end/code_frontier.rb index f9e6920..71af0b5 100644 --- a/lib/dead_end/code_frontier.rb +++ b/lib/dead_end/code_frontier.rb @@ -50,6 +50,8 @@ module DeadEnd # CodeFrontier#detect_invalid_blocks # class CodeFrontier + attr_reader :queue + def initialize(code_lines:, unvisited: UnvisitedLines.new(code_lines: code_lines)) @code_lines = code_lines @unvisited = unvisited diff --git a/lib/dead_end/code_line.rb b/lib/dead_end/code_line.rb index 6520518..43cead8 100644 --- a/lib/dead_end/code_line.rb +++ b/lib/dead_end/code_line.rb @@ -38,7 +38,7 @@ def self.from_source(source, lines: nil) end end - attr_reader :line, :index, :lex, :line_number, :indent + attr_reader :line, :index, :lex, :line_number, :indent, :lex_diff def initialize(line:, index:, lex:) @lex = lex @line = line @@ -57,6 +57,16 @@ def initialize(line:, index:, lex:) end set_kw_end + + @lex_diff = LexPairDiff.from_lex( + lex: @lex, + is_kw: is_kw?, + is_end: is_end? + ) + end + + def balanced? + @lex_diff.balanced? end # Used for stable sort via indentation level diff --git a/lib/dead_end/code_search.rb b/lib/dead_end/code_search.rb index 19b5bc8..12072c2 100644 --- a/lib/dead_end/code_search.rb +++ b/lib/dead_end/code_search.rb @@ -15,7 +15,7 @@ module DeadEnd # # - CodeFrontier (Holds information for generating blocks and determining if we can stop searching) # - ParseBlocksFromLine (Creates blocks into the frontier) - # - BlockExpand (Expands existing blocks to search more code) + # - IndentBlockExpand (Expands existing blocks to search more code) # # ## Syntax error detection # @@ -61,7 +61,7 @@ def initialize(source, record_dir: DEFAULT_VALUE) @code_lines = CleanDocument.new(source: source).call.lines @frontier = CodeFrontier.new(code_lines: @code_lines) - @block_expand = BlockExpand.new(code_lines: @code_lines) + @indent_block_expand = IndentBlockExpand.new(code_lines: @code_lines) @parse_blocks_from_indent_line = ParseBlocksFromIndentLine.new(code_lines: @code_lines) end @@ -88,6 +88,7 @@ def record(block:, name: "record") end end + # Add a block back onto the frontier def push(block, name:) record(block: block, name: name) @@ -100,6 +101,10 @@ def push(block, name:) def create_blocks_from_untracked_lines max_indent = frontier.next_indent_line&.indent + # Expand an unvisited line into a block and put it on the frontier + # This registers all lines and removes "univisted" lines from the + # frontier. The process continues until all unvisited lines at a given + # indentation are added while (line = frontier.next_indent_line) && (line.indent == max_indent) @parse_blocks_from_indent_line.each_neighbor_block(frontier.next_indent_line) do |block| push(block, name: "add") @@ -115,7 +120,37 @@ def expand_existing record(block: block, name: "before-expand") - block = @block_expand.call(block) + if block.invalid? + # When a block is invalid the BalanceHeuristicExpand class tends to make it valid + # again. This property reduces the number of Ripper calls to + # `frontier.holds_all_syntax_errors?`. + # + # This class tends to produce larger expansions meaning fewer + # total expansion steps. + blocks = [] + expand = BalanceHeuristicExpand.new(code_lines: code_lines, block: block) + + # Expand magic number 3 times + # + # There's likely a hidden property that explains why. I + # guessed it accidentally and it works really well. Reducing or increasing + # call count produces awful results. I'm not entirely sure why. + blocks << expand.call.to_block + blocks << expand.to_block if expand.call.balanced? + blocks << expand.to_block if expand.call.balanced? + + # Take the largest generated, valid block + block = blocks.reverse_each.detect(&:valid?) || blocks.first + else + # The original block expansion process works well when it starts + # with good i.e. "valid" input. Unlike BalanceHeuristicExpand, it does not self-correct + # towards a valid state. This naive property is desireable since + # we want to generate invalid code blocks (that make logical sense) + # or the algorithm will tend towards matching incorrect pairs + # at the expense of an incorrect result. + block = @indent_block_expand.call(block) + end + push(block, name: "expand") end diff --git a/lib/dead_end/block_expand.rb b/lib/dead_end/indent_block_expand.rb similarity index 91% rename from lib/dead_end/block_expand.rb rename to lib/dead_end/indent_block_expand.rb index 7f3396f..29bbca5 100644 --- a/lib/dead_end/block_expand.rb +++ b/lib/dead_end/indent_block_expand.rb @@ -10,7 +10,7 @@ module DeadEnd # puts "wow" # end # - # block = BlockExpand.new(code_lines: code_lines) + # block = IndentBlockExpand.new(code_lines: code_lines) # .call(CodeBlock.new(lines: code_lines[1])) # # puts block.to_s @@ -21,7 +21,7 @@ module DeadEnd # Once a code block has captured everything at a given indentation level # then it will expand to capture surrounding indentation. # - # block = BlockExpand.new(code_lines: code_lines) + # block = IndentBlockExpand.new(code_lines: code_lines) # .call(block) # # block.to_s @@ -30,7 +30,7 @@ module DeadEnd # puts "wow" # end # - class BlockExpand + class IndentBlockExpand def initialize(code_lines:) @code_lines = code_lines end diff --git a/lib/dead_end/left_right_lex_count.rb b/lib/dead_end/left_right_lex_count.rb index 3b71ade..6ddf731 100644 --- a/lib/dead_end/left_right_lex_count.rb +++ b/lib/dead_end/left_right_lex_count.rb @@ -22,6 +22,8 @@ module DeadEnd # left_right.missing.first # # => "}" class LeftRightLexCount + attr_reader :kw_count, :end_count + def initialize @kw_count = 0 @end_count = 0 @@ -37,6 +39,16 @@ def initialize } end + def concat(other) + @count_for_char.each do |(k, _)| + @count_for_char[k] += other[k] + end + + @kw_count += other.kw_count + @end_count += other.end_count + self + end + def count_kw @kw_count += 1 end @@ -45,6 +57,14 @@ def count_end @end_count += 1 end + def count_lines(lines) + lines.each do |line| + line.lex.each do |lex| + count_lex(lex) + end + end + end + # Count source code characters # # Example: @@ -121,6 +141,18 @@ def missing "(" => ")" }.freeze + def curly_diff + @count_for_char["{"] - @count_for_char["}"] + end + + def square_diff + @count_for_char["["] - @count_for_char["]"] + end + + def parens_diff + @count_for_char["("] - @count_for_char[")"] + end + # Opening characters like `{` need closing characters # like `}`. # # When a mis-match count is detected, suggest the diff --git a/lib/dead_end/lex_pair_diff.rb b/lib/dead_end/lex_pair_diff.rb new file mode 100644 index 0000000..bc02510 --- /dev/null +++ b/lib/dead_end/lex_pair_diff.rb @@ -0,0 +1,104 @@ +module DeadEnd + # Holds a diff of lexical pairs + # + # Example: + # + # diff = LexPairDiff.from_lex(LexAll.new("}"), is_kw: false, is_end: false) + # diff.curly # => 1 + # diff.balanced? # => false + # diff.leaning # => :right + # + # two = LexPairDiff.from_lex(LexAll.new("{"), is_kw: false, is_end: false) + # two.curly => -1 + # + # diff.concat(two) + # diff.curly # => 0 + # diff.balanced? # => true + # diff.leaning # => :equal + # + # Internally a pair is stored as a single value + # positive indicates more left elements, negative + # indicates more right elements, and zero indicates + # balanced pairs. + class LexPairDiff + # Convienece constructor + def self.from_lex(lex:, is_kw:, is_end:) + left_right = LeftRightLexCount.new + lex.each do |l| + left_right.count_lex(l) + end + + kw_end = 0 + kw_end += 1 if is_kw + kw_end -= 1 if is_end + + LexPairDiff.new( + curly: left_right.curly_diff, + square: left_right.square_diff, + parens: left_right.parens_diff, + kw_end: kw_end + ) + end + + attr_reader :curly, :square, :parens, :kw_end + + def initialize(curly:, square:, parens:, kw_end:) + @curly = curly + @square = square + @parens = parens + @kw_end = kw_end + end + + def each + yield @curly + yield @square + yield @parens + yield @kw_end + end + + # Returns :left if all there are more unmatched pairs to + # left i.e. "{" + # Returns :right if all there are more unmatched pairs to + # left i.e. "}" + # + # If pairs are unmatched like "(]" returns `:both` + # + # If everything is balanced returns :equal + def leaning + dir = 0 + each do |v| + case v <=> 0 + when 1 + return :both if dir == -1 + dir = 1 + when -1 + return :both if dir == 1 + dir = -1 + end + end + + case dir + when 1 + :left + when 0 + :equal + when -1 + :right + end + end + + # Returns true if all pairs are equal + def balanced? + @curly == 0 && @square == 0 && @parens == 0 && @kw_end == 0 + end + + # Mutates the existing diff with contents of another diff + def concat(other) + @curly += other.curly + @square += other.square + @parens += other.parens + @kw_end += other.kw_end + self + end + end +end diff --git a/lib/dead_end/parse_blocks_from_indent_line.rb b/lib/dead_end/parse_blocks_from_indent_line.rb index 11fa2b8..ec2dc98 100644 --- a/lib/dead_end/parse_blocks_from_indent_line.rb +++ b/lib/dead_end/parse_blocks_from_indent_line.rb @@ -26,6 +26,10 @@ module DeadEnd # # At this point it has no where else to expand, and it will yield this inner # code as a block + # + # The other major concern is eliminating all lines that do not contain + # an end. In the above example, if we started from the top and moved + # down we might accidentally eliminate everything but `end` class ParseBlocksFromIndentLine attr_reader :code_lines @@ -42,6 +46,19 @@ def each_neighbor_block(target_line) neighbors = scan.code_block.lines + # Block production here greatly affects quality and performance. + # + # Larger blocks produce a faster search as the frontier must go + # through fewer iterations. However too large of a block, will + # degrade output quality if too many unrelated lines are caught + # in an invalid block. + # + # Another concern is being too clever with block production. + # Quality of the end result depends on sometimes including unrelated + # lines. For example in code like `deffoo; end` we want to match + # both lines as the programmer's mistake was missing a space in the + # `def` even though technically we could make it valid by simply + # removing the "extra" `end`. block = CodeBlock.new(lines: neighbors) if neighbors.length <= 2 || block.valid? yield block diff --git a/spec/integration/dead_end_spec.rb b/spec/integration/dead_end_spec.rb index 926383a..bbbafb8 100644 --- a/spec/integration/dead_end_spec.rb +++ b/spec/integration/dead_end_spec.rb @@ -4,9 +4,7 @@ module DeadEnd RSpec.describe "Integration tests that don't spawn a process (like using the cli)" do - it "does not timeout on massive files" do - next unless ENV["DEAD_END_TIMEOUT"] - + it "does not timeout on massive files", slow: true do file = fixtures_dir.join("syntax_tree.rb.txt") lines = file.read.lines lines.delete_at(768 - 1) @@ -140,6 +138,8 @@ module DeadEnd expect(out).to include(<<~EOM) 16 class Rexe + 18 VERSION = '1.5.1' + 20 PROJECT_URL = 'https://github.com/keithrbennett/rexe' ❯ 77 class Lookups ❯ 78 def input_modes ❯ 148 end diff --git a/spec/integration/exe_cli_spec.rb b/spec/integration/exe_cli_spec.rb index 5a49d9a..75e3bbf 100644 --- a/spec/integration/exe_cli_spec.rb +++ b/spec/integration/exe_cli_spec.rb @@ -14,7 +14,7 @@ def exe(cmd) out end - it "prints the version" do + it "prints the version", slow: true do out = exe("-v") expect(out.strip).to include(DeadEnd::VERSION) end diff --git a/spec/integration/ruby_command_line_spec.rb b/spec/integration/ruby_command_line_spec.rb index e124287..35a2ded 100644 --- a/spec/integration/ruby_command_line_spec.rb +++ b/spec/integration/ruby_command_line_spec.rb @@ -4,7 +4,7 @@ module DeadEnd RSpec.describe "Requires with ruby cli" do - it "namespaces all monkeypatched methods" do + it "namespaces all monkeypatched methods", slow: true do Dir.mktmpdir do |dir| tmpdir = Pathname(dir) script = tmpdir.join("script.rb") @@ -43,7 +43,7 @@ module DeadEnd end end - it "detects require error and adds a message with auto mode" do + it "detects require error and adds a message with auto mode", slow: true do Dir.mktmpdir do |dir| tmpdir = Pathname(dir) script = tmpdir.join("script.rb") diff --git a/spec/unit/balance_heuristic_expand_spec.rb b/spec/unit/balance_heuristic_expand_spec.rb new file mode 100644 index 0000000..cc3066a --- /dev/null +++ b/spec/unit/balance_heuristic_expand_spec.rb @@ -0,0 +1,230 @@ +# frozen_string_literal: true + +require_relative "../spec_helper" + +module DeadEnd + RSpec.describe BalanceHeuristicExpand do + it "can handle 'unknown' direction code" do + source = <<~'EOM' + parser.on('-r', '--require REQUIRE(S)', + 'Gems and built-in libraries (e.g. shellwords, yaml) to require, comma separated, or ! to clear') do |v| + if v == '!' + options.requires.clear + else + v.split(',').map(&:strip).each do |r| + if r[0] == '-' + options.requires -= [r[1..-1]] + else + options.requires << r + end + end + end + end + EOM + + lines = CleanDocument.new(source: source).call.lines + expand = BalanceHeuristicExpand.new( + code_lines: lines, + block: CodeBlock.new(lines: lines[1]) + ) + + expect(expand.direction).to eq(:both) + expand.call + expect(expand.to_s).to eq(<<~'EOM') + parser.on('-r', '--require REQUIRE(S)', + 'Gems and built-in libraries (e.g. shellwords, yaml) to require, comma separated, or ! to clear') do |v| + if v == '!' + EOM + + expand.call + expect(expand.to_s).to eq(<<~'EOM') + parser.on('-r', '--require REQUIRE(S)', + 'Gems and built-in libraries (e.g. shellwords, yaml) to require, comma separated, or ! to clear') do |v| + if v == '!' + options.requires.clear + else + v.split(',').map(&:strip).each do |r| + if r[0] == '-' + options.requires -= [r[1..-1]] + else + options.requires << r + end + end + end + end + EOM + end + + it "does not generate (known) invalid blocks when started at different positions" do + source = <<~EOM + Foo.call do |a + # inner + end # one + + print lol + class Foo + end # two + EOM + lines = CodeLine.from_source(source) + expand = BalanceHeuristicExpand.new( + code_lines: lines, + block: CodeBlock.new(lines: lines[1]) + ) + expect(expand.direction).to eq(:equal) + expand.call + expect(expand.to_s).to eq(<<~'EOM') + Foo.call do |a + # inner + end # one + + print lol + class Foo + end # two + EOM + + expand = BalanceHeuristicExpand.new( + code_lines: lines, + block: CodeBlock.new(lines: lines[0]) + ) + expect(expand.call.to_s).to eq(<<~'EOM') + Foo.call do |a + # inner + end # one + + print lol + class Foo + end # two + EOM + + expand = BalanceHeuristicExpand.new( + code_lines: lines, + block: CodeBlock.new(lines: lines[2]) + ) + expect(expand.direction).to eq(:up) + + expand.call + + expect(expand.to_s).to eq(<<~'EOM') + Foo.call do |a + # inner + end # one + EOM + + expand = BalanceHeuristicExpand.new( + code_lines: lines, + block: CodeBlock.new(lines: lines[3]) + ) + expect(expand.direction).to eq(:equal) + expand.call + expect(expand.to_s).to eq(<<~'EOM') + Foo.call do |a + # inner + end # one + + print lol + EOM + + expand = BalanceHeuristicExpand.new( + code_lines: lines, + block: CodeBlock.new(lines: lines[4]) + ) + expect(expand.direction).to eq(:equal) + expand.call + expect(expand.to_s).to eq(<<~'EOM') + Foo.call do |a + # inner + end # one + + print lol + EOM + + expand = BalanceHeuristicExpand.new( + code_lines: lines, + block: CodeBlock.new(lines: lines[5]) + ) + expect(expand.direction).to eq(:down) + expand.call + expect(expand.to_s).to eq(<<~'EOM') + class Foo + end # two + EOM + end + + it "expands" do + source = <<~EOM + class Blerg + Foo.call do |a + end # one + + print lol + class Foo + end # two + end # three + EOM + lines = CodeLine.from_source(source) + expand = BalanceHeuristicExpand.new( + code_lines: lines, + block: CodeBlock.new(lines: lines[5]) + ) + expect(expand.call.to_s).to eq(<<~'EOM'.indent(2)) + class Foo + end # two + EOM + expect(expand.call.to_s).to eq(<<~'EOM'.indent(2)) + Foo.call do |a + end # one + + print lol + class Foo + end # two + EOM + + expect(expand.call.to_s).to eq(<<~'EOM') + class Blerg + Foo.call do |a + end # one + + print lol + class Foo + end # two + end # three + EOM + end + + it "expands up when on an end" do + lines = CodeLine.from_source(<<~'EOM') + Foo.new do + end + EOM + expand = BalanceHeuristicExpand.new( + code_lines: lines, + block: CodeBlock.new(lines: lines[1]) + ) + expect(expand.direction).to eq(:up) + expand.call + expect(expand.direction).to eq(:stop) + + expect(expand.start_index).to eq(0) + expect(expand.end_index).to eq(1) + expect(expand.to_s).to eq(lines.join) + end + + it "expands down when on a keyword" do + lines = CodeLine.from_source(<<~'EOM') + Foo.new do + end + EOM + expand = BalanceHeuristicExpand.new( + code_lines: lines, + block: CodeBlock.new(lines: lines[0]) + ) + expect(expand.direction).to eq(:down) + expand.call + expect(expand.direction).to eq(:stop) + + expect(expand.start_index).to eq(0) + expect(expand.end_index).to eq(1) + expect(expand.to_s).to eq(lines.join) + end + end +end diff --git a/spec/unit/block_expand_spec.rb b/spec/unit/indent_block_expand_spec.rb similarity index 89% rename from spec/unit/block_expand_spec.rb rename to spec/unit/indent_block_expand_spec.rb index dc4dade..ca9afcc 100644 --- a/spec/unit/block_expand_spec.rb +++ b/spec/unit/indent_block_expand_spec.rb @@ -3,7 +3,7 @@ require_relative "../spec_helper" module DeadEnd - RSpec.describe BlockExpand do + RSpec.describe IndentBlockExpand do it "captures multiple empty and hidden lines" do source_string = <<~EOM def foo @@ -22,7 +22,7 @@ def foo code_lines[6].mark_invisible block = CodeBlock.new(lines: [code_lines[3]]) - expansion = BlockExpand.new(code_lines: code_lines) + expansion = IndentBlockExpand.new(code_lines: code_lines) block = expansion.call(block) expect(block.to_s).to eq(<<~EOM.indent(4)) @@ -47,7 +47,7 @@ def foo code_lines = code_line_array(source_string) block = CodeBlock.new(lines: [code_lines[3]]) - expansion = BlockExpand.new(code_lines: code_lines) + expansion = IndentBlockExpand.new(code_lines: code_lines) block = expansion.call(block) expect(block.to_s).to eq(<<~EOM.indent(4)) @@ -71,7 +71,7 @@ def foo code_lines = code_line_array(source_string) block = CodeBlock.new(lines: [code_lines[3]]) - expansion = BlockExpand.new(code_lines: code_lines) + expansion = IndentBlockExpand.new(code_lines: code_lines) block = expansion.call(block) expect(block.to_s).to eq(<<~EOM.indent(4)) @@ -104,7 +104,7 @@ def foo code_lines = code_line_array(source_string) block = CodeBlock.new(lines: [code_lines[2]]) - expansion = BlockExpand.new(code_lines: code_lines) + expansion = IndentBlockExpand.new(code_lines: code_lines) block = expansion.call(block) expect(block.to_s).to eq(<<~EOM.indent(2)) @@ -138,7 +138,7 @@ def foo lines: code_lines[6] ) - expansion = BlockExpand.new(code_lines: code_lines) + expansion = IndentBlockExpand.new(code_lines: code_lines) block = expansion.call(block) expect(block.to_s).to eq(<<~EOM.indent(2)) @@ -171,7 +171,7 @@ def foo EOM code_lines = code_line_array(source_string) - expansion = BlockExpand.new(code_lines: code_lines) + expansion = IndentBlockExpand.new(code_lines: code_lines) block = CodeBlock.new(lines: code_lines[3]) block = expansion.call(block) diff --git a/spec/unit/left_right_lex_count_spec.rb b/spec/unit/left_right_lex_count_spec.rb new file mode 100644 index 0000000..ce6ee51 --- /dev/null +++ b/spec/unit/left_right_lex_count_spec.rb @@ -0,0 +1,8 @@ +# frozen_string_literal: true + +require_relative "../spec_helper" + +module DeadEnd + RSpec.describe LeftRightLexCount do + end +end diff --git a/spec/unit/lex_pair_diff_spec.rb b/spec/unit/lex_pair_diff_spec.rb new file mode 100644 index 0000000..bceb2f6 --- /dev/null +++ b/spec/unit/lex_pair_diff_spec.rb @@ -0,0 +1,43 @@ +# frozen_string_literal: true + +require_relative "../spec_helper" + +module DeadEnd + RSpec.describe "LexPairDiff" do + it "leans unknown" do + diff = LexPairDiff.from_lex( + lex: LexAll.new(source: "[}").to_a, + is_kw: false, + is_end: false + ) + expect(diff.leaning).to eq(:both) + end + + it "leans right" do + diff = LexPairDiff.from_lex( + lex: LexAll.new(source: "}").to_a, + is_kw: false, + is_end: false + ) + expect(diff.leaning).to eq(:right) + end + + it "leans left" do + diff = LexPairDiff.from_lex( + lex: LexAll.new(source: "{").to_a, + is_kw: false, + is_end: false + ) + expect(diff.leaning).to eq(:left) + end + + it "leans equal" do + diff = LexPairDiff.from_lex( + lex: LexAll.new(source: "{}").to_a, + is_kw: false, + is_end: false + ) + expect(diff.leaning).to eq(:equal) + end + end +end From 7324ef1a24940e451d01e7ac1d563a2f8a644d74 Mon Sep 17 00:00:00 2001 From: schneems Date: Sun, 16 Jan 2022 20:44:48 -0600 Subject: [PATCH 02/58] New direction BlockNode and IndentTree The "left/right" balanced heuristic proved to be to unstable to be used as a block expansion by itself. I found that while I could produce better blocks (valid source code) the old algorithm was optimized to work with itself and not with a better input. Instead of experimenting more with a different expansion algorithm I want to take a new direction and re-write search from the ground up. The idea is this: We need a tree we can search. The more time spent building a higher quality tree the better and more accurate the actual exploration of the tree will be. I've experimented with tree construction before. I know that I want nodes to be able to handle intersection/capture with many other nodes. I know that I want all valid source code to be built into logically separate "chunks" previously I've described these as "pyramids" as they stick out when viewed from the side https://github.com/zombocom/dead_end/issues/118#issuecomment-1004950659. This work was done in an exploratory fashion so I'm going back and annotating some of my prior commits to flesh out the notes. Most of these were intended to be refactored out later and are perhaps less "professional" than they would be otherwise. I've decided to leave them in here where their inclusion helps to highlight what guided my decision making process as I was writing the code. I believe that will be more valuable in the long run than deleting all the notes and refactoring into one single large commit (which I usually prefer for open source feature work). All that's to say: I know some of these commits and their messages aren't great, just roll with it, the end result is worth it. ## Goal: Indentation focused blocks - A block's inner are comprised of sub-blocks Ideally I want these to be two separate blocks (certainly not 5 separate lines, that's useless) ``` class Zoo def foo print def bar print end end ``` If we can't do that, then it may be okay to build the (incorrect) match as long as we can determine at what indent level the problem was created, then we can work backwards if "class zoo" is implicated, we know it's missing an end and reverse scan up ## Thinking here Desired properties: - I think we can somehow leverage data about indentation to inform the blocks, We've got two cases we need to handle at the same time: ``` print "one" print "one" print "one" ``` And ``` def one print "one" end def two print "one" end def three Print "three end ``` Both should produce 3 inner "blocks". Since we're operating only at the level of one block at a time, how do we figure out how to generate the "inner". - One option: If we scan ahead to find all blocks we want to join together then we can make one block with ALL of those other sub blocks. - I like that - I want large blocks that can be "zoomed" into to expose the "inner" code. Next task is to look through all existing examples and build demo blocks, develop a feel for what's useful and what isn't then figure out if we can find patterns --- lib/dead_end/api.rb | 4 + lib/dead_end/block_document.rb | 130 ++++++++++++++++++++++++ lib/dead_end/block_node.rb | 163 +++++++++++++++++++++++++++++++ lib/dead_end/indent_tree.rb | 65 ++++++++++++ lib/dead_end/lex_pair_diff.rb | 4 + spec/unit/block_document_spec.rb | 120 +++++++++++++++++++++++ spec/unit/block_node_spec.rb | 32 ++++++ spec/unit/code_search_spec.rb | 1 + spec/unit/indent_tree_spec.rb | 146 +++++++++++++++++++++++++++ 9 files changed, 665 insertions(+) create mode 100644 lib/dead_end/block_document.rb create mode 100644 lib/dead_end/block_node.rb create mode 100644 lib/dead_end/indent_tree.rb create mode 100644 spec/unit/block_document_spec.rb create mode 100644 spec/unit/block_node_spec.rb create mode 100644 spec/unit/indent_tree_spec.rb diff --git a/lib/dead_end/api.rb b/lib/dead_end/api.rb index f497057..6d32da1 100644 --- a/lib/dead_end/api.rb +++ b/lib/dead_end/api.rb @@ -198,3 +198,7 @@ def self.valid?(source) require_relative "display_invalid_blocks" require_relative "balance_heuristic_expand" require_relative "parse_blocks_from_indent_line" + +require_relative "block_document" +require_relative "block_node" +require_relative "indent_tree" diff --git a/lib/dead_end/block_document.rb b/lib/dead_end/block_document.rb new file mode 100644 index 0000000..eda3f24 --- /dev/null +++ b/lib/dead_end/block_document.rb @@ -0,0 +1,130 @@ +# frozen_string_literal: true + +module DeadEnd + class BlockDocument + attr_reader :blocks, :queue, :root + + include Enumerable + + def initialize(code_lines: ) + @code_lines = code_lines + blocks = nil + @queue = PriorityQueue.new + @root = nil + end + + def to_a + map(&:itself) + end + + def each + node = @root + while node + yield node + node = node.below + end + end + + def to_s + string = String.new + each do |block| + string << block.to_s + end + string + end + + def call + last = nil + blocks = @code_lines.map.with_index do |line, i| + next if line.empty? + + node = BlockNode.new(lines: line, indent: line.indent) + @root ||= node + node.above = last + last&.below = node + last = node + node + end + + if last.above + last.above.below = last + end + + # Need all above/below set to determine correct next_indent + blocks.each do |b| + next if b.nil? + queue << b + end + + self + end + + def capture(node: , captured: ) + inner = [] + inner.concat(Array(captured)) + inner << node + inner.sort_by! {|block| block.start_index } + + lines = [] + indent = node.indent + lex_diff = LexPairDiff.new_empty + inner.each do |block| + lines.concat(block.lines) + lex_diff.concat(block.lex_diff) + block.delete + indent = block.indent if block.indent < indent + end + + now = BlockNode.new( + lines: lines, + lex_diff: lex_diff, + indent: indent + ) + now.inner = inner + + if inner.first == @root + @root = now + end + + if inner.first&.above + inner.first.above.below = now + now.above = inner.first.above + end + + if inner.last&.below + inner.last.below.above = now + now.below = inner.last.below + end + now + end + + def eat_above(node) + return unless now = node&.eat_above + + if node.above == @root + @root = now + end + + node.above.delete + node.delete + + while queue&.peek&.deleted? + queue.pop + end + + now + end + + def eat_below(node) + eat_above(node&.below) + end + + def pop + @queue.pop + end + + def peek + @queue.peek + end + end +end diff --git a/lib/dead_end/block_node.rb b/lib/dead_end/block_node.rb new file mode 100644 index 0000000..bb73c0c --- /dev/null +++ b/lib/dead_end/block_node.rb @@ -0,0 +1,163 @@ +# frozen_string_literal: true + +module DeadEnd + class BlockNode + attr_accessor :above, :below, :left, :right, :inner + attr_reader :lines, :start_index, :end_index, :lex_diff, :indent + + def initialize(lines: , indent: , next_indent: nil, lex_diff: nil) + lines = Array(lines) + @indent = indent + @next_indent = next_indent + @lines = lines + @left = nil + @right = nil + @inner = [] + + @start_index = lines.first.index + @end_index = lines.last.index + + if lex_diff.nil? + set_lex_diff_from(@lines) + else + @lex_diff = lex_diff + end + + @deleted = false + end + + def self.next_indent(above, node, below) + return node.indent if above && above.indent >= node.indent + return node.indent if below && below.indent >= node.indent + + if above + if below + case above.indent <=> below.indent + when 1 then below.indent + when 0 then above.indent + when -1 then above.indent + end + else + above.indent + end + elsif below + below.indent + else + node.indent + end + end + + def next_indent + @next_indent ||= self.class.next_indent(above, self, below) + end + + def delete + @deleted = true + end + + def deleted? + @deleted + end + + def valid? + return @valid if defined?(@valid) + + @valid = DeadEnd.valid?(@lines.join) + end + + def unbalanced? + !balanced? + end + + def balanced? + @lex_diff.balanced? + end + + def leaning + @lex_diff.leaning + end + + def to_s + @lines.join + end + + def <=>(other) + case next_indent <=> other.next_indent + when 1 then 1 + when -1 then -1 + when 0 + case indent <=> other.indent + when 1 then 1 + when -1 then -1 + when 0 + end_index <=> other.end_index + end + end + end + + def indent + @indent ||= lines.map(&:indent).min || 0 + end + + def inspect + "#" + end + + private def set_lex_diff_from(lines) + @lex_diff = LexPairDiff.new_empty + lines.each do |line| + @lex_diff.concat(line.lex_diff) + end + end + + def ==(other) + @lines == other.lines && @indent == other.indent && next_indent == other.next_indent && @inner == other.inner + end + + def eat_above + return nil if above.nil? + + node = BlockNode.new( + lines: above.lines + @lines, + indent: above.indent < @indent ? above.indent : @indent + ) + + if above.inner.empty? + node.inner << above + else + above.inner.each do |b| + node.inner << b + end + end + + if self.inner.empty? + node.inner << self + else + self.inner.each do |b| + node.inner << b + end + end + + if above.above + node.above = above.above + above.above.below = node + end + + if below + node.below = below + below.above = node + end + + node + end + + def eat_below + # return nil if below.nil? + # below.eat_above + end + + def without(other) + BlockNode.new(lines: self.lines - other.lines) + end + end +end diff --git a/lib/dead_end/indent_tree.rb b/lib/dead_end/indent_tree.rb new file mode 100644 index 0000000..28a4a06 --- /dev/null +++ b/lib/dead_end/indent_tree.rb @@ -0,0 +1,65 @@ +# frozen_string_literal: true + +module DeadEnd + class IndentTree + attr_reader :document + + def initialize(document: ) + @document = document + @last_length = Float::INFINITY + end + + def call + reduce + # loop do + # requeue + # if document.queue.length >= @last_length + # break + # else + # @last_length = document.queue.length + # reduce + # end + # end + + self + end + + + def reduce + while block = document.pop + original = block + blocks = [block] + + indent = original.next_indent + while (above = blocks.last.above) && above.indent >= indent + blocks << above + break if above.leaning == :left + end + + blocks.reverse! + + while (below = blocks.last.below) && below.indent >= indent + blocks << below + break if below.leaning == :right + end + + blocks.delete(original) + if !blocks.empty? + node = document.capture(node: original, captured: blocks) + document.queue << node + end + end + self + end + + def requeue + document.each do |block| + document.queue << block + end + end + + def to_s + @document.to_s + end + end +end diff --git a/lib/dead_end/lex_pair_diff.rb b/lib/dead_end/lex_pair_diff.rb index bc02510..a3ff2a1 100644 --- a/lib/dead_end/lex_pair_diff.rb +++ b/lib/dead_end/lex_pair_diff.rb @@ -40,6 +40,10 @@ def self.from_lex(lex:, is_kw:, is_end:) ) end + def self.new_empty + self.new(curly: 0, square: 0, parens: 0, kw_end: 0) + end + attr_reader :curly, :square, :parens, :kw_end def initialize(curly:, square:, parens:, kw_end:) diff --git a/spec/unit/block_document_spec.rb b/spec/unit/block_document_spec.rb new file mode 100644 index 0000000..49aa01e --- /dev/null +++ b/spec/unit/block_document_spec.rb @@ -0,0 +1,120 @@ +# frozen_string_literal: true + +require_relative "../spec_helper" + +module DeadEnd + RSpec.describe BlockDocument do + it "captures" do + source = <<~'EOM' + if true + print 'huge 1' + print 'huge 2' + print 'huge 3' + end + EOM + + code_lines = CleanDocument.new(source: source).call.lines + document = BlockDocument.new(code_lines: code_lines).call + + blocks = document.to_a + + node = document.capture(node: blocks[3], captured: blocks[1..2]) + + expect(node.to_s).to eq(code_lines[1..3].join) + expect(node.start_index).to eq(1) + expect(node.indent).to eq(2) + expect(node.next_indent).to eq(0) + expect(document.map(&:itself).length).to eq(3) + + # Document has changed, rebuild blocks to array + blocks = document.to_a + node = document.capture(node: blocks[1], captured: [blocks[0], blocks[2]]) + + expect(node.to_s).to eq(code_lines.join) + expect(node.inner.length).to eq(3) + end + + it "captures complicated" do + source = <<~'EOM' + if true # 0 + print 'huge 1' # 1 + end # 2 + + if true # 4 + print 'huge 2' # 5 + end # 6 + + if true # 8 + print 'huge 3' # 9 + end # 10 + EOM + + code_lines = CleanDocument.new(source: source).call.lines + document = BlockDocument.new(code_lines: code_lines).call + + blocks = document.to_a + document.capture(node: blocks[1], captured: [blocks[0], blocks[2]]) + blocks = document.to_a + document.capture(node: blocks[2], captured: [blocks[1], blocks[3]]) + blocks = document.to_a + document.capture(node: blocks[3], captured: [blocks[2], blocks[4]]) + + blocks = document.to_a + expect(blocks.length).to eq(3) + root = document.root + document.capture(node: root, captured: blocks[1..-1]) + + blocks = document.to_a + expect(blocks.length).to eq(1) + expect(document.root.inner.length).to eq(3) + expect(document.root.inner[0].to_s).to eq(<<~'EOM') + if true # 0 + print 'huge 1' # 1 + end # 2 + EOM + + expect(document.root.inner[1].to_s).to eq(<<~'EOM') + if true # 4 + print 'huge 2' # 5 + end # 6 + EOM + + expect(document.root.inner[2].to_s).to eq(<<~'EOM') + if true # 8 + print 'huge 3' # 9 + end # 10 + EOM + end + + it "prioritizes indent" do + code_lines = CodeLine.from_source(<<~'EOM') + def foo + end # one + end # two + EOM + + document = BlockDocument.new(code_lines: code_lines).call + one = document.queue.pop + expect(one.to_s.strip).to eq("end # one") + end + + it "Block document dequeues from bottom to top" do + code_lines = CodeLine.from_source(<<~'EOM') + Foo.call + end + EOM + + document = BlockDocument.new(code_lines: code_lines).call + one = document.queue.pop + expect(one.to_s.strip).to eq("end") + + two = document.queue.pop + expect(two.to_s.strip).to eq("Foo.call") + + expect(one.above).to eq(two) + expect(two.below).to eq(one) + + expect(document.queue.pop).to eq(nil) + end + end +end diff --git a/spec/unit/block_node_spec.rb b/spec/unit/block_node_spec.rb new file mode 100644 index 0000000..530319c --- /dev/null +++ b/spec/unit/block_node_spec.rb @@ -0,0 +1,32 @@ +# frozen_string_literal: true + +require_relative "../spec_helper" + +module DeadEnd + RSpec.describe BlockNode do + it "Can figure out it's own next_indentation" do + source = <<~'EOM' + if true + print 'huge' + print 'huge' + end + EOM + + code_lines = CleanDocument.new(source: source).call.lines + document = BlockDocument.new(code_lines: code_lines).call + + expect(document.map(&:next_indent)).to eq([0, 2, 2, 0]) + + source = <<~'EOM' + if true + print 'huge' + end + EOM + + code_lines = CleanDocument.new(source: source).call.lines + document = BlockDocument.new(code_lines: code_lines).call + + expect(document.map(&:next_indent)).to eq([0, 0, 0]) + end + end +end diff --git a/spec/unit/code_search_spec.rb b/spec/unit/code_search_spec.rb index 8f3ca19..76f08b7 100644 --- a/spec/unit/code_search_spec.rb +++ b/spec/unit/code_search_spec.rb @@ -501,5 +501,6 @@ def foo end EOM end + end end diff --git a/spec/unit/indent_tree_spec.rb b/spec/unit/indent_tree_spec.rb new file mode 100644 index 0000000..8df4493 --- /dev/null +++ b/spec/unit/indent_tree_spec.rb @@ -0,0 +1,146 @@ +# frozen_string_literal: true + +require_relative "../spec_helper" + +module DeadEnd + RSpec.describe BlockDocument do + it "extra space before end" do + source = <<~'EOM' + Foo.call + def foo + print "lol" + print "lol" + end # one + end # two + EOM + code_lines = CleanDocument.new(source: source).call.lines + document = BlockDocument.new(code_lines: code_lines).call + tree = IndentTree.new(document: document).call + + blocks = document.to_a + expect(blocks.length).to eq(1) + expect(document.root.leaning).to eq(:right) + + expect(document.root.inner.length).to eq(3) + expect(document.root.inner[0].to_s).to eq(<<~'EOM') + Foo.call + EOM + expect(document.root.inner[0].indent).to eq(0) + + + expect(document.root.inner[1].to_s).to eq(<<~'EOM'.indent(2)) + def foo + print "lol" + print "lol" + end # one + EOM + expect(document.root.inner[1].balanced?).to be_truthy + expect(document.root.inner[1].indent).to eq(2) + + expect(document.root.inner[2].to_s).to eq(<<~'EOM') + end # two + EOM + expect(document.root.inner[2].indent).to eq(0) + end + + it "captures complicated" do + source = <<~'EOM' + if true # 0 + print 'huge 1' # 1 + end # 2 + + if true # 4 + print 'huge 2' # 5 + end # 6 + + if true # 8 + print 'huge 3' # 9 + end # 10 + EOM + + code_lines = CleanDocument.new(source: source).call.lines + document = BlockDocument.new(code_lines: code_lines).call + tree = IndentTree.new(document: document) + tree.call + + blocks = document.to_a + expect(blocks.length).to eq(1) + + expect(document.root.inner.length).to eq(3) + expect(document.root.inner[0].to_s).to eq(<<~'EOM') + if true # 0 + print 'huge 1' # 1 + end # 2 + EOM + + expect(document.root.inner[1].to_s).to eq(<<~'EOM') + if true # 4 + print 'huge 2' # 5 + end # 6 + EOM + + expect(document.root.inner[2].to_s).to eq(<<~'EOM') + if true # 8 + print 'huge 3' # 9 + end # 10 + EOM + end + + it "prioritizes indent" do + code_lines = CodeLine.from_source(<<~'EOM') + def foo + end # one + end # two + EOM + + document = BlockDocument.new(code_lines: code_lines).call + one = document.queue.pop + expect(one.to_s.strip).to eq("end # one") + end + + it "captures" do + source = <<~'EOM' + if true + print 'huge 1' + print 'huge 2' + print 'huge 3' + end + EOM + + code_lines = CleanDocument.new(source: source).call.lines + document = BlockDocument.new(code_lines: code_lines).call + tree = IndentTree.new(document: document) + tree.call + + # blocks = document.to_a + expect(document.root.to_s).to eq(code_lines.join) + expect(document.to_a.length).to eq(1) + expect(document.root.inner.length).to eq(3) + end + + it "simple" do + skip + source = <<~'EOM' + print 'lol' + print 'lol' + + Foo.call # missing do + end + EOM + + code_lines = CleanDocument.new(source: source).call.lines + document = BlockDocument.new(code_lines: code_lines).call + search = BlockSearch.new(document: document).call + search.call + + expect(search.document.root).to eq( + BlockNode.new(lines: code_lines[0..1], indent: 0).tap { |node| + node.inner << BlockNode.new(lines: code_lines[0], indent: 0) + node.right = BlockNode.new(lines: code_lines[1], indent: 0) + } + ) + end + + + end +end From ddcc64660e326931136122856dc4a8dd0ed04142 Mon Sep 17 00:00:00 2001 From: schneems Date: Wed, 26 Jan 2022 15:28:55 -0600 Subject: [PATCH 03/58] Initial tree building I'm sure there are edge cases but I'm liking the properties of the tree. It lets me view blocks logically the way I construct them in my head (based on indentation). I may want to make some convenience methods, for instance given the if/else/end cadence I want to quickly view ``` Args.new(parts: [argument], location: argument.location) Args.new( parts: arguments.parts << argument, location: arguments.location.to(argument.location) ) ``` And: ``` if arguments.parts.empty? else end ``` If the problem is in the block, this will let me figure out if I need to "zoom" in another level or if I've already isolated the issue. In the current indentation. So far all my tests are mostly with valid code and mostly regular indentation. I don't know how well these properties will hold when we face some "real" world code cases --- spec/unit/indent_tree_spec.rb | 75 ++++++++++++++++++++++++++++++++--- 1 file changed, 70 insertions(+), 5 deletions(-) diff --git a/spec/unit/indent_tree_spec.rb b/spec/unit/indent_tree_spec.rb index 8df4493..72c8ecc 100644 --- a/spec/unit/indent_tree_spec.rb +++ b/spec/unit/indent_tree_spec.rb @@ -3,7 +3,76 @@ require_relative "../spec_helper" module DeadEnd - RSpec.describe BlockDocument do + RSpec.describe IndentTree do + it "valid if/else end" do + source = <<~'EOM' + def on_args_add(arguments, argument) + if arguments.parts.empty? + + Args.new(parts: [argument], location: argument.location) + else + + Args.new( + parts: arguments.parts << argument, + location: arguments.location.to(argument.location) + ) + end + end + EOM + + code_lines = CleanDocument.new(source: source).call.lines + document = BlockDocument.new(code_lines: code_lines).call + tree = IndentTree.new(document: document).call + + blocks = document.to_a + expect(blocks.length).to eq(1) + expect(document.root.leaning).to eq(:equal) + expect(document.root.inner.length).to eq(3) + expect(document.root.inner[0].to_s).to eq(<<~'EOM') + def on_args_add(arguments, argument) + EOM + + expect(document.root.inner[1].to_s).to eq(<<~'EOM'.indent(2)) + if arguments.parts.empty? + Args.new(parts: [argument], location: argument.location) + else + Args.new( + parts: arguments.parts << argument, + location: arguments.location.to(argument.location) + ) + end + EOM + + expect(document.root.inner[2].to_s).to eq(<<~'EOM') + end + EOM + + inside = document.root.inner[1] + expect(inside.inner.length).to eq(5) + expect(inside.inner[0].to_s).to eq(<<~'EOM'.indent(2)) + if arguments.parts.empty? + EOM + + expect(inside.inner[1].to_s).to eq(<<~'EOM'.indent(4)) + Args.new(parts: [argument], location: argument.location) + EOM + + expect(inside.inner[2].to_s).to eq(<<~'EOM'.indent(2)) + else + EOM + + expect(inside.inner[3].to_s).to eq(<<~'EOM'.indent(4)) + Args.new( + parts: arguments.parts << argument, + location: arguments.location.to(argument.location) + ) + EOM + + expect(inside.inner[4].to_s).to eq(<<~'EOM'.indent(2)) + end + EOM + end + it "extra space before end" do source = <<~'EOM' Foo.call @@ -26,8 +95,6 @@ def foo Foo.call EOM expect(document.root.inner[0].indent).to eq(0) - - expect(document.root.inner[1].to_s).to eq(<<~'EOM'.indent(2)) def foo print "lol" @@ -140,7 +207,5 @@ def foo } ) end - - end end From 3682b6b4b965b06edbd78dfe88d5689d8ec7e9af Mon Sep 17 00:00:00 2001 From: schneems Date: Wed, 26 Jan 2022 15:43:37 -0600 Subject: [PATCH 04/58] Handle invalid nodes Really loving these BlockNode properties I don't want an invalid node to "capture" a valid one. We can prevent this by ensuring that the blocks are always built from the inside out. For example if we're going up and see `)` It's a sign that above us is building a valid block. If we grab it now, it means the the above code couldn't just be ``` ( # inner ) ``` It would have to be ``` ( # inner ) # plus extra stuff here ``` Which we don't want. If all the code is completely valid then we'll eventually build correct blocks. Otherwise (I'm hoping) that we'll preserve relatively logical blocks that isolate their own syntax errors. One unknown is how a :both leaning line would do here. I think if we are expanding up a line leaning "both" would need to expand down so we should capture it. Same follows for expanding down. So I think the current logic will hold, but perhaps there will be edge cases to find. Either way, this is quite an exciting development. I am hopeful that this is the right path. I want to keep working on the tree logic, adding some more tests. Then I want to take a stab at building an algorithm that searches the tree. We can't purely rely on leaning when doing a search, as the syntax error could be due to missing `|` or an extra comma etc. However it might inform us as to which nodes to look at first as long as we fall back to checking all nodes (if those prove to not hold the full set of syntax errors. Also worth noting that I think we'll have to completely re-visit the "capture context" logic. As it was mostly based on the blocks that the prior search produced. This new search will make different shaped blocks. We should optimize for speed and quality. The "bad" blocks returned should contain the error, and we should do it fast. Later on we might want to split the blocks up, say by chunking into logical blocks at the same indentation based off of newline/whitespace as the old algorithm did. This new algorithm throws away empty/double-newline lines currently. That allows us to completely get rid of any "is this line empty" checks which littered the prior algorithm. It's better to normalize and build a general algorithm that works and doesn't have to handle edge cases. Then later go back and clean up the results instead of making the search/tree-building overly complicated just to simplify showing results. --- lib/dead_end/indent_tree.rb | 2 + spec/unit/indent_tree_spec.rb | 78 +++++++++++++++++++++++++++++++++++ 2 files changed, 80 insertions(+) diff --git a/lib/dead_end/indent_tree.rb b/lib/dead_end/indent_tree.rb index 28a4a06..79499c5 100644 --- a/lib/dead_end/indent_tree.rb +++ b/lib/dead_end/indent_tree.rb @@ -32,6 +32,7 @@ def reduce indent = original.next_indent while (above = blocks.last.above) && above.indent >= indent + break if above.leaning == :right blocks << above break if above.leaning == :left end @@ -39,6 +40,7 @@ def reduce blocks.reverse! while (below = blocks.last.below) && below.indent >= indent + break if below.leaning == :left blocks << below break if below.leaning == :right end diff --git a/spec/unit/indent_tree_spec.rb b/spec/unit/indent_tree_spec.rb index 72c8ecc..b86c1d4 100644 --- a/spec/unit/indent_tree_spec.rb +++ b/spec/unit/indent_tree_spec.rb @@ -4,6 +4,84 @@ module DeadEnd RSpec.describe IndentTree do + it "invalid if/else end with surrounding code" do + source = <<~'EOM' + class Foo + def to_json(*opts) + { type: :args, parts: parts, loc: location }.to_json(*opts) + end + end + + def on_args_add(arguments, argument) + if arguments.parts.empty? + Args.new(parts: [argument], location: argument.location) + else + + Args.new( + parts: arguments.parts << argument, + location: arguments.location.to(argument.location) + ) + end + # Missing end here, comments are erased via CleanDocument + + class ArgsAddBlock + attr_reader :arguments + + attr_reader :block + + attr_reader :location + + def initialize(arguments:, block:, location:) + @arguments = arguments + @block = block + @location = location + end + end + EOM + + code_lines = CleanDocument.new(source: source).call.lines + document = BlockDocument.new(code_lines: code_lines).call + tree = IndentTree.new(document: document).call + + blocks = document.to_a + expect(blocks.length).to eq(1) + expect(document.root.leaning).to eq(:left) + expect(document.root.inner[0].to_s).to eq(<<~'EOM') + class Foo + def to_json(*opts) + { type: :args, parts: parts, loc: location }.to_json(*opts) + end + end + EOM + expect(document.root.inner[0].leaning).to eq(:equal) + expect(document.root.inner[1].inner[0].to_s).to eq(<<~'EOM') + def on_args_add(arguments, argument) + if arguments.parts.empty? + Args.new(parts: [argument], location: argument.location) + else + Args.new( + parts: arguments.parts << argument, + location: arguments.location.to(argument.location) + ) + end + EOM + expect(document.root.inner[1].inner[0].leaning).to eq(:left) + + expect(document.root.inner[1].inner[1].to_s).to eq(<<~'EOM') + class ArgsAddBlock + attr_reader :arguments + attr_reader :block + attr_reader :location + def initialize(arguments:, block:, location:) + @arguments = arguments + @block = block + @location = location + end + end + EOM + expect(document.root.inner[1].inner[1].leaning).to eq(:equal) + end + it "valid if/else end" do source = <<~'EOM' def on_args_add(arguments, argument) From a7d0fe2ac980aaa75cb677664af1ad7c3ab35abe Mon Sep 17 00:00:00 2001 From: schneems Date: Wed, 26 Jan 2022 16:07:16 -0600 Subject: [PATCH 05/58] Tree building perf commit Currently spending 40% of time building the tree in popping off nodes ``` 40.12% 0.35% 3.16 0.03 0.00 3.13 23373 DeadEnd::PriorityQueue#pop ``` Spending 9% of time enqueueing nodes. It might be worth it to revisit using an insertion sort (with bsearch_index) to see if it's faster than a heap implementation as we're guaranteeing to search through all elements in the queue. Currently perf for tree building our target case is: ``` $ be rspec spec/unit/indent_tree_spec.rb:7 Run options: include {:locations=>{"./spec/unit/indent_tree_spec.rb"=>[7]}} DeadEnd::IndentTree WIP syntax_tree.rb.txt for performance validation Finished in 0.75664 seconds (files took 0.14354 seconds to load) 1 example, 0 failures ``` Main has the full search around 1.2~1.1 seconds with the same time to load). So our budget to hit equal perf is 0.45 seconds. I'm hoping we can beat that by a bunch, but we've not even touched ripper yet and it's *supposed* to be the expensive part. --- lib/dead_end/block_document.rb | 22 +++++++++++++++------- lib/dead_end/block_node.rb | 2 -- lib/dead_end/indent_tree.rb | 20 ++------------------ spec/unit/indent_tree_spec.rb | 13 +++++++++++++ 4 files changed, 30 insertions(+), 27 deletions(-) diff --git a/lib/dead_end/block_document.rb b/lib/dead_end/block_document.rb index eda3f24..785eebe 100644 --- a/lib/dead_end/block_document.rb +++ b/lib/dead_end/block_document.rb @@ -59,14 +59,9 @@ def call self end - def capture(node: , captured: ) - inner = [] - inner.concat(Array(captured)) - inner << node - inner.sort_by! {|block| block.start_index } - + def capture_all(inner) lines = [] - indent = node.indent + indent = inner.first.indent lex_diff = LexPairDiff.new_empty inner.each do |block| lines.concat(block.lines) @@ -98,6 +93,15 @@ def capture(node: , captured: ) now end + def capture(node: , captured: ) + inner = [] + inner.concat(Array(captured)) + inner << node + inner.sort_by! {|block| block.start_index } + + capture_all(inner) + end + def eat_above(node) return unless now = node&.eat_above @@ -126,5 +130,9 @@ def pop def peek @queue.peek end + + def inspect + "#" + end end end diff --git a/lib/dead_end/block_node.rb b/lib/dead_end/block_node.rb index bb73c0c..3f055ea 100644 --- a/lib/dead_end/block_node.rb +++ b/lib/dead_end/block_node.rb @@ -10,8 +10,6 @@ def initialize(lines: , indent: , next_indent: nil, lex_diff: nil) @indent = indent @next_indent = next_indent @lines = lines - @left = nil - @right = nil @inner = [] @start_index = lines.first.index diff --git a/lib/dead_end/indent_tree.rb b/lib/dead_end/indent_tree.rb index 79499c5..e71363b 100644 --- a/lib/dead_end/indent_tree.rb +++ b/lib/dead_end/indent_tree.rb @@ -11,15 +11,6 @@ def initialize(document: ) def call reduce - # loop do - # requeue - # if document.queue.length >= @last_length - # break - # else - # @last_length = document.queue.length - # reduce - # end - # end self end @@ -45,21 +36,14 @@ def reduce break if below.leaning == :right end - blocks.delete(original) - if !blocks.empty? - node = document.capture(node: original, captured: blocks) + if blocks.length != 1 + node = document.capture_all(blocks) document.queue << node end end self end - def requeue - document.each do |block| - document.queue << block - end - end - def to_s @document.to_s end diff --git a/spec/unit/indent_tree_spec.rb b/spec/unit/indent_tree_spec.rb index b86c1d4..88b34a4 100644 --- a/spec/unit/indent_tree_spec.rb +++ b/spec/unit/indent_tree_spec.rb @@ -4,6 +4,19 @@ module DeadEnd RSpec.describe IndentTree do + it "WIP syntax_tree.rb.txt for performance validation" do + file = fixtures_dir.join("syntax_tree.rb.txt") + lines = file.read.lines + lines.delete_at(768 - 1) + source = lines.join + + debug_perf do + code_lines = CleanDocument.new(source: source).call.lines + document = BlockDocument.new(code_lines: code_lines).call + tree = IndentTree.new(document: document).call + end + end + it "invalid if/else end with surrounding code" do source = <<~'EOM' class Foo From 6fd4e36b6a0695ff6ac2e93ecc0948f63d8ee3bc Mon Sep 17 00:00:00 2001 From: schneems Date: Wed, 26 Jan 2022 16:45:15 -0600 Subject: [PATCH 06/58] Reintroduce and fix InsertionSort for performance MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Previously I wasn't using `bsearch_index` which seems like a MASSIVE oversight. However we still see speed gains whenever we don't pop the whole queue. InsertionSortQueue is much slower on insert, however on insert + pop for all elements it's roughly 2~3x faster than a priority queue (benchmark ips included in spec). ``` $ DEBUG_PERF=1 be rspec spec/unit/priority_queue_spec.rb DeadEnd::CodeFrontier Warming up -------------------------------------- Bsearch insertion 18.000 i/100ms Priority queue 7.000 i/100ms Calculating ------------------------------------- Bsearch insertion 193.177 (± 3.1%) i/s - 972.000 in 5.037030s Priority queue 74.128 (± 2.7%) i/s - 371.000 in 5.009844s Comparison: Bsearch insertion: 193.2 i/s Priority queue : 74.1 i/s - 2.61x (± 0.00) slower ``` This reduces the prior stated metric from ~0.75 seconds down to ~0.57 seconds ``` $ rspec ./spec/unit/indent_tree_spec.rb:8 Run options: include {:locations=>{"./spec/unit/indent_tree_spec.rb"=>[8]}} DeadEnd::IndentTree WIP syntax_tree.rb.txt for performance validation Finished in 0.57633 seconds (files took 0.30931 seconds to load) 1 example, 0 failures ``` (Though I'm not sure why "files took" time went up by so much. That seems strange --- lib/dead_end/block_document.rb | 2 +- lib/dead_end/priority_queue.rb | 62 ++++++++++++++++++++++++++++++++ spec/unit/priority_queue_spec.rb | 30 +++++++++++++++- 3 files changed, 92 insertions(+), 2 deletions(-) diff --git a/lib/dead_end/block_document.rb b/lib/dead_end/block_document.rb index 785eebe..8769dde 100644 --- a/lib/dead_end/block_document.rb +++ b/lib/dead_end/block_document.rb @@ -9,7 +9,7 @@ class BlockDocument def initialize(code_lines: ) @code_lines = code_lines blocks = nil - @queue = PriorityQueue.new + @queue = InsertionSortQueue.new @root = nil end diff --git a/lib/dead_end/priority_queue.rb b/lib/dead_end/priority_queue.rb index 3621e70..8dd370f 100644 --- a/lib/dead_end/priority_queue.rb +++ b/lib/dead_end/priority_queue.rb @@ -1,6 +1,68 @@ # frozen_string_literal: true module DeadEnd + # Sort elements on insert + # + # Instead of constantly calling `sort!`, put + # the element where it belongs the first time + # around + # + # Example: + # + # sorted = InsertionSort.new + # sorted << 33 + # sorted << 44 + # sorted << 1 + # puts sorted.to_a + # # => [1, 44, 33] + # + class InsertionSortQueue + def initialize + @array = [] + end + + def <<(value) + index = @array.bsearch_index do |existing| + case value <=> existing + when -1 + true + when 0 + false + when 1 + false + end + end || @array.length + + + @array.insert(index, value) + end + + def to_a + @array + end + + def pop + @array.pop + end + + def length + @array.length + end + + def empty? + @array.empty? + end + + def peek + @array.last + end + + # Legacy for testing PriorityQueue + def sorted + @array + end + end + # Holds elements in a priority heap on insert # # Instead of constantly calling `sort!`, put diff --git a/spec/unit/priority_queue_spec.rb b/spec/unit/priority_queue_spec.rb index 7381559..1ff17f3 100644 --- a/spec/unit/priority_queue_spec.rb +++ b/spec/unit/priority_queue_spec.rb @@ -20,6 +20,34 @@ def inspect end RSpec.describe CodeFrontier do + it "benchmark/ips" do + skip unless ENV["DEBUG_PERF"] + require 'benchmark/ips' + + values = 5000.times.map { rand(0..100) }.freeze + + Benchmark.ips do |x| + x.report("Bsearch insertion") { + q = InsertionSortQueue.new + values.each do |v| + q << v + end + while q.pop() do + end + } + + x.report("Priority queue ") { + q = PriorityQueue.new + values.each do |v| + q << v + end + while q.pop() do + end + } + x.compare! + end + end + it "works" do q = PriorityQueue.new q << 1 @@ -63,7 +91,7 @@ def inspect end it "priority queue" do - frontier = PriorityQueue.new + frontier = InsertionSortQueue.new frontier << CurrentIndex.new(0) frontier << CurrentIndex.new(1) From 24cde0f00adfff9e22b0050245bb1ed9cc83fe5f Mon Sep 17 00:00:00 2001 From: schneems Date: Wed, 26 Jan 2022 16:58:14 -0600 Subject: [PATCH 07/58] Fix missing deleted logic The prior results are actually MUCH better than initially thought. I forgot to trigger the "deleted" logic that cleans up/removes blocks from the frontier/queue. That roughly dropped time to build the tree in half which gives us much more time to work with: ``` $ rspec ./spec/unit/indent_tree_spec.rb:8 Run options: include {:locations=>{"./spec/unit/indent_tree_spec.rb"=>[8]}} DeadEnd::IndentTree WIP syntax_tree.rb.txt for performance validation Finished in 0.25907 seconds (files took 0.31338 seconds to load) 1 example, 0 failures ``` Naturally it makes a test fail rspec ./spec/unit/indent_tree_spec.rb:20 # DeadEnd::IndentTree invalid if/else end with surrounding code UGHHHH, It might be a good time to implement tree creation tracing. Here's the problem: ``` ============= Expanding block location: arguments.location.to(argument.location) Result: parts: arguments.parts << argument, location: arguments.location.to(argument.location) ============= Expanding block parts: arguments.parts << argument, location: arguments.location.to(argument.location) Result: Args.new( parts: arguments.parts << argument, location: arguments.location.to(argument.location) ) ============= Expanding block @location = location Result: @arguments = arguments @block = block @location = location ============= Expanding block @arguments = arguments @block = block @location = location Result: def initialize(arguments:, block:, location:) @arguments = arguments @block = block @location = location end ============= Expanding block Args.new( parts: arguments.parts << argument, location: arguments.location.to(argument.location) ) Result: if arguments.parts.empty? Args.new(parts: [argument], location: argument.location) else Args.new( parts: arguments.parts << argument, location: arguments.location.to(argument.location) ) end ``` The block ``` Args.new(parts: [argument], location: argument.location) ``` Needs to be expanded first. It should be based on it's indent/next indent I can work with that --- lib/dead_end/block_document.rb | 25 ++++--------------------- 1 file changed, 4 insertions(+), 21 deletions(-) diff --git a/lib/dead_end/block_document.rb b/lib/dead_end/block_document.rb index 8769dde..771a322 100644 --- a/lib/dead_end/block_document.rb +++ b/lib/dead_end/block_document.rb @@ -70,6 +70,10 @@ def capture_all(inner) indent = block.indent if block.indent < indent end + while queue&.peek&.deleted? + queue.pop + end + now = BlockNode.new( lines: lines, lex_diff: lex_diff, @@ -102,27 +106,6 @@ def capture(node: , captured: ) capture_all(inner) end - def eat_above(node) - return unless now = node&.eat_above - - if node.above == @root - @root = now - end - - node.above.delete - node.delete - - while queue&.peek&.deleted? - queue.pop - end - - now - end - - def eat_below(node) - eat_above(node&.below) - end - def pop @queue.pop end From 32eb0b37a78e69b7038ce549bb6935e347617061 Mon Sep 17 00:00:00 2001 From: schneems Date: Wed, 26 Jan 2022 21:08:24 -0600 Subject: [PATCH 08/58] Build tree recording I need the ability to introspect the tree building process as it's recursive and complex. This commit adds an initial recording mechnism. --- lib/dead_end/block_document.rb | 1 + lib/dead_end/block_node.rb | 5 ++- lib/dead_end/indent_tree.rb | 64 +++++++++++++++++++++++++++++----- spec/unit/indent_tree_spec.rb | 21 ++++++----- 4 files changed, 73 insertions(+), 18 deletions(-) diff --git a/lib/dead_end/block_document.rb b/lib/dead_end/block_document.rb index 771a322..75ad9b1 100644 --- a/lib/dead_end/block_document.rb +++ b/lib/dead_end/block_document.rb @@ -3,6 +3,7 @@ module DeadEnd class BlockDocument attr_reader :blocks, :queue, :root + attr_reader :blocks, :queue, :root, :code_lines include Enumerable diff --git a/lib/dead_end/block_node.rb b/lib/dead_end/block_node.rb index 3f055ea..1a3796a 100644 --- a/lib/dead_end/block_node.rb +++ b/lib/dead_end/block_node.rb @@ -3,7 +3,7 @@ module DeadEnd class BlockNode attr_accessor :above, :below, :left, :right, :inner - attr_reader :lines, :start_index, :end_index, :lex_diff, :indent + attr_reader :lines, :start_index, :end_index, :lex_diff, :indent, :starts_at, :ends_at def initialize(lines: , indent: , next_indent: nil, lex_diff: nil) lines = Array(lines) @@ -15,6 +15,9 @@ def initialize(lines: , indent: , next_indent: nil, lex_diff: nil) @start_index = lines.first.index @end_index = lines.last.index + @starts_at = @start_index + 1 + @ends_at = @end_index + 1 + if lex_diff.nil? set_lex_diff_from(@lines) else diff --git a/lib/dead_end/indent_tree.rb b/lib/dead_end/indent_tree.rb index e71363b..328008c 100644 --- a/lib/dead_end/indent_tree.rb +++ b/lib/dead_end/indent_tree.rb @@ -1,12 +1,56 @@ # frozen_string_literal: true module DeadEnd + class Recorder + def initialize(dir: , code_lines: ) + @code_lines = code_lines + @dir = Pathname(dir) + @tick = 0 + @name_tick = Hash.new {|h, k| h[k] = 0} + end + + def capture(block, name: ) + @tick += 1 + + filename = "#{@tick}-#{name}-#{@name_tick[name] += 1}-(#{block.starts_at}__#{block.ends_at}).txt" + @dir.join(filename).open(mode: "a") do |f| + document = DisplayCodeWithLineNumbers.new( + lines: @code_lines, + terminal: false, + highlight_lines: block.lines + ).call + + f.write(" Block lines: #{(block.starts_at + 1)..(block.ends_at + 1)} (#{name})\n") + f.write(" indent: #{block.indent} next_indent: #{block.next_indent}\n\n") + f.write("#{document}") + end + end + end + + class NullRecorder + def capture(block, name: ) + end + end + class IndentTree attr_reader :document - def initialize(document: ) + def initialize(document: , recorder: DEFAULT_VALUE) @document = document @last_length = Float::INFINITY + + if recorder != DEFAULT_VALUE + @recorder = recorder + else + dir = ENV["DEAD_END_RECORD_DIR"] || ENV["DEBUG"] ? DeadEnd.record_dir("tmp") : nil + if dir.nil? + @recorder = NullRecorder.new + else + dir = dir.join("build_tree") + dir.mkpath + @recorder = Recorder.new(dir: dir, code_lines: document.code_lines) + end + end end def call @@ -15,29 +59,33 @@ def call self end - - def reduce + private def reduce while block = document.pop original = block blocks = [block] indent = original.next_indent while (above = blocks.last.above) && above.indent >= indent - break if above.leaning == :right + leaning = above.leaning + break if leaning == :right blocks << above - break if above.leaning == :left + break if leaning == :left end blocks.reverse! while (below = blocks.last.below) && below.indent >= indent - break if below.leaning == :left + leaning = below.leaning + break if leaning == :left blocks << below - break if below.leaning == :right + break if leaning == :right end - if blocks.length != 1 + @recorder.capture(original, name: "pop") + + if blocks.length > 1 node = document.capture_all(blocks) + @recorder.capture(node, name: "expand") document.queue << node end end diff --git a/spec/unit/indent_tree_spec.rb b/spec/unit/indent_tree_spec.rb index 88b34a4..929aeb1 100644 --- a/spec/unit/indent_tree_spec.rb +++ b/spec/unit/indent_tree_spec.rb @@ -69,18 +69,21 @@ def to_json(*opts) expect(document.root.inner[0].leaning).to eq(:equal) expect(document.root.inner[1].inner[0].to_s).to eq(<<~'EOM') def on_args_add(arguments, argument) - if arguments.parts.empty? - Args.new(parts: [argument], location: argument.location) - else - Args.new( - parts: arguments.parts << argument, - location: arguments.location.to(argument.location) - ) - end EOM expect(document.root.inner[1].inner[0].leaning).to eq(:left) - expect(document.root.inner[1].inner[1].to_s).to eq(<<~'EOM') + expect(document.root.inner[1].inner[1].to_s).to eq(<<~'EOM'.indent(2)) + if arguments.parts.empty? + Args.new(parts: [argument], location: argument.location) + else + Args.new( + parts: arguments.parts << argument, + location: arguments.location.to(argument.location) + ) + end + EOM + + expect(document.root.inner[1].inner[2].to_s).to eq(<<~'EOM') class ArgsAddBlock attr_reader :arguments attr_reader :block From 20656d753e72e66a19d9b03901857c884566ee69 Mon Sep 17 00:00:00 2001 From: schneems Date: Thu, 27 Jan 2022 15:05:31 -0600 Subject: [PATCH 09/58] Slightly faster initial sorting Before: ``` 18.11% 0.00% 0.31 0.00 0.00 0.31 1 DeadEnd::BlockDocument#call 18.86% 0.30% 0.32 0.01 0.00 0.32 7711 DeadEnd::InsertionSortQueue#<< ``` After: ``` 17.22% 0.00% 0.29 0.00 0.00 0.29 1 DeadEnd::BlockDocument#call 6.34% 0.10% 0.11 0.00 0.00 0.11 2238 DeadEnd::InsertionSortQueue#<< ``` --- lib/dead_end/block_document.rb | 8 ++------ lib/dead_end/priority_queue.rb | 4 ++++ 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/lib/dead_end/block_document.rb b/lib/dead_end/block_document.rb index 75ad9b1..07cbe69 100644 --- a/lib/dead_end/block_document.rb +++ b/lib/dead_end/block_document.rb @@ -2,7 +2,6 @@ module DeadEnd class BlockDocument - attr_reader :blocks, :queue, :root attr_reader :blocks, :queue, :root, :code_lines include Enumerable @@ -36,7 +35,7 @@ def to_s def call last = nil - blocks = @code_lines.map.with_index do |line, i| + blocks = @code_lines.filter_map do |line| next if line.empty? node = BlockNode.new(lines: line, indent: line.indent) @@ -52,10 +51,7 @@ def call end # Need all above/below set to determine correct next_indent - blocks.each do |b| - next if b.nil? - queue << b - end + @queue.replace(blocks.sort) self end diff --git a/lib/dead_end/priority_queue.rb b/lib/dead_end/priority_queue.rb index 8dd370f..750c5ca 100644 --- a/lib/dead_end/priority_queue.rb +++ b/lib/dead_end/priority_queue.rb @@ -21,6 +21,10 @@ def initialize @array = [] end + def replace(array) + @array = array + end + def <<(value) index = @array.bsearch_index do |existing| case value <=> existing From b1acaf59deb0e4005db25382038e70dd3833830d Mon Sep 17 00:00:00 2001 From: schneems Date: Thu, 27 Jan 2022 15:15:43 -0600 Subject: [PATCH 10/58] Tiny perf change to CodeLine#initialize We're already looping through all lex values, we can capture the left/right count at the same time Before ``` 26.14% 1.45% 0.44 0.02 0.00 0.42 9252 DeadEnd::CodeLine#initialize ``` After ``` 25.32% 1.26% 0.41 0.02 0.00 0.39 9252 DeadEnd::CodeLine#initialize ``` --- lib/dead_end/code_line.rb | 16 ++++++++++------ 1 file changed, 10 insertions(+), 6 deletions(-) diff --git a/lib/dead_end/code_line.rb b/lib/dead_end/code_line.rb index 43cead8..f40609b 100644 --- a/lib/dead_end/code_line.rb +++ b/lib/dead_end/code_line.rb @@ -57,12 +57,6 @@ def initialize(line:, index:, lex:) end set_kw_end - - @lex_diff = LexPairDiff.from_lex( - lex: @lex, - is_kw: is_kw?, - is_end: is_end? - ) end def balanced? @@ -216,7 +210,10 @@ def trailing_slash? end_count = 0 @ignore_newline_not_beg = false + + left_right = LeftRightLexCount.new @lex.each do |lex| + left_right.count_lex(lex) kw_count += 1 if lex.is_kw? end_count += 1 if lex.is_end? @@ -244,6 +241,13 @@ def trailing_slash? @is_kw = (kw_count - end_count) > 0 @is_end = (end_count - kw_count) > 0 + + @lex_diff = LexPairDiff.new( + curly: left_right.curly_diff, + square: left_right.square_diff, + parens: left_right.parens_diff, + kw_end: kw_count - end_count + ) end end end From 763dec438c83550ef11b7a1d9d7b1c0843d04522 Mon Sep 17 00:00:00 2001 From: schneems Date: Thu, 27 Jan 2022 15:28:55 -0600 Subject: [PATCH 11/58] Remove unused code --- lib/dead_end/block_node.rb | 50 -------------------------------------- 1 file changed, 50 deletions(-) diff --git a/lib/dead_end/block_node.rb b/lib/dead_end/block_node.rb index 1a3796a..8ec1eb2 100644 --- a/lib/dead_end/block_node.rb +++ b/lib/dead_end/block_node.rb @@ -96,10 +96,6 @@ def <=>(other) end end - def indent - @indent ||= lines.map(&:indent).min || 0 - end - def inspect "#" end @@ -114,51 +110,5 @@ def inspect def ==(other) @lines == other.lines && @indent == other.indent && next_indent == other.next_indent && @inner == other.inner end - - def eat_above - return nil if above.nil? - - node = BlockNode.new( - lines: above.lines + @lines, - indent: above.indent < @indent ? above.indent : @indent - ) - - if above.inner.empty? - node.inner << above - else - above.inner.each do |b| - node.inner << b - end - end - - if self.inner.empty? - node.inner << self - else - self.inner.each do |b| - node.inner << b - end - end - - if above.above - node.above = above.above - above.above.below = node - end - - if below - node.below = below - below.above = node - end - - node - end - - def eat_below - # return nil if below.nil? - # below.eat_above - end - - def without(other) - BlockNode.new(lines: self.lines - other.lines) - end end end From 4ab2b7052a328aa50e89f88515bc0a8266120b42 Mon Sep 17 00:00:00 2001 From: schneems Date: Thu, 27 Jan 2022 15:39:14 -0600 Subject: [PATCH 12/58] Pull out inner/outer nodes --- lib/dead_end/block_document.rb | 18 +----------------- lib/dead_end/block_node.rb | 32 ++++++++++++++++++++++++++++++-- lib/dead_end/indent_tree.rb | 26 ++++++++++++++++++++++++++ spec/unit/indent_tree_spec.rb | 20 ++++++++++++++++++++ 4 files changed, 77 insertions(+), 19 deletions(-) diff --git a/lib/dead_end/block_document.rb b/lib/dead_end/block_document.rb index 07cbe69..1dfff49 100644 --- a/lib/dead_end/block_document.rb +++ b/lib/dead_end/block_document.rb @@ -57,27 +57,11 @@ def call end def capture_all(inner) - lines = [] - indent = inner.first.indent - lex_diff = LexPairDiff.new_empty - inner.each do |block| - lines.concat(block.lines) - lex_diff.concat(block.lex_diff) - block.delete - indent = block.indent if block.indent < indent - end - + now = BlockNode.from_blocks(inner) while queue&.peek&.deleted? queue.pop end - now = BlockNode.new( - lines: lines, - lex_diff: lex_diff, - indent: indent - ) - now.inner = inner - if inner.first == @root @root = now end diff --git a/lib/dead_end/block_node.rb b/lib/dead_end/block_node.rb index 8ec1eb2..d26b6ab 100644 --- a/lib/dead_end/block_node.rb +++ b/lib/dead_end/block_node.rb @@ -2,15 +2,35 @@ module DeadEnd class BlockNode + + def self.from_blocks(inner) + lines = [] + indent = inner.first.indent + lex_diff = LexPairDiff.new_empty + inner.each do |block| + lines.concat(block.lines) + lex_diff.concat(block.lex_diff) + indent = block.indent if block.indent < indent + block.delete + end + + BlockNode.new( + lines: lines, + lex_diff: lex_diff, + indent: indent, + inner: inner + ) + end + attr_accessor :above, :below, :left, :right, :inner attr_reader :lines, :start_index, :end_index, :lex_diff, :indent, :starts_at, :ends_at - def initialize(lines: , indent: , next_indent: nil, lex_diff: nil) + def initialize(lines: , indent: , next_indent: nil, lex_diff: nil, inner: []) lines = Array(lines) @indent = indent @next_indent = next_indent @lines = lines - @inner = [] + @inner = inner @start_index = lines.first.index @end_index = lines.last.index @@ -27,6 +47,14 @@ def initialize(lines: , indent: , next_indent: nil, lex_diff: nil) @deleted = false end + def outer_nodes + @outer_nodes ||= BlockNode.from_blocks inner.select {|block| block.indent == indent } + end + + def inner_nodes + @inner_nodes ||= BlockNode.from_blocks inner.select {|block| block.indent > indent } + end + def self.next_indent(above, node, below) return node.indent if above && above.indent >= node.indent return node.indent if below && below.indent >= node.indent diff --git a/lib/dead_end/indent_tree.rb b/lib/dead_end/indent_tree.rb index 328008c..b8d9ae2 100644 --- a/lib/dead_end/indent_tree.rb +++ b/lib/dead_end/indent_tree.rb @@ -32,6 +32,24 @@ def capture(block, name: ) end end + class IndentSearch + def initialize(tree: ) + @tree = tree + @invalid_blocks = [] + end + + def call + frontier = @tree.inner.dup + while block = frontier.pop + next if block.valid? + + # + end + + self + end + end + class IndentTree attr_reader :document @@ -53,6 +71,14 @@ def initialize(document: , recorder: DEFAULT_VALUE) end end + def to_a + @document.to_a + end + + def root + @document.root + end + def call reduce diff --git a/spec/unit/indent_tree_spec.rb b/spec/unit/indent_tree_spec.rb index 929aeb1..f81b68c 100644 --- a/spec/unit/indent_tree_spec.rb +++ b/spec/unit/indent_tree_spec.rb @@ -10,11 +10,31 @@ module DeadEnd lines.delete_at(768 - 1) source = lines.join + tree = nil + document = nil debug_perf do code_lines = CleanDocument.new(source: source).call.lines document = BlockDocument.new(code_lines: code_lines).call tree = IndentTree.new(document: document).call end + + expect(tree.to_a.length).to eq(1) + expect(tree.root.inner.length).to eq(3) + expect(tree.root.inner[0].to_s).to eq(<<~'EOM') + require 'ripper' + EOM + + expect(tree.root.inner[1].to_s).to eq(<<~'EOM') + require_relative 'syntax_tree/version' + EOM + + inner = tree.root.inner[2] + expect(inner.outer_nodes.to_s).to eq(<<~'EOM') + class SyntaxTree < Ripper + end + EOM + expect(inner.outer_nodes.valid?).to be_truthy + expect(inner.inner_nodes.valid?).to be_falsey end it "invalid if/else end with surrounding code" do From 5ba79c0c3de209f5f469db66b2639173ac368bea Mon Sep 17 00:00:00 2001 From: schneems Date: Thu, 27 Jan 2022 15:42:28 -0600 Subject: [PATCH 13/58] Rename @inner to @parents --- lib/dead_end/block_node.rb | 22 +++++------ spec/unit/indent_tree_spec.rb | 72 +++++++++++++++++------------------ 2 files changed, 47 insertions(+), 47 deletions(-) diff --git a/lib/dead_end/block_node.rb b/lib/dead_end/block_node.rb index d26b6ab..f336a96 100644 --- a/lib/dead_end/block_node.rb +++ b/lib/dead_end/block_node.rb @@ -3,11 +3,11 @@ module DeadEnd class BlockNode - def self.from_blocks(inner) + def self.from_blocks(parents) lines = [] - indent = inner.first.indent + indent = parents.first.indent lex_diff = LexPairDiff.new_empty - inner.each do |block| + parents.each do |block| lines.concat(block.lines) lex_diff.concat(block.lex_diff) indent = block.indent if block.indent < indent @@ -18,19 +18,19 @@ def self.from_blocks(inner) lines: lines, lex_diff: lex_diff, indent: indent, - inner: inner + parents:parents ) end - attr_accessor :above, :below, :left, :right, :inner + attr_accessor :above, :below, :left, :right, :parents attr_reader :lines, :start_index, :end_index, :lex_diff, :indent, :starts_at, :ends_at - def initialize(lines: , indent: , next_indent: nil, lex_diff: nil, inner: []) + def initialize(lines: , indent: , next_indent: nil, lex_diff: nil, parents: []) lines = Array(lines) @indent = indent @next_indent = next_indent @lines = lines - @inner = inner + @parents = parents @start_index = lines.first.index @end_index = lines.last.index @@ -48,11 +48,11 @@ def initialize(lines: , indent: , next_indent: nil, lex_diff: nil, inner: []) end def outer_nodes - @outer_nodes ||= BlockNode.from_blocks inner.select {|block| block.indent == indent } + @outer_nodes ||= BlockNode.from_blocks(parents.select { |block| block.indent == indent }) end def inner_nodes - @inner_nodes ||= BlockNode.from_blocks inner.select {|block| block.indent > indent } + @inner_nodes ||= BlockNode.from_blocks(parents.select { |block| block.indent > indent }) end def self.next_indent(above, node, below) @@ -125,7 +125,7 @@ def <=>(other) end def inspect - "#" + "#" end private def set_lex_diff_from(lines) @@ -136,7 +136,7 @@ def inspect end def ==(other) - @lines == other.lines && @indent == other.indent && next_indent == other.next_indent && @inner == other.inner + @lines == other.lines && @indent == other.indent && next_indent == other.next_indent && @parents == other.parents end end end diff --git a/spec/unit/indent_tree_spec.rb b/spec/unit/indent_tree_spec.rb index f81b68c..b82b16d 100644 --- a/spec/unit/indent_tree_spec.rb +++ b/spec/unit/indent_tree_spec.rb @@ -19,16 +19,16 @@ module DeadEnd end expect(tree.to_a.length).to eq(1) - expect(tree.root.inner.length).to eq(3) - expect(tree.root.inner[0].to_s).to eq(<<~'EOM') + expect(tree.root.parents.length).to eq(3) + expect(tree.root.parents[0].to_s).to eq(<<~'EOM') require 'ripper' EOM - expect(tree.root.inner[1].to_s).to eq(<<~'EOM') + expect(tree.root.parents[1].to_s).to eq(<<~'EOM') require_relative 'syntax_tree/version' EOM - inner = tree.root.inner[2] + inner = tree.root.parents[2] expect(inner.outer_nodes.to_s).to eq(<<~'EOM') class SyntaxTree < Ripper end @@ -79,20 +79,20 @@ def initialize(arguments:, block:, location:) blocks = document.to_a expect(blocks.length).to eq(1) expect(document.root.leaning).to eq(:left) - expect(document.root.inner[0].to_s).to eq(<<~'EOM') + expect(document.root.parents[0].to_s).to eq(<<~'EOM') class Foo def to_json(*opts) { type: :args, parts: parts, loc: location }.to_json(*opts) end end EOM - expect(document.root.inner[0].leaning).to eq(:equal) - expect(document.root.inner[1].inner[0].to_s).to eq(<<~'EOM') + expect(document.root.parents[0].leaning).to eq(:equal) + expect(document.root.parents[1].parents[0].to_s).to eq(<<~'EOM') def on_args_add(arguments, argument) EOM - expect(document.root.inner[1].inner[0].leaning).to eq(:left) + expect(document.root.parents[1].parents[0].leaning).to eq(:left) - expect(document.root.inner[1].inner[1].to_s).to eq(<<~'EOM'.indent(2)) + expect(document.root.parents[1].parents[1].to_s).to eq(<<~'EOM'.indent(2)) if arguments.parts.empty? Args.new(parts: [argument], location: argument.location) else @@ -103,7 +103,7 @@ def on_args_add(arguments, argument) end EOM - expect(document.root.inner[1].inner[2].to_s).to eq(<<~'EOM') + expect(document.root.parents[1].parents[2].to_s).to eq(<<~'EOM') class ArgsAddBlock attr_reader :arguments attr_reader :block @@ -115,7 +115,7 @@ def initialize(arguments:, block:, location:) end end EOM - expect(document.root.inner[1].inner[1].leaning).to eq(:equal) + expect(document.root.parents[1].parents[1].leaning).to eq(:equal) end it "valid if/else end" do @@ -141,12 +141,12 @@ def on_args_add(arguments, argument) blocks = document.to_a expect(blocks.length).to eq(1) expect(document.root.leaning).to eq(:equal) - expect(document.root.inner.length).to eq(3) - expect(document.root.inner[0].to_s).to eq(<<~'EOM') + expect(document.root.parents.length).to eq(3) + expect(document.root.parents[0].to_s).to eq(<<~'EOM') def on_args_add(arguments, argument) EOM - expect(document.root.inner[1].to_s).to eq(<<~'EOM'.indent(2)) + expect(document.root.parents[1].to_s).to eq(<<~'EOM'.indent(2)) if arguments.parts.empty? Args.new(parts: [argument], location: argument.location) else @@ -157,32 +157,32 @@ def on_args_add(arguments, argument) end EOM - expect(document.root.inner[2].to_s).to eq(<<~'EOM') + expect(document.root.parents[2].to_s).to eq(<<~'EOM') end EOM - inside = document.root.inner[1] - expect(inside.inner.length).to eq(5) - expect(inside.inner[0].to_s).to eq(<<~'EOM'.indent(2)) + inside = document.root.parents[1] + expect(inside.parents.length).to eq(5) + expect(inside.parents[0].to_s).to eq(<<~'EOM'.indent(2)) if arguments.parts.empty? EOM - expect(inside.inner[1].to_s).to eq(<<~'EOM'.indent(4)) + expect(inside.parents[1].to_s).to eq(<<~'EOM'.indent(4)) Args.new(parts: [argument], location: argument.location) EOM - expect(inside.inner[2].to_s).to eq(<<~'EOM'.indent(2)) + expect(inside.parents[2].to_s).to eq(<<~'EOM'.indent(2)) else EOM - expect(inside.inner[3].to_s).to eq(<<~'EOM'.indent(4)) + expect(inside.parents[3].to_s).to eq(<<~'EOM'.indent(4)) Args.new( parts: arguments.parts << argument, location: arguments.location.to(argument.location) ) EOM - expect(inside.inner[4].to_s).to eq(<<~'EOM'.indent(2)) + expect(inside.parents[4].to_s).to eq(<<~'EOM'.indent(2)) end EOM end @@ -204,24 +204,24 @@ def foo expect(blocks.length).to eq(1) expect(document.root.leaning).to eq(:right) - expect(document.root.inner.length).to eq(3) - expect(document.root.inner[0].to_s).to eq(<<~'EOM') + expect(document.root.parents.length).to eq(3) + expect(document.root.parents[0].to_s).to eq(<<~'EOM') Foo.call EOM - expect(document.root.inner[0].indent).to eq(0) - expect(document.root.inner[1].to_s).to eq(<<~'EOM'.indent(2)) + expect(document.root.parents[0].indent).to eq(0) + expect(document.root.parents[1].to_s).to eq(<<~'EOM'.indent(2)) def foo print "lol" print "lol" end # one EOM - expect(document.root.inner[1].balanced?).to be_truthy - expect(document.root.inner[1].indent).to eq(2) + expect(document.root.parents[1].balanced?).to be_truthy + expect(document.root.parents[1].indent).to eq(2) - expect(document.root.inner[2].to_s).to eq(<<~'EOM') + expect(document.root.parents[2].to_s).to eq(<<~'EOM') end # two EOM - expect(document.root.inner[2].indent).to eq(0) + expect(document.root.parents[2].indent).to eq(0) end it "captures complicated" do @@ -247,20 +247,20 @@ def foo blocks = document.to_a expect(blocks.length).to eq(1) - expect(document.root.inner.length).to eq(3) - expect(document.root.inner[0].to_s).to eq(<<~'EOM') + expect(document.root.parents.length).to eq(3) + expect(document.root.parents[0].to_s).to eq(<<~'EOM') if true # 0 print 'huge 1' # 1 end # 2 EOM - expect(document.root.inner[1].to_s).to eq(<<~'EOM') + expect(document.root.parents[1].to_s).to eq(<<~'EOM') if true # 4 print 'huge 2' # 5 end # 6 EOM - expect(document.root.inner[2].to_s).to eq(<<~'EOM') + expect(document.root.parents[2].to_s).to eq(<<~'EOM') if true # 8 print 'huge 3' # 9 end # 10 @@ -296,7 +296,7 @@ def foo # blocks = document.to_a expect(document.root.to_s).to eq(code_lines.join) expect(document.to_a.length).to eq(1) - expect(document.root.inner.length).to eq(3) + expect(document.root.parents.length).to eq(3) end it "simple" do @@ -316,7 +316,7 @@ def foo expect(search.document.root).to eq( BlockNode.new(lines: code_lines[0..1], indent: 0).tap { |node| - node.inner << BlockNode.new(lines: code_lines[0], indent: 0) + node.parents << BlockNode.new(lines: code_lines[0], indent: 0) node.right = BlockNode.new(lines: code_lines[1], indent: 0) } ) From 60d796e0f8cc83909e76cc9144b43f82daec6b5f Mon Sep 17 00:00:00 2001 From: schneems Date: Thu, 27 Jan 2022 16:06:43 -0600 Subject: [PATCH 14/58] Figure out search algorithm TODO: - Enable separate search/tree reporting - Follow question and answer until we have a reasonable algorithm --- lib/dead_end/indent_tree.rb | 32 +++++++++++++++++++++++++++++++- spec/unit/indent_tree_spec.rb | 15 +++++++++++++++ 2 files changed, 46 insertions(+), 1 deletion(-) diff --git a/lib/dead_end/indent_tree.rb b/lib/dead_end/indent_tree.rb index b8d9ae2..278503f 100644 --- a/lib/dead_end/indent_tree.rb +++ b/lib/dead_end/indent_tree.rb @@ -38,12 +38,42 @@ def initialize(tree: ) @invalid_blocks = [] end + + # Keep track of trail of how we got here, Introduce Trail class + # Each main block gets a trail with one or more paths + # + # Problem: We can follow valid/invalid for awhile but + # at the edges single lines of valid code look invalid + # + # Solution maybe: Hold a set of code that is invalid with + # a sub block, and valid without it. Goal: Make this block + # as small as possible to reduce parsing time + # + # Problem: when to stop looking? The old "when to stop looking" + # started from not capturing the syntax error and re-checking the + # whole document when a syntax error was found. + # + # We are reversing the idea on it's head by starting with a known + # invalid state, we know if we removed the given block the whole + # document would be valid, however we want to find the smallest + # block where this holds true + # + # Goal: Find the smallest block where it's removal will make a fork + # of the path valid again. + + # Solution: Popstars never stop stopping def call frontier = @tree.inner.dup + # Check outer, check inner, map parents while block = frontier.pop next if block.valid? - # + if block.outer_nodes.valid? + frontier << block.inner_nodes + else + # frontier << block.outer_nodes + frontier << block.inner_nodes + end end self diff --git a/spec/unit/indent_tree_spec.rb b/spec/unit/indent_tree_spec.rb index b82b16d..3ad9620 100644 --- a/spec/unit/indent_tree_spec.rb +++ b/spec/unit/indent_tree_spec.rb @@ -35,6 +35,21 @@ class SyntaxTree < Ripper EOM expect(inner.outer_nodes.valid?).to be_truthy expect(inner.inner_nodes.valid?).to be_falsey + + inner = inner.inner_nodes + + expect(inner.parents[0].parents.length).to eq(31) + expect(inner.parents[0].parents.map(&:valid?)).to eq([true] * 30 + [false]) + + inner = inner.parents[0].parents.last + + expect(inner.parents[0].parents.length).to eq(183) + expect(inner.parents[0].parents.map(&:valid?)).to eq([false] + [true] * 182) + + inner = inner.parents[0].parents.first + expect(inner.to_s).to eq(<<~'EOM'.indent(2)) + def on_args_add(arguments, argument) + EOM end it "invalid if/else end with surrounding code" do From 083e7332ce41bc1653ef1ab21f56d45f154a8042 Mon Sep 17 00:00:00 2001 From: schneems Date: Thu, 27 Jan 2022 19:59:28 -0600 Subject: [PATCH 15/58] REXE test case I wanted to start with this test case as it has a property that causes it to fail even when most/all other test cases pass. It's the lone holdover that resisted my initial attempts to add a "heuristic expansion" in https://github.com/zombocom/dead_end/pull/129. --- spec/unit/indent_tree_spec.rb | 42 +++++++++++++++++++++++++++++++++++ 1 file changed, 42 insertions(+) diff --git a/spec/unit/indent_tree_spec.rb b/spec/unit/indent_tree_spec.rb index 3ad9620..f043814 100644 --- a/spec/unit/indent_tree_spec.rb +++ b/spec/unit/indent_tree_spec.rb @@ -4,6 +4,48 @@ module DeadEnd RSpec.describe IndentTree do + it "rexe regression" do + lines = fixtures_dir.join("rexe.rb.txt").read.lines + lines.delete_at(148 - 1) + source = lines.join + + code_lines = CleanDocument.new(source: source).call.lines + document = BlockDocument.new(code_lines: code_lines).call + tree = IndentTree.new(document: document).call + + node = tree.root + expect(tree.to_a.length).to eq(1) + expect(node.parents.length).to eq(6) + expect(node.outer_nodes.valid?).to be_falsey + expect(node.outer_nodes.parents.length).to eq(6) + expect(node.parents.map(&:valid?)).to eq([true] * 5 + [false]) + + node = node.parents.last + expect(node.parents.length).to eq(3) + expect(node.parents.map(&:valid?)).to eq([false, true, true]) + + node = node.parents.first + expect(node.parents.length).to eq(3) + expect(node.outer_nodes.valid?).to be_truthy + node = node.inner_nodes.parents[0] + expect(node.parents.length).to eq(5) + expect(node.parents.map(&:valid?)).to eq([true, true, true, true, false]) + node = node.parents.last + expect(node.parents.length).to eq(3) + expect(node.parents.map(&:valid?)).to eq([false, true, true]) + + node = node.parents.first + + expect(node.outer_nodes.valid?).to be_truthy + node = node.inner_nodes.parents[0] + expect(node.parents.length).to eq(7) + expect(node.parents.map(&:valid?)).to eq([true, true, true, true, true, false, true]) + node = node.parents[5] + expect(node.to_s).to eq(<<~'EOM'.indent(4)) + def format_requires + EOM + end + it "WIP syntax_tree.rb.txt for performance validation" do file = fixtures_dir.join("syntax_tree.rb.txt") lines = file.read.lines From c260450ff61b4e38309eb71902c7055d46c74d52 Mon Sep 17 00:00:00 2001 From: schneems Date: Thu, 27 Jan 2022 21:30:15 -0600 Subject: [PATCH 16/58] Add more test cases --- lib/dead_end/block_node.rb | 15 ++++- spec/unit/indent_tree_spec.rb | 118 +++++++++++++++++++++++++++++++--- 2 files changed, 122 insertions(+), 11 deletions(-) diff --git a/lib/dead_end/block_node.rb b/lib/dead_end/block_node.rb index f336a96..ff688b8 100644 --- a/lib/dead_end/block_node.rb +++ b/lib/dead_end/block_node.rb @@ -48,11 +48,22 @@ def initialize(lines: , indent: , next_indent: nil, lex_diff: nil, parents: []) end def outer_nodes - @outer_nodes ||= BlockNode.from_blocks(parents.select { |block| block.indent == indent }) + outer = parents.select { |block| block.indent == indent } + + if outer.any? + @outer_nodes ||= BlockNode.from_blocks(outer) + else + nil + end end def inner_nodes - @inner_nodes ||= BlockNode.from_blocks(parents.select { |block| block.indent > indent }) + inner = parents.select { |block| block.indent > indent } + if inner.any? + @inner_nodes ||= BlockNode.from_blocks(inner) + else + nil + end end def self.next_indent(above, node, below) diff --git a/spec/unit/indent_tree_spec.rb b/spec/unit/indent_tree_spec.rb index f043814..4ee1d21 100644 --- a/spec/unit/indent_tree_spec.rb +++ b/spec/unit/indent_tree_spec.rb @@ -4,6 +4,113 @@ module DeadEnd RSpec.describe IndentTree do + it "regression dog test" do + source = <<~'EOM' + class Dog + def bark + puts "woof" + end + EOM + code_lines = CleanDocument.new(source: source).call.lines + document = BlockDocument.new(code_lines: code_lines).call + tree = IndentTree.new(document: document).call + + node = tree.root + expect(node.outer_nodes.valid?).to be_truthy + expect(node.inner_nodes.valid?).to be_falsey + node = node.inner_nodes.parents[0] + + expect(node.outer_nodes.valid?).to be_falsey + expect(node.inner_nodes.valid?).to be_truthy + + expect(node.outer_nodes.to_s).to eq(<<~'EOM'.indent(2)) + def bark + EOM + end + + it "regression test ambiguous end" do + # Even though you would think the first step is to + # expand the "print" line, we base priority off of + # "next_indent" so the actual highest "next indent" line + # comes from "end # one" which captures "print", then it + # expands out from there + source = <<~'EOM' + def call + print "lol" + end # one + end # two + EOM + + code_lines = CleanDocument.new(source: source).call.lines + document = BlockDocument.new(code_lines: code_lines).call + tree = IndentTree.new(document: document).call + + node = tree.root + expect(node.outer_nodes.valid?).to be_truthy + expect(node.outer_nodes.to_s).to eq(<<~'EOM') + def call + end # two + EOM + + expect(node.inner_nodes.valid?).to be_falsey + expect(node.inner_nodes.to_s).to eq(<<~'EOM'.indent(2)) + print "lol" + end # one + EOM + + node = node.inner_nodes.parents[0] + expect(node.outer_nodes.valid?).to be_falsey + expect(node.inner_nodes.valid?).to be_truthy + expect(node.outer_nodes.to_s).to eq(<<~'EOM'.indent(2)) + end # one + EOM + end + + it "squished do regression" do + source = <<~'EOM' + def call + trydo + + @options = CommandLineParser.new.parse + + options.requires.each { |r| require!(r) } + load_global_config_if_exists + options.loads.each { |file| load(file) } + + @user_source_code = ARGV.join(' ') + @user_source_code = 'self' if @user_source_code == '' + + @callable = create_callable + + init_rexe_context + init_parser_and_formatters + + # This is where the user's source code will be executed; the action will in turn call `execute`. + lookup_action(options.input_mode).call unless options.noop + + output_log_entry + end # one + end # two + EOM + + code_lines = CleanDocument.new(source: source).call.lines + document = BlockDocument.new(code_lines: code_lines).call + tree = IndentTree.new(document: document).call + + node = tree.root + + expect(node.outer_nodes.valid?).to be_truthy + expect(node.inner_nodes.valid?).to be_falsey + + node = node.inner_nodes.parents[0] + expect(node.inner_nodes.valid?).to be_truthy + expect(node.outer_nodes.valid?).to be_falsey + expect(node.outer_nodes.to_s).to eq(<<~'EOM'.indent(2)) + trydo + end # one + EOM + end + it "rexe regression" do lines = fixtures_dir.join("rexe.rb.txt").read.lines lines.delete_at(148 - 1) @@ -14,31 +121,24 @@ module DeadEnd tree = IndentTree.new(document: document).call node = tree.root - expect(tree.to_a.length).to eq(1) - expect(node.parents.length).to eq(6) expect(node.outer_nodes.valid?).to be_falsey - expect(node.outer_nodes.parents.length).to eq(6) expect(node.parents.map(&:valid?)).to eq([true] * 5 + [false]) node = node.parents.last - expect(node.parents.length).to eq(3) expect(node.parents.map(&:valid?)).to eq([false, true, true]) node = node.parents.first - expect(node.parents.length).to eq(3) expect(node.outer_nodes.valid?).to be_truthy node = node.inner_nodes.parents[0] - expect(node.parents.length).to eq(5) expect(node.parents.map(&:valid?)).to eq([true, true, true, true, false]) + node = node.parents.last - expect(node.parents.length).to eq(3) expect(node.parents.map(&:valid?)).to eq([false, true, true]) node = node.parents.first - expect(node.outer_nodes.valid?).to be_truthy + node = node.inner_nodes.parents[0] - expect(node.parents.length).to eq(7) expect(node.parents.map(&:valid?)).to eq([true, true, true, true, true, false, true]) node = node.parents[5] expect(node.to_s).to eq(<<~'EOM'.indent(4)) From 22f9fc05b30c73da5446083f4d54ca5244790efb Mon Sep 17 00:00:00 2001 From: schneems Date: Fri, 28 Jan 2022 15:02:55 -0600 Subject: [PATCH 17/58] Add another test --- spec/unit/indent_tree_spec.rb | 37 +++++++++++++++++++++++++++++++++++ 1 file changed, 37 insertions(+) diff --git a/spec/unit/indent_tree_spec.rb b/spec/unit/indent_tree_spec.rb index 4ee1d21..0636050 100644 --- a/spec/unit/indent_tree_spec.rb +++ b/spec/unit/indent_tree_spec.rb @@ -4,6 +4,43 @@ module DeadEnd RSpec.describe IndentTree do + it "finds hanging def in this project" do + source = fixtures_dir.join("this_project_extra_def.rb.txt").read + + code_lines = CleanDocument.new(source: source).call.lines + document = BlockDocument.new(code_lines: code_lines).call + tree = IndentTree.new(document: document).call + + node = tree.root + expect(node.outer_nodes.valid?).to be_truthy + expect(node.inner_nodes.valid?).to be_falsey + + node = node.inner_nodes.parents[0] + + expect(node.outer_nodes.valid?).to be_truthy + expect(node.inner_nodes.valid?).to be_falsey + + node = node.inner_nodes.parents[0] + expect(node.inner_nodes).to be_falsey + + expect(node.outer_nodes.valid?).to be_falsey + node = node.outer_nodes + expect(node.inner_nodes).to be_falsey + expect(node.parents.map(&:valid?)).to eq([true, true, true, false]) + node = node.parents.last + expect(node.inner_nodes).to be_falsey + expect(node.parents.map(&:valid?)).to eq([false, true, true]) + node = node.parents.first + expect(node.inner_nodes).to be_falsey + expect(node.outer_nodes).to be_falsey + expect(node.parents).to be_empty + + expect(node.to_s).to eq(<<~'EOM'.indent(4)) + def filename + EOM + end + + it "regression dog test" do source = <<~'EOM' class Dog From f2b88788331f67f4fcfd2e6859fc357a366c3b66 Mon Sep 17 00:00:00 2001 From: schneems Date: Fri, 28 Jan 2022 15:03:28 -0600 Subject: [PATCH 18/58] Fix problem with my logic for "inner/outer" check When dealing with a non KW problem at the wrong indentation (too low). The next idea is to make the inner/outer logic smarter. ## Primary cases - Problem is within an kw/else/end - Problem is missing kw/end - --- lib/dead_end/block_node.rb | 33 +++++++++++++++++ spec/unit/indent_tree_spec.rb | 68 +++++++++++++++++++++++++++++++++++ 2 files changed, 101 insertions(+) diff --git a/lib/dead_end/block_node.rb b/lib/dead_end/block_node.rb index ff688b8..f07216d 100644 --- a/lib/dead_end/block_node.rb +++ b/lib/dead_end/block_node.rb @@ -47,6 +47,39 @@ def initialize(lines: , indent: , next_indent: nil, lex_diff: nil, parents: []) @deleted = false end + def split_same_indent + output = [] + parents.each do |block| + if block.indent == indent + block.parents.each do |b| + output << b + end + else + output << block + end + end + + if output.any? + @split_same_indent ||= BlockNode.from_blocks(output) + else + nil + end + end + + def invalid_count + parents.select{ |block| !block.valid? }.length + end + + def join_invalid + invalid = parents.select{ |block| !block.valid? } + + if invalid.any? + @join_invalid ||= BlockNode.from_blocks(invalid) + else + nil + end + end + def outer_nodes outer = parents.select { |block| block.indent == indent } diff --git a/spec/unit/indent_tree_spec.rb b/spec/unit/indent_tree_spec.rb index 0636050..30535f5 100644 --- a/spec/unit/indent_tree_spec.rb +++ b/spec/unit/indent_tree_spec.rb @@ -4,6 +4,74 @@ module DeadEnd RSpec.describe IndentTree do + it "finds random pipe (|) wildly misindented" do + source = fixtures_dir.join("ruby_buildpack.rb.txt").read + + code_lines = CleanDocument.new(source: source).call.lines + document = BlockDocument.new(code_lines: code_lines).call + tree = IndentTree.new(document: document).call + + node = tree.root + expect(node.outer_nodes.valid?).to be_falsey + expect(node.inner_nodes).to be_falsey + + node = node.outer_nodes + expect(node.parents.length).to eq(14) + expect(node.parents.map(&:valid?)).to eq([true] * 13 + [false]) + + node = node.parents.last + expect(node.outer_nodes.valid?).to be_falsey + expect(node.inner_nodes.valid?).to be_truthy + + node = node.outer_nodes + expect(node.parents.length).to eq(3) + expect(node.parents.map(&:valid?)).to eq([false, true, false]) + + expect(node.outer_nodes&.valid?).to be_falsey + expect(node.inner_nodes&.valid?).to be_falsey + expect(node.invalid_count).to eq(2) + + node = node.join_invalid + expect(node.outer_nodes&.valid?).to be_falsey + expect(node.inner_nodes&.valid?).to be_falsey + expect(node.parents.length).to eq(2) + expect(node.parents.map(&:valid?)).to eq([false, false]) + + expect(node.split_same_indent.parents.length).to eq(4) + expect(node.split_same_indent.parents.last.to_s).to eq("end\n") + expect(node.split_same_indent.parents.map(&:valid?)).to eq([false, false, false, false]) + node = node.split_same_indent + expect(node.outer_nodes&.valid?).to be_falsey + expect(node.inner_nodes&.valid?).to be_truthy + + # Problem + # + # The outer/inner logic isn't robust. + # + # Above we see two parents that are false + # + # The class line is on the first block + # and the matching end is on the second + # however, we can't join them purely by + # indentation + # + # I had the idea to split a block into it's + # parent blocks, which seems good. But i'm not totally sure when we can do this + # also after doing it only + + + puts node.inner_nodes + puts node.outer_nodes.starts_at + + node = node.outer_nodes + expect(node.to_s).to eq(<<~'EOM') + EOM + + expect(node.outer_nodes&.valid?).to be_falsey + expect(node.inner_nodes&.valid?).to be_falsey + expect(node.parents.length).to eq(2) + expect(node.parents.map(&:valid?)).to eq([false, false]) + end it "finds hanging def in this project" do source = fixtures_dir.join("this_project_extra_def.rb.txt").read From b7a6c651a6f3cf69237b99940092ff3ab65cff66 Mon Sep 17 00:00:00 2001 From: schneems Date: Sat, 29 Jan 2022 16:59:17 -0600 Subject: [PATCH 19/58] Add misindentation test This spec causes a massive problem When we have a mis-indentation in the "wrong" direction it causes the blocks to no longer capture their inner contents. --- spec/unit/indent_tree_spec.rb | 78 +++++++++++++++++++++++++++++++++++ 1 file changed, 78 insertions(+) diff --git a/spec/unit/indent_tree_spec.rb b/spec/unit/indent_tree_spec.rb index 30535f5..2ca14f3 100644 --- a/spec/unit/indent_tree_spec.rb +++ b/spec/unit/indent_tree_spec.rb @@ -4,6 +4,84 @@ module DeadEnd RSpec.describe IndentTree do + it "(smaller) finds random pipe (|) wildly misindented" do + source = <<~'EOM' + class LanguagePack::Ruby < LanguagePack::Base + def allow_git(&blk) + git_dir = ENV.delete("GIT_DIR") # can mess with bundler + blk.call + ENV["GIT_DIR"] = git_dir + end + + def add_dev_database_addon + pg_adapters.any? {|a| bundler.has_gem?(a) } ? ['heroku-postgresql'] : [] + end + + def pg_adapters + [ + "pg", + "activerecord-jdbcpostgresql-adapter", + "jdbc-postgres", + "jdbc-postgresql", + "jruby-pg", + "rjack-jdbc-postgres", + "tgbyte-activerecord-jdbcpostgresql-adapter" + ] + end + + def add_node_js_binary + return [] if node_js_preinstalled? + + if Pathname(build_path).join("package.json").exist? || + bundler.has_gem?('execjs') || + bundler.has_gem?('webpacker') + [@node_installer.binary_path] + else + [] + end + end + + def add_yarn_binary + return [] if yarn_preinstalled? + | + if Pathname(build_path).join("yarn.lock").exist? || bundler.has_gem?('webpacker') + [@yarn_installer.name] + else + [] + end + end + + def has_yarn_binary? + add_yarn_binary.any? + end + + def node_preinstall_bin_path + return @node_preinstall_bin_path if defined?(@node_preinstall_bin_path) + + legacy_path = "#{Dir.pwd}/#{NODE_BP_PATH}" + path = run("which node").strip + if path && $?.success? + @node_preinstall_bin_path = path + elsif run("#{legacy_path}/node -v") && $?.success? + @node_preinstall_bin_path = legacy_path + else + @node_preinstall_bin_path = false + end + end + alias :node_js_preinstalled? :node_preinstall_bin_path + end + EOM + # search = CodeSearch.new(source) + # search.call + + code_lines = CleanDocument.new(source: source).call.lines + document = BlockDocument.new(code_lines: code_lines).call + tree = IndentTree.new(document: document).call + + node = tree.root + + end + it "finds random pipe (|) wildly misindented" do source = fixtures_dir.join("ruby_buildpack.rb.txt").read From 01c2b8c902384ee3d8b1567246e01adc282f25e9 Mon Sep 17 00:00:00 2001 From: schneems Date: Sat, 29 Jan 2022 17:10:36 -0600 Subject: [PATCH 20/58] Refactor decision logic into node This allows us to match generating next_indent priority with changes to the building of the indentation tree --- lib/dead_end/block_node.rb | 15 +++++++++++++-- lib/dead_end/indent_tree.rb | 6 ++++-- 2 files changed, 17 insertions(+), 4 deletions(-) diff --git a/lib/dead_end/block_node.rb b/lib/dead_end/block_node.rb index f07216d..a41e85e 100644 --- a/lib/dead_end/block_node.rb +++ b/lib/dead_end/block_node.rb @@ -90,6 +90,18 @@ def outer_nodes end end + def expand_above?(with_indent: self.indent) + return false if above.nil? + + above.indent >= with_indent + end + + def expand_below?(with_indent: self.indent) + return false if below.nil? + + below.indent >= with_indent + end + def inner_nodes inner = parents.select { |block| block.indent > indent } if inner.any? @@ -100,8 +112,7 @@ def inner_nodes end def self.next_indent(above, node, below) - return node.indent if above && above.indent >= node.indent - return node.indent if below && below.indent >= node.indent + return node.indent if node.expand_above? || node.expand_below? if above if below diff --git a/lib/dead_end/indent_tree.rb b/lib/dead_end/indent_tree.rb index 278503f..56b50e9 100644 --- a/lib/dead_end/indent_tree.rb +++ b/lib/dead_end/indent_tree.rb @@ -121,7 +121,8 @@ def call blocks = [block] indent = original.next_indent - while (above = blocks.last.above) && above.indent >= indent + while blocks.last.expand_above?(with_indent: indent) + above = blocks.last.above leaning = above.leaning break if leaning == :right blocks << above @@ -130,7 +131,8 @@ def call blocks.reverse! - while (below = blocks.last.below) && below.indent >= indent + while blocks.last.expand_below?(with_indent: indent) + below = blocks.last.below leaning = below.leaning break if leaning == :left blocks << below From 1fa20e1b8ca772f6e1e15a0a49cc7fc37f9ea7e0 Mon Sep 17 00:00:00 2001 From: schneems Date: Mon, 31 Jan 2022 11:39:23 -0600 Subject: [PATCH 21/58] Initial idea to search by "diagnosing" a node I am very happy with this direction. When looking at a node, we can determine if it holds a syntax error or not. If it's invalid it means that either one of its parents are the problem or it is. Through test based exploration I found several properties that seem to indicate a specific case happened. For instance if only one parent is invalid in isolation, then we should explore just that parent. I wanted to decouple inspection from modification so that we can look at a node to guess what's wrong with it, without having to take the next step of actually taking an action to return the invalid inner/parent node. This produces some possible performance problems as some of the checks for diagnose may not be cheap. I'm mitigating that where possible via memoizing expensive values. It's not ideal, but it's holding up okay. This strategy handles all of our problems like a champ, except for the full `|` example which needs more investigation. Note to future self we also need to handle: - actually investigating multiple syntax errors, like if there are two syntax errors in a document - Handling the `else` case. This is where kw/end are matched but there's one or more dangling pseudo keywords ("ensure", "rescue", "else", "elsif", "when", "retry") left in the middle. --- lib/dead_end/block_node.rb | 103 ++++--- lib/dead_end/indent_tree.rb | 17 +- spec/unit/block_document_spec.rb | 10 +- spec/unit/indent_tree_spec.rb | 457 +++++++++++++++---------------- 4 files changed, 297 insertions(+), 290 deletions(-) diff --git a/lib/dead_end/block_node.rb b/lib/dead_end/block_node.rb index a41e85e..f52ec79 100644 --- a/lib/dead_end/block_node.rb +++ b/lib/dead_end/block_node.rb @@ -47,67 +47,79 @@ def initialize(lines: , indent: , next_indent: nil, lex_diff: nil, parents: []) @deleted = false end - def split_same_indent - output = [] - parents.each do |block| - if block.indent == indent - block.parents.each do |b| - output << b - end - else - output << block - end - end + def expand_above?(with_indent: self.indent) + return false if above.nil? + return false if leaf? && self.leaning == :left - if output.any? - @split_same_indent ||= BlockNode.from_blocks(output) + if above.leaning == :left + above.indent >= with_indent else - nil + true end end - def invalid_count - parents.select{ |block| !block.valid? }.length - end - - def join_invalid - invalid = parents.select{ |block| !block.valid? } + def expand_below?(with_indent: self.indent) + return false if below.nil? + return false if leaf? && self.leaning == :right - if invalid.any? - @join_invalid ||= BlockNode.from_blocks(invalid) + if below.leaning == :right + below.indent >= with_indent else - nil + true end end - def outer_nodes - outer = parents.select { |block| block.indent == indent } + def leaf? + parents.empty? + end - if outer.any? - @outer_nodes ||= BlockNode.from_blocks(outer) - else - nil - end + def next_invalid + parents.detect(&:invalid?) end - def expand_above?(with_indent: self.indent) - return false if above.nil? + def diagnose + return :self if leaf? + + invalid = parents.select(&:invalid?) + return :next_invalid if invalid.count == 1 - above.indent >= with_indent + return :split_leaning if split_leaning? + + :multiple end - def expand_below?(with_indent: self.indent) - return false if below.nil? + def split_leaning + block = left_right_parents + invalid = parents.select(&:invalid?) + + invalid.reject! {|x| block.parents.include?(x) } - below.indent >= with_indent + @inner_leang ||= BlockNode.from_blocks(invalid) end - def inner_nodes - inner = parents.select { |block| block.indent > indent } - if inner.any? - @inner_nodes ||= BlockNode.from_blocks(inner) + def left_right_parents + invalid = parents.select(&:invalid?) + return false if invalid.length < 3 + + left = invalid.detect {|block| block.leaning == :left } + + return false if left.nil? + + right = invalid.reverse_each.detect {|block| block.leaning == :right } + return false if right.nil? + + @left_right_parents ||= BlockNode.from_blocks([left, right]) + end + + # When a kw/end has an invalid block inbetween it will show up as [false, false, false] + # we can check if the first and last can be joined together for a valid block which + # effectively gives us [true, false, true] + def split_leaning? + block = left_right_parents + if block + block.leaning == :equal && block.valid? else - nil + false end end @@ -143,6 +155,10 @@ def deleted? @deleted end + def invalid? + !valid? + end + def valid? return @valid if defined?(@valid) @@ -174,6 +190,11 @@ def <=>(other) when 1 then 1 when -1 then -1 when 0 + # if leaning != other.leaning + # return -1 if self.leaning == :equal + # return 1 if other.leaning == :equal + # end + end_index <=> other.end_index end end diff --git a/lib/dead_end/indent_tree.rb b/lib/dead_end/indent_tree.rb index 56b50e9..8abb50b 100644 --- a/lib/dead_end/indent_tree.rb +++ b/lib/dead_end/indent_tree.rb @@ -63,18 +63,6 @@ def initialize(tree: ) # Solution: Popstars never stop stopping def call - frontier = @tree.inner.dup - # Check outer, check inner, map parents - while block = frontier.pop - next if block.valid? - - if block.outer_nodes.valid? - frontier << block.inner_nodes - else - # frontier << block.outer_nodes - frontier << block.inner_nodes - end - end self end @@ -121,10 +109,11 @@ def call blocks = [block] indent = original.next_indent + while blocks.last.expand_above?(with_indent: indent) above = blocks.last.above leaning = above.leaning - break if leaning == :right + # break if leaning == :right blocks << above break if leaning == :left end @@ -134,7 +123,7 @@ def call while blocks.last.expand_below?(with_indent: indent) below = blocks.last.below leaning = below.leaning - break if leaning == :left + # break if leaning == :left blocks << below break if leaning == :right end diff --git a/spec/unit/block_document_spec.rb b/spec/unit/block_document_spec.rb index 49aa01e..a37fb3e 100644 --- a/spec/unit/block_document_spec.rb +++ b/spec/unit/block_document_spec.rb @@ -31,7 +31,7 @@ module DeadEnd node = document.capture(node: blocks[1], captured: [blocks[0], blocks[2]]) expect(node.to_s).to eq(code_lines.join) - expect(node.inner.length).to eq(3) + expect(node.parents.length).to eq(3) end it "captures complicated" do @@ -66,20 +66,20 @@ module DeadEnd blocks = document.to_a expect(blocks.length).to eq(1) - expect(document.root.inner.length).to eq(3) - expect(document.root.inner[0].to_s).to eq(<<~'EOM') + expect(document.root.parents.length).to eq(3) + expect(document.root.parents[0].to_s).to eq(<<~'EOM') if true # 0 print 'huge 1' # 1 end # 2 EOM - expect(document.root.inner[1].to_s).to eq(<<~'EOM') + expect(document.root.parents[1].to_s).to eq(<<~'EOM') if true # 4 print 'huge 2' # 5 end # 6 EOM - expect(document.root.inner[2].to_s).to eq(<<~'EOM') + expect(document.root.parents[2].to_s).to eq(<<~'EOM') if true # 8 print 'huge 3' # 9 end # 10 diff --git a/spec/unit/indent_tree_spec.rb b/spec/unit/indent_tree_spec.rb index 2ca14f3..e8a3489 100644 --- a/spec/unit/indent_tree_spec.rb +++ b/spec/unit/indent_tree_spec.rb @@ -71,8 +71,6 @@ def node_preinstall_bin_path alias :node_js_preinstalled? :node_preinstall_bin_path end EOM - # search = CodeSearch.new(source) - # search.call code_lines = CleanDocument.new(source: source).call.lines document = BlockDocument.new(code_lines: code_lines).call @@ -80,6 +78,28 @@ def node_preinstall_bin_path node = tree.root + expect(node.diagnose).to eq(:split_leaning) + node = node.split_leaning + + expect(node.diagnose).to eq(:next_invalid) + node = node.next_invalid + + expect(node.diagnose).to eq(:next_invalid) + node = node.next_invalid + + expect(node.diagnose).to eq(:split_leaning) + node = node.split_leaning + + expect(node.diagnose).to eq(:next_invalid) + node = node.next_invalid + + expect(node.diagnose).to eq(:next_invalid) + node = node.next_invalid + + expect(node.diagnose).to eq(:self) + expect(node.to_s).to eq(<<~'EOM') + | + EOM end it "finds random pipe (|) wildly misindented" do @@ -90,66 +110,29 @@ def node_preinstall_bin_path tree = IndentTree.new(document: document).call node = tree.root - expect(node.outer_nodes.valid?).to be_falsey - expect(node.inner_nodes).to be_falsey + expect(node.diagnose).to eq(:next_invalid) + node = node.next_invalid - node = node.outer_nodes - expect(node.parents.length).to eq(14) - expect(node.parents.map(&:valid?)).to eq([true] * 13 + [false]) + expect(node.diagnose).to eq(:split_leaning) + node = node.split_leaning - node = node.parents.last - expect(node.outer_nodes.valid?).to be_falsey - expect(node.inner_nodes.valid?).to be_truthy + expect(node.diagnose).to eq(:next_invalid) + node = node.next_invalid - node = node.outer_nodes - expect(node.parents.length).to eq(3) - expect(node.parents.map(&:valid?)).to eq([false, true, false]) - - expect(node.outer_nodes&.valid?).to be_falsey - expect(node.inner_nodes&.valid?).to be_falsey - expect(node.invalid_count).to eq(2) + expect(node.diagnose).to eq(:next_invalid) + node = node.next_invalid - node = node.join_invalid - expect(node.outer_nodes&.valid?).to be_falsey - expect(node.inner_nodes&.valid?).to be_falsey - expect(node.parents.length).to eq(2) - expect(node.parents.map(&:valid?)).to eq([false, false]) + expect(node.diagnose).to eq(:split_leaning) + node = node.split_leaning - expect(node.split_same_indent.parents.length).to eq(4) - expect(node.split_same_indent.parents.last.to_s).to eq("end\n") - expect(node.split_same_indent.parents.map(&:valid?)).to eq([false, false, false, false]) - node = node.split_same_indent - expect(node.outer_nodes&.valid?).to be_falsey - expect(node.inner_nodes&.valid?).to be_truthy - - # Problem - # - # The outer/inner logic isn't robust. - # - # Above we see two parents that are false - # - # The class line is on the first block - # and the matching end is on the second - # however, we can't join them purely by - # indentation - # - # I had the idea to split a block into it's - # parent blocks, which seems good. But i'm not totally sure when we can do this - # also after doing it only - - - puts node.inner_nodes - puts node.outer_nodes.starts_at - - node = node.outer_nodes - expect(node.to_s).to eq(<<~'EOM') - EOM + expect(node.diagnose).to eq(:multiple) - expect(node.outer_nodes&.valid?).to be_falsey - expect(node.inner_nodes&.valid?).to be_falsey - expect(node.parents.length).to eq(2) expect(node.parents.map(&:valid?)).to eq([false, false]) + + pending("multiple") + raise "We don't know what to do with :multiple failures yet" end + it "finds hanging def in this project" do source = fixtures_dir.join("this_project_extra_def.rb.txt").read @@ -158,29 +141,26 @@ def node_preinstall_bin_path tree = IndentTree.new(document: document).call node = tree.root - expect(node.outer_nodes.valid?).to be_truthy - expect(node.inner_nodes.valid?).to be_falsey - node = node.inner_nodes.parents[0] + expect(node.diagnose).to eq(:split_leaning) + node = node.split_leaning + + expect(node.diagnose).to eq(:next_invalid) + node = node.next_invalid - expect(node.outer_nodes.valid?).to be_truthy - expect(node.inner_nodes.valid?).to be_falsey + expect(node.diagnose).to eq(:split_leaning) + node = node.split_leaning - node = node.inner_nodes.parents[0] - expect(node.inner_nodes).to be_falsey + expect(node.diagnose).to eq(:next_invalid) + node = node.next_invalid - expect(node.outer_nodes.valid?).to be_falsey - node = node.outer_nodes - expect(node.inner_nodes).to be_falsey - expect(node.parents.map(&:valid?)).to eq([true, true, true, false]) - node = node.parents.last - expect(node.inner_nodes).to be_falsey - expect(node.parents.map(&:valid?)).to eq([false, true, true]) - node = node.parents.first - expect(node.inner_nodes).to be_falsey - expect(node.outer_nodes).to be_falsey - expect(node.parents).to be_empty + expect(node.diagnose).to eq(:next_invalid) + node = node.next_invalid + expect(node.diagnose).to eq(:next_invalid) + node = node.next_invalid + + expect(node.diagnose).to eq(:self) expect(node.to_s).to eq(<<~'EOM'.indent(4)) def filename EOM @@ -199,14 +179,16 @@ def bark tree = IndentTree.new(document: document).call node = tree.root - expect(node.outer_nodes.valid?).to be_truthy - expect(node.inner_nodes.valid?).to be_falsey - node = node.inner_nodes.parents[0] + expect(node.diagnose).to eq(:split_leaning) + node = node.split_leaning + + expect(node.diagnose).to eq(:next_invalid) + node = node.next_invalid - expect(node.outer_nodes.valid?).to be_falsey - expect(node.inner_nodes.valid?).to be_truthy + expect(node.diagnose).to eq(:next_invalid) + node = node.next_invalid - expect(node.outer_nodes.to_s).to eq(<<~'EOM'.indent(2)) + expect(node.to_s).to eq(<<~'EOM'.indent(2)) def bark EOM end @@ -229,23 +211,19 @@ def call tree = IndentTree.new(document: document).call node = tree.root - expect(node.outer_nodes.valid?).to be_truthy - expect(node.outer_nodes.to_s).to eq(<<~'EOM') - def call - end # two - EOM - expect(node.inner_nodes.valid?).to be_falsey - expect(node.inner_nodes.to_s).to eq(<<~'EOM'.indent(2)) - print "lol" - end # one - EOM + expect(node.diagnose).to eq(:split_leaning) + node = node.split_leaning + + expect(node.diagnose).to eq(:next_invalid) + node = node.next_invalid + + expect(node.diagnose).to eq(:next_invalid) + node = node.next_invalid - node = node.inner_nodes.parents[0] - expect(node.outer_nodes.valid?).to be_falsey - expect(node.inner_nodes.valid?).to be_truthy - expect(node.outer_nodes.to_s).to eq(<<~'EOM'.indent(2)) - end # one + expect(node.diagnose).to eq(:self) + expect(node.to_s).to eq(<<~'EOM'.indent(2)) + end # one EOM end @@ -281,19 +259,105 @@ def call tree = IndentTree.new(document: document).call node = tree.root + expect(node.diagnose).to eq(:split_leaning) + node = node.split_leaning + + expect(node.diagnose).to eq(:next_invalid) + node = node.next_invalid - expect(node.outer_nodes.valid?).to be_truthy - expect(node.inner_nodes.valid?).to be_falsey + expect(node.diagnose).to eq(:next_invalid) + node = node.next_invalid - node = node.inner_nodes.parents[0] - expect(node.inner_nodes.valid?).to be_truthy - expect(node.outer_nodes.valid?).to be_falsey - expect(node.outer_nodes.to_s).to eq(<<~'EOM'.indent(2)) - trydo + expect(node.diagnose).to eq(:self) + expect(node.to_s).to eq(<<~'EOM'.indent(2)) end # one EOM end + it "simpler rexe regression" do + source = <<~'EOM' + module Helpers + def output_formats + @output_formats ||= { + 'a' => :amazing_print, + 'i' => :inspect, + 'j' => :json, + 'J' => :pretty_json, + 'm' => :marshal, + 'n' => :none, + 'p' => :puts, # default + 'P' => :pretty_print, + 's' => :to_s, + 'y' => :yaml, + } + end + + + def formatters + @formatters ||= { + amazing_print: ->(obj) { obj.ai + "\n" }, + inspect: ->(obj) { obj.inspect + "\n" }, + json: ->(obj) { obj.to_json }, + marshal: ->(obj) { Marshal.dump(obj) }, + none: ->(_obj) { nil }, + pretty_json: ->(obj) { JSON.pretty_generate(obj) }, + pretty_print: ->(obj) { obj.pretty_inspect }, + puts: ->(obj) { require 'stringio'; sio = StringIO.new; sio.puts(obj); sio.string }, + to_s: ->(obj) { obj.to_s + "\n" }, + yaml: ->(obj) { obj.to_yaml }, + } + end + + + def format_requires + @format_requires ||= { + json: 'json', + pretty_json: 'json', + amazing_print: 'amazing_print', + pretty_print: 'pp', + yaml: 'yaml' + } + end + + class CommandLineParser + + include Helpers + + attr_reader :lookups, :options + + def initialize + @lookups = Lookups.new + @options = Options.new + end + end + EOM + + code_lines = CleanDocument.new(source: source).call.lines + document = BlockDocument.new(code_lines: code_lines).call + tree = IndentTree.new(document: document).call + + node = tree.root + expect(node.diagnose).to eq(:next_invalid) + node = node.next_invalid + + expect(node.diagnose).to eq(:split_leaning) + node = node.split_leaning + + expect(node.diagnose).to eq(:next_invalid) + node = node.next_invalid + + expect(node.diagnose).to eq(:next_invalid) + node = node.next_invalid + + expect(node.diagnose).to eq(:next_invalid) + node = node.next_invalid + + expect(node.diagnose).to eq(:self) + expect(node.to_s).to eq(<<~'EOM'.indent(2)) + def format_requires + EOM + end + it "rexe regression" do lines = fixtures_dir.join("rexe.rb.txt").read.lines lines.delete_at(148 - 1) @@ -304,32 +368,44 @@ def call tree = IndentTree.new(document: document).call node = tree.root - expect(node.outer_nodes.valid?).to be_falsey - expect(node.parents.map(&:valid?)).to eq([true] * 5 + [false]) - node = node.parents.last - expect(node.parents.map(&:valid?)).to eq([false, true, true]) + expect(node.diagnose).to eq(:next_invalid) + node = node.next_invalid + + expect(node.diagnose).to eq(:next_invalid) + node = node.next_invalid + + expect(node.diagnose).to eq(:split_leaning) + node = node.split_leaning - node = node.parents.first - expect(node.outer_nodes.valid?).to be_truthy - node = node.inner_nodes.parents[0] - expect(node.parents.map(&:valid?)).to eq([true, true, true, true, false]) + expect(node.diagnose).to eq(:next_invalid) + node = node.next_invalid - node = node.parents.last - expect(node.parents.map(&:valid?)).to eq([false, true, true]) + expect(node.diagnose).to eq(:next_invalid) + node = node.next_invalid - node = node.parents.first - expect(node.outer_nodes.valid?).to be_truthy + expect(node.diagnose).to eq(:next_invalid) + node = node.next_invalid - node = node.inner_nodes.parents[0] - expect(node.parents.map(&:valid?)).to eq([true, true, true, true, true, false, true]) - node = node.parents[5] + expect(node.diagnose).to eq(:split_leaning) + node = node.split_leaning + + expect(node.diagnose).to eq(:next_invalid) + node = node.next_invalid + + expect(node.diagnose).to eq(:next_invalid) + node = node.next_invalid + + expect(node.diagnose).to eq(:next_invalid) + node = node.next_invalid + + expect(node.diagnose).to eq(:self) expect(node.to_s).to eq(<<~'EOM'.indent(4)) def format_requires EOM end - it "WIP syntax_tree.rb.txt for performance validation" do + it "syntax_tree.rb.txt for performance validation" do file = fixtures_dir.join("syntax_tree.rb.txt") lines = file.read.lines lines.delete_at(768 - 1) @@ -343,36 +419,25 @@ def format_requires tree = IndentTree.new(document: document).call end - expect(tree.to_a.length).to eq(1) - expect(tree.root.parents.length).to eq(3) - expect(tree.root.parents[0].to_s).to eq(<<~'EOM') - require 'ripper' - EOM + node = tree.root - expect(tree.root.parents[1].to_s).to eq(<<~'EOM') - require_relative 'syntax_tree/version' - EOM + expect(node.diagnose).to eq(:next_invalid) + node = node.next_invalid - inner = tree.root.parents[2] - expect(inner.outer_nodes.to_s).to eq(<<~'EOM') - class SyntaxTree < Ripper - end - EOM - expect(inner.outer_nodes.valid?).to be_truthy - expect(inner.inner_nodes.valid?).to be_falsey - - inner = inner.inner_nodes + expect(node.diagnose).to eq(:split_leaning) + node = node.split_leaning - expect(inner.parents[0].parents.length).to eq(31) - expect(inner.parents[0].parents.map(&:valid?)).to eq([true] * 30 + [false]) + expect(node.diagnose).to eq(:next_invalid) + node = node.next_invalid - inner = inner.parents[0].parents.last + expect(node.diagnose).to eq(:next_invalid) + node = node.next_invalid - expect(inner.parents[0].parents.length).to eq(183) - expect(inner.parents[0].parents.map(&:valid?)).to eq([false] + [true] * 182) + expect(node.diagnose).to eq(:next_invalid) + node = node.next_invalid - inner = inner.parents[0].parents.first - expect(inner.to_s).to eq(<<~'EOM'.indent(2)) + expect(node.diagnose).to eq(:self) + expect(node.to_s).to eq(<<~'EOM'.indent(2)) def on_args_add(arguments, argument) EOM end @@ -416,46 +481,18 @@ def initialize(arguments:, block:, location:) document = BlockDocument.new(code_lines: code_lines).call tree = IndentTree.new(document: document).call - blocks = document.to_a - expect(blocks.length).to eq(1) - expect(document.root.leaning).to eq(:left) - expect(document.root.parents[0].to_s).to eq(<<~'EOM') - class Foo - def to_json(*opts) - { type: :args, parts: parts, loc: location }.to_json(*opts) - end - end - EOM - expect(document.root.parents[0].leaning).to eq(:equal) - expect(document.root.parents[1].parents[0].to_s).to eq(<<~'EOM') - def on_args_add(arguments, argument) - EOM - expect(document.root.parents[1].parents[0].leaning).to eq(:left) + node = tree.root - expect(document.root.parents[1].parents[1].to_s).to eq(<<~'EOM'.indent(2)) - if arguments.parts.empty? - Args.new(parts: [argument], location: argument.location) - else - Args.new( - parts: arguments.parts << argument, - location: arguments.location.to(argument.location) - ) - end - EOM + expect(node.diagnose).to eq(:next_invalid) + node = node.next_invalid - expect(document.root.parents[1].parents[2].to_s).to eq(<<~'EOM') - class ArgsAddBlock - attr_reader :arguments - attr_reader :block - attr_reader :location - def initialize(arguments:, block:, location:) - @arguments = arguments - @block = block - @location = location - end - end + expect(node.diagnose).to eq(:next_invalid) + node = node.next_invalid + + expect(node.diagnose).to eq(:self) + expect(node.to_s).to eq(<<~'EOM') + def on_args_add(arguments, argument) EOM - expect(document.root.parents[1].parents[1].leaning).to eq(:equal) end it "valid if/else end" do @@ -480,9 +517,12 @@ def on_args_add(arguments, argument) blocks = document.to_a expect(blocks.length).to eq(1) - expect(document.root.leaning).to eq(:equal) - expect(document.root.parents.length).to eq(3) - expect(document.root.parents[0].to_s).to eq(<<~'EOM') + node = document.root + expect(node.leaning).to eq(:equal) + expect(node.parents.length).to eq(3) + expect(node.parents.map(&:valid?)).to eq([false, true , false]) + + expect(node.parents[0].to_s).to eq(<<~'EOM') def on_args_add(arguments, argument) EOM @@ -502,27 +542,20 @@ def on_args_add(arguments, argument) EOM inside = document.root.parents[1] - expect(inside.parents.length).to eq(5) expect(inside.parents[0].to_s).to eq(<<~'EOM'.indent(2)) if arguments.parts.empty? EOM - expect(inside.parents[1].to_s).to eq(<<~'EOM'.indent(4)) + expect(inside.parents[1].to_s).to eq(<<~'EOM'.indent(2)) Args.new(parts: [argument], location: argument.location) - EOM - - expect(inside.parents[2].to_s).to eq(<<~'EOM'.indent(2)) else - EOM - - expect(inside.parents[3].to_s).to eq(<<~'EOM'.indent(4)) Args.new( parts: arguments.parts << argument, location: arguments.location.to(argument.location) ) EOM - expect(inside.parents[4].to_s).to eq(<<~'EOM'.indent(2)) + expect(inside.parents[2].to_s).to eq(<<~'EOM'.indent(2)) end EOM end @@ -540,28 +573,15 @@ def foo document = BlockDocument.new(code_lines: code_lines).call tree = IndentTree.new(document: document).call - blocks = document.to_a - expect(blocks.length).to eq(1) - expect(document.root.leaning).to eq(:right) + node = tree.root - expect(document.root.parents.length).to eq(3) - expect(document.root.parents[0].to_s).to eq(<<~'EOM') - Foo.call - EOM - expect(document.root.parents[0].indent).to eq(0) - expect(document.root.parents[1].to_s).to eq(<<~'EOM'.indent(2)) - def foo - print "lol" - print "lol" - end # one - EOM - expect(document.root.parents[1].balanced?).to be_truthy - expect(document.root.parents[1].indent).to eq(2) + expect(node.diagnose).to eq(:next_invalid) + node = node.next_invalid - expect(document.root.parents[2].to_s).to eq(<<~'EOM') + expect(node.diagnose).to eq(:self) + expect(node.to_s).to eq(<<~'EOM') end # two EOM - expect(document.root.parents[2].indent).to eq(0) end it "captures complicated" do @@ -638,28 +658,5 @@ def foo expect(document.to_a.length).to eq(1) expect(document.root.parents.length).to eq(3) end - - it "simple" do - skip - source = <<~'EOM' - print 'lol' - print 'lol' - - Foo.call # missing do - end - EOM - - code_lines = CleanDocument.new(source: source).call.lines - document = BlockDocument.new(code_lines: code_lines).call - search = BlockSearch.new(document: document).call - search.call - - expect(search.document.root).to eq( - BlockNode.new(lines: code_lines[0..1], indent: 0).tap { |node| - node.parents << BlockNode.new(lines: code_lines[0], indent: 0) - node.right = BlockNode.new(lines: code_lines[1], indent: 0) - } - ) - end end end From 16a109b347d0dcfb83fcd210c74cd6c056d8a2c7 Mon Sep 17 00:00:00 2001 From: schneems Date: Mon, 31 Jan 2022 21:13:51 -0600 Subject: [PATCH 22/58] Fix BlockNode.from_parents above/below --- lib/dead_end/block_node.rb | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/lib/dead_end/block_node.rb b/lib/dead_end/block_node.rb index f52ec79..eafc1a0 100644 --- a/lib/dead_end/block_node.rb +++ b/lib/dead_end/block_node.rb @@ -14,12 +14,16 @@ def self.from_blocks(parents) block.delete end - BlockNode.new( + node = BlockNode.new( lines: lines, lex_diff: lex_diff, indent: indent, parents:parents ) + + node.above = parents.first.above + node.below = parents.last.below + node end attr_accessor :above, :below, :left, :right, :parents From 72cd93ee1e79b13b164ca87c7daa77e485c4de72 Mon Sep 17 00:00:00 2001 From: schneems Date: Mon, 31 Jan 2022 21:14:54 -0600 Subject: [PATCH 23/58] Fix BlockNode.from_parents that have 1 parent When making block nodes from an array, if the array only contains a single node, then don't duplicate it, use its parents instead. This fixes having to diagnose essentially the same block twice which was happening due to generating blocks. --- lib/dead_end/block_node.rb | 1 + 1 file changed, 1 insertion(+) diff --git a/lib/dead_end/block_node.rb b/lib/dead_end/block_node.rb index eafc1a0..6ce0975 100644 --- a/lib/dead_end/block_node.rb +++ b/lib/dead_end/block_node.rb @@ -5,6 +5,7 @@ class BlockNode def self.from_blocks(parents) lines = [] + parents = parents.first.parents if parents.length == 1 && parents.first.parents.any? indent = parents.first.indent lex_diff = LexPairDiff.new_empty parents.each do |block| From c5dbd274cb10ce4c569e87b5779a37dc3368fae0 Mon Sep 17 00:00:00 2001 From: schneems Date: Mon, 31 Jan 2022 21:28:26 -0600 Subject: [PATCH 24/58] Handle multiple case Works surprisingly well The only thing we're not covering (that I know of) is the case of if an issue exists in more than one block. We need to figure out how to either detect and deal with this case --- lib/dead_end/block_node.rb | 20 +++++- spec/unit/indent_tree_spec.rb | 126 +++++++++++++++++++++++++--------- 2 files changed, 113 insertions(+), 33 deletions(-) diff --git a/lib/dead_end/block_node.rb b/lib/dead_end/block_node.rb index 6ce0975..3c3d7e4 100644 --- a/lib/dead_end/block_node.rb +++ b/lib/dead_end/block_node.rb @@ -93,13 +93,31 @@ def diagnose :multiple end + # Muliple could be: + # + # - valid rescue/else + # - leaves inside of an array/hash + # - An actual fork indicating multiple syntax errors + def handle_multiple + invalid = parents.select(&:invalid?) + # valid rescue/else + if above && above.leaning == :left && below && below.leaning == :right + before_length = invalid.length + invalid.reject! {|block| + b = BlockNode.from_blocks([above, block, below]) + b.leaning == :equal && b.valid? + } + return BlockNode.from_blocks(invalid) if invalid.length != before_length + end + end + def split_leaning block = left_right_parents invalid = parents.select(&:invalid?) invalid.reject! {|x| block.parents.include?(x) } - @inner_leang ||= BlockNode.from_blocks(invalid) + @inner_leaning ||= BlockNode.from_blocks(invalid) end def left_right_parents diff --git a/spec/unit/indent_tree_spec.rb b/spec/unit/indent_tree_spec.rb index e8a3489..b5998e5 100644 --- a/spec/unit/indent_tree_spec.rb +++ b/spec/unit/indent_tree_spec.rb @@ -84,21 +84,93 @@ def node_preinstall_bin_path expect(node.diagnose).to eq(:next_invalid) node = node.next_invalid + expect(node.diagnose).to eq(:split_leaning) + node = node.split_leaning + expect(node.diagnose).to eq(:next_invalid) node = node.next_invalid + expect(node.diagnose).to eq(:self) + expect(node.to_s).to eq(<<~'EOM') + | + EOM + end + + it "doesn't scapegoat rescue" do + source = <<~'EOM' + def compile + instrument 'ruby.compile' do + # check for new app at the beginning of the compile + new_app? + Dir.chdir(build_path) + remove_vendor_bundle + warn_bundler_upgrade + warn_bad_binstubs + install_ruby(slug_vendor_ruby, build_ruby_path) + setup_language_pack_environment( + ruby_layer_path: File.expand_path("."), + gem_layer_path: File.expand_path("."), + bundle_path: "vendor/bundle", } + bundle_default_without: "development:test" + ) + allow_git do + install_bundler_in_app(slug_vendor_base) + load_bundler_cache + build_bundler + post_bundler + create_database_yml + install_binaries + run_assets_precompile_rake_task + end + config_detect + best_practice_warnings + warn_outdated_ruby + setup_profiled(ruby_layer_path: "$HOME", gem_layer_path: "$HOME") # $HOME is set to /app at run time + setup_export + cleanup + super + end + rescue => e + warn_outdated_ruby + raise e + end + EOM + + code_lines = CleanDocument.new(source: source).call.lines + document = BlockDocument.new(code_lines: code_lines).call + tree = IndentTree.new(document: document).call + + node = tree.root + + node = tree.root + expect(node.diagnose).to eq(:split_leaning) + node = node.split_leaning + + expect(node.diagnose).to eq(:multiple) + node = node.handle_multiple + expect(node.diagnose).to eq(:split_leaning) node = node.split_leaning expect(node.diagnose).to eq(:next_invalid) node = node.next_invalid + expect(node.diagnose).to eq(:split_leaning) + node = node.split_leaning + + expect(node.diagnose).to eq(:multiple) + expect(node.parents.length).to eq(4) + + node = node.handle_multiple + + expect(node.parents.length).to eq(1) expect(node.diagnose).to eq(:next_invalid) - node = node.next_invalid + node = node.next_invalid expect(node.diagnose).to eq(:self) - expect(node.to_s).to eq(<<~'EOM') - | + + expect(node.to_s).to eq(<<~'EOM'.indent(6)) + bundle_path: "vendor/bundle", } EOM end @@ -119,18 +191,31 @@ def node_preinstall_bin_path expect(node.diagnose).to eq(:next_invalid) node = node.next_invalid - expect(node.diagnose).to eq(:next_invalid) - node = node.next_invalid + expect(node.diagnose).to eq(:split_leaning) + node = node.split_leaning + + expect(node.diagnose).to eq(:multiple) + node = node.handle_multiple + + expect(node.diagnose).to eq(:split_leaning) + node = node.split_leaning expect(node.diagnose).to eq(:split_leaning) node = node.split_leaning expect(node.diagnose).to eq(:multiple) + node = node.handle_multiple - expect(node.parents.map(&:valid?)).to eq([false, false]) + expect(node.diagnose).to eq(:split_leaning) + node = node.split_leaning - pending("multiple") - raise "We don't know what to do with :multiple failures yet" + expect(node.diagnose).to eq(:next_invalid) + node = node.next_invalid + + expect(node.diagnose).to eq(:self) + expect(node.to_s).to eq(<<~'EOM') + | + EOM end it "finds hanging def in this project" do @@ -145,8 +230,6 @@ def node_preinstall_bin_path expect(node.diagnose).to eq(:split_leaning) node = node.split_leaning - expect(node.diagnose).to eq(:next_invalid) - node = node.next_invalid expect(node.diagnose).to eq(:split_leaning) node = node.split_leaning @@ -157,9 +240,6 @@ def node_preinstall_bin_path expect(node.diagnose).to eq(:next_invalid) node = node.next_invalid - expect(node.diagnose).to eq(:next_invalid) - node = node.next_invalid - expect(node.diagnose).to eq(:self) expect(node.to_s).to eq(<<~'EOM'.indent(4)) def filename @@ -185,8 +265,7 @@ def bark expect(node.diagnose).to eq(:next_invalid) node = node.next_invalid - expect(node.diagnose).to eq(:next_invalid) - node = node.next_invalid + expect(node.diagnose).to eq(:self) expect(node.to_s).to eq(<<~'EOM'.indent(2)) def bark @@ -218,9 +297,6 @@ def call expect(node.diagnose).to eq(:next_invalid) node = node.next_invalid - expect(node.diagnose).to eq(:next_invalid) - node = node.next_invalid - expect(node.diagnose).to eq(:self) expect(node.to_s).to eq(<<~'EOM'.indent(2)) end # one @@ -265,9 +341,6 @@ def call expect(node.diagnose).to eq(:next_invalid) node = node.next_invalid - expect(node.diagnose).to eq(:next_invalid) - node = node.next_invalid - expect(node.diagnose).to eq(:self) expect(node.to_s).to eq(<<~'EOM'.indent(2)) end # one @@ -349,9 +422,6 @@ def initialize expect(node.diagnose).to eq(:next_invalid) node = node.next_invalid - expect(node.diagnose).to eq(:next_invalid) - node = node.next_invalid - expect(node.diagnose).to eq(:self) expect(node.to_s).to eq(<<~'EOM'.indent(2)) def format_requires @@ -384,8 +454,6 @@ def format_requires expect(node.diagnose).to eq(:next_invalid) node = node.next_invalid - expect(node.diagnose).to eq(:next_invalid) - node = node.next_invalid expect(node.diagnose).to eq(:split_leaning) node = node.split_leaning @@ -396,9 +464,6 @@ def format_requires expect(node.diagnose).to eq(:next_invalid) node = node.next_invalid - expect(node.diagnose).to eq(:next_invalid) - node = node.next_invalid - expect(node.diagnose).to eq(:self) expect(node.to_s).to eq(<<~'EOM'.indent(4)) def format_requires @@ -433,9 +498,6 @@ def format_requires expect(node.diagnose).to eq(:next_invalid) node = node.next_invalid - expect(node.diagnose).to eq(:next_invalid) - node = node.next_invalid - expect(node.diagnose).to eq(:self) expect(node.to_s).to eq(<<~'EOM'.indent(2)) def on_args_add(arguments, argument) From db85b5b71accad046d58e6dafc737ac8b2a60f8e Mon Sep 17 00:00:00 2001 From: schneems Date: Tue, 1 Feb 2022 20:58:17 -0600 Subject: [PATCH 25/58] First indent tree search --- lib/dead_end/indent_tree.rb | 107 ++++++++++++++++++++++++-------- spec/unit/indent_search_spec.rb | 68 ++++++++++++++++++++ spec/unit/indent_tree_spec.rb | 66 +++++++++++++++----- 3 files changed, 200 insertions(+), 41 deletions(-) create mode 100644 spec/unit/indent_search_spec.rb diff --git a/lib/dead_end/indent_tree.rb b/lib/dead_end/indent_tree.rb index 8abb50b..151639a 100644 --- a/lib/dead_end/indent_tree.rb +++ b/lib/dead_end/indent_tree.rb @@ -33,41 +33,94 @@ def capture(block, name: ) end class IndentSearch + attr_reader :finished + def initialize(tree: ) @tree = tree - @invalid_blocks = [] - end - - - # Keep track of trail of how we got here, Introduce Trail class - # Each main block gets a trail with one or more paths - # - # Problem: We can follow valid/invalid for awhile but - # at the edges single lines of valid code look invalid - # - # Solution maybe: Hold a set of code that is invalid with - # a sub block, and valid without it. Goal: Make this block - # as small as possible to reduce parsing time - # - # Problem: when to stop looking? The old "when to stop looking" - # started from not capturing the syntax error and re-checking the - # whole document when a syntax error was found. - # - # We are reversing the idea on it's head by starting with a known - # invalid state, we know if we removed the given block the whole - # document would be valid, however we want to find the smallest - # block where this holds true - # - # Goal: Find the smallest block where it's removal will make a fork - # of the path valid again. - - # Solution: Popstars never stop stopping + @finished = [] + @frontier = [Journey.new(@tree.root)] + end + def call + while journey = @frontier.pop + node = journey.node + case node.diagnose + when :self + @finished << journey + next + when :next_invalid + block = node.next_invalid + when :split_leaning + block = node.split_leaning + when :multiple + block = node.handle_multiple + else + raise "DeadEnd internal error: Unknown diagnosis #{node.diagnose}" + end + + # When true, we made a good move + # otherwise, go back to last known reasonable guess + if journey.holds_all_errors?(block) + journey << Step.new(block) + @frontier << journey + else + @finished << journey + next + end + end self end end + + # Each journey represents a walk of the graph to eliminate + # invalid code + # + # We can check the a step's validity by asserting that it's removal produces + # valid code from it's parent + class Journey + def initialize(root) + @root = root + @steps = [Step.new(root)] + end + + # In isolation a block may appear valid when it isn't or invalid when it is + # by checking against several levels of the tree, we can have higher + # confidence that our values are correct + def holds_all_errors?(blocks) + @steps.first.valid_without?(blocks) + end + + def <<(step) + @steps << step + end + + def node + @steps.last.block + end + end + + class Step + attr_reader :block + + def initialize(block) + @block = block + end + + def valid_without?(blocks) + without_lines = Array(blocks).flat_map do |block| + block.lines + end + + out = DeadEnd.valid_without?( + without_lines: without_lines, + code_lines: @block.lines + ) + out + end + end + class IndentTree attr_reader :document diff --git a/spec/unit/indent_search_spec.rb b/spec/unit/indent_search_spec.rb new file mode 100644 index 0000000..3ee0797 --- /dev/null +++ b/spec/unit/indent_search_spec.rb @@ -0,0 +1,68 @@ +# frozen_string_literal: true + +require_relative "../spec_helper" + +module DeadEnd + RSpec.describe IndentSearch do + it "finds random pipe (|) wildly misindented" do + source = fixtures_dir.join("ruby_buildpack.rb.txt").read + + code_lines = CleanDocument.new(source: source).call.lines + document = BlockDocument.new(code_lines: code_lines).call + tree = IndentTree.new(document: document).call + + code_lines = CleanDocument.new(source: source).call.lines + document = BlockDocument.new(code_lines: code_lines).call + tree = IndentTree.new(document: document).call + search = IndentSearch.new(tree: tree).call + + expect(search.finished.first.node.to_s).to eq(<<~'EOM') + | + EOM + end + + it "syntax tree search" do + file = fixtures_dir.join("syntax_tree.rb.txt") + lines = file.read.lines + lines.delete_at(768 - 1) + source = lines.join + + tree = nil + document = nil + debug_perf do + code_lines = CleanDocument.new(source: source).call.lines + document = BlockDocument.new(code_lines: code_lines).call + tree = IndentTree.new(document: document).call + search = IndentSearch.new(tree: tree).call + + expect(search.finished.first.node.to_s).to eq(<<~'EOM'.indent(2)) + def on_args_add(arguments, argument) + EOM + end + end + + it "finds missing comma in array" do + source = <<~'EOM' + def animals + [ + cat, + dog + horse + ] + end + EOM + + code_lines = CleanDocument.new(source: source).call.lines + document = BlockDocument.new(code_lines: code_lines).call + tree = IndentTree.new(document: document).call + + search = IndentSearch.new(tree: tree).call + + expect(search.finished.first.node.to_s).to eq(<<~'EOM'.indent(4)) + cat, + dog + horse + EOM + end + end +end diff --git a/spec/unit/indent_tree_spec.rb b/spec/unit/indent_tree_spec.rb index b5998e5..f3ff3dd 100644 --- a/spec/unit/indent_tree_spec.rb +++ b/spec/unit/indent_tree_spec.rb @@ -96,6 +96,44 @@ def node_preinstall_bin_path EOM end + it "finds missing comma in array" do + source = <<~'EOM' + def animals + [ + cat, + dog + horse + ] + end + EOM + + code_lines = CleanDocument.new(source: source).call.lines + document = BlockDocument.new(code_lines: code_lines).call + tree = IndentTree.new(document: document).call + + node = tree.root + + node = tree.root + expect(node.diagnose).to eq(:split_leaning) + node = node.split_leaning + + expect(node.diagnose).to eq(:split_leaning) + node = node.split_leaning + + node.parents.each do |block| + puts "==" + puts block + puts block.valid? + end + + expect(node.diagnose).to eq(:next_invalid) + node = node.next_invalid + + expect(node.diagnose).to eq(:self) + expect(node.to_s).to eq(<<~'EOM') + EOM + end + it "doesn't scapegoat rescue" do source = <<~'EOM' def compile @@ -482,26 +520,26 @@ def format_requires code_lines = CleanDocument.new(source: source).call.lines document = BlockDocument.new(code_lines: code_lines).call tree = IndentTree.new(document: document).call - end - node = tree.root + node = tree.root - expect(node.diagnose).to eq(:next_invalid) - node = node.next_invalid + expect(node.diagnose).to eq(:next_invalid) + node = node.next_invalid - expect(node.diagnose).to eq(:split_leaning) - node = node.split_leaning + expect(node.diagnose).to eq(:split_leaning) + node = node.split_leaning - expect(node.diagnose).to eq(:next_invalid) - node = node.next_invalid + expect(node.diagnose).to eq(:next_invalid) + node = node.next_invalid - expect(node.diagnose).to eq(:next_invalid) - node = node.next_invalid + expect(node.diagnose).to eq(:next_invalid) + node = node.next_invalid - expect(node.diagnose).to eq(:self) - expect(node.to_s).to eq(<<~'EOM'.indent(2)) - def on_args_add(arguments, argument) - EOM + expect(node.diagnose).to eq(:self) + expect(node.to_s).to eq(<<~'EOM'.indent(2)) + def on_args_add(arguments, argument) + EOM + end end it "invalid if/else end with surrounding code" do From 1a346978dd64a074dc156e8356c080a2a5370eb9 Mon Sep 17 00:00:00 2001 From: schneems Date: Wed, 2 Feb 2022 20:40:45 -0600 Subject: [PATCH 26/58] Standardrb --fix --- lib/dead_end/block_document.rb | 9 ++++----- lib/dead_end/block_node.rb | 21 ++++++++++----------- lib/dead_end/indent_tree.rb | 24 +++++++++++------------- lib/dead_end/lex_pair_diff.rb | 2 +- lib/dead_end/priority_queue.rb | 1 - spec/unit/code_search_spec.rb | 1 - spec/unit/indent_search_spec.rb | 4 ---- spec/unit/indent_tree_spec.rb | 21 +++------------------ spec/unit/priority_queue_spec.rb | 6 +++--- 9 files changed, 32 insertions(+), 57 deletions(-) diff --git a/lib/dead_end/block_document.rb b/lib/dead_end/block_document.rb index 1dfff49..721c9df 100644 --- a/lib/dead_end/block_document.rb +++ b/lib/dead_end/block_document.rb @@ -6,9 +6,8 @@ class BlockDocument include Enumerable - def initialize(code_lines: ) + def initialize(code_lines:) @code_lines = code_lines - blocks = nil @queue = InsertionSortQueue.new @root = nil end @@ -26,7 +25,7 @@ def each end def to_s - string = String.new + string = +"" each do |block| string << block.to_s end @@ -78,11 +77,11 @@ def capture_all(inner) now end - def capture(node: , captured: ) + def capture(node:, captured:) inner = [] inner.concat(Array(captured)) inner << node - inner.sort_by! {|block| block.start_index } + inner.sort_by! { |block| block.start_index } capture_all(inner) end diff --git a/lib/dead_end/block_node.rb b/lib/dead_end/block_node.rb index 3c3d7e4..f2b33ff 100644 --- a/lib/dead_end/block_node.rb +++ b/lib/dead_end/block_node.rb @@ -2,7 +2,6 @@ module DeadEnd class BlockNode - def self.from_blocks(parents) lines = [] parents = parents.first.parents if parents.length == 1 && parents.first.parents.any? @@ -19,7 +18,7 @@ def self.from_blocks(parents) lines: lines, lex_diff: lex_diff, indent: indent, - parents:parents + parents: parents ) node.above = parents.first.above @@ -30,7 +29,7 @@ def self.from_blocks(parents) attr_accessor :above, :below, :left, :right, :parents attr_reader :lines, :start_index, :end_index, :lex_diff, :indent, :starts_at, :ends_at - def initialize(lines: , indent: , next_indent: nil, lex_diff: nil, parents: []) + def initialize(lines:, indent:, next_indent: nil, lex_diff: nil, parents: []) lines = Array(lines) @indent = indent @next_indent = next_indent @@ -52,9 +51,9 @@ def initialize(lines: , indent: , next_indent: nil, lex_diff: nil, parents: []) @deleted = false end - def expand_above?(with_indent: self.indent) + def expand_above?(with_indent: indent) return false if above.nil? - return false if leaf? && self.leaning == :left + return false if leaf? && leaning == :left if above.leaning == :left above.indent >= with_indent @@ -63,9 +62,9 @@ def expand_above?(with_indent: self.indent) end end - def expand_below?(with_indent: self.indent) + def expand_below?(with_indent: indent) return false if below.nil? - return false if leaf? && self.leaning == :right + return false if leaf? && leaning == :right if below.leaning == :right below.indent >= with_indent @@ -103,7 +102,7 @@ def handle_multiple # valid rescue/else if above && above.leaning == :left && below && below.leaning == :right before_length = invalid.length - invalid.reject! {|block| + invalid.reject! { |block| b = BlockNode.from_blocks([above, block, below]) b.leaning == :equal && b.valid? } @@ -115,7 +114,7 @@ def split_leaning block = left_right_parents invalid = parents.select(&:invalid?) - invalid.reject! {|x| block.parents.include?(x) } + invalid.reject! { |x| block.parents.include?(x) } @inner_leaning ||= BlockNode.from_blocks(invalid) end @@ -124,11 +123,11 @@ def left_right_parents invalid = parents.select(&:invalid?) return false if invalid.length < 3 - left = invalid.detect {|block| block.leaning == :left } + left = invalid.detect { |block| block.leaning == :left } return false if left.nil? - right = invalid.reverse_each.detect {|block| block.leaning == :right } + right = invalid.reverse_each.detect { |block| block.leaning == :right } return false if right.nil? @left_right_parents ||= BlockNode.from_blocks([left, right]) diff --git a/lib/dead_end/indent_tree.rb b/lib/dead_end/indent_tree.rb index 151639a..1c0636a 100644 --- a/lib/dead_end/indent_tree.rb +++ b/lib/dead_end/indent_tree.rb @@ -2,14 +2,14 @@ module DeadEnd class Recorder - def initialize(dir: , code_lines: ) + def initialize(dir:, code_lines:) @code_lines = code_lines @dir = Pathname(dir) @tick = 0 - @name_tick = Hash.new {|h, k| h[k] = 0} + @name_tick = Hash.new { |h, k| h[k] = 0 } end - def capture(block, name: ) + def capture(block, name:) @tick += 1 filename = "#{@tick}-#{name}-#{@name_tick[name] += 1}-(#{block.starts_at}__#{block.ends_at}).txt" @@ -22,27 +22,27 @@ def capture(block, name: ) f.write(" Block lines: #{(block.starts_at + 1)..(block.ends_at + 1)} (#{name})\n") f.write(" indent: #{block.indent} next_indent: #{block.next_indent}\n\n") - f.write("#{document}") + f.write(document.to_s) end end end class NullRecorder - def capture(block, name: ) + def capture(block, name:) end end class IndentSearch attr_reader :finished - def initialize(tree: ) + def initialize(tree:) @tree = tree @finished = [] @frontier = [Journey.new(@tree.root)] end def call - while journey = @frontier.pop + while (journey = @frontier.pop) node = journey.node case node.diagnose when :self @@ -73,7 +73,6 @@ def call end end - # Each journey represents a walk of the graph to eliminate # invalid code # @@ -113,18 +112,17 @@ def valid_without?(blocks) block.lines end - out = DeadEnd.valid_without?( - without_lines: without_lines, + DeadEnd.valid_without?( + without_lines: without_lines, code_lines: @block.lines ) - out end end class IndentTree attr_reader :document - def initialize(document: , recorder: DEFAULT_VALUE) + def initialize(document:, recorder: DEFAULT_VALUE) @document = document @last_length = Float::INFINITY @@ -157,7 +155,7 @@ def call end private def reduce - while block = document.pop + while (block = document.pop) original = block blocks = [block] diff --git a/lib/dead_end/lex_pair_diff.rb b/lib/dead_end/lex_pair_diff.rb index a3ff2a1..e602579 100644 --- a/lib/dead_end/lex_pair_diff.rb +++ b/lib/dead_end/lex_pair_diff.rb @@ -41,7 +41,7 @@ def self.from_lex(lex:, is_kw:, is_end:) end def self.new_empty - self.new(curly: 0, square: 0, parens: 0, kw_end: 0) + new(curly: 0, square: 0, parens: 0, kw_end: 0) end attr_reader :curly, :square, :parens, :kw_end diff --git a/lib/dead_end/priority_queue.rb b/lib/dead_end/priority_queue.rb index 750c5ca..2962a0f 100644 --- a/lib/dead_end/priority_queue.rb +++ b/lib/dead_end/priority_queue.rb @@ -37,7 +37,6 @@ def <<(value) end end || @array.length - @array.insert(index, value) end diff --git a/spec/unit/code_search_spec.rb b/spec/unit/code_search_spec.rb index 76f08b7..8f3ca19 100644 --- a/spec/unit/code_search_spec.rb +++ b/spec/unit/code_search_spec.rb @@ -501,6 +501,5 @@ def foo end EOM end - end end diff --git a/spec/unit/indent_search_spec.rb b/spec/unit/indent_search_spec.rb index 3ee0797..29a7987 100644 --- a/spec/unit/indent_search_spec.rb +++ b/spec/unit/indent_search_spec.rb @@ -7,10 +7,6 @@ module DeadEnd it "finds random pipe (|) wildly misindented" do source = fixtures_dir.join("ruby_buildpack.rb.txt").read - code_lines = CleanDocument.new(source: source).call.lines - document = BlockDocument.new(code_lines: code_lines).call - tree = IndentTree.new(document: document).call - code_lines = CleanDocument.new(source: source).call.lines document = BlockDocument.new(code_lines: code_lines).call tree = IndentTree.new(document: document).call diff --git a/spec/unit/indent_tree_spec.rb b/spec/unit/indent_tree_spec.rb index f3ff3dd..9f8a499 100644 --- a/spec/unit/indent_tree_spec.rb +++ b/spec/unit/indent_tree_spec.rb @@ -4,8 +4,8 @@ module DeadEnd RSpec.describe IndentTree do - it "(smaller) finds random pipe (|) wildly misindented" do - source = <<~'EOM' + it "(smaller) finds random pipe (|) wildly misindented" do + source = <<~'EOM' class LanguagePack::Ruby < LanguagePack::Base def allow_git(&blk) git_dir = ENV.delete("GIT_DIR") # can mess with bundler @@ -111,21 +111,12 @@ def animals document = BlockDocument.new(code_lines: code_lines).call tree = IndentTree.new(document: document).call - node = tree.root - node = tree.root expect(node.diagnose).to eq(:split_leaning) node = node.split_leaning expect(node.diagnose).to eq(:split_leaning) node = node.split_leaning - - node.parents.each do |block| - puts "==" - puts block - puts block.valid? - end - expect(node.diagnose).to eq(:next_invalid) node = node.next_invalid @@ -178,8 +169,6 @@ def compile document = BlockDocument.new(code_lines: code_lines).call tree = IndentTree.new(document: document).call - node = tree.root - node = tree.root expect(node.diagnose).to eq(:split_leaning) node = node.split_leaning @@ -268,7 +257,6 @@ def compile expect(node.diagnose).to eq(:split_leaning) node = node.split_leaning - expect(node.diagnose).to eq(:split_leaning) node = node.split_leaning @@ -284,7 +272,6 @@ def filename EOM end - it "regression dog test" do source = <<~'EOM' class Dog @@ -492,7 +479,6 @@ def format_requires expect(node.diagnose).to eq(:next_invalid) node = node.next_invalid - expect(node.diagnose).to eq(:split_leaning) node = node.split_leaning @@ -613,14 +599,13 @@ def on_args_add(arguments, argument) code_lines = CleanDocument.new(source: source).call.lines document = BlockDocument.new(code_lines: code_lines).call - tree = IndentTree.new(document: document).call blocks = document.to_a expect(blocks.length).to eq(1) node = document.root expect(node.leaning).to eq(:equal) expect(node.parents.length).to eq(3) - expect(node.parents.map(&:valid?)).to eq([false, true , false]) + expect(node.parents.map(&:valid?)).to eq([false, true, false]) expect(node.parents[0].to_s).to eq(<<~'EOM') def on_args_add(arguments, argument) diff --git a/spec/unit/priority_queue_spec.rb b/spec/unit/priority_queue_spec.rb index 1ff17f3..ce07d80 100644 --- a/spec/unit/priority_queue_spec.rb +++ b/spec/unit/priority_queue_spec.rb @@ -22,7 +22,7 @@ def inspect RSpec.describe CodeFrontier do it "benchmark/ips" do skip unless ENV["DEBUG_PERF"] - require 'benchmark/ips' + require "benchmark/ips" values = 5000.times.map { rand(0..100) }.freeze @@ -32,7 +32,7 @@ def inspect values.each do |v| q << v end - while q.pop() do + while q.pop end } @@ -41,7 +41,7 @@ def inspect values.each do |v| q << v end - while q.pop() do + while q.pop end } x.compare! From 6185b90f5f93cf61634fdf5f2964cd20b3e262e2 Mon Sep 17 00:00:00 2001 From: schneems Date: Wed, 2 Feb 2022 20:49:30 -0600 Subject: [PATCH 27/58] More search cases Still need: Actual multi failure test --- lib/dead_end/block_node.rb | 2 +- spec/unit/indent_search_spec.rb | 296 ++++++++++++++++++++++++++++++++ spec/unit/indent_tree_spec.rb | 107 +++++------- 3 files changed, 339 insertions(+), 66 deletions(-) diff --git a/lib/dead_end/block_node.rb b/lib/dead_end/block_node.rb index f2b33ff..4637cdf 100644 --- a/lib/dead_end/block_node.rb +++ b/lib/dead_end/block_node.rb @@ -106,7 +106,7 @@ def handle_multiple b = BlockNode.from_blocks([above, block, below]) b.leaning == :equal && b.valid? } - return BlockNode.from_blocks(invalid) if invalid.length != before_length + return BlockNode.from_blocks(invalid) if invalid.any? && invalid.length != before_length end end diff --git a/spec/unit/indent_search_spec.rb b/spec/unit/indent_search_spec.rb index 29a7987..fe38050 100644 --- a/spec/unit/indent_search_spec.rb +++ b/spec/unit/indent_search_spec.rb @@ -4,6 +4,302 @@ module DeadEnd RSpec.describe IndentSearch do + it "invalid if and else" do + source = <<~'EOM' + if true + puts ( + else + puts } + end + EOM + + code_lines = CleanDocument.new(source: source).call.lines + document = BlockDocument.new(code_lines: code_lines).call + tree = IndentTree.new(document: document).call + search = IndentSearch.new(tree: tree).call + + expect(search.finished.first.node.to_s).to eq(<<~'EOM'.indent(2)) + puts ( + puts } + EOM + end + + it "handles heredocs" do + lines = fixtures_dir.join("rexe.rb.txt").read.lines + lines.delete_at(85 - 1) + source = lines.join + + code_lines = CleanDocument.new(source: source).call.lines + document = BlockDocument.new(code_lines: code_lines).call + tree = IndentTree.new(document: document).call + search = IndentSearch.new(tree: tree).call + + expect(search.finished.first.node.to_s).to eq(<<~'EOM'.indent(4)) + def input_modes + EOM + end + + it "handles derailed output issues/50" do + source = fixtures_dir.join("derailed_require_tree.rb.txt").read + + code_lines = CleanDocument.new(source: source).call.lines + document = BlockDocument.new(code_lines: code_lines).call + tree = IndentTree.new(document: document).call + search = IndentSearch.new(tree: tree).call + + expect(search.finished.first.node.to_s).to eq(<<~'EOM'.indent(4)) + def initialize(name) + EOM + end + + it "handles multi-line-methods issues/64" do + source = fixtures_dir.join("webmock.rb.txt").read + + code_lines = CleanDocument.new(source: source).call.lines + document = BlockDocument.new(code_lines: code_lines).call + tree = IndentTree.new(document: document).call + search = IndentSearch.new(tree: tree).call + + expect(search.finished.first.node.to_s).to eq(<<~'EOM'.indent(6)) + port: port + body: body + EOM + end + + it "returns good results on routes.rb" do + source = fixtures_dir.join("routes.rb.txt").read + + code_lines = CleanDocument.new(source: source).call.lines + document = BlockDocument.new(code_lines: code_lines).call + tree = IndentTree.new(document: document).call + search = IndentSearch.new(tree: tree).call + + expect(search.finished.first.node.to_s).to eq(<<~'EOM'.indent(2)) + namespace :admin do + EOM + end + + it "doesn't scapegoat rescue" do + source = <<~'EOM' + def compile + instrument 'ruby.compile' do + # check for new app at the beginning of the compile + new_app? + Dir.chdir(build_path) + remove_vendor_bundle + warn_bundler_upgrade + warn_bad_binstubs + install_ruby(slug_vendor_ruby, build_ruby_path) + setup_language_pack_environment( + ruby_layer_path: File.expand_path("."), + gem_layer_path: File.expand_path("."), + bundle_path: "vendor/bundle", } + bundle_default_without: "development:test" + ) + allow_git do + install_bundler_in_app(slug_vendor_base) + load_bundler_cache + build_bundler + post_bundler + create_database_yml + install_binaries + run_assets_precompile_rake_task + end + config_detect + best_practice_warnings + warn_outdated_ruby + setup_profiled(ruby_layer_path: "$HOME", gem_layer_path: "$HOME") # $HOME is set to /app at run time + setup_export + cleanup + super + end + rescue => e + warn_outdated_ruby + raise e + end + EOM + + code_lines = CleanDocument.new(source: source).call.lines + document = BlockDocument.new(code_lines: code_lines).call + tree = IndentTree.new(document: document).call + search = IndentSearch.new(tree: tree).call + + expect(search.finished.first.node.to_s).to eq(<<~'EOM'.indent(6)) + bundle_path: "vendor/bundle", } + EOM + end + + it "finds hanging def in this project" do + source = fixtures_dir.join("this_project_extra_def.rb.txt").read + + code_lines = CleanDocument.new(source: source).call.lines + document = BlockDocument.new(code_lines: code_lines).call + tree = IndentTree.new(document: document).call + search = IndentSearch.new(tree: tree).call + + expect(search.finished.first.node.to_s).to eq(<<~'EOM'.indent(4)) + def filename + EOM + end + + it "regression dog test" do + source = <<~'EOM' + class Dog + def bark + puts "woof" + end + EOM + + code_lines = CleanDocument.new(source: source).call.lines + document = BlockDocument.new(code_lines: code_lines).call + tree = IndentTree.new(document: document).call + search = IndentSearch.new(tree: tree).call + + expect(search.finished.first.node.to_s).to eq(<<~'EOM'.indent(2)) + def bark + EOM + end + + it "regression test ambiguous end" do + # Even though you would think the first step is to + # expand the "print" line, we base priority off of + # "next_indent" so the actual highest "next indent" line + # comes from "end # one" which captures "print", then it + # expands out from there + source = <<~'EOM' + def call + print "lol" + end # one + end # two + EOM + + code_lines = CleanDocument.new(source: source).call.lines + document = BlockDocument.new(code_lines: code_lines).call + tree = IndentTree.new(document: document).call + search = IndentSearch.new(tree: tree).call + + expect(search.finished.first.node.to_s).to eq(<<~'EOM'.indent(2)) + end # one + EOM + end + + it "squished do regression" do + source = <<~'EOM' + def call + trydo + + @options = CommandLineParser.new.parse + + options.requires.each { |r| require!(r) } + load_global_config_if_exists + options.loads.each { |file| load(file) } + + @user_source_code = ARGV.join(' ') + @user_source_code = 'self' if @user_source_code == '' + + @callable = create_callable + + init_rexe_context + init_parser_and_formatters + + # This is where the user's source code will be executed; the action will in turn call `execute`. + lookup_action(options.input_mode).call unless options.noop + + output_log_entry + end # one + end # two + EOM + + code_lines = CleanDocument.new(source: source).call.lines + document = BlockDocument.new(code_lines: code_lines).call + tree = IndentTree.new(document: document).call + search = IndentSearch.new(tree: tree).call + + expect(search.finished.first.node.to_s).to eq(<<~'EOM'.indent(2)) + end # one + EOM + end + + it "rexe regression" do + lines = fixtures_dir.join("rexe.rb.txt").read.lines + lines.delete_at(148 - 1) + source = lines.join + + code_lines = CleanDocument.new(source: source).call.lines + document = BlockDocument.new(code_lines: code_lines).call + tree = IndentTree.new(document: document).call + search = IndentSearch.new(tree: tree).call + + expect(search.finished.first.node.to_s).to eq(<<~'EOM'.indent(4)) + def format_requires + EOM + end + + it "invalid if/else end with surrounding code" do + source = <<~'EOM' + class Foo + def to_json(*opts) + { type: :args, parts: parts, loc: location }.to_json(*opts) + end + end + + def on_args_add(arguments, argument) + if arguments.parts.empty? + Args.new(parts: [argument], location: argument.location) + else + + Args.new( + parts: arguments.parts << argument, + location: arguments.location.to(argument.location) + ) + end + # Missing end here, comments are erased via CleanDocument + + class ArgsAddBlock + attr_reader :arguments + + attr_reader :block + + attr_reader :location + + def initialize(arguments:, block:, location:) + @arguments = arguments + @block = block + @location = location + end + end + EOM + + code_lines = CleanDocument.new(source: source).call.lines + document = BlockDocument.new(code_lines: code_lines).call + tree = IndentTree.new(document: document).call + search = IndentSearch.new(tree: tree).call + + expect(search.finished.first.node.to_s).to eq(<<~'EOM') + def on_args_add(arguments, argument) + EOM + end + + it "extra space before end" do + source = <<~'EOM' + Foo.call + def foo + print "lol" + print "lol" + end # one + end # two + EOM + + code_lines = CleanDocument.new(source: source).call.lines + document = BlockDocument.new(code_lines: code_lines).call + tree = IndentTree.new(document: document).call + search = IndentSearch.new(tree: tree).call + + expect(search.finished.first.node.to_s).to eq(<<~'EOM') + end # two + EOM + end + it "finds random pipe (|) wildly misindented" do source = fixtures_dir.join("ruby_buildpack.rb.txt").read diff --git a/spec/unit/indent_tree_spec.rb b/spec/unit/indent_tree_spec.rb index 9f8a499..82fb082 100644 --- a/spec/unit/indent_tree_spec.rb +++ b/spec/unit/indent_tree_spec.rb @@ -4,6 +4,43 @@ module DeadEnd RSpec.describe IndentTree do + it "invalid if and else" do + source = <<~'EOM' + if true + puts ( + else + puts } + end + EOM + + code_lines = CleanDocument.new(source: source).call.lines + document = BlockDocument.new(code_lines: code_lines).call + tree = IndentTree.new(document: document).call + + node = tree.root + expect(node.diagnose).to eq(:split_leaning) + node = node.split_leaning + expect(node.to_s).to eq(<<~'EOM') + puts ( + else + puts } + EOM + + expect(node.diagnose).to eq(:multiple) + node = node.handle_multiple + + expect(node.to_s).to eq(<<~'EOM'.indent(2)) + puts ( + puts } + EOM + + expect(node.diagnose).to eq(:multiple) + node = node.handle_multiple + + expect(node.to_s).to eq(<<~'EOM') + EOM + end + it "(smaller) finds random pipe (|) wildly misindented" do source = <<~'EOM' class LanguagePack::Ruby < LanguagePack::Base @@ -121,7 +158,11 @@ def animals node = node.next_invalid expect(node.diagnose).to eq(:self) - expect(node.to_s).to eq(<<~'EOM') + # Note that this is a bad pick, it's actual a + # valid line, the search algorithm has to account + # for this + expect(node.to_s).to eq(<<~'EOM'.indent(4)) + cat, EOM end @@ -581,70 +622,6 @@ def on_args_add(arguments, argument) EOM end - it "valid if/else end" do - source = <<~'EOM' - def on_args_add(arguments, argument) - if arguments.parts.empty? - - Args.new(parts: [argument], location: argument.location) - else - - Args.new( - parts: arguments.parts << argument, - location: arguments.location.to(argument.location) - ) - end - end - EOM - - code_lines = CleanDocument.new(source: source).call.lines - document = BlockDocument.new(code_lines: code_lines).call - - blocks = document.to_a - expect(blocks.length).to eq(1) - node = document.root - expect(node.leaning).to eq(:equal) - expect(node.parents.length).to eq(3) - expect(node.parents.map(&:valid?)).to eq([false, true, false]) - - expect(node.parents[0].to_s).to eq(<<~'EOM') - def on_args_add(arguments, argument) - EOM - - expect(document.root.parents[1].to_s).to eq(<<~'EOM'.indent(2)) - if arguments.parts.empty? - Args.new(parts: [argument], location: argument.location) - else - Args.new( - parts: arguments.parts << argument, - location: arguments.location.to(argument.location) - ) - end - EOM - - expect(document.root.parents[2].to_s).to eq(<<~'EOM') - end - EOM - - inside = document.root.parents[1] - expect(inside.parents[0].to_s).to eq(<<~'EOM'.indent(2)) - if arguments.parts.empty? - EOM - - expect(inside.parents[1].to_s).to eq(<<~'EOM'.indent(2)) - Args.new(parts: [argument], location: argument.location) - else - Args.new( - parts: arguments.parts << argument, - location: arguments.location.to(argument.location) - ) - EOM - - expect(inside.parents[2].to_s).to eq(<<~'EOM'.indent(2)) - end - EOM - end - it "extra space before end" do source = <<~'EOM' Foo.call From 036373bc7a7e3b2d1061f7b34a011be12e248adc Mon Sep 17 00:00:00 2001 From: schneems Date: Fri, 4 Feb 2022 20:30:39 -0600 Subject: [PATCH 28/58] Follow multiple errors with IndentSearch When a node has two diverging branches that each contain a syntax error we need to follow both branches where the search forks. This commit adds the ability to fork the search. --- lib/dead_end/block_node.rb | 50 ++++++++++++--- lib/dead_end/indent_tree.rb | 52 ++++++++++++++- spec/unit/indent_search_spec.rb | 7 +- spec/unit/indent_tree_spec.rb | 110 +++++++++++++++++++++++++++++++- 4 files changed, 202 insertions(+), 17 deletions(-) diff --git a/lib/dead_end/block_node.rb b/lib/dead_end/block_node.rb index 4637cdf..5e0bd58 100644 --- a/lib/dead_end/block_node.rb +++ b/lib/dead_end/block_node.rb @@ -89,7 +89,15 @@ def diagnose return :split_leaning if split_leaning? - :multiple + return :multiple if reduce_multiple? + + :fork_invalid + end + + def fork_invalid + parents.select(&:invalid?).map do |block| + BlockNode.from_blocks([block]) + end end # Muliple could be: @@ -97,18 +105,40 @@ def diagnose # - valid rescue/else # - leaves inside of an array/hash # - An actual fork indicating multiple syntax errors + # + # This method handles the first two cases def handle_multiple - invalid = parents.select(&:invalid?) - # valid rescue/else - if above && above.leaning == :left && below && below.leaning == :right - before_length = invalid.length - invalid.reject! { |block| - b = BlockNode.from_blocks([above, block, below]) - b.leaning == :equal && b.valid? - } - return BlockNode.from_blocks(invalid) if invalid.any? && invalid.length != before_length + if reduced_multiple_invalid_array.any? + @reduce_multiple ||= BlockNode.from_blocks(reduced_multiple_invalid_array) end end + alias :reduce_multiple :handle_multiple + + def reduced_multiple_invalid_array + @reduced_multiple_invalid_array ||= begin + invalid = parents.select(&:invalid?) + # valid rescue/else + if above && above.leaning == :left && below && below.leaning == :right + before_length = invalid.length + invalid.reject! { |block| + b = BlockNode.from_blocks([above, block, below]) + b.leaning == :equal && b.valid? + } + if invalid.any? && invalid.length != before_length + invalid + else + [] + end + # return BlockNode.from_blocks(invalid) if invalid.any? && invalid.length != before_length + else + [] + end + end + end + + def reduce_multiple? + reduced_multiple_invalid_array.any? + end def split_leaning block = left_right_parents diff --git a/lib/dead_end/indent_tree.rb b/lib/dead_end/indent_tree.rb index 1c0636a..d7abced 100644 --- a/lib/dead_end/indent_tree.rb +++ b/lib/dead_end/indent_tree.rb @@ -37,6 +37,7 @@ class IndentSearch def initialize(tree:) @tree = tree + @root = tree.root @finished = [] @frontier = [Journey.new(@tree.root)] end @@ -47,6 +48,19 @@ def call case node.diagnose when :self @finished << journey + next + when :fork_invalid + forks = node.fork_invalid + if holds_all_errors?(forks) + forks.each do |block| + route = journey.deep_dup + route << Step.new(block) + @frontier.unshift(route) + end + else + @finished << journey + end + next when :next_invalid block = node.next_invalid @@ -60,17 +74,33 @@ def call # When true, we made a good move # otherwise, go back to last known reasonable guess - if journey.holds_all_errors?(block) + if holds_all_errors?(block) journey << Step.new(block) - @frontier << journey + @frontier.unshift(journey) else @finished << journey + @finished.sort_by! {|j| j.node.starts_at } next end end self end + + def holds_all_errors?(blocks) + blocks = Array(blocks).clone + blocks.concat(@finished.map(&:node)) + blocks.concat(@frontier.map(&:node)) + + without_lines = blocks.flat_map do |block| + block.lines + end + + DeadEnd.valid_without?( + without_lines: without_lines, + code_lines: @root.lines + ) + end end # Each journey represents a walk of the graph to eliminate @@ -79,11 +109,25 @@ def call # We can check the a step's validity by asserting that it's removal produces # valid code from it's parent class Journey + attr_reader :steps + def initialize(root) @root = root @steps = [Step.new(root)] end + def deep_dup + j = Journey.new(@root) + steps.each do |step| + j << step + end + j + end + + def to_s + node.to_s + end + # In isolation a block may appear valid when it isn't or invalid when it is # by checking against several levels of the tree, we can have higher # confidence that our values are correct @@ -107,6 +151,10 @@ def initialize(block) @block = block end + def to_s + block.to_s + end + def valid_without?(blocks) without_lines = Array(blocks).flat_map do |block| block.lines diff --git a/spec/unit/indent_search_spec.rb b/spec/unit/indent_search_spec.rb index fe38050..beb727d 100644 --- a/spec/unit/indent_search_spec.rb +++ b/spec/unit/indent_search_spec.rb @@ -18,8 +18,12 @@ module DeadEnd tree = IndentTree.new(document: document).call search = IndentSearch.new(tree: tree).call - expect(search.finished.first.node.to_s).to eq(<<~'EOM'.indent(2)) + expect(search.finished.length).to eq(2) + expect(search.finished.first.to_s).to eq(<<~'EOM'.indent(2)) puts ( + EOM + + expect(search.finished.last.to_s).to eq(<<~'EOM'.indent(2)) puts } EOM end @@ -62,7 +66,6 @@ def initialize(name) expect(search.finished.first.node.to_s).to eq(<<~'EOM'.indent(6)) port: port - body: body EOM end diff --git a/spec/unit/indent_tree_spec.rb b/spec/unit/indent_tree_spec.rb index 82fb082..226402c 100644 --- a/spec/unit/indent_tree_spec.rb +++ b/spec/unit/indent_tree_spec.rb @@ -4,6 +4,104 @@ module DeadEnd RSpec.describe IndentTree do + # If you put an indented "print" in there then + # the problem goes away, I think it's fine to not handle + # this (hopefully rare) case. If we showed you there was a problem + # on this line, deleting it would actually fix the problem + # even if the resultant code would be misindented + # + # We could also handle it in post though if we want to + it "ambiguous end, only a problem if nothing internal" do + source = <<~'EOM' + class Cow + end # one + end # two + EOM + code_lines = CleanDocument.new(source: source).call.lines + document = BlockDocument.new(code_lines: code_lines).call + tree = IndentTree.new(document: document).call + + node = tree.root + + expect(node.diagnose).to eq(:next_invalid) + node = node.next_invalid + + expect(node.diagnose).to eq(:self) + expect(node.to_s).to eq(<<~'EOM') + end # two + EOM + end + + + it "ambiguous kw" do + source = <<~'EOM' + class Cow + def speak + end + EOM + code_lines = CleanDocument.new(source: source).call.lines + document = BlockDocument.new(code_lines: code_lines).call + tree = IndentTree.new(document: document).call + + node = tree.root + expect(node.parents.length).to eq(2) + expect(node.diagnose).to eq(:next_invalid) + node = node.next_invalid + + expect(node.diagnose).to eq(:self) + expect(node.to_s).to eq(<<~'EOM') + class Cow + EOM + end + + it "fork invalid" do + source = <<~'EOM' + class Cow + def speak + print "moo" + end + + class Buffalo + print "buffalo" + end # buffalo one + end + EOM + code_lines = CleanDocument.new(source: source).call.lines + document = BlockDocument.new(code_lines: code_lines).call + tree = IndentTree.new(document: document).call + + node = tree.root + # expect(node.parents.length).to eq(2) + expect(node.diagnose).to eq(:fork_invalid) + forks = node.fork_invalid + + node = forks.first + + expect(node.diagnose).to eq(:split_leaning) + node = node.split_leaning + + expect(node.diagnose).to eq(:next_invalid) + node = node.next_invalid + + expect(node.diagnose).to eq(:self) + expect(node.to_s).to eq(<<~'EOM'.indent(2)) + def speak + EOM + + node = forks.last + + expect(node.diagnose).to eq(:split_leaning) + node = node.split_leaning + + expect(node.diagnose).to eq(:next_invalid) + node = node.next_invalid + + expect(node.diagnose).to eq(:self) + expect(node.to_s).to eq(<<~'EOM'.indent(2)) + end # buffalo one + EOM + end + it "invalid if and else" do source = <<~'EOM' if true @@ -34,10 +132,16 @@ module DeadEnd puts } EOM - expect(node.diagnose).to eq(:multiple) - node = node.handle_multiple + expect(node.diagnose).to eq(:fork_invalid) + forks = node.fork_invalid - expect(node.to_s).to eq(<<~'EOM') + expect(forks.length).to eq(2) + expect(forks.first.to_s).to eq(<<~'EOM'.indent(2)) + puts ( + EOM + + expect(forks.last.to_s).to eq(<<~'EOM'.indent(2)) + puts } EOM end From f9770543b7a76bc58533531c29f080769e298ad1 Mon Sep 17 00:00:00 2001 From: schneems Date: Sun, 6 Feb 2022 13:24:55 -0600 Subject: [PATCH 29/58] Implement tracing on new search Previously I implemented recording/tracing on building the tree, we also need to introspect how the tree is searched. This commit adds that tracing and DRYs up the code a bit. I don't love that there's this DEFAULT_VALUE that needs to be understood by so many abstraction layers. However if we want a false value like `nil` to represent "default state" then it seems to be needed. We also want to be as lazy as possible about tracing because each individual level/layer might need to be invoked with the debug env vars. This is good enough for now. The only other major downside is that the performance of this is quite expensive. For a search that takes half a second with tracing off it takes 1 min 47 seconds with it on. --- lib/dead_end/api.rb | 16 ++++--- lib/dead_end/indent_tree.rb | 88 +++++++++++++++++++++---------------- 2 files changed, 60 insertions(+), 44 deletions(-) diff --git a/lib/dead_end/api.rb b/lib/dead_end/api.rb index 6d32da1..92f9db7 100644 --- a/lib/dead_end/api.rb +++ b/lib/dead_end/api.rb @@ -85,13 +85,15 @@ def self.call(source:, filename: DEFAULT_VALUE, terminal: DEFAULT_VALUE, record_ # Used to generate a unique directory to record # search steps for debugging def self.record_dir(dir) - time = Time.now.strftime("%Y-%m-%d-%H-%M-%s-%N") - dir = Pathname(dir) - symlink = dir.join("last").tap { |path| path.delete if path.exist? } - dir.join(time).tap { |path| - path.mkpath - FileUtils.symlink(path.basename, symlink) - } + @record_dir ||= begin + time = Time.now.strftime("%Y-%m-%d-%H-%M-%s-%N") + dir = Pathname(dir) + symlink = dir.join("last").tap { |path| path.delete if path.exist? } + dir.join(time).tap { |path| + path.mkpath + FileUtils.symlink(path.basename, symlink) + } + end end # DeadEnd.valid_without? [Private] diff --git a/lib/dead_end/indent_tree.rb b/lib/dead_end/indent_tree.rb index d7abced..c04887a 100644 --- a/lib/dead_end/indent_tree.rb +++ b/lib/dead_end/indent_tree.rb @@ -1,7 +1,22 @@ # frozen_string_literal: true module DeadEnd - class Recorder + class BlockRecorder + def self.from_dir(dir, subdir: , code_lines: ) + if dir == DEFAULT_VALUE + dir = ENV["DEAD_END_RECORD_DIR"] || ENV["DEBUG"] ? DeadEnd.record_dir("tmp") : nil + end + + if dir.nil? + NullRecorder.new + else + dir = Pathname(dir) + dir = dir.join(subdir) + dir.mkpath + BlockRecorder.new(dir: dir, code_lines: code_lines) + end + end + def initialize(dir:, code_lines:) @code_lines = code_lines @dir = Pathname(dir) @@ -35,29 +50,38 @@ def capture(block, name:) class IndentSearch attr_reader :finished - def initialize(tree:) + def initialize(tree: , record_dir: DEFAULT_VALUE) @tree = tree @root = tree.root @finished = [] @frontier = [Journey.new(@tree.root)] + @recorder = BlockRecorder.from_dir(record_dir, subdir: "search", code_lines: tree.code_lines) end def call while (journey = @frontier.pop) node = journey.node - case node.diagnose + diagnose = node.diagnose + @recorder.capture(node, name: "pop_#{diagnose}") + + case diagnose when :self @finished << journey next when :fork_invalid forks = node.fork_invalid if holds_all_errors?(forks) + forks.each do |block| + @recorder.capture(block, name: "reduced_#{diagnose}") route = journey.deep_dup route << Step.new(block) @frontier.unshift(route) end else + forks.each do |block| + @recorder.capture(block, name: "finished_not_recorded_#{diagnose}") + end @finished << journey end @@ -72,22 +96,39 @@ def call raise "DeadEnd internal error: Unknown diagnosis #{node.diagnose}" end + # When true, we made a good move # otherwise, go back to last known reasonable guess if holds_all_errors?(block) + @recorder.capture(block, name: "reduced_#{diagnose}") + journey << Step.new(block) @frontier.unshift(journey) else + @recorder.capture(block, name: "finished_not_recorded_#{diagnose}") if block @finished << journey - @finished.sort_by! {|j| j.node.starts_at } next end end + @finished.sort_by! {|j| j.node.starts_at } + self end - def holds_all_errors?(blocks) + # Check if a given set of blocks holds + # syntax errors in the context of the document + # + # The frontier + finished arrays should always + # hold all errors for the document. + # + # When reducing a node or nodes we need to make sure + # that while they seem to hold a syntax error in isolation + # that they also hold it in the full document context. + # + # This method accounts for the need to branch/fork a + # search for multiple syntax errors + private def holds_all_errors?(blocks) blocks = Array(blocks).clone blocks.concat(@finished.map(&:node)) blocks.concat(@frontier.map(&:node)) @@ -128,13 +169,6 @@ def to_s node.to_s end - # In isolation a block may appear valid when it isn't or invalid when it is - # by checking against several levels of the tree, we can have higher - # confidence that our values are correct - def holds_all_errors?(blocks) - @steps.first.valid_without?(blocks) - end - def <<(step) @steps << step end @@ -154,38 +188,18 @@ def initialize(block) def to_s block.to_s end - - def valid_without?(blocks) - without_lines = Array(blocks).flat_map do |block| - block.lines - end - - DeadEnd.valid_without?( - without_lines: without_lines, - code_lines: @block.lines - ) - end end class IndentTree - attr_reader :document + attr_reader :document, :code_lines - def initialize(document:, recorder: DEFAULT_VALUE) + def initialize(document:, record_dir: DEFAULT_VALUE) @document = document + @code_lines = document.code_lines @last_length = Float::INFINITY - if recorder != DEFAULT_VALUE - @recorder = recorder - else - dir = ENV["DEAD_END_RECORD_DIR"] || ENV["DEBUG"] ? DeadEnd.record_dir("tmp") : nil - if dir.nil? - @recorder = NullRecorder.new - else - dir = dir.join("build_tree") - dir.mkpath - @recorder = Recorder.new(dir: dir, code_lines: document.code_lines) - end - end + + @recorder = BlockRecorder.from_dir(record_dir, subdir: "build_tree", code_lines: @code_lines) end def to_a From 62bff091502977891291f020a3cde3ee6b73d847 Mon Sep 17 00:00:00 2001 From: schneems Date: Mon, 7 Feb 2022 16:15:53 -0600 Subject: [PATCH 30/58] Move IndentSearch to a proper file --- lib/dead_end/api.rb | 4 +- lib/dead_end/indent_search.rb | 100 ++++++++++++++++++++++++++++++++++ lib/dead_end/indent_tree.rb | 96 -------------------------------- 3 files changed, 103 insertions(+), 97 deletions(-) create mode 100644 lib/dead_end/indent_search.rb diff --git a/lib/dead_end/api.rb b/lib/dead_end/api.rb index 92f9db7..bec47d1 100644 --- a/lib/dead_end/api.rb +++ b/lib/dead_end/api.rb @@ -201,6 +201,8 @@ def self.valid?(source) require_relative "balance_heuristic_expand" require_relative "parse_blocks_from_indent_line" -require_relative "block_document" require_relative "block_node" require_relative "indent_tree" +require_relative "block_document" + +require_relative "indent_search" diff --git a/lib/dead_end/indent_search.rb b/lib/dead_end/indent_search.rb new file mode 100644 index 0000000..c2acef8 --- /dev/null +++ b/lib/dead_end/indent_search.rb @@ -0,0 +1,100 @@ +# frozen_string_literal: true + +module DeadEnd + class IndentSearch + attr_reader :finished + + def initialize(tree: , record_dir: DEFAULT_VALUE) + @tree = tree + @root = tree.root + @finished = [] + @frontier = [Journey.new(@tree.root)] + @recorder = BlockRecorder.from_dir(record_dir, subdir: "search", code_lines: tree.code_lines) + end + + def call + while (journey = @frontier.pop) + node = journey.node + diagnose = node.diagnose + @recorder.capture(node, name: "pop_#{diagnose}") + + case diagnose + when :self + @finished << journey + next + when :fork_invalid + forks = node.fork_invalid + if holds_all_errors?(forks) + + forks.each do |block| + @recorder.capture(block, name: "reduced_#{diagnose}") + route = journey.deep_dup + route << Step.new(block) + @frontier.unshift(route) + end + else + forks.each do |block| + @recorder.capture(block, name: "finished_not_recorded_#{diagnose}") + end + @finished << journey + end + + next + when :next_invalid + block = node.next_invalid + when :split_leaning + block = node.split_leaning + when :multiple + block = node.handle_multiple + else + raise "DeadEnd internal error: Unknown diagnosis #{node.diagnose}" + end + + + # When true, we made a good move + # otherwise, go back to last known reasonable guess + if holds_all_errors?(block) + @recorder.capture(block, name: "reduced_#{diagnose}") + + journey << Step.new(block) + @frontier.unshift(journey) + else + @recorder.capture(block, name: "finished_not_recorded_#{diagnose}") if block + @finished << journey + next + end + end + + @finished.sort_by! {|j| j.node.starts_at } + + self + end + + # Check if a given set of blocks holds + # syntax errors in the context of the document + # + # The frontier + finished arrays should always + # hold all errors for the document. + # + # When reducing a node or nodes we need to make sure + # that while they seem to hold a syntax error in isolation + # that they also hold it in the full document context. + # + # This method accounts for the need to branch/fork a + # search for multiple syntax errors + private def holds_all_errors?(blocks) + blocks = Array(blocks).clone + blocks.concat(@finished.map(&:node)) + blocks.concat(@frontier.map(&:node)) + + without_lines = blocks.flat_map do |block| + block.lines + end + + DeadEnd.valid_without?( + without_lines: without_lines, + code_lines: @root.lines + ) + end + end +end diff --git a/lib/dead_end/indent_tree.rb b/lib/dead_end/indent_tree.rb index c04887a..6e415bf 100644 --- a/lib/dead_end/indent_tree.rb +++ b/lib/dead_end/indent_tree.rb @@ -47,102 +47,6 @@ def capture(block, name:) end end - class IndentSearch - attr_reader :finished - - def initialize(tree: , record_dir: DEFAULT_VALUE) - @tree = tree - @root = tree.root - @finished = [] - @frontier = [Journey.new(@tree.root)] - @recorder = BlockRecorder.from_dir(record_dir, subdir: "search", code_lines: tree.code_lines) - end - - def call - while (journey = @frontier.pop) - node = journey.node - diagnose = node.diagnose - @recorder.capture(node, name: "pop_#{diagnose}") - - case diagnose - when :self - @finished << journey - next - when :fork_invalid - forks = node.fork_invalid - if holds_all_errors?(forks) - - forks.each do |block| - @recorder.capture(block, name: "reduced_#{diagnose}") - route = journey.deep_dup - route << Step.new(block) - @frontier.unshift(route) - end - else - forks.each do |block| - @recorder.capture(block, name: "finished_not_recorded_#{diagnose}") - end - @finished << journey - end - - next - when :next_invalid - block = node.next_invalid - when :split_leaning - block = node.split_leaning - when :multiple - block = node.handle_multiple - else - raise "DeadEnd internal error: Unknown diagnosis #{node.diagnose}" - end - - - # When true, we made a good move - # otherwise, go back to last known reasonable guess - if holds_all_errors?(block) - @recorder.capture(block, name: "reduced_#{diagnose}") - - journey << Step.new(block) - @frontier.unshift(journey) - else - @recorder.capture(block, name: "finished_not_recorded_#{diagnose}") if block - @finished << journey - next - end - end - - @finished.sort_by! {|j| j.node.starts_at } - - self - end - - # Check if a given set of blocks holds - # syntax errors in the context of the document - # - # The frontier + finished arrays should always - # hold all errors for the document. - # - # When reducing a node or nodes we need to make sure - # that while they seem to hold a syntax error in isolation - # that they also hold it in the full document context. - # - # This method accounts for the need to branch/fork a - # search for multiple syntax errors - private def holds_all_errors?(blocks) - blocks = Array(blocks).clone - blocks.concat(@finished.map(&:node)) - blocks.concat(@frontier.map(&:node)) - - without_lines = blocks.flat_map do |block| - block.lines - end - - DeadEnd.valid_without?( - without_lines: without_lines, - code_lines: @root.lines - ) - end - end # Each journey represents a walk of the graph to eliminate # invalid code From b8a15496fe80b4ddc5fa68acffdb2c84e6fc9aa0 Mon Sep 17 00:00:00 2001 From: schneems Date: Mon, 7 Feb 2022 16:16:52 -0600 Subject: [PATCH 31/58] Add missing frozen string magic comment --- lib/dead_end/api.rb | 2 ++ 1 file changed, 2 insertions(+) diff --git a/lib/dead_end/api.rb b/lib/dead_end/api.rb index bec47d1..285c90b 100644 --- a/lib/dead_end/api.rb +++ b/lib/dead_end/api.rb @@ -1,3 +1,5 @@ +# frozen_string_literal: true + require_relative "version" require "tmpdir" From 4301fabc4f78014532ddb30d10379a58a5c57d0e Mon Sep 17 00:00:00 2001 From: schneems Date: Mon, 7 Feb 2022 16:25:45 -0600 Subject: [PATCH 32/58] Move BlockRecorder to proper file --- lib/dead_end/block_recorder.rb | 62 ++++++++++++++++++++++++++++++++++ lib/dead_end/indent_search.rb | 2 ++ lib/dead_end/indent_tree.rb | 48 ++------------------------ 3 files changed, 66 insertions(+), 46 deletions(-) create mode 100644 lib/dead_end/block_recorder.rb diff --git a/lib/dead_end/block_recorder.rb b/lib/dead_end/block_recorder.rb new file mode 100644 index 0000000..abb43d2 --- /dev/null +++ b/lib/dead_end/block_recorder.rb @@ -0,0 +1,62 @@ +# frozen_string_literal: true + +module DeadEnd + # Records a BlockNode to a folder on disk + # + # This class allows for tracing the algorithm + class BlockRecorder + + # Convienece constructor for building a BlockRecorder given + # a directory object. + # + # When nil and debug env vars have not been triggered, a + # NullRecorder instance will be returned + # + # Multiple different processes may be logging to the same + # directory, so writing to a subdir is recommended + def self.from_dir(dir, subdir: , code_lines: ) + if dir == DEFAULT_VALUE + dir = ENV["DEAD_END_RECORD_DIR"] || ENV["DEBUG"] ? DeadEnd.record_dir("tmp") : nil + end + + if dir.nil? + NullRecorder.new + else + dir = Pathname(dir) + dir = dir.join(subdir) + dir.mkpath + BlockRecorder.new(dir: dir, code_lines: code_lines) + end + end + + def initialize(dir:, code_lines:) + @code_lines = code_lines + @dir = Pathname(dir) + @tick = 0 + @name_tick = Hash.new { |h, k| h[k] = 0 } + end + + def capture(block, name:) + @tick += 1 + + filename = "#{@tick}-#{name}-#{@name_tick[name] += 1}-(#{block.starts_at}__#{block.ends_at}).txt" + @dir.join(filename).open(mode: "a") do |f| + document = DisplayCodeWithLineNumbers.new( + lines: @code_lines, + terminal: false, + highlight_lines: block.lines + ).call + + f.write(" Block lines: #{(block.starts_at + 1)..(block.ends_at + 1)} (#{name})\n") + f.write(" indent: #{block.indent} next_indent: #{block.next_indent}\n\n") + f.write(document.to_s) + end + end + end + + # Used when recording isn't needed + class NullRecorder + def capture(block, name:) + end + end +end diff --git a/lib/dead_end/indent_search.rb b/lib/dead_end/indent_search.rb index c2acef8..ee0aecc 100644 --- a/lib/dead_end/indent_search.rb +++ b/lib/dead_end/indent_search.rb @@ -1,5 +1,7 @@ # frozen_string_literal: true +require_relative "block_recorder" + module DeadEnd class IndentSearch attr_reader :finished diff --git a/lib/dead_end/indent_tree.rb b/lib/dead_end/indent_tree.rb index 6e415bf..432fdc2 100644 --- a/lib/dead_end/indent_tree.rb +++ b/lib/dead_end/indent_tree.rb @@ -1,52 +1,8 @@ # frozen_string_literal: true -module DeadEnd - class BlockRecorder - def self.from_dir(dir, subdir: , code_lines: ) - if dir == DEFAULT_VALUE - dir = ENV["DEAD_END_RECORD_DIR"] || ENV["DEBUG"] ? DeadEnd.record_dir("tmp") : nil - end - - if dir.nil? - NullRecorder.new - else - dir = Pathname(dir) - dir = dir.join(subdir) - dir.mkpath - BlockRecorder.new(dir: dir, code_lines: code_lines) - end - end - - def initialize(dir:, code_lines:) - @code_lines = code_lines - @dir = Pathname(dir) - @tick = 0 - @name_tick = Hash.new { |h, k| h[k] = 0 } - end - - def capture(block, name:) - @tick += 1 - - filename = "#{@tick}-#{name}-#{@name_tick[name] += 1}-(#{block.starts_at}__#{block.ends_at}).txt" - @dir.join(filename).open(mode: "a") do |f| - document = DisplayCodeWithLineNumbers.new( - lines: @code_lines, - terminal: false, - highlight_lines: block.lines - ).call - - f.write(" Block lines: #{(block.starts_at + 1)..(block.ends_at + 1)} (#{name})\n") - f.write(" indent: #{block.indent} next_indent: #{block.next_indent}\n\n") - f.write(document.to_s) - end - end - end - - class NullRecorder - def capture(block, name:) - end - end +require_relative "block_recorder" +module DeadEnd # Each journey represents a walk of the graph to eliminate # invalid code From 53eb26630b2d47b9f181c3641bb06c348390ffa2 Mon Sep 17 00:00:00 2001 From: schneems Date: Mon, 7 Feb 2022 16:32:01 -0600 Subject: [PATCH 33/58] Document IndentSearch --- lib/dead_end/indent_search.rb | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/lib/dead_end/indent_search.rb b/lib/dead_end/indent_search.rb index ee0aecc..7b9b95c 100644 --- a/lib/dead_end/indent_search.rb +++ b/lib/dead_end/indent_search.rb @@ -3,6 +3,27 @@ require_relative "block_recorder" module DeadEnd + # Search for the cause of a syntax error + # + # Starts with a BlockNode tree built from IndentTree + # this has the property of the entire document starting + # as a single root. From there we inspect the "parents" of + # the document node to follow the invalid blocks. + # + # This process is recorded via one or more `Journey` instances. + # + # The search enforces the property that all nodes on a journey + # would produce a valid document if removed. This holds true + # from the root node as removing all source code would produce + # a parsable document + # + # After each step in a search, the step is evaluated to see if + # it preserves the Journey property. If not, it means we've looked + # too far and have over-shot our syntax error. Or we've made a bad + # move. In either case we terminate the journey and report its last block. + # + # When done, the journey instances can be accessed in the `finished` + # array class IndentSearch attr_reader :finished From ea6986ecc3a19fe2a4bd942ca2c2fc50dd277b85 Mon Sep 17 00:00:00 2001 From: schneems Date: Mon, 7 Feb 2022 16:37:31 -0600 Subject: [PATCH 34/58] Move journey to proper file --- lib/dead_end/indent_search.rb | 1 + lib/dead_end/indent_tree.rb | 47 ----------------------------- lib/dead_end/journey.rb | 56 +++++++++++++++++++++++++++++++++++ 3 files changed, 57 insertions(+), 47 deletions(-) create mode 100644 lib/dead_end/journey.rb diff --git a/lib/dead_end/indent_search.rb b/lib/dead_end/indent_search.rb index 7b9b95c..f6da1cf 100644 --- a/lib/dead_end/indent_search.rb +++ b/lib/dead_end/indent_search.rb @@ -1,5 +1,6 @@ # frozen_string_literal: true +require_relative "journey" require_relative "block_recorder" module DeadEnd diff --git a/lib/dead_end/indent_tree.rb b/lib/dead_end/indent_tree.rb index 432fdc2..6c61241 100644 --- a/lib/dead_end/indent_tree.rb +++ b/lib/dead_end/indent_tree.rb @@ -3,53 +3,6 @@ require_relative "block_recorder" module DeadEnd - - # Each journey represents a walk of the graph to eliminate - # invalid code - # - # We can check the a step's validity by asserting that it's removal produces - # valid code from it's parent - class Journey - attr_reader :steps - - def initialize(root) - @root = root - @steps = [Step.new(root)] - end - - def deep_dup - j = Journey.new(@root) - steps.each do |step| - j << step - end - j - end - - def to_s - node.to_s - end - - def <<(step) - @steps << step - end - - def node - @steps.last.block - end - end - - class Step - attr_reader :block - - def initialize(block) - @block = block - end - - def to_s - block.to_s - end - end - class IndentTree attr_reader :document, :code_lines diff --git a/lib/dead_end/journey.rb b/lib/dead_end/journey.rb new file mode 100644 index 0000000..dfdfa1f --- /dev/null +++ b/lib/dead_end/journey.rb @@ -0,0 +1,56 @@ +# frozen_string_literal: true + +module DeadEnd + # Each journey represents a walk of the graph to eliminate + # invalid code + # + # We can check the a step's validity by asserting that it's removal produces + # valid code from it's parent + # + # node = tree.root + # journey = Journe.new(node) + # journey << Step.new(node.parents[0]) + # expect(journey.node).to eq(node.parents[0]) + # + class Journey + attr_reader :steps + + def initialize(root) + @root = root + @steps = [Step.new(root)] + end + + # Needed so we don't internally mutate the @steps array + def deep_dup + j = Journey.new(@root) + steps.each do |step| + j << step + end + j + end + + def to_s + node.to_s + end + + def <<(step) + @steps << step + end + + def node + @steps.last.block + end + end + + class Step + attr_reader :block + + def initialize(block) + @block = block + end + + def to_s + block.to_s + end + end +end From ccd9816900ba08aa83ff804efb90a6c30fef0e7b Mon Sep 17 00:00:00 2001 From: schneems Date: Mon, 7 Feb 2022 16:45:04 -0600 Subject: [PATCH 35/58] Document and refactor IndentTree --- lib/dead_end/indent_tree.rb | 42 ++++++++++++++++++------------------- 1 file changed, 21 insertions(+), 21 deletions(-) diff --git a/lib/dead_end/indent_tree.rb b/lib/dead_end/indent_tree.rb index 6c61241..408d950 100644 --- a/lib/dead_end/indent_tree.rb +++ b/lib/dead_end/indent_tree.rb @@ -3,6 +3,20 @@ require_relative "block_recorder" module DeadEnd + # Transform a BlockDocument into a Tree + # + # tree = IndentTree.new(document: document).call + # expect(tree.root.lines).to eq(document.code_lines) + # + # Nodes are put into a queue (provided by the document) + # and are pulled out in a specific priority order (high coupling). + # + # A node then attempts to expand up and down according to rules here + # and in `BlockNode#expand_above?` and `BlockNode#expand_below?` + # + # While this process tends to produce valid code blocks from valid code + # it's not guaranteed. Since we will ultimately search for invalid code + # it's not an ideal property. class IndentTree attr_reader :document, :code_lines @@ -15,47 +29,33 @@ def initialize(document:, record_dir: DEFAULT_VALUE) @recorder = BlockRecorder.from_dir(record_dir, subdir: "build_tree", code_lines: @code_lines) end - def to_a - @document.to_a - end - def root @document.root end def call - reduce - - self - end - - private def reduce while (block = document.pop) - original = block - blocks = [block] + @recorder.capture(block, name: "pop") - indent = original.next_indent + blocks = [block] + indent = block.next_indent + # Look up while blocks.last.expand_above?(with_indent: indent) above = blocks.last.above - leaning = above.leaning - # break if leaning == :right blocks << above - break if leaning == :left + break if above.leaning == :left end blocks.reverse! + # Look down while blocks.last.expand_below?(with_indent: indent) below = blocks.last.below - leaning = below.leaning - # break if leaning == :left blocks << below - break if leaning == :right + break if below.leaning == :right end - @recorder.capture(original, name: "pop") - if blocks.length > 1 node = document.capture_all(blocks) @recorder.capture(node, name: "expand") From 93bab24bbc8858b76492cb14bbdf8e30c4abfa2c Mon Sep 17 00:00:00 2001 From: schneems Date: Mon, 7 Feb 2022 20:47:02 -0600 Subject: [PATCH 36/58] Document BlockNode and update BlockDocument docs --- lib/dead_end/block_document.rb | 25 ++++++ lib/dead_end/block_node.rb | 143 ++++++++++++++++++++++++++++++--- 2 files changed, 158 insertions(+), 10 deletions(-) diff --git a/lib/dead_end/block_document.rb b/lib/dead_end/block_document.rb index 721c9df..67bc6a8 100644 --- a/lib/dead_end/block_document.rb +++ b/lib/dead_end/block_document.rb @@ -1,6 +1,31 @@ # frozen_string_literal: true module DeadEnd + # Convert an array of code lines to a linked list of BlockNodes + # + # Each BlockNode is connected to the node above and below it. + # A BlockNode can "capture" other nodes. This process is recursively + # performed to build a hierarchical tree structure via IndentTree + # + # document = BlockDocument.new(code_lines: code_lines) + # document.call + # + # A BlockDocument also holds a priority queue for all BlockNodes + # + # Empty lines are ignored so blocks in the list may have gaps in + # line/index numbers. + # + # A core operation is the ability to to "capture" + # several blocks immutably. This process creates a new block + # that holds the captured state and substitutes it into the graph + # data structure for the original block + # + # node.above.leaning # => :left + # node.below.leaning # => :right + + # block = document.capture_all([node.above, node.below]) + # block.leaning # => :equal + # class BlockDocument attr_reader :blocks, :queue, :root, :code_lines diff --git a/lib/dead_end/block_node.rb b/lib/dead_end/block_node.rb index 5e0bd58..f9e2388 100644 --- a/lib/dead_end/block_node.rb +++ b/lib/dead_end/block_node.rb @@ -1,7 +1,51 @@ # frozen_string_literal: true module DeadEnd + # A core data structure + # + # A block node keeps a reference to the block above it + # and below it. In addition a block can "capture" another + # block. Block nodes are treated as immutable(ish) so when that happens + # a new node is created that contains a refernce to all the blocks it was + # derived from. These are known as a block's "parents". + # + # If you walk the parent chain until it ends you'll end up with nodes + # representing individual lines of code (generated from a CodeLine). + # + # An important concept in a block is that it knows how it is "leaning" + # based on it's internal LexPairDiff. If it's leaning `:left` that means + # it needs to capture something to it's right/down to be balanced again. + # + # Note: that that the capture method is on BlockDocument since it needs to + # retain a valid reference to it's root. + # + # Another important concept is that blocks know their current indentation + # as well as can accurately derive their "next" indentation for when/if + # they're expanded. To be calculated a nodes above and below blocks must + # be accurately assigned. So this property cannot be calculated at creation + # time. + # + # Beyond these core capabilities blocks also know how to `diagnose` what + # is wrong with them. And then they can take an action based on that + # diagnosis. For example `node.diagnose == :split_leaning` indicates that + # it contains parents invalid parents that likey represent an invalid node + # sandwitched between a left and right leaning node. This will happen with + # code. For example `[`, `bad &*$@&^ code`, `]`. Then the inside invalid node + # can be grabbed via calling `node.split_leaning`. + # + # In the long term it likely makes sense to move diagnosis and extraction + # to a separate class as this class already is a bit of a "false god object" + # however a lot of tests depend on it currently and it's not really getting + # in the way. class BlockNode + # Helper to create a block from other blocks + # + # parents = node.parents + # expect(parents[0].leaning).to eq(:left) + # expect(parents[2].leaning).to eq(:right) + # + # block = BlockNode.from_blocks([parents[0], parents[2]]) + # expect(block.leaning).to eq(:equal) def self.from_blocks(parents) lines = [] parents = parents.first.parents if parents.length == 1 && parents.first.parents.any? @@ -51,6 +95,12 @@ def initialize(lines:, indent:, next_indent: nil, lex_diff: nil, parents: []) @deleted = false end + # Used to determine when to expand up in building + # a tree. Also used to calculate the `next_indent`. + # + # There is a tight coupling between the two concepts + # as the `next_indent` is used to determine node expansion + # priority def expand_above?(with_indent: indent) return false if above.nil? return false if leaf? && leaning == :left @@ -62,6 +112,12 @@ def expand_above?(with_indent: indent) end end + # Used to determine when to expand down in building + # a tree. Also used to calculate the `next_indent`. + # + # There is a tight coupling between the two concepts + # as the `next_indent` is used to determine node expansion + # priority def expand_below?(with_indent: indent) return false if below.nil? return false if leaf? && leaning == :right @@ -77,10 +133,28 @@ def leaf? parents.empty? end + # When diagnose is `:next_invalid` it indicates that + # only one parent is not valid. Therefore we must + # follow that node if we wish to continue reducing + # the invalid blocks def next_invalid parents.detect(&:invalid?) end + # Returns a symbol correlated to the current node's + # parents state + # + # - :self - Leaf node, problem must be with self + # - :next_invalid - Only one invalid parent node found + # - :split_leaning - Invalid block is sandwiched between + # a left/right leaning block, grab the inside + # - :multiple - multiple parent blocks are detected as being + # invalid but it's not a "split leaning". If we can reduce/remove + # one or more of these blocks by pairing with the above/below + # nodes then we can reduce multiple invalid blocks to possibly + # be a single invalid block. + # - :fork_invalid - If we got here, it looks like there's actually + # multiple syntax errors in multiple parents. def diagnose return :self if leaf? @@ -94,19 +168,22 @@ def diagnose :fork_invalid end + # - :fork_invalid - If we got here, it looks like there's actually + # multiple syntax errors in multiple parents. def fork_invalid parents.select(&:invalid?).map do |block| BlockNode.from_blocks([block]) end end - # Muliple could be: + # - :multiple - multiple parent blocks are detected as being + # invalid but it's not a "split leaning". If we can reduce/remove + # one or more of these blocks by pairing with the above/below + # nodes then we can reduce multiple invalid blocks to possibly + # be a single invalid block. # # - valid rescue/else # - leaves inside of an array/hash - # - An actual fork indicating multiple syntax errors - # - # This method handles the first two cases def handle_multiple if reduced_multiple_invalid_array.any? @reduce_multiple ||= BlockNode.from_blocks(reduced_multiple_invalid_array) @@ -114,7 +191,7 @@ def handle_multiple end alias :reduce_multiple :handle_multiple - def reduced_multiple_invalid_array + private def reduced_multiple_invalid_array @reduced_multiple_invalid_array ||= begin invalid = parents.select(&:invalid?) # valid rescue/else @@ -140,6 +217,12 @@ def reduce_multiple? reduced_multiple_invalid_array.any? end + # In isolation left and right leaning blocks + # are invalid. For example `(` and `)`. + # + # If we see 3 or more invalid blocks and the outer + # are leaning left and right, then the problem might + # be between the leaning blocks rather than with them def split_leaning block = left_right_parents invalid = parents.select(&:invalid?) @@ -149,7 +232,7 @@ def split_leaning @inner_leaning ||= BlockNode.from_blocks(invalid) end - def left_right_parents + private def left_right_parents invalid = parents.select(&:invalid?) return false if invalid.length < 3 @@ -175,6 +258,18 @@ def split_leaning? end end + # Given a node, it's above and below links + # returns the next indentation. + # + # The algorithm for the logic follows: + # + # Expand given the current rules and current indentation + # keep doing that until we can't anymore. When we can't + # then pick the lowest indentation that will capture above + # and below blocks. + # + # The results of this algorithm are tightly coupled to + # tree building and therefore search. def self.next_indent(above, node, below) return node.indent if node.expand_above? || node.expand_below? @@ -195,10 +290,17 @@ def self.next_indent(above, node, below) end end + # Calculating the next_indent must be done after above and below + # have been assigned (otherwise we would have a race condition). def next_indent @next_indent ||= self.class.next_indent(above, self, below) end + # It's useful to be able to mark a node as deleted without having + # to iterate over a data structure to remove it. + # + # By storing a deleted state of a node we can instead lazilly ignore it + # as needed. This is a performance optimization. def delete @deleted = true end @@ -207,24 +309,35 @@ def deleted? @deleted end + # Code within a given node is not syntatically valid def invalid? !valid? end + # Code within a given node is syntatically valid + # + # Value is memoized for performance def valid? return @valid if defined?(@valid) @valid = DeadEnd.valid?(@lines.join) end + # Opposite of `balanced?` def unbalanced? !balanced? end + # A node that is `leaning == :equal` is determined to be "balanced". + # + # Alternative states include :left, :right, or :both def balanced? @lex_diff.balanced? end + # Returns the direction a block is leaning + # + # States include :equal, :left, :right, and :both def leaning @lex_diff.leaning end @@ -233,6 +346,17 @@ def to_s @lines.join end + # Determines priority of node within a priority data structure + # (such as a priority queue). + # + # This is tightly coupled to tree building and search. + # + # It's also a performance sensitive area. An optimization + # not yet taken would be to re-encode the same data as a string + # so a node with next indent of 8, current indent of 10 and line + # of 100 might possibly be encoded as `008001000100` which would + # sort the same as this logic. Preliminary benchmarks indicate a + # rough 2x speedup def <=>(other) case next_indent <=> other.next_indent when 1 then 1 @@ -242,20 +366,18 @@ def <=>(other) when 1 then 1 when -1 then -1 when 0 - # if leaning != other.leaning - # return -1 if self.leaning == :equal - # return 1 if other.leaning == :equal - # end end_index <=> other.end_index end end end + # Provide meaningful diffs in rspec def inspect "#" end + # Generate a new lex pair diff given an array of lines private def set_lex_diff_from(lines) @lex_diff = LexPairDiff.new_empty lines.each do |line| @@ -263,6 +385,7 @@ def inspect end end + # Needed for meaningful rspec assertions def ==(other) @lines == other.lines && @indent == other.indent && next_indent == other.next_indent && @parents == other.parents end From f3a83a511a09a219e1bf9cb175dda28aef0a7115 Mon Sep 17 00:00:00 2001 From: schneems Date: Tue, 8 Feb 2022 09:04:08 -0600 Subject: [PATCH 37/58] Refactor logic to Diagnose class Previously the BlockNode knew how to both "diagnose" it's own problem as well as how to return its parent that likely holds the core issue. That caused a lot of extra, somewhat confusing code to go into BlockNode that wasn't really the responsibility of that object. This commit introduces a new class `Diagnose` it is responsible for naming the problem with the current node and extracting the likely issue. This makes BlockNode smaller and more focused, it also isolates all the extraction logic to one convienent location. This was done as a pure refactor. After fully extracting the existing logic, I was able to clean up IndentSearch a lot thanks to the new API. It is now cleaner and more focused. The IndentSearch class is now mainly responsible for maintaining the guarantees that each journey only contains steps that would produce a valid document. I also found a problem where we were claiming that a block node was not a leaf even though it was. This was due to passing in a non-empty parents array to the constructor when we shouldn't have. That's now been fixed --- lib/dead_end/block_node.rb | 224 +++++++++++++++----------------- lib/dead_end/indent_search.rb | 57 ++------ spec/unit/indent_search_spec.rb | 23 ++++ spec/unit/indent_tree_spec.rb | 6 +- 4 files changed, 142 insertions(+), 168 deletions(-) diff --git a/lib/dead_end/block_node.rb b/lib/dead_end/block_node.rb index f9e2388..f7e3378 100644 --- a/lib/dead_end/block_node.rb +++ b/lib/dead_end/block_node.rb @@ -1,6 +1,87 @@ # frozen_string_literal: true module DeadEnd + class Diagnose + attr_reader :block, :problem, :next + + def initialize(block) + @block = block + @problem = nil + @next = [] + end + + def invalid + @block.parents.select(&:invalid?) + end + + def call + find_invalid + return self if invalid.empty? + + if @problem == :fork_invalid + @next = invalid.map {|b| BlockNode.from_blocks([b]) } + else + @next = [ BlockNode.from_blocks(invalid) ] + end + + self + end + + def invalid + @invalid ||= get_invalid + end + + private def find_invalid + invalid + end + + private def get_invalid + if block.parents.empty? + @problem = :self + return [] + end + + invalid = block.parents.select(&:invalid?) + + left = invalid.detect { |block| block.leaning == :left } + right = invalid.reverse_each.detect { |block| block.leaning == :right } + + above = block.above if block.above&.leaning == :left + below = block.below if block.below&.leaning == :right + + if left && right && invalid.length >= 3 && BlockNode.from_blocks([left, right]).valid? + @problem = :split_leaning + + invalid.reject! {|x| x == left || x == right } + + return invalid + end + + if above && below + @problem = :multiple + + before_length = invalid.length + invalid.reject! { |block| + b = BlockNode.from_blocks([above, block, below]) + b.leaning == :equal && b.valid? + } + + if invalid.any? && invalid.length != before_length + return invalid + end + end + + invalid = block.parents.select(&:invalid?) + if invalid.length > 1 + @problem = :fork_invalid + else + @problem = :next_invalid + end + + invalid + end + end + # A core data structure # # A block node keeps a reference to the block above it @@ -48,7 +129,9 @@ class BlockNode # expect(block.leaning).to eq(:equal) def self.from_blocks(parents) lines = [] - parents = parents.first.parents if parents.length == 1 && parents.first.parents.any? + while parents.length == 1 && parents.first.parents.any? + parents = parents.first.parents + end indent = parents.first.indent lex_diff = LexPairDiff.new_empty parents.each do |block| @@ -58,6 +141,11 @@ def self.from_blocks(parents) block.delete end + above = parents.first.above + below = parents.last.below + + parents = [] if parents.length == 1 + node = BlockNode.new( lines: lines, lex_diff: lex_diff, @@ -65,8 +153,8 @@ def self.from_blocks(parents) parents: parents ) - node.above = parents.first.above - node.below = parents.last.below + node.above = above + node.below = below node end @@ -75,24 +163,23 @@ def self.from_blocks(parents) def initialize(lines:, indent:, next_indent: nil, lex_diff: nil, parents: []) lines = Array(lines) - @indent = indent - @next_indent = next_indent @lines = lines - @parents = parents + @deleted = false - @start_index = lines.first.index @end_index = lines.last.index + @start_index = lines.first.index + @indent = indent + @next_indent = next_indent @starts_at = @start_index + 1 - @ends_at = @end_index + 1 + + @parents = parents if lex_diff.nil? set_lex_diff_from(@lines) else @lex_diff = lex_diff end - - @deleted = false end # Used to determine when to expand up in building @@ -133,129 +220,26 @@ def leaf? parents.empty? end - # When diagnose is `:next_invalid` it indicates that - # only one parent is not valid. Therefore we must - # follow that node if we wish to continue reducing - # the invalid blocks def next_invalid - parents.detect(&:invalid?) + @diagnose.next.first end - # Returns a symbol correlated to the current node's - # parents state - # - # - :self - Leaf node, problem must be with self - # - :next_invalid - Only one invalid parent node found - # - :split_leaning - Invalid block is sandwiched between - # a left/right leaning block, grab the inside - # - :multiple - multiple parent blocks are detected as being - # invalid but it's not a "split leaning". If we can reduce/remove - # one or more of these blocks by pairing with the above/below - # nodes then we can reduce multiple invalid blocks to possibly - # be a single invalid block. - # - :fork_invalid - If we got here, it looks like there's actually - # multiple syntax errors in multiple parents. def diagnose - return :self if leaf? - - invalid = parents.select(&:invalid?) - return :next_invalid if invalid.count == 1 - - return :split_leaning if split_leaning? - - return :multiple if reduce_multiple? - - :fork_invalid + @diagnose ||= Diagnose.new(self).call + @diagnose.problem end - # - :fork_invalid - If we got here, it looks like there's actually - # multiple syntax errors in multiple parents. def fork_invalid - parents.select(&:invalid?).map do |block| - BlockNode.from_blocks([block]) - end + @diagnose.next end - # - :multiple - multiple parent blocks are detected as being - # invalid but it's not a "split leaning". If we can reduce/remove - # one or more of these blocks by pairing with the above/below - # nodes then we can reduce multiple invalid blocks to possibly - # be a single invalid block. - # - # - valid rescue/else - # - leaves inside of an array/hash def handle_multiple - if reduced_multiple_invalid_array.any? - @reduce_multiple ||= BlockNode.from_blocks(reduced_multiple_invalid_array) - end + @diagnose.next.first end alias :reduce_multiple :handle_multiple - private def reduced_multiple_invalid_array - @reduced_multiple_invalid_array ||= begin - invalid = parents.select(&:invalid?) - # valid rescue/else - if above && above.leaning == :left && below && below.leaning == :right - before_length = invalid.length - invalid.reject! { |block| - b = BlockNode.from_blocks([above, block, below]) - b.leaning == :equal && b.valid? - } - if invalid.any? && invalid.length != before_length - invalid - else - [] - end - # return BlockNode.from_blocks(invalid) if invalid.any? && invalid.length != before_length - else - [] - end - end - end - - def reduce_multiple? - reduced_multiple_invalid_array.any? - end - - # In isolation left and right leaning blocks - # are invalid. For example `(` and `)`. - # - # If we see 3 or more invalid blocks and the outer - # are leaning left and right, then the problem might - # be between the leaning blocks rather than with them def split_leaning - block = left_right_parents - invalid = parents.select(&:invalid?) - - invalid.reject! { |x| block.parents.include?(x) } - - @inner_leaning ||= BlockNode.from_blocks(invalid) - end - - private def left_right_parents - invalid = parents.select(&:invalid?) - return false if invalid.length < 3 - - left = invalid.detect { |block| block.leaning == :left } - - return false if left.nil? - - right = invalid.reverse_each.detect { |block| block.leaning == :right } - return false if right.nil? - - @left_right_parents ||= BlockNode.from_blocks([left, right]) - end - - # When a kw/end has an invalid block inbetween it will show up as [false, false, false] - # we can check if the first and last can be joined together for a valid block which - # effectively gives us [true, false, true] - def split_leaning? - block = left_right_parents - if block - block.leaning == :equal && block.valid? - else - false - end + @diagnose.next.first end # Given a node, it's above and below links @@ -387,6 +371,8 @@ def inspect # Needed for meaningful rspec assertions def ==(other) + return false if other.nil? + @lines == other.lines && @indent == other.indent && next_indent == other.next_indent && @parents == other.parents end end diff --git a/lib/dead_end/indent_search.rb b/lib/dead_end/indent_search.rb index f6da1cf..9e3c48f 100644 --- a/lib/dead_end/indent_search.rb +++ b/lib/dead_end/indent_search.rb @@ -38,54 +38,21 @@ def initialize(tree: , record_dir: DEFAULT_VALUE) def call while (journey = @frontier.pop) - node = journey.node - diagnose = node.diagnose - @recorder.capture(node, name: "pop_#{diagnose}") + diagnose = Diagnose.new(journey.node).call + problem = diagnose.problem + nodes = diagnose.next - case diagnose - when :self - @finished << journey - next - when :fork_invalid - forks = node.fork_invalid - if holds_all_errors?(forks) - - forks.each do |block| - @recorder.capture(block, name: "reduced_#{diagnose}") - route = journey.deep_dup - route << Step.new(block) - @frontier.unshift(route) - end - else - forks.each do |block| - @recorder.capture(block, name: "finished_not_recorded_#{diagnose}") - end - @finished << journey - end - - next - when :next_invalid - block = node.next_invalid - when :split_leaning - block = node.split_leaning - when :multiple - block = node.handle_multiple - else - raise "DeadEnd internal error: Unknown diagnosis #{node.diagnose}" - end + @recorder.capture(journey.node, name: "pop_#{problem}") - - # When true, we made a good move - # otherwise, go back to last known reasonable guess - if holds_all_errors?(block) - @recorder.capture(block, name: "reduced_#{diagnose}") - - journey << Step.new(block) - @frontier.unshift(journey) - else - @recorder.capture(block, name: "finished_not_recorded_#{diagnose}") if block + if nodes.empty? || !holds_all_errors?(nodes) @finished << journey - next + else + nodes.each do |block| + @recorder.capture(block, name: "explore_#{problem}") + route = journey.deep_dup + route << Step.new(block) + @frontier.unshift(route) + end end end diff --git a/spec/unit/indent_search_spec.rb b/spec/unit/indent_search_spec.rb index beb727d..7bc516f 100644 --- a/spec/unit/indent_search_spec.rb +++ b/spec/unit/indent_search_spec.rb @@ -4,6 +4,29 @@ module DeadEnd RSpec.describe IndentSearch do + # it "long inner" do + # skip("it") + # pending("Fixes to diagnose") + # source = <<~'EOM' + # { + # foo: :bar, + # bing: :baz, + # blat: :flat # problem + # florg: :blorg, + # bling: :blong + # } + # EOM + + # code_lines = CleanDocument.new(source: source).call.lines + # document = BlockDocument.new(code_lines: code_lines).call + # tree = IndentTree.new(document: document).call + # search = IndentSearch.new(tree: tree).call + + # expect(search.finished.first.node.to_s).to eq(<<~'EOM'.indent(4)) + # blat: :flat # + # EOM + # end + it "invalid if and else" do source = <<~'EOM' if true diff --git a/spec/unit/indent_tree_spec.rb b/spec/unit/indent_tree_spec.rb index 226402c..9466920 100644 --- a/spec/unit/indent_tree_spec.rb +++ b/spec/unit/indent_tree_spec.rb @@ -72,6 +72,7 @@ class Buffalo node = tree.root # expect(node.parents.length).to eq(2) + expect(node.diagnose).to eq(:fork_invalid) forks = node.fork_invalid @@ -258,6 +259,7 @@ def animals expect(node.diagnose).to eq(:split_leaning) node = node.split_leaning + expect(node.diagnose).to eq(:next_invalid) node = node.next_invalid @@ -335,10 +337,6 @@ def compile node = node.handle_multiple - expect(node.parents.length).to eq(1) - expect(node.diagnose).to eq(:next_invalid) - - node = node.next_invalid expect(node.diagnose).to eq(:self) expect(node.to_s).to eq(<<~'EOM'.indent(6)) From e06af0c5627a80648f98c1849ed0b31fad988299 Mon Sep 17 00:00:00 2001 From: schneems Date: Tue, 8 Feb 2022 11:37:06 -0600 Subject: [PATCH 38/58] Fix problem with one bad internal node When there's a node inside of leaning elements it might look valid in isolation, for example: ``` cat ``` Is valid code but: ``` [ cat dog, bird ] ``` Is not (missing a comma). We are already enumerating each internal element to see if it's valid in isolation i.e.: ``` [ cat ] ``` and ``` [ dog, ] ``` and ``` [ bird ] ``` However due to ruby's permissive rules all of these will be true and no single problem will be reported ## The fix Rather than checking each element in isolation, we can do the opposite. Check if removing that element from the inner block array would produce a valid outcome: ``` [ cat dog, # bird ] # Still invalid ``` ``` [ cat # dog, bird ] # Still invalid ``` ``` [ # cat dog, bird ] # VALID! ``` If one and only one block can be removed to produce a valid outcome then it must hold the problem. Worth noting that it feels like there's a more generic case to handle several of the different diagnose states but I'm unable to find it right now. I would love to clean up the Diagnose class to make some of these checks cleaner/simpler --- lib/dead_end/block_node.rb | 9 +++++-- spec/unit/indent_search_spec.rb | 42 ++++++++++++++++----------------- spec/unit/indent_tree_spec.rb | 2 +- 3 files changed, 28 insertions(+), 25 deletions(-) diff --git a/lib/dead_end/block_node.rb b/lib/dead_end/block_node.rb index f7e3378..fd6971c 100644 --- a/lib/dead_end/block_node.rb +++ b/lib/dead_end/block_node.rb @@ -66,8 +66,13 @@ def invalid b.leaning == :equal && b.valid? } - if invalid.any? && invalid.length != before_length - return invalid + if invalid.length != before_length + if invalid.any? + return invalid + elsif (b = block.parents.select(&:invalid?).detect { |b| BlockNode.from_blocks([above, block.parents.select(&:invalid?) - [b] , below].flatten).valid? }) + @problem = :one_inside + return [b] + end end end diff --git a/spec/unit/indent_search_spec.rb b/spec/unit/indent_search_spec.rb index 7bc516f..be0f7b2 100644 --- a/spec/unit/indent_search_spec.rb +++ b/spec/unit/indent_search_spec.rb @@ -4,28 +4,26 @@ module DeadEnd RSpec.describe IndentSearch do - # it "long inner" do - # skip("it") - # pending("Fixes to diagnose") - # source = <<~'EOM' - # { - # foo: :bar, - # bing: :baz, - # blat: :flat # problem - # florg: :blorg, - # bling: :blong - # } - # EOM - - # code_lines = CleanDocument.new(source: source).call.lines - # document = BlockDocument.new(code_lines: code_lines).call - # tree = IndentTree.new(document: document).call - # search = IndentSearch.new(tree: tree).call - - # expect(search.finished.first.node.to_s).to eq(<<~'EOM'.indent(4)) - # blat: :flat # - # EOM - # end + it "long inner" do + source = <<~'EOM' + { + foo: :bar, + bing: :baz, + blat: :flat # problem + florg: :blorg, + bling: :blong + } + EOM + + code_lines = CleanDocument.new(source: source).call.lines + document = BlockDocument.new(code_lines: code_lines).call + tree = IndentTree.new(document: document).call + search = IndentSearch.new(tree: tree).call + + expect(search.finished.join).to eq(<<~'EOM'.indent(2)) + blat: :flat # problem + EOM + end it "invalid if and else" do source = <<~'EOM' diff --git a/spec/unit/indent_tree_spec.rb b/spec/unit/indent_tree_spec.rb index 9466920..653b5be 100644 --- a/spec/unit/indent_tree_spec.rb +++ b/spec/unit/indent_tree_spec.rb @@ -260,7 +260,7 @@ def animals expect(node.diagnose).to eq(:split_leaning) node = node.split_leaning - expect(node.diagnose).to eq(:next_invalid) + expect(node.diagnose).to eq(:one_inside) node = node.next_invalid expect(node.diagnose).to eq(:self) From a666da7cd38d4a712ff7dff4847f5f3217250f19 Mon Sep 17 00:00:00 2001 From: schneems Date: Tue, 8 Feb 2022 14:46:03 -0600 Subject: [PATCH 39/58] Fix block reporter --- lib/dead_end/block_node.rb | 1 + lib/dead_end/block_recorder.rb | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/lib/dead_end/block_node.rb b/lib/dead_end/block_node.rb index fd6971c..8c9184c 100644 --- a/lib/dead_end/block_node.rb +++ b/lib/dead_end/block_node.rb @@ -177,6 +177,7 @@ def initialize(lines:, indent:, next_indent: nil, lex_diff: nil, parents: []) @next_indent = next_indent @starts_at = @start_index + 1 + @ends_at = @end_index + 1 @parents = parents diff --git a/lib/dead_end/block_recorder.rb b/lib/dead_end/block_recorder.rb index abb43d2..b691245 100644 --- a/lib/dead_end/block_recorder.rb +++ b/lib/dead_end/block_recorder.rb @@ -47,7 +47,7 @@ def capture(block, name:) highlight_lines: block.lines ).call - f.write(" Block lines: #{(block.starts_at + 1)..(block.ends_at + 1)} (#{name})\n") + f.write(" Block lines: #{(block.starts_at)..(block.ends_at)} (#{name})\n") f.write(" indent: #{block.indent} next_indent: #{block.next_indent}\n\n") f.write(document.to_s) end From 011e8a05fdbb702f954acbcd2db1de3854d5a8f4 Mon Sep 17 00:00:00 2001 From: schneems Date: Tue, 8 Feb 2022 16:12:53 -0600 Subject: [PATCH 40/58] Diagnose docs, rename problems & fix one edge case Renamed the problem symbols to be more indicative of their root causes rather than some obscure thing that only makes sense to me. --- lib/dead_end/block_node.rb | 156 +++++++++++++++++++++++++++----- lib/dead_end/indent_search.rb | 1 + spec/unit/indent_search_spec.rb | 30 +++++- spec/unit/indent_tree_spec.rb | 130 +++++++++++++------------- 4 files changed, 228 insertions(+), 89 deletions(-) diff --git a/lib/dead_end/block_node.rb b/lib/dead_end/block_node.rb index 8c9184c..06aaa07 100644 --- a/lib/dead_end/block_node.rb +++ b/lib/dead_end/block_node.rb @@ -1,6 +1,54 @@ # frozen_string_literal: true module DeadEnd + # Explore and diagnose problems with a block + # + # Given an invalid node, the root cause of the syntax error + # may exist in that node, or in one or more of it's parents. + # + # The Diagnose class is responsible for determining the most reasonable next move to + # make. + # + # Results can be best effort, i.e. they must be re-checked against a document + # before being recommended to a user. We still want to take care in making the + # best possible suggestion as a bad suggestion may halt the search at a suboptimal + # location. + # + # The algorithm here is tightly coupled to the nodes produced by the current IndentTree + # implementation. + # + # + # Possible problem states: + # + # - :self - The block holds no parents, if it holds a problem its in the current node. + # + # - :invalid_inside_split_pair - An invalid block is splitting two valid leaning blocks, return the middle. + # + # - :remove_pseudo_pair - Multiple invalid blocks in isolation are present, but when paired with external leaning + # blocks above and below they become valid. Remove these and group the leftovers together. i.e. `else/ensure/rescue`. + # + # - :extract_from_multiple - Multiple invalid blocks in isolation are present, but we were able to find one that could be removed + # to make a valid set along with outer leaning i.e. `[`, `in)&lid` , `vaild`, `]`. Different from :invalid_inside_split_pair because + # the leaning elements come from different blocks above & below. At the end of a journey split_leaning might break one invalid + # node into multiple parents that then hit :extract_from_multiple + # + # - :one_invalid_parent - Only one parent is invalid, better investigate. + # + # - :multiple_invalid_parents - Multiple blocks are invalid, they cannot be reduced or extracted, we will have to fork the search and + # explore all of them independently. + # + # Returns the next 0, 1 or N node(s) based on the given problem state. + # + # - 0 nodes returned by :self + # - 1 node returned by :invalid_inside_split_pair, :remove_pseudo_pair, :extract_from_multiple, :one_invalid_parent + # - N nodes returned by :multiple_invalid_parents + # + # Usage example: + # + # diagnose = Diagnose.new(block).call + # expect(diagnose.problem).to eq(:multiple_invalid_parents) + # expect(diagnose.next.length).to eq(2) + # class Diagnose attr_reader :block, :problem, :next @@ -18,7 +66,7 @@ def call find_invalid return self if invalid.empty? - if @problem == :fork_invalid + if @problem == :multiple_invalid_parents @next = invalid.map {|b| BlockNode.from_blocks([b]) } else @next = [ BlockNode.from_blocks(invalid) ] @@ -27,7 +75,7 @@ def call self end - def invalid + private def invalid @invalid ||= get_invalid end @@ -35,7 +83,12 @@ def invalid invalid end + # Checks for the common problem states a node might face. + # returns an array of 0, 1 or N blocks that gets memoized + # + # Sets @problem instance variable private def get_invalid + # If current block has no parents we can explore them, the problem must exist in itself if block.parents.empty? @problem = :self return [] @@ -46,41 +99,100 @@ def invalid left = invalid.detect { |block| block.leaning == :left } right = invalid.reverse_each.detect { |block| block.leaning == :right } - above = block.above if block.above&.leaning == :left - below = block.below if block.below&.leaning == :right - + # Handle case where keyword/end (or any pair) is falsely reported as invalid in isolation but + # holds a syntax error inside of it. + # + # Example: + # + # ``` + # def cow # left, invalid in isolation, valid when paired with end + # ``` + # + # ``` + # inv&li) code # Actual problem to be isolated + # ``` + # + # ``` + # end # right, invalid in isolation, valid when paired with def + # ``` if left && right && invalid.length >= 3 && BlockNode.from_blocks([left, right]).valid? - @problem = :split_leaning + @problem = :invalid_inside_split_pair invalid.reject! {|x| x == left || x == right } - return invalid + # If the left/right was not mapped properly or we've accidentally got a :multiple_invalid_parents + # we can get a false positive, double check the invalid lines fully capture the problem + if DeadEnd.valid_without?( + code_lines: block.lines, + without_lines: invalid.flat_map(&:lines) + ) + + return invalid + end end - if above && below - @problem = :multiple + above = block.above if block.above&.leaning == :left + below = block.below if block.below&.leaning == :right - before_length = invalid.length - invalid.reject! { |block| - b = BlockNode.from_blocks([above, block, below]) - b.leaning == :equal && b.valid? - } + if above && below + @problem = :remove_pseudo_pair + + # Handle else/ensure case + # + # Example: + # + # ``` + # def cow # above + # ``` + # + # ``` + # print inv&li) # Actual problem + # rescue => e # Invalid in isolation, valid when paired with above/below + # ``` + # + # ``` + # end # below + # ``` + if invalid.reject! { |block| + b = BlockNode.from_blocks([above, block, below]) + b.leaning == :equal && b.valid? + } - if invalid.length != before_length if invalid.any? return invalid - elsif (b = block.parents.select(&:invalid?).detect { |b| BlockNode.from_blocks([above, block.parents.select(&:invalid?) - [b] , below].flatten).valid? }) - @problem = :one_inside - return [b] + else + # Handle syntax seems fine in isolation, but not when combined with above/below leaning blocks + # + # Example: + # + # ``` + # [ # above + # ``` + # + # ``` + # missing_comma_not_okay + # missing_comma_okay + # ``` + # + # ``` + # ] # below + # ``` + # + invalid = block.parents.select(&:invalid?) + if (b = invalid.detect { |b| BlockNode.from_blocks([above, invalid - [b] , below].flatten).valid? }) + @problem = :extract_from_multiple + return [b] + end end end end + # We couldn't detect any special cases, either return 1 or N invalid nodes invalid = block.parents.select(&:invalid?) if invalid.length > 1 - @problem = :fork_invalid + @problem = :multiple_invalid_parents else - @problem = :next_invalid + @problem = :one_invalid_parent end invalid @@ -113,7 +225,7 @@ def invalid # # Beyond these core capabilities blocks also know how to `diagnose` what # is wrong with them. And then they can take an action based on that - # diagnosis. For example `node.diagnose == :split_leaning` indicates that + # diagnosis. For example `node.diagnose == :invalid_inside_split_pair` indicates that # it contains parents invalid parents that likey represent an invalid node # sandwitched between a left and right leaning node. This will happen with # code. For example `[`, `bad &*$@&^ code`, `]`. Then the inside invalid node @@ -242,7 +354,7 @@ def fork_invalid def handle_multiple @diagnose.next.first end - alias :reduce_multiple :handle_multiple + alias :remove_pseudo_pair :handle_multiple def split_leaning @diagnose.next.first diff --git a/lib/dead_end/indent_search.rb b/lib/dead_end/indent_search.rb index 9e3c48f..be520bd 100644 --- a/lib/dead_end/indent_search.rb +++ b/lib/dead_end/indent_search.rb @@ -45,6 +45,7 @@ def call @recorder.capture(journey.node, name: "pop_#{problem}") if nodes.empty? || !holds_all_errors?(nodes) + @recorder.capture(journey.node, name: "skip_capture_and_exit_#{problem}") @finished << journey else nodes.each do |block| diff --git a/spec/unit/indent_search_spec.rb b/spec/unit/indent_search_spec.rb index be0f7b2..a927fbf 100644 --- a/spec/unit/indent_search_spec.rb +++ b/spec/unit/indent_search_spec.rb @@ -4,7 +4,35 @@ module DeadEnd RSpec.describe IndentSearch do - it "long inner" do + it "won't show valid code when two invalid blocks are splitting it" do + source = <<~'EOM' + { + print ( + } + + print 'haha' + + { + print ) + } + EOM + + code_lines = CleanDocument.new(source: source).call.lines + document = BlockDocument.new(code_lines: code_lines).call + tree = IndentTree.new(document: document).call + search = IndentSearch.new(tree: tree).call + + expect(search.finished.join).to eq(<<~'EOM'.indent(0)) + { + print ( + } + { + print ) + } + EOM + end + + it "only returns the problem line and not all lines on a long inner section" do source = <<~'EOM' { foo: :bar, diff --git a/spec/unit/indent_tree_spec.rb b/spec/unit/indent_tree_spec.rb index 653b5be..9359c55 100644 --- a/spec/unit/indent_tree_spec.rb +++ b/spec/unit/indent_tree_spec.rb @@ -23,7 +23,7 @@ class Cow node = tree.root - expect(node.diagnose).to eq(:next_invalid) + expect(node.diagnose).to eq(:one_invalid_parent) node = node.next_invalid expect(node.diagnose).to eq(:self) @@ -45,7 +45,7 @@ def speak node = tree.root expect(node.parents.length).to eq(2) - expect(node.diagnose).to eq(:next_invalid) + expect(node.diagnose).to eq(:one_invalid_parent) node = node.next_invalid expect(node.diagnose).to eq(:self) @@ -73,15 +73,15 @@ class Buffalo node = tree.root # expect(node.parents.length).to eq(2) - expect(node.diagnose).to eq(:fork_invalid) + expect(node.diagnose).to eq(:multiple_invalid_parents) forks = node.fork_invalid node = forks.first - expect(node.diagnose).to eq(:split_leaning) + expect(node.diagnose).to eq(:invalid_inside_split_pair) node = node.split_leaning - expect(node.diagnose).to eq(:next_invalid) + expect(node.diagnose).to eq(:one_invalid_parent) node = node.next_invalid expect(node.diagnose).to eq(:self) @@ -91,10 +91,10 @@ def speak node = forks.last - expect(node.diagnose).to eq(:split_leaning) + expect(node.diagnose).to eq(:invalid_inside_split_pair) node = node.split_leaning - expect(node.diagnose).to eq(:next_invalid) + expect(node.diagnose).to eq(:one_invalid_parent) node = node.next_invalid expect(node.diagnose).to eq(:self) @@ -117,7 +117,7 @@ def speak tree = IndentTree.new(document: document).call node = tree.root - expect(node.diagnose).to eq(:split_leaning) + expect(node.diagnose).to eq(:invalid_inside_split_pair) node = node.split_leaning expect(node.to_s).to eq(<<~'EOM') puts ( @@ -125,7 +125,7 @@ def speak puts } EOM - expect(node.diagnose).to eq(:multiple) + expect(node.diagnose).to eq(:remove_pseudo_pair) node = node.handle_multiple expect(node.to_s).to eq(<<~'EOM'.indent(2)) @@ -133,7 +133,7 @@ def speak puts } EOM - expect(node.diagnose).to eq(:fork_invalid) + expect(node.diagnose).to eq(:multiple_invalid_parents) forks = node.fork_invalid expect(forks.length).to eq(2) @@ -220,16 +220,16 @@ def node_preinstall_bin_path node = tree.root - expect(node.diagnose).to eq(:split_leaning) + expect(node.diagnose).to eq(:invalid_inside_split_pair) node = node.split_leaning - expect(node.diagnose).to eq(:next_invalid) + expect(node.diagnose).to eq(:one_invalid_parent) node = node.next_invalid - expect(node.diagnose).to eq(:split_leaning) + expect(node.diagnose).to eq(:invalid_inside_split_pair) node = node.split_leaning - expect(node.diagnose).to eq(:next_invalid) + expect(node.diagnose).to eq(:one_invalid_parent) node = node.next_invalid expect(node.diagnose).to eq(:self) @@ -254,19 +254,17 @@ def animals tree = IndentTree.new(document: document).call node = tree.root - expect(node.diagnose).to eq(:split_leaning) - node = node.split_leaning + diagnose = Diagnose.new(node).call + expect(diagnose.problem).to eq(:invalid_inside_split_pair) + node = diagnose.next[0] - expect(node.diagnose).to eq(:split_leaning) + expect(node.diagnose).to eq(:invalid_inside_split_pair) node = node.split_leaning - expect(node.diagnose).to eq(:one_inside) + expect(node.diagnose).to eq(:extract_from_multiple) node = node.next_invalid expect(node.diagnose).to eq(:self) - # Note that this is a bad pick, it's actual a - # valid line, the search algorithm has to account - # for this expect(node.to_s).to eq(<<~'EOM'.indent(4)) cat, EOM @@ -317,22 +315,22 @@ def compile tree = IndentTree.new(document: document).call node = tree.root - expect(node.diagnose).to eq(:split_leaning) + expect(node.diagnose).to eq(:invalid_inside_split_pair) node = node.split_leaning - expect(node.diagnose).to eq(:multiple) + expect(node.diagnose).to eq(:remove_pseudo_pair) node = node.handle_multiple - expect(node.diagnose).to eq(:split_leaning) + expect(node.diagnose).to eq(:invalid_inside_split_pair) node = node.split_leaning - expect(node.diagnose).to eq(:next_invalid) + expect(node.diagnose).to eq(:one_invalid_parent) node = node.next_invalid - expect(node.diagnose).to eq(:split_leaning) + expect(node.diagnose).to eq(:invalid_inside_split_pair) node = node.split_leaning - expect(node.diagnose).to eq(:multiple) + expect(node.diagnose).to eq(:remove_pseudo_pair) expect(node.parents.length).to eq(4) node = node.handle_multiple @@ -352,34 +350,34 @@ def compile tree = IndentTree.new(document: document).call node = tree.root - expect(node.diagnose).to eq(:next_invalid) + expect(node.diagnose).to eq(:one_invalid_parent) node = node.next_invalid - expect(node.diagnose).to eq(:split_leaning) + expect(node.diagnose).to eq(:invalid_inside_split_pair) node = node.split_leaning - expect(node.diagnose).to eq(:next_invalid) + expect(node.diagnose).to eq(:one_invalid_parent) node = node.next_invalid - expect(node.diagnose).to eq(:split_leaning) + expect(node.diagnose).to eq(:invalid_inside_split_pair) node = node.split_leaning - expect(node.diagnose).to eq(:multiple) + expect(node.diagnose).to eq(:remove_pseudo_pair) node = node.handle_multiple - expect(node.diagnose).to eq(:split_leaning) + expect(node.diagnose).to eq(:invalid_inside_split_pair) node = node.split_leaning - expect(node.diagnose).to eq(:split_leaning) + expect(node.diagnose).to eq(:invalid_inside_split_pair) node = node.split_leaning - expect(node.diagnose).to eq(:multiple) + expect(node.diagnose).to eq(:remove_pseudo_pair) node = node.handle_multiple - expect(node.diagnose).to eq(:split_leaning) + expect(node.diagnose).to eq(:invalid_inside_split_pair) node = node.split_leaning - expect(node.diagnose).to eq(:next_invalid) + expect(node.diagnose).to eq(:one_invalid_parent) node = node.next_invalid expect(node.diagnose).to eq(:self) @@ -397,16 +395,16 @@ def compile node = tree.root - expect(node.diagnose).to eq(:split_leaning) + expect(node.diagnose).to eq(:invalid_inside_split_pair) node = node.split_leaning - expect(node.diagnose).to eq(:split_leaning) + expect(node.diagnose).to eq(:invalid_inside_split_pair) node = node.split_leaning - expect(node.diagnose).to eq(:next_invalid) + expect(node.diagnose).to eq(:one_invalid_parent) node = node.next_invalid - expect(node.diagnose).to eq(:next_invalid) + expect(node.diagnose).to eq(:one_invalid_parent) node = node.next_invalid expect(node.diagnose).to eq(:self) @@ -427,10 +425,10 @@ def bark tree = IndentTree.new(document: document).call node = tree.root - expect(node.diagnose).to eq(:split_leaning) + expect(node.diagnose).to eq(:invalid_inside_split_pair) node = node.split_leaning - expect(node.diagnose).to eq(:next_invalid) + expect(node.diagnose).to eq(:one_invalid_parent) node = node.next_invalid expect(node.diagnose).to eq(:self) @@ -459,10 +457,10 @@ def call node = tree.root - expect(node.diagnose).to eq(:split_leaning) + expect(node.diagnose).to eq(:invalid_inside_split_pair) node = node.split_leaning - expect(node.diagnose).to eq(:next_invalid) + expect(node.diagnose).to eq(:one_invalid_parent) node = node.next_invalid expect(node.diagnose).to eq(:self) @@ -503,10 +501,10 @@ def call tree = IndentTree.new(document: document).call node = tree.root - expect(node.diagnose).to eq(:split_leaning) + expect(node.diagnose).to eq(:invalid_inside_split_pair) node = node.split_leaning - expect(node.diagnose).to eq(:next_invalid) + expect(node.diagnose).to eq(:one_invalid_parent) node = node.next_invalid expect(node.diagnose).to eq(:self) @@ -578,16 +576,16 @@ def initialize tree = IndentTree.new(document: document).call node = tree.root - expect(node.diagnose).to eq(:next_invalid) + expect(node.diagnose).to eq(:one_invalid_parent) node = node.next_invalid - expect(node.diagnose).to eq(:split_leaning) + expect(node.diagnose).to eq(:invalid_inside_split_pair) node = node.split_leaning - expect(node.diagnose).to eq(:next_invalid) + expect(node.diagnose).to eq(:one_invalid_parent) node = node.next_invalid - expect(node.diagnose).to eq(:next_invalid) + expect(node.diagnose).to eq(:one_invalid_parent) node = node.next_invalid expect(node.diagnose).to eq(:self) @@ -607,28 +605,28 @@ def format_requires node = tree.root - expect(node.diagnose).to eq(:next_invalid) + expect(node.diagnose).to eq(:one_invalid_parent) node = node.next_invalid - expect(node.diagnose).to eq(:next_invalid) + expect(node.diagnose).to eq(:one_invalid_parent) node = node.next_invalid - expect(node.diagnose).to eq(:split_leaning) + expect(node.diagnose).to eq(:invalid_inside_split_pair) node = node.split_leaning - expect(node.diagnose).to eq(:next_invalid) + expect(node.diagnose).to eq(:one_invalid_parent) node = node.next_invalid - expect(node.diagnose).to eq(:next_invalid) + expect(node.diagnose).to eq(:one_invalid_parent) node = node.next_invalid - expect(node.diagnose).to eq(:split_leaning) + expect(node.diagnose).to eq(:invalid_inside_split_pair) node = node.split_leaning - expect(node.diagnose).to eq(:next_invalid) + expect(node.diagnose).to eq(:one_invalid_parent) node = node.next_invalid - expect(node.diagnose).to eq(:next_invalid) + expect(node.diagnose).to eq(:one_invalid_parent) node = node.next_invalid expect(node.diagnose).to eq(:self) @@ -652,16 +650,16 @@ def format_requires node = tree.root - expect(node.diagnose).to eq(:next_invalid) + expect(node.diagnose).to eq(:one_invalid_parent) node = node.next_invalid - expect(node.diagnose).to eq(:split_leaning) + expect(node.diagnose).to eq(:invalid_inside_split_pair) node = node.split_leaning - expect(node.diagnose).to eq(:next_invalid) + expect(node.diagnose).to eq(:one_invalid_parent) node = node.next_invalid - expect(node.diagnose).to eq(:next_invalid) + expect(node.diagnose).to eq(:one_invalid_parent) node = node.next_invalid expect(node.diagnose).to eq(:self) @@ -712,10 +710,10 @@ def initialize(arguments:, block:, location:) node = tree.root - expect(node.diagnose).to eq(:next_invalid) + expect(node.diagnose).to eq(:one_invalid_parent) node = node.next_invalid - expect(node.diagnose).to eq(:next_invalid) + expect(node.diagnose).to eq(:one_invalid_parent) node = node.next_invalid expect(node.diagnose).to eq(:self) @@ -739,7 +737,7 @@ def foo node = tree.root - expect(node.diagnose).to eq(:next_invalid) + expect(node.diagnose).to eq(:one_invalid_parent) node = node.next_invalid expect(node.diagnose).to eq(:self) From 70087edf58a961e5fe548d388cb422ca0e250333 Mon Sep 17 00:00:00 2001 From: schneems Date: Tue, 8 Feb 2022 16:21:43 -0600 Subject: [PATCH 41/58] Refactor Diagnose into smaller methods & rename --- lib/dead_end/block_node.rb | 187 ++++++++++++++++++---------------- lib/dead_end/indent_search.rb | 2 +- spec/unit/indent_tree_spec.rb | 2 +- 3 files changed, 99 insertions(+), 92 deletions(-) diff --git a/lib/dead_end/block_node.rb b/lib/dead_end/block_node.rb index 06aaa07..0fc484b 100644 --- a/lib/dead_end/block_node.rb +++ b/lib/dead_end/block_node.rb @@ -6,7 +6,7 @@ module DeadEnd # Given an invalid node, the root cause of the syntax error # may exist in that node, or in one or more of it's parents. # - # The Diagnose class is responsible for determining the most reasonable next move to + # The DiagnoseNode class is responsible for determining the most reasonable next move to # make. # # Results can be best effort, i.e. they must be re-checked against a document @@ -45,11 +45,11 @@ module DeadEnd # # Usage example: # - # diagnose = Diagnose.new(block).call + # diagnose = DiagnoseNode.new(block).call # expect(diagnose.problem).to eq(:multiple_invalid_parents) # expect(diagnose.next.length).to eq(2) # - class Diagnose + class DiagnoseNode attr_reader :block, :problem, :next def initialize(block) @@ -58,12 +58,8 @@ def initialize(block) @next = [] end - def invalid - @block.parents.select(&:invalid?) - end - def call - find_invalid + invalid = get_invalid return self if invalid.empty? if @problem == :multiple_invalid_parents @@ -75,46 +71,43 @@ def call self end - private def invalid - @invalid ||= get_invalid - end + # Checks for the common problem states a node might face. + # returns an array of 0, 1 or N blocks + private def get_invalid + out = diagnose_self + return out if out - private def find_invalid - invalid + out = diagnose_left_right + return out if out + + out = diagnose_above_below + return out if out + + diagnose_one_or_more_parents end - # Checks for the common problem states a node might face. - # returns an array of 0, 1 or N blocks that gets memoized + # ## (:invalid_inside_split_pair) Handle case where keyword/end (or any pair) is falsely reported as invalid in isolation but + # holds a syntax error inside of it. # - # Sets @problem instance variable - private def get_invalid - # If current block has no parents we can explore them, the problem must exist in itself - if block.parents.empty? - @problem = :self - return [] - end - + # Example: + # + # ``` + # def cow # left, invalid in isolation, valid when paired with end + # ``` + # + # ``` + # inv&li) code # Actual problem to be isolated + # ``` + # + # ``` + # end # right, invalid in isolation, valid when paired with def + # ``` + private def diagnose_left_right invalid = block.parents.select(&:invalid?) left = invalid.detect { |block| block.leaning == :left } right = invalid.reverse_each.detect { |block| block.leaning == :right } - # Handle case where keyword/end (or any pair) is falsely reported as invalid in isolation but - # holds a syntax error inside of it. - # - # Example: - # - # ``` - # def cow # left, invalid in isolation, valid when paired with end - # ``` - # - # ``` - # inv&li) code # Actual problem to be isolated - # ``` - # - # ``` - # end # right, invalid in isolation, valid when paired with def - # ``` if left && right && invalid.length >= 3 && BlockNode.from_blocks([left, right]).valid? @problem = :invalid_inside_split_pair @@ -130,64 +123,71 @@ def call return invalid end end + end + + + # ## (:remove_pseudo_pair) Handle else/ensure case + # + # Example: + # + # ``` + # def cow # above + # ``` + # + # ``` + # print inv&li) # Actual problem + # rescue => e # Invalid in isolation, valid when paired with above/below + # ``` + # + # ``` + # end # below + # ``` + # + # ## (:extract_from_multiple) Handle syntax seems fine in isolation, but not when combined with above/below leaning blocks + # + # Example: + # + # ``` + # [ # above + # ``` + # + # ``` + # missing_comma_not_okay + # missing_comma_okay + # ``` + # + # ``` + # ] # below + # ``` + private def diagnose_above_below + invalid = block.parents.select(&:invalid?) above = block.above if block.above&.leaning == :left below = block.below if block.below&.leaning == :right - if above && below - @problem = :remove_pseudo_pair - - # Handle else/ensure case - # - # Example: - # - # ``` - # def cow # above - # ``` - # - # ``` - # print inv&li) # Actual problem - # rescue => e # Invalid in isolation, valid when paired with above/below - # ``` - # - # ``` - # end # below - # ``` - if invalid.reject! { |block| - b = BlockNode.from_blocks([above, block, below]) - b.leaning == :equal && b.valid? - } - - if invalid.any? - return invalid - else - # Handle syntax seems fine in isolation, but not when combined with above/below leaning blocks - # - # Example: - # - # ``` - # [ # above - # ``` - # - # ``` - # missing_comma_not_okay - # missing_comma_okay - # ``` - # - # ``` - # ] # below - # ``` - # - invalid = block.parents.select(&:invalid?) - if (b = invalid.detect { |b| BlockNode.from_blocks([above, invalid - [b] , below].flatten).valid? }) - @problem = :extract_from_multiple - return [b] - end + return false if above.nil? || below.nil? + + if invalid.reject! { |block| + b = BlockNode.from_blocks([above, block, below]) + b.leaning == :equal && b.valid? + } + + if invalid.any? + @problem = :remove_pseudo_pair + return invalid + else + + invalid = block.parents.select(&:invalid?) + if (b = invalid.detect { |b| BlockNode.from_blocks([above, invalid - [b] , below].flatten).valid? }) + @problem = :extract_from_multiple + return [b] end end end + end - # We couldn't detect any special cases, either return 1 or N invalid nodes + # We couldn't detect any special cases, either return 1 or N invalid nodes + private def diagnose_one_or_more_parents invalid = block.parents.select(&:invalid?) if invalid.length > 1 @problem = :multiple_invalid_parents @@ -197,6 +197,13 @@ def call invalid end + + private def diagnose_self + if block.parents.empty? + @problem = :self + return [] + end + end end # A core data structure @@ -343,7 +350,7 @@ def next_invalid end def diagnose - @diagnose ||= Diagnose.new(self).call + @diagnose ||= DiagnoseNode.new(self).call @diagnose.problem end diff --git a/lib/dead_end/indent_search.rb b/lib/dead_end/indent_search.rb index be520bd..42b44e7 100644 --- a/lib/dead_end/indent_search.rb +++ b/lib/dead_end/indent_search.rb @@ -38,7 +38,7 @@ def initialize(tree: , record_dir: DEFAULT_VALUE) def call while (journey = @frontier.pop) - diagnose = Diagnose.new(journey.node).call + diagnose = DiagnoseNode.new(journey.node).call problem = diagnose.problem nodes = diagnose.next diff --git a/spec/unit/indent_tree_spec.rb b/spec/unit/indent_tree_spec.rb index 9359c55..f76fe74 100644 --- a/spec/unit/indent_tree_spec.rb +++ b/spec/unit/indent_tree_spec.rb @@ -254,7 +254,7 @@ def animals tree = IndentTree.new(document: document).call node = tree.root - diagnose = Diagnose.new(node).call + diagnose = DiagnoseNode.new(node).call expect(diagnose.problem).to eq(:invalid_inside_split_pair) node = diagnose.next[0] From 745dc3bd82234339f6dc1ca22625e25e919a4638 Mon Sep 17 00:00:00 2001 From: schneems Date: Tue, 8 Feb 2022 16:25:22 -0600 Subject: [PATCH 42/58] Move diagnose node to its own file --- lib/dead_end/api.rb | 1 + lib/dead_end/block_node.rb | 204 --------------------------------- lib/dead_end/diagnose_node.rb | 208 ++++++++++++++++++++++++++++++++++ 3 files changed, 209 insertions(+), 204 deletions(-) create mode 100644 lib/dead_end/diagnose_node.rb diff --git a/lib/dead_end/api.rb b/lib/dead_end/api.rb index 285c90b..b93be7a 100644 --- a/lib/dead_end/api.rb +++ b/lib/dead_end/api.rb @@ -208,3 +208,4 @@ def self.valid?(source) require_relative "block_document" require_relative "indent_search" +require_relative "diagnose_node" diff --git a/lib/dead_end/block_node.rb b/lib/dead_end/block_node.rb index 0fc484b..7bde7ae 100644 --- a/lib/dead_end/block_node.rb +++ b/lib/dead_end/block_node.rb @@ -1,210 +1,6 @@ # frozen_string_literal: true module DeadEnd - # Explore and diagnose problems with a block - # - # Given an invalid node, the root cause of the syntax error - # may exist in that node, or in one or more of it's parents. - # - # The DiagnoseNode class is responsible for determining the most reasonable next move to - # make. - # - # Results can be best effort, i.e. they must be re-checked against a document - # before being recommended to a user. We still want to take care in making the - # best possible suggestion as a bad suggestion may halt the search at a suboptimal - # location. - # - # The algorithm here is tightly coupled to the nodes produced by the current IndentTree - # implementation. - # - # - # Possible problem states: - # - # - :self - The block holds no parents, if it holds a problem its in the current node. - # - # - :invalid_inside_split_pair - An invalid block is splitting two valid leaning blocks, return the middle. - # - # - :remove_pseudo_pair - Multiple invalid blocks in isolation are present, but when paired with external leaning - # blocks above and below they become valid. Remove these and group the leftovers together. i.e. `else/ensure/rescue`. - # - # - :extract_from_multiple - Multiple invalid blocks in isolation are present, but we were able to find one that could be removed - # to make a valid set along with outer leaning i.e. `[`, `in)&lid` , `vaild`, `]`. Different from :invalid_inside_split_pair because - # the leaning elements come from different blocks above & below. At the end of a journey split_leaning might break one invalid - # node into multiple parents that then hit :extract_from_multiple - # - # - :one_invalid_parent - Only one parent is invalid, better investigate. - # - # - :multiple_invalid_parents - Multiple blocks are invalid, they cannot be reduced or extracted, we will have to fork the search and - # explore all of them independently. - # - # Returns the next 0, 1 or N node(s) based on the given problem state. - # - # - 0 nodes returned by :self - # - 1 node returned by :invalid_inside_split_pair, :remove_pseudo_pair, :extract_from_multiple, :one_invalid_parent - # - N nodes returned by :multiple_invalid_parents - # - # Usage example: - # - # diagnose = DiagnoseNode.new(block).call - # expect(diagnose.problem).to eq(:multiple_invalid_parents) - # expect(diagnose.next.length).to eq(2) - # - class DiagnoseNode - attr_reader :block, :problem, :next - - def initialize(block) - @block = block - @problem = nil - @next = [] - end - - def call - invalid = get_invalid - return self if invalid.empty? - - if @problem == :multiple_invalid_parents - @next = invalid.map {|b| BlockNode.from_blocks([b]) } - else - @next = [ BlockNode.from_blocks(invalid) ] - end - - self - end - - # Checks for the common problem states a node might face. - # returns an array of 0, 1 or N blocks - private def get_invalid - out = diagnose_self - return out if out - - out = diagnose_left_right - return out if out - - out = diagnose_above_below - return out if out - - diagnose_one_or_more_parents - end - - # ## (:invalid_inside_split_pair) Handle case where keyword/end (or any pair) is falsely reported as invalid in isolation but - # holds a syntax error inside of it. - # - # Example: - # - # ``` - # def cow # left, invalid in isolation, valid when paired with end - # ``` - # - # ``` - # inv&li) code # Actual problem to be isolated - # ``` - # - # ``` - # end # right, invalid in isolation, valid when paired with def - # ``` - private def diagnose_left_right - invalid = block.parents.select(&:invalid?) - - left = invalid.detect { |block| block.leaning == :left } - right = invalid.reverse_each.detect { |block| block.leaning == :right } - - if left && right && invalid.length >= 3 && BlockNode.from_blocks([left, right]).valid? - @problem = :invalid_inside_split_pair - - invalid.reject! {|x| x == left || x == right } - - # If the left/right was not mapped properly or we've accidentally got a :multiple_invalid_parents - # we can get a false positive, double check the invalid lines fully capture the problem - if DeadEnd.valid_without?( - code_lines: block.lines, - without_lines: invalid.flat_map(&:lines) - ) - - return invalid - end - end - end - - - # ## (:remove_pseudo_pair) Handle else/ensure case - # - # Example: - # - # ``` - # def cow # above - # ``` - # - # ``` - # print inv&li) # Actual problem - # rescue => e # Invalid in isolation, valid when paired with above/below - # ``` - # - # ``` - # end # below - # ``` - # - # ## (:extract_from_multiple) Handle syntax seems fine in isolation, but not when combined with above/below leaning blocks - # - # Example: - # - # ``` - # [ # above - # ``` - # - # ``` - # missing_comma_not_okay - # missing_comma_okay - # ``` - # - # ``` - # ] # below - # ``` - private def diagnose_above_below - invalid = block.parents.select(&:invalid?) - - above = block.above if block.above&.leaning == :left - below = block.below if block.below&.leaning == :right - - return false if above.nil? || below.nil? - - if invalid.reject! { |block| - b = BlockNode.from_blocks([above, block, below]) - b.leaning == :equal && b.valid? - } - - if invalid.any? - @problem = :remove_pseudo_pair - return invalid - else - - invalid = block.parents.select(&:invalid?) - if (b = invalid.detect { |b| BlockNode.from_blocks([above, invalid - [b] , below].flatten).valid? }) - @problem = :extract_from_multiple - return [b] - end - end - end - end - - # We couldn't detect any special cases, either return 1 or N invalid nodes - private def diagnose_one_or_more_parents - invalid = block.parents.select(&:invalid?) - if invalid.length > 1 - @problem = :multiple_invalid_parents - else - @problem = :one_invalid_parent - end - - invalid - end - - private def diagnose_self - if block.parents.empty? - @problem = :self - return [] - end - end - end # A core data structure # diff --git a/lib/dead_end/diagnose_node.rb b/lib/dead_end/diagnose_node.rb new file mode 100644 index 0000000..b533718 --- /dev/null +++ b/lib/dead_end/diagnose_node.rb @@ -0,0 +1,208 @@ +# frozen_string_literal: true + +module DeadEnd + # Explore and diagnose problems with a block + # + # Given an invalid node, the root cause of the syntax error + # may exist in that node, or in one or more of it's parents. + # + # The DiagnoseNode class is responsible for determining the most reasonable next move to + # make. + # + # Results can be best effort, i.e. they must be re-checked against a document + # before being recommended to a user. We still want to take care in making the + # best possible suggestion as a bad suggestion may halt the search at a suboptimal + # location. + # + # The algorithm here is tightly coupled to the nodes produced by the current IndentTree + # implementation. + # + # + # Possible problem states: + # + # - :self - The block holds no parents, if it holds a problem its in the current node. + # + # - :invalid_inside_split_pair - An invalid block is splitting two valid leaning blocks, return the middle. + # + # - :remove_pseudo_pair - Multiple invalid blocks in isolation are present, but when paired with external leaning + # blocks above and below they become valid. Remove these and group the leftovers together. i.e. `else/ensure/rescue`. + # + # - :extract_from_multiple - Multiple invalid blocks in isolation are present, but we were able to find one that could be removed + # to make a valid set along with outer leaning i.e. `[`, `in)&lid` , `vaild`, `]`. Different from :invalid_inside_split_pair because + # the leaning elements come from different blocks above & below. At the end of a journey split_leaning might break one invalid + # node into multiple parents that then hit :extract_from_multiple + # + # - :one_invalid_parent - Only one parent is invalid, better investigate. + # + # - :multiple_invalid_parents - Multiple blocks are invalid, they cannot be reduced or extracted, we will have to fork the search and + # explore all of them independently. + # + # Returns the next 0, 1 or N node(s) based on the given problem state. + # + # - 0 nodes returned by :self + # - 1 node returned by :invalid_inside_split_pair, :remove_pseudo_pair, :extract_from_multiple, :one_invalid_parent + # - N nodes returned by :multiple_invalid_parents + # + # Usage example: + # + # diagnose = DiagnoseNode.new(block).call + # expect(diagnose.problem).to eq(:multiple_invalid_parents) + # expect(diagnose.next.length).to eq(2) + # + class DiagnoseNode + attr_reader :block, :problem, :next + + def initialize(block) + @block = block + @problem = nil + @next = [] + end + + def call + invalid = get_invalid + return self if invalid.empty? + + if @problem == :multiple_invalid_parents + @next = invalid.map {|b| BlockNode.from_blocks([b]) } + else + @next = [ BlockNode.from_blocks(invalid) ] + end + + self + end + + # Checks for the common problem states a node might face. + # returns an array of 0, 1 or N blocks + private def get_invalid + out = diagnose_self + return out if out + + out = diagnose_left_right + return out if out + + out = diagnose_above_below + return out if out + + diagnose_one_or_more_parents + end + + # ## (:invalid_inside_split_pair) Handle case where keyword/end (or any pair) is falsely reported as invalid in isolation but + # holds a syntax error inside of it. + # + # Example: + # + # ``` + # def cow # left, invalid in isolation, valid when paired with end + # ``` + # + # ``` + # inv&li) code # Actual problem to be isolated + # ``` + # + # ``` + # end # right, invalid in isolation, valid when paired with def + # ``` + private def diagnose_left_right + invalid = block.parents.select(&:invalid?) + + left = invalid.detect { |block| block.leaning == :left } + right = invalid.reverse_each.detect { |block| block.leaning == :right } + + if left && right && invalid.length >= 3 && BlockNode.from_blocks([left, right]).valid? + @problem = :invalid_inside_split_pair + + invalid.reject! {|x| x == left || x == right } + + # If the left/right was not mapped properly or we've accidentally got a :multiple_invalid_parents + # we can get a false positive, double check the invalid lines fully capture the problem + if DeadEnd.valid_without?( + code_lines: block.lines, + without_lines: invalid.flat_map(&:lines) + ) + + return invalid + end + end + end + + + # ## (:remove_pseudo_pair) Handle else/ensure case + # + # Example: + # + # ``` + # def cow # above + # ``` + # + # ``` + # print inv&li) # Actual problem + # rescue => e # Invalid in isolation, valid when paired with above/below + # ``` + # + # ``` + # end # below + # ``` + # + # ## (:extract_from_multiple) Handle syntax seems fine in isolation, but not when combined with above/below leaning blocks + # + # Example: + # + # ``` + # [ # above + # ``` + # + # ``` + # missing_comma_not_okay + # missing_comma_okay + # ``` + # + # ``` + # ] # below + # ``` + private def diagnose_above_below + invalid = block.parents.select(&:invalid?) + + above = block.above if block.above&.leaning == :left + below = block.below if block.below&.leaning == :right + + return false if above.nil? || below.nil? + + if invalid.reject! { |block| + b = BlockNode.from_blocks([above, block, below]) + b.leaning == :equal && b.valid? + } + + if invalid.any? + @problem = :remove_pseudo_pair + return invalid + else + + invalid = block.parents.select(&:invalid?) + if (b = invalid.detect { |b| BlockNode.from_blocks([above, invalid - [b] , below].flatten).valid? }) + @problem = :extract_from_multiple + return [b] + end + end + end + end + + # We couldn't detect any special cases, either return 1 or N invalid nodes + private def diagnose_one_or_more_parents + invalid = block.parents.select(&:invalid?) + if invalid.length > 1 + @problem = :multiple_invalid_parents + else + @problem = :one_invalid_parent + end + + invalid + end + + private def diagnose_self + if block.parents.empty? + @problem = :self + return [] + end + end + end +end From 194ee1b02b984c3e0b16d7ba95a8e5e7df5161d7 Mon Sep 17 00:00:00 2001 From: schneems Date: Tue, 8 Feb 2022 16:31:26 -0600 Subject: [PATCH 43/58] standardrb --fix --- lib/dead_end/block_node.rb | 3 +-- lib/dead_end/block_recorder.rb | 3 +-- lib/dead_end/diagnose_node.rb | 31 +++++++++++++++---------------- lib/dead_end/indent_search.rb | 4 ++-- lib/dead_end/indent_tree.rb | 1 - spec/unit/indent_search_spec.rb | 2 +- spec/unit/indent_tree_spec.rb | 13 ++++++------- 7 files changed, 26 insertions(+), 31 deletions(-) diff --git a/lib/dead_end/block_node.rb b/lib/dead_end/block_node.rb index 7bde7ae..3c76662 100644 --- a/lib/dead_end/block_node.rb +++ b/lib/dead_end/block_node.rb @@ -1,7 +1,6 @@ # frozen_string_literal: true module DeadEnd - # A core data structure # # A block node keeps a reference to the block above it @@ -157,7 +156,7 @@ def fork_invalid def handle_multiple @diagnose.next.first end - alias :remove_pseudo_pair :handle_multiple + alias_method :remove_pseudo_pair, :handle_multiple def split_leaning @diagnose.next.first diff --git a/lib/dead_end/block_recorder.rb b/lib/dead_end/block_recorder.rb index b691245..cb5a68a 100644 --- a/lib/dead_end/block_recorder.rb +++ b/lib/dead_end/block_recorder.rb @@ -5,7 +5,6 @@ module DeadEnd # # This class allows for tracing the algorithm class BlockRecorder - # Convienece constructor for building a BlockRecorder given # a directory object. # @@ -14,7 +13,7 @@ class BlockRecorder # # Multiple different processes may be logging to the same # directory, so writing to a subdir is recommended - def self.from_dir(dir, subdir: , code_lines: ) + def self.from_dir(dir, subdir:, code_lines:) if dir == DEFAULT_VALUE dir = ENV["DEAD_END_RECORD_DIR"] || ENV["DEBUG"] ? DeadEnd.record_dir("tmp") : nil end diff --git a/lib/dead_end/diagnose_node.rb b/lib/dead_end/diagnose_node.rb index b533718..af08cf6 100644 --- a/lib/dead_end/diagnose_node.rb +++ b/lib/dead_end/diagnose_node.rb @@ -62,10 +62,10 @@ def call invalid = get_invalid return self if invalid.empty? - if @problem == :multiple_invalid_parents - @next = invalid.map {|b| BlockNode.from_blocks([b]) } + @next = if @problem == :multiple_invalid_parents + invalid.map { |b| BlockNode.from_blocks([b]) } else - @next = [ BlockNode.from_blocks(invalid) ] + [BlockNode.from_blocks(invalid)] end self @@ -111,7 +111,7 @@ def call if left && right && invalid.length >= 3 && BlockNode.from_blocks([left, right]).valid? @problem = :invalid_inside_split_pair - invalid.reject! {|x| x == left || x == right } + invalid.reject! { |x| x == left || x == right } # If the left/right was not mapped properly or we've accidentally got a :multiple_invalid_parents # we can get a false positive, double check the invalid lines fully capture the problem @@ -120,12 +120,11 @@ def call without_lines: invalid.flat_map(&:lines) ) - return invalid + invalid end end end - # ## (:remove_pseudo_pair) Handle else/ensure case # # Example: @@ -168,19 +167,19 @@ def call return false if above.nil? || below.nil? if invalid.reject! { |block| - b = BlockNode.from_blocks([above, block, below]) - b.leaning == :equal && b.valid? - } + b = BlockNode.from_blocks([above, block, below]) + b.leaning == :equal && b.valid? + } if invalid.any? @problem = :remove_pseudo_pair - return invalid + invalid else invalid = block.parents.select(&:invalid?) - if (b = invalid.detect { |b| BlockNode.from_blocks([above, invalid - [b] , below].flatten).valid? }) + if (b = invalid.detect { |b| BlockNode.from_blocks([above, invalid - [b], below].flatten).valid? }) @problem = :extract_from_multiple - return [b] + [b] end end end @@ -189,10 +188,10 @@ def call # We couldn't detect any special cases, either return 1 or N invalid nodes private def diagnose_one_or_more_parents invalid = block.parents.select(&:invalid?) - if invalid.length > 1 - @problem = :multiple_invalid_parents + @problem = if invalid.length > 1 + :multiple_invalid_parents else - @problem = :one_invalid_parent + :one_invalid_parent end invalid @@ -201,7 +200,7 @@ def call private def diagnose_self if block.parents.empty? @problem = :self - return [] + [] end end end diff --git a/lib/dead_end/indent_search.rb b/lib/dead_end/indent_search.rb index 42b44e7..5aca0cc 100644 --- a/lib/dead_end/indent_search.rb +++ b/lib/dead_end/indent_search.rb @@ -28,7 +28,7 @@ module DeadEnd class IndentSearch attr_reader :finished - def initialize(tree: , record_dir: DEFAULT_VALUE) + def initialize(tree:, record_dir: DEFAULT_VALUE) @tree = tree @root = tree.root @finished = [] @@ -57,7 +57,7 @@ def call end end - @finished.sort_by! {|j| j.node.starts_at } + @finished.sort_by! { |j| j.node.starts_at } self end diff --git a/lib/dead_end/indent_tree.rb b/lib/dead_end/indent_tree.rb index 408d950..533c0a9 100644 --- a/lib/dead_end/indent_tree.rb +++ b/lib/dead_end/indent_tree.rb @@ -25,7 +25,6 @@ def initialize(document:, record_dir: DEFAULT_VALUE) @code_lines = document.code_lines @last_length = Float::INFINITY - @recorder = BlockRecorder.from_dir(record_dir, subdir: "build_tree", code_lines: @code_lines) end diff --git a/spec/unit/indent_search_spec.rb b/spec/unit/indent_search_spec.rb index a927fbf..553badb 100644 --- a/spec/unit/indent_search_spec.rb +++ b/spec/unit/indent_search_spec.rb @@ -328,7 +328,7 @@ def initialize(arguments:, block:, location:) search = IndentSearch.new(tree: tree).call expect(search.finished.first.node.to_s).to eq(<<~'EOM') - def on_args_add(arguments, argument) + def on_args_add(arguments, argument) EOM end diff --git a/spec/unit/indent_tree_spec.rb b/spec/unit/indent_tree_spec.rb index f76fe74..64ed507 100644 --- a/spec/unit/indent_tree_spec.rb +++ b/spec/unit/indent_tree_spec.rb @@ -24,7 +24,7 @@ class Cow node = tree.root expect(node.diagnose).to eq(:one_invalid_parent) - node = node.next_invalid + node = node.next_invalid expect(node.diagnose).to eq(:self) expect(node.to_s).to eq(<<~'EOM') @@ -32,7 +32,6 @@ class Cow EOM end - it "ambiguous kw" do source = <<~'EOM' class Cow @@ -46,7 +45,7 @@ def speak node = tree.root expect(node.parents.length).to eq(2) expect(node.diagnose).to eq(:one_invalid_parent) - node = node.next_invalid + node = node.next_invalid expect(node.diagnose).to eq(:self) expect(node.to_s).to eq(<<~'EOM') @@ -74,7 +73,7 @@ class Buffalo # expect(node.parents.length).to eq(2) expect(node.diagnose).to eq(:multiple_invalid_parents) - forks = node.fork_invalid + forks = node.fork_invalid node = forks.first @@ -120,9 +119,9 @@ def speak expect(node.diagnose).to eq(:invalid_inside_split_pair) node = node.split_leaning expect(node.to_s).to eq(<<~'EOM') - puts ( - else - puts } + puts ( + else + puts } EOM expect(node.diagnose).to eq(:remove_pseudo_pair) From e0256d111623542001241f2f3fa24ea44d26c18c Mon Sep 17 00:00:00 2001 From: schneems Date: Tue, 8 Feb 2022 16:42:19 -0600 Subject: [PATCH 44/58] Finish extraction of Diagnose logic --- lib/dead_end/block_node.rb | 35 ---- spec/unit/indent_tree_spec.rb | 363 +++++++++++++++++++++------------- 2 files changed, 221 insertions(+), 177 deletions(-) diff --git a/lib/dead_end/block_node.rb b/lib/dead_end/block_node.rb index 3c76662..c282f55 100644 --- a/lib/dead_end/block_node.rb +++ b/lib/dead_end/block_node.rb @@ -24,19 +24,6 @@ module DeadEnd # they're expanded. To be calculated a nodes above and below blocks must # be accurately assigned. So this property cannot be calculated at creation # time. - # - # Beyond these core capabilities blocks also know how to `diagnose` what - # is wrong with them. And then they can take an action based on that - # diagnosis. For example `node.diagnose == :invalid_inside_split_pair` indicates that - # it contains parents invalid parents that likey represent an invalid node - # sandwitched between a left and right leaning node. This will happen with - # code. For example `[`, `bad &*$@&^ code`, `]`. Then the inside invalid node - # can be grabbed via calling `node.split_leaning`. - # - # In the long term it likely makes sense to move diagnosis and extraction - # to a separate class as this class already is a bit of a "false god object" - # however a lot of tests depend on it currently and it's not really getting - # in the way. class BlockNode # Helper to create a block from other blocks # @@ -140,28 +127,6 @@ def leaf? parents.empty? end - def next_invalid - @diagnose.next.first - end - - def diagnose - @diagnose ||= DiagnoseNode.new(self).call - @diagnose.problem - end - - def fork_invalid - @diagnose.next - end - - def handle_multiple - @diagnose.next.first - end - alias_method :remove_pseudo_pair, :handle_multiple - - def split_leaning - @diagnose.next.first - end - # Given a node, it's above and below links # returns the next indentation. # diff --git a/spec/unit/indent_tree_spec.rb b/spec/unit/indent_tree_spec.rb index 64ed507..25617b1 100644 --- a/spec/unit/indent_tree_spec.rb +++ b/spec/unit/indent_tree_spec.rb @@ -23,10 +23,12 @@ class Cow node = tree.root - expect(node.diagnose).to eq(:one_invalid_parent) - node = node.next_invalid + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:one_invalid_parent) + node = diagnose.next[0] - expect(node.diagnose).to eq(:self) + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:self) expect(node.to_s).to eq(<<~'EOM') end # two EOM @@ -43,11 +45,14 @@ def speak tree = IndentTree.new(document: document).call node = tree.root + + diagnose = DiagnoseNode.new(node).call expect(node.parents.length).to eq(2) - expect(node.diagnose).to eq(:one_invalid_parent) - node = node.next_invalid + expect(diagnose.problem).to eq(:one_invalid_parent) + node = diagnose.next[0] - expect(node.diagnose).to eq(:self) + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:self) expect(node.to_s).to eq(<<~'EOM') class Cow EOM @@ -72,31 +77,38 @@ class Buffalo node = tree.root # expect(node.parents.length).to eq(2) - expect(node.diagnose).to eq(:multiple_invalid_parents) - forks = node.fork_invalid + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:multiple_invalid_parents) + forks = diagnose.next node = forks.first - expect(node.diagnose).to eq(:invalid_inside_split_pair) - node = node.split_leaning + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:invalid_inside_split_pair) + node = diagnose.next[0] - expect(node.diagnose).to eq(:one_invalid_parent) - node = node.next_invalid + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:one_invalid_parent) + node = diagnose.next[0] - expect(node.diagnose).to eq(:self) + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:self) expect(node.to_s).to eq(<<~'EOM'.indent(2)) def speak EOM node = forks.last - expect(node.diagnose).to eq(:invalid_inside_split_pair) - node = node.split_leaning + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:invalid_inside_split_pair) + node = diagnose.next[0] - expect(node.diagnose).to eq(:one_invalid_parent) - node = node.next_invalid + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:one_invalid_parent) + node = diagnose.next[0] - expect(node.diagnose).to eq(:self) + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:self) expect(node.to_s).to eq(<<~'EOM'.indent(2)) end # buffalo one EOM @@ -116,24 +128,27 @@ def speak tree = IndentTree.new(document: document).call node = tree.root - expect(node.diagnose).to eq(:invalid_inside_split_pair) - node = node.split_leaning + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:invalid_inside_split_pair) + node = diagnose.next[0] + expect(node.to_s).to eq(<<~'EOM') puts ( else puts } EOM - expect(node.diagnose).to eq(:remove_pseudo_pair) - node = node.handle_multiple - + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:remove_pseudo_pair) + node = diagnose.next[0] expect(node.to_s).to eq(<<~'EOM'.indent(2)) puts ( puts } EOM - expect(node.diagnose).to eq(:multiple_invalid_parents) - forks = node.fork_invalid + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:multiple_invalid_parents) + forks = diagnose.next expect(forks.length).to eq(2) expect(forks.first.to_s).to eq(<<~'EOM'.indent(2)) @@ -219,19 +234,24 @@ def node_preinstall_bin_path node = tree.root - expect(node.diagnose).to eq(:invalid_inside_split_pair) - node = node.split_leaning + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:invalid_inside_split_pair) + node = diagnose.next[0] - expect(node.diagnose).to eq(:one_invalid_parent) - node = node.next_invalid + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:one_invalid_parent) + node = diagnose.next[0] - expect(node.diagnose).to eq(:invalid_inside_split_pair) - node = node.split_leaning + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:invalid_inside_split_pair) + node = diagnose.next[0] - expect(node.diagnose).to eq(:one_invalid_parent) - node = node.next_invalid + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:one_invalid_parent) + node = diagnose.next[0] - expect(node.diagnose).to eq(:self) + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:self) expect(node.to_s).to eq(<<~'EOM') | EOM @@ -257,13 +277,17 @@ def animals expect(diagnose.problem).to eq(:invalid_inside_split_pair) node = diagnose.next[0] - expect(node.diagnose).to eq(:invalid_inside_split_pair) - node = node.split_leaning + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:invalid_inside_split_pair) + node = diagnose.next[0] + - expect(node.diagnose).to eq(:extract_from_multiple) - node = node.next_invalid + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:extract_from_multiple) + node = diagnose.next[0] - expect(node.diagnose).to eq(:self) + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:self) expect(node.to_s).to eq(<<~'EOM'.indent(4)) cat, EOM @@ -314,28 +338,35 @@ def compile tree = IndentTree.new(document: document).call node = tree.root - expect(node.diagnose).to eq(:invalid_inside_split_pair) - node = node.split_leaning + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:invalid_inside_split_pair) + node = diagnose.next[0] - expect(node.diagnose).to eq(:remove_pseudo_pair) - node = node.handle_multiple + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:remove_pseudo_pair) + node = diagnose.next[0] - expect(node.diagnose).to eq(:invalid_inside_split_pair) - node = node.split_leaning + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:invalid_inside_split_pair) + node = diagnose.next[0] - expect(node.diagnose).to eq(:one_invalid_parent) - node = node.next_invalid + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:one_invalid_parent) + node = diagnose.next[0] - expect(node.diagnose).to eq(:invalid_inside_split_pair) - node = node.split_leaning + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:invalid_inside_split_pair) + node = diagnose.next[0] - expect(node.diagnose).to eq(:remove_pseudo_pair) + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:remove_pseudo_pair) expect(node.parents.length).to eq(4) - node = node.handle_multiple - - expect(node.diagnose).to eq(:self) + diagnose = DiagnoseNode.new(node).call + node = diagnose.next[0] + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:self) expect(node.to_s).to eq(<<~'EOM'.indent(6)) bundle_path: "vendor/bundle", } EOM @@ -349,37 +380,48 @@ def compile tree = IndentTree.new(document: document).call node = tree.root - expect(node.diagnose).to eq(:one_invalid_parent) - node = node.next_invalid + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:one_invalid_parent) + node = diagnose.next[0] - expect(node.diagnose).to eq(:invalid_inside_split_pair) - node = node.split_leaning + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:invalid_inside_split_pair) + node = diagnose.next[0] - expect(node.diagnose).to eq(:one_invalid_parent) - node = node.next_invalid + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:one_invalid_parent) + node = diagnose.next[0] - expect(node.diagnose).to eq(:invalid_inside_split_pair) - node = node.split_leaning + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:invalid_inside_split_pair) + node = diagnose.next[0] - expect(node.diagnose).to eq(:remove_pseudo_pair) - node = node.handle_multiple + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:remove_pseudo_pair) + node = diagnose.next[0] - expect(node.diagnose).to eq(:invalid_inside_split_pair) - node = node.split_leaning + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:invalid_inside_split_pair) + node = diagnose.next[0] - expect(node.diagnose).to eq(:invalid_inside_split_pair) - node = node.split_leaning + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:invalid_inside_split_pair) + node = diagnose.next[0] - expect(node.diagnose).to eq(:remove_pseudo_pair) - node = node.handle_multiple + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:remove_pseudo_pair) + node = diagnose.next[0] - expect(node.diagnose).to eq(:invalid_inside_split_pair) - node = node.split_leaning + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:invalid_inside_split_pair) + node = diagnose.next[0] - expect(node.diagnose).to eq(:one_invalid_parent) - node = node.next_invalid + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:one_invalid_parent) + node = diagnose.next[0] - expect(node.diagnose).to eq(:self) + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:self) expect(node.to_s).to eq(<<~'EOM') | EOM @@ -394,19 +436,24 @@ def compile node = tree.root - expect(node.diagnose).to eq(:invalid_inside_split_pair) - node = node.split_leaning + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:invalid_inside_split_pair) + node = diagnose.next[0] - expect(node.diagnose).to eq(:invalid_inside_split_pair) - node = node.split_leaning + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:invalid_inside_split_pair) + node = diagnose.next[0] - expect(node.diagnose).to eq(:one_invalid_parent) - node = node.next_invalid + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:one_invalid_parent) + node = diagnose.next[0] - expect(node.diagnose).to eq(:one_invalid_parent) - node = node.next_invalid + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:one_invalid_parent) + node = diagnose.next[0] - expect(node.diagnose).to eq(:self) + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:self) expect(node.to_s).to eq(<<~'EOM'.indent(4)) def filename EOM @@ -424,14 +471,17 @@ def bark tree = IndentTree.new(document: document).call node = tree.root - expect(node.diagnose).to eq(:invalid_inside_split_pair) - node = node.split_leaning - expect(node.diagnose).to eq(:one_invalid_parent) - node = node.next_invalid + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:invalid_inside_split_pair) + node = diagnose.next[0] - expect(node.diagnose).to eq(:self) + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:one_invalid_parent) + node = diagnose.next[0] + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:self) expect(node.to_s).to eq(<<~'EOM'.indent(2)) def bark EOM @@ -456,13 +506,16 @@ def call node = tree.root - expect(node.diagnose).to eq(:invalid_inside_split_pair) - node = node.split_leaning + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:invalid_inside_split_pair) + node = diagnose.next[0] - expect(node.diagnose).to eq(:one_invalid_parent) - node = node.next_invalid + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:one_invalid_parent) + node = diagnose.next[0] - expect(node.diagnose).to eq(:self) + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:self) expect(node.to_s).to eq(<<~'EOM'.indent(2)) end # one EOM @@ -500,13 +553,16 @@ def call tree = IndentTree.new(document: document).call node = tree.root - expect(node.diagnose).to eq(:invalid_inside_split_pair) - node = node.split_leaning + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:invalid_inside_split_pair) + node = diagnose.next[0] - expect(node.diagnose).to eq(:one_invalid_parent) - node = node.next_invalid + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:one_invalid_parent) + node = diagnose.next[0] - expect(node.diagnose).to eq(:self) + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:self) expect(node.to_s).to eq(<<~'EOM'.indent(2)) end # one EOM @@ -575,19 +631,24 @@ def initialize tree = IndentTree.new(document: document).call node = tree.root - expect(node.diagnose).to eq(:one_invalid_parent) - node = node.next_invalid + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:one_invalid_parent) + node = diagnose.next[0] - expect(node.diagnose).to eq(:invalid_inside_split_pair) - node = node.split_leaning + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:invalid_inside_split_pair) + node = diagnose.next[0] - expect(node.diagnose).to eq(:one_invalid_parent) - node = node.next_invalid + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:one_invalid_parent) + node = diagnose.next[0] - expect(node.diagnose).to eq(:one_invalid_parent) - node = node.next_invalid + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:one_invalid_parent) + node = diagnose.next[0] - expect(node.diagnose).to eq(:self) + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:self) expect(node.to_s).to eq(<<~'EOM'.indent(2)) def format_requires EOM @@ -603,32 +664,40 @@ def format_requires tree = IndentTree.new(document: document).call node = tree.root + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:one_invalid_parent) + node = diagnose.next[0] - expect(node.diagnose).to eq(:one_invalid_parent) - node = node.next_invalid - - expect(node.diagnose).to eq(:one_invalid_parent) - node = node.next_invalid + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:one_invalid_parent) + node = diagnose.next[0] - expect(node.diagnose).to eq(:invalid_inside_split_pair) - node = node.split_leaning + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:invalid_inside_split_pair) + node = diagnose.next[0] - expect(node.diagnose).to eq(:one_invalid_parent) - node = node.next_invalid + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:one_invalid_parent) + node = diagnose.next[0] - expect(node.diagnose).to eq(:one_invalid_parent) - node = node.next_invalid + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:one_invalid_parent) + node = diagnose.next[0] - expect(node.diagnose).to eq(:invalid_inside_split_pair) - node = node.split_leaning + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:invalid_inside_split_pair) + node = diagnose.next[0] - expect(node.diagnose).to eq(:one_invalid_parent) - node = node.next_invalid + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:one_invalid_parent) + node = diagnose.next[0] - expect(node.diagnose).to eq(:one_invalid_parent) - node = node.next_invalid + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:one_invalid_parent) + node = diagnose.next[0] - expect(node.diagnose).to eq(:self) + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:self) expect(node.to_s).to eq(<<~'EOM'.indent(4)) def format_requires EOM @@ -649,19 +718,24 @@ def format_requires node = tree.root - expect(node.diagnose).to eq(:one_invalid_parent) - node = node.next_invalid + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:one_invalid_parent) + node = diagnose.next[0] - expect(node.diagnose).to eq(:invalid_inside_split_pair) - node = node.split_leaning + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:invalid_inside_split_pair) + node = diagnose.next[0] - expect(node.diagnose).to eq(:one_invalid_parent) - node = node.next_invalid + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:one_invalid_parent) + node = diagnose.next[0] - expect(node.diagnose).to eq(:one_invalid_parent) - node = node.next_invalid + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:one_invalid_parent) + node = diagnose.next[0] - expect(node.diagnose).to eq(:self) + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:self) expect(node.to_s).to eq(<<~'EOM'.indent(2)) def on_args_add(arguments, argument) EOM @@ -709,13 +783,16 @@ def initialize(arguments:, block:, location:) node = tree.root - expect(node.diagnose).to eq(:one_invalid_parent) - node = node.next_invalid + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:one_invalid_parent) + node = diagnose.next[0] - expect(node.diagnose).to eq(:one_invalid_parent) - node = node.next_invalid + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:one_invalid_parent) + node = diagnose.next[0] - expect(node.diagnose).to eq(:self) + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:self) expect(node.to_s).to eq(<<~'EOM') def on_args_add(arguments, argument) EOM @@ -736,10 +813,12 @@ def foo node = tree.root - expect(node.diagnose).to eq(:one_invalid_parent) - node = node.next_invalid + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:one_invalid_parent) + node = diagnose.next[0] - expect(node.diagnose).to eq(:self) + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:self) expect(node.to_s).to eq(<<~'EOM') end # two EOM From c0d8625ada6ea94d98d424ed8d5b27ad02560ca1 Mon Sep 17 00:00:00 2001 From: schneems Date: Tue, 8 Feb 2022 20:05:59 -0600 Subject: [PATCH 45/58] Remove unused BalanceHueristicExpand --- lib/dead_end/api.rb | 1 - lib/dead_end/balance_heuristic_expand.rb | 281 --------------------- lib/dead_end/code_search.rb | 31 +-- spec/integration/dead_end_spec.rb | 2 - spec/unit/balance_heuristic_expand_spec.rb | 230 ----------------- 5 files changed, 1 insertion(+), 544 deletions(-) delete mode 100644 lib/dead_end/balance_heuristic_expand.rb delete mode 100644 spec/unit/balance_heuristic_expand_spec.rb diff --git a/lib/dead_end/api.rb b/lib/dead_end/api.rb index b93be7a..80ee597 100644 --- a/lib/dead_end/api.rb +++ b/lib/dead_end/api.rb @@ -200,7 +200,6 @@ def self.valid?(source) require_relative "priority_engulf_queue" require_relative "pathname_from_message" require_relative "display_invalid_blocks" -require_relative "balance_heuristic_expand" require_relative "parse_blocks_from_indent_line" require_relative "block_node" diff --git a/lib/dead_end/balance_heuristic_expand.rb b/lib/dead_end/balance_heuristic_expand.rb deleted file mode 100644 index b1fe96b..0000000 --- a/lib/dead_end/balance_heuristic_expand.rb +++ /dev/null @@ -1,281 +0,0 @@ -# frozen_string_literal: true - -module DeadEnd - # Expand code based on lexical heuristic - # - # Code that has unbalanced pairs cannot be valid - # i.e. `{` must always be matched with a `}`. - # - # This expansion class exploits that knowledge to - # expand a logical block towards equal pairs. - # - # For example: if code is missing a `]` it cannot - # be on a line above, so it must expand down - # - # This heuristic allows us to make larger and more - # accurate expansions which means fewer invalid - # blocks to check which means overall faster search. - # - # This class depends on another class LexPairDiff can be - # accesssed per-line. It holds the delta of tracked directional - # pairs: curly brackets, square brackets, parens, and kw/end - # with positive count (leaning left), 0 (balanced), or negative - # count (leaning right). - # - # With this lexical diff information we can look around a given - # block and move with inteligently. For instance if the current - # block has a miss matched `end` and the line above it holds - # `def foo` then the block will be expanded up to capture that line. - # - # An unbalanced block can never be valid (this provides info to - # the overall search). However a balanced block may contain other syntax - # error and so must be re-checked using Ripper (slow). - # - # Example - # - # lines = CodeLines.from_source(<~'EOM') - # if bark? - # end - # EOM - # block = CodeBlock.new(lines: lines[0]) - # - # expand = BalanceHeuristicExpand.new( - # code_lines: lines, - # block: block - # ) - # expand.direction # => :down - # expand.call - # expand.direction # => :equal - # - # expect(expand.to_s).to eq(lines.join) - class BalanceHeuristicExpand - attr_reader :start_index, :end_index - - def initialize(code_lines:, block:) - @block = block - @iterations = 0 - @code_lines = code_lines - @last_index = @code_lines.length - 1 - @max_iterations = @code_lines.length * 2 - @start_index = block.lines.first.index - @end_index = block.lines.last.index - @last_equal_range = nil - - set_lex_diff_from(block) - end - - private def set_lex_diff_from(block) - @lex_diff = LexPairDiff.new( - curly: 0, - square: 0, - parens: 0, - kw_end: 0 - ) - block.lines.each do |line| - @lex_diff.concat(line.lex_diff) - end - end - - # Converts the searched lines into a source string - def to_s - @code_lines[start_index..end_index].join - end - - # Converts the searched lines into a code block - def to_block - CodeBlock.new(lines: @code_lines[start_index..end_index]) - end - - # Returns true if all lines are equal - def balanced? - @lex_diff.balanced? - end - - # Returns false if captured lines are "leaning" - # one direction - def unbalanced? - !balanced? - end - - # Main search entrypoint - # - # Essentially a state machine, determine the leaning - # of the given block, then figure out how to either - # move it towards balanced, or expand it while keeping - # it balanced. - def call - case direction - when :up - # the goal is to become balanced - while keep_going? && direction == :up && try_expand_up - end - when :down - # the goal is to become balanced - while keep_going? && direction == :down && try_expand_down - end - when :equal - while keep_going? && grab_equal_or { - # Cannot create a balanced expansion, choose to be unbalanced - try_expand_up - } - end - - call # Recurse - when :both - while keep_going? && grab_equal_or { - try_expand_up - try_expand_down - } - end - when :stop - return self - end - - self - end - - # Convert a lex diff to a direction to search - # - # leaning left -> down - # leaning right -> up - # - def direction - leaning = @lex_diff.leaning - case leaning - when :left # go down - stop_bottom? ? :stop : :down - when :right # go up - stop_top? ? :stop : :up - when :equal, :both - if stop_top? && stop_bottom? - :stop - elsif stop_top? && !stop_bottom? - :down - elsif !stop_top? && stop_bottom? - :up - else - leaning - end - end - end - - # Limit rspec failure output - def inspect - "#" - end - - # Upper bound on iterations - private def keep_going? - if @iterations < @max_iterations - @iterations += 1 - true - else - warn <<~EOM - DeadEnd: Internal problem detected, possible infinite loop in #{self.class} - - Please open a ticket with the following information. Max: #{@max_iterations}, actual: #{@iterations} - - Original block: - - ``` - #{@block.lines.map(&:original).join}``` - - Stuck at: - - ``` - #{to_block.lines.map(&:original).join}``` - EOM - - false - end - end - - # Attempt to grab "free" lines - # - # if either above, below or both are - # balanced, take them, return true. - # - # If above is leaning left and below - # is leaning right and they cancel out - # take them, return true. - # - # If we couldn't grab any balanced lines - # then call the block and return false. - private def grab_equal_or - did_expand = false - if above&.balanced? - did_expand = true - try_expand_up - end - - if below&.balanced? - did_expand = true - try_expand_down - end - - return true if did_expand - - if make_balanced_from_up_down? - try_expand_up - try_expand_down - true - else - yield - false - end - end - - # If up is leaning left and down is leaning right - # they might cancel out, to make a complete - # and balanced block - private def make_balanced_from_up_down? - return false if above.nil? || below.nil? - return false if above.lex_diff.leaning != :left - return false if below.lex_diff.leaning != :right - - @lex_diff.dup.concat(above.lex_diff).concat(below.lex_diff).balanced? - end - - # The line above the current location - private def above - @code_lines[@start_index - 1] unless stop_top? - end - - # The line below the current location - private def below - @code_lines[@end_index + 1] unless stop_bottom? - end - - # Mutates the start index and applies the new line's - # lex diff - private def expand_up - @start_index -= 1 - @lex_diff.concat(@code_lines[@start_index].lex_diff) - end - - private def try_expand_up - stop_top? ? false : expand_up - end - - private def try_expand_down - stop_bottom? ? false : expand_down - end - - # Mutates the end index and applies the new line's - # lex diff - private def expand_down - @end_index += 1 - @lex_diff.concat(@code_lines[@end_index].lex_diff) - end - - # Returns true when we can no longer expand up - private def stop_top? - @start_index == 0 - end - - # Returns true when we can no longer expand down - private def stop_bottom? - @end_index == @last_index - end - end -end diff --git a/lib/dead_end/code_search.rb b/lib/dead_end/code_search.rb index 12072c2..7d46060 100644 --- a/lib/dead_end/code_search.rb +++ b/lib/dead_end/code_search.rb @@ -120,36 +120,7 @@ def expand_existing record(block: block, name: "before-expand") - if block.invalid? - # When a block is invalid the BalanceHeuristicExpand class tends to make it valid - # again. This property reduces the number of Ripper calls to - # `frontier.holds_all_syntax_errors?`. - # - # This class tends to produce larger expansions meaning fewer - # total expansion steps. - blocks = [] - expand = BalanceHeuristicExpand.new(code_lines: code_lines, block: block) - - # Expand magic number 3 times - # - # There's likely a hidden property that explains why. I - # guessed it accidentally and it works really well. Reducing or increasing - # call count produces awful results. I'm not entirely sure why. - blocks << expand.call.to_block - blocks << expand.to_block if expand.call.balanced? - blocks << expand.to_block if expand.call.balanced? - - # Take the largest generated, valid block - block = blocks.reverse_each.detect(&:valid?) || blocks.first - else - # The original block expansion process works well when it starts - # with good i.e. "valid" input. Unlike BalanceHeuristicExpand, it does not self-correct - # towards a valid state. This naive property is desireable since - # we want to generate invalid code blocks (that make logical sense) - # or the algorithm will tend towards matching incorrect pairs - # at the expense of an incorrect result. - block = @indent_block_expand.call(block) - end + block = @indent_block_expand.call(block) push(block, name: "expand") end diff --git a/spec/integration/dead_end_spec.rb b/spec/integration/dead_end_spec.rb index bbbafb8..04c9ce6 100644 --- a/spec/integration/dead_end_spec.rb +++ b/spec/integration/dead_end_spec.rb @@ -138,8 +138,6 @@ module DeadEnd expect(out).to include(<<~EOM) 16 class Rexe - 18 VERSION = '1.5.1' - 20 PROJECT_URL = 'https://github.com/keithrbennett/rexe' ❯ 77 class Lookups ❯ 78 def input_modes ❯ 148 end diff --git a/spec/unit/balance_heuristic_expand_spec.rb b/spec/unit/balance_heuristic_expand_spec.rb deleted file mode 100644 index cc3066a..0000000 --- a/spec/unit/balance_heuristic_expand_spec.rb +++ /dev/null @@ -1,230 +0,0 @@ -# frozen_string_literal: true - -require_relative "../spec_helper" - -module DeadEnd - RSpec.describe BalanceHeuristicExpand do - it "can handle 'unknown' direction code" do - source = <<~'EOM' - parser.on('-r', '--require REQUIRE(S)', - 'Gems and built-in libraries (e.g. shellwords, yaml) to require, comma separated, or ! to clear') do |v| - if v == '!' - options.requires.clear - else - v.split(',').map(&:strip).each do |r| - if r[0] == '-' - options.requires -= [r[1..-1]] - else - options.requires << r - end - end - end - end - EOM - - lines = CleanDocument.new(source: source).call.lines - expand = BalanceHeuristicExpand.new( - code_lines: lines, - block: CodeBlock.new(lines: lines[1]) - ) - - expect(expand.direction).to eq(:both) - expand.call - expect(expand.to_s).to eq(<<~'EOM') - parser.on('-r', '--require REQUIRE(S)', - 'Gems and built-in libraries (e.g. shellwords, yaml) to require, comma separated, or ! to clear') do |v| - if v == '!' - EOM - - expand.call - expect(expand.to_s).to eq(<<~'EOM') - parser.on('-r', '--require REQUIRE(S)', - 'Gems and built-in libraries (e.g. shellwords, yaml) to require, comma separated, or ! to clear') do |v| - if v == '!' - options.requires.clear - else - v.split(',').map(&:strip).each do |r| - if r[0] == '-' - options.requires -= [r[1..-1]] - else - options.requires << r - end - end - end - end - EOM - end - - it "does not generate (known) invalid blocks when started at different positions" do - source = <<~EOM - Foo.call do |a - # inner - end # one - - print lol - class Foo - end # two - EOM - lines = CodeLine.from_source(source) - expand = BalanceHeuristicExpand.new( - code_lines: lines, - block: CodeBlock.new(lines: lines[1]) - ) - expect(expand.direction).to eq(:equal) - expand.call - expect(expand.to_s).to eq(<<~'EOM') - Foo.call do |a - # inner - end # one - - print lol - class Foo - end # two - EOM - - expand = BalanceHeuristicExpand.new( - code_lines: lines, - block: CodeBlock.new(lines: lines[0]) - ) - expect(expand.call.to_s).to eq(<<~'EOM') - Foo.call do |a - # inner - end # one - - print lol - class Foo - end # two - EOM - - expand = BalanceHeuristicExpand.new( - code_lines: lines, - block: CodeBlock.new(lines: lines[2]) - ) - expect(expand.direction).to eq(:up) - - expand.call - - expect(expand.to_s).to eq(<<~'EOM') - Foo.call do |a - # inner - end # one - EOM - - expand = BalanceHeuristicExpand.new( - code_lines: lines, - block: CodeBlock.new(lines: lines[3]) - ) - expect(expand.direction).to eq(:equal) - expand.call - expect(expand.to_s).to eq(<<~'EOM') - Foo.call do |a - # inner - end # one - - print lol - EOM - - expand = BalanceHeuristicExpand.new( - code_lines: lines, - block: CodeBlock.new(lines: lines[4]) - ) - expect(expand.direction).to eq(:equal) - expand.call - expect(expand.to_s).to eq(<<~'EOM') - Foo.call do |a - # inner - end # one - - print lol - EOM - - expand = BalanceHeuristicExpand.new( - code_lines: lines, - block: CodeBlock.new(lines: lines[5]) - ) - expect(expand.direction).to eq(:down) - expand.call - expect(expand.to_s).to eq(<<~'EOM') - class Foo - end # two - EOM - end - - it "expands" do - source = <<~EOM - class Blerg - Foo.call do |a - end # one - - print lol - class Foo - end # two - end # three - EOM - lines = CodeLine.from_source(source) - expand = BalanceHeuristicExpand.new( - code_lines: lines, - block: CodeBlock.new(lines: lines[5]) - ) - expect(expand.call.to_s).to eq(<<~'EOM'.indent(2)) - class Foo - end # two - EOM - expect(expand.call.to_s).to eq(<<~'EOM'.indent(2)) - Foo.call do |a - end # one - - print lol - class Foo - end # two - EOM - - expect(expand.call.to_s).to eq(<<~'EOM') - class Blerg - Foo.call do |a - end # one - - print lol - class Foo - end # two - end # three - EOM - end - - it "expands up when on an end" do - lines = CodeLine.from_source(<<~'EOM') - Foo.new do - end - EOM - expand = BalanceHeuristicExpand.new( - code_lines: lines, - block: CodeBlock.new(lines: lines[1]) - ) - expect(expand.direction).to eq(:up) - expand.call - expect(expand.direction).to eq(:stop) - - expect(expand.start_index).to eq(0) - expect(expand.end_index).to eq(1) - expect(expand.to_s).to eq(lines.join) - end - - it "expands down when on a keyword" do - lines = CodeLine.from_source(<<~'EOM') - Foo.new do - end - EOM - expand = BalanceHeuristicExpand.new( - code_lines: lines, - block: CodeBlock.new(lines: lines[0]) - ) - expect(expand.direction).to eq(:down) - expand.call - expect(expand.direction).to eq(:stop) - - expect(expand.start_index).to eq(0) - expect(expand.end_index).to eq(1) - expect(expand.to_s).to eq(lines.join) - end - end -end From e5a70a9caa443d38598535aec41f25a1cfee482f Mon Sep 17 00:00:00 2001 From: schneems Date: Wed, 9 Feb 2022 11:56:43 -0600 Subject: [PATCH 46/58] WIP Integration with One failing test MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit This case fails because all "next indent" heuristics for the 3 blocks are exactly the same ``` Block lines: 8..9 (pop) indent: 2 next_indent: 2 1 describe "things" do 2 it "blerg" do 3 end # one 4 5 it "flerg" 6 end # two 7 ❯ 8 it "zlerg" do ❯ 9 end # three 10 end ``` Is expanded to ``` Block lines: 2..9 (expand) indent: 2 next_indent: 0 1 describe "things" do ❯ 2 it "blerg" do ❯ 3 end # one 4 ❯ 5 it "flerg" ❯ 6 end # two 7 ❯ 8 it "zlerg" do ❯ 9 end # three 10 end ``` Not great. Also for the case where inner is indented: ``` Block lines: 7..7 (pop) indent: 4 next_indent: 4 1 describe "things" do 2 it "blerg" do 3 print foo 4 end # one 5 6 it "flerg" ❯ 7 print foo 8 end # two 9 10 it "zlerg" do 11 print foo 12 end # three 13 end ``` Is expanded to: ``` Block lines: 3..7 (expand) indent: 2 next_indent: 2 1 describe "things" do 2 it "blerg" do ❯ 3 print foo ❯ 4 end # one 5 ❯ 6 it "flerg" ❯ 7 print foo 8 end # two 9 10 it "zlerg" do 11 print foo 12 end # three 13 end ``` Which is a valid way to remove the syntax error as it removes the end, but it doesn't do it entirely as expected. That's the "wrong" end to remove. --- lib/dead_end/api.rb | 38 +++++++++++++++++++- lib/dead_end/around_block_scan.rb | 4 +-- lib/dead_end/block_node.rb | 4 +++ lib/dead_end/display_invalid_blocks.rb | 21 ++++++----- lib/dead_end/indent_search.rb | 4 +++ spec/integration/dead_end_spec.rb | 48 ++++++++++++-------------- spec/unit/indent_search_spec.rb | 32 +++++++++++++++++ 7 files changed, 115 insertions(+), 36 deletions(-) diff --git a/lib/dead_end/api.rb b/lib/dead_end/api.rb index 80ee597..75e2932 100644 --- a/lib/dead_end/api.rb +++ b/lib/dead_end/api.rb @@ -62,6 +62,42 @@ def self.handle_error(e, re_raise: true, io: $stderr) # # Main private interface def self.call(source:, filename: DEFAULT_VALUE, terminal: DEFAULT_VALUE, record_dir: DEFAULT_VALUE, timeout: TIMEOUT_DEFAULT, io: $stderr) + call_now(source: source, filename: filename, terminal: terminal, record_dir: record_dir, timeout: timeout, io: io) + # call_old(source: source, filename: filename, terminal: terminal, record_dir: record_dir, timeout: timeout, io: io) + end + + def self.call_now(source:, filename: DEFAULT_VALUE, terminal: DEFAULT_VALUE, record_dir: DEFAULT_VALUE, timeout: TIMEOUT_DEFAULT, io: $stderr) + search = nil + code_lines = nil + filename = nil if filename == DEFAULT_VALUE + Timeout.timeout(timeout) do + code_lines = CleanDocument.new(source: source).call.lines + + if DeadEnd.valid?(code_lines) + io.puts "Syntax OK" + obj = Object.new + def obj.document_ok?; true; end + return obj + end + + document = BlockDocument.new(code_lines: code_lines).call + tree = IndentTree.new(document: document).call + search = IndentSearch.new(tree: tree).call + end + + blocks = search.finished.map(&:node).map {|node| CodeBlock.new(lines: node.lines) } + # puts search.finished.first.steps.last(2).first.block + + DisplayInvalidBlocks.new( + io: io, + blocks: blocks, + filename: filename, + terminal: terminal, + code_lines: code_lines, + ).call + end + + def self.call_old(source:, filename: DEFAULT_VALUE, terminal: DEFAULT_VALUE, record_dir: DEFAULT_VALUE, timeout: TIMEOUT_DEFAULT, io: $stderr) search = nil filename = nil if filename == DEFAULT_VALUE Timeout.timeout(timeout) do @@ -75,7 +111,7 @@ def self.call(source:, filename: DEFAULT_VALUE, terminal: DEFAULT_VALUE, record_ blocks: blocks, filename: filename, terminal: terminal, - code_lines: search.code_lines + code_lines: search.code_lines, ).call rescue Timeout::Error => e io.puts "Search timed out DEAD_END_TIMEOUT=#{timeout}, run with DEBUG=1 for more info" diff --git a/lib/dead_end/around_block_scan.rb b/lib/dead_end/around_block_scan.rb index 8c12c68..2076850 100644 --- a/lib/dead_end/around_block_scan.rb +++ b/lib/dead_end/around_block_scan.rb @@ -121,7 +121,7 @@ def capture_neighbor_context break end - lines << line + lines << line if line.is_kw? || line.is_end? end lines.reverse! @@ -140,7 +140,7 @@ def capture_neighbor_context break end - lines << line + lines << line if line.is_kw? || line.is_end? end lines diff --git a/lib/dead_end/block_node.rb b/lib/dead_end/block_node.rb index c282f55..dd754b8 100644 --- a/lib/dead_end/block_node.rb +++ b/lib/dead_end/block_node.rb @@ -241,6 +241,10 @@ def <=>(other) end end + def hidden? + false + end + # Provide meaningful diffs in rspec def inspect "#" diff --git a/lib/dead_end/display_invalid_blocks.rb b/lib/dead_end/display_invalid_blocks.rb index fff6023..8dd05a1 100644 --- a/lib/dead_end/display_invalid_blocks.rb +++ b/lib/dead_end/display_invalid_blocks.rb @@ -6,13 +6,14 @@ module DeadEnd # Used for formatting invalid blocks class DisplayInvalidBlocks - attr_reader :filename + attr_reader :filename, :code_lines - def initialize(code_lines:, blocks:, io: $stderr, filename: nil, terminal: DEFAULT_VALUE) + def initialize(code_lines:, blocks:, io: $stderr, filename: nil, terminal: DEFAULT_VALUE, capture_mode: :old) @io = io @blocks = Array(blocks) @filename = filename @code_lines = code_lines + @capture_mode = capture_mode @terminal = terminal == DEFAULT_VALUE ? io.isatty : terminal end @@ -44,12 +45,16 @@ def call code_lines: block.lines ).call - # Enhance code output - # Also handles several ambiguious cases - lines = CaptureCodeContext.new( - blocks: block, - code_lines: @code_lines - ).call + if @capture_mode == :old + # Enhance code output + # Also handles several ambiguious cases + lines = CaptureCodeContext.new( + blocks: block, + code_lines: @code_lines + ).call + else + lines = block.lines + end # Build code output document = DisplayCodeWithLineNumbers.new( diff --git a/lib/dead_end/indent_search.rb b/lib/dead_end/indent_search.rb index 5aca0cc..4b12e0d 100644 --- a/lib/dead_end/indent_search.rb +++ b/lib/dead_end/indent_search.rb @@ -89,4 +89,8 @@ def call ) end end + + def inspect + "#" + end end diff --git a/spec/integration/dead_end_spec.rb b/spec/integration/dead_end_spec.rb index 04c9ce6..24ea902 100644 --- a/spec/integration/dead_end_spec.rb +++ b/spec/integration/dead_end_spec.rb @@ -25,11 +25,11 @@ module DeadEnd expect(io.string).to include(<<~'EOM') 6 class SyntaxTree < Ripper - 170 def self.parse(source) - 174 end + 727 class Args + 750 end ❯ 754 def on_args_add(arguments, argument) - ❯ 776 class ArgsAddBlock - ❯ 810 end + 776 class ArgsAddBlock + 810 end 9233 end EOM end @@ -52,10 +52,9 @@ module DeadEnd expect(io.string).to_not include("def ruby_install_binstub_path") expect(io.string).to include(<<~'EOM') - ❯ 1067 def add_yarn_binary - ❯ 1068 return [] if yarn_preinstalled? + 16 class LanguagePack::Ruby < LanguagePack::Base ❯ 1069 | - ❯ 1075 end + 1344 end EOM end @@ -70,10 +69,10 @@ module DeadEnd debug_display(io.string) expect(io.string).to include(<<~'EOM') - 1 Rails.application.routes.draw do + 1 Rails.application.routes.draw do + 107 constraints -> { Rails.application.config.non_production } do + 111 end ❯ 113 namespace :admin do - ❯ 116 match "/foobar(*path)", via: :all, to: redirect { |_params, req| - ❯ 120 } 121 end EOM end @@ -93,7 +92,6 @@ module DeadEnd 22 it "body" do 27 query = Cutlass::FunctionQuery.new( ❯ 28 port: port - ❯ 29 body: body 30 ).call 34 end 35 end @@ -113,12 +111,9 @@ module DeadEnd expect(io.string).to include(<<~'EOM') 5 module DerailedBenchmarks 6 class RequireTree - 7 REQUIRED_BY = {} - 9 attr_reader :name - 10 attr_writer :cost ❯ 13 def initialize(name) - ❯ 18 def self.reset! - ❯ 25 end + 18 def self.reset! + 25 end 73 end 74 end EOM @@ -138,9 +133,11 @@ module DeadEnd expect(out).to include(<<~EOM) 16 class Rexe - ❯ 77 class Lookups + 77 class Lookups ❯ 78 def input_modes - ❯ 148 end + 87 def input_formats + 94 end + 148 end 551 end EOM end @@ -158,10 +155,11 @@ module DeadEnd out = io.string expect(out).to include(<<~EOM) 16 class Rexe - 18 VERSION = '1.5.1' - ❯ 77 class Lookups + 77 class Lookups + 124 def formatters + 137 end ❯ 140 def format_requires - ❯ 148 end + 148 end 551 end EOM end @@ -180,9 +178,9 @@ def call # 0 ) out = io.string expect(out).to include(<<~EOM) - ❯ 1 def call # 0 + 1 def call # 0 ❯ 3 end # one # 2 - ❯ 4 end # two # 3 + 4 end # two # 3 EOM end @@ -200,9 +198,9 @@ def bark ) out = io.string expect(out).to include(<<~EOM) - ❯ 1 class Dog + 1 class Dog ❯ 2 def bark - ❯ 4 end + 4 end EOM end end diff --git a/spec/unit/indent_search_spec.rb b/spec/unit/indent_search_spec.rb index 553badb..767369b 100644 --- a/spec/unit/indent_search_spec.rb +++ b/spec/unit/indent_search_spec.rb @@ -4,6 +4,38 @@ module DeadEnd RSpec.describe IndentSearch do + it "finds missing do in an rspec context same indent when the problem is in the middle and blocks do not have inner contents" do + source = <<~'EOM' + describe "things" do + it "blerg" do + print foo + end # one + + it "flerg" + print foo + end # two + + it "zlerg" do + print foo + end # three + end + EOM + + code_lines = CleanDocument.new(source: source).call.lines + document = BlockDocument.new(code_lines: code_lines).call + tree = IndentTree.new(document: document).call + search = IndentSearch.new(tree: tree).call + + expect(search.finished.join).to eq(<<~'EOM'.indent(0)) + { + print ( + } + { + print ) + } + EOM + end + it "won't show valid code when two invalid blocks are splitting it" do source = <<~'EOM' { From fc872fbd3727549b6a47f7732f27fc40b6ae71b6 Mon Sep 17 00:00:00 2001 From: schneems Date: Wed, 9 Feb 2022 13:16:39 -0600 Subject: [PATCH 47/58] WIP 8 total failing tests The current block node implementation is optimized for the "hard" cases from the old algorithm, but turns out it doesn't do some of the "basic" stuff the old one did. We've got a few paths to remediate these cases: - We can solve them "in post" in the same way that Capture Context "fixed" otherwise ambiguous or meh results from the old search - We can update the way we're building blocks - We can update the way we're searching blocks I think each of these cases will need something a little different. For the case where I want to return the line missing a "do", where we're grabbing the correct "end", I think we can/should fix that in post. There's a lot of other cases where I'm not quite sure. Also worth mentioning that some of these failure modes may change if any values are placed "inside" them for example: ``` def foo def blerg end ``` Returns a match saying the problem is `def foo` but if you put something "in" the `def blerg` it says the problem is `def blerg` ``` def foo def blerg print lol end ``` Let's take it one step at a time, focus on one failure case, see if it breaks other cases when it's fixed...then iterate. --- spec/unit/indent_search_spec.rb | 183 ++++++++++++++++++++++++++++++-- 1 file changed, 177 insertions(+), 6 deletions(-) diff --git a/spec/unit/indent_search_spec.rb b/spec/unit/indent_search_spec.rb index 767369b..a2944e6 100644 --- a/spec/unit/indent_search_spec.rb +++ b/spec/unit/indent_search_spec.rb @@ -4,7 +4,38 @@ module DeadEnd RSpec.describe IndentSearch do + def tmp_capture_context(finished) + code_lines = finished.first.steps[0].block.lines + blocks = finished.map(&:node).map {|node| CodeBlock.new(lines: node.lines )} + lines = CaptureCodeContext.new(blocks: blocks , code_lines: code_lines).call + lines + end + it "finds missing do in an rspec context same indent when the problem is in the middle and blocks do not have inner contents" do + source = <<~'EOM' + describe "things" do + it "blerg" do + end # one + + it "flerg" + end # two + + it "zlerg" do + end # three + end + EOM + + code_lines = CleanDocument.new(source: source).call.lines + document = BlockDocument.new(code_lines: code_lines).call + tree = IndentTree.new(document: document).call + search = IndentSearch.new(tree: tree).call + + expect(search.finished.join).to eq(<<~'EOM'.indent(2)) + end # two + EOM + end + + it "finds missing do in an rspec context same indent when the problem is in the middle and blocks HAVE inner contents" do source = <<~'EOM' describe "things" do it "blerg" do @@ -26,13 +57,153 @@ module DeadEnd tree = IndentTree.new(document: document).call search = IndentSearch.new(tree: tree).call + expect(search.finished.join).to eq(<<~'EOM'.indent(2)) + end # two + EOM + end + + it "finds a mis-matched def" do + source = <<~'EOM' + def foo + def blerg + end + EOM + + code_lines = CleanDocument.new(source: source).call.lines + document = BlockDocument.new(code_lines: code_lines).call + tree = IndentTree.new(document: document).call + search = IndentSearch.new(tree: tree).call + + expect(search.finished.join).to eq(<<~'EOM'.indent(2)) + def blerg + EOM + end + + it "finds a typo def" do + source = <<~'EOM' + defzfoo + puts "lol" + end + EOM + + code_lines = CleanDocument.new(source: source).call.lines + document = BlockDocument.new(code_lines: code_lines).call + tree = IndentTree.new(document: document).call + search = IndentSearch.new(tree: tree).call + expect(search.finished.join).to eq(<<~'EOM'.indent(0)) - { - print ( - } - { - print ) - } + end + EOM + + lines = tmp_capture_context(search.finished) + expect(lines.join).to eq(<<~'EOM') + defzfoo + end + EOM + end + + it "finds a naked end" do + source = <<~'EOM' + def foo + end # one + end # two + EOM + + code_lines = CleanDocument.new(source: source).call.lines + document = BlockDocument.new(code_lines: code_lines).call + tree = IndentTree.new(document: document).call + search = IndentSearch.new(tree: tree).call + + expect(search.finished.join).to eq(<<~'EOM'.indent(0)) + end # two + EOM + + lines = tmp_capture_context(search.finished) + expect(lines.join).to eq(<<~'EOM') + IDK what I want here + EOM + end + + it "finds multiple syntax errors" do + source = <<~'EOM' + describe "hi" do + Foo.call + end # one + end # two + + it "blerg" do + Bar.call + end # three + end # four + EOM + + code_lines = CleanDocument.new(source: source).call.lines + document = BlockDocument.new(code_lines: code_lines).call + tree = IndentTree.new(document: document).call + search = IndentSearch.new(tree: tree).call + + expect(search.finished.join).to eq(<<~'EOM'.indent(2)) + end # one + end # three + EOM + + lines = tmp_capture_context(search.finished) + expect(lines.join).to include(<<~'EOM'.indent(2)) + Foo.call + end # one + EOM + + expect(lines.join).to include(<<~'EOM'.indent(2)) + Bar.call + end # three + EOM + end + + it "doesn't just return an empty `end`" do + source = <<~'EOM' + Foo.call + end # one + EOM + + code_lines = CleanDocument.new(source: source).call.lines + document = BlockDocument.new(code_lines: code_lines).call + tree = IndentTree.new(document: document).call + search = IndentSearch.new(tree: tree).call + + expect(search.finished.join).to eq(<<~'EOM'.indent(0)) + end # one + EOM + + lines = tmp_capture_context(search.finished) + expect(lines.join).to include(<<~'EOM'.indent(0)) + Foo.call + end # one + EOM + end + + it "returns syntax error in outer block without inner block" do + source = <<~'EOM' + Foo.call + def foo + puts "lol" + puts "lol" + end # one + end # two + EOM + + code_lines = CleanDocument.new(source: source).call.lines + document = BlockDocument.new(code_lines: code_lines).call + tree = IndentTree.new(document: document).call + search = IndentSearch.new(tree: tree).call + + expect(search.finished.join).to eq(<<~'EOM'.indent(0)) + end # two + EOM + + lines = tmp_capture_context(search.finished) + expect(lines.join).to include(<<~'EOM'.indent(0)) + Foo.call + end # two EOM end From 026aa1b4d48dcfb00520be597f1d17237a070afb Mon Sep 17 00:00:00 2001 From: schneems Date: Thu, 10 Feb 2022 15:52:21 -0600 Subject: [PATCH 48/58] WIP Initial BlockNodeContext Fixes 6 out of 8 of the prior failures. The concept is quite simple as well. If a node leans one way, capture in the direction needed to theoretically fix it until we find a node that is leaning. It fails though when we look at a larger example because blocks tend to self balance for valid code and this produces very large outputs for large inputs. It's not suitable for a final answer, but it's a good first step. I introduced a failing test with "handles heredocs" to capture that need. --- spec/unit/indent_search_spec.rb | 144 +++++++++++++++++++++++++++----- 1 file changed, 124 insertions(+), 20 deletions(-) diff --git a/spec/unit/indent_search_spec.rb b/spec/unit/indent_search_spec.rb index a2944e6..41e31e5 100644 --- a/spec/unit/indent_search_spec.rb +++ b/spec/unit/indent_search_spec.rb @@ -33,21 +33,31 @@ def tmp_capture_context(finished) expect(search.finished.join).to eq(<<~'EOM'.indent(2)) end # two EOM + + context = BlockNodeContext.new(search.finished[0]).call + expect(context.lines.join).to eq(<<~'EOM'.indent(2)) + it "flerg" + end # two + EOM + + expect(context.highlight.join).to eq(<<~'EOM'.indent(2)) + end # two + EOM end it "finds missing do in an rspec context same indent when the problem is in the middle and blocks HAVE inner contents" do source = <<~'EOM' describe "things" do it "blerg" do - print foo + print foo1 end # one it "flerg" - print foo + print foo2 end # two it "zlerg" do - print foo + print foo3 end # three end EOM @@ -57,7 +67,14 @@ def tmp_capture_context(finished) tree = IndentTree.new(document: document).call search = IndentSearch.new(tree: tree).call - expect(search.finished.join).to eq(<<~'EOM'.indent(2)) + context = BlockNodeContext.new(search.finished[0]).call + expect(context.lines.join).to eq(<<~'EOM'.indent(2)) + it "flerg" + print foo + end # two + EOM + + expect(context.highlight.join).to eq(<<~'EOM'.indent(2)) end # two EOM end @@ -74,8 +91,15 @@ def blerg tree = IndentTree.new(document: document).call search = IndentSearch.new(tree: tree).call - expect(search.finished.join).to eq(<<~'EOM'.indent(2)) - def blerg + context = BlockNodeContext.new(search.finished[0]).call + expect(context.lines.join).to eq(<<~'EOM'.indent(0)) + def foo + def blerg + end + EOM + + expect(context.highlight.join).to eq(<<~'EOM'.indent(0)) + def foo EOM end @@ -95,9 +119,14 @@ def blerg end EOM - lines = tmp_capture_context(search.finished) - expect(lines.join).to eq(<<~'EOM') + context = BlockNodeContext.new(search.finished[0]).call + expect(context.lines.join).to eq(<<~'EOM'.indent(0)) defzfoo + puts "lol" + end + EOM + + expect(context.highlight.join).to eq(<<~'EOM'.indent(0)) end EOM end @@ -118,9 +147,15 @@ def foo end # two EOM - lines = tmp_capture_context(search.finished) - expect(lines.join).to eq(<<~'EOM') - IDK what I want here + context = BlockNodeContext.new(search.finished[0]).call + expect(context.lines.join).to eq(<<~'EOM'.indent(0)) + def foo + end # one + end # two + EOM + + expect(context.highlight.join).to eq(<<~'EOM'.indent(0)) + end # two EOM end @@ -147,16 +182,25 @@ def foo end # three EOM - lines = tmp_capture_context(search.finished) - expect(lines.join).to include(<<~'EOM'.indent(2)) + context = BlockNodeContext.new(search.finished[0]).call + expect(context.lines.join).to eq(<<~'EOM'.indent(2)) Foo.call end # one EOM - expect(lines.join).to include(<<~'EOM'.indent(2)) + expect(context.highlight.join).to eq(<<~'EOM'.indent(2)) + end # one + EOM + + context = BlockNodeContext.new(search.finished[1]).call + expect(context.lines.join).to eq(<<~'EOM'.indent(2)) Bar.call end # three EOM + + expect(context.highlight.join).to eq(<<~'EOM'.indent(2)) + end # three + EOM end it "doesn't just return an empty `end`" do @@ -174,11 +218,52 @@ def foo end # one EOM - lines = tmp_capture_context(search.finished) - expect(lines.join).to include(<<~'EOM'.indent(0)) + context = BlockNodeContext.new(search.finished[0]).call + expect(context.lines.join).to eq(<<~'EOM'.indent(0)) Foo.call end # one EOM + + expect(context.highlight.join).to eq(<<~'EOM'.indent(0)) + end # one + EOM + end + + class BlockNodeContext + attr_reader :blocks + + def initialize(journey) + @journey = journey + @blocks = [] + end + + def call + node = @journey.node + @blocks << node + + if node.leaning == :right && node.leaf? + while @blocks.last.above && @blocks.last.above.leaning == :equal + @blocks << @blocks.last.above + end + end + + if node.leaning == :left && node.leaf? + while @blocks.last.below && @blocks.last.below.leaning == :equal + @blocks << @blocks.last.below + end + end + + @blocks.sort_by! {|block| block.start_index } + self + end + + def highlight + @journey.node.lines + end + + def lines + blocks.flat_map(&:lines).sort_by {|line| line.number } + end end it "returns syntax error in outer block without inner block" do @@ -196,13 +281,17 @@ def foo tree = IndentTree.new(document: document).call search = IndentSearch.new(tree: tree).call - expect(search.finished.join).to eq(<<~'EOM'.indent(0)) + context = BlockNodeContext.new(search.finished[0]).call + expect(context.lines.join).to eq(<<~'EOM'.indent(0)) + Foo.call + def foo + puts "lol" + puts "lol" + end # one end # two EOM - lines = tmp_capture_context(search.finished) - expect(lines.join).to include(<<~'EOM'.indent(0)) - Foo.call + expect(context.highlight.join).to eq(<<~'EOM'.indent(0)) end # two EOM end @@ -293,6 +382,21 @@ def foo expect(search.finished.first.node.to_s).to eq(<<~'EOM'.indent(4)) def input_modes EOM + + context = BlockNodeContext.new(search.finished[0]).call + expect(context.highlight.join).to eq(<<~'EOM'.indent(4)) + def input_modes + EOM + + expect(context.lines.join).to eq(<<~'EOM'.indent(0)) + def input_modes + @input_modes ||= { + 'l' => :line, + 'e' => :enumerator, + 'b' => :one_big_string, + 'n' => :none + } + EOM end it "handles derailed output issues/50" do From 60b4ed3da343a4f21949dfdd749835ec4a0d31ac Mon Sep 17 00:00:00 2001 From: schneems Date: Tue, 15 Feb 2022 10:22:40 -0600 Subject: [PATCH 49/58] Update block building We were making some questionable decisions when building blocks. I refactored out the IndentTree building so that it can be done step by step so we can observe it. This allows for tests that are quite verbose, but should allow for use of intuition "that shouldn't be the next step" as blocks are being built. I found that some nodes at the same indent were being expanded to cover the entire "inner" before other blocks could capture them. For example in: ``` describe "things" do it "blerg" do print foo1 end # one it "flerg" print foo2 end # two it "zlerg" do print foo3 end # three end ``` This block: ``` it "zlerg" do print foo3 end # three ``` Would expand to capture the above `end # two` even though logically it should be captured by the `print foo2` block. To accommodate this I added an extra condition to block node expansion saying to not capture a node leaning in the opposite direction as expansion if that node is also a leaf. Essentially if a node is leaning right, it needs to expand up. I want either it to expand itself, or for it's "inner" equal block to expand out to capture it. The other case I added was to enforce the indentation check for nodes leaning opposite of expansion direction, but only on first expansion (other wise we end up with "split" blocks where one block will not engulf all inner blocks. Those changes exposed a problem with the `next_indent` calculation where sometimes it would come back at a higher value than the current indent which is not correct. Fixing this by adding a final check/guarantee when deriving that value. This change seems good in isolation but is causing a lot of test failures due to tight coupling between tests and implementation. I need to go back and re-work the tests to see if there's any fundamental "disagreements" or if they just need to be updated to new/better values. There's one problem fundamental to the :both case that seems not well handled here --- lib/dead_end/block_node.rb | 10 +- lib/dead_end/indent_tree.rb | 59 ++-- spec/unit/indent_search_spec.rb | 578 +++++++++++++++++++++++++++++++- spec/unit/indent_tree_spec.rb | 256 +++++++++++++- 4 files changed, 860 insertions(+), 43 deletions(-) diff --git a/lib/dead_end/block_node.rb b/lib/dead_end/block_node.rb index dd754b8..6d4effd 100644 --- a/lib/dead_end/block_node.rb +++ b/lib/dead_end/block_node.rb @@ -98,8 +98,9 @@ def initialize(lines:, indent:, next_indent: nil, lex_diff: nil, parents: []) def expand_above?(with_indent: indent) return false if above.nil? return false if leaf? && leaning == :left + return false if above.leaf? && above.leaning == :right - if above.leaning == :left + if above.leaning == :left || (above.leaning == :right && leaf?) above.indent >= with_indent else true @@ -115,8 +116,9 @@ def expand_above?(with_indent: indent) def expand_below?(with_indent: indent) return false if below.nil? return false if leaf? && leaning == :right + return false if below.leaf? && below.leaning == :left - if below.leaning == :right + if below.leaning == :right || (below.leaning == :left && leaf?) below.indent >= with_indent else true @@ -142,7 +144,7 @@ def leaf? def self.next_indent(above, node, below) return node.indent if node.expand_above? || node.expand_below? - if above + value = if above if below case above.indent <=> below.indent when 1 then below.indent @@ -157,6 +159,8 @@ def self.next_indent(above, node, below) else node.indent end + + value > node.indent ? node.indent : value end # Calculating the next_indent must be done after above and below diff --git a/lib/dead_end/indent_tree.rb b/lib/dead_end/indent_tree.rb index 533c0a9..15dd35e 100644 --- a/lib/dead_end/indent_tree.rb +++ b/lib/dead_end/indent_tree.rb @@ -28,38 +28,51 @@ def initialize(document:, record_dir: DEFAULT_VALUE) @recorder = BlockRecorder.from_dir(record_dir, subdir: "build_tree", code_lines: @code_lines) end + def peek + document.peek + end + def root @document.root end - def call - while (block = document.pop) - @recorder.capture(block, name: "pop") + def step + block = document.pop + return nil if block.nil? + + @recorder.capture(block, name: "pop") + + blocks = [block] + indent = block.next_indent - blocks = [block] - indent = block.next_indent + # Look up + while blocks.last.expand_above?(with_indent: indent) + above = blocks.last.above + blocks << above + break if above.leaning == :left + end - # Look up - while blocks.last.expand_above?(with_indent: indent) - above = blocks.last.above - blocks << above - break if above.leaning == :left - end + blocks.reverse! - blocks.reverse! + # Look down + while blocks.last.expand_below?(with_indent: indent) + below = blocks.last.below + blocks << below + break if below.leaning == :right + end - # Look down - while blocks.last.expand_below?(with_indent: indent) - below = blocks.last.below - blocks << below - break if below.leaning == :right - end + if blocks.length > 1 + now = document.capture_all(blocks) + @recorder.capture(now, name: "expand") + document.queue << now + now + else + block + end + end - if blocks.length > 1 - node = document.capture_all(blocks) - @recorder.capture(node, name: "expand") - document.queue << node - end + def call + while step end self end diff --git a/spec/unit/indent_search_spec.rb b/spec/unit/indent_search_spec.rb index 41e31e5..9463c50 100644 --- a/spec/unit/indent_search_spec.rb +++ b/spec/unit/indent_search_spec.rb @@ -30,7 +30,7 @@ def tmp_capture_context(finished) tree = IndentTree.new(document: document).call search = IndentSearch.new(tree: tree).call - expect(search.finished.join).to eq(<<~'EOM'.indent(2)) + expect(context.highlight.join).to eq(<<~'EOM'.indent(2)) end # two EOM @@ -39,10 +39,6 @@ def tmp_capture_context(finished) it "flerg" end # two EOM - - expect(context.highlight.join).to eq(<<~'EOM'.indent(2)) - end # two - EOM end it "finds missing do in an rspec context same indent when the problem is in the middle and blocks HAVE inner contents" do @@ -68,13 +64,13 @@ def tmp_capture_context(finished) search = IndentSearch.new(tree: tree).call context = BlockNodeContext.new(search.finished[0]).call - expect(context.lines.join).to eq(<<~'EOM'.indent(2)) - it "flerg" - print foo + expect(context.highlight.join).to eq(<<~'EOM'.indent(2)) end # two EOM - expect(context.highlight.join).to eq(<<~'EOM'.indent(2)) + expect(context.lines.join).to eq(<<~'EOM'.indent(2)) + it "flerg" + print foo2 end # two EOM end @@ -369,6 +365,570 @@ def foo EOM end + it "smaller rexe input_modes" do + source = <<~'EOM' + class Lookups + def input_modes + @input_modes ||= { + 'l' => :line, + 'e' => :enumerator, + 'b' => :one_big_string, + 'n' => :none + } + # missing end problem here + + + def input_formats + @input_formats ||= { + 'j' => :json, + 'm' => :marshal, + 'n' => :none, + 'y' => :yaml, + } + end + + + def input_parsers + @input_parsers ||= { + json: ->(string) { JSON.parse(string) }, + marshal: ->(string) { Marshal.load(string) }, + none: ->(string) { string }, + yaml: ->(string) { YAML.load(string) }, + } + end + + + def output_formats + @output_formats ||= { + 'a' => :amazing_print, + 'i' => :inspect, + 'j' => :json, + 'J' => :pretty_json, + 'm' => :marshal, + 'n' => :none, + 'p' => :puts, # default + 'P' => :pretty_print, + 's' => :to_s, + 'y' => :yaml, + } + end + + + def formatters + @formatters ||= { + amazing_print: ->(obj) { obj.ai + "\n" }, + inspect: ->(obj) { obj.inspect + "\n" }, + json: ->(obj) { obj.to_json }, + marshal: ->(obj) { Marshal.dump(obj) }, + none: ->(_obj) { nil }, + pretty_json: ->(obj) { JSON.pretty_generate(obj) }, + pretty_print: ->(obj) { obj.pretty_inspect }, + puts: ->(obj) { require 'stringio'; sio = StringIO.new; sio.puts(obj); sio.string }, + to_s: ->(obj) { obj.to_s + "\n" }, + yaml: ->(obj) { obj.to_yaml }, + } + end + + + def format_requires + @format_requires ||= { + json: 'json', + pretty_json: 'json', + amazing_print: 'amazing_print', + pretty_print: 'pp', + yaml: 'yaml' + } + end + end + + + + class CommandLineParser + + include Helpers + + attr_reader :lookups, :options + + def initialize + @lookups = Lookups.new + @options = Options.new + end + + + # Inserts contents of REXE_OPTIONS environment variable at the beginning of ARGV. + private def prepend_environment_options + env_opt_string = ENV['REXE_OPTIONS'] + if env_opt_string + args_to_prepend = Shellwords.shellsplit(env_opt_string) + ARGV.unshift(args_to_prepend).flatten! + end + end + + + private def add_format_requires_to_requires_list + formats = [options.input_format, options.output_format, options.log_format] + requires = formats.map { |format| lookups.format_requires[format] }.uniq.compact + requires.each { |r| options.requires << r } + end + + + private def help_text + unless @help_text + @help_text ||= <<~HEREDOC + + rexe -- Ruby Command Line Executor/Filter -- v#{VERSION} -- #{PROJECT_URL} + + Executes Ruby code on the command line, + optionally automating management of standard input and standard output, + and optionally parsing input and formatting output with YAML, JSON, etc. + + rexe [options] [Ruby source code] + + Options: + + -c --clear_options Clear all previous command line options specified up to now + -f --input_file Use this file instead of stdin for preprocessed input; + if filespec has a YAML and JSON file extension, + sets input format accordingly and sets input mode to -mb + -g --log_format FORMAT Log format, logs to stderr, defaults to -gn (none) + (see -o for format options) + -h, --help Print help and exit + -i, --input_format FORMAT Input format, defaults to -in (None) + -ij JSON + -im Marshal + -in None (default) + -iy YAML + -l, --load RUBY_FILE(S) Ruby file(s) to load, comma separated; + ! to clear all, or precede a name with '-' to remove + -m, --input_mode MODE Input preprocessing mode (determines what `self` will be) + defaults to -mn (none) + -ml line; each line is ingested as a separate string + -me enumerator (each_line on STDIN or File) + -mb big string; all lines combined into one string + -mn none (default); no input preprocessing; + self is an Object.new + -n, --[no-]noop Do not execute the code (useful with -g); + For true: yes, true, y, +; for false: no, false, n + -o, --output_format FORMAT Output format, defaults to -on (no output): + -oa Amazing Print + -oi Inspect + -oj JSON + -oJ Pretty JSON + -om Marshal + -on No Output (default) + -op Puts + -oP Pretty Print + -os to_s + -oy YAML + If 2 letters are provided, 1st is for tty devices, 2nd for block + --project-url Outputs project URL on Github, then exits + -r, --require REQUIRE(S) Gems and built-in libraries to require, comma separated; + ! to clear all, or precede a name with '-' to remove + -v, --version Prints version and exits + + --------------------------------------------------------------------------------------- + + In many cases you will need to enclose your source code in single or double quotes. + + If source code is not specified, it will default to 'self', + which is most likely useful only in a filter mode (-ml, -me, -mb). + + If there is a .rexerc file in your home directory, it will be run as Ruby code + before processing the input. + + If there is a REXE_OPTIONS environment variable, its content will be prepended + to the command line so that you can specify options implicitly + (e.g. `export REXE_OPTIONS="-r amazing_print,yaml"`) + + HEREDOC + + @help_text.freeze + end + + @help_text + end + + + # File file input mode; detects the input mode (JSON, YAML, or None) from the extension. + private def autodetect_file_format(filespec) + extension = File.extname(filespec).downcase + if extension == '.json' + :json + elsif extension == '.yml' || extension == '.yaml' + :yaml + else + :none + end + end + + + private def open_resource(resource_identifier) + command = case (`uname`.chomp) + when 'Darwin' + 'open' + when 'Linux' + 'xdg-open' + else + 'start' + end + + `#{command} #{resource_identifier}` + end + + + # Using 'optparse', parses the command line. + # Settings go into this instance's properties (see Struct declaration). + def parse + + prepend_environment_options + + OptionParser.new do |parser| + + parser.on('-c', '--clear_options', "Clear all previous command line options") do |v| + options.clear + end + + parser.on('-f', '--input_file FILESPEC', + 'Use this file instead of stdin; autodetects YAML and JSON file extensions') do |v| + unless File.exist?(v) + raise "File #{v} does not exist." + end + options.input_filespec = v + options.input_format = autodetect_file_format(v) + if [:json, :yaml].include?(options.input_format) + options.input_mode = :one_big_string + end + end + + parser.on('-g', '--log_format FORMAT', 'Log format, logs to stderr, defaults to none (see -o for format options)') do |v| + options.log_format = lookups.output_formats[v] + if options.log_format.nil? + raise("Output mode was '#{v}' but must be one of #{lookups.output_formats.keys}.") + end + end + + parser.on("-h", "--help", "Show help") do |_help_requested| + puts help_text + exit + end + + parser.on('-i', '--input_format FORMAT', + 'Mode with which to parse input values (n = none (default), j = JSON, m = Marshal, y = YAML') do |v| + + options.input_format = lookups.input_formats[v] + if options.input_format.nil? + raise("Input mode was '#{v}' but must be one of #{lookups.input_formats.keys}.") + end + end + + parser.on('-l', '--load RUBY_FILE(S)', 'Ruby file(s) to load, comma separated, or ! to clear') do |v| + if v == '!' + options.loads.clear + else + loadfiles = v.split(',').map(&:strip).map { |s| File.expand_path(s) } + removes, adds = loadfiles.partition { |filespec| filespec[0] == '-' } + + existent, nonexistent = adds.partition { |filespec| File.exists?(filespec) } + if nonexistent.any? + raise("\nDid not find the following files to load: #{nonexistent}\n\n") + else + existent.each { |filespec| options.loads << filespec } + end + + removes.each { |filespec| options.loads -= [filespec[1..-1]] } + end + end + + parser.on('-m', '--input_mode MODE', + 'Mode with which to handle input (-ml, -me, -mb, -mn (default)') do |v| + + options.input_mode = lookups.input_modes[v] + if options.input_mode.nil? + raise("Input mode was '#{v}' but must be one of #{lookups.input_modes.keys}.") + end + end + + # See https://stackoverflow.com/questions/54576873/ruby-optionparser-short-code-for-boolean-option + # for an excellent explanation of this optparse incantation. + # According to the answer, valid options are: + # -n no, -n yes, -n false, -n true, -n n, -n y, -n +, but not -n -. + parser.on('-n', '--[no-]noop [FLAG]', TrueClass, "Do not execute the code (useful with -g)") do |v| + options.noop = (v.nil? ? true : v) + end + + parser.on('-o', '--output_format FORMAT', + 'Mode with which to format values for output (`-o` + [aijJmnpsy])') do |v| + options.output_format_tty = lookups.output_formats[v[0]] + options.output_format_block = lookups.output_formats[v[-1]] + options.output_format = ($stdout.tty? ? options.output_format_tty : options.output_format_block) + if [options.output_format_tty, options.output_format_block].include?(nil) + raise("Bad output mode '#{v}'; each must be one of #{lookups.output_formats.keys}.") + end + end + + parser.on('-r', '--require REQUIRE(S)', + 'Gems and built-in libraries (e.g. shellwords, yaml) to require, comma separated, or ! to clear') do |v| + if v == '!' + options.requires.clear + else + v.split(',').map(&:strip).each do |r| + if r[0] == '-' + options.requires -= [r[1..-1]] + else + options.requires << r + end + end + end + end + + parser.on('-v', '--version', 'Print version') do + puts VERSION + exit(0) + end + + # Undocumented feature: open Github project with default web browser on a Mac + parser.on('', '--open-project') do + open_resource(PROJECT_URL) + exit(0) + end + + parser.on('', '--project-url') do + puts PROJECT_URL + exit(0) + end + + end.parse! + + # We want to do this after all options have been processed because we don't want any clearing of the + # options (by '-c', etc.) to result in exclusion of these needed requires. + add_format_requires_to_requires_list + + options.requires = options.requires.sort.uniq + options.loads.uniq! + + options + + end + end + + + class Main + + include Helpers + + attr_reader :callable, :input_parser, :lookups, + :options, :output_formatter, + :log_formatter, :start_time, :user_source_code + + + def initialize + @lookups = Lookups.new + @start_time = DateTime.now + end + + + private def load_global_config_if_exists + filespec = File.join(Dir.home, '.rexerc') + load(filespec) if File.exists?(filespec) + end + + + private def init_parser_and_formatters + @input_parser = lookups.input_parsers[options.input_format] + @output_formatter = lookups.formatters[options.output_format] + @log_formatter = lookups.formatters[options.log_format] + end + + + # Executes the user specified code in the manner appropriate to the input mode. + # Performs any optionally specified parsing on input and formatting on output. + private def execute(eval_context_object, code) + if options.input_format != :none && options.input_mode != :none + eval_context_object = input_parser.(eval_context_object) + end + + value = eval_context_object.instance_eval(&code) + + unless options.output_format == :none + print output_formatter.(value) + end + rescue Errno::EPIPE + exit(-13) + end + + + # The global $RC (Rexe Context) OpenStruct is available in your user code. + # In order to make it possible to access this object in your loaded files, we are not creating + # it here; instead we add properties to it. This way, you can initialize an OpenStruct yourself + # in your loaded code and it will still work. If you do that, beware, any properties you add will be + # included in the log output. If the to_s of your added objects is large, that might be a pain. + private def init_rexe_context + $RC ||= OpenStruct.new + $RC.count = 0 + $RC.rexe_version = VERSION + $RC.start_time = start_time.iso8601 + $RC.source_code = user_source_code + $RC.options = options.to_h + + def $RC.i; count end # `i` aliases `count` so you can more concisely get the count in your user code + end + + + private def create_callable + eval("Proc.new { #{user_source_code} }") + end + + + private def lookup_action(mode) + input = options.input_filespec ? File.open(options.input_filespec) : STDIN + { + line: -> { input.each { |l| execute(l.chomp, callable); $RC.count += 1 } }, + enumerator: -> { execute(input.each_line, callable); $RC.count += 1 }, + one_big_string: -> { big_string = input.read; execute(big_string, callable); $RC.count += 1 }, + none: -> { execute(Object.new, callable) } + }.fetch(mode) + end + + + private def output_log_entry + if options.log_format != :none + $RC.duration_secs = Time.now - start_time.to_time + STDERR.puts(log_formatter.($RC.to_h)) + end + end + + + # Bypasses Bundler's restriction on loading gems + # (see https://stackoverflow.com/questions/55144094/bundler-doesnt-permit-using-gems-in-project-home-directory) + private def require!(the_require) + begin + require the_require + rescue LoadError => error + gem_path = `gem which #{the_require}` + if gem_path.chomp.strip.empty? + raise error # re-raise the error, can't fix it + else + load_dir = File.dirname(gem_path) + $LOAD_PATH += load_dir + require the_require + end + end + end + + + # This class' entry point. + def call + + try do + + @options = CommandLineParser.new.parse + + options.requires.each { |r| require!(r) } + load_global_config_if_exists + options.loads.each { |file| load(file) } + + @user_source_code = ARGV.join(' ') + @user_source_code = 'self' if @user_source_code == '' + + @callable = create_callable + + init_rexe_context + init_parser_and_formatters + + # This is where the user's source code will be executed; the action will in turn call `execute`. + lookup_action(options.input_mode).call unless options.noop + + output_log_entry + end + end + end + EOM + + code_lines = CleanDocument.new(source: source).call.lines + document = BlockDocument.new(code_lines: code_lines).call + tree = IndentTree.new(document: document).call + search = IndentSearch.new(tree: tree).call + + expect(search.finished.first.node.to_s).to eq(<<~'EOM'.indent(2)) + def input_modes + EOM + + search.finished[0].node.below.parents.each do |p| + puts '--' + puts p + end + end + + it "handles heredocs indentation building microcase outside missing end" do + source = <<~'EOM' + parser.on('-c', '--clear_options', "Clear all previous command line options") do |v| + options.clear + + parser.on('-f', '--input_file FILESPEC', + 'Use this file instead of stdin; autodetects YAML and JSON file extensions') do |v| + unless File.exist?(v) + raise "File #{v} does not exist." + end + options.input_filespec = v + options.input_format = autodetect_file_format(v) + if [:json, :yaml].include?(options.input_format) + options.input_mode = :one_big_string + end + end + EOM + + code_lines = CleanDocument.new(source: source).call.lines + document = BlockDocument.new(code_lines: code_lines).call + tree = IndentTree.new(document: document).call + search = IndentSearch.new(tree: tree).call + + expect(search.finished.first.node.to_s).to eq(<<~'EOM'.indent(0)) + parser.on('-c', '--clear_options', "Clear all previous command line options") do |v| + EOM + end + + it "rexe missing if microcase" do + source = <<~'EOM' + parser.on('-c', '--clear_options', "Clear all previous command line options") do |v| + options.clear + end # one + + parser.on('-f', '--input_file FILESPEC', + 'Use this file instead of stdin; autodetects YAML and JSON file extensions') do |v| + unless File.exist?(v) + raise "File #{v} does not exist." + end # two + options.input_filespec = v + options.input_format = autodetect_file_format(v) + + + # missing if here: if [:json, :yaml].include?(options.input_format) + options.input_mode = :one_big_string + end # three + end # four + EOM + + code_lines = CleanDocument.new(source: source).call.lines + document = BlockDocument.new(code_lines: code_lines).call + tree = IndentTree.new(document: document).call + search = IndentSearch.new(tree: tree).call + + context = BlockNodeContext.new(search.finished[0]).call + # expect(context.highlight.join).to eq(<<~'EOM'.indent(4)) + # def input_modes + # EOM + + expect(context.lines.join).to eq(<<~'EOM'.indent(0)) + def input_modes + @input_modes ||= { + 'l' => :line, + 'e' => :enumerator, + 'b' => :one_big_string, + 'n' => :none + } + EOM + end + it "handles heredocs" do lines = fixtures_dir.join("rexe.rb.txt").read.lines lines.delete_at(85 - 1) diff --git a/spec/unit/indent_tree_spec.rb b/spec/unit/indent_tree_spec.rb index 25617b1..e5120c7 100644 --- a/spec/unit/indent_tree_spec.rb +++ b/spec/unit/indent_tree_spec.rb @@ -4,6 +4,246 @@ module DeadEnd RSpec.describe IndentTree do + it "finds missing do in an rspec context same indent when the problem is in the middle and blocks HAVE inner contents" do + source = <<~'EOM' + describe "things" do + it "blerg" do + print foo1 + end # one + + it "flerg" + print foo2 + end # two + + it "zlerg" do + print foo3 + end # three + end + EOM + + code_lines = CleanDocument.new(source: source).call.lines + document = BlockDocument.new(code_lines: code_lines).call + tree = IndentTree.new(document: document) + + node = tree.peek + expect(node.to_s).to eq(<<~'EOM'.indent(4)) + print foo2 + EOM + node = tree.step + + expect(node.to_s).to eq(<<~'EOM'.indent(2)) + it "flerg" + print foo2 + EOM + + node = tree.peek + expect(node.to_s).to eq(<<~'EOM'.indent(4)) + print foo3 + EOM + node = tree.step + + expect(node.to_s).to eq(<<~'EOM'.indent(2)) + it "zlerg" do + print foo3 + end # three + EOM + + node = tree.peek + expect(node.to_s).to eq(<<~'EOM'.indent(4)) + print foo1 + EOM + node = tree.step + + expect(node.to_s).to eq(<<~'EOM'.indent(2)) + it "blerg" do + print foo1 + end # one + EOM + + node = tree.peek + expect(node.to_s).to eq(<<~'EOM'.indent(2)) + end # two + EOM + + node = tree.step + + expect(node.to_s).to eq(<<~'EOM'.indent(2)) + it "blerg" do + print foo1 + end # one + it "flerg" + print foo2 + end # two + EOM + + node = tree.peek + expect(node.to_s).to eq(<<~'EOM'.indent(2)) + it "blerg" do + print foo1 + end # one + it "flerg" + print foo2 + end # two + EOM + + node = tree.step + + expect(node.to_s).to eq(<<~'EOM'.indent(2)) + it "blerg" do + print foo1 + end # one + it "flerg" + print foo2 + end # two + it "zlerg" do + print foo3 + end # three + EOM + + node = tree.peek + expect(node.to_s).to eq(<<~'EOM'.indent(2)) + it "blerg" do + print foo1 + end # one + it "flerg" + print foo2 + end # two + it "zlerg" do + print foo3 + end # three + EOM + + node = tree.step + expect(node.join).to eq(<<~'EOM') + describe "things" do + it "blerg" do + print foo1 + end # one + it "flerg" + print foo2 + end # two + it "zlerg" do + print foo3 + end # three + end + EOM + end + + it "rexe missing if microcase" do + source = <<~'EOM' + parser.on('-c', '--clear_options', "Clear all previous command line options") do |v| + options.clear + end # one + + parser.on('-f', '--input_file FILESPEC', + 'Use this file instead of stdin; autodetects YAML and JSON file extensions') do |v| + unless File.exist?(v) + raise "File #{v} does not exist." + end # two + options.input_filespec = v + options.input_format = autodetect_file_format(v) + + + # missing if here: if [:json, :yaml].include?(options.input_format) + options.input_mode = :one_big_string + end # three + end # four + EOM + + code_lines = CleanDocument.new(source: source).call.lines + document = BlockDocument.new(code_lines: code_lines).call + tree = IndentTree.new(document: document) + + node = tree.peek + expect(node.to_s).to eq(<<~'EOM'.indent(4)) + options.input_mode = :one_big_string + EOM + node = tree.step + + expect(node.to_s).to eq(<<~'EOM'.indent(2)) + options.input_filespec = v + options.input_format = autodetect_file_format(v) + options.input_mode = :one_big_string + EOM + + node = tree.peek + expect(node.to_s).to eq(<<~'EOM'.indent(4)) + raise "File #{v} does not exist." + EOM + node = tree.step + + expect(node.to_s).to eq(<<~'EOM'.indent(2)) + unless File.exist?(v) + raise "File #{v} does not exist." + end # two + EOM + + node = tree.peek + expect(node.to_s).to eq(<<~'EOM'.indent(2)) + end # three + EOM + + node = tree.step + + expect(node.to_s).to eq(<<~'EOM'.indent(2)) + unless File.exist?(v) + raise "File #{v} does not exist." + end # two + options.input_filespec = v + options.input_format = autodetect_file_format(v) + options.input_mode = :one_big_string + end # three + EOM + + node = tree.peek + expect(node.to_s).to eq(<<~'EOM'.indent(4)) + 'Use this file instead of stdin; autodetects YAML and JSON file extensions') do |v| + EOM + + node = tree.step + expect(node.to_s).to eq(<<~'EOM'.indent(0)) + parser.on('-f', '--input_file FILESPEC', + 'Use this file instead of stdin; autodetects YAML and JSON file extensions') do |v| + unless File.exist?(v) + raise "File #{v} does not exist." + end # two + options.input_filespec = v + options.input_format = autodetect_file_format(v) + options.input_mode = :one_big_string + end # three + EOM + + node = tree.peek + expect(node.to_s).to eq(<<~'EOM'.indent(2)) + options.clear + EOM + + node = tree.step + expect(node.to_s).to eq(<<~'EOM'.indent(0)) + parser.on('-c', '--clear_options', "Clear all previous command line options") do |v| + options.clear + end # one + EOM + + # Problem here is that "four" is not captured by the lower block, but by this upper block + node = tree.step + expect(node.to_s).to eq(<<~'EOM'.indent(0)) + parser.on('-c', '--clear_options', "Clear all previous command line options") do |v| + options.clear + end # one + parser.on('-f', '--input_file FILESPEC', + 'Use this file instead of stdin; autodetects YAML and JSON file extensions') do |v| + unless File.exist?(v) + raise "File #{v} does not exist." + end # two + options.input_filespec = v + options.input_format = autodetect_file_format(v) + options.input_mode = :one_big_string + end # three + end # four + EOM + end + # If you put an indented "print" in there then # the problem goes away, I think it's fine to not handle # this (hopefully rare) case. If we showed you there was a problem @@ -117,9 +357,9 @@ def speak it "invalid if and else" do source = <<~'EOM' if true - puts ( + print ( else - puts } + print } end EOM @@ -133,17 +373,17 @@ def speak node = diagnose.next[0] expect(node.to_s).to eq(<<~'EOM') - puts ( + print ( else - puts } + print } EOM diagnose = DiagnoseNode.new(node).call expect(diagnose.problem).to eq(:remove_pseudo_pair) node = diagnose.next[0] expect(node.to_s).to eq(<<~'EOM'.indent(2)) - puts ( - puts } + print ( + print } EOM diagnose = DiagnoseNode.new(node).call @@ -152,7 +392,7 @@ def speak expect(forks.length).to eq(2) expect(forks.first.to_s).to eq(<<~'EOM'.indent(2)) - puts ( + print ( EOM expect(forks.last.to_s).to eq(<<~'EOM'.indent(2)) @@ -463,7 +703,7 @@ def filename source = <<~'EOM' class Dog def bark - puts "woof" + print "woof" end EOM code_lines = CleanDocument.new(source: source).call.lines From ee2c9fd2e0f6f878a67579d5d2fe33f0b63a2bdb Mon Sep 17 00:00:00 2001 From: schneems Date: Wed, 16 Feb 2022 11:27:45 -0600 Subject: [PATCH 50/58] WIP WIP WIP???? --- lib/dead_end/block_node.rb | 7 +- lib/dead_end/indent_tree.rb | 4 +- spec/unit/indent_search_spec.rb | 63 +++++++++++-- spec/unit/indent_tree_spec.rb | 158 +++++++++++++++++++------------- 4 files changed, 155 insertions(+), 77 deletions(-) diff --git a/lib/dead_end/block_node.rb b/lib/dead_end/block_node.rb index 6d4effd..bd69a29 100644 --- a/lib/dead_end/block_node.rb +++ b/lib/dead_end/block_node.rb @@ -98,9 +98,11 @@ def initialize(lines:, indent:, next_indent: nil, lex_diff: nil, parents: []) def expand_above?(with_indent: indent) return false if above.nil? return false if leaf? && leaning == :left + return true if leaf? && leaning == :both && above.leaning == :left return false if above.leaf? && above.leaning == :right - if above.leaning == :left || (above.leaning == :right && leaf?) + + if above.leaning == :left || above.leaning == :both || leaf? && above.leaning == :right above.indent >= with_indent else true @@ -116,9 +118,10 @@ def expand_above?(with_indent: indent) def expand_below?(with_indent: indent) return false if below.nil? return false if leaf? && leaning == :right + return true if leaf? && leaning == :both && above.leaning == :right return false if below.leaf? && below.leaning == :left - if below.leaning == :right || (below.leaning == :left && leaf?) + if below.leaning == :right || below.leaning == :both || leaf? && below.leaning == :left below.indent >= with_indent else true diff --git a/lib/dead_end/indent_tree.rb b/lib/dead_end/indent_tree.rb index 15dd35e..d2882c4 100644 --- a/lib/dead_end/indent_tree.rb +++ b/lib/dead_end/indent_tree.rb @@ -49,7 +49,7 @@ def step while blocks.last.expand_above?(with_indent: indent) above = blocks.last.above blocks << above - break if above.leaning == :left + break if above.leaning == :left || above.leaning == :both end blocks.reverse! @@ -58,7 +58,7 @@ def step while blocks.last.expand_below?(with_indent: indent) below = blocks.last.below blocks << below - break if below.leaning == :right + break if below.leaning == :right || below.leaning == :both end if blocks.length > 1 diff --git a/spec/unit/indent_search_spec.rb b/spec/unit/indent_search_spec.rb index 9463c50..40832cf 100644 --- a/spec/unit/indent_search_spec.rb +++ b/spec/unit/indent_search_spec.rb @@ -11,6 +11,34 @@ def tmp_capture_context(finished) lines end + it "large both" do + source = <<~'EOM' + [ + one, + two, + three + ].each do |i| + print i { + end + EOM + + code_lines = CleanDocument.new(source: source).call.lines + document = BlockDocument.new(code_lines: code_lines).call + tree = IndentTree.new(document: document).call + search = IndentSearch.new(tree: tree).call + + context = BlockNodeContext.new(search.finished[0]).call + expect(context.highlight.join).to eq(<<~'EOM'.indent(2)) + end # two + EOM + + expect(context.lines.join).to eq(<<~'EOM'.indent(2)) + it "flerg" + end # two + EOM + end + + it "finds missing do in an rspec context same indent when the problem is in the middle and blocks do not have inner contents" do source = <<~'EOM' describe "things" do @@ -30,11 +58,11 @@ def tmp_capture_context(finished) tree = IndentTree.new(document: document).call search = IndentSearch.new(tree: tree).call + context = BlockNodeContext.new(search.finished[0]).call expect(context.highlight.join).to eq(<<~'EOM'.indent(2)) end # two EOM - context = BlockNodeContext.new(search.finished[0]).call expect(context.lines.join).to eq(<<~'EOM'.indent(2)) it "flerg" end # two @@ -343,11 +371,17 @@ def foo it "invalid if and else" do source = <<~'EOM' + def dog + end + if true puts ( else puts } end + + def cat + end EOM code_lines = CleanDocument.new(source: source).call.lines @@ -355,14 +389,24 @@ def foo tree = IndentTree.new(document: document).call search = IndentSearch.new(tree: tree).call - expect(search.finished.length).to eq(2) - expect(search.finished.first.to_s).to eq(<<~'EOM'.indent(2)) - puts ( - EOM - expect(search.finished.last.to_s).to eq(<<~'EOM'.indent(2)) - puts } + context = BlockNodeContext.new(search.finished[0]).call + expect(search.finished.join).to eq(<<~'EOM'.indent(0)) + if true + puts ( + else + puts } + end EOM + + # expect(search.finished.length).to eq(2) + # expect(search.finished.first.to_s).to eq(<<~'EOM'.indent(2)) + # puts ( + # EOM + + # expect(search.finished.last.to_s).to eq(<<~'EOM'.indent(2)) + # puts } + # EOM end it "smaller rexe input_modes" do @@ -914,9 +958,8 @@ def input_modes search = IndentSearch.new(tree: tree).call context = BlockNodeContext.new(search.finished[0]).call - # expect(context.highlight.join).to eq(<<~'EOM'.indent(4)) - # def input_modes - # EOM + expect(context.highlight.join).to eq(<<~'EOM'.indent(4)) + EOM expect(context.lines.join).to eq(<<~'EOM'.indent(0)) def input_modes diff --git a/spec/unit/indent_tree_spec.rb b/spec/unit/indent_tree_spec.rb index e5120c7..34ee8c1 100644 --- a/spec/unit/indent_tree_spec.rb +++ b/spec/unit/indent_tree_spec.rb @@ -4,6 +4,65 @@ module DeadEnd RSpec.describe IndentTree do + it "large both" do + source = <<~'EOM' + [ + one, + two, + three + ].each do |i| + print i { + end + EOM + + code_lines = CleanDocument.new(source: source).call.lines + document = BlockDocument.new(code_lines: code_lines).call + tree = IndentTree.new(document: document) + + expect(tree.peek.to_s).to eq(<<~'EOM'.indent(2)) + three + EOM + + expect(tree.step.to_s).to eq(<<~'EOM'.indent(2)) + one, + two, + three + EOM + + expect(tree.peek.to_s).to eq(<<~'EOM'.indent(2)) + print i { + EOM + + expect(tree.step.to_s).to eq(<<~'EOM'.indent(0)) + print i { + end + EOM + + expect(tree.peek.to_s).to eq(<<~'EOM'.indent(2)) + one, + two, + three + EOM + + expect(tree.step.to_s).to eq(<<~'EOM'.indent(0)) + [ + one, + two, + three + ].each do |i| + EOM + + expect(tree.step.to_s).to eq(<<~'EOM'.indent(0)) + [ + one, + two, + three + ].each do |i| + print i { + end + EOM + end + it "finds missing do in an rspec context same indent when the problem is in the middle and blocks HAVE inner contents" do source = <<~'EOM' describe "things" do @@ -25,8 +84,7 @@ module DeadEnd document = BlockDocument.new(code_lines: code_lines).call tree = IndentTree.new(document: document) - node = tree.peek - expect(node.to_s).to eq(<<~'EOM'.indent(4)) + expect(tree.peek.to_s).to eq(<<~'EOM'.indent(4)) print foo2 EOM node = tree.step @@ -114,7 +172,7 @@ module DeadEnd EOM node = tree.step - expect(node.join).to eq(<<~'EOM') + expect(node.to_s).to eq(<<~'EOM') describe "things" do it "blerg" do print foo1 @@ -154,25 +212,30 @@ module DeadEnd document = BlockDocument.new(code_lines: code_lines).call tree = IndentTree.new(document: document) - node = tree.peek - expect(node.to_s).to eq(<<~'EOM'.indent(4)) - options.input_mode = :one_big_string + expect(tree.peek.to_s).to eq(<<~'EOM'.indent(4)) + options.input_mode = :one_big_string EOM - node = tree.step - expect(node.to_s).to eq(<<~'EOM'.indent(2)) + expect(tree.step.to_s).to eq(<<~'EOM'.indent(2)) options.input_filespec = v options.input_format = autodetect_file_format(v) options.input_mode = :one_big_string EOM - node = tree.peek - expect(node.to_s).to eq(<<~'EOM'.indent(4)) + expect(tree.peek.to_s).to eq(<<~'EOM'.indent(4)) + 'Use this file instead of stdin; autodetects YAML and JSON file extensions') do |v| + EOM + + expect(tree.step.to_s).to eq(<<~'EOM'.indent(0)) + parser.on('-f', '--input_file FILESPEC', + 'Use this file instead of stdin; autodetects YAML and JSON file extensions') do |v| + EOM + + expect(tree.peek.to_s).to eq(<<~'EOM'.indent(4)) raise "File #{v} does not exist." EOM - node = tree.step - expect(node.to_s).to eq(<<~'EOM'.indent(2)) + expect(tree.step.to_s).to eq(<<~'EOM'.indent(2)) unless File.exist?(v) raise "File #{v} does not exist." end # two @@ -182,7 +245,6 @@ module DeadEnd expect(node.to_s).to eq(<<~'EOM'.indent(2)) end # three EOM - node = tree.step expect(node.to_s).to eq(<<~'EOM'.indent(2)) @@ -195,12 +257,9 @@ module DeadEnd end # three EOM - node = tree.peek - expect(node.to_s).to eq(<<~'EOM'.indent(4)) - 'Use this file instead of stdin; autodetects YAML and JSON file extensions') do |v| - EOM - + expect(tree.peek).to eq(node) node = tree.step + expect(node.to_s).to eq(<<~'EOM'.indent(0)) parser.on('-f', '--input_file FILESPEC', 'Use this file instead of stdin; autodetects YAML and JSON file extensions') do |v| @@ -211,36 +270,18 @@ module DeadEnd options.input_format = autodetect_file_format(v) options.input_mode = :one_big_string end # three + end # four EOM - node = tree.peek - expect(node.to_s).to eq(<<~'EOM'.indent(2)) - options.clear - EOM - - node = tree.step - expect(node.to_s).to eq(<<~'EOM'.indent(0)) - parser.on('-c', '--clear_options', "Clear all previous command line options") do |v| + expect(tree.peek.to_s).to eq(<<~'EOM'.indent(2)) options.clear - end # one EOM - - # Problem here is that "four" is not captured by the lower block, but by this upper block node = tree.step + expect(node.to_s).to eq(<<~'EOM'.indent(0)) parser.on('-c', '--clear_options', "Clear all previous command line options") do |v| options.clear end # one - parser.on('-f', '--input_file FILESPEC', - 'Use this file instead of stdin; autodetects YAML and JSON file extensions') do |v| - unless File.exist?(v) - raise "File #{v} does not exist." - end # two - options.input_filespec = v - options.input_format = autodetect_file_format(v) - options.input_mode = :one_big_string - end # three - end # four EOM end @@ -315,7 +356,6 @@ class Buffalo tree = IndentTree.new(document: document).call node = tree.root - # expect(node.parents.length).to eq(2) diagnose = DiagnoseNode.new(node).call expect(diagnose.problem).to eq(:multiple_invalid_parents) @@ -365,39 +405,31 @@ def speak code_lines = CleanDocument.new(source: source).call.lines document = BlockDocument.new(code_lines: code_lines).call - tree = IndentTree.new(document: document).call + tree = IndentTree.new(document: document) - node = tree.root - diagnose = DiagnoseNode.new(node).call - expect(diagnose.problem).to eq(:invalid_inside_split_pair) - node = diagnose.next[0] + expect(tree.peek.to_s).to eq(<<~'EOM'.indent(2)) + print } + EOM - expect(node.to_s).to eq(<<~'EOM') + last = tree.step + expect(last.to_s).to eq(<<~'EOM'.indent(0)) print ( else print } EOM - diagnose = DiagnoseNode.new(node).call - expect(diagnose.problem).to eq(:remove_pseudo_pair) - node = diagnose.next[0] - expect(node.to_s).to eq(<<~'EOM'.indent(2)) - print ( - print } - EOM - - diagnose = DiagnoseNode.new(node).call - expect(diagnose.problem).to eq(:multiple_invalid_parents) - forks = diagnose.next - - expect(forks.length).to eq(2) - expect(forks.first.to_s).to eq(<<~'EOM'.indent(2)) - print ( + expect(tree.peek.to_s).to eq(<<~'EOM'.indent(2)) + end EOM - expect(forks.last.to_s).to eq(<<~'EOM'.indent(2)) - puts } + last = tree.step + expect(last.to_s).to eq(<<~'EOM'.indent(0)) + print ( + else + print } + end EOM + # expect(tree.peek.to_s).to eq(last.to_s) end it "(smaller) finds random pipe (|) wildly misindented" do From 230317948a24c2827543e8863a700cff5af3b89d Mon Sep 17 00:00:00 2001 From: schneems Date: Wed, 16 Feb 2022 14:24:57 -0600 Subject: [PATCH 51/58] WIPPPPPPPPPPPPPPPPPPPPPPPPPP --- lib/dead_end/block_node.rb | 31 ++++-- lib/dead_end/indent_tree.rb | 8 +- spec/unit/indent_search_spec.rb | 108 ++++++++++---------- spec/unit/indent_tree_spec.rb | 174 ++++++++++++++------------------ 4 files changed, 158 insertions(+), 163 deletions(-) diff --git a/lib/dead_end/block_node.rb b/lib/dead_end/block_node.rb index bd69a29..2b39165 100644 --- a/lib/dead_end/block_node.rb +++ b/lib/dead_end/block_node.rb @@ -97,12 +97,21 @@ def initialize(lines:, indent:, next_indent: nil, lex_diff: nil, parents: []) # priority def expand_above?(with_indent: indent) return false if above.nil? - return false if leaf? && leaning == :left - return true if leaf? && leaning == :both && above.leaning == :left + + # Above node needs to expand up too, make sure that happens first return false if above.leaf? && above.leaning == :right + # Special case first move + if leaf? + # We need to expand down on first move, not up + return false if leaning == :left + + # If we're unbalanced both ways, prefer to be unbalanced in only one way + return true if leaning == :both && above.leaning == :left + end - if above.leaning == :left || above.leaning == :both || leaf? && above.leaning == :right + # Capturing a :left or :both could change our leaning, do so with caution + if above.leaning == :left || above.leaning == :both above.indent >= with_indent else true @@ -117,11 +126,21 @@ def expand_above?(with_indent: indent) # priority def expand_below?(with_indent: indent) return false if below.nil? - return false if leaf? && leaning == :right - return true if leaf? && leaning == :both && above.leaning == :right + + # Below node needs to expand down, make sure that happens first return false if below.leaf? && below.leaning == :left - if below.leaning == :right || below.leaning == :both || leaf? && below.leaning == :left + # Special case first move + if leaf? + # We need to expand up on first move, not down + return false if leaning == :right + + # If we're unbalanced both ways, prefer to be unbalanced in only one way + return true if leaning == :both && below.leaning == :right + end + + # Capturing a :right or both could change our leaning, do so with caution + if below.leaning == :right || below.leaning == :both below.indent >= with_indent else true diff --git a/lib/dead_end/indent_tree.rb b/lib/dead_end/indent_tree.rb index d2882c4..d70ecf2 100644 --- a/lib/dead_end/indent_tree.rb +++ b/lib/dead_end/indent_tree.rb @@ -49,7 +49,9 @@ def step while blocks.last.expand_above?(with_indent: indent) above = blocks.last.above blocks << above - break if above.leaning == :left || above.leaning == :both + + break if above.leaning == :left + break if above.leaning == :both && above.leaf? end blocks.reverse! @@ -58,7 +60,9 @@ def step while blocks.last.expand_below?(with_indent: indent) below = blocks.last.below blocks << below - break if below.leaning == :right || below.leaning == :both + + break if below.leaning == :right + break if below.leaning == :both && below.leaf? end if blocks.length > 1 diff --git a/spec/unit/indent_search_spec.rb b/spec/unit/indent_search_spec.rb index 40832cf..f97ea61 100644 --- a/spec/unit/indent_search_spec.rb +++ b/spec/unit/indent_search_spec.rb @@ -13,6 +13,9 @@ def tmp_capture_context(finished) it "large both" do source = <<~'EOM' + def dog + end + [ one, two, @@ -20,6 +23,9 @@ def tmp_capture_context(finished) ].each do |i| print i { end + + def cat + end EOM code_lines = CleanDocument.new(source: source).call.lines @@ -27,14 +33,14 @@ def tmp_capture_context(finished) tree = IndentTree.new(document: document).call search = IndentSearch.new(tree: tree).call - context = BlockNodeContext.new(search.finished[0]).call - expect(context.highlight.join).to eq(<<~'EOM'.indent(2)) - end # two - EOM - - expect(context.lines.join).to eq(<<~'EOM'.indent(2)) - it "flerg" - end # two + # context = BlockNodeContext.new(search.finished[0]).call + expect(search.finished.join).to eq(<<~'EOM'.indent(0)) + [ + one, + two, + ].each do |i| + print i { + end EOM end @@ -63,10 +69,10 @@ def tmp_capture_context(finished) end # two EOM - expect(context.lines.join).to eq(<<~'EOM'.indent(2)) - it "flerg" - end # two - EOM + # expect(context.lines.join).to eq(<<~'EOM'.indent(2)) + # it "flerg" + # end # two + # EOM end it "finds missing do in an rspec context same indent when the problem is in the middle and blocks HAVE inner contents" do @@ -96,11 +102,11 @@ def tmp_capture_context(finished) end # two EOM - expect(context.lines.join).to eq(<<~'EOM'.indent(2)) - it "flerg" - print foo2 - end # two - EOM + # expect(context.lines.join).to eq(<<~'EOM'.indent(2)) + # it "flerg" + # print foo2 + # end # two + # EOM end it "finds a mis-matched def" do @@ -391,22 +397,10 @@ def cat context = BlockNodeContext.new(search.finished[0]).call - expect(search.finished.join).to eq(<<~'EOM'.indent(0)) - if true + expect(search.finished.join).to eq(<<~'EOM'.indent(2)) puts ( - else puts } - end EOM - - # expect(search.finished.length).to eq(2) - # expect(search.finished.first.to_s).to eq(<<~'EOM'.indent(2)) - # puts ( - # EOM - - # expect(search.finished.last.to_s).to eq(<<~'EOM'.indent(2)) - # puts } - # EOM end it "smaller rexe input_modes" do @@ -896,11 +890,6 @@ def call expect(search.finished.first.node.to_s).to eq(<<~'EOM'.indent(2)) def input_modes EOM - - search.finished[0].node.below.parents.each do |p| - puts '--' - puts p - end end it "handles heredocs indentation building microcase outside missing end" do @@ -958,18 +947,19 @@ def input_modes search = IndentSearch.new(tree: tree).call context = BlockNodeContext.new(search.finished[0]).call - expect(context.highlight.join).to eq(<<~'EOM'.indent(4)) + expect(context.highlight.join).to eq(<<~'EOM'.indent(2)) + end # three EOM - expect(context.lines.join).to eq(<<~'EOM'.indent(0)) - def input_modes - @input_modes ||= { - 'l' => :line, - 'e' => :enumerator, - 'b' => :one_big_string, - 'n' => :none - } - EOM + # expect(context.lines.join).to eq(<<~'EOM'.indent(0)) + # def input_modes + # @input_modes ||= { + # 'l' => :line, + # 'e' => :enumerator, + # 'b' => :one_big_string, + # 'n' => :none + # } + # EOM end it "handles heredocs" do @@ -991,15 +981,15 @@ def input_modes def input_modes EOM - expect(context.lines.join).to eq(<<~'EOM'.indent(0)) - def input_modes - @input_modes ||= { - 'l' => :line, - 'e' => :enumerator, - 'b' => :one_big_string, - 'n' => :none - } - EOM + # expect(context.lines.join).to eq(<<~'EOM'.indent(0)) + # def input_modes + # @input_modes ||= { + # 'l' => :line, + # 'e' => :enumerator, + # 'b' => :one_big_string, + # 'n' => :none + # } + # EOM end it "handles derailed output issues/50" do @@ -1086,7 +1076,13 @@ def compile tree = IndentTree.new(document: document).call search = IndentSearch.new(tree: tree).call - expect(search.finished.first.node.to_s).to eq(<<~'EOM'.indent(6)) + expect(search.finished.join).to include(<<~'EOM'.indent(6)) + bundle_path: "vendor/bundle", } + EOM + + expect(search.finished.join).to eq(<<~'EOM'.indent(6)) + ruby_layer_path: File.expand_path("."), + gem_layer_path: File.expand_path("."), bundle_path: "vendor/bundle", } EOM end @@ -1270,7 +1266,7 @@ def foo tree = IndentTree.new(document: document).call search = IndentSearch.new(tree: tree).call - expect(search.finished.first.node.to_s).to eq(<<~'EOM') + expect(search.finished.join).to eq(<<~'EOM') | EOM end diff --git a/spec/unit/indent_tree_spec.rb b/spec/unit/indent_tree_spec.rb index 34ee8c1..80a3b81 100644 --- a/spec/unit/indent_tree_spec.rb +++ b/spec/unit/indent_tree_spec.rb @@ -418,150 +418,126 @@ def speak print } EOM - expect(tree.peek.to_s).to eq(<<~'EOM'.indent(2)) + expect(tree.peek.to_s).to eq(<<~'EOM'.indent(0)) end EOM last = tree.step expect(last.to_s).to eq(<<~'EOM'.indent(0)) + if true print ( else print } end EOM - # expect(tree.peek.to_s).to eq(last.to_s) end it "(smaller) finds random pipe (|) wildly misindented" do source = <<~'EOM' class LanguagePack::Ruby < LanguagePack::Base - def allow_git(&blk) - git_dir = ENV.delete("GIT_DIR") # can mess with bundler - blk.call - ENV["GIT_DIR"] = git_dir - end - - def add_dev_database_addon - pg_adapters.any? {|a| bundler.has_gem?(a) } ? ['heroku-postgresql'] : [] - end - - def pg_adapters - [ - "pg", - "activerecord-jdbcpostgresql-adapter", - "jdbc-postgres", - "jdbc-postgresql", - "jruby-pg", - "rjack-jdbc-postgres", - "tgbyte-activerecord-jdbcpostgresql-adapter" - ] - end - def add_node_js_binary - return [] if node_js_preinstalled? - - if Pathname(build_path).join("package.json").exist? || - bundler.has_gem?('execjs') || - bundler.has_gem?('webpacker') - [@node_installer.binary_path] - else - [] - end - end + print add_node_js_binary + end # one def add_yarn_binary return [] if yarn_preinstalled? - | + | # problem is here if Pathname(build_path).join("yarn.lock").exist? || bundler.has_gem?('webpacker') [@yarn_installer.name] else [] - end - end - - def has_yarn_binary? - add_yarn_binary.any? - end + end # two + end # three misindented but fine def node_preinstall_bin_path - return @node_preinstall_bin_path if defined?(@node_preinstall_bin_path) - - legacy_path = "#{Dir.pwd}/#{NODE_BP_PATH}" - path = run("which node").strip - if path && $?.success? - @node_preinstall_bin_path = path - elsif run("#{legacy_path}/node -v") && $?.success? - @node_preinstall_bin_path = legacy_path - else - @node_preinstall_bin_path = false - end - end + print node_preinstall_bin_path + end # four alias :node_js_preinstalled? :node_preinstall_bin_path - end + end # five EOM code_lines = CleanDocument.new(source: source).call.lines document = BlockDocument.new(code_lines: code_lines).call - tree = IndentTree.new(document: document).call + tree = IndentTree.new(document: document) - node = tree.root - diagnose = DiagnoseNode.new(node).call - expect(diagnose.problem).to eq(:invalid_inside_split_pair) - node = diagnose.next[0] - diagnose = DiagnoseNode.new(node).call - expect(diagnose.problem).to eq(:one_invalid_parent) - node = diagnose.next[0] + expect(tree.peek.to_s).to eq(<<~'EOM'.indent(6)) + [] + EOM + last = tree.step + expect(last.to_s).to eq(<<~'EOM'.indent(4)) + [@yarn_installer.name] + else + [] + EOM - diagnose = DiagnoseNode.new(node).call - expect(diagnose.problem).to eq(:invalid_inside_split_pair) - node = diagnose.next[0] + expect(tree.peek.to_s).to eq(<<~'EOM'.indent(4)) + end # two + EOM - diagnose = DiagnoseNode.new(node).call - expect(diagnose.problem).to eq(:one_invalid_parent) - node = diagnose.next[0] + last = tree.step + expect(last.to_s).to eq(<<~'EOM'.indent(4)) + if Pathname(build_path).join("yarn.lock").exist? || bundler.has_gem?('webpacker') + [@yarn_installer.name] + else + [] + end # two + EOM - diagnose = DiagnoseNode.new(node).call - expect(diagnose.problem).to eq(:self) - expect(node.to_s).to eq(<<~'EOM') - | + expect(tree.peek.to_s).to eq(last.to_s) + + last = tree.step + expect(last.to_s).to eq(<<~'EOM'.indent(0)) + return [] if yarn_preinstalled? + | # problem is here + if Pathname(build_path).join("yarn.lock").exist? || bundler.has_gem?('webpacker') + [@yarn_installer.name] + else + [] + end # two EOM - end - it "finds missing comma in array" do - source = <<~'EOM' - def animals - [ - cat, - dog - horse - ] - end + expect(tree.peek.to_s).to eq(<<~'EOM'.indent(4)) + print node_preinstall_bin_path EOM - code_lines = CleanDocument.new(source: source).call.lines - document = BlockDocument.new(code_lines: code_lines).call - tree = IndentTree.new(document: document).call + last = tree.step + expect(last.to_s).to eq(<<~'EOM'.indent(2)) + def node_preinstall_bin_path + print node_preinstall_bin_path + end # four + EOM - node = tree.root - diagnose = DiagnoseNode.new(node).call - expect(diagnose.problem).to eq(:invalid_inside_split_pair) - node = diagnose.next[0] + expect(tree.peek.to_s).to eq(<<~'EOM'.indent(4)) + print add_node_js_binary + EOM - diagnose = DiagnoseNode.new(node).call - expect(diagnose.problem).to eq(:invalid_inside_split_pair) - node = diagnose.next[0] + last = tree.step + expect(last.to_s).to eq(<<~'EOM'.indent(2)) + def add_node_js_binary + print add_node_js_binary + end # one + EOM + expect(tree.peek.to_s).to eq(<<~'EOM'.indent(2)) + def node_preinstall_bin_path + print node_preinstall_bin_path + end # four + EOM - diagnose = DiagnoseNode.new(node).call - expect(diagnose.problem).to eq(:extract_from_multiple) - node = diagnose.next[0] + last = tree.step + expect(last.to_s).to eq(<<~'EOM'.indent(2)) + def node_preinstall_bin_path + print node_preinstall_bin_path + end # four + alias :node_js_preinstalled? :node_preinstall_bin_path + EOM - diagnose = DiagnoseNode.new(node).call - expect(diagnose.problem).to eq(:self) - expect(node.to_s).to eq(<<~'EOM'.indent(4)) - cat, + search = IndentSearch.new(tree: tree.call).call + + expect(search.finished.join).to eq(<<~'EOM') + lol EOM end From be9b4c4de657d4b9f7d46e2ae5b743c817900c5c Mon Sep 17 00:00:00 2001 From: schneems Date: Fri, 20 May 2022 13:49:38 -0700 Subject: [PATCH 52/58] WIP Follow the search logic, does it make sense? If not, time to rename and update Welp the last commit simply is labeled "WIPPPPPPPPPPPPPPPPPPPPPPPPPP" and it was made in February From a prior commit ``` The other case I added was to enforce the indentation check for nodes leaning opposite of expansion direction, but only on first expansion (other wise we end up with "split" blocks where one block will not engulf all inner blocks. Those changes exposed a problem with the `next_indent` calculation where sometimes it would come back at a higher value than the current indent which is not correct. Fixing this by adding a final check/guarantee when deriving that value. This change seems good in isolation but is causing a lot of test failures due to tight coupling between tests and implementation. I need to go back an d re-work the tests to see if there's any fundamental "disagreements" or if they just need to be updated to new/better values. There's one problem fundamental to the :both case that seems not well handled here ``` 192 examples, 4 failures, 1 pending ``` rspec ./spec/integration/ruby_command_line_spec.rb:46 # Requires with ruby cli detects require error and adds a message with auto mode rspec ./spec/unit/indent_tree_spec.rb:614 # DeadEnd::IndentTree doesn't scapegoat rescue rspec ./spec/unit/indent_tree_spec.rb:693 # DeadEnd::IndentTree finds random pipe (|) wildly misindented rspec ./spec/unit/indent_tree_spec.rb:1024 # DeadEnd::IndentTree syntax_tree.rb.txt for performance validation ``` --- spec/unit/indent_tree_spec.rb | 80 ++++++++++++++++++++++++++++++++--- 1 file changed, 75 insertions(+), 5 deletions(-) diff --git a/spec/unit/indent_tree_spec.rb b/spec/unit/indent_tree_spec.rb index 80a3b81..50b18b0 100644 --- a/spec/unit/indent_tree_spec.rb +++ b/spec/unit/indent_tree_spec.rb @@ -460,8 +460,6 @@ def node_preinstall_bin_path document = BlockDocument.new(code_lines: code_lines).call tree = IndentTree.new(document: document) - - expect(tree.peek.to_s).to eq(<<~'EOM'.indent(6)) [] EOM @@ -534,11 +532,83 @@ def node_preinstall_bin_path alias :node_js_preinstalled? :node_preinstall_bin_path EOM - search = IndentSearch.new(tree: tree.call).call + last = tree.step + expect(last.to_s).to eq(<<~'EOM'.indent(0)) + def add_yarn_binary + return [] if yarn_preinstalled? + | # problem is here + if Pathname(build_path).join("yarn.lock").exist? || bundler.has_gem?('webpacker') + [@yarn_installer.name] + else + [] + end # two + EOM + + last = tree.step + expect(last.to_s).to eq(<<~'EOM'.indent(0)) + def node_preinstall_bin_path + print node_preinstall_bin_path + end # four + alias :node_js_preinstalled? :node_preinstall_bin_path + end # five + EOM + + last = tree.step + expect(last.to_s).to eq(<<~'EOM'.indent(0)) + class LanguagePack::Ruby < LanguagePack::Base + def add_node_js_binary + print add_node_js_binary + end # one + def add_yarn_binary + return [] if yarn_preinstalled? + | # problem is here + if Pathname(build_path).join("yarn.lock").exist? || bundler.has_gem?('webpacker') + [@yarn_installer.name] + else + [] + end # two + end # three misindented but fine + EOM - expect(search.finished.join).to eq(<<~'EOM') - lol + last = tree.step + expect(last.to_s).to eq(<<~'EOM'.indent(0)) + class LanguagePack::Ruby < LanguagePack::Base + def add_node_js_binary + print add_node_js_binary + end # one + def add_yarn_binary + return [] if yarn_preinstalled? + | # problem is here + if Pathname(build_path).join("yarn.lock").exist? || bundler.has_gem?('webpacker') + [@yarn_installer.name] + else + [] + end # two + end # three misindented but fine + def node_preinstall_bin_path + print node_preinstall_bin_path + end # four + alias :node_js_preinstalled? :node_preinstall_bin_path + end # five EOM + + ## That's the whole document + + # HEY: Weird that this is picking the wrong end + tree = tree.call # Resolve all steps + search = IndentSearch.new(tree: tree).call + + # expect(search.finished.join).to eq(<<~'EOM'.indent(0)) + # def add_yarn_binary + # return [] if yarn_preinstalled? + # | # problem is here + # if Pathname(build_path).join("yarn.lock").exist? || bundler.has_gem?('webpacker') + # [@yarn_installer.name] + # else + # [] + # end # two + # end # five + # EOM end it "doesn't scapegoat rescue" do From 02285d2c594605d33bb149d42bd47cc4925eea23 Mon Sep 17 00:00:00 2001 From: schneems Date: Sat, 4 Jun 2022 10:40:10 -0500 Subject: [PATCH 53/58] Fix nested pseudo pair case The problem here is that we still have a remove_pseudo_pair situation two of these lines are valid when paired with above/below, one is not however when you look for the next `above` it shows `setup_language_pack_environment(` which is correct, but the below returns `bundle_default_without: "development:test"` which is technically correct, but not useful to us, we need that node's below The fix was to re-group the invalid nodes with the original above/below The downside of this approch is that it may violate expectations of above/below guarantees. The original document can no longer be re-created. It makes things very convenient though so this seems like the right path forward. It also seems that `invalid_inside_split_pair` and `remove_pseudo_pair` are essentially the same thing, but one has captured the leaning blocks inside of the node's parents while the other simply has them as an above/below reference. We might be able to simplify something later. I'm pleased with this result, it isolates exactly just the failing line. 192 examples, 4 failures, 1 pending Failed examples: rspec ./spec/integration/ruby_command_line_spec.rb:46 # Requires with ruby cli detects require error and adds a message with auto mode rspec ./spec/unit/indent_search_spec.rb:1034 # DeadEnd::IndentSearch doesn't scapegoat rescue rspec ./spec/unit/indent_tree_spec.rb:721 # DeadEnd::IndentTree finds random pipe (|) wildly misindented rspec ./spec/unit/indent_tree_spec.rb:1052 # DeadEnd::IndentTree syntax_tree.rb.txt for performance validation It seems like if the fix here was in diagnose, that I need to have a better set of diagnose tests. Considering there are none! It seems like it would be a good idea to start there next time. We can collect cases from the existing indent_tree_spec. Ultimately we need to assert the indent tree shape/properties rather than what we're currently doing with walking/diagnosing/searching in the name of testing. However there's a dependency resolution problem. We are testing a tree structure so we need an easy way to assert properties of that structure. But changes to the structure change it's properties. In short until we have a working solution the properties we desire won't be 100% clear. --- lib/dead_end/block_node.rb | 9 +++--- lib/dead_end/diagnose_node.rb | 58 +++++++++++++++++------------------ lib/dead_end/journey.rb | 2 +- spec/unit/indent_tree_spec.rb | 34 ++++++++++++++++++-- 4 files changed, 65 insertions(+), 38 deletions(-) diff --git a/lib/dead_end/block_node.rb b/lib/dead_end/block_node.rb index 2b39165..20c25e0 100644 --- a/lib/dead_end/block_node.rb +++ b/lib/dead_end/block_node.rb @@ -6,7 +6,7 @@ module DeadEnd # A block node keeps a reference to the block above it # and below it. In addition a block can "capture" another # block. Block nodes are treated as immutable(ish) so when that happens - # a new node is created that contains a refernce to all the blocks it was + # a new node is created that contains a reference to all the blocks it was # derived from. These are known as a block's "parents". # # If you walk the parent chain until it ends you'll end up with nodes @@ -33,7 +33,7 @@ class BlockNode # # block = BlockNode.from_blocks([parents[0], parents[2]]) # expect(block.leaning).to eq(:equal) - def self.from_blocks(parents) + def self.from_blocks(parents, above: nil, below: nil) lines = [] while parents.length == 1 && parents.first.parents.any? parents = parents.first.parents @@ -47,8 +47,9 @@ def self.from_blocks(parents) block.delete end - above = parents.first.above - below = parents.last.below + above ||= parents.first.above + below ||= parents.last.below + parents = [] if parents.length == 1 diff --git a/lib/dead_end/diagnose_node.rb b/lib/dead_end/diagnose_node.rb index af08cf6..9664aae 100644 --- a/lib/dead_end/diagnose_node.rb +++ b/lib/dead_end/diagnose_node.rb @@ -17,7 +17,6 @@ module DeadEnd # The algorithm here is tightly coupled to the nodes produced by the current IndentTree # implementation. # - # # Possible problem states: # # - :self - The block holds no parents, if it holds a problem its in the current node. @@ -25,7 +24,8 @@ module DeadEnd # - :invalid_inside_split_pair - An invalid block is splitting two valid leaning blocks, return the middle. # # - :remove_pseudo_pair - Multiple invalid blocks in isolation are present, but when paired with external leaning - # blocks above and below they become valid. Remove these and group the leftovers together. i.e. `else/ensure/rescue`. + # blocks above and below they become valid. Remove these and group the leftovers together. i.e. don't + # scapegoat `else/ensure/rescue`, remove them from the block and retry with whats leftover. # # - :extract_from_multiple - Multiple invalid blocks in isolation are present, but we were able to find one that could be removed # to make a valid set along with outer leaning i.e. `[`, `in)&lid` , `vaild`, `]`. Different from :invalid_inside_split_pair because @@ -65,7 +65,7 @@ def call @next = if @problem == :multiple_invalid_parents invalid.map { |b| BlockNode.from_blocks([b]) } else - [BlockNode.from_blocks(invalid)] + invalid end self @@ -86,32 +86,35 @@ def call diagnose_one_or_more_parents end + # Diagnose left/right + # + # Handles cases where the block is made up of a several nodes and is book ended by + # nodes leaning in the correct direction that pair with one another. For example [`{`, `b@&[d`, `}`] + # + # This is different from above/below which also has matching blocks, but those are outside of the current + # block array (they are above and below it respectively) + # # ## (:invalid_inside_split_pair) Handle case where keyword/end (or any pair) is falsely reported as invalid in isolation but # holds a syntax error inside of it. # # Example: # # ``` - # def cow # left, invalid in isolation, valid when paired with end - # ``` - # - # ``` + # def cow # left, invalid in isolation, valid when paired with end # inv&li) code # Actual problem to be isolated - # ``` - # - # ``` - # end # right, invalid in isolation, valid when paired with def + # end # right, invalid in isolation, valid when paired with def # ``` private def diagnose_left_right invalid = block.parents.select(&:invalid?) + return false if invalid.length < 3 left = invalid.detect { |block| block.leaning == :left } right = invalid.reverse_each.detect { |block| block.leaning == :right } - if left && right && invalid.length >= 3 && BlockNode.from_blocks([left, right]).valid? + if left && right && BlockNode.from_blocks([left, right]).valid? @problem = :invalid_inside_split_pair - invalid.reject! { |x| x == left || x == right } + invalid.reject! { |b| b == left || b == right } # If the left/right was not mapped properly or we've accidentally got a :multiple_invalid_parents # we can get a false positive, double check the invalid lines fully capture the problem @@ -130,16 +133,10 @@ def call # Example: # # ``` - # def cow # above - # ``` - # - # ``` + # def cow # above # print inv&li) # Actual problem # rescue => e # Invalid in isolation, valid when paired with above/below - # ``` - # - # ``` - # end # below + # end # below # ``` # # ## (:extract_from_multiple) Handle syntax seems fine in isolation, but not when combined with above/below leaning blocks @@ -148,22 +145,16 @@ def call # # ``` # [ # above - # ``` - # - # ``` # missing_comma_not_okay # missing_comma_okay - # ``` - # - # ``` # ] # below # ``` + # private def diagnose_above_below invalid = block.parents.select(&:invalid?) above = block.above if block.above&.leaning == :left below = block.below if block.below&.leaning == :right - return false if above.nil? || below.nil? if invalid.reject! { |block| @@ -172,11 +163,17 @@ def call } if invalid.any? + # At this point invalid array was reduced and represents only + # nodes that are invalid when paired with it's above/below + # however, we may need to split the node apart again @problem = :remove_pseudo_pair - invalid - else + [BlockNode.from_blocks(invalid, above: above, below: below)] + else invalid = block.parents.select(&:invalid?) + + # If we can remove one node from many blocks to make the other blocks valid then, that + # block must be the problem if (b = invalid.detect { |b| BlockNode.from_blocks([above, invalid - [b], below].flatten).valid? }) @problem = :extract_from_multiple [b] @@ -189,6 +186,7 @@ def call private def diagnose_one_or_more_parents invalid = block.parents.select(&:invalid?) @problem = if invalid.length > 1 + :multiple_invalid_parents else :one_invalid_parent diff --git a/lib/dead_end/journey.rb b/lib/dead_end/journey.rb index dfdfa1f..4834274 100644 --- a/lib/dead_end/journey.rb +++ b/lib/dead_end/journey.rb @@ -8,7 +8,7 @@ module DeadEnd # valid code from it's parent # # node = tree.root - # journey = Journe.new(node) + # journey = Journey.new(node) # journey << Step.new(node.parents[0]) # expect(journey.node).to eq(node.parents[0]) # diff --git a/spec/unit/indent_tree_spec.rb b/spec/unit/indent_tree_spec.rb index 50b18b0..29c6173 100644 --- a/spec/unit/indent_tree_spec.rb +++ b/spec/unit/indent_tree_spec.rb @@ -625,7 +625,7 @@ def compile setup_language_pack_environment( ruby_layer_path: File.expand_path("."), gem_layer_path: File.expand_path("."), - bundle_path: "vendor/bundle", } + bundle_path: "vendor/bundle", } # problem bundle_default_without: "development:test" ) allow_git do @@ -672,21 +672,49 @@ def compile expect(diagnose.problem).to eq(:one_invalid_parent) node = diagnose.next[0] + expect(node.to_s).to eq(<<~'EOM'.indent(4)) + setup_language_pack_environment( + ruby_layer_path: File.expand_path("."), + gem_layer_path: File.expand_path("."), + bundle_path: "vendor/bundle", } # problem + bundle_default_without: "development:test" + ) + EOM + diagnose = DiagnoseNode.new(node).call expect(diagnose.problem).to eq(:invalid_inside_split_pair) node = diagnose.next[0] + expect(node.to_s).to eq(<<~'EOM'.indent(6)) + ruby_layer_path: File.expand_path("."), + gem_layer_path: File.expand_path("."), + bundle_path: "vendor/bundle", } # problem + bundle_default_without: "development:test" + EOM + diagnose = DiagnoseNode.new(node).call + node = diagnose.next[0] expect(diagnose.problem).to eq(:remove_pseudo_pair) - expect(node.parents.length).to eq(4) + + expect(node.to_s).to eq(<<~'EOM'.indent(6)) + ruby_layer_path: File.expand_path("."), + gem_layer_path: File.expand_path("."), + bundle_path: "vendor/bundle", } # problem + EOM diagnose = DiagnoseNode.new(node).call node = diagnose.next[0] + expect(diagnose.problem).to eq(:remove_pseudo_pair) + + expect(node.to_s).to eq(<<~'EOM'.indent(6)) + bundle_path: "vendor/bundle", } # problem + EOM diagnose = DiagnoseNode.new(node).call expect(diagnose.problem).to eq(:self) + expect(node.to_s).to eq(<<~'EOM'.indent(6)) - bundle_path: "vendor/bundle", } + bundle_path: "vendor/bundle", } # problem EOM end From ef535d5afdf6d6bfcab3c05b45a2abfc9882d494 Mon Sep 17 00:00:00 2001 From: schneems Date: Tue, 7 Jun 2022 15:30:24 -0500 Subject: [PATCH 54/58] 192 examples, 2 failures, 1 pending Failed examples: rspec ./spec/integration/ruby_command_line_spec.rb:46 # Requires with ruby cli detects require error and adds a message with auto mode rspec ./spec/unit/indent_tree_spec.rb:1048 # DeadEnd::IndentTree syntax_tree.rb.txt for performance validation --- spec/unit/indent_search_spec.rb | 2 -- spec/unit/indent_tree_spec.rb | 12 ++++-------- 2 files changed, 4 insertions(+), 10 deletions(-) diff --git a/spec/unit/indent_search_spec.rb b/spec/unit/indent_search_spec.rb index f97ea61..831afad 100644 --- a/spec/unit/indent_search_spec.rb +++ b/spec/unit/indent_search_spec.rb @@ -1081,8 +1081,6 @@ def compile EOM expect(search.finished.join).to eq(<<~'EOM'.indent(6)) - ruby_layer_path: File.expand_path("."), - gem_layer_path: File.expand_path("."), bundle_path: "vendor/bundle", } EOM end diff --git a/spec/unit/indent_tree_spec.rb b/spec/unit/indent_tree_spec.rb index 29c6173..fbf3df1 100644 --- a/spec/unit/indent_tree_spec.rb +++ b/spec/unit/indent_tree_spec.rb @@ -739,23 +739,19 @@ def compile node = diagnose.next[0] diagnose = DiagnoseNode.new(node).call - expect(diagnose.problem).to eq(:invalid_inside_split_pair) - node = diagnose.next[0] - - diagnose = DiagnoseNode.new(node).call - expect(diagnose.problem).to eq(:remove_pseudo_pair) + expect(diagnose.problem).to eq(:one_invalid_parent) node = diagnose.next[0] diagnose = DiagnoseNode.new(node).call - expect(diagnose.problem).to eq(:invalid_inside_split_pair) + expect(diagnose.problem).to eq(:one_invalid_parent) node = diagnose.next[0] diagnose = DiagnoseNode.new(node).call - expect(diagnose.problem).to eq(:invalid_inside_split_pair) + expect(diagnose.problem).to eq(:one_invalid_parent) node = diagnose.next[0] diagnose = DiagnoseNode.new(node).call - expect(diagnose.problem).to eq(:remove_pseudo_pair) + expect(diagnose.problem).to eq(:one_invalid_parent) node = diagnose.next[0] diagnose = DiagnoseNode.new(node).call From fc7a62c8798c6d71cd743f9489f2b893b333c65f Mon Sep 17 00:00:00 2001 From: schneems Date: Tue, 7 Jun 2022 15:42:34 -0500 Subject: [PATCH 55/58] Check if one node can be pruned from multiple MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit When we think that multiple nodes are invalid, check to see if removing one in isolation makes the others valid, if so report that node to be the problem. ``` Finished in 3.58 seconds (files took 0.23647 seconds to load) 192 examples, 1 failure, 1 pending Failed examples: rspec ./spec/integration/ruby_command_line_spec.rb:46 # Requires with ruby cli detects require error and adds a message with auto mode ``` This last failure: ``` 1) Requires with ruby cli detects require error and adds a message with auto mode Failure/Error: expect(out).to include('❯ 5 it "flerg"').once expected "--> /var/folders/l9/w5ggmcjd28d57p4rwb1m7kph0000gp/T/d20220607-97216-rq4ujg/script.rb\n\nUnmatched `... /var/folders/l9/w5ggmcjd28d57p4rwb1m7kph0000gp/T/d20220607-97216-rq4ujg/require.rb:1:in `
'\n" to include "❯ 5 it \"flerg\"" once but it is included 0 times Diff: @@ -1,14 +1,27 @@ -❯ 5 it "flerg" +--> /var/folders/l9/w5ggmcjd28d57p4rwb1m7kph0000gp/T/d20220607-97216-rq4ujg/script.rb + +Unmatched `end', missing keyword (`do', `def`, `if`, etc.) ? + + 1 describe "things" do + 2 it "blerg" do + 3 end +❯ 7 end + 9 it "zlerg" do + 10 end + 11 end +/Users/rschneeman/Documents/projects/dead_end/lib/dead_end/core_ext.rb:13:in `load': /var/folders/l9/w5ggmcjd28d57p4rwb1m7kph0000gp/T/d20220607-97216-rq4ujg/script.rb:11: syntax error, unexpected `end', expecting end-of-input (SyntaxError) + from /Users/rschneeman/Documents/projects/dead_end/lib/dead_end/core_ext.rb:13:in `load' + from /var/folders/l9/w5ggmcjd28d57p4rwb1m7kph0000gp/T/d20220607-97216-rq4ujg/require.rb:1:in `
' # ./spec/integration/ruby_command_line_spec.rb:72:in `block (3 levels) in ' # ./spec/integration/ruby_command_line_spec.rb:47:in `block (2 levels) in ' Finished in 0.40714 seconds (files took 0.20038 seconds to load) 1 example, 1 failure Failed examples: rspec ./spec/integration/ruby_command_line_spec.rb:46 # Requires with ruby cli detects require error and adds a message with auto mode ``` Happens due to display. We found the correct `end` but are not showing the line missing the `do`. Previously we would have corrected for this in "capture context" which doesn't quite work because we don't just want extra context displayed, we want to point at that being a problem line. i.e. ``` + 1 describe "things" do + 2 it "blerg" do + 3 end + 5 it "flerg" +❯ 7 end + 9 it "zlerg" do + 10 end + 11 end ``` Isn't quite good enough. Overall performance is great ## Misindent pipe Before: ``` Unmatched `|', missing `|' ? ❯ 1067 def add_yarn_binary ❯ 1068 return [] if yarn_preinstalled? ❯ 1069 | ❯ 1075 end ``` After ``` 16 class LanguagePack::Ruby < LanguagePack::Base ❯ 1069 | 1344 end ``` Unmatched keyword, missing `end' ? 1 Rails.application.routes.draw do 107 constraints -> { Rails.application.config.non_production } do 111 end ❯ 113 namespace :admin do 121 end ## Webmock, missing comma Before: ``` syntax error, unexpected ':', expecting end-of-input 1 describe "webmock tests" do 22 it "body" do 27 query = Cutlass::FunctionQuery.new( ❯ 28 port: port ❯ 29 body: body 30 ).call 34 end 35 end ``` After: ``` syntax error, unexpected ':', expecting end-of-input 1 describe "webmock tests" do 22 it "body" do 27 query = Cutlass::FunctionQuery.new( ❯ 28 port: port 30 ).call 34 end 35 end ``` ## Missing end require tree Before: ``` Unmatched keyword, missing `end' ? 5 module DerailedBenchmarks 6 class RequireTree 7 REQUIRED_BY = {} 9 attr_reader :name 10 attr_writer :cost ❯ 13 def initialize(name) ❯ 18 def self.reset! ❯ 25 end 73 end 74 end ``` After: ``` Unmatched keyword, missing `end' ? 5 module DerailedBenchmarks 6 class RequireTree ❯ 13 def initialize(name) 18 def self.reset! 25 end 73 end 74 end ``` ## Rexe missing end Before: ``` Unmatched keyword, missing `end' ? 16 class Rexe ❯ 77 class Lookups ❯ 78 def input_modes ❯ 148 end 551 end ``` After: ``` Unmatched keyword, missing `end' ? 16 class Rexe 77 class Lookups ❯ 78 def input_modes 87 def input_formats 94 end 148 end 551 end ``` ## Display invalid blocks missing end Before: ``` Unmatched keyword, missing `end' ? 1 module SyntaxErrorSearch 3 class DisplayInvalidBlocks ❯ 36 def filename ❯ 38 def code_with_filename ❯ 45 end 63 end 64 end ``` After: ``` Unmatched keyword, missing `end' ? 1 module SyntaxErrorSearch 3 class DisplayInvalidBlocks 17 def call 34 end ❯ 36 def filename 38 def code_with_filename 45 end 63 end 64 end ``` ## Next steps Since output performance is great and it looks like this is the only edge case, let's go through the old code base to make sure we're testing all known cases. Once we are satisfied there then the next step is to add some context back into this specific match. It needs to be done outside of capture code context because we need it highlighted. --- lib/dead_end/api.rb | 1 - lib/dead_end/diagnose_node.rb | 16 ++++++++++------ spec/unit/indent_tree_spec.rb | 5 +++++ 3 files changed, 15 insertions(+), 7 deletions(-) diff --git a/lib/dead_end/api.rb b/lib/dead_end/api.rb index 75e2932..fda28d8 100644 --- a/lib/dead_end/api.rb +++ b/lib/dead_end/api.rb @@ -86,7 +86,6 @@ def obj.document_ok?; true; end end blocks = search.finished.map(&:node).map {|node| CodeBlock.new(lines: node.lines) } - # puts search.finished.first.steps.last(2).first.block DisplayInvalidBlocks.new( io: io, diff --git a/lib/dead_end/diagnose_node.rb b/lib/dead_end/diagnose_node.rb index 9664aae..e7a3752 100644 --- a/lib/dead_end/diagnose_node.rb +++ b/lib/dead_end/diagnose_node.rb @@ -185,14 +185,18 @@ def call # We couldn't detect any special cases, either return 1 or N invalid nodes private def diagnose_one_or_more_parents invalid = block.parents.select(&:invalid?) - @problem = if invalid.length > 1 - - :multiple_invalid_parents + if invalid.length > 1 + if (b = invalid.detect { |b| BlockNode.from_blocks([invalid - [b]].flatten).valid? }) + @problem = :extract_from_multiple + [b] + else + @problem = :multiple_invalid_parents + invalid + end else - :one_invalid_parent + @problem = :one_invalid_parent + invalid end - - invalid end private def diagnose_self diff --git a/spec/unit/indent_tree_spec.rb b/spec/unit/indent_tree_spec.rb index fbf3df1..d530ad7 100644 --- a/spec/unit/indent_tree_spec.rb +++ b/spec/unit/indent_tree_spec.rb @@ -1076,6 +1076,11 @@ def format_requires expect(diagnose.problem).to eq(:one_invalid_parent) node = diagnose.next[0] + diagnose = DiagnoseNode.new(node).call + expect(diagnose.problem).to eq(:extract_from_multiple) + node = diagnose.next[0] + + diagnose = DiagnoseNode.new(node).call expect(diagnose.problem).to eq(:self) expect(node.to_s).to eq(<<~'EOM'.indent(2)) From 91b94a41c8351a5a9d164021335119e00bb89f7d Mon Sep 17 00:00:00 2001 From: schneems Date: Wed, 8 Jun 2022 14:21:22 -0500 Subject: [PATCH 56/58] Move test out of ruby CLI The purpose of the Ruby CLI tests isn't to check formatting it's to ensure that Dead end integrates with Ruby in the real world. To that end I'm putting the failing test into `integration/dead_end_spec.rb` where it better fits. I've also added a more advanced case that includes internal elements inside of the method that I think should be omitted. The main problem with the existing test cases is that they came from real world scenarios where the prior algorithm performed poorly. Since this is a different algorithm it has different characteristics that cause it to perform poorly in different scenarios. ``` Finished in 3.44 seconds (files took 0.20514 seconds to load) 194 examples, 2 failures, 1 pending Failed examples: rspec ./spec/integration/dead_end_spec.rb:207 # Integration tests that don't spawn a process (like using the cli) missing `do` highlights more than `end` simple rspec ./spec/integration/dead_end_spec.rb:238 # Integration tests that don't spawn a process (like using the cli) missing `do` highlights more than `end`, with internal contents ``` --- spec/integration/dead_end_spec.rb | 72 ++++++++++++++++++++++ spec/integration/ruby_command_line_spec.rb | 5 +- 2 files changed, 74 insertions(+), 3 deletions(-) diff --git a/spec/integration/dead_end_spec.rb b/spec/integration/dead_end_spec.rb index 24ea902..d3d2d8a 100644 --- a/spec/integration/dead_end_spec.rb +++ b/spec/integration/dead_end_spec.rb @@ -203,5 +203,77 @@ def bark 4 end EOM end + + it "missing `do` highlights more than `end` simple" do + source = <<~'EOM' + describe "things" do + it "blerg" do + end + + it "flerg" + end + + it "zlerg" do + end + end + EOM + io = StringIO.new + DeadEnd.call( + io: io, + source: source + ) + out = io.string + expect(out).to include(<<~EOM) + 1 describe "things" do + 2 it "blerg" do + 3 end + ❯ 5 it "flerg" + ❯ 6 end + 8 it "zlerg" do + 9 end + 10 end + EOM + end + + it "missing `do` highlights more than `end`, with internal contents" do + source = <<~'EOM' + describe "things" do + it "blerg" do + end + + it "flerg" + doesnt + show + extra + stuff() + that_s + not + critical + inside + end + + it "zlerg" do + foo + end + end + EOM + io = StringIO.new + DeadEnd.call( + io: io, + source: source + ) + out = io.string + + expect(out).to include(<<~EOM) + 1 describe "things" do + 2 it "blerg" do + 3 end + ❯ 5 it "flerg" + ❯ 14 end + 16 it "zlerg" do + 18 end + 19 end + EOM + end end end diff --git a/spec/integration/ruby_command_line_spec.rb b/spec/integration/ruby_command_line_spec.rb index 35a2ded..a9399f9 100644 --- a/spec/integration/ruby_command_line_spec.rb +++ b/spec/integration/ruby_command_line_spec.rb @@ -52,8 +52,7 @@ module DeadEnd it "blerg" do end - it "flerg" - end + def foo it "zlerg" do end @@ -68,7 +67,7 @@ module DeadEnd out = `ruby -I#{lib_dir} -rdead_end #{require_rb} 2>&1` expect($?.success?).to be_falsey - expect(out).to include('❯ 5 it "flerg"').once + expect(out).to include('❯ 5 def foo').once end end end From c3b30dabf68dfa5fd62f22252fe37cee2cb6b794 Mon Sep 17 00:00:00 2001 From: schneems Date: Wed, 8 Jun 2022 15:58:12 -0500 Subject: [PATCH 57/58] Porting examples from existing tests MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - https://github.com/zombocom/dead_end/blob/622dfddda92e27235425dd5a370618c5257731e8/spec/unit/code_search_spec.rb ``` 22 examples, 9 failures Failed examples: rspec ./spec/integration/dead_end_spec.rb:208 # Integration tests that don't spawn a process (like using the cli) missing `do` highlights more than `end` simple rspec ./spec/integration/dead_end_spec.rb:239 # Integration tests that don't spawn a process (like using the cli) missing `do` highlights more than `end`, with internal contents rspec ./spec/integration/dead_end_spec.rb:302 # Integration tests that don't spawn a process (like using the cli) squished do regression rspec ./spec/integration/dead_end_spec.rb:400 # Integration tests that don't spawn a process (like using the cli) handles no spaces between blocks rspec ./spec/integration/dead_end_spec.rb:447 # Integration tests that don't spawn a process (like using the cli) Format Code blocks real world example rspec ./spec/integration/dead_end_spec.rb:492 # Integration tests that don't spawn a process (like using the cli) returns syntax error in outer block without inner block rspec ./spec/integration/dead_end_spec.rb:514 # Integration tests that don't spawn a process (like using the cli) finds multiple syntax errors rspec ./spec/integration/dead_end_spec.rb:547 # Integration tests that don't spawn a process (like using the cli) finds a naked end rspec ./spec/integration/dead_end_spec.rb:565 # Integration tests that don't spawn a process (like using the cli) handles mismatched | ``` Failing tests need to be investigated. Most are variations on a missing keyword but present end. Next I want to group related failures together in the file so I can understand which might have competing requirements/properties. The other thing that seems difficult is that this sometimes identifies one syntax error as several which the old algorithm never did. Looking into it some can be fixed in "post" while others need investigation for why they're failing. ## Unknown ### Needs investigation 0 ``` 1 class Blerg ❯ 2 Foo.call do |a 5 class Foo 6 end # two 7 end # three Unmatched `end', missing keyword (`do', `def`, `if`, etc.) ? 1 class Blerg ❯ 3 end # one 5 class Foo 6 end # two 7 end # three handles mismatched | (FAILED - 1) Failures: 1) Integration tests that don't spawn a process (like using the cli) handles mismatched | Failure/Error: raise("this should be one failure, not two") RuntimeError: this should be one failure, not two # ./spec/integration/dead_end_spec.rb:603:in `block (2 levels) in ' ``` ### Needs investigation 1 Expected: ``` 7 context "test" do ❯ 8 it "should" do 9 end ``` Actual ``` +Unmatched keyword, missing `end' ? + + 1 context "foo bar" do +❯ 7 context "test" do + 9 end # ./spec/integration/dead_end_spec.rb:418:in `block (2 levels) in ' ``` ### Needs investigation 2 Expected: ``` it "finds a naked end" do source = <<~'EOM' def foo end # one end # two EOM io = StringIO.new DeadEnd.call( io: io, source: source ) expect(io.string).to include(<<~'EOM') ❯ end # one EOM end ``` Actual: ``` +Unmatched `end', missing keyword (`do', `def`, `if`, etc.) ? + + 1 def foo +❯ 3 end # two ``` ## Fix it in post 1 ``` +Unmatched `end', missing keyword (`do', `def`, `if`, etc.) ? + + 1 describe "things" do + 2 it "blerg" do + 3 end +❯ 6 end + 8 it "zlerg" do + 9 end + 10 end # ./spec/integration/dead_end_spec.rb:227:in `block (2 levels) in ' ``` Singular end, can fix it in post ``` - 1 describe "things" do\n 2 it "blerg" do\n 3 end\n❯ 5 it "flerg"\n❯ 14 end\n 16 it "zlerg" do\n 18 end\n 19 end\n +Unmatched `end', missing keyword (`do', `def`, `if`, etc.) ? + + 1 describe "things" do + 2 it "blerg" do + 3 end +❯ 14 end + 16 it "zlerg" do + 18 end + 19 end # ./spec/integration/dead_end_spec.rb:268:in `block (2 levels) in ' ``` Same, can fix it in post ``` +Unmatched `end', missing keyword (`do', `def`, `if`, etc.) ? + + 1 def call +❯ 15 end # one + 16 end # two # ./spec/integration/dead_end_spec.rb:329:in `block (2 levels) in ' ``` Same, can fix in post ``` +Unmatched `end', missing keyword (`do', `def`, `if`, etc.) ? + +❯ 6 end # two # ./spec/integration/dead_end_spec.rb:508:in `block (2 levels) in ' ``` Same ``` +Unmatched `end', missing keyword (`do', `def`, `if`, etc.) ? + + 1 describe "hi" do +❯ 3 end + 4 end +Unmatched `end', missing keyword (`do', `def`, `if`, etc.) ? + + 5 it "blerg" do +❯ 7 end + 8 end # ./spec/integration/dead_end_spec.rb:532:in `block (2 levels) in ' ``` Same ## Fix it in post 2 Subtly different: ``` +Unmatched `end', missing keyword (`do', `def`, `if`, etc.) ? + + 2 RSpec.describe AclassNameHere, type: :worker do + 3 describe "thing" do + 13 end # line 16 accidental end, but valid block +❯ 23 end # mismatched due to 16 + 24 end # ./spec/integration/dead_end_spec.rb:481:in `block (2 levels) in ' ``` --- spec/integration/dead_end_spec.rb | 325 ++++++++++++++++++++++++++++++ 1 file changed, 325 insertions(+) diff --git a/spec/integration/dead_end_spec.rb b/spec/integration/dead_end_spec.rb index d3d2d8a..c183529 100644 --- a/spec/integration/dead_end_spec.rb +++ b/spec/integration/dead_end_spec.rb @@ -34,6 +34,7 @@ module DeadEnd EOM end + it "re-checks all block code, not just what's visible issues/95" do file = fixtures_dir.join("ruby_buildpack.rb.txt") io = StringIO.new @@ -275,5 +276,329 @@ def bark 19 end EOM end + + it "works with valid code" do + source = <<~'EOM' + class OH + def hello + end + def hai + end + end + EOM + + io = StringIO.new + DeadEnd.call( + io: io, + source: source + ) + out = io.string + + expect(out).to include(<<~EOM) + Syntax OK + EOM + end + + it "squished do regression" do + source = <<~'EOM' + def call + trydo + @options = CommandLineParser.new.parse + options.requires.each { |r| require!(r) } + load_global_config_if_exists + options.loads.each { |file| load(file) } + @user_source_code = ARGV.join(' ') + @user_source_code = 'self' if @user_source_code == '' + @callable = create_callable + init_rexe_context + init_parser_and_formatters + # This is where the user's source code will be executed; the action will in turn call `execute`. + lookup_action(options.input_mode).call unless options.noop + output_log_entry + end # one + end # two + EOM + + io = StringIO.new + DeadEnd.call( + io: io, + source: source + ) + out = io.string + + expect(out).to eq(<<~'EOM'.indent(2)) + 1 def call + ❯ 2 trydo + ❯ 15 end # one + 16 end + EOM + end + + it "handles mismatched }" do + source = <<~EOM + class Blerg + Foo.call do { + puts lol + class Foo + end # two + end # three + EOM + + io = StringIO.new + DeadEnd.call( + io: io, + source: source + ) + + expect(io.string).to include(<<~'EOM') + 1 class Blerg + ❯ 2 Foo.call do { + 4 class Foo + 5 end # two + 6 end # three + EOM + end + + it "handles no spaces between blocks and trailing slash" do + source = <<~'EOM' + require "rails_helper" + RSpec.describe Foo, type: :model do + describe "#bar" do + context "context" do + it "foos the bar with a foo and then bazes the foo with a bar to"\ + "fooify the barred bar" do + travel_to DateTime.new(2020, 10, 1, 10, 0, 0) do + foo = build(:foo) + end + end + end + end + describe "#baz?" do + context "baz has barred the foo" do + it "returns true" do # <== HERE + end + end + end + EOM + + io = StringIO.new + DeadEnd.call( + io: io, + source: source + ) + + expect(io.string).to include(<<~'EOM') + 2 RSpec.describe Foo, type: :model do + 13 describe "#baz?" do + ❯ 14 context "baz has barred the foo" do + 16 end + 17 end + 18 end + EOM + end + + it "handles no spaces between blocks" do + source = <<~'EOM' + context "foo bar" do + it "bars the foo" do + travel_to DateTime.new(2020, 10, 1, 10, 0, 0) do + end + end + end + context "test" do + it "should" do + end + EOM + io = StringIO.new + DeadEnd.call( + io: io, + source: source + ) + + expect(io.string).to include(<<~'EOM') + 7 context "test" do + ❯ 8 it "should" do + 9 end + EOM + end + + it "finds hanging def in this project" do + source = fixtures_dir.join("this_project_extra_def.rb.txt").read + + io = StringIO.new + DeadEnd.call( + io: io, + source: source + ) + + expect(io.string).to include(<<~'EOM') + 1 module SyntaxErrorSearch + 3 class DisplayInvalidBlocks + 17 def call + 34 end + ❯ 36 def filename + 38 def code_with_filename + 45 end + 63 end + 64 end + EOM + end + + it "Format Code blocks real world example" do + source = <<~'EOM' + require 'rails_helper' + RSpec.describe AclassNameHere, type: :worker do + describe "thing" do + context "when" do + let(:thing) { stuff } + let(:another_thing) { moarstuff } + subject { foo.new.perform(foo.id, true) } + it "stuff" do + subject + expect(foo.foo.foo).to eq(true) + end + end + end # line 16 accidental end, but valid block + context "stuff" do + let(:thing) { create(:foo, foo: stuff) } + let(:another_thing) { create(:stuff) } + subject { described_class.new.perform(foo.id, false) } + it "more stuff" do + subject + expect(foo.foo.foo).to eq(false) + end + end + end # mismatched due to 16 + end + EOM + + io = StringIO.new + DeadEnd.call( + io: io, + source: source + ) + + expect(io.string).to include(<<~'EOM') + 1 require 'rails_helper' + 2 + 3 RSpec.describe AclassNameHere, type: :worker do + ❯ 4 describe "thing" do + ❯ 16 end # line 16 accidental end, but valid block + ❯ 30 end # mismatched due to 16 + 31 end + EOM + end + + it "returns syntax error in outer block without inner block" do + source = <<~'EOM' + Foo.call + def foo + puts "lol" + puts "lol" + end # one + end # two + EOM + + io = StringIO.new + DeadEnd.call( + io: io, + source: source + ) + + expect(io.string).to include(<<~'EOM') + 1 Foo.call + ❯ 6 end # two + EOM + end + + it "finds multiple syntax errors" do + source = <<~'EOM' + describe "hi" do + Foo.call + end + end + it "blerg" do + Bar.call + end + end + EOM + + io = StringIO.new + DeadEnd.call( + io: io, + source: source + ) + + expect(io.string).to include(<<~'EOM') + 1 describe "hi" do + ❯ 2 Foo.call + ❯ 3 end + 4 end + EOM + + expect(io.string).to include(<<~'EOM') + 5 it "blerg" do + ❯ 6 Bar.call + ❯ 7 end + 8 end + EOM + end + + it "finds a naked end" do + source = <<~'EOM' + def foo + end # one + end # two + EOM + + io = StringIO.new + DeadEnd.call( + io: io, + source: source + ) + + expect(io.string).to include(<<~'EOM') + ❯ end # one + EOM + end + + it "handles mismatched |" do + source = <<~EOM + class Blerg + Foo.call do |a + end # one + puts lol + class Foo + end # two + end # three + EOM + + io = StringIO.new + DeadEnd.call( + io: io, + source: source + ) + + expect(io.string).to include(<<~'EOM') + Unmatched `|', missing `|' ? + Unmatched keyword, missing `end' ? + + 1 class Blerg + ❯ 2 Foo.call do |a + 5 class Foo + 6 end # two + 7 end # three + Unmatched `end', missing keyword (`do', `def`, `if`, etc.) ? + + 1 class Blerg + ❯ 3 end # one + 5 class Foo + 6 end # two + 7 end # three + EOM + + raise("this should be one failure, not two") + end + end end From 4dfa1b2a3a39a89ac3a37a54bbc1b2ac862e01d5 Mon Sep 17 00:00:00 2001 From: schneems Date: Tue, 26 Jul 2022 16:22:00 -0500 Subject: [PATCH 58/58] This is a difficult problem MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The algorithm builds a pretty good looking tree but then it sees this: ``` 1 class Blerg ❯ 2 Foo.call do |lol ❯ 3 print lol ❯ 4 end # one 5 print lol 6 class Foo 7 end # two 8 end # three ``` It thinks that there are two invalid nodes here ``` ❯ 2 Foo.call do |lol ``` and ``` ❯ 4 end # one ``` While it's obvious to you and me that they belong together the search algorithm calls a `multiple_invalid_parents` and continues to explore both which gives us the weird output: ``` +Unmatched `|', missing `|' ? +Unmatched keyword, missing `end' ? + + 1 class Blerg +❯ 2 Foo.call do |lol + 4 end # one + 6 class Foo + 8 end # three +Unmatched `end', missing keyword (`do', `def`, `if`, etc.) ? + + 1 class Blerg + 2 Foo.call do |lol +❯ 4 end # one + 6 class Foo + 7 end # two + 8 end # three ``` Interesting enough if we combined these two it would be the perfect output, which makes me think it's not a different problem than the "fix it in post" ones I mentioned on the last commit, but rather might be a variation on them. Sometimes we only match the `end` but not the source line. Other times we match the end AND the source line if they both have a syntax error on them. A harder case is replacing the mismatched pipe with something mismatched up like ``` class Blerg Foo.call } print haha print lol end # one print lol class Foo end # two end # three ``` Gives us: ``` ❯ 1 class Blerg 9 end # three Unmatched `}', missing `{' ? 1 class Blerg ❯ 2 Foo.call } 5 end # one 7 class Foo 9 end # three Unmatched `end', missing keyword (`do', `def`, `if`, etc.) ? 1 class Blerg ❯ 5 end # one 7 class Foo 8 end # two 9 end # three Unmatched `end', missing keyword (`do', `def`, `if`, etc.) ? 1 class Blerg ❯ 9 end # three ``` So it's like double the problem from before. `class Blerg/end # three` belong together and `Foo.call }/end #one` belong together. Technically one of these is not a good match `class Blerg/end # three`. We could do something naive that might work long term or try to get more clever. It seems like we could add another grouping round after the search round. We could try to intelligently pair nodes together and do fancy stuff like re-check parsability of the daocument. That would be the advanced path. The "easy" path would be to shove all these groups together at the very end so that the output might look like this: ``` ❯ 1 class Blerg ❯ 2 Foo.call } ❯ 5 end # one ❯ 9 end # three ``` Which is honestly pretty good. Not ideal, but good. The main downside is we would eventually need a way to split off ACTUAL multiple syntax errors and report on them. I don't remember at this point if that was a feature of the old algorithm or not. If not, then it's no regression so maybe it's fine to start there. So that's good. Maybe the search algorithm is good enough and it's just up to touching up the post operations. Here's the list from last time of the failure: ### Needs investigation 0 ``` 1 class Blerg ❯ 2 Foo.call do |a 5 class Foo 6 end # two 7 end # three Unmatched `end', missing keyword (`do', `def`, `if`, etc.) ? 1 class Blerg ❯ 3 end # one 5 class Foo 6 end # two 7 end # three handles mismatched | (FAILED - 1) Failures: 1) Integration tests that don't spawn a process (like using the cli) handles mismatched | Failure/Error: raise("this should be one failure, not two") RuntimeError: this should be one failure, not two # ./spec/integration/dead_end_spec.rb:603:in `block (2 levels) in ' ``` ### Needs investigation 1 Expected: ``` 7 context "test" do ❯ 8 it "should" do 9 end ``` Actual ``` +Unmatched keyword, missing `end' ? + + 1 context "foo bar" do +❯ 7 context "test" do + 9 end # ./spec/integration/dead_end_spec.rb:418:in `block (2 levels) in ' ``` ### Needs investigation 2 Expected: ``` it "finds a naked end" do source = <<~'EOM' def foo end # one end # two EOM io = StringIO.new DeadEnd.call( io: io, source: source ) expect(io.string).to include(<<~'EOM') ❯ end # one EOM end ``` Actual: ``` +Unmatched `end', missing keyword (`do', `def`, `if`, etc.) ? + + 1 def foo +❯ 3 end # two ``` ## Fix it in post 1 ``` +Unmatched `end', missing keyword (`do', `def`, `if`, etc.) ? + + 1 describe "things" do + 2 it "blerg" do + 3 end +❯ 6 end + 8 it "zlerg" do + 9 end + 10 end # ./spec/integration/dead_end_spec.rb:227:in `block (2 levels) in ' ``` Several of these repeated ## Fix it in post 2 Subtly different: ``` +Unmatched `end', missing keyword (`do', `def`, `if`, etc.) ? + + 2 RSpec.describe AclassNameHere, type: :worker do + 3 describe "thing" do + 13 end # line 16 accidental end, but valid block +❯ 23 end # mismatched due to 16 + 24 end # ./spec/integration/dead_end_spec.rb:481:in `block (2 levels) in ' ``` Next time I want to look at "### Needs investigation 2" to see how it compares/contrasts to the other ones. --- spec/integration/dead_end_spec.rb | 27 +++++++++++++++++++++++++-- 1 file changed, 25 insertions(+), 2 deletions(-) diff --git a/spec/integration/dead_end_spec.rb b/spec/integration/dead_end_spec.rb index c183529..03343cf 100644 --- a/spec/integration/dead_end_spec.rb +++ b/spec/integration/dead_end_spec.rb @@ -562,12 +562,36 @@ def foo EOM end + it "is harder" do + source = <<~EOM + class Blerg + Foo.call } + print haha + print lol + end # one + print lol + class Foo + end # two + end # three + EOM + + io = StringIO.new + DeadEnd.call( + io: io, + source: source + ) + + puts io.string + raise "not implemented" + end + it "handles mismatched |" do source = <<~EOM class Blerg Foo.call do |a + print lol end # one - puts lol + print lol class Foo end # two end # three @@ -599,6 +623,5 @@ class Foo raise("this should be one failure, not two") end - end end