Skip to content

fix: type/method name detection across C, C++, Go, Ruby#10

Open
vnz wants to merge 1 commit intoCSCSoftware:masterfrom
vnz:fix/extractor-name-detection
Open

fix: type/method name detection across C, C++, Go, Ruby#10
vnz wants to merge 1 commit intoCSCSoftware:masterfrom
vnz:fix/extractor-name-detection

Conversation

@vnz
Copy link
Copy Markdown
Contributor

@vnz vnz commented Apr 27, 2026

Summary

Fixes pre-existing extractor gaps where method/type names were nested below the direct children of their parent node, causing them to be silently dropped from the index. Discovered during cross-language smoke testing while working on PR #9.

What was missing before

Created a small smoke-test corpus (one Widget class + greet method per language) and indexed it. Result on master:

Language Test file Extracted
C void c_widget_greet(...) ❌ no method (name nested in function_declarator)
C++ class CppWidget { greet() { ... } } ❌ no method
Go func (w *Widget) Greet() ❌ no method (field_identifier not recognized)
Ruby class RbWidget ❌ no class (constant not recognized)

After the fix, all of these are extracted correctly.

Approach

C/C++: tree-sitter declarator field + recursive walker

Rather than enumerate every position the name might be in (which is fragile — there are many wrapper combinations), use node.childForFieldName('declarator') to ask tree-sitter directly for the declarator, then recursively walk through any chain of declarator wrappers to find the name leaf.

Wrapper types covered: function_declarator, pointer_declarator, reference_declarator, parenthesized_declarator, array_declarator, attributed_declarator.

Leaf types covered: identifier, field_identifier, qualified_identifier, destructor_name, operator_name, operator_cast, template_function.

Using the declarator field instead of iterating all children is the key insight — it cleanly distinguishes the actual function name from confusable nodes like a qualified return type (std::string foo() would otherwise index as std::string).

C++: removed template_function from CPP_METHOD_NODES

This was a workaround for the previous name-extraction gap. With the field-aware lookup catching template specializations via their enclosing function_definition, listing template_function separately would produce duplicate entries.

Go and Ruby: small node-type additions

  • Add field_identifier to method-name candidates (Go method names on receivers)
  • Add constant and scope_resolution to type-name candidates (Ruby class Foo, class Foo::Bar)

Test cases verified

Beyond the basic per-language widget test, verified handling for:

Case Result
C function c_widget_greet c_widget_greet
C++ method CppWidget::greet greet
C++ pointer return int* A::getPtr() A::getPtr
C++ reference return + qualified + operator A& A::operator=() A::operator=
C++ array_declarator int (*make_table())[10] make_table
C++ conversion operator Foo::operator bool() const Foo::operator bool() const
C++ qualified return type std::string get_name() get_name (not std::string)
C++ templated qualified return const std::vector<int>& Foo::values() Foo::values
C++ qualified template specialization template<> void A::foo<int>() A::foo<int> (no duplicate)
C++ unqualified template specialization template<> void freestanding<int>() freestanding<int>
C++ plain template template<typename T> T identity(T x) identity
C++ pre-attribute [[nodiscard]] int compute() compute
C++ post-attribute int compute2() [[nodiscard]] compute2
Go method func (w *Widget) Greet() Greet
Ruby class class RbWidget RbWidget
Ruby namespaced class class Foo::Bar Foo::Bar
Ruby namespaced module module A::B A::B

Test plan

  • Build passes on Node 22
  • Index a real C/C++ project (e.g., a small game engine or std-lib subset) — verify methods now show up in aidex_signature
  • Index a real Go project — verify receiver methods are searchable
  • Index a Ruby project (e.g., a Rails app) — verify class names with Module::Class form are searchable

Notes

  • No new files. All changes in src/parser/extractor.ts and src/parser/languages/cpp.ts (~78 lines).
  • No new dependencies.
  • Both CodeRabbit and Codex reviewed clean after iteration.

🤖 Generated with Claude Code

Pre-existing extractor gaps where names were nested below the direct
children of method/type nodes — surfaced by cross-language smoke testing
during the HCL PR work (CSCSoftware#9).

- **C**: Recurse into `function_declarator` (and wrappers) to find the
  function name; previously every C function definition was silently
  dropped from the methods table.
- **C++**: Use tree-sitter's `declarator` field instead of pattern-matching
  direct children — avoids confusing the return type for the function name
  (`std::string foo()` was indexing as `std::string`). Walks through
  `pointer_declarator`, `reference_declarator`, `array_declarator`,
  `parenthesized_declarator`, and `attributed_declarator` wrappers, and
  handles `qualified_identifier`, `operator_cast`, `destructor_name`,
  `operator_name`, and `template_function` leaves.
- **C++**: Drop `template_function` from CPP_METHOD_NODES — was a
  workaround for the name-extraction gap and produced duplicate entries
  for template specializations like `template<> void A::foo<int>()`.
- **Go**: Accept `field_identifier` as a method name (used for receiver
  methods like `func (w *Widget) Greet()`).
- **Ruby**: Accept `constant` and `scope_resolution` as type names —
  classes are `class` nodes whose name is a `constant`, and namespaced
  classes (`class Foo::Bar`) use `scope_resolution`.

Verified with smoke tests across all 12 supported languages plus 4
synthetic C++ edge-case files (qualified return types, conversion
operators, attributes, templates).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant