Fix when 2 function have the same name, run in parallel#135
Conversation
WalkthroughThe pull request introduces a series of code improvements across multiple files in the project. The changes primarily focus on reorganizing import statements, adding type hints, and enhancing code structure. Key modifications include updating function signatures in Changes
Tip CodeRabbit's docstrings feature is now available as part of our Early Access Program! Simply use the command Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
|
Report bugs in Issues The following are automatically added:
Available user actions:
Supported /retest check runs
Supported labels
|
for more information, see https://pre-commit.ci
|
/verified |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
apps/unused_code/unused_code.py (1)
145-145: Commented-out invocation.Remove or uncomment if needed to serve a purpose. This reduces clutter.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
poetry.lockis excluded by!**/*.lock
📒 Files selected for processing (11)
apps/jira_utils/jira_information.py(1 hunks)apps/polarion/polarion_set_automated.py(1 hunks)apps/polarion/polarion_utils.py(3 hunks)apps/polarion/polarion_verify_tc_requirements.py(1 hunks)apps/unused_code/unused_code.py(3 hunks)apps/utils.py(1 hunks)pyproject.toml(2 hunks)tests/jira_utils/test_jira_utils.py(1 hunks)tests/polarion/test_polarion_automated.py(1 hunks)tests/polarion/test_verify_polarion_requirements.py(1 hunks)tests/unused_code/test_unused_code.py(1 hunks)
✅ Files skipped from review due to trivial changes (3)
- apps/polarion/polarion_verify_tc_requirements.py
- apps/utils.py
- apps/polarion/polarion_set_automated.py
🔇 Additional comments (28)
apps/unused_code/unused_code.py (16)
6-7: Concurrent processing import looks good.
Leveraging ThreadPoolExecutor and as_completed is a clean approach to run file checks in parallel.
12-12: Importing helpers is appropriate.
Bringing in ListParamType, all_python_files, and get_util_config centralizes utility functions.
59-62: Add error handling for file-opening failures.
A try-except block would help gracefully handle I/O errors or permissions issues.
63-69: Ignore list checks appear correct.
Skipping functions early is efficient, especially when ignoring specific function prefixes and pytest fixtures.
74-76: Splitting grep output is straightforward.
The logic to parse file line references looks fine.
77-79: Properly skipping function definition lines.
Smart approach to ignore the line where the function is declared.
80-82: Ignoring commented references is sensible.
This avoids falsely counting commented-out usage lines.
83-85: Detecting usage by line content.
Conditionally setting used is correct.
87-88: Clear unused function result.
Returning the specific location of unused code is highly informative.
90-92: Returning an empty string signals all functions are used.
Straightforward design to indicate success.
110-110: Added verbose click option is valuable.
It neatly toggles debug logs.
112-113: Refined function signature with precise types.
Specifying parameter and return types helps maintain clarity.
121-121: Explicitly typed jobs list.
Using jobs: list[Future] is more readable and helps type checkers.
123-132: Efficient concurrent file processing.
Launching parallel tasks speeds up scanning large codebases.
134-135: Awaiting futures with as_completed is well-structured.
It cleanly collects results from background tasks.
137-139: Exit if any unused functions are found.
Early termination is appropriate for a code-linting-like script.
tests/unused_code/test_unused_code.py (1)
2-2: Importing the function for testing.
Great to see direct testing of get_unused_functions. This improves overall coverage.
tests/jira_utils/test_jira_utils.py (1)
1-10: Reorganized imports and newly added function reference.
These additions tidy up the file and provide the necessary resources for testing JIRA ID extraction.
apps/jira_utils/jira_information.py (1)
1-14: Improved import ordering and type hints.
The code is more maintainable with explicit type usage and well-organized imports.
pyproject.toml (2)
16-16: Relaxing generics rule.
Switching disallow_any_generics to false can reduce false positives, but be sure to keep consistent type usage throughout the project.
26-26: Added extend-select rule for ruff.
Enabling additional import checks or style checks helps maintain consistent coding standards.
tests/polarion/test_polarion_automated.py (1)
3-3: Looks good to me.
Importing run_command is a clean way to consolidate command execution logic. It's clear and straightforward, with no apparent issues.
tests/polarion/test_verify_polarion_requirements.py (1)
3-3: Good import usage.
Bringing run_command into this file provides a consistent approach to shell command execution and error handling, aligning well with the test structure.
apps/polarion/polarion_utils.py (5)
2-2: Use of postponed evaluation.
The from __future__ import annotations statement can improve type-hint performance and avoid circular import issues in Python 3.7+. This is a good practice.
6-8: Additional imports for system access and type hints.
The inclusion of sys and typing modules is justified because you are calling sys.exit() and declaring typed collections. These changes clarify code intent and improve maintainability.
9-9: Logger import usage.
Using a custom logger is beneficial for debugging and consistent logging. No issues found with this line.
42-42: Imports within the function scope.
Importing Requirement and TestCase inside the function is valid to reduce overhead when these classes are not needed elsewhere. Make sure this style is consistent with your project's guidelines.
86-86: Improving code clarity with local imports.
Again, localizing the import for TestCase helps only load it when required. This is acceptable if the project guidelines allow it and helps reduce overall load times in modules that don’t use these classes.
…utility-scripts into multiple-func-same-name
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (2)
apps/unused_code/unused_code.py (2)
54-91: Consider refactoring for better separation of concerns.The function handles multiple responsibilities: file reading, AST parsing, function filtering, and git operations. Consider splitting it into smaller, focused functions.
Example refactor:
def is_function_used(func_name: str) -> bool: try: _func_grep_found = subprocess.check_output(["git", "grep", "-w", func_name], shell=False) for entry in _func_grep_found.decode().splitlines(): _, _line = entry.split(":", 1) if f"def {func_name}" in _line or _line.strip().startswith("#"): continue if func_name in _line: return True return False except subprocess.CalledProcessError as e: if e.returncode == 1: # No matches found return False LOGGER.error(f"Error running git grep: {e}") return True # Assume used on error to avoid false positives def process_file(py_file: str, func_ignore_prefix: List[str], file_ignore_list: List[str]) -> List[str]: """Returns a list of unused function descriptions.""" if os.path.basename(py_file) in file_ignore_list: LOGGER.debug(f"Skipping file: {py_file}") return [] with open(py_file) as fd: tree = ast.parse(source=fd.read()) unused_functions = [] for func in _iter_functions(tree=tree): if (func_ignore_prefix and is_ignore_function_list(ignore_prefix_list=func_ignore_prefix, function=func)): LOGGER.debug(f"Skipping function: {func.name}") continue if is_fixture_autouse(func=func): LOGGER.debug(f"Skipping fixture function: {func.name}") continue if not is_function_used(func.name): unused_functions.append( f"{os.path.relpath(py_file)}:{func.name}:{func.lineno}:{func.col_offset}" " Is not used anywhere in the code." ) return unused_functions
144-145: Remove redundant commented code.The commented call to
get_unused_functions()is redundant as it's already called in theif __name__ == "__main__":block.if __name__ == "__main__": get_unused_functions() - - # get_unused_functions()
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
apps/unused_code/unused_code.py(3 hunks)
🔇 Additional comments (2)
apps/unused_code/unused_code.py (2)
6-7: LGTM: Import statements are well-organized.
The new imports support concurrent processing and type hints, following Python's import style guidelines.
110-110: LGTM: CLI interface improvements.
Good improvements:
- Added verbose flag with convenient short option (-v)
- Enhanced type hints for better type safety
Also applies to: 112-113
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (3)
apps/unused_code/unused_code.py (3)
Line range hint
45-54: Add return type annotation for consistency.The function has input type hints but is missing the return type annotation.
-def is_ignore_function_list(ignore_prefix_list: list[str], function: ast.FunctionDef) -> bool: +def is_ignore_function_list(ignore_prefix_list: list[str], function: ast.FunctionDef) -> bool:
123-140: Enhance concurrent execution with progress reporting and error handling.The concurrent implementation could benefit from several improvements:
- Progress reporting for better user experience
- Error handling for failed jobs
- Memory-efficient result processing
jobs: list[Future] = [] + total_files = len(list(all_python_files())) + processed_files = [] - with ThreadPoolExecutor() as executor: + with ThreadPoolExecutor() as executor, \ + click.progressbar(length=total_files, label='Scanning files') as bar: for py_file in all_python_files(): jobs.append( executor.submit( process_file, py_file=py_file, func_ignore_prefix=func_ignore_prefix, file_ignore_list=file_ignore_list, ) ) for result in as_completed(jobs): - _unused_functions.append(result.result()) + try: + if unused_func := result.result(): + processed_files.append(unused_func) + except Exception as e: + LOGGER.error(f"Failed to process file: {e}") + bar.update(1) - if unused_functions := [_unused_func for _unused_func in _unused_functions if _unused_func]: - click.echo("\n".join(unused_functions)) + if processed_files: + click.echo("\n".join(processed_files)) sys.exit(1)
147-147: Remove commented out code.The commented out function call adds no value and should be removed.
- # get_unused_functions()
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
apps/jira_utils/jira_information.py(8 hunks)apps/polarion/polarion_set_automated.py(1 hunks)apps/polarion/polarion_utils.py(5 hunks)apps/unused_code/unused_code.py(4 hunks)pyproject.toml(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
- pyproject.toml
- apps/polarion/polarion_utils.py
- apps/polarion/polarion_set_automated.py
- apps/jira_utils/jira_information.py
🔇 Additional comments (1)
apps/unused_code/unused_code.py (1)
1-14: LGTM! Well-organized imports.
The imports are properly structured and include all necessary modules for the concurrent implementation.
|
/verified |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (3)
apps/unused_code/unused_code.py (3)
45-54: Improve function name and simplify logic.The function can be simplified and made more readable:
-def is_ignore_function_list(ignore_prefix_list: list[str], function: ast.FunctionDef) -> bool: - ignore_function_lists = [ - function.name for ignore_prefix in ignore_prefix_list if function.name.startswith(ignore_prefix) - ] - if ignore_function_lists: - return True - - return False +def should_ignore_function(ignore_prefix_list: list[str], function: ast.FunctionDef) -> bool: + return any(function.name.startswith(prefix) for prefix in ignore_prefix_list)
78-86: Improve function usage detection accuracy.The current implementation might have false positives in several cases:
- When the function name appears in a string literal
- When the function name is part of a larger identifier
Consider adding more precise checks:
if f"def {func.name}" in _line: continue if _line.strip().startswith("#"): continue - if func.name in _line: + # Check for common false positive patterns + line_parts = _line.split() + for part in line_parts: + if part == func.name or part.startswith(f"{func.name}("): + used = True + break - used = True - break
122-136: Add thread pool size control and progress reporting.For better resource management and user experience:
+ max_workers = min(32, os.cpu_count() + 4) # Limit concurrent threads jobs: list[Future] = [] if not os.path.exists(".git"): LOGGER.error("Must be run from a git repository") sys.exit(1) - with ThreadPoolExecutor() as executor: + with ThreadPoolExecutor(max_workers=max_workers) as executor: + total_files = len(list(all_python_files())) + with click.progressbar(length=total_files, label='Scanning files') as bar: for py_file in all_python_files(): jobs.append( executor.submit( process_file, py_file=py_file, func_ignore_prefix=func_ignore_prefix, file_ignore_list=file_ignore_list, ) ) + bar.update(1)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
apps/unused_code/unused_code.py(3 hunks)
🧰 Additional context used
📓 Learnings (1)
apps/unused_code/unused_code.py (1)
Learnt from: myakove
PR: RedHatQE/python-utility-scripts#135
File: apps/unused_code/unused_code.py:0-0
Timestamp: 2024-12-29T11:59:22.364Z
Learning: When a function name retrieved from the AST is not found by 'git grep', the user wants a RuntimeError raised, instead of silently returning or skipping.
🔇 Additional comments (3)
apps/unused_code/unused_code.py (3)
1-14: LGTM! Well-organized imports and type hints.
The imports are properly organized and the necessary modules for concurrent execution are correctly imported.
73-74:
Add error handling for git grep command.
The subprocess.check_output call should handle potential errors:
- _func_grep_found = subprocess.check_output(["git", "grep", "-w", func.name], shell=False)
+ try:
+ _func_grep_found = subprocess.check_output(["git", "grep", "-w", func.name], shell=False)
+ except subprocess.CalledProcessError as e:
+ if e.returncode == 1: # No matches found
+ return f"{os.path.relpath(py_file)}:{func.name}:{func.lineno}:{func.col_offset} Is not used anywhere in the code."
+ raise RuntimeError(f"Error running git grep for function {func.name}: {e}")Likely invalid or redundant comment.
138-140:
Add error handling for thread results.
The current implementation doesn't handle exceptions from threads:
for result in as_completed(jobs):
- if unused_func := result.result():
- unused_functions.append(unused_func)
+ try:
+ if unused_func := result.result():
+ unused_functions.append(unused_func)
+ except Exception as e:
+ LOGGER.error(f"Error processing file: {e}")
+ sys.exit(1)Likely invalid or redundant comment.
|
/lgtm |
|
/verified |
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (3)
apps/unused_code/unused_code.py (3)
45-53: Add docstring and improve type hints.The function logic is correct, but could benefit from documentation and more specific type hints.
-def is_ignore_function_list(ignore_prefix_list: list[str], function: ast.FunctionDef) -> bool: +def is_ignore_function_list( + ignore_prefix_list: list[str], + function: ast.FunctionDef, +) -> bool: + """ + Check if a function should be ignored based on its name prefix. + + Args: + ignore_prefix_list: List of prefixes to check against + function: AST function definition node + + Returns: + bool: True if function should be ignored, False otherwise + """ ignore_function_lists = [ function.name for ignore_prefix in ignore_prefix_list if function.name.startswith(ignore_prefix) ] if ignore_function_lists: return True return False
122-140: Add progress reporting and error aggregation.Consider adding progress reporting and better error handling for parallel execution.
jobs: list[Future] = [] +errors: list[str] = [] +total_files = len(list(all_python_files())) + if not os.path.exists(".git"): LOGGER.error("Must be run from a git repository") sys.exit(1) -with ThreadPoolExecutor() as executor: +with ThreadPoolExecutor() as executor, click.progressbar( + length=total_files, + label="Processing files" +) as bar: for py_file in all_python_files(): jobs.append( executor.submit( process_file, py_file=py_file, func_ignore_prefix=func_ignore_prefix, file_ignore_list=file_ignore_list, ) ) for result in as_completed(jobs): - if unused_func := result.result(): - unused_functions.append(unused_func) + try: + if unused_func := result.result(): + unused_functions.append(unused_func) + except Exception as e: + errors.append(str(e)) + bar.update(1) + +if errors: + LOGGER.error("Errors occurred during processing:") + for error in errors: + LOGGER.error(error) + sys.exit(1)
1-2: Add module-level documentation.Consider adding module-level documentation to explain the purpose and usage of this script.
from __future__ import annotations +""" +Detect unused functions in Python files. + +This module provides functionality to scan Python files in a git repository +and identify functions that are not used anywhere in the codebase. It supports +concurrent processing for better performance and provides options to exclude +specific files and function prefixes from the analysis. + +Usage: + python -m apps.unused_code.unused_code --help +""" +
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
apps/unused_code/unused_code.py(3 hunks)
🧰 Additional context used
📓 Learnings (1)
apps/unused_code/unused_code.py (1)
Learnt from: myakove
PR: RedHatQE/python-utility-scripts#135
File: apps/unused_code/unused_code.py:0-0
Timestamp: 2024-12-29T11:59:22.364Z
Learning: When a function name retrieved from the AST is not found by 'git grep', the user wants a RuntimeError raised, instead of silently returning or skipping.
🔇 Additional comments (5)
apps/unused_code/unused_code.py (5)
1-14: LGTM! Well-organized imports.
The imports are properly structured and include all necessary modules for concurrent processing and type hints.
Line range hint 93-111: LGTM! Well-structured command options.
The click command is well-organized with appropriate options and help text.
75-77:
Add validation for string splitting.
The current implementation assumes the git grep output always contains a colon separator.
-_, _line = entry.split(":", 1)
+try:
+ _, _line = entry.split(":", 1)
+except ValueError:
+ LOGGER.error(f"Unexpected git grep output format: {entry}")
+ continueLikely invalid or redundant comment.
60-61:
Add error handling for file operations.
Based on previous feedback, we should fail immediately when file operations fail.
-with open(py_file) as fd:
- tree = ast.parse(source=fd.read())
+try:
+ with open(py_file) as fd:
+ tree = ast.parse(source=fd.read())
+except (IOError, SyntaxError) as e:
+ LOGGER.error(f"Failed to parse {py_file}: {e}")
+ sys.exit(1)Likely invalid or redundant comment.
73-73:
Add error handling for subprocess call.
Based on learnings, we should raise a RuntimeError when git grep doesn't find the function.
-_func_grep_found = subprocess.check_output(["git", "grep", "-w", func.name], shell=False)
+try:
+ _func_grep_found = subprocess.check_output(["git", "grep", "-w", func.name], shell=False)
+except subprocess.CalledProcessError as e:
+ if e.returncode == 1: # No matches found
+ raise RuntimeError(
+ f"Function '{func.name}' is declared in {os.path.relpath(py_file)} "
+ f"(line {func.lineno}, col {func.col_offset}) but was not found by git grep."
+ )
+ LOGGER.error(f"Error running git grep: {e}")
+ sys.exit(1)Likely invalid or redundant comment.
If we have two files with functions with the same name, we consider them as used and shouldn't.
Run in parallel to reduce run time.
Summary by CodeRabbit
Refactor
New Features
Configuration
pyproject.tomlsettings for type checking and linting.Testing