-
Notifications
You must be signed in to change notification settings - Fork 0
feat: Add AST-based code chunking support with new splitter and confi… #23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Comment |
…ting in repository. Fix issue with non-resolving issues duplicates ( branch reconciliation )
…ed issue reconciliation
…status handling for previous issues
…ure correct handling of duplicate issues
… tokens and enhance connection handling
… issue deduplication; enhance prompt builder with detailed instructions
|
/codecrow analyze |
|
Comment commands are not enabled for this project Check the job logs in CodeCrow for detailed error information. |
|
| Severity | Count | |
|---|---|---|
| 🔴 High | 1 | Critical issues requiring immediate attention |
| 🟡 Medium | 2 | Issues that should be addressed |
| 🔵 Low | 1 | Minor issues and improvements |
Detailed Issues
🔴 High Severity Issues
Id on Platform: 666
Category: 🔒 Security
File: .../vcsclient/VcsClientProvider.java:307
Issue: In refreshBitbucketConnectAppConnection, if decryption fails, the code falls back to using the shared secret 'as-is'. This bypasses security controls and encourages storing secrets in plaintext. Secrets should always be encrypted at rest.
💡 Suggested Fix
Remove the fallback to plaintext. If the secret cannot be decrypted, it should be considered an invalid configuration and throw an exception.
--- a/java-ecosystem/libs/vcs-client/src/main/java/org/rostilos/codecrow/vcsclient/VcsClientProvider.java
+++ b/java-ecosystem/libs/vcs-client/src/main/java/org/rostilos/codecrow/vcsclient/VcsClientProvider.java
@@ -301,8 +301,7 @@
try {
decryptedSecret = encryptionService.decrypt(sharedSecret);
} catch (Exception e) {
- // Shared secret might be stored unencrypted (old installation) - use as-is
- log.warn("Could not decrypt shared secret, using as stored: {}", e.getMessage());
- decryptedSecret = sharedSecret;
+ log.error("Failed to decrypt shared secret for client key: {}", installation.getClientKey());
+ throw new VcsClientException("Invalid or corrupted shared secret encryption", e);
}🟡 Medium Severity Issues
Id on Platform: 667
Category: ⚡ Performance
File: .../core/index_manager.py:331
Issue: The indexing process pre-computes embeddings for the entire batch of 100 chunks in memory before insertion. For very large documents or high-dimensional embeddings, this could lead to memory pressure or OOM (Out Of Memory) errors, especially since GC is only called every 10 batches.
💡 Suggested Fix
Reduce the embedding batch size or implement more aggressive cleanup of the text and embedding buffers.
--- a/python-ecosystem/rag-pipeline/src/rag_pipeline/core/index_manager.py
+++ b/python-ecosystem/rag-pipeline/src/rag_pipeline/core/index_manager.py
@@ -328,1 +328,1 @@
- embedding_batch_size = 100 # Batch size for embedding API calls
+ embedding_batch_size = 32 # Smaller batch size to reduce memory pressureId on Platform: 668
Category: 🐛 Bug Risk
File: .../analysis/BranchAnalysisProcessor.java:476
Issue: The reconciliation logic was changed to look for 'id' instead of 'issueId'. While this matches the new Pydantic model, several parts of the system (like incremental_review_prompt) still refer to 'issueId', which will cause reconciliation to fail for those components.
💡 Suggested Fix
The AI response should ideally provide both or the backend should check both keys to maintain backward compatibility during the migration to the new 'id' field.
--- a/java-ecosystem/libs/analysis-engine/src/main/java/org/rostilos/codecrow/analysisengine/processor/analysis/BranchAnalysisProcessor.java
+++ b/java-ecosystem/libs/analysis-engine/src/main/java/org/rostilos/codecrow/analysisengine/processor/analysis/BranchAnalysisProcessor.java
@@ -475,1 +475,4 @@
- Object issueIdFromAi = issueData.get("id");
+ Object issueIdFromAi = issueData.get("id");
+ if (issueIdFromAi == null) {
+ issueIdFromAi = issueData.get("issueId");
+ }🔵 Low Severity Issues
Id on Platform: 669
Category: ✨ Best Practices
File: .../service/issue_post_processor.py:450
Issue: The IssuePostProcessor uses a custom SequenceMatcher ratio for similarity. While effective, it doesn't account for semantic meaning of code. Additionally, the line drift correction assumes the 'reason' contains enough keywords, which might not be true for style-only issues.
💡 Suggested Fix
Add a check to ensure at least one high-confidence keyword is found before attempting line correction to avoid 'false correction' to a random similar-looking line.
--- a/python-ecosystem/mcp-client/service/issue_post_processor.py
+++ b/python-ecosystem/mcp-client/service/issue_post_processor.py
@@ -242,1 +242,1 @@
- return best_match_line if best_match_score > 0 else None
+ return best_match_line if best_match_score > 1.5 else NoneFiles Affected
- .../core/index_manager.py: 1 issue
- .../service/issue_post_processor.py: 1 issue
- .../vcsclient/VcsClientProvider.java: 1 issue
- .../analysis/BranchAnalysisProcessor.java: 1 issue
Analysis completed on 2025-12-31 11:34:40 | View Full Report | Pull Request
…guration options