Skip to content

Configurable prompts + reveal chat context#35

Merged
swilly22 merged 16 commits into
mainfrom
conf-prompts
Nov 24, 2024
Merged

Configurable prompts + reveal chat context#35
swilly22 merged 16 commits into
mainfrom
conf-prompts

Conversation

@galshubeli
Copy link
Copy Markdown
Collaborator

@galshubeli galshubeli commented Nov 14, 2024

Summary by CodeRabbit

Summary by CodeRabbit

  • New Features

    • Enhanced KnowledgeGraph class to support custom instructions and prompts for Cypher and QA sessions.
    • Improved ChatSession to return structured responses, including context and question.
    • Added support for customizable prompts in various steps, improving interaction flexibility.
    • Introduced a new entry "Cypher" to the wordlist for enhanced recognition.
    • New structured representation of entities and relationships in ontology.json for a combat sports context.
  • Bug Fixes

    • Improved logic for handling prompts in GraphQueryGenerationStep and QAStep to streamline prompt generation.
  • Documentation

    • Enhanced the README with clearer guidance on configurable prompts and usage examples, including new subsections for better organization.
    • Expanded instructions for ontology creation and data extraction in the documentation.
    • Updated Jupyter notebooks to reflect new execution counts and structured response formats.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Nov 14, 2024

Walkthrough

The changes involve modifications to several classes within the graphrag_sdk module to enhance the configurability of chat sessions and query steps. The ChatSession class constructor now accepts four new required parameters for custom instructions and prompts. The KnowledgeGraph class's chat_session method has been updated to utilize these parameters when creating a ChatSession object. Additionally, the GraphQueryGenerationStep and QAStep classes received updates to incorporate optional prompt parameters, simplifying their internal logic for handling prompts.

Changes

File Change Summary
graphrag_sdk/chat_session.py Updated ChatSession constructor to include cypher_system_instruction, qa_system_instruction, cypher_gen_prompt, qa_prompt, and cypher_gen_prompt_history. Modified initialization logic and adjusted send_message.
graphrag_sdk/kg.py Modified KnowledgeGraph class's __init__ and chat_session methods to accept cypher_system_instruction, qa_system_instruction, cypher_gen_prompt, and qa_prompt as parameters when creating a ChatSession.
graphrag_sdk/steps/graph_query_step.py Added cypher_prompt and cypher_prompt_with_history parameters to GraphQueryGenerationStep constructor. Updated logic in run method for determining cypher_prompt based on provided values.
graphrag_sdk/steps/qa_step.py Introduced qa_prompt parameter in QAStep constructor. Updated run method to utilize qa_prompt for formatting the QA prompt, falling back to GRAPH_QA_PROMPT if not provided.
graphrag_sdk/fixtures/prompts.py Added multiple constants related to ontology creation and data extraction, including CREATE_ONTOLOGY_SYSTEM, CREATE_ONTOLOGY_PROMPT, UPDATE_ONTOLOGY_PROMPT, and others for structured prompt management.
README.md Updated documentation to clarify the usage of the Knowledge Graph Agent, added sections on configurable prompts, and refined examples to illustrate the new parameters and their usage.
.wordlist.txt New entry added: Cypher to the wordlist, expanding the vocabulary for the system.
tests/test_multi_agent.py Enhanced test setup in TestMultiAgent class, reorganized imports, and added import_data method for loading necessary data into knowledge graphs.
tests/test_kg_ollama.py Modified the test_kg_creation method to change the input message sent to the chat session, affecting the test's input scenario.
tests/test_kg_gemini.py Updated test_kg_creation method to extract the 'response' key from the chat session output, changing how the response is processed.
tests/test_kg_openai.py Similar update to test_kg_creation method to extract the 'response' key from the chat session output.
examples/movies/demo-movies.ipynb Adjusted execution counts and updated outputs to reflect a structured response format in the Jupyter notebook demonstrating the SDK.
examples/ufc/demo-ufc.ipynb Enhanced ontology extraction with new parameters and modified entity structures in the Jupyter notebook for UFC data demonstration.
ontology.json Introduced a new ontology structure defining entities and relationships relevant to combat sports.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant KnowledgeGraph
    participant ChatSession
    participant GraphQueryGenerationStep
    participant QAStep

    User->>KnowledgeGraph: chat_session(cypher_system_instruction, qa_system_instruction, cypher_gen_prompt, qa_prompt)
    KnowledgeGraph->>ChatSession: Create with provided parameters
    ChatSession->>GraphQueryGenerationStep: Run with cypher_prompt
    ChatSession->>QAStep: Run with qa_prompt
Loading

🐰 "In the realm of code where rabbits hop,
New prompts and systems, we never stop!
With cyphers and QA, our chat's now a breeze,
Custom instructions bring us to our knees!
So let’s celebrate this coding delight,
For every change makes our future bright!" 🐇

Possibly related PRs

  • History to Cypher generation prompt  #23: The changes in the examples/movies/demo-movies.ipynb notebook involve updating the output of the chat session to a structured response format, which aligns with the modifications made to the send_message method in the main PR that also changes the response structure to a dictionary.
  • Fix Cypher generation iterative loop #37: The modifications in the graphrag_sdk/steps/graph_query_step.py file include changes to the run method that involve the chat session, which is relevant to the updates made in the send_message method of the ChatSession class in the main PR.

Suggested reviewers

  • gkorland
  • swilly22

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Outside diff range and nitpick comments (6)
graphrag_sdk/steps/graph_query_step.py (1)

54-62: Fix indentation consistency.

The indentation in the ternary expressions is inconsistent, which affects readability. Ensure consistent indentation throughout the nested expressions.

-                    cypher_prompt = (
-                        (CYPHER_GEN_PROMPT.format(question=question) 
-                        if self.last_answer is None
-                        else CYPHER_GEN_PROMPT_WITH_HISTORY.format(question=question, last_answer=self.last_answer))
-                        if error is False
-                        else CYPHER_GEN_PROMPT_WITH_ERROR.format(
-                            question=question, error=error
-                        )
-                    )   
+                    cypher_prompt = (
+                        (CYPHER_GEN_PROMPT.format(question=question)
+                         if self.last_answer is None
+                         else CYPHER_GEN_PROMPT_WITH_HISTORY.format(
+                             question=question,
+                             last_answer=self.last_answer
+                         ))
+                        if error is False
+                        else CYPHER_GEN_PROMPT_WITH_ERROR.format(
+                            question=question,
+                            error=error
+                        )
+                    )
graphrag_sdk/chat_session.py (3)

54-55: Add type hints and documentation for new attributes.

The new prompt attributes would benefit from type hints and docstring documentation to improve code maintainability.

Add type hints and update the class docstring:

 class ChatSession:
+    cypher_prompt: str | None
+    qa_prompt: str | None
     """
     Represents a chat session with a Knowledge Graph.

     Args:
         model_config (KnowledgeGraphModelConfig): The model configuration to use.
         ontology (Ontology): The ontology to use.
         graph (Graph): The graph to query.
+        cypher_system_instruction (str, optional): Custom system instructions for Cypher generation.
+        qa_system_instruction (str, optional): Custom system instructions for QA.
+        cypher_gen_prompt (str, optional): Custom prompt template for Cypher generation.
+        qa_prompt (str, optional): Custom prompt template for QA.

63-63: Consider consistent instruction handling approaches.

The QA system instruction handling uses a different approach compared to the Cypher system instruction handling. Consider using the same pattern for consistency.

-            qa_system_instruction or GRAPH_QA_SYSTEM
+        if qa_system_instruction is None:
+            qa_system_instruction = GRAPH_QA_SYSTEM
+        self.qa_chat_session = model_config.qa.with_system_instruction(qa_system_instruction).start_chat()

82-82: Consider caching step instances.

The steps are recreated for each message, which could be inefficient for long chat sessions since most parameters remain constant.

Consider creating the steps once in the constructor:

     def __init__(self, ...):
         # ... existing code ...
+        self.cypher_step = GraphQueryGenerationStep(
+            graph=self.graph,
+            chat_session=self.cypher_chat_session,
+            ontology=self.ontology,
+            cypher_prompt=self.cypher_prompt,
+        )
+        self.qa_step = QAStep(
+            chat_session=self.qa_chat_session,
+            qa_prompt=self.qa_prompt,
+        )

     def send_message(self, message: str):
-        cypher_step = GraphQueryGenerationStep(...)
+        self.cypher_step.last_answer = self.last_answer
+        (context, cypher) = self.cypher_step.run(message)
         # ... rest of the method ...
-        qa_step = QAStep(...)
+        answer = self.qa_step.run(message, cypher, context)

Also applies to: 92-92

graphrag_sdk/kg.py (2)

137-141: Update method docstring to document new parameters

The method signature has been updated with new parameters, but the docstring hasn't been updated to reflect these changes. Please add parameter descriptions to help users understand the purpose of each new parameter.

Apply this diff:

     def chat_session(self, cypher_system_instruction: str = None, qa_system_instruction: str = None,
                 cypher_gen_prompt: str = None, qa_prompt: str = None) -> ChatSession:
+        """
+        Create a new chat session with optional custom instructions and prompts.
+
+        Parameters:
+            cypher_system_instruction (str, optional): Custom system instruction for Cypher query generation
+            qa_system_instruction (str, optional): Custom system instruction for question answering
+            cypher_gen_prompt (str, optional): Custom prompt template for Cypher query generation
+            qa_prompt (str, optional): Custom prompt template for question answering
+
+        Returns:
+            ChatSession: A new chat session instance
+        """
         chat_session = ChatSession(self._model_config, self.ontology, self.graph, cypher_system_instruction,
                                    qa_system_instruction, cypher_gen_prompt, qa_prompt)
         return chat_session

137-141: Consider adding parameter validation and default values

The method accepts optional string parameters but doesn't validate them or provide default values. Consider:

  1. Validating that provided strings are not empty
  2. Using the imported GRAPH_QA_SYSTEM and CYPHER_GEN_SYSTEM as default values

Here's a suggested implementation:

     def chat_session(self, cypher_system_instruction: str = None, qa_system_instruction: str = None,
                 cypher_gen_prompt: str = None, qa_prompt: str = None) -> ChatSession:
+        # Use default system instructions if not provided
+        cypher_system_instruction = cypher_system_instruction or CYPHER_GEN_SYSTEM
+        qa_system_instruction = qa_system_instruction or GRAPH_QA_SYSTEM
+
+        # Validate non-empty strings if provided
+        if cypher_gen_prompt is not None and not cypher_gen_prompt.strip():
+            raise ValueError("cypher_gen_prompt cannot be empty")
+        if qa_prompt is not None and not qa_prompt.strip():
+            raise ValueError("qa_prompt cannot be empty")
+
         chat_session = ChatSession(self._model_config, self.ontology, self.graph, cypher_system_instruction,
                                    qa_system_instruction, cypher_gen_prompt, qa_prompt)
         return chat_session
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between ca3aa2f and f24594c.

📒 Files selected for processing (4)
  • graphrag_sdk/chat_session.py (4 hunks)
  • graphrag_sdk/kg.py (1 hunks)
  • graphrag_sdk/steps/graph_query_step.py (1 hunks)
  • graphrag_sdk/steps/qa_step.py (1 hunks)
🔇 Additional comments (5)
graphrag_sdk/steps/qa_step.py (2)

20-20: LGTM! Clean implementation of configurable prompts

The addition of the optional qa_prompt parameter with proper type hinting and None default maintains backward compatibility while enabling prompt customization.

Also applies to: 24-24


27-32: Verify format string compatibility with custom prompts

The code assumes that any custom qa_prompt will contain the format placeholders {context}, {cypher}, and {question}. If a custom prompt is provided without these placeholders, it will raise a KeyError at runtime.

Let's verify the format string requirements are documented:

Consider adding validation for required placeholders:

     def run(self, question: str, cypher: str, context: str):
         graph_qa_prompt = self.qa_prompt or GRAPH_QA_SYSTEM
+        required_placeholders = ["{context}", "{cypher}", "{question}"]
+        if self.qa_prompt and not all(p in self.qa_prompt for p in required_placeholders):
+            raise ValueError(
+                f"Custom qa_prompt must contain all placeholders: {required_placeholders}"
+            )
         qa_prompt = graph_qa_prompt.format(
             context=context, cypher=cypher, question=question
         )
graphrag_sdk/steps/graph_query_step.py (1)

36-36: LGTM! Clean implementation of the configurable prompt.

The addition of the optional cypher_prompt parameter and its corresponding instance variable is well-implemented, following Python's best practices.

Also applies to: 43-43

graphrag_sdk/chat_session.py (2)

28-30: LGTM! Well-structured parameter additions.

The new optional parameters are well-named and properly typed, maintaining backward compatibility while adding configurability.


28-30: Verify consistent parameter usage across the codebase.

The new parameters should be consistently handled in related classes and methods.

✅ Verification successful

Let me gather more specific information about the parameter usage.


Parameters are correctly propagated and used throughout the codebase

The verification shows that:

  • The new parameters are properly handled in KnowledgeGraph.chat_session() method which correctly forwards them to ChatSession
  • ChatSession has proper fallback logic for cypher_system_instruction and qa_system_instruction
  • The parameters are used in the expected way with GraphQueryGenerationStep and QAStep classes
  • No inconsistencies found in parameter usage across different model implementations (Gemini, OpenAI, Azure, Ollama)
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for related class instantiations and verify parameter handling
echo "Searching for ChatSession instantiations..."
rg -l "ChatSession\(" 

echo "\nSearching for related step classes..."
rg -l "class.*Step"

echo "\nChecking GraphQueryGenerationStep and QAStep parameter handling..."
ast-grep --pattern 'class GraphQueryGenerationStep {
  $$$
  def __init__($$$) {
    $$$
  }
  $$$
}'

ast-grep --pattern 'class QAStep {
  $$$
  def __init__($$$) {
    $$$
  }
  $$$
}'

Length of output: 1564


Script:

#!/bin/bash
# Check ChatSession instantiations and their parameters
echo "Checking ChatSession instantiations..."
rg "ChatSession\(" -A 5

echo "\nChecking GraphQueryGenerationStep implementation..."
rg "class GraphQueryGenerationStep" -A 10 graphrag_sdk/steps/graph_query_step.py

echo "\nChecking QAStep implementation..."
rg "class QAStep" -A 10 graphrag_sdk/steps/qa_step.py

echo "\nChecking where cypher_system_instruction and qa_system_instruction are used..."
rg "cypher_system_instruction|qa_system_instruction"

Length of output: 6099

Comment thread graphrag_sdk/steps/graph_query_step.py Outdated
Comment thread graphrag_sdk/chat_session.py Outdated
@galshubeli galshubeli requested a review from swilly22 November 14, 2024 16:02
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (1)
graphrag_sdk/steps/graph_query_step.py (1)

45-45: Replace unused loop variable.

The loop control variable i is not used within the loop body.

-        for i in range(retries):
+        for _ in range(retries):
🧰 Tools
🪛 Ruff

45-45: Loop control variable i not used within loop body

Rename unused i to _i

(B007)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between d07f97e and 70fcc3e.

📒 Files selected for processing (1)
  • graphrag_sdk/steps/graph_query_step.py (1 hunks)
🧰 Additional context used
🪛 Ruff
graphrag_sdk/steps/graph_query_step.py

45-45: Loop control variable i not used within loop body

Rename unused i to _i

(B007)

🔇 Additional comments (1)
graphrag_sdk/steps/graph_query_step.py (1)

34-34: LGTM! Clean addition of configurable prompt parameter.

The new optional parameter and its initialization follow Python best practices.

Also applies to: 41-41

Comment thread graphrag_sdk/steps/graph_query_step.py Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (3)
graphrag_sdk/chat_session.py (3)

28-30: Update docstring to include new parameters

The constructor's docstring needs to be updated to document the new parameters:

  • cypher_system_instruction
  • qa_system_instruction
  • cypher_gen_prompt
  • qa_prompt

54-55: Add input validation for prompt parameters

Consider adding validation for the prompt parameters to ensure they are not empty strings when provided. Empty prompts could lead to unexpected behavior.

+        if cypher_gen_prompt is not None and not cypher_gen_prompt.strip():
+            raise ValueError("cypher_gen_prompt cannot be empty")
+        if qa_prompt is not None and not qa_prompt.strip():
+            raise ValueError("qa_prompt cannot be empty")
         self.cypher_prompt = cypher_gen_prompt
         self.qa_prompt = qa_prompt

59-59: Standardize system instruction handling

The code handles cypher and qa system instructions differently:

  • Cypher instruction is modified with ontology information
  • QA instruction uses simple fallback to GRAPH_QA_SYSTEM

Consider standardizing this approach for better maintainability.

Also applies to: 63-64

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 70fcc3e and 2496eb3.

📒 Files selected for processing (1)
  • graphrag_sdk/chat_session.py (4 hunks)
🔇 Additional comments (2)
graphrag_sdk/chat_session.py (2)

49-52: Previous review comment about standardizing ontology injection is still applicable


82-82: Verify step handling of None prompts

Please ensure that both GraphQueryGenerationStep and QAStep handle None prompts gracefully. Consider documenting the expected behavior when prompts are not provided.

Also applies to: 92-92

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Outside diff range and nitpick comments (11)
graphrag_sdk/steps/graph_query_step.py (2)

30-31: Add input validation for prompt parameters.

While the new prompt parameters are well-structured, consider adding validation to ensure they contain the required format placeholders ({question} and {last_answer}).

 def __init__(
     self,
     graph: Graph,
     ontology: Ontology,
     chat_session: GenerativeModelChatSession,
     config: dict = None,
     last_answer: str = None,
     cypher_prompt: str = None,
     cypher_prompt_with_history: str = None,
 ) -> None:
     self.ontology = ontology
     self.config = config or {}
     self.graph = graph
     self.chat_session = chat_session
     self.last_answer = last_answer
+    if cypher_prompt and '{question}' not in cypher_prompt:
+        raise ValueError("cypher_prompt must contain {question} placeholder")
+    if cypher_prompt_with_history and ('{question}' not in cypher_prompt_with_history or 
+                                      '{last_answer}' not in cypher_prompt_with_history):
+        raise ValueError("cypher_prompt_with_history must contain both {question} and {last_answer} placeholders")
     self.cypher_prompt = cypher_prompt
     self.cypher_prompt_with_history = cypher_prompt_with_history

Also applies to: 38-39


43-43: Rename unused loop variable.

The loop variable i is not used within the loop body. Following Python conventions, rename it to _i to indicate it's intentionally unused.

-        for i in range(retries):
+        for _i in range(retries):
🧰 Tools
🪛 Ruff (0.7.0)

43-43: Loop control variable i not used within loop body

Rename unused i to _i

(B007)

graphrag_sdk/chat_session.py (1)

50-53: Add type hints and validate prompt strings

The new prompt attributes lack type hints and input validation. Also, there are unnecessary empty lines.

Consider adding type hints and validation:

-        
-        self.cypher_prompt = cypher_gen_prompt
-        self.qa_prompt = qa_prompt
-        self.cypher_prompt_with_history = cypher_gen_prompt_history
-        
+        # Add type hints
+        self.cypher_prompt: str = self._validate_prompt(cypher_gen_prompt, "cypher_gen_prompt")
+        self.qa_prompt: str = self._validate_prompt(qa_prompt, "qa_prompt")
+        self.cypher_prompt_with_history: str = self._validate_prompt(
+            cypher_gen_prompt_history, "cypher_gen_prompt_history"
+        )
+
+    @staticmethod
+    def _validate_prompt(prompt: str, name: str) -> str:
+        """Validates that the prompt is a non-empty string."""
+        if not isinstance(prompt, str) or not prompt.strip():
+            raise ValueError(f"{name} must be a non-empty string")
+        return prompt.strip()
README.md (2)

194-217: Enhance prompt documentation with examples and validation details

The documentation of configurable prompts could be improved by:

  1. Adding example values for each prompt type
  2. Highlighting required placeholders more prominently
  3. Including validation warnings for missing placeholders

Consider adding a code block with actual prompt examples:

# Example prompt values
cypher_system_instruction = """
Given the following ontology:
#ONTOLOGY
Generate Cypher queries to answer questions about the data.
"""

cypher_gen_prompt = """
Based on the question: {question}
Generate a Cypher query to retrieve relevant information.
"""

# Add similar examples for other prompt types...
🧰 Tools
🪛 Markdownlint (0.35.0)

200-200: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


201-201: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


204-204: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


207-207: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


208-208: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


211-211: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


212-212: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


215-215: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


216-216: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


199-216: Maintain consistent markdown list style

The unordered lists in the documentation use dash (-) markers inconsistently. Consider using asterisks (*) throughout for better consistency.

Apply this change to the list items:

-   - System instructions for the Cypher generation step.
+   * System instructions for the Cypher generation step.
🧰 Tools
🪛 Markdownlint (0.35.0)

200-200: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


201-201: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


204-204: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


207-207: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


208-208: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


211-211: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


212-212: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


215-215: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


216-216: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)

graphrag_sdk/fixtures/prompts.py (6)

Line range hint 1-2: Fix syntax error due to extra quotation mark in CREATE_ONTOLOGY_SYSTEM

There is an extra double-quote character at the end of the line defining ## 1. Overview, which may cause a syntax error.

Apply this diff to fix the syntax error:

- CREATE_ONTOLOGY_SYSTEM = """
- ## 1. Overview\n"
+ CREATE_ONTOLOGY_SYSTEM = """
+ ## 1. Overview\n

Line range hint 58-58: Correct unbalanced quotation marks in entity labeling instructions

In the instructions under Consistency, the quotation marks are unbalanced around the example terms.

Apply this diff to correct the quotation marks:

- Avoid using more specific terms "like 'mathematician' or 'scientist'"
+ Avoid using more specific terms like 'mathematician' or 'scientist'.

Line range hint 47-47: Remove duplicate 'unique' key in the example JSON

In the example JSON, the unique key is duplicated for the age attribute of the Person entity.

Apply this diff to fix the duplication:

- {"name":"age","type":"number","unique":false,"unique":false}
+ {"name":"age","type":"number","unique":false,"required":false}

Line range hint 221-221: Correct singular 'relation' to plural 'relations' for consistency

In EXTRACT_DATA_SYSTEM, change 'relation' to 'relations' for grammatical consistency.

Apply this diff:

- Use only the provided entities, relation, and attributes in the ontology.
+ Use only the provided entities, relations, and attributes in the ontology.

Line range hint 163-166: Clarify instructions regarding attributes conversion in CREATE_ONTOLOGY_PROMPT

The instruction "Attributes should be extracted as entities or relations whenever possible" may be confusing. Attributes are typically properties of entities or relationships, and converting them into entities or relations might not always be appropriate.

Consider rephrasing for clarity, such as:

- Attributes should be extracted as entities or relations whenever possible.
+ Consider representing attributes as entities or relations if they have significant importance or require further detail.

Line range hint 373-374: Update example in ORCHESTRATOR_SYSTEM to match available agents

In the example provided within ORCHESTRATOR_SYSTEM, agents like BaggageAgent and SpecialItemsAgent are mentioned, which may not be relevant to the context of this SDK.

Consider updating the example to match agents relevant to the knowledge graph domain for better clarity.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 2496eb3 and df0c334.

📒 Files selected for processing (6)
  • README.md (2 hunks)
  • graphrag_sdk/chat_session.py (5 hunks)
  • graphrag_sdk/fixtures/prompts.py (2 hunks)
  • graphrag_sdk/kg.py (5 hunks)
  • graphrag_sdk/steps/graph_query_step.py (2 hunks)
  • graphrag_sdk/steps/qa_step.py (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • graphrag_sdk/kg.py
  • graphrag_sdk/steps/qa_step.py
🧰 Additional context used
🪛 Markdownlint (0.35.0)
README.md

200-200: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


201-201: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


204-204: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


207-207: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


208-208: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


211-211: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


212-212: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


215-215: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


216-216: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)

🪛 Ruff (0.7.0)
graphrag_sdk/steps/graph_query_step.py

43-43: Loop control variable i not used within loop body

Rename unused i to _i

(B007)

🔇 Additional comments (5)
graphrag_sdk/chat_session.py (1)

80-81: Verify step parameter names

Please ensure that the parameter names match those expected by the GraphQueryGenerationStep and QAStep classes.

Also applies to: 91-91

README.md (1)

173-175: Verify consistent terminology across documentation

The rename from "Knowledge Graph" to "Knowledge Graph Agent" better reflects the agent-based architecture. However, we should ensure this terminology is consistent across all documentation and code comments.

✅ Verification successful

Terminology is consistently used across the codebase

The term "Knowledge Graph" and its variations (including "Knowledge Graph Agent") are used consistently throughout the codebase. The README change aligns with the existing implementation where:

  • The core class is named KnowledgeGraph
  • The agent implementation exists in agents/kg_agent.py as KGAgent
  • Documentation consistently uses "Knowledge Graph (KG)" terminology
  • The new "Knowledge Graph Agent" section accurately reflects the agent-based architecture while maintaining backward compatibility
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for inconsistent terminology in documentation and code comments
rg -i "knowledge\s+graph(?!\s+agent)" --type md --type py

Length of output: 309


Script:

#!/bin/bash
# Search for "Knowledge Graph" terminology variations in documentation and code comments
rg -i "knowledge.*graph" --type md --type py

Length of output: 9943

graphrag_sdk/fixtures/prompts.py (3)

Line range hint 373-374: Ensure placeholders like #BACKSTORY and #AGENTS are appropriately handled

In ORCHESTRATOR_SYSTEM, placeholders such as #BACKSTORY and #AGENTS are present. Ensure these placeholders are properly replaced or managed when the prompts are used to prevent confusion.


445-445: ⚠️ Potential issue

Remove unintended line number annotation in CYPHER_GEN_PROMPT

There is an unintended line number annotation 445~ within the CYPHER_GEN_PROMPT constant.

Apply this diff to remove the annotation:

- 445~
 If you cannot generate a OpenCypher statement for any reason, return an empty response.

Likely invalid or redundant comment.


486-486: ⚠️ Potential issue

Remove unintended line number annotation in GRAPH_QA_SYSTEM

There is an unintended line number annotation 486~ within the GRAPH_QA_SYSTEM constant.

Apply this diff to remove the annotation:

- 486~
 Here is an example:

Likely invalid or redundant comment.

Comment thread graphrag_sdk/steps/graph_query_step.py
Comment thread graphrag_sdk/chat_session.py Outdated
Comment thread graphrag_sdk/chat_session.py
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (6)
tests/test_multi_agent.py (6)

1-16: Consider adjusting logging level and adding module documentation.

A few suggestions for improvement:

  1. Consider using a less verbose logging level for tests (e.g., INFO instead of DEBUG)
  2. Add a module docstring explaining the purpose of these tests

Add this documentation at the beginning of the file:

+"""
+Integration tests for the multi-agent system, validating the interaction between
+KnowledgeGraph agents and the Orchestrator for complex query scenarios.
+"""

Line range hint 18-21: Add class documentation.

The test class should have a docstring explaining its purpose and test coverage.

Add this documentation:

 class TestMultiAgent(unittest.TestCase):
+    """
+    Test suite for validating multi-agent interactions in complex scenarios.
+    
+    This class tests the collaboration between restaurant and attraction agents
+    orchestrated to generate comprehensive travel itineraries.
+    """

Line range hint 23-156: Refactor ontology setup to reduce code duplication.

The ontology setup contains significant duplication between restaurants and attractions. Consider extracting common entities and attributes into helper methods.

Here's a suggested refactoring approach:

def create_base_ontology():
    """Create a base ontology with common entities."""
    ontology = Ontology()
    ontology.add_entity(
        Entity(
            label="Country",
            attributes=[
                Attribute(
                    name="name",
                    attr_type=AttributeType.STRING,
                    required=True,
                    unique=True,
                ),
            ],
        )
    )
    ontology.add_entity(
        Entity(
            label="City",
            attributes=[
                Attribute(name="name", attr_type=AttributeType.STRING, required=True, unique=True),
                Attribute(name="weather", attr_type=AttributeType.STRING),
                Attribute(name="population", attr_type=AttributeType.NUMBER),
            ],
        )
    )
    ontology.add_relation(
        Relation(
            label="IN_COUNTRY",
            source="City",
            target="Country",
        )
    )
    return ontology

@classmethod
def setUpClass(cls):
    # Create base ontologies
    cls.restaurants_ontology = create_base_ontology()
    cls.attractions_ontology = create_base_ontology()
    
    # Add specific entities
    cls.restaurants_ontology.add_entity(
        Entity(
            label="Restaurant",
            attributes=[
                Attribute(name="name", attr_type=AttributeType.STRING, required=True, unique=True),
                Attribute(name="description", attr_type=AttributeType.STRING),
                Attribute(name="rating", attr_type=AttributeType.NUMBER),
                Attribute(name="food_type", attr_type=AttributeType.STRING),
            ],
        )
    )
    cls.restaurants_ontology.add_relation(
        Relation(
            label="IN_CITY",
            source="Restaurant",
            target="City",
        )
    )
    
    cls.attractions_ontology.add_entity(
        Entity(
            label="Attraction",
            attributes=[
                Attribute(name="name", attr_type=AttributeType.STRING, required=True, unique=True),
                Attribute(name="description", attr_type=AttributeType.STRING),
                Attribute(name="type", attr_type=AttributeType.STRING),
            ],
        )
    )
    cls.attractions_ontology.add_relation(
        Relation(
            label="IN_CITY",
            source="Attraction",
            target="City",
        )
    )

Line range hint 158-208: Enhance agent configuration and add error handling.

A few suggestions for improvement:

  1. The agent introductions could be more descriptive and specific
  2. Add error handling for model initialization

Consider these improvements:

-        cls.model = GeminiGenerativeModel("gemini-1.5-flash-001")
+        try:
+            cls.model = GeminiGenerativeModel("gemini-1.5-flash-001")
+        except Exception as e:
+            logger.error(f"Failed to initialize Gemini model: {e}")
+            raise

         cls.restaurants_agent = KGAgent(
             agent_id="restaurants_agent",
             kg=cls.restaurants_kg,
             introduction="""
-        I'm a restaurant agent, specialized in finding the best restaurants for you. 
+        I'm a restaurant recommendation specialist with detailed knowledge of dining
+        establishments, their cuisines, ratings, and locations. I can help you find
+        the perfect dining spots based on your preferences and location.
         """,
         )

         cls.attractions_agent = KGAgent(
             agent_id="attractions_agent",
             kg=cls.attractions_kg,
             introduction="""
-        I'm an attractions agent, specialized in finding the best attractions for you. 
+        I'm a tourist attractions expert with comprehensive knowledge of landmarks,
+        museums, historical sites, and entertainment venues. I can help you discover
+        the most interesting places to visit based on your interests and location.
         """,
         )

Line range hint 210-301: Add error handling and data validation for data import.

The data import method lacks error handling for file operations and data validation.

Consider these improvements:

+    def validate_city_data(self, city: dict) -> bool:
+        """Validate city data structure."""
+        required_fields = {"name", "weather", "population", "country"}
+        return all(field in city for field in required_fields)

     def import_data(
         self,
         restaurants_kg: KnowledgeGraph,
         attractions_kg: KnowledgeGraph,
     ):
+        """
+        Import test data from JSON files into knowledge graphs.
+        
+        Args:
+            restaurants_kg: Knowledge graph for restaurants
+            attractions_kg: Knowledge graph for attractions
+            
+        Raises:
+            FileNotFoundError: If data files are missing
+            ValueError: If data validation fails
+        """
-        with open("tests/data/cities.json") as f:
-            cities = loads(f.read())
+        try:
+            with open("tests/data/cities.json") as f:
+                cities = loads(f.read())
+        except FileNotFoundError:
+            logger.error("Cities data file not found")
+            raise
+        except json.JSONDecodeError:
+            logger.error("Invalid JSON in cities data file")
+            raise

+        # Validate data structure
+        for city in cities:
+            if not self.validate_city_data(city):
+                raise ValueError(f"Invalid city data structure: {city}")

Line range hint 303-316: Enhance test coverage and assertions.

The current test method could be more comprehensive:

  1. Add more specific assertions for response content
  2. Include negative test cases
  3. Test different scenarios

Consider adding these test methods:

def test_multi_agent_invalid_city(self):
    """Test behavior with invalid city input."""
    response = self.orchestrator.ask(
        "Write me a two-day itinerary for a trip to NonExistentCity."
    )
    assert "no information available" in response.output.lower()

def test_multi_agent_specific_requirements(self):
    """Test with specific dining and attraction requirements."""
    response = self.orchestrator.ask(
        "Write me a two-day itinerary for Rome focusing on historical sites and Italian restaurants."
    )
    assert "historical" in response.output.lower()
    assert "italian" in response.output.lower()

def test_multi_agent_validation(self):
    """Test response structure and content validation."""
    response = self.orchestrator.ask(
        "Write me a two-day itinerary for Rome."
    )
    # Verify response structure
    assert "day 1" in response.output.lower()
    assert "day 2" in response.output.lower()
    # Verify it includes both restaurants and attractions
    assert any(r["name"] in response.output for r in self.restaurants)
    assert any(a["name"] in response.output for a in self.attractions)
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between c8b9928 and c3e03ec.

📒 Files selected for processing (1)
  • tests/test_multi_agent.py (1 hunks)

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Outside diff range and nitpick comments (1)
tests/test_kg_ollama.py (1)

Line range hint 93-99: Consider making the test more robust and deterministic.

The current test has several areas that could be improved:

  1. The assertion threshold (>5) seems arbitrary and might not reflect the actual data
  2. The regex pattern assumes the answer contains digits, but doesn't validate the response format
  3. There's no verification that the count matches the actual data in the source file

Consider this improved implementation:

-        answer = chat.send_message("Count the number of actors acted in a movie?")
-
-        logger.info(f"Answer: {answer}")
-
-        actors_count = re.findall(r'\d+', answer)
-        num_actors = 0 if len(actors_count) == 0 else int(actors_count[0])
-
-        assert num_actors > 5, "The number of actors found should be greater than 5"
+        # First, get the ground truth
+        count_query = """
+        MATCH (a:Actor)-[:ACTED_IN]->(:Movie)
+        RETURN COUNT(DISTINCT a) as actor_count
+        """
+        expected_count = self.kg.query(count_query)[0]["actor_count"]
+        
+        # Then test the chat response
+        answer = chat.send_message("Count the number of actors acted in a movie?")
+        logger.info(f"Answer: {answer}")
+        
+        # Extract number from response
+        actors_count = re.findall(r'\b\d+\b', answer)
+        self.assertTrue(actors_count, f"Expected numeric count in answer: {answer}")
+        actual_count = int(actors_count[0])
+        
+        # Verify against ground truth
+        self.assertEqual(
+            actual_count,
+            expected_count,
+            f"Actor count {actual_count} doesn't match actual count {expected_count}"
+        )
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between c3e03ec and 1b5e192.

📒 Files selected for processing (1)
  • tests/test_kg_ollama.py (1 hunks)
🔇 Additional comments (2)
tests/test_kg_ollama.py (2)

93-93: LGTM! The rephrased question is clearer and more directive.

The change from "How many" to "Count the number of" makes the question more explicit and imperative, which could potentially lead to more consistent responses from the model.


93-93: Consider testing the new configurable prompts feature.

Given that the ChatSession now supports configurable prompts (as mentioned in the AI summary), this test could be extended to verify this new functionality.

Let's verify the new prompt parameters:

Consider adding test cases for:

  1. Custom system instructions
  2. Different prompt templates
  3. Edge cases with empty/invalid prompts

@galshubeli galshubeli changed the title Add configurable prompts onfigurable prompts Nov 24, 2024
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (2)
graphrag_sdk/chat_session.py (2)

51-53: Add type hints for prompt attributes

The new prompt attributes should have type hints for better code maintainability and IDE support.

-        self.cypher_prompt = cypher_gen_prompt
-        self.qa_prompt = qa_prompt
-        self.cypher_prompt_with_history = cypher_gen_prompt_history
+        self.cypher_prompt: str = cypher_gen_prompt
+        self.qa_prompt: str = qa_prompt
+        self.cypher_prompt_with_history: str = cypher_gen_prompt_history

64-64: Implement context management to prevent unbounded growth

The context list grows without bounds as new messages are processed. This could lead to memory issues in long-running chat sessions.

Consider:

  1. Implementing a maximum size limit for the context list
  2. Using a circular buffer or removing older entries when the limit is reached
  3. Adding configuration parameter for context size limit

Example implementation:

+    MAX_CONTEXT_SIZE = 100  # or make this configurable via constructor
     def __init__(...):
-        self.context = []
+        self.context: list[dict] = []

     def send_message(self, message: str):
+        if len(self.context) >= self.MAX_CONTEXT_SIZE:
+            self.context.pop(0)  # Remove oldest entry
         self.context.append({"message": message, "cypher": cypher, "context": context})

Also applies to: 89-90

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 1b5e192 and f052dec.

📒 Files selected for processing (2)
  • graphrag_sdk/chat_session.py (4 hunks)
  • graphrag_sdk/kg.py (5 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • graphrag_sdk/kg.py
🔇 Additional comments (1)
graphrag_sdk/chat_session.py (1)

27-29: Verify that all callers are updated with the new parameters

The constructor signature change requires updates to all code that creates ChatSession instances.

✅ Verification successful

Based on the search results, I can see that ChatSession is only instantiated in one place - in the chat_session() method of the kg.py file, and this caller is already updated with all the required parameters:

chat_session = ChatSession(self._model_config, self.ontology, self.graph, self.cypher_system_instruction,
                          self.qa_system_instruction, self.cypher_gen_prompt, self.qa_prompt, self.cypher_gen_prompt_history)

All other occurrences of "ChatSession" in the codebase refer to the abstract base class GenerativeModelChatSession from the models package, which is a different class.

Changes are properly integrated

The only instantiation point of ChatSession has been updated with all the required parameters.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Find all potential instantiations of ChatSession
rg -l "ChatSession" | while read -r file; do
  if [[ "$file" != "graphrag_sdk/chat_session.py" ]]; then
    echo "=== $file ==="
    rg -A 5 "ChatSession" "$file"
  fi
done

Length of output: 12159

Comment thread graphrag_sdk/chat_session.py
@galshubeli galshubeli changed the title onfigurable prompts Configurable prompts + reveal chat context Nov 24, 2024
Comment thread graphrag_sdk/steps/qa_step.py
Comment thread graphrag_sdk/steps/graph_query_step.py
Comment thread graphrag_sdk/kg.py Outdated
Comment thread graphrag_sdk/kg.py
Comment thread graphrag_sdk/chat_session.py Outdated
Comment thread README.md Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Outside diff range and nitpick comments (5)
graphrag_sdk/chat_session.py (1)

81-82: Document prompt requirements and add validation

The prompts are passed to steps without documentation about their expected format or structure.

  1. Add docstring updates to document prompt requirements:
def send_message(self, message: str):
    """
    Sends a message to the chat session.

    Args:
        message (str): The message to send.

    Returns:
        str: The response to the message.

    Note:
        The cypher_prompt should follow the format: ...
        The qa_prompt should follow the format: ...
        The cypher_prompt_with_history should include: ...
    """
  1. Add runtime validation in the steps to ensure prompts meet the requirements.

Also applies to: 94-94

README.md (2)

194-232: Consider adding information about default prompts

The documentation for configurable prompts is clear and well-structured. However, it would be helpful to:

  1. Indicate whether these configurations are optional
  2. Document the default prompts that are used if custom ones aren't provided
  3. Provide example prompt templates for each type

Would you like me to help draft this additional documentation?

🧰 Tools
🪛 Markdownlint (0.35.0)

200-200: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


201-201: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


204-204: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


207-207: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


208-208: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


211-211: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


212-212: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


215-215: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


216-216: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


200-216: Maintain consistent list style

For better markdown consistency, use asterisks (*) instead of dashes (-) for unordered lists throughout the document.

Apply this change to the list items in the Configurable Prompts section:

-   - System instructions for the Cypher generation step.  
+   * System instructions for the Cypher generation step.  
-   - **Note:** Ensure your prompt includes `{ontology}`.
+   * **Note:** Ensure your prompt includes `{ontology}`.

(Apply similar changes to other list items)

🧰 Tools
🪛 Markdownlint (0.35.0)

200-200: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


201-201: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


204-204: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


207-207: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


208-208: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


211-211: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


212-212: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


215-215: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


216-216: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)

graphrag_sdk/fixtures/prompts.py (2)

Line range hint 1-1: Fix syntax error due to mismatched quotation mark

There is an extra quotation mark at the end of the first line in the CREATE_ONTOLOGY_SYSTEM string assignment, which may cause a syntax error.

Apply this diff to fix the syntax error:

 CREATE_ONTOLOGY_SYSTEM = """
-## 1. Overview\n"
+## 1. Overview\n
 You are a top-tier algorithm designed for extracting ontologies in structured formats to build a knowledge graph from raw texts.

Line range hint 24-24: Fix unbalanced quotation marks in the CREATE_ONTOLOGY_SYSTEM prompt

In the "Consistency" section, there is an unbalanced quotation mark before "like 'mathematician' or 'scientist'". This could lead to issues when processing the string.

Apply this diff to correct the quotation marks:

 Avoid using more specific terms "like 'mathematician' or 'scientist'"
+Avoid using more specific terms like 'mathematician' or 'scientist'
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between f052dec and 85fdacf.

📒 Files selected for processing (4)
  • README.md (2 hunks)
  • graphrag_sdk/chat_session.py (4 hunks)
  • graphrag_sdk/fixtures/prompts.py (3 hunks)
  • graphrag_sdk/kg.py (5 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • graphrag_sdk/kg.py
🧰 Additional context used
🪛 Markdownlint (0.35.0)
README.md

200-200: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


201-201: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


204-204: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


207-207: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


208-208: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


211-211: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


212-212: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


215-215: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)


216-216: Expected: asterisk; Actual: dash
Unordered list style

(MD004, ul-style)

🔇 Additional comments (2)
graphrag_sdk/chat_session.py (1)

27-29: ⚠️ Potential issue

Breaking change: Update constructor signature and docstring

The constructor changes introduce several issues:

  1. Making new parameters required is a breaking change that could affect existing clients
  2. The docstring is incomplete, missing documentation for the new parameters
  3. Parameters should be optional with sensible defaults for backward compatibility

Apply this diff to address the issues:

 def __init__(self, model_config: KnowledgeGraphModelConfig, ontology: Ontology, graph: Graph,
-                cypher_system_instruction: str, qa_system_instruction: str,
-                cypher_gen_prompt: str, qa_prompt: str, cypher_gen_prompt_history: str):
+                cypher_system_instruction: str | None = None,
+                qa_system_instruction: str | None = None,
+                cypher_gen_prompt: str | None = None,
+                qa_prompt: str | None = None,
+                cypher_gen_prompt_history: str | None = None):
         """
         Initializes a new ChatSession object.

         Args:
             model_config (KnowledgeGraphModelConfig): The model configuration.
             ontology (Ontology): The ontology object.
             graph (Graph): The graph object.
+            cypher_system_instruction (str, optional): Custom system instruction for Cypher generation.
+            qa_system_instruction (str, optional): Custom system instruction for QA.
+            cypher_gen_prompt (str, optional): Custom prompt for Cypher generation.
+            qa_prompt (str, optional): Custom prompt for QA.
+            cypher_gen_prompt_history (str, optional): Custom prompt history for Cypher generation.
         """

Likely invalid or redundant comment.

README.md (1)

173-175: LGTM! Clear section renaming

The rename from "Knowledge Graph" to "Knowledge Graph Agent" better reflects the agent-based nature of the system and its Q&A capabilities.

Comment thread graphrag_sdk/chat_session.py Outdated
Comment thread graphrag_sdk/chat_session.py
Comment thread graphrag_sdk/fixtures/prompts.py
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Outside diff range and nitpick comments (8)
tests/test_kg_openai.py (3)

88-88: Consider testing the full response structure

The change suggests that send_message now returns a richer response object with additional context. While extracting the 'response' is correct, we should also verify the structure and content of the full response object to ensure all expected context is present.

Consider adding assertions to verify the response structure:

 answer = chat.send_message("How many actors acted in a movie?")
+assert isinstance(answer, dict), "Response should be a dictionary"
+assert "response" in answer, "Response should contain 'response' key"
+# Add assertions for other expected keys in the response
 answer = answer['response']

Line range hint 93-97: Consider strengthening the assertion logic

The current regex pattern \d+ might match any number in the response text, not necessarily the actor count. This could lead to false positives if the response contains other numbers.

Consider using a more specific regex or response parsing:

-actors_count = re.findall(r'\d+', answer)
-num_actors = 0 if len(actors_count) == 0 else int(actors_count[0])
+# Look for specific patterns like "X actors" or "found X actors"
+actors_count = re.findall(r'(\d+)\s+actors?', answer)
+num_actors = 0 if len(actors_count) == 0 else int(actors_count[0])
+assert num_actors > 0, "No actor count found in response"
 assert num_actors > 10, "The number of actors found should be greater than 10"

Missing error handling for chat response structure

Based on the code analysis:

  1. The test directly accesses answer['response'] without any error handling
  2. The underlying implementations (send_message and parse_generate_content_response) don't have explicit error handling for malformed responses
  3. The GenerationResponse class expects specific fields but there's no validation to ensure they exist

The test should be updated to include:

  • Error handling for missing 'response' key
  • Validation of the response structure before accessing it
  • Test cases for error scenarios
🔗 Analysis chain

Line range hint 87-92: Verify error handling for chat responses

The code assumes the 'response' key will always be present. We should verify how the test handles cases where the response might be malformed or missing.

Let's check if there's error handling in the chat implementation:

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for error handling in chat session implementation
ast-grep --pattern 'def send_message($$$) {
  try {
    $$$
  } catch ($$$) {
    $$$
  }
}'

# Search for response structure validation
rg -A 5 "send_message.*response.*"

Length of output: 131


Script:

#!/bin/bash
# Let's try a broader search for the chat implementation and error handling
# First, find the chat implementation file
fd "kg_openai.py"

# Search for the send_message method implementation
rg -A 10 "def send_message" 

# Search for any error handling related to response
rg -B 2 -A 5 "try.*response" 

Length of output: 4827


Script:

#!/bin/bash
# Let's check the implementation of parse_generate_content_response and error handling
rg -B 2 -A 5 "parse_generate_content_response"

# Let's also check the GenerationResponse type definition
fd "generation_response" --type f

# And check its content
rg -A 5 "class GenerationResponse"

Length of output: 5320


Script:

#!/bin/bash
# Let's check for try-catch blocks in the OpenAI implementation since that's what's being used in the test
rg -B 5 -A 5 "try" graphrag_sdk/models/openai.py

# Also check the test file itself for any error handling
rg -B 5 -A 5 "try" tests/test_kg_openai.py

# And check the chat session implementation for error handling
rg -B 5 -A 5 "try" graphrag_sdk/chat_session.py

Length of output: 139

tests/test_kg_gemini.py (1)

93-94: Add error handling and type hints for the chat response.

While the change correctly handles the new dictionary response format, it could benefit from better error handling and type hints.

Consider applying this improvement:

-        answer = chat.send_message("How many actors acted in a movie?")
-        answer = answer['response']
+        response_dict = chat.send_message("How many actors acted in a movie?")
+        if not isinstance(response_dict, dict) or 'response' not in response_dict:
+            raise ValueError("Unexpected response format from chat session")
+        answer = response_dict['response']
graphrag_sdk/agents/kg_agent.py (1)

139-139: Document the output structure and consider exposing context

While extracting just the 'response' works, the output dictionary contains valuable context ('question', 'context', 'cypher') that might be useful for debugging or advanced use cases. Consider:

  1. Documenting the complete output structure in the method's docstring
  2. Providing a way to access the full context when needed

Here's a suggested improvement:

     def run(self, params: dict) -> str:
         """
         Ask the agent a question.
 
         Args:
             params (dict): The parameters for the agent.
 
         Returns:
-            str: The agent's response.
+            str: The agent's response. Note: The full output dictionary contains
+                 additional context including 'question', 'context', and 'cypher'
+                 which can be accessed by setting return_context=True.
 
         """
-        output = self.chat_session.send_message(params["prompt"])
-        return output['response']
+        output = self.chat_session.send_message(params["prompt"])
+        return output['response'] if not params.get('return_context') else output
graphrag_sdk/chat_session.py (1)

101-106: Add type hints and fix formatting

Consider:

  1. Adding a TypedDict for the return type
  2. Removing trailing whitespace
+from typing import TypedDict
+
+class ChatResponse(TypedDict):
+    question: str
+    response: str
+    context: list[str]
+    cypher: str
+
 def send_message(self, message: str) -> ChatResponse:
     # ...
     return {
-        "question": message, 
-        "response": answer, 
-        "context": context, 
-        "cypher": cypher
-        }
+        "question": message,
+        "response": answer,
+        "context": context,
+        "cypher": cypher
+    }
graphrag_sdk/kg.py (2)

Line range hint 33-54: Enhance parameter documentation with examples

The parameter documentation is clear but would benefit from example values to help users understand the expected format and placeholders.

Consider adding examples like:

"""
Example:
    cypher_system_instruction: "Using the following ontology: {ontology}..."
    qa_prompt: "Question: {question}\nContext: {context}\nCypher: {cypher}\nAnswer:"
"""

185-187: Improve readability by breaking long line into multiple lines

The chat session initialization line is quite long. Consider reformatting for better readability:

-        chat_session = ChatSession(self._model_config, self.ontology, self.graph, self.cypher_system_instruction,
-                                   self.qa_system_instruction, self.cypher_gen_prompt, self.qa_prompt, self.cypher_gen_prompt_history)
+        chat_session = ChatSession(
+            self._model_config,
+            self.ontology,
+            self.graph,
+            self.cypher_system_instruction,
+            self.qa_system_instruction,
+            self.cypher_gen_prompt,
+            self.qa_prompt,
+            self.cypher_gen_prompt_history
+        )
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 85fdacf and 5e5500b.

📒 Files selected for processing (6)
  • graphrag_sdk/agents/kg_agent.py (1 hunks)
  • graphrag_sdk/chat_session.py (5 hunks)
  • graphrag_sdk/kg.py (5 hunks)
  • tests/test_kg_gemini.py (1 hunks)
  • tests/test_kg_ollama.py (1 hunks)
  • tests/test_kg_openai.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/test_kg_ollama.py
🧰 Additional context used
🪛 Ruff (0.7.0)
graphrag_sdk/kg.py

73-73: No explicit stacklevel keyword argument found

(B028)


90-90: No explicit stacklevel keyword argument found

(B028)


98-98: No explicit stacklevel keyword argument found

(B028)

🔇 Additional comments (3)
tests/test_kg_gemini.py (1)

93-94: Verify consistent response handling across test files.

Since this change is part of a broader update to chat response handling, let's verify that similar changes are applied consistently in other test files.

✅ Verification successful

Response handling is consistent across test files

The verification shows that all test files (test_kg_gemini.py, test_kg_openai.py, and test_kg_ollama.py) handle chat responses in exactly the same way:

  1. They all use answer = chat.send_message(...) to send a message
  2. They all extract the response using answer = answer['response']
  3. They all follow the same pattern of logging the answer afterward
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check if other test files handle the chat response similarly
# Expected: Similar response extraction in test_kg_openai.py and test_kg_ollama.py

echo "Checking chat response handling in other test files:"
rg -A 2 "chat.send_message" "tests/test_kg_*.py"

Length of output: 222


Script:

#!/bin/bash
# Let's first find all test files related to knowledge graph (kg) testing
fd "test_kg.*\.py$" tests/

# Then examine the content of these files focusing on chat response handling
echo "Checking chat response handling patterns:"
rg -B 2 -A 2 "send_message|response" tests/test_kg*

Length of output: 1362

graphrag_sdk/agents/kg_agent.py (1)

Line range hint 39-39: Add required parameters to chat_session initialization

According to the changes in the codebase, ChatSession now requires four new parameters for custom instructions and prompts. These parameters should be passed through the chat_session() method call.

Let's verify the required parameters for ChatSession:

graphrag_sdk/chat_session.py (1)

48-48: 🛠️ Refactor suggestion

Validate format string placeholders

The switch to using format() is good, but ensure the cypher_system_instruction contains the correct placeholder {ontology}. Consider adding validation:

+        if "{ontology}" not in cypher_system_instruction:
+            raise ValueError("cypher_system_instruction must contain {ontology} placeholder")
         cypher_system_instruction = cypher_system_instruction.format(ontology=str(ontology.to_json()))

Comment thread graphrag_sdk/kg.py
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Outside diff range and nitpick comments (2)
graphrag_sdk/chat_session.py (2)

51-53: Add type hints for prompt attributes

Add type hints to improve code maintainability and IDE support.

+        self.cypher_prompt: str = cypher_gen_prompt
+        self.qa_prompt: str = qa_prompt
+        self.cypher_prompt_with_history: str = cypher_gen_prompt_history

91-96: Define constants for response dictionary keys

Define constants for the dictionary keys to prevent typos and improve maintainability.

+    # Class-level constants for response keys
+    QUESTION_KEY = "question"
+    RESPONSE_KEY = "response"
+    CONTEXT_KEY = "context"
+    CYPHER_KEY = "cypher"
+
     def send_message(self, message: str):
         # ...
         if not cypher or len(cypher) == 0:
             return {
-                "question": message,
-                "response": "I am sorry, I could not find the answer to your question",
-                "context": None,
-                "cypher": None
+                self.QUESTION_KEY: message,
+                self.RESPONSE_KEY: "I am sorry, I could not find the answer to your question",
+                self.CONTEXT_KEY: None,
+                self.CYPHER_KEY: None
             }
         # ...
         return {
-            "question": message,
-            "response": answer,
-            "context": context,
-            "cypher": cypher
+            self.QUESTION_KEY: message,
+            self.RESPONSE_KEY: answer,
+            self.CONTEXT_KEY: context,
+            self.CYPHER_KEY: cypher
         }

Also applies to: 106-111

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 5e5500b and 572d03b.

📒 Files selected for processing (1)
  • graphrag_sdk/chat_session.py (4 hunks)

Comment thread graphrag_sdk/chat_session.py
Comment thread graphrag_sdk/chat_session.py
Comment thread graphrag_sdk/chat_session.py
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Outside diff range and nitpick comments (12)
ontology.json (5)

19-19: Add meaningful descriptions for entities

Each entity has an empty description field. Adding clear descriptions would improve schema documentation and help users understand the purpose of each entity.

Example improvements:

-      "description": ""
+      "description": "Represents a fighter or participant in combat sports events"
-      "description": ""
+      "description": "Represents a combat sports event that contains multiple bouts"
-      "description": ""
+      "description": "Represents a single fight/match between two fighters"

Also applies to: 43-43, 85-85


31-35: Consider using proper date type

The date attribute is defined as a string. Consider using a proper date type and specifying the expected format (e.g., ISO 8601) to ensure data consistency.

       {
         "name": "date",
-        "type": "string",
+        "type": "date",
         "unique": false,
-        "required": true
+        "required": true,
+        "format": "ISO8601"
       },

73-77: Add validation for time_format

The time_format attribute should have constraints to ensure valid values (e.g., "3x5" for three five-minute rounds).

       {
         "name": "time_format",
         "type": "string",
         "unique": false,
-        "required": true
+        "required": true,
+        "pattern": "^\\d+x\\d+$",
+        "description": "Format: NxM where N is number of rounds and M is minutes per round"
       },

49-53: Add enumeration for weight_class

The weight_class attribute should have predefined values to ensure consistency.

       {
         "name": "weight_class",
         "type": "string",
         "unique": false,
-        "required": true
+        "required": true,
+        "enum": [
+          "Flyweight",
+          "Bantamweight",
+          "Featherweight",
+          "Lightweight",
+          "Welterweight",
+          "Middleweight",
+          "Light Heavyweight",
+          "Heavyweight"
+        ]
       },

105-109: Add range constraints for numerical statistics

Numerical attributes like knockdowns, significant_strikes, and takedowns should have minimum values to prevent negative numbers.

       {
         "name": "knockdowns",
         "type": "number",
         "unique": false,
-        "required": false
+        "required": false,
+        "minimum": 0
       },

Similar constraints should be added to other numerical statistics.

Also applies to: 111-115, 141-145

examples/movies/demo-movies.ipynb (5)

Line range hint 45-45: Avoid commenting out sensitive credentials in example code

The commented API key line could lead to accidental credential exposure if users uncomment it. Consider replacing it with a placeholder or environment variable reference.

-# OPENAI_API_KEY = "sk-"# OpenAI API key
+# Set OPENAI_API_KEY in your .env file or environment variables

Line range hint 62-73: Consider externalizing the URL list

To improve maintainability and allow for easier updates, consider moving the URLs to a separate configuration file or providing them as a parameter.

-urls = ["https://www.rottentomatoes.com/m/side_by_side_2012",
-"https://www.rottentomatoes.com/m/matrix",
-"https://www.rottentomatoes.com/m/matrix_revolutions",
-"https://www.rottentomatoes.com/m/matrix_reloaded",
-"https://www.rottentomatoes.com/m/speed_1994",
-"https://www.rottentomatoes.com/m/john_wick_chapter_4"]
+# Load URLs from a configuration file or environment
+from config import DEMO_URLS
+urls = DEMO_URLS

Line range hint 91-95: Consider making boundaries configurable

The ontology boundaries are hardcoded in the example. Consider making them configurable to demonstrate flexibility in ontology creation.

-boundaries = """
-    Extract only the most relevant information about all the movies, actors, and directors over the text.
-    Avoid creating entities for details that can be expressed as attributes.
-"""
+# Load boundaries from configuration or allow as parameter
+boundaries = get_ontology_boundaries()  # or pass as parameter to the notebook

Line range hint 116-122: Add error handling for source processing

Consider adding try-catch blocks to handle potential errors during source processing and provide meaningful feedback.

-kg.process_sources(sources)
+try:
+    kg.process_sources(sources)
+except Exception as e:
+    print(f"Error processing sources: {e}")
+    raise

Line range hint 139-157: Improve output readability and add error handling

The chat session output could be more readable, and error handling should be added for robustness.

-print(chat.send_message("Who is the director of the movie The Matrix?"))
+def format_response(response):
+    """Format the chat response for better readability"""
+    return f"""
+Question: {response['question']}
+Answer: {response['response']}
+Context: {response['context']}
+Cypher: {response['cypher']}
+"""
+
+try:
+    response = chat.send_message("Who is the director of the movie The Matrix?")
+    print(format_response(response))
+except Exception as e:
+    print(f"Error in chat session: {e}")
examples/ufc/demo-ufc.ipynb (2)

Line range hint 95-100: Ensure sampled_sources is not empty to prevent errors

When calculating sampled_sources, if len(sources) * percent rounds down to zero, random.sample will raise a ValueError. To avoid this error, ensure that at least one source is selected, especially when the dataset is small or the percent value is low.

Consider modifying the code to guarantee a minimum sample size:

 sample_size = max(1, round(len(sources) * percent))
-sampled_sources = random.sample(sources, round(len(sources) * percent))
+sampled_sources = random.sample(sources, sample_size)

Model name "gpt-4o" appears to be incorrect and needs to be updated

The codebase consistently uses an invalid model name "gpt-4o" in multiple files, while the test files correctly use valid OpenAI model names like "gpt-3.5-turbo" and "gpt-3.5-turbo-0125". The model name "gpt-4o" is not a valid OpenAI model identifier.

Locations requiring updates:

  • README.md (multiple occurrences)
  • examples/ufc/demo-ufc.ipynb
  • examples/movies/demo-movies.ipynb
  • examples/trip/demo_orchestrator_trip.ipynb

The model name should be changed to a valid OpenAI model identifier such as "gpt-4" or "gpt-3.5-turbo".

🔗 Analysis chain

Line range hint 98-98: Verify the correctness of the model name in OpenAiGenerativeModel

The model name "gpt-4o" may be incorrect or unsupported. Please verify that it corresponds to a valid OpenAI model name. Common valid model names include "gpt-3.5-turbo" or "gpt-4".

Run the following script to check for valid OpenAI model names used in the codebase:

This script will list all the model names used with OpenAiGenerativeModel in the codebase, helping you verify the correct model names.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Search for OpenAI model names used in the codebase.

# Search for instances where OpenAiGenerativeModel is initialized.
rg 'OpenAiGenerativeModel\(model_name=["\'](.*?)["\']\)' -or '$1' | sort | uniq

Length of output: 262


Script:

#!/bin/bash
# Search for OpenAI model names using simpler patterns
rg -i "model_name.*['\"]gpt" -A 2

Length of output: 1112


Script:

#!/bin/bash
# Search for any OpenAI model references
rg -i "gpt-[34]" -A 2

Length of output: 1948


Script:

#!/bin/bash
# Look for OpenAiGenerativeModel usage
rg "OpenAiGenerativeModel" -A 3

Length of output: 6815

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 572d03b and 2dcd12b.

📒 Files selected for processing (3)
  • examples/movies/demo-movies.ipynb (7 hunks)
  • examples/ufc/demo-ufc.ipynb (10 hunks)
  • ontology.json (1 hunks)
🔇 Additional comments (4)
ontology.json (1)

1-199: Verify ontology integration with chat sessions

The AI summary indicates that this ontology is used in chat sessions and knowledge graph initialization. Let's verify this integration.

✅ Verification successful

Ontology integration is properly implemented and actively used

The ontology file is correctly integrated and utilized throughout the codebase:

  • Used in chat sessions (chat_session.py) to format system instructions and handle queries
  • Core component in knowledge graph operations (kg.py) for validating nodes, relations, and attributes
  • Actively used in data extraction (extract_data_step.py) with proper placeholder replacement
  • Supports ontology creation and fixes through create_ontology_step.py

The implementation shows proper validation, error handling, and integration with both chat and graph operations, confirming the AI summary's claims about its usage in chat sessions and knowledge graph initialization.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify how the ontology is used in the codebase

# Check for ontology usage in chat sessions and knowledge graph
echo "Checking for ontology usage in code..."
rg -l "ontology" --type py

# Check for specific placeholder mentioned in the summary
echo "Checking for #ONTOLOGY placeholder..."
rg "#ONTOLOGY" --type py

Length of output: 913


Script:

#!/bin/bash
# Let's check how the ontology is actually used in key files

# Check ontology usage in chat_session.py
echo "=== chat_session.py usage ==="
rg -A 5 "ontology" graphrag_sdk/chat_session.py

# Check ontology usage in kg.py
echo "=== kg.py usage ==="
rg -A 5 "ontology" graphrag_sdk/kg.py

# Check create_ontology_step.py to understand initialization
echo "=== create_ontology_step.py usage ==="
rg -A 5 "ontology" graphrag_sdk/steps/create_ontology_step.py

# Check extract_data_step.py since it uses #ONTOLOGY placeholder
echo "=== extract_data_step.py usage ==="
rg -A 5 "ontology" graphrag_sdk/steps/extract_data_step.py

Length of output: 11870

examples/movies/demo-movies.ipynb (2)

146-149: Changes align well with PR objectives

The updated output format successfully demonstrates the "reveal chat context" objective by including context and Cypher queries in the responses.


87-90: ⚠️ Potential issue

Fix the model name typo

The model name "gpt-4o" appears to be incorrect. It should probably be "gpt-4".

-model = OpenAiGenerativeModel(model_name="gpt-4o")
+model = OpenAiGenerativeModel(model_name="gpt-4")

Likely invalid or redundant comment.

examples/ufc/demo-ufc.ipynb (1)

386-387: Verify the exposure of internal details in the chat responses

The chat responses include detailed context and Cypher queries, which may expose internal implementation details or sensitive information.

Consider whether this level of detail is appropriate for end-users. If not, you might want to modify the response handling to exclude or anonymize the context and cypher information.

To verify where the response structure is defined and adjust it accordingly, run:

This script helps locate the code responsible for the response formatting so you can make necessary adjustments.

Comment thread ontology.json Outdated
Comment thread ontology.json Outdated
Comment thread examples/ufc/demo-ufc.ipynb
Comment thread examples/ufc/demo-ufc.ipynb
@galshubeli galshubeli requested a review from swilly22 November 24, 2024 16:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants