feat(model/rerank): support instruct#55
Conversation
There was a problem hiding this comment.
Summary of Changes
Hello @kevinlin09, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
This pull request introduces a new capability to the TextReRank model by adding an "instruct" parameter. This enhancement allows for more guided and context-aware text reranking, improving the relevance of results based on specific instructions provided by the user.
Highlights
- TextReRank Model Enhancement: Introduced support for an "instruct" parameter in the TextReRank.call method, allowing users to provide instructions to guide the reranking process.
- New Test Case: Added samples/test_text_rerank.py to demonstrate the usage and functionality of the new "instruct" parameter with the TextReRank API.
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.
| Feature | Command | Description |
|---|---|---|
| Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
| Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
| Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
| Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
There was a problem hiding this comment.
Code Review
This pull request adds a sample script to demonstrate the new instruct parameter for the TextReRank API. The script is well-structured, but I have a few suggestions for improvement. First, the script is named as a test but lacks assertions to verify the API response; I've suggested adding assertions to make it a proper test. Second, there's a minor typo in the instruct parameter's value which I've pointed out. Finally, I've recommended changing the .env file loading logic to look for the file in the project root, which is a more common convention than the user's home directory. These changes will improve the quality and usefulness of the sample script.
|
|
||
| print("\n✅ Test passed! All assertions successful.") |
There was a problem hiding this comment.
The function is named test_text_rerank and the success message claims "All assertions successful", but there are no assertions in the code. This makes it an example script rather than a test. To ensure the API is working correctly, you should add assertions to validate the structure and content of the response.
| print("\n✅ Test passed! All assertions successful.") | |
| assert response.output, "Response should have an output." | |
| assert 'results' in response.output, "Output should contain 'results'." | |
| results = response.output['results'] | |
| assert isinstance(results, list) | |
| assert len(results) <= len(documents) | |
| assert all('relevance_score' in r for r in results) | |
| assert all('index' in r for r in results) | |
| assert all('document' in r for r in results), "document should be returned when return_documents=True" | |
| print("\n✅ Test passed! All assertions successful.") |
| documents=documents, | ||
| return_documents=True, | ||
| top_n=5, | ||
| instruct="Retrieval document that can answer users query." |
There was a problem hiding this comment.
There appears to be a grammatical error and a typo in the instruct string. "Retrieval" is a noun, but an instruction should typically start with a verb like "Retrieve". Also, "document" should probably be plural "documents", and "users" should be possessive "user's". Correcting this will improve clarity and may lead to better model performance.
| instruct="Retrieval document that can answer users query." | |
| instruct="Retrieve documents that can answer the user's query." |
| try: | ||
| with open(os.path.expanduser('~/.env'), 'r') as f: | ||
| for line in f: | ||
| if line.strip() and not line.startswith('#'): | ||
| key, value = line.strip().split('=', 1) | ||
| os.environ[key] = value | ||
| except FileNotFoundError: | ||
| print("No .env file found, using system environment variables") |
There was a problem hiding this comment.
The current implementation loads environment variables from ~/.env, which is a user-level global configuration file. For project-specific settings, it's more conventional to use a .env file in the project's root directory. This makes the project more self-contained and easier for other developers to set up. I suggest modifying this logic to load from the project root and also improving the error message to be more specific.
| try: | |
| with open(os.path.expanduser('~/.env'), 'r') as f: | |
| for line in f: | |
| if line.strip() and not line.startswith('#'): | |
| key, value = line.strip().split('=', 1) | |
| os.environ[key] = value | |
| except FileNotFoundError: | |
| print("No .env file found, using system environment variables") | |
| try: | |
| project_root = os.path.dirname(os.path.dirname(os.path.abspath(__file__))) | |
| dotenv_path = os.path.join(project_root, '.env') | |
| with open(dotenv_path, 'r') as f: | |
| for line in f: | |
| if line.strip() and not line.startswith('#'): | |
| key, value = line.strip().split('=', 1) | |
| os.environ[key] = value | |
| except FileNotFoundError: | |
| print(f"No .env file found at '{dotenv_path}', using system environment variables") |
No description provided.