-
Notifications
You must be signed in to change notification settings - Fork 4.6k
Python: Verify local models in Ollama and LM Studio are compatible with the OpenAI connector #6973
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
TaoChenOSU
merged 15 commits into
microsoft:main
from
TaoChenOSU:taochen/local-models-with-openai-connector-2
Jul 5, 2024
Merged
Changes from all commits
Commits
Show all changes
15 commits
Select commit
Hold shift + click to select a range
ebd79a4
Add samples to use local models
TaoChenOSU 13bfe5c
Setup Ollama for sample tests
TaoChenOSU 918f4d8
Mark skip
TaoChenOSU d32628a
Misc
TaoChenOSU d403cc5
Misc 2
TaoChenOSU b524999
Merge branch 'main' into taochen/local-models-with-openai-connector-2
TaoChenOSU 793418e
Merge branch 'main' into taochen/local-models-with-openai-connector-2
TaoChenOSU 85e51f7
Update readme
TaoChenOSU dff067c
Merge branch 'main' into taochen/local-models-with-openai-connector-2
TaoChenOSU 8793d3d
Merge branch 'main' into taochen/local-models-with-openai-connector-2
TaoChenOSU a213a83
Merge branch 'main' into taochen/local-models-with-openai-connector-2
TaoChenOSU b45f1a7
Make api key optional
TaoChenOSU ad63f4b
Merge branch 'main' into taochen/local-models-with-openai-connector-2
TaoChenOSU ace8c0e
Merge branch 'main' into taochen/local-models-with-openai-connector-2
TaoChenOSU 8f96c18
Fix integration test
TaoChenOSU File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file was deleted.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
83 changes: 83 additions & 0 deletions
83
python/samples/concepts/local_models/lm_studio_chat_completion.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,83 @@ | ||
| # Copyright (c) Microsoft. All rights reserved. | ||
|
|
||
|
|
||
| import asyncio | ||
|
|
||
| from openai import AsyncOpenAI | ||
|
|
||
| from semantic_kernel.connectors.ai.open_ai.services.open_ai_chat_completion import OpenAIChatCompletion | ||
| from semantic_kernel.contents.chat_history import ChatHistory | ||
| from semantic_kernel.functions.kernel_arguments import KernelArguments | ||
| from semantic_kernel.kernel import Kernel | ||
|
|
||
| # This concept sample shows how to use the OpenAI connector to create a | ||
| # chat experience with a local model running in LM studio: https://lmstudio.ai/ | ||
| # Please follow the instructions here: https://lmstudio.ai/docs/local-server to set up LM studio. | ||
| # The default model used in this sample is phi3 due to its compact size. | ||
|
|
||
| system_message = """ | ||
| You are a chat bot. Your name is Mosscap and | ||
| you have one goal: figure out what people need. | ||
| Your full name, should you need to know it, is | ||
| Splendid Speckled Mosscap. You communicate | ||
| effectively, but you tend to answer with long | ||
| flowery prose. | ||
| """ | ||
|
|
||
| kernel = Kernel() | ||
|
|
||
| service_id = "local-gpt" | ||
|
|
||
| openAIClient: AsyncOpenAI = AsyncOpenAI( | ||
| api_key="fake-key", # This cannot be an empty string, use a fake key | ||
| base_url="http://localhost:1234/v1", | ||
| ) | ||
| kernel.add_service(OpenAIChatCompletion(service_id=service_id, ai_model_id="phi3", async_client=openAIClient)) | ||
|
|
||
| settings = kernel.get_prompt_execution_settings_from_service_id(service_id) | ||
| settings.max_tokens = 2000 | ||
| settings.temperature = 0.7 | ||
| settings.top_p = 0.8 | ||
|
|
||
| chat_function = kernel.add_function( | ||
| plugin_name="ChatBot", | ||
| function_name="Chat", | ||
| prompt="{{$chat_history}}{{$user_input}}", | ||
| template_format="semantic-kernel", | ||
| prompt_execution_settings=settings, | ||
| ) | ||
|
|
||
| chat_history = ChatHistory(system_message=system_message) | ||
| chat_history.add_user_message("Hi there, who are you?") | ||
| chat_history.add_assistant_message("I am Mosscap, a chat bot. I'm trying to figure out what people need") | ||
|
|
||
|
|
||
| async def chat() -> bool: | ||
| try: | ||
| user_input = input("User:> ") | ||
| except KeyboardInterrupt: | ||
| print("\n\nExiting chat...") | ||
| return False | ||
| except EOFError: | ||
| print("\n\nExiting chat...") | ||
| return False | ||
|
|
||
| if user_input == "exit": | ||
| print("\n\nExiting chat...") | ||
| return False | ||
|
|
||
| answer = await kernel.invoke(chat_function, KernelArguments(user_input=user_input, chat_history=chat_history)) | ||
| chat_history.add_user_message(user_input) | ||
| chat_history.add_assistant_message(str(answer)) | ||
| print(f"Mosscap:> {answer}") | ||
| return True | ||
|
|
||
|
|
||
| async def main() -> None: | ||
| chatting = True | ||
| while chatting: | ||
| chatting = await chat() | ||
|
|
||
|
|
||
| if __name__ == "__main__": | ||
| asyncio.run(main()) | ||
62 changes: 62 additions & 0 deletions
62
python/samples/concepts/local_models/lm_studio_text_embedding.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,62 @@ | ||
| # Copyright (c) Microsoft. All rights reserved. | ||
|
|
||
| import asyncio | ||
|
|
||
| from openai import AsyncOpenAI | ||
|
|
||
| from semantic_kernel.connectors.ai.open_ai.services.open_ai_text_embedding import OpenAITextEmbedding | ||
| from semantic_kernel.core_plugins.text_memory_plugin import TextMemoryPlugin | ||
| from semantic_kernel.kernel import Kernel | ||
| from semantic_kernel.memory.semantic_text_memory import SemanticTextMemory | ||
| from semantic_kernel.memory.volatile_memory_store import VolatileMemoryStore | ||
|
|
||
| # This concept sample shows how to use the OpenAI connector to add memory | ||
| # to applications with a local embedding model running in LM studio: https://lmstudio.ai/ | ||
| # Please follow the instructions here: https://lmstudio.ai/docs/local-server to set up LM studio. | ||
| # The default model used in this sample is from nomic.ai due to its compact size. | ||
|
|
||
| kernel = Kernel() | ||
|
|
||
| service_id = "local-gpt" | ||
|
|
||
| openAIClient: AsyncOpenAI = AsyncOpenAI( | ||
| api_key="fake_key", # This cannot be an empty string, use a fake key | ||
| base_url="http://localhost:1234/v1", | ||
| ) | ||
| kernel.add_service( | ||
| OpenAITextEmbedding( | ||
| service_id=service_id, ai_model_id="Nomic-embed-text-v1.5-Embedding-GGUF", async_client=openAIClient | ||
| ) | ||
| ) | ||
|
|
||
| memory = SemanticTextMemory(storage=VolatileMemoryStore(), embeddings_generator=kernel.get_service(service_id)) | ||
| kernel.add_plugin(TextMemoryPlugin(memory), "TextMemoryPlugin") | ||
|
|
||
|
|
||
| async def populate_memory(memory: SemanticTextMemory, collection_id="generic") -> None: | ||
| # Add some documents to the semantic memory | ||
| await memory.save_information(collection=collection_id, id="info1", text="Your budget for 2024 is $100,000") | ||
| await memory.save_information(collection=collection_id, id="info2", text="Your savings from 2023 are $50,000") | ||
| await memory.save_information(collection=collection_id, id="info3", text="Your investments are $80,000") | ||
|
|
||
|
|
||
| async def search_memory_examples(memory: SemanticTextMemory, collection_id="generic") -> None: | ||
| questions = [ | ||
| "What is my budget for 2024?", | ||
| "What are my savings from 2023?", | ||
| "What are my investments?", | ||
| ] | ||
|
|
||
| for question in questions: | ||
| print(f"Question: {question}") | ||
| result = await memory.search(collection_id, question) | ||
| print(f"Answer: {result[0].text}\n") | ||
|
|
||
|
|
||
| async def main() -> None: | ||
| await populate_memory(memory) | ||
| await search_memory_examples(memory) | ||
|
|
||
|
|
||
| if __name__ == "__main__": | ||
| asyncio.run(main()) |
87 changes: 87 additions & 0 deletions
87
python/samples/concepts/local_models/ollama_chat_completion.py
|
TaoChenOSU marked this conversation as resolved.
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,87 @@ | ||
| # Copyright (c) Microsoft. All rights reserved. | ||
|
|
||
|
|
||
| import asyncio | ||
|
|
||
| from openai import AsyncOpenAI | ||
|
|
||
| from semantic_kernel.connectors.ai.open_ai.services.open_ai_chat_completion import OpenAIChatCompletion | ||
| from semantic_kernel.contents.chat_history import ChatHistory | ||
| from semantic_kernel.functions.kernel_arguments import KernelArguments | ||
| from semantic_kernel.kernel import Kernel | ||
|
|
||
| # This concept sample shows how to use the OpenAI connector with | ||
| # a local model running in Ollama: https://github.com/ollama/ollama | ||
| # A docker image is also available: https://hub.docker.com/r/ollama/ollama | ||
| # The default model used in this sample is phi3 due to its compact size. | ||
| # At the time of creating this sample, Ollama only provides experimental | ||
| # compatibility with the `chat/completions` endpoint: | ||
| # https://github.com/ollama/ollama/blob/main/docs/openai.md | ||
| # Please follow the instructions in the Ollama repository to set up Ollama. | ||
|
|
||
| system_message = """ | ||
| You are a chat bot. Your name is Mosscap and | ||
| you have one goal: figure out what people need. | ||
| Your full name, should you need to know it, is | ||
| Splendid Speckled Mosscap. You communicate | ||
| effectively, but you tend to answer with long | ||
| flowery prose. | ||
| """ | ||
|
|
||
| kernel = Kernel() | ||
|
|
||
| service_id = "local-gpt" | ||
|
|
||
| openAIClient: AsyncOpenAI = AsyncOpenAI( | ||
| api_key="fake-key", # This cannot be an empty string, use a fake key | ||
| base_url="http://localhost:11434/v1", | ||
| ) | ||
| kernel.add_service(OpenAIChatCompletion(service_id=service_id, ai_model_id="phi3", async_client=openAIClient)) | ||
|
TaoChenOSU marked this conversation as resolved.
|
||
|
|
||
| settings = kernel.get_prompt_execution_settings_from_service_id(service_id) | ||
| settings.max_tokens = 2000 | ||
| settings.temperature = 0.7 | ||
| settings.top_p = 0.8 | ||
|
|
||
| chat_function = kernel.add_function( | ||
| plugin_name="ChatBot", | ||
| function_name="Chat", | ||
| prompt="{{$chat_history}}{{$user_input}}", | ||
| template_format="semantic-kernel", | ||
| prompt_execution_settings=settings, | ||
| ) | ||
|
|
||
| chat_history = ChatHistory(system_message=system_message) | ||
| chat_history.add_user_message("Hi there, who are you?") | ||
| chat_history.add_assistant_message("I am Mosscap, a chat bot. I'm trying to figure out what people need") | ||
|
|
||
|
|
||
| async def chat() -> bool: | ||
| try: | ||
| user_input = input("User:> ") | ||
| except KeyboardInterrupt: | ||
| print("\n\nExiting chat...") | ||
| return False | ||
| except EOFError: | ||
| print("\n\nExiting chat...") | ||
| return False | ||
|
|
||
| if user_input == "exit": | ||
| print("\n\nExiting chat...") | ||
| return False | ||
|
|
||
| answer = await kernel.invoke(chat_function, KernelArguments(user_input=user_input, chat_history=chat_history)) | ||
| chat_history.add_user_message(user_input) | ||
| chat_history.add_assistant_message(str(answer)) | ||
| print(f"Mosscap:> {answer}") | ||
| return True | ||
|
|
||
|
|
||
| async def main() -> None: | ||
| chatting = True | ||
| while chatting: | ||
| chatting = await chat() | ||
|
|
||
|
|
||
| if __name__ == "__main__": | ||
| asyncio.run(main()) | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.