HAI-439 upgrade langchain-community to 0.2.4 to include cost calcuation setting of gpt-4o #6

haoruiqian · 2024-06-12T11:40:19Z

This will also upgrade langchain and langchain-core to 0.2.2.

Description: This change adds args_schema (pydantic BaseModel) to WikipediaQueryRun for correct schema formatting on LLM function calls Issue: currently using WikipediaQueryRun with OpenAI function calling returns the following error "TypeError: WikipediaQueryRun._run() got an unexpected keyword argument '__arg1' ". This happens because the schema sent to the LLM is "input: '{"__arg1":"Hunter x Hunter"}'" while the method should be called with the "query" parameter. --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>

Thank you for contributing to LangChain! - [X] **PR title**: "docs: Chroma docstrings update" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [X] **PR message**: - **Description:** Added and updated Chroma docstrings - **Issue:** langchain-ai/langchain#21983 - [X] **Add tests and docs**: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - only docs - [X] **Lint and test**: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

…r search_kwargs (#21572) - **Description:** Fixed `AzureSearchVectorStoreRetriever` to account for search_kwargs. More explanation is in the mentioned issue. - **Issue:** #21492 --------- Co-authored-by: MAC <mac@MACs-MacBook-Pro.local> Co-authored-by: Massimiliano Pronesti <massimiliano.pronesti@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>

…se. (#21980) Related to [20085](langchain-ai/langchain#20085) Updated the Javelin chat model to standardize the initialization argument. Also fixed an existing bug, where code was initialized with incorrect call to the JavelinClient defined in the javelin_sdk, resulting in an initialization error. See related [Javelin Documentation](https://docs.getjavelin.io/docs/javelin-python/quickstart).

…iever` (#22016) - **Description:** upgrade model to `gpt-4o`

…in dashscope. (#21249) Add the support of multimodal conversation in dashscope,now we can use multimodal language model "qwen-vl-v1", "qwen-vl-chat-v1", "qwen-audio-turbo" to processing picture an audio. :) - [ ] **PR title**: "community: add multimodal conversation support in dashscope" - [ ] **PR message**: ***Delete this entire checklist*** and replace with - **Description:** add multimodal conversation support in dashscope - **Issue:** - **Dependencies:** dashscope≥1.18.0 - **Twitter handle:** none :) - [ ] **How to use it?**: - ```python Tongyi_chat = ChatTongyi( top_p=0.5, dashscope_api_key=api_key, model="qwen-vl-v1" ) response= Tongyi_chat.invoke( input = [ { "role": "user", "content": [ {"image": "https://dashscope.oss-cn-beijing.aliyuncs.com/images/dog_and_girl.jpeg"}, {"text": "这是什么?"} ] } ] ) ``` --------- Co-authored-by: Bagatur <baskaryan@gmail.com>

Add admonitions with more information.

Thank you for contributing to LangChain! - [ ] **PR title**: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] **PR message**: ***Delete this entire checklist*** and replace with - **Description:** a description of the change - **Issue:** the issue # it fixes, if applicable - **Dependencies:** any dependencies required for this change - **Twitter handle:** if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] **Add tests and docs**: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] **Lint and test**: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines:https://github.com/arpitkumar980/langchain.git - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, hwchase17. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com> Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com>

- [ ] **PR title**: "Add Naver ClovaX embedding to LangChain community" - HyperClovaX is a large language model developed by [Naver](https://clova-x.naver.com/welcome). It's a powerful and purpose-trained LLM. - You can visit the embedding service provided by [ClovaX](https://www.ncloud.com/product/aiService/clovaStudio) - You may get CLOVA_EMB_API_KEY, CLOVA_EMB_APIGW_API_KEY, CLOVA_EMB_APP_ID From https://www.ncloud.com/product/aiService/clovaStudio --------- Co-authored-by: Bagatur <baskaryan@gmail.com>

Updating #21137

…21946) **Description:** adds headers to the list of supported locations when generating the openai function schema

- docs: v0.2 version sidebar - x - x

…al Relevance) (#21185) This PR contains 4 added functions: - max_marginal_relevance_search_by_vector - amax_marginal_relevance_search_by_vector - max_marginal_relevance_search - amax_marginal_relevance_search I'm no langchain expert, but tried do inspect other vectorstore sources like chroma, to build these functions for SurrealDB. If someone has some changes for me, please let me know. Otherwise I would be happy, if these changes are added to the repository, so that I can use the orignal repo and not my local monkey patched version. --------- Co-authored-by: Harrison Chase <hw.chase.17@gmail.com> Co-authored-by: Bagatur <baskaryan@gmail.com>

Correct the admonition text

…21975)

@hwchase17

… cloud buckets (#21957) Thank you for contributing to LangChain! - [ ] **PR title**: "Add CloudBlobLoader" - community: Add CloudBlobLoader - [ ] **PR message**: Add cloud blob loader - **Description:** Langchain provides several approaches to read different file formats: Specific loaders (`CVSLoader`) or blob-compatible loaders (`FileSystemBlobLoader`). The only implementation proposed for BlobLoader is `FileSystemBlobLoader`. Many projects retrieve files from cloud storage. We propose a new implementation of `BlobLoader` to read files from the three cloud storage systems. The interface is strictly identical to `FileSystemBlobLoader`. The only difference is the constructor, which takes a cloud "url" object such as `s3://my-bucket`, `az://my-bucket`, or `gs://my-bucket`. By streamlining the process, this novel implementation eliminates the requirement to pre-download files from cloud storage to local temporary files (which are seldom removed). The code relies on the [CloudPathLib](https://cloudpathlib.drivendata.org/stable/) library to interpret cloud URLs. This has been added as an optional dependency. ```Python loader = CloudBlobLoader("s3://mybucket/id") for blob in loader.yield_blobs(): print(blob) ``` - [X] **Dependencies:** CloudPathLib - [X] **Twitter handle:** pprados - [X] **Add tests and docs**: Add unit test, but it's easy to convert to integration test, with some files in a cloud storage (see `test_cloud_blob_loader.py`) - [X] **Lint and test**: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. Hello from Paris @hwchase17. Can you review this PR? --------- Co-authored-by: Eugene Yurtsev <eugene@langchain.dev>

- **Description:** Adding correct imports to the integrations callbacks doc (langchain-community package) - **Issue:** #22005 --------- Co-authored-by: Maxime Perrin <mperrin@doing.fr>

Update doc-string

@baskaryan

…eLoader (#20663) **Description:** - Added propagation of document metadata from O365BaseLoader to FileSystemBlobLoader (O365BaseLoader uses FileSystemBlobLoader under the hood). - This is done by passing dictionary `metadata_dict`: key=filename and value=dictionary containing document's metadata - Modified `FileSystemBlobLoader` to accept the `metadata_dict`, use `mimetype` from it (if available) and pass metadata further into blob loader. **Issue:** - `O365BaseLoader` under the hood downloads documents to temp folder and then uses `FileSystemBlobLoader` on it. - However metadata about the document in question is lost in this process. In particular: - `mime_type`: `FileSystemBlobLoader` guesses `mime_type` from the file extension, but that does not work 100% of the time. - `web_url`: this is useful to keep around since in RAG LLM we might want to provide link to the source document. In order to work well with document parsers, we pass the `web_url` as `source` (`web_url` is ignored by parsers, `source` is preserved) **Dependencies:** None **Twitter handle:** @martintriska1 Please review @baskaryan --------- Co-authored-by: Bagatur <22008038+baskaryan@users.noreply.github.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>

Fixed an error in `embed_documents` when the input was given as an empty list. And I have revised the document.

This pull request addresses and fixes exception handling in the UpstageLayoutAnalysisParser and enhances the test coverage by adding error exception tests for the document loader. These improvements ensure robust error handling and increase the reliability of the system when dealing with external API calls and JSON responses. ### Changes Made 1. Fix Request Exception Handling: - Issue: The existing implementation of UpstageLayoutAnalysisParser did not properly handle exceptions thrown by the requests library, which could lead to unhandled exceptions and potential crashes. - Solution: Added comprehensive exception handling for requests.RequestException to catch any request-related errors. This includes logging the error details and raising a ValueError with a meaningful error message. 2. Add Error Exception Tests for Document Loader: - New Tests: Introduced new test cases to verify the robustness of the UpstageLayoutAnalysisLoader against various error scenarios. The tests ensure that the loader gracefully handles: - RequestException: Simulates network issues or invalid API requests to ensure appropriate error handling and user feedback. - JSONDecodeError: Simulates scenarios where the API response is not a valid JSON, ensuring the system does not crash and provides clear error messaging.

…m (#22070) - **Description:** When I was running the sparkllm, I found that the default parameters currently used could no longer run correctly. - original parameters & values: - spark_api_url: "wss://spark-api.xf-yun.com/v3.1/chat" - spark_llm_domain: "generalv3" ```python # example from langchain_community.chat_models import ChatSparkLLM spark = ChatSparkLLM(spark_app_id="my_app_id", spark_api_key="my_api_key", spark_api_secret="my_api_secret") spark.invoke("hello") ``` ![sparkllm](https://github.com/langchain-ai/langchain/assets/55082429/5369bfdf-4305-496a-bcf5-2d3f59d39414) So I updated them to 3.5 (same as sparkllm official website). After the update, they can be used normally. - new parameters & values: - spark_api_url: "wss://spark-api.xf-yun.com/v3.5/chat" - spark_llm_domain: "generalv3.5"

```python class UsageMetadata(TypedDict): """Usage metadata for a message, such as token counts. Attributes: input_tokens: (int) count of input (or prompt) tokens output_tokens: (int) count of output (or completion) tokens total_tokens: (int) total token count """ input_tokens: int output_tokens: int total_tokens: int ``` ```python class AIMessage(BaseMessage): ... usage_metadata: Optional[UsageMetadata] = None """If provided, token usage information associated with the message.""" ... ```

We dont really have any abstractions around multi-modal... so add a section explaining we dont have any abstrations and then how to guides for openai and anthropic (probably need to add for more) --------- Co-authored-by: Chester Curme <chester.curme@gmail.com> Co-authored-by: Tomaz Bratanic <bratanic.tomaz@gmail.com> Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com> Co-authored-by: junefish <junefish@users.noreply.github.com> Co-authored-by: William Fu-Hinthorn <13333726+hinthornw@users.noreply.github.com> Co-authored-by: Bagatur <baskaryan@gmail.com>

open to other ideas <img width="1181" alt="Screenshot 2024-04-08 at 5 34 08 PM" src="https://github.com/langchain-ai/langchain/assets/22008038/03eb11c4-5eb5-43e3-9109-a13f76098fa4"> --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>

Reverts langchain-ai/langchain#20180

- [ ] **community**: "pgvector: replace nin_ by not_in" - [ ] **PR message**: nin_ do not exist in sqlalchemy orm, it's not_in

update name to `stop_sequences` and alias to `stop` (instead of the other way around), since `stop_sequences` is the name used by anthropic.

…ols (#22555) This PR adds support for using Databricks Unity Catalog functions as LangChain tools, which runs inside a Databricks SQL warehouse. * An example notebook is provided.

implement ls_params for ai21, fireworks, groq.

## Description The `path` param is used to specify the local persistence directory, which isn't required if using Qdrant server. This is a breaking but necessary change.

…e.py (#22629) Correct the grammar error for missing transformers package ValueError

…22151) - [x] **Adding AsyncRootListener**: "langchain_core: Adding AsyncRootListener" - **Description:** Adding an AsyncBaseTracer, AsyncRootListener and `with_alistener` function. This is to enable binding async root listener to runnables. This currently only supported for sync listeners. - **Issue:** None - **Dependencies:** None - [x] **Add tests and docs**: Added units tests and example snippet code within the function description of `with_alistener` - [x] **Lint and test**: Run make format_diff, make lint_diff and make test

removed an extra space before the period in the "Click **Create codespace on master**." line. Thank you for contributing to LangChain! - [ ] **PR title**: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] **PR message**: ***Delete this entire checklist*** and replace with - **Description:** a description of the change - **Issue:** the issue # it fixes, if applicable - **Dependencies:** any dependencies required for this change - **Twitter handle:** if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] **Add tests and docs**: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] **Lint and test**: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

Corrected a typo in the AutoGPT example notebook. Changed "Needed synce jupyter runs an async eventloop" to "Needed since Jupyter runs an async event loop". Thank you for contributing to LangChain! - [ ] **PR title**: "package: description" - Where "package" is whichever of langchain, community, core, experimental, etc. is being modified. Use "docs: ..." for purely docs changes, "templates: ..." for template changes, "infra: ..." for CI changes. - Example: "community: add foobar LLM" - [ ] **PR message**: ***Delete this entire checklist*** and replace with - **Description:** a description of the change - **Issue:** the issue # it fixes, if applicable - **Dependencies:** any dependencies required for this change - **Twitter handle:** if your PR gets announced, and you'd like a mention, we'll gladly shout you out! - [ ] **Add tests and docs**: If you're adding a new integration, please include 1. a test for the integration, preferably unit tests that do not rely on network access, 2. an example notebook showing its use. It lives in `docs/docs/integrations` directory. - [ ] **Lint and test**: Run `make format`, `make lint` and `make test` from the root of the package(s) you've modified. See contribution guidelines for more: https://python.langchain.com/docs/contributing/ Additional guidelines: - Make sure optional dependencies are imported within a function. - Please do not add dependencies to pyproject.toml files (even optional ones) unless they are required for unit tests. - Most PRs should not touch more than one package. - Changes should be backwards compatible. - If you are adding something to community, do not re-import it in langchain. If no one reviews your PR within a few days, please @-mention one of baskaryan, efriis, eyurtsev, ccurme, vbarda, hwchase17.

Add how-to guide that shows a design pattern for creating tools at run time

Include a list of parent ids for each event in astream events.

The outer try/except block handles connection errors, and the inner try/except block handles SQL execution errors, providing detailed error messages for both. try: conn = oracledb.connect(user=username, password=password, dsn=dsn) print("Connection successful!") cursor = conn.cursor() try: cursor.execute( """ begin -- Drop user begin execute immediate 'drop user testuser cascade'; exception when others then dbms_output.put_line('Error dropping user: ' || SQLERRM); end; --------- Co-authored-by: Chester Curme <chester.curme@gmail.com>

…to immediate parent), add defensive check for cycles (#22637) This PR makes two changes: 1. Fixes the order of parent IDs to be from root to immediate parent 2. Adds a simple defensive check for cycles

They cause `poetry lock` to take a ton of time, and `uv pip install` can resolve the constraints from these toml files in trivial time (addressing problem with #19153) This allows us to properly upgrade lockfile dependencies moving forward, which revealed some issues that were either fixed or type-ignored (see file comments)

Added a link to search the arXiv papers with references to LangChain. Updated table: better format (no horizontal scroll in table anymore).

… (#22594) After merging the [PR #22416 to include Jina AI multimodal capabilities](langchain-ai/langchain#22416), we updated the Jina AI embedding notebook accordingly.

…gchain into HAI-439

shinxi · 2024-06-13T02:53:00Z

@haoruiqian per discussion, pls test the langchain-core and community locally.

StreetLamb and others added 30 commits May 22, 2024 21:31

experimental[patch], docs: refine notebook for MyScale `SelfQueryRetr…

63284ff

…iever` (#22016) - **Description:** upgrade model to `gpt-4o`

docs: add admonitions to how-to callbacks (#22046)

8a87712

Add admonitions with more information.

infra: rm unused # noqa violations (#22049)

50186da

Updating #21137

community[patch]: Adding HEADER to the list of supported locations (#…

5eabe90

…21946) **Description:** adds headers to the list of supported locations when generating the openai function schema

docs: add astream v2 migration guide links (#21845)

58b6c72

- docs: v0.2 version sidebar - x - x

docs: version increases (#22050)

53293da

docs: concepts callbacks fix admonition (#22048)

37cfc00

Correct the admonition text

community[minor]: Add async methods to CassandraChatMessageHistory (#…

fea6b99

…21975)

community[minor]: Add Cassandra ByteStore (#22064)

74947ec

docs : Adding correct imports to the integrations callbacks doc (#22059)

8ba4f77

- **Description:** Adding correct imports to the integrations callbacks doc (langchain-community package) - **Issue:** #22005 --------- Co-authored-by: Maxime Perrin <mperrin@doing.fr>

community[patch]: Update doc-string in CloudBlobLoader (#22069)

e5541d1

Update doc-string

partner-upstage[patch]: embeddings empty list bug (#22057)

d9eff44

Fixed an error in `embed_documents` when the input was given as an empty list. And I have revised the document.

langchain[patch]: Release 0.2.1 (#22074)

2d96821

community[patch]: Release. 0.2.1 (#22073)

3d26807

core: bump to 0.2.1rc (#22080)

cd07521

anthropic, openai: cut pre-releases (#22083)

152c8ca

baskaryan and others added 27 commits June 6, 2024 08:07

docs: Add ChatGoogleGenerativeAI to model feat table (#22617)

feb73d4

docs: Fix typo in llmonitor.md (#22590)

e0e40f3

Revert "anthropic: stream token usage" (#22624)

e088791

Reverts langchain-ai/langchain#20180

multiple: add stop attribute (#22573)

3999761

community[patch]: pgvector replace nin_ by not_in (#22619)

05bf98b

- [ ] **community**: "pgvector: replace nin_ by not_in" - [ ] **PR message**: nin_ do not exist in sqlalchemy orm, it's not_in

anthropic: update attribute name and alias (#22625)

c1ef731

update name to `stop_sequences` and alias to `stop` (instead of the other way around), since `stop_sequences` is the name used by anthropic.

community: support Databricks Unity Catalog functions as LangChain to…

f26ab93

…ols (#22555) This PR adds support for using Databricks Unity Catalog functions as LangChain tools, which runs inside a Databricks SQL warehouse. * An example notebook is provided.

multiple: implement ls_params (#22621)

b57aa89

implement ls_params for ai21, fireworks, groq.

qdrant[patch]: Make path optional in from_existing_collection() (#21875)

8056041

## Description The `path` param is used to specify the local persistence directory, which isn't required if using Qdrant server. This is a breaking but necessary change.

openai[patch]: correct grammar in exception message in embeddings/bas…

2904c50

…e.py (#22629) Correct the grammar error for missing transformers package ValueError

docs: Add information about run time binding values to tools (#22623)

6b8963a

Add how-to guide that shows a design pattern for creating tools at run time

docs[patch]: Fix diffbot docs (#22584)

67e58fd

core[minor]: Add parent_ids to astream_events API (#22563)

035a9c9

Include a list of parent ids for each event in astream events.

core[patch]: Correctly order parent ids in astream events (from root …

28f744c

…to immediate parent), add defensive check for cycles (#22637) This PR makes two changes: 1. Fixes the order of parent IDs to be from root to immediate parent 2. Adds a simple defensive check for cycles

core[patch]: Release 0.2.5 (#22642)

4367e89

langchain[patch]: Release 0.2.3 (#22644)

fe2e5a3

docs: arxiv page update (#22574)

57c1239

Added a link to search the arXiv papers with references to LangChain. Updated table: better format (no horizontal scroll in table anymore).

[Core] Unified Enable/Disable Tracing (#22576)

be79ce9

docs: Update jina embedding notebook to include multimodal capability…

344adad

… (#22594) After merging the [PR #22416 to include Jina AI multimodal capabilities](langchain-ai/langchain#22416), we updated the Jina AI embedding notebook accordingly.

community[patch]: Release 0.2.4 (#22643)

235d919

Merge tag 'langchain-community==0.2.4' of github.com:langchain-ai/lan…

83a4ed9

…gchain into HAI-439

shinxi approved these changes Jun 13, 2024

View reviewed changes

shinxi merged commit e4a9e14 into ServiceMax-Engineering:master Jun 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

HAI-439 upgrade langchain-community to 0.2.4 to include cost calcuation setting of gpt-4o #6

HAI-439 upgrade langchain-community to 0.2.4 to include cost calcuation setting of gpt-4o #6

Uh oh!

haoruiqian commented Jun 12, 2024

Uh oh!

shinxi commented Jun 13, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

HAI-439 upgrade langchain-community to 0.2.4 to include cost calcuation setting of gpt-4o #6

HAI-439 upgrade langchain-community to 0.2.4 to include cost calcuation setting of gpt-4o #6

Uh oh!

Conversation

haoruiqian commented Jun 12, 2024

Uh oh!

shinxi commented Jun 13, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants