feat: collect total token usage by lifeizhou-ap · Pull Request #32 · square/exchange

lifeizhou-ap · 2024-09-06T10:31:06Z

Why
User would like to know the token usage and cost involved while using LLM in goose

What

Created _TokenUsageCollector to collect the usage data from http call to LLM apis including the model name
Exposed get_token_usage on exchange for goose to get the total token usage (raised discussion in the comments here)

… used

lukealvoeiro · 2024-09-06T19:28:04Z

@lifeizhou-ap I don't think we need to collect it as the total token count is already stored in the CheckpointData

zakiali · 2024-09-07T02:25:20Z

+1 to @lukealvoeiro's comment, we already have this, but turning this into total tokens sent (across all requests to the LLM) would be usefule for cost tracking. Some providers like openai also return back how much was spent on each request. Obviously this changes for each LLM, and we could have an internal look up for it, but then have to keep it up to date. Maybe we just provide total tokens to user and $ amount if available.

lifeizhou-ap · 2024-09-07T03:38:10Z

Thanks for the suggestions and early feedback!

Firstly I would like to clarify the requirements about tracking the cost/tokens.

My understanding is: we would like to track the cost/tokens that the user used after the user starts the goose console. The LLM provider api will return the tokens that used in the api call if they have, and we want to track all the token counts that are used in each api call after the user starts the goose console. (At the moment it is tracked in memory. Later on we could persist it to show usage per session)
In the current implementation, we exchange function delegates the provider complete function to interact with the LLM providers and the complete function returns the total number of the token used. The return value is the data source that will be used to calculate the cost. Currently api such as open ai does not return cost. So we can use the model and total token count data from the api call to calculate the cost.

Please correct me if any of above points is incorrect. This is to make sure that I am on the same page with you on the requirements before the implementation discussion below.

I've looked into Checkpoint before I started this PR change. Here is my understanding on CheckpointData (containing a list of checkpoints) :

Each exchange instance has CheckpointData. It is used to track the token count which has been used in sync with the messages in the exchange` instance.
However, it does not stand for the tokens have been used since exchange has been initialised. By reading the code, currently the checkpointData's main purpose is to monitor the token usage (in sync with messages). We can use it when we sending the next api call to avoid sending too many tokens. If the tokens exceed the limit, the messages will be truncated by FIFO logic. The total_token_count is reduced based on the message truncated.

Here is my thoughts:

Based on my understanding of the requirements, we want to track the token usage from the api calls delegated by all exchange instances. We have different places in the code to initiate the exchanges. Instead of gathering/managing all the exchanges to sum up the total token count in each instances, it would be good to plugin/inject the token_usage_collector (we can use it track more data in the future if necessary) to exchange and let the collector to collect the usages. The approach is similar to logging.

In addition, the usage data that token_usage_collector collects is more accurate. The CheckpointData total token count only has the count which in sync with the current messages in the instance.

With this approach, the token_usage_collector and checkpointData have their own single responsibility without coupling.

codefromthecrypt · 2024-09-07T07:25:16Z

if this is about cost, and we are talking about billing platforms vs local like ollama, wouldn't we need to split between input_tokens and output_tokens (total_tokens is mixed)?

My thinking is if we turn this into a feature request first (cost accounting? or preventing overflow), we can figure out how to solve it, and also decouple from however these counts are processed.

For example, I work on opentelemetry and it collects input and output tokens, but this is not processed inline in the code, rather sent as trace spans or metrics. In this way you don't need to change the application logic depending on what you are looking at. OTOH, this approach (instrumentation) doesn't allow you to control future requests based on that data.

TL;DR; can we turn this into a request for a feature? then, it would be easier to decide how to proceed in code.

* main: feat: Rework error handling (#48) chore(release): release version 0.9.0 (#45) chore: add just command for releases and update pyproject for changelog (#43) feat: convert ollama provider to an openai configuration (#34) fix: Bedrock Provider request (#29) test: Update truncate and summarize tests to check for sytem prompt t… (#42) chore: update test_tools to read a file instead of get a password (#38) fix: Use placeholder message to check tokens (#41) feat: rewind to user message (#30) chore: Update LICENSE (#40) fix: shouldn't hardcode truncate to gpt4o mini (#35)

lifeizhou-ap · 2024-09-18T23:45:56Z

@baxen and I had a discussion about the implementation of this feature

Summary:

the exchange constructor should not have dependency on _TokenUsageCollector
We also looked into the approach to use Moderator to collect the token usage. It was supposed to use usage on the checkpoint on the moderator, but the usage does not stand for the total usage. more details are in my comments above.

Conclusion:

we decided to set _TokenUsageCollector as a global singleton to collect the token usage.
save the usage in the log file
At the time, we use _TokenUsageCollector to collect the usage as a simple solution. In the long term, we can parse the the open telemetry instrumentation data if we would like to track more data from the http call response payload

Hi @baxen Please correct me if anything above is incorrect. Thanks!

lifeizhou-ap · 2024-09-19T00:42:30Z

Hey @baxen

I have question about the log the usage. We could either log them in either exchange or goose project

If we trigger logging from exchange
- we will log the usage every api call in a fine-grained level.
- we have to specify the location of log file in exchange or pass the log directory from goose to exchange
If we trigger logging from goose
- We can log the total usage when session is saved. this will be easier to calculate the total cost and I guess the total cost might be useful to the user instead of the fine grained data.
- We can save the log file in the goose config directory as a single point

I am leaning to trigger logging from goose. (Draft PR in goose) WDYT?

baxen

Looks good! I do think we should switch away from the queue before merging for simplicity

baxen · 2024-09-19T12:52:55Z

src/exchange/token_usage_collector.py

+        self.usage_data_queue = queue.Queue()
+
+    def collect(self, model: str, usage: Usage) -> None:
+        self.usage_data_queue.put((model, usage.input_tokens, usage.output_tokens))


i don't think we need a queue here (I don't see this making using of put/task_done etc in the sense of processing a queue in threads). If we're appending to a list (or adding to a running sum in a dictionary) the GIL will make sure all updates are processed, and I don't believe we have any concerns around ordering

baxen · 2024-09-19T12:53:48Z

src/exchange/token_usage_collector.py

+
+
+@dataclass
+class TokenUsage:


nit: I'd avoid a new object here and instead make a map of the existing Usage object stored in a dict with model as the key?

yes, can do

baxen · 2024-09-19T13:02:42Z

Hey @baxen

I have question about the log the usage. We could either log them in either exchange or goose project

If we trigger logging from exchange

we will log the usage every api call in a fine-grained level.

we have to specify the location of log file in exchange or pass the log directory from goose to exchange

If we trigger logging from goose

We can log the total usage when session is saved. this will be easier to calculate the total cost and I guess the total cost might be useful to the user instead of the fine grained data.

We can save the log file in the goose config directory as a single point

I am leaning to trigger logging from goose. (Draft PR in goose) WDYT?

Yes agree! Definitely should be handling setting logging verbosity/etc downstream in the application. If it's convenient, i do think it's fully reasonable to log at e.g. debug level here in exchange, but I don't see any need to add that at this point.

* adding in ability to provide per repo hints * tidy up test

lifeizhou-ap added 2 commits September 6, 2024 20:24

created usage collector to collect the count of total tokens that are…

00968fe

… used

rename

1b4820d

lifeizhou-ap changed the title ~~Lifei/collect total token usage~~ feat: collect total token usage Sep 6, 2024

lifeizhou-ap added 2 commits September 18, 2024 06:58

used _TokenUsageCollector as a global variable to track the utoken usage

15e6449

reformat code

450a9dd

make sure token usage collector only initialises once

6044b6b

baxen approved these changes Sep 19, 2024

View reviewed changes

lifeizhou-ap added 2 commits September 20, 2024 08:37

return dict instead of a class

d9eddba

reformat code

cccf600

lifeizhou-ap marked this pull request as ready for review September 19, 2024 22:43

lifeizhou-ap merged commit 8139c74 into main Sep 19, 2024

lifeizhou-ap deleted the lifei/collect_total_token_usage branch September 19, 2024 22:44

lifeizhou-ap mentioned this pull request Sep 20, 2024

feat: track cost and token usage in log file block/goose#80

Merged

codefromthecrypt pushed a commit to codefromthecrypt/exchange that referenced this pull request Oct 13, 2024

adding in ability to provide per repo hints (square#32)

13db515

* adding in ability to provide per repo hints * tidy up test

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

feat: collect total token usage#32

feat: collect total token usage#32
lifeizhou-ap merged 8 commits intomainfrom
lifei/collect_total_token_usage

lifeizhou-ap commented Sep 6, 2024 •

edited

Loading

Uh oh!

lukealvoeiro commented Sep 6, 2024

Uh oh!

zakiali commented Sep 7, 2024

Uh oh!

lifeizhou-ap commented Sep 7, 2024 •

edited

Loading

Uh oh!

codefromthecrypt commented Sep 7, 2024

Uh oh!

lifeizhou-ap commented Sep 18, 2024 •

edited

Loading

Uh oh!

lifeizhou-ap commented Sep 19, 2024 •

edited

Loading

Uh oh!

baxen left a comment

Uh oh!

baxen Sep 19, 2024

Uh oh!

baxen Sep 19, 2024

Uh oh!

lifeizhou-ap Sep 19, 2024

Uh oh!

baxen commented Sep 19, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants



		@dataclass
		class TokenUsage:

Comments

Conversation

lifeizhou-ap commented Sep 6, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lukealvoeiro commented Sep 6, 2024

Uh oh!

zakiali commented Sep 7, 2024

Uh oh!

lifeizhou-ap commented Sep 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codefromthecrypt commented Sep 7, 2024

Uh oh!

lifeizhou-ap commented Sep 18, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lifeizhou-ap commented Sep 19, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

baxen left a comment

Choose a reason for hiding this comment

Uh oh!

baxen Sep 19, 2024

Choose a reason for hiding this comment

Uh oh!

baxen Sep 19, 2024

Choose a reason for hiding this comment

Uh oh!

lifeizhou-ap Sep 19, 2024

Choose a reason for hiding this comment

Uh oh!

baxen commented Sep 19, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

lifeizhou-ap commented Sep 6, 2024 •

edited

Loading

lifeizhou-ap commented Sep 7, 2024 •

edited

Loading

lifeizhou-ap commented Sep 18, 2024 •

edited

Loading

lifeizhou-ap commented Sep 19, 2024 •

edited

Loading