Conversation
|
@lifeizhou-ap I don't think we need to collect it as the total token count is already stored in the |
|
+1 to @lukealvoeiro's comment, we already have this, but turning this into total tokens sent (across all requests to the LLM) would be usefule for cost tracking. Some providers like openai also return back how much was spent on each request. Obviously this changes for each LLM, and we could have an internal look up for it, but then have to keep it up to date. Maybe we just provide total tokens to user and $ amount if available. |
|
Thanks for the suggestions and early feedback! Firstly I would like to clarify the requirements about tracking the cost/tokens.
Please correct me if any of above points is incorrect. This is to make sure that I am on the same page with you on the requirements before the implementation discussion below. I've looked into
Here is my thoughts: Based on my understanding of the requirements, we want to track the token usage from the api calls delegated by all In addition, the usage data that token_usage_collector collects is more accurate. The With this approach, the |
|
if this is about cost, and we are talking about billing platforms vs local like ollama, wouldn't we need to split between input_tokens and output_tokens (total_tokens is mixed)? My thinking is if we turn this into a feature request first (cost accounting? or preventing overflow), we can figure out how to solve it, and also decouple from however these counts are processed. For example, I work on opentelemetry and it collects input and output tokens, but this is not processed inline in the code, rather sent as trace spans or metrics. In this way you don't need to change the application logic depending on what you are looking at. OTOH, this approach (instrumentation) doesn't allow you to control future requests based on that data. TL;DR; can we turn this into a request for a feature? then, it would be easier to decide how to proceed in code. |
* main: feat: Rework error handling (#48) chore(release): release version 0.9.0 (#45) chore: add just command for releases and update pyproject for changelog (#43) feat: convert ollama provider to an openai configuration (#34) fix: Bedrock Provider request (#29) test: Update truncate and summarize tests to check for sytem prompt t… (#42) chore: update test_tools to read a file instead of get a password (#38) fix: Use placeholder message to check tokens (#41) feat: rewind to user message (#30) chore: Update LICENSE (#40) fix: shouldn't hardcode truncate to gpt4o mini (#35)
|
@baxen and I had a discussion about the implementation of this feature Summary:
Conclusion:
Hi @baxen Please correct me if anything above is incorrect. Thanks! |
|
Hey @baxen I have question about the
I am leaning to trigger logging from |
baxen
left a comment
There was a problem hiding this comment.
Looks good! I do think we should switch away from the queue before merging for simplicity
| self.usage_data_queue = queue.Queue() | ||
|
|
||
| def collect(self, model: str, usage: Usage) -> None: | ||
| self.usage_data_queue.put((model, usage.input_tokens, usage.output_tokens)) |
There was a problem hiding this comment.
i don't think we need a queue here (I don't see this making using of put/task_done etc in the sense of processing a queue in threads). If we're appending to a list (or adding to a running sum in a dictionary) the GIL will make sure all updates are processed, and I don't believe we have any concerns around ordering
|
|
||
|
|
||
| @dataclass | ||
| class TokenUsage: |
There was a problem hiding this comment.
nit: I'd avoid a new object here and instead make a map of the existing Usage object stored in a dict with model as the key?
Yes agree! Definitely should be handling setting logging verbosity/etc downstream in the application. If it's convenient, i do think it's fully reasonable to log at e.g. debug level here in exchange, but I don't see any need to add that at this point. |
* adding in ability to provide per repo hints * tidy up test
Why
User would like to know the token usage and cost involved while using LLM in goose
What
_TokenUsageCollectorto collect the usage data from http call to LLM apis including the model nameget_token_usageonexchangeforgooseto get the total token usage (raised discussion in the comments here)