Bug Fixes by yuyuan12138 · Pull Request #4 · zhchen18/ToMBench

yuyuan12138 · 2026-01-27T10:02:40Z

Summary

This PR addresses multiple critical bugs identified in the ToMBench evaluation scripts. The changes improve reliability, security, error handling, and maintainability.

All changes are backward compatible.

Files Changed: 2 files

get_results.py: Fixed data aggregation errors and edge cases
run_api.py: Updated to OpenAI v1.0+ API, fixed async execution issues

Changes Made

get_results.py - Fixed Data Aggregation Crashes

Problem: The script crashed with ValueError: max() iterable argument is empty when processing results from failed API calls.

Fixes:

Added null/empty check in most_common_element() function
Fixed array sizing to use actual max_idx from data instead of calculated value
Added safe dictionary access with .get() to prevent KeyError
Added validation to skip empty/incomplete samples
Added comprehensive docstrings

# Before: Crash on empty list
def most_common_element(lst):
    most_common = max(element_freq, key=element_freq.get)

# After: Safe handling
def most_common_element(lst):
    if not lst:
        return None
    # ...

run_api.py - OpenAI API v1.0+ Compatibility & Bug Fixes

Problems:

Script exited immediately without waiting for API calls to complete
Used deprecated OpenAI v0.x API syntax
Exception types incompatible with v1.0+

Fixes:

Issue	Fix
ThreadPoolExecutor not waiting	Added future.result() to wait for all tasks
OpenAI API v1.0+ incompatibility	Updated to new client-based API
Exception types	Changed openai.error.X to openai.X

# Before: Script exits before API calls finish
with ThreadPoolExecutor(max_workers=32) as executor:
    for payload in payloads:
        executor.submit(processor.multiple_gpt, payload)  # ❌ Not waiting

# After: Waits for completion
with ThreadPoolExecutor(max_workers=32) as executor:
    futures = [executor.submit(processor.multiple_gpt, payload) for payload in payloads]
    for future in futures:
        future.result()  # ✅ Waits for each task

API Migration:
# Before (v0.x - deprecated)
import openai
openai.api_key = api_key
openai.ChatCompletion.create(...)

# After (v1.0+)
from openai import OpenAI
self.client = OpenAI(api_key=api_key, base_url=api_base)
self.client.chat.completions.create(...)

Testing

✅ run_api.py successfully completes API calls and writes results
✅ get_results.py correctly aggregates results without crashes
✅ Generated results.json with accurate accuracy metrics

Reviewers

Please review the following changes:

get_results.py - Data aggregation safety improvements
run_api.py - OpenAI API migration and async fix

yuyuan12138 added 4 commits January 27, 2026 13:08

update

fe1fe17

l

e476398

delete

f20d447

update

a801cd1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug Fixes#4

Bug Fixes#4
yuyuan12138 wants to merge 4 commits intozhchen18:mainfrom
yuyuan12138:main

yuyuan12138 commented Jan 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

yuyuan12138 commented Jan 27, 2026

Summary

Changes Made

Testing

Reviewers

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant