We need to come up with some rules to evaluate the ask_details pipeline. Something we can think about:
- What is a good ask_details pipeline from user's perspective?
- Summery text in each step is easily understandable?
- SQL query in each step is executable?
- SQL query should semantically correspond to respective summary text?
- Is there any public benchmark we can reference with?
related issue: #4
We need to come up with some rules to evaluate the ask_details pipeline. Something we can think about:
related issue: #4