You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the current evaluation process, there are some issues for further refinement
reproducibility: there is no dvc.lock related file to make sure the pipelines are reproducible in evaluation.(we can directly reference this to spider-benchmark repo; however, if it's not that important as of now, we can skip this right now.)
evaluation report:
adding cost/latency
adding pipeline metadata so that we can easily understand what parameters cause the differences among different reports
In the current evaluation process, there are some issues for further refinement