Datasets/lm evaluation library by RakshitKhajuria · Pull Request #724 · PacificAI/langtest

RakshitKhajuria · 2023-08-24T17:48:39Z

Description

This PR aims at adding Benchmark Datasets For QA task

Datasets Added:

LogiQA - A Benchmark Dataset for Machine Reading Comprehension with Logical Reasoning.
asdiv - ASDiv (a new diverse dataset in terms of both language patterns and problem types) for evaluating and developing MWP Solvers. It contains 2305 english Math Word Problems (MWPs), and is published in this paper "A Diverse Corpus for Evaluating and Developing English Math Word Problem Solvers".
Google/Bigbench - The Beyond the Imitation Game Benchmark (BIG-bench) is a collaborative benchmark intended to probe large language models and extrapolate their future capabilities. Tasks included in BIG-bench are summarized by keyword here, and by task name here
```
We added some of the subsets to our library:
  1. AbstractUnderstanding
  2. DisambiguationQA
  3. Disfil qa
  4. Casual Judgement
```

➤ Fixes: Explore lm evaluation library for good datasets #556

➤ Notebook Links:

Type of change

Please delete options that are not relevant.

New feature (non-breaking change which adds functionality)
This change requires a documentation update

Usage

Checklist:

I've added Google style docstrings to my code.
I've used pydantic for typing when/where necessary.
I have linted my code
I have added tests to cover my changes.

Screenshots (if appropriate):

LogiQA

ASDiv

BigBench

…hnSnowLabs/langtest into dataset-lm-evaluation-library

ArshaanNazir · 2023-08-31T05:17:41Z

@RakshitKhajuria @Prikshit7766 have you prepared any NB for it. Give link to NB in PR description with some screenshots of generated results.

RakshitKhajuria · 2023-08-31T05:44:47Z

NB for it. Give link to NB in PR description with some screenshots of generated results.

Updated

RakshitKhajuria added 2 commits August 24, 2023 23:08

task(dataset/Bigbench): Added AbstractUnderstanding and DisambiguationQA

c815e14

task(dataset): Added LogiQA

b97e9cb

RakshitKhajuria added the v2.1.0 Issue or request to be done in v2.1.0 release label Aug 24, 2023

RakshitKhajuria assigned RakshitKhajuria and Prikshit7766 Aug 24, 2023

Prikshit7766 and others added 7 commits August 24, 2023 23:33

dataset: added casual-judgement-test and disfl-qa-test

d6d9d5a

dataset: added asdiv-test

fd39a49

datasource.py: added dataset path

76c6645

helpers.py : prompt added (asdiv, causaljudgment, disflqa)

1f5faa5

task(datasource.py): Dataset added to path

ac3647c

task(helpers.py): Prompts added

adae708

fix: path

470362d

Prikshit7766 linked an issue Aug 24, 2023 that may be closed by this pull request

Explore lm-evaluation library #556

Closed

RakshitKhajuria and others added 15 commits August 28, 2023 12:46

Task(dataset): Added ASDiv tiny version

9680ee7

Task(dataset): Added casual-judgement and disfl-qa tiny version

1b07925

Task(dataset): Added DisambiguationQA tiny version

e6965e7

fix: path asdiv

9fafe1a

Rename ASDiv-test-tiny.jsonl to asdiv-test-tiny.jsonl

c83f917

Chore(notebook): Added asdiv dataset nb

354e9ff

Chore(notebook): Added LogiQA dataset nb

f62332a

Merge branch 'dataset-lm-evaluation-library' of https://github.com/Jo…

9e8d35f

…hnSnowLabs/langtest into dataset-lm-evaluation-library

rename: abstract narrative understanding

f734772

rename: abstract narrative understanding

9e8013b

added Bigbench_dataset notebooks

0e9dcb8

updated colab links

f85676e

Rename DisflQA and Causal-judgment

594a92f

rename: abstract narrative understanding

75fb13b

setup.py: added dataset path

a48c5a5

ArshaanNazir self-requested a review August 31, 2023 05:15

ArshaanNazir approved these changes Aug 31, 2023

View reviewed changes

ArshaanNazir merged commit 5f0e3fb into release/1.4.0 Aug 31, 2023

ArshaanNazir deleted the dataset-lm-evaluation-library branch September 6, 2023 04:56

chakravarthik27 removed the v2.1.0 Issue or request to be done in v2.1.0 release label Sep 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Datasets/lm evaluation library#724

Datasets/lm evaluation library#724
ArshaanNazir merged 24 commits intorelease/1.4.0from
dataset-lm-evaluation-library

RakshitKhajuria commented Aug 24, 2023 •

edited

Loading

Uh oh!

ArshaanNazir commented Aug 31, 2023 •

edited

Loading

Uh oh!

RakshitKhajuria commented Aug 31, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

RakshitKhajuria commented Aug 24, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Datasets Added:

Type of change

Usage

Checklist:

Screenshots (if appropriate):

LogiQA

ASDiv

BigBench

Uh oh!

ArshaanNazir commented Aug 31, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

RakshitKhajuria commented Aug 31, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

RakshitKhajuria commented Aug 24, 2023 •

edited

Loading

ArshaanNazir commented Aug 31, 2023 •

edited

Loading