Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
5a57bda
Updated translation notebook
RakshitKhajuria Aug 17, 2023
6f25690
Updated accuracy notebook
RakshitKhajuria Aug 17, 2023
fc2868e
Updated custom data notebook
RakshitKhajuria Aug 17, 2023
134bce3
Updated bias demo notebook
RakshitKhajuria Aug 17, 2023
79a9f06
Updated fairness demo notebook
RakshitKhajuria Aug 17, 2023
b097d7d
Updated representation demo notebook
RakshitKhajuria Aug 17, 2023
65427e3
Updated robustness demo notebook
RakshitKhajuria Aug 17, 2023
38e37b0
Updated markdown in test specific notebook
RakshitKhajuria Aug 17, 2023
54179f6
Updated ai21 demo notebook
RakshitKhajuria Aug 17, 2023
6c3c4d7
Updated LLM notebooks
RakshitKhajuria Aug 17, 2023
6cc335a
updated misc notebooks
Prikshit7766 Aug 17, 2023
c1d7c95
Merge branch 'notebook-website-update-harness' of https://github.com/…
Prikshit7766 Aug 17, 2023
d47eb51
updated dataset notebooks
Prikshit7766 Aug 17, 2023
fed7343
updated end-to-end-notebooks notebooks
Prikshit7766 Aug 17, 2023
b262596
Update compare model tutorial
ArshaanNazir Aug 18, 2023
2e8c5d2
updated parameter in the table
Prikshit7766 Aug 18, 2023
a420524
updated misc and test-specific-notebooks
Prikshit7766 Aug 18, 2023
328e9db
chore: prompt injection notebook
chakravarthik27 Aug 18, 2023
d3cad7a
merged: fix tutorials nb
chakravarthik27 Aug 18, 2023
151e7e1
update one-liner Prompt-Injection
ArshaanNazir Aug 18, 2023
1cb238a
update tasks and tutorial section
ArshaanNazir Aug 18, 2023
43daf02
Add Prompt-Injection Test to Website
ArshaanNazir Aug 18, 2023
22cb04a
Update tutorial links
ArshaanNazir Aug 18, 2023
c4465c0
Update one-liners
ArshaanNazir Aug 18, 2023
a7b775f
Update one-liners page
ArshaanNazir Aug 18, 2023
5967479
update landing page
ArshaanNazir Aug 18, 2023
4a3444e
update prompt-injection attack page
ArshaanNazir Aug 18, 2023
d1ba590
updated dataset-notebooks
Prikshit7766 Aug 18, 2023
167c85c
update model page
Prikshit7766 Aug 18, 2023
c297de7
add task
ArshaanNazir Aug 18, 2023
3d6cd16
Merge branch 'chore/Update_NBs_Website' of https://github.com/JohnSno…
ArshaanNazir Aug 18, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -111,13 +111,14 @@
"<br/>\n",
"\n",
"\n",
"| Parameter | Description | \n",
"| - | - | \n",
"|**task** |Task for which the model is to be evaluated (text-classification or ner)|\n",
"|**model** |PipelineModel or path to a saved model or pretrained pipeline/model from hub.\n",
"|**data** |Path to the data that is to be used for evaluation. Can be .csv or .conll file in the CoNLL format \n",
"|**config** |Configuration for the tests to be performed, specified in form of a YAML file.\n",
"|**hub** |model hub to load from the path. Required if model param is passed as path.|\n",
"\n",
"| Parameter | Description |\n",
"| ------------- | ----------- |\n",
"| **task** | Task for which the model is to be evaluated (text-classification or ner) |\n",
"| **model** | Specifies the model(s) to be evaluated. Can be a dictionary or a list of dictionaries. Each dictionary should contain 'model' and 'hub' keys. If a path is specified, the dictionary must contain 'model' and 'hub' keys. |\n",
"| **data** | The data to be used for evaluation. A dictionary providing flexibility and options for data sources. It should include the following keys: <ul><li>data_source (mandatory): The source of the data.</li><li>subset (optional): The subset of the data.</li><li>feature_column (optional): The column containing the features.</li><li>target_column (optional): The column containing the target labels.</li><li>split (optional): The data split to be used.</li></ul> |\n",
"| **config** | Configuration for the tests to be performed, specified in the form of a YAML file. |\n",
"\n",
"\n",
"<br/>\n",
"<br/>"
Expand Down Expand Up @@ -549,7 +550,8 @@
},
"outputs": [],
"source": [
"h = Harness(task=\"ner\", model=\"trained_model\", hub=\"huggingface\", data=\"sample.conll\")"
"\n",
"h = Harness(task=\"ner\", model={\"model\": \"trained_model\", \"hub\": \"huggingface\"}, data={\"data_source\" :\"sample.conll\"})"
]
},
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -109,13 +109,14 @@
"<br/>\n",
"\n",
"\n",
"| Parameter | Description |\n",
"| - | - |\n",
"|**task** |Task for which the model is to be evaluated (text-classification or ner)|\n",
"|**model** |PipelineModel or path to a saved model or pretrained pipeline/model from hub.\n",
"|**data** |Path to the data that is to be used for evaluation. Can be .csv or .conll file in the CoNLL format\n",
"|**config** |Configuration for the tests to be performed, specified in form of a YAML file.\n",
"|**hub** |model hub to load from the path. Required if model param is passed as path.|\n",
"\n",
"| Parameter | Description |\n",
"| ------------- | ----------- |\n",
"| **task** | Task for which the model is to be evaluated (text-classification or ner) |\n",
"| **model** | Specifies the model(s) to be evaluated. Can be a dictionary or a list of dictionaries. Each dictionary should contain 'model' and 'hub' keys. If a path is specified, the dictionary must contain 'model' and 'hub' keys. |\n",
"| **data** | The data to be used for evaluation. A dictionary providing flexibility and options for data sources. It should include the following keys: <ul><li>data_source (mandatory): The source of the data.</li><li>subset (optional): The subset of the data.</li><li>feature_column (optional): The column containing the features.</li><li>target_column (optional): The column containing the target labels.</li><li>split (optional): The data split to be used.</li></ul> |\n",
"| **config** | Configuration for the tests to be performed, specified in the form of a YAML file. |\n",
"\n",
"\n",
"<br/>\n",
"<br/>"
Expand Down Expand Up @@ -409,7 +410,7 @@
},
"outputs": [],
"source": [
"harness = Harness(task=\"ner\", model=ner_model, data=\"sample.conll\")"
"harness = Harness(task=\"ner\", model={\"model\": ner_model, \"hub\": \"johnsnowlabs\"}, data={\"data_source\" :\"sample.conll\"})"
]
},
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -109,13 +109,14 @@
"<br/>\n",
"\n",
"\n",
"| Parameter | Description | \n",
"| - | - | \n",
"|**task** |Task for which the model is to be evaluated (text-classification or ner)|\n",
"|**model** |PipelineModel or path to a saved model or pretrained pipeline/model from hub.\n",
"|**data** |Path to the data that is to be used for evaluation. Can be .csv or .conll file in the CoNLL format \n",
"|**config** |Configuration for the tests to be performed, specified in form of a YAML file.\n",
"|**hub** |model hub to load from the path. Required if model param is passed as path.|\n",
"\n",
"| Parameter | Description |\n",
"| ------------- | ----------- |\n",
"| **task** | Task for which the model is to be evaluated (text-classification or ner) |\n",
"| **model** | Specifies the model(s) to be evaluated. Can be a dictionary or a list of dictionaries. Each dictionary should contain 'model' and 'hub' keys. If a path is specified, the dictionary must contain 'model' and 'hub' keys. |\n",
"| **data** | The data to be used for evaluation. A dictionary providing flexibility and options for data sources. It should include the following keys: <ul><li>data_source (mandatory): The source of the data.</li><li>subset (optional): The subset of the data.</li><li>feature_column (optional): The column containing the features.</li><li>target_column (optional): The column containing the target labels.</li><li>split (optional): The data split to be used.</li></ul> |\n",
"| **config** | Configuration for the tests to be performed, specified in the form of a YAML file. |\n",
"\n",
"\n",
"<br/>\n",
"<br/>"
Expand Down Expand Up @@ -242,7 +243,7 @@
},
"outputs": [],
"source": [
"harness = Harness(task=\"ner\", model=ner_model, data=\"sample.conll\", hub=\"johnsnowlabs\")"
"harness = Harness(task=\"ner\", model={\"model\": ner_model, \"hub\": \"johnsnowlabs\"}, data={\"data_source\" :\"sample.conll\"})"
]
},
{
Expand Down Expand Up @@ -1284,7 +1285,7 @@
}
],
"source": [
"harness = Harness.load(\"saved_test_configurations\",model=augmented_ner_model)"
"harness = Harness.load(\"saved_test_configurations\",model=augmented_ner_model,task=\"ner\")"
]
},
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -89,13 +89,14 @@
"<br/>\n",
"\n",
"\n",
"| Parameter | Description | \n",
"| - | - | \n",
"|**task** |Task for which the model is to be evaluated (text-classification or ner)|\n",
"|**model** |PipelineModel or path to a saved model or pretrained pipeline/model from hub.\n",
"|**data** |Path to the data that is to be used for evaluation. Can be .csv or .conll file in the CoNLL format \n",
"|**config** |Configuration for the tests to be performed, specified in form of a YAML file.\n",
"|**hub** |model hub to load from the path. Required if model param is passed as path.|\n",
"\n",
"| Parameter | Description |\n",
"| ------------- | ----------- |\n",
"| **task** | Task for which the model is to be evaluated (text-classification or ner) |\n",
"| **model** | Specifies the model(s) to be evaluated. Can be a dictionary or a list of dictionaries. Each dictionary should contain 'model' and 'hub' keys. If a path is specified, the dictionary must contain 'model' and 'hub' keys. |\n",
"| **data** | The data to be used for evaluation. A dictionary providing flexibility and options for data sources. It should include the following keys: <ul><li>data_source (mandatory): The source of the data.</li><li>subset (optional): The subset of the data.</li><li>feature_column (optional): The column containing the features.</li><li>target_column (optional): The column containing the target labels.</li><li>split (optional): The data split to be used.</li></ul> |\n",
"| **config** | Configuration for the tests to be performed, specified in the form of a YAML file. |\n",
"\n",
"\n",
"<br/>\n",
"<br/>"
Expand Down Expand Up @@ -416,7 +417,7 @@
},
"outputs": [],
"source": [
"h = Harness(task=\"ner\",model=spacy_model, data=\"/content/sample.conll\")"
"h = Harness(task=\"ner\",model={\"model\": spacy_model, \"hub\": \"spacy\"}, data={\"data_source\": \"/content/sample.conll\"})"
]
},
{
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -98,10 +98,9 @@
"| Parameter | Description | \n",
"| - | - | \n",
"|**task** |Task for which the model is to be evaluated (question-answering or summarization)|\n",
"|**model** |LLM model name (ex: text-davinci-002, command-xlarge-nightly etc.)|\n",
"|**data** |Benchmark dataset name (ex: BoolQ-test, XSum-test etc.)|\n",
"|**config** |Configuration for the tests to be performed, specified in form of a YAML file.|\n",
"|**hub** | Name of the hub (ex: openai, azure-openai, ai21, cohere etc.)|\n",
"| **model** | Specifies the model(s) to be evaluated. Can be a dictionary or a list of dictionaries. Each dictionary should contain 'model' and 'hub' keys. If a path is specified, the dictionary must contain 'model' and 'hub' keys.|\n",
"| **data** | The data to be used for evaluation. A dictionary providing flexibility and options for data sources. It should include the following keys: <ul><li>data_source (mandatory): The source of the data.</li><li>subset (optional): The subset of the data.</li><li>feature_column (optional): The column containing the features.</li><li>target_column (optional): The column containing the target labels.</li><li>split (optional): The data split to be used.</li></ul> |\n",
"| **config** | Configuration for the tests to be performed, specified in the form of a YAML file. |\n",
"\n",
"<br/>\n",
"<br/>"
Expand Down Expand Up @@ -173,7 +172,7 @@
},
"outputs": [],
"source": [
"harness = Harness(task=\"question-answering\", hub=\"ai21\", model=\"j2-jumbo-instruct\", data='BoolQ-test-tiny')"
"harness = Harness(task=\"question-answering\", model={\"model\": \"j2-jumbo-instruct\", \"hub\":\"ai21\"}, data={\"data_source\": 'BoolQ-test-tiny'})"
]
},
{
Expand Down Expand Up @@ -1146,7 +1145,7 @@
},
"outputs": [],
"source": [
"harness = Harness(task=\"question-answering\", hub=\"ai21\", model=\"j2-jumbo-instruct\", data='NQ-open-test-tiny')"
"harness = Harness(task=\"question-answering\", model={\"model\": \"j2-jumbo-instruct\", \"hub\": \"ai21\"}, data={\"data_source\": 'NQ-open-test-tiny'})"
]
},
{
Expand Down Expand Up @@ -1819,7 +1818,7 @@
"metadata": {},
"outputs": [],
"source": [
"harness = Harness(task=\"summarization\", hub=\"ai21\", model=\"j2-jumbo-instruct\", data='XSum-test-tiny')"
"harness = Harness(task=\"summarization\", model={\"model\": \"j2-jumbo-instruct\", \"hub\": \"ai21\"}, data={\"data_source\": 'XSum-test-tiny'})"
]
},
{
Expand Down Expand Up @@ -3236,7 +3235,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.3"
"version": "3.9.13"
}
},
"nbformat": 4,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -92,10 +92,9 @@
"| Parameter | Description | \n",
"| - | - | \n",
"|**task** |Task for which the model is to be evaluated (question-answering or summarization)|\n",
"|**model** |LLM model name (ex: text-davinci-002, command-xlarge-nightly etc.)|\n",
"|**data** |Benchmark dataset name (ex: BoolQ-test, XSum-test etc.)|\n",
"|**config** |Configuration for the tests to be performed, specified in form of a YAML file.|\n",
"|**hub** | Name of the hub (ex: openai, azure-openai, ai21, cohere etc.)|\n",
"| **model** | Specifies the model(s) to be evaluated. Can be a dictionary or a list of dictionaries. Each dictionary should contain 'model' and 'hub' keys. If a path is specified, the dictionary must contain 'model' and 'hub' keys.|\n",
"| **data** | The data to be used for evaluation. A dictionary providing flexibility and options for data sources. It should include the following keys: <ul><li>data_source (mandatory): The source of the data.</li><li>subset (optional): The subset of the data.</li><li>feature_column (optional): The column containing the features.</li><li>target_column (optional): The column containing the target labels.</li><li>split (optional): The data split to be used.</li></ul> |\n",
"| **config** | Configuration for the tests to be performed, specified in the form of a YAML file. |\n",
"\n",
"<br/>\n",
"<br/>"
Expand Down Expand Up @@ -173,7 +172,7 @@
},
"outputs": [],
"source": [
"harness = Harness(task=\"question-answering\", hub=\"azure-openai\", model=\"text-davinci-003\", data='BoolQ-test-tiny')"
"harness = Harness(task=\"question-answering\", model={\"model\": \"text-davinci-003\", \"hub\":\"azure-openai\"} data={\"data_source\": 'BoolQ-test-tiny'})"
]
},
{
Expand Down Expand Up @@ -1131,7 +1130,8 @@
},
"outputs": [],
"source": [
"harness = Harness(task=\"question-answering\", hub=\"azure-openai\", model=\"text-davinci-003\", data='NQ-open-test-tiny')"
"harness = Harness(task=\"question-answering\", model={\"model\": \"text-davinci-003\",\"hub\":\"azure-openai\"} data={\"data_source\": \n",
"'NQ-open-test-tiny'})"
]
},
{
Expand Down Expand Up @@ -1806,7 +1806,8 @@
"metadata": {},
"outputs": [],
"source": [
"harness = Harness(task='summarization',model='text-davinci-003', hub=\"azure-openai\", data='XSum-test-tiny')"
"harness = Harness(task='summarization',model={\"model\": 'text-davinci-003', \"hub\": \"azure-openai\"}, data={\"data_source\": \n",
"'XSum-test-tiny'})"
]
},
{
Expand Down
11 changes: 5 additions & 6 deletions demo/tutorials/llm_notebooks/Clinical_Tests.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -100,12 +100,11 @@
"\n",
"\n",
"| Parameter | Description | \n",
"| - | - |\n",
"|**task** |Task for which the model is to be evaluated (ex: clinical-tests)|\n",
"|**model** |LLM model name (ex: text-davinci-003)|\n",
"|**data** |dataset name (ex: Medical-files, Gastroenterology-files, Oromaxillofacial-files)|\n",
"|**config** |Configuration for the tests to be performed, specified in form of a YAML file.|\n",
"|**hub** | Name of the hub (ex: openai, azure-openai, ai21, cohere etc.)|\n",
"| - | - | \n",
"|**task** |Task for which the model is to be evaluated (question-answering or summarization)|\n",
"| **model** | Specifies the model(s) to be evaluated. Can be a dictionary or a list of dictionaries. Each dictionary should contain 'model' and 'hub' keys. If a path is specified, the dictionary must contain 'model' and 'hub' keys.|\n",
"| **data** | The data to be used for evaluation. A dictionary providing flexibility and options for data sources. It should include the following keys: <ul><li>data_source (mandatory): The source of the data.</li><li>subset (optional): The subset of the data.</li><li>feature_column (optional): The column containing the features.</li><li>target_column (optional): The column containing the target labels.</li><li>split (optional): The data split to be used.</li></ul> |\n",
"| **config** | Configuration for the tests to be performed, specified in the form of a YAML file. |\n",
"\n",
"<br/>\n",
"<br/>"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -92,10 +92,9 @@
"| Parameter | Description | \n",
"| - | - | \n",
"|**task** |Task for which the model is to be evaluated (question-answering or summarization)|\n",
"|**model** |LLM model name (ex: text-davinci-002, command-xlarge-nightly etc.)|\n",
"|**data** |Benchmark dataset name (ex: BoolQ-test, XSum-test etc.)|\n",
"|**config** |Configuration for the tests to be performed, specified in form of a YAML file.|\n",
"|**hub** | Name of the hub (ex: openai, azure-openai, ai21, cohere etc.)|\n",
"| **model** | Specifies the model(s) to be evaluated. Can be a dictionary or a list of dictionaries. Each dictionary should contain 'model' and 'hub' keys. If a path is specified, the dictionary must contain 'model' and 'hub' keys.|\n",
"| **data** | The data to be used for evaluation. A dictionary providing flexibility and options for data sources. It should include the following keys: <ul><li>data_source (mandatory): The source of the data.</li><li>subset (optional): The subset of the data.</li><li>feature_column (optional): The column containing the features.</li><li>target_column (optional): The column containing the target labels.</li><li>split (optional): The data split to be used.</li></ul> |\n",
"| **config** | Configuration for the tests to be performed, specified in the form of a YAML file. |\n",
"\n",
"<br/>\n",
"<br/>"
Expand Down Expand Up @@ -176,7 +175,7 @@
},
"outputs": [],
"source": [
"harness = Harness(task=\"question-answering\", hub=\"cohere\", model=\"command-xlarge-nightly\", data='BoolQ-test-tiny')"
"harness = Harness(task=\"question-answering\", model={\"model\": \"command-xlarge-nightly\", \"hub\":\"cohere\"}, data={\"data_source\": 'BoolQ-test-tiny'})"
]
},
{
Expand Down Expand Up @@ -577,7 +576,8 @@
},
"outputs": [],
"source": [
"harness = Harness(task=\"question-answering\", hub=\"cohere\", model=\"command-xlarge-nightly\", data='NQ-open-test-tiny')"
"harness = Harness(task=\"question-answering\", model={\"model\": \"command-xlarge-nightly\",\"hub\":\"cohere\"} data={\"data_source\": \n",
"'NQ-open-test-tiny'})"
]
},
{
Expand Down Expand Up @@ -716,7 +716,7 @@
"metadata": {},
"outputs": [],
"source": [
"harness = Harness(task='summarization',hub=\"cohere\", model=\"command-xlarge-nightly\", data='XSum-test-tiny')"
"harness = Harness(task='summarization', model={\"model\": \"command-xlarge-nightly\", \"hub\":\"cohere\"}, data={\"data_source\": 'XSum-test-tiny'})"
]
},
{
Expand Down
Loading