Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions src/app/docs/chain-of-thought-text-classification/page.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,3 +44,15 @@ from skllm.models.gpt.classification.zero_shot import CoTGPTClassifier
| `prompt_template` | `Optional[str]` | Custom prompt template to use, by default None. |
| `key` | `Optional[str]` | Estimator-specific API key; if None, retrieved from the global config, by default None. |
| `org` | `Optional[str]` | Estimator-specific ORG key; if None, retrieved from the global config, by default None. |

### CoTClaudeClassifier
```python
from skllm.models.anthropic.classification.zero_shot import CoTClaudeClassifier
```

| **Parameter** | **Type** | **Description** |
| ------------- | -------- | --------------- |
| `model` | `str` | Model to use, by default "claude-3-haiku-20240307". |
| `default_label` | `str` | Default label for failed predictions; if "Random", selects randomly based on class frequencies, by default "Random". |
| `prompt_template` | `Optional[str]` | Custom prompt template to use, by default None. |
| `key` | `Optional[str]` | Estimator-specific API key; if None, retrieved from the global config, by default None. |
20 changes: 19 additions & 1 deletion src/app/docs/dynamic-few-shot-text-classification/page.md
Original file line number Diff line number Diff line change
Expand Up @@ -89,4 +89,22 @@ from skllm.models.gpt.classification.few_shot import DynamicFewShotGPTClassifier
| `n_examples` | `int` | Number of closest examples per class to be retrieved, by default 3. |
| `memory_index` | `Optional[IndexConstructor]` | Custom memory index, for details check `skllm.memory` submodule, by default None. |
| `vectorizer` | `Optional[BaseVectorizer]` | Scikit-LLM vectorizer; if None, `GPTVectorizer` is used, by default None. |
| `metric` | `Optional[str]` | Metric used for similarity search by the memory_index, by default "euclidean"
| `metric` | `Optional[str]` | Metric used for similarity search by the memory_index, by default "euclidean"

### DynamicFewShotClaudeClassifier
When using `DynamicFewShotClaudeClassifier`, you need to set up an OpenAI API key, since the classifier uses `GPTVectorizer` for computing embeddings.

```python
from skllm.models.claude.classification.few_shot import DynamicFewShotClaudeClassifier
```

| **Parameter** | **Type** | **Description** |
| ------------- | -------- | --------------- |
| `model` | `str` | Model to use, by default "claude-3-haiku-20240307" |
| `default_label` | `str` | Default label for failed prediction; if "Random" -> selects randomly based on class frequencies |
| `prompt_template` | `Optional[str]` | Custom prompt template to use, by default None |
| `key` | `Optional[str]` | Estimator-specific API key; if None, retrieved from the global config |
| `n_examples` | `int` | Number of closest examples per class to be retrieved, by default 3 |
| `memory_index` | `Optional[IndexConstructor]` | Custom memory index, for details check `skllm.memory` submodule |
| `vectorizer` | `Optional[BaseVectorizer]` | Scikit-LLM vectorizer; if None, `GPTVectorizer` is used |
| `metric` | `Optional[str]` | Metric used for similarity search, by default "euclidean" |
27 changes: 26 additions & 1 deletion src/app/docs/few-shot-text-classification/page.md
Original file line number Diff line number Diff line change
Expand Up @@ -68,4 +68,29 @@ from skllm.models.gpt.classification.few_shot import MultiLabelFewShotGPTClassif
| `max_labels` | `Optional[int]` | Maximum labels per sample, by default 5. |
| `prompt_template` | `Optional[str]` | Custom prompt template to use, by default None. |
| `key` | `Optional[str]` | Estimator-specific API key; if None, retrieved from the global config, by default None. |
| `org` | `Optional[str]` | Estimator-specific ORG key; if None, retrieved from the global config, by default None. |
| `org` | `Optional[str]` | Estimator-specific ORG key; if None, retrieved from the global config, by default None. |

### FewShotClaudeClassifier
```python
from skllm.models.claude.classification.few_shot import FewShotClaudeClassifier
```

| **Parameter** | **Type** | **Description** |
| ------------- | -------- | --------------- |
| `model` | `str` | Model to use, by default "claude-3-haiku-20240307" |
| `default_label` | `str` | Default label for failed prediction; if "Random" -> selects randomly based on class frequencies |
| `prompt_template` | `Optional[str]` | Custom prompt template to use, by default None |
| `key` | `Optional[str]` | Estimator-specific API key; if None, retrieved from the global config |

### MultiLabelFewShotClaudeClassifier
```python
from skllm.models.claude.classification.few_shot import MultiLabelFewShotClaudeClassifier
```

| **Parameter** | **Type** | **Description** |
| ------------- | -------- | --------------- |
| `model` | `str` | Model to use, by default "claude-3-haiku-20240307" |
| `default_label` | `str` | Default label for failed prediction; if "Random" -> selects randomly based on class frequencies |
| `max_labels` | `Optional[int]` | Maximum labels per sample, by default 5 |
| `prompt_template` | `Optional[str]` | Custom prompt template to use, by default None |
| `key` | `Optional[str]` | Estimator-specific API key; if None, retrieved from the global config |
15 changes: 15 additions & 0 deletions src/app/docs/introduction-backend-families/page.md
Original file line number Diff line number Diff line change
Expand Up @@ -160,6 +160,21 @@ Additionally, for tuning LLMs in Vertex, it is required to have to have 64 cores
3. Click on “Edit Quotas”, set the limit to 64 and submit the request.
The request should be approved within a few hours, but it might take up to several days.

---

## Anthropic Family

The Anthropic family includes models that use the Anthropic API format, specifically the Claude series of models.

## Claude (default)
The Claude backend is the default backend for the Anthropic family. To use the Claude backend, you need to set your Anthropic API key as follows:
```python
from skllm.config import SKLLMConfig

SKLLMConfig.set_anthropic_key("<YOUR_API_KEY>")
```

---

## Third party integrations

Expand Down
20 changes: 16 additions & 4 deletions src/app/docs/ner/page.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,9 +14,6 @@ Named Entity Recognition is an experimental feature and may be subject to instab

Named Entity Recognition is a process of locating and classifying the named entities in a provided text.


Currently, Scikit-LLM has a single NER estimator (only works with the GPT family) called `Explainable NER`.

Exemplary usage:

```python
Expand Down Expand Up @@ -86,4 +83,19 @@ from skllm.models.gpt.tagging.ner import GPTExplainableNER
| `model` | `Optional[str]` | A model to use, by default "gpt-4o". |
| `key` | `Optional[str]` | Estimator-specific API key; if None, retrieved from the global config, by default None. |
| `org` | `Optional[str]` | Estimator-specific ORG key; if None, retrieved from the global config, by default None. |
| `num_workers` | `Optional[int]` | Number of workers (threads) to use, by default 1. |
| `num_workers` | `Optional[int]` | Number of workers (threads) to use, by default 1. |

### AnthropicExplainableNER

```python
from skllm.models.anthropic.tagging.ner import AnthropicExplainableNER
```

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `entities` | dict | - | Dictionary of entities to recognize, with keys as entity names and values as descriptions |
| `display_predictions` | bool | False | Whether to display predictions |
| `sparse_output` | bool | True | Whether to generate a sparse representation of the predictions |
| `model` | str | "claude-3-haiku-20240307" | Model to use |
| `key` | Optional[str] | None | Estimator-specific API key; if None, retrieved from the global config |
| `num_workers` | int | 1 | Number of workers (threads) to use |
14 changes: 13 additions & 1 deletion src/app/docs/text-summarization/page.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,4 +46,16 @@ from skllm.models.gpt.text2text.summarization import GPTSummarizer
| `key` | `Optional[str]` | Estimator-specific API key; if None, retrieved from the global config, by default None. |
| `org` | `Optional[str]` | Estimator-specific ORG key; if None, retrieved from the global config, by default None. |
| `max_words` | `int` | Soft limit of the summary length, by default 15. |
| `focus` | `Optional[str]` | Concept in the text to focus on, by default None. |
| `focus` | `Optional[str]` | Concept in the text to focus on, by default None. |

### ClaudeSummarizer
```python
from skllm.models.anthropic.text2text.summarization import ClaudeSummarizer
```

| **Parameter** | **Type** | **Description** |
| ------------- | -------- | --------------- |
| `model` | `str` | Model to use, by default "claude-3-haiku-20240307". |
| `key` | `Optional[str]` | Estimator-specific API key; if None, retrieved from the global config, by default None. |
| `max_words` | `int` | Soft limit of the summary length, by default 15. |
| `focus` | `Optional[str]` | Concept in the text to focus on, by default None. |
13 changes: 12 additions & 1 deletion src/app/docs/text-translation/page.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,4 +37,15 @@ from skllm.models.gpt.text2text.translation import GPTTranslator
| `model` | `str` | Model to use, by default "gpt-3.5-turbo". |
| `key` | `Optional[str]` | Estimator-specific API key; if None, retrieved from the global config, by default None. |
| `org` | `Optional[str]` | Estimator-specific ORG key; if None, retrieved from the global config, by default None. |
| `output_language` | `str` | Target language, by default "English". |
| `output_language` | `str` | Target language, by default "English". |

### ClaudeTranslator
```python
from skllm.models.anthropic.text2text.translation import ClaudeTranslator
```

| Parameter | Type | Description |
|-----------|------|-------------|
| `model` | `str` | Model to use, by default "claude-3-haiku-20240307" |
| `key` | `Optional[str]` | Estimator-specific API key; if None, retrieved from the global config, by default None |
| `output_language` | `str` | Target language, by default "English" |
27 changes: 26 additions & 1 deletion src/app/docs/zero-shot-text-classification/page.md
Original file line number Diff line number Diff line change
Expand Up @@ -123,4 +123,29 @@ from skllm.models.vertex.classification.zero_shot import MultiLabelZeroShotVerte
| `model` | `str` | Model to use, by default "text-bison@002". |
| `default_label` | `str` | Default label for failed prediction; if "Random" -> selects randomly based on class frequencies, by default "Random". |
| `max_labels` | `Optional[int]` | Maximum labels per sample, by default 5. |
| `prompt_template` | `Optional[str]` | Custom prompt template to use, by default None. |
| `prompt_template` | `Optional[str]` | Custom prompt template to use, by default None. |

### ZeroShotClaudeClassifier
```python
from skllm.models.anthropic.classification.zero_shot import ZeroShotClaudeClassifier
```

| **Parameter** | **Type** | **Description** |
| ------------- | -------- | --------------- |
| `model` | `str` | Model to use, by default "claude-3-haiku-20240307". |
| `default_label` | `str` | Default label for failed predictions; if "Random", selects randomly based on class frequencies, by default "Random". |
| `prompt_template` | `Optional[str]` | Custom prompt template to use, by default None. |
| `key` | `Optional[str]` | Estimator-specific API key; if None, retrieved from the global config, by default None. |

### MultiLabelZeroShotClaudeClassifier
```python
from skllm.models.anthropic.classification.zero_shot import MultiLabelZeroShotClaudeClassifier
```

| **Parameter** | **Type** | **Description** |
| ------------- | -------- | --------------- |
| `model` | `str` | Model to use, by default "claude-3-haiku-20240307". |
| `default_label` | `str` | Default label for failed predictions; if "Random", selects randomly based on class frequencies, by default "Random". |
| `max_labels` | `Optional[int]` | Maximum number of labels per sample, by default 5. |
| `prompt_template` | `Optional[str]` | Custom prompt template to use, by default None. |
| `key` | `Optional[str]` | Estimator-specific API key; if None, retrieved from the global config, by default None. |