Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
import citrine
import os
import sys
sys.path.insert(0, os.path.abspath('../../src'))
sys.path.insert(0, os.path.abspath('../../src/citrine'))


# -- Project information -----------------------------------------------------
Expand Down Expand Up @@ -44,7 +44,7 @@
# build.
#
# See: https://github.com/sphinx-contrib/apidoc
apidoc_module_dir = '../../src'
apidoc_module_dir = '../../src/citrine'
apidoc_output_dir = 'reference'
apidoc_excluded_paths = ['tests']
apidoc_separate_modules = True
Expand Down
4 changes: 2 additions & 2 deletions docs/source/data_extraction.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ A GEM Table is defined on a set of material histories, and the rows in the resul
Columns correspond to data about the material histories, such as the temperature measured in a kiln used at a specific manufacturing step.

Defining rows and columns
------------------------
-------------------------

A Row object describes a mapping from a list of Datasets to rows of a table by selecting a set of Material Histories.
Each Material History corresponds to exactly one row, though the Material Histories may overlap such that the same objects contribute data to multiple rows.
Expand Down Expand Up @@ -327,4 +327,4 @@ are compatible with each type of descriptor:
- :class:`~citrine.informatics.descriptors.ChemicalFormulaDescriptor`: values of type :class:`~gemd.entity.EmpiricalFormula`,
or values of type :class:`~gemd.entity.NominalComposition` when **all** quantity keys are valid atomic symbols
- :class:`~citrine.informatics.descriptors.FormulationDescriptor`: all values extracted by ingredient quantity, identifier, and label variables
are used to represent the formulation
are used to represent the formulation
2 changes: 1 addition & 1 deletion docs/source/getting_started/datasets.rst
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ Assume you have a "band gaps project" with known id, ``band_gaps_project_id``, a


Dataset Access, Sharing, and Transfer
------------------------------------
-------------------------------------

When a Dataset is created on the Citrine Platform, only members of the project in which it was created can see it and interact with it.
If a Dataset is made public, it (and its entire contents) can be retrieved by any user using any project.
Expand Down
3 changes: 2 additions & 1 deletion docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ Welcome to the Citrine Python client documentation!

This site documents the Python SDK for the Citrine Platform.
It provides utilities to upload and manage data and design materials using Sequential Learning.
See the :ref:`getting started <getting-started>` guide for a high-level introduction.
See the :ref:`getting started <ai-engine-getting-started>` guide for a high-level introduction.
The :ref:`workflows <workflows>` section documents how to configure and run artificial intelligence (AI) workflows for materials research and development.

Installation
Expand Down Expand Up @@ -41,6 +41,7 @@ Table of Contents
formulations_example
molecular_generation
FAQ/index
API Reference <reference/modules>

Indices and tables
==================
Expand Down
4 changes: 0 additions & 4 deletions docs/source/modules.rst

This file was deleted.

74 changes: 74 additions & 0 deletions docs/source/molecular_generation.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
.. generative_design_execution:

[ALPHA] Generative Design Execution
===================================
The Citrine Platform offers a Generative Design Execution tool that allows the creation of new molecules by applying mutations to a set of given seed molecules.
To use this feature, you need to provide a set of starting molecules and filtering parameters using the :class:`~citrine.informatics.generative_design.GenerativeDesignInput` class.

The class requires you to define the seed molecules for generating mutations, the fingerprint type used to calculate the `fingerprint similarity <https://www.rdkit.org/docs/GettingStartedInPython.html#fingerprinting-and-molecular-similarity>`_, the minimum fingerprint similarity between the seed and mutated molecule, the number of initial mutations attempted per seed, and the minimum substructure counts for each mutated molecule.

Various fingerprint types are available on the Citrine Platform, including Atom Pairs (AP), Path-Length Connectivity (PHCO), Binary Path (BPF), Paths of Atoms of Heteroatoms (PATH), Extended Connectivity Fingerprint with radius 4 (ECFP4) and radius 6 (ECFP6), and Focused Connectivity Fingerprint with radius 4 (FCFP4) and radius 6 (FCFP6).
Each fingerprint type captures different aspects of molecular structure and influences the generated mutations.
You can access these fingerprint types through the :class:`~citrine.informatics.generative_design.FingerprintType` enum, like `FingerprintType.ECFP4`.

The `structure_exclusions` parameter allows you to control the structural features of mutated molecules.
It is a sequence of exclusion types corresponding to the types of structural features or elements to exclude from the list of possible mutation steps during the generative design process.
If a type is present in the sequence, the mutation steps generated by the process will avoid using that feature or element.
The available structure exclusion options can be found in the :class:`~citrine.informatics.generative_design.StructureExclusion` class.

The `min_substructure_counts` parameter is a dictionary for constraining which substructures (represented by SMILES strings) must appear in each mutated molecule, along with integer values representing the minimum number of times each substructure must appear in a molecule to be considered a valid mutation.

After the generative design process is complete, the mutations are filtered based on their similarity to the starting seed molecules.
Mutations that do not meet the similarity threshold or are duplicates will be discarded. The remaining mutations are returned as a subset of the original mutations in the form of a list of :class:`~citrine.informatics.generative_design.GenerativeDesignResult` objects.
These results contain information about the seed molecule, the mutation, the similarity score, and the fingerprint type used during execution.

After triggering the execution and waiting for completion, the user can retrieve the results and utilize them in their work.'
The following example demonstrates how to run a generative design execution on the Citrine Platform using the Citrine Python client.

.. code-block:: python

import os
from citrine import Citrine
from citrine.jobs.waiting import wait_while_executing
from citrine.informatics.generative_design import GenerativeDesignInput, FingerprintType, StructureExclusion

session = Citrine(
api_key=os.environ.get("API_KEY"),
scheme="https",
host=os.environ.get("CITRINE_HOST"),
port="443",
)

team_uid = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
project_uid = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
team = session.teams.get(team_uid)
project = team.projects.get(project_uid)

# Trigger a new generative design execution
generative_design_input = GenerativeDesignInput(
seeds=["CC(O)=O", "CCCCCCCCCCCC"],
fingerprint_type=FingerprintType.ECFP4,
min_fingerprint_similarity=0.1,
mutation_per_seed=1000,
structure_exclusions=[
StructureExclusion.BROMINE,
StructureExclusion.CHLORINE,
],
min_substructure_counts={"c1ccccc1": 1}
)
generative_design_execution = project.generative_design_executions.trigger(
generative_design_input
)
execution = wait_while_executing(
collection=project.generative_design_executions, execution=generative_design_execution
)
generated = execution.results()
mutations = [(gen.seed, gen.mutated) for gen in generated]

# Or get a completed execution by ID
execution_uid = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx"
execution = project.generative_design_executions.get(execution_uid)
generated = execution.results()
mutations = [(gen.seed, gen.mutated) for gen in generated]

To execute the code, replace the `xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx` placeholders with valid UIDs from your Citrine environment. Ensure that the API key, scheme, host, and port are correctly specified in the `Citrine` initialization.
7 changes: 0 additions & 7 deletions docs/source/setup.rst

This file was deleted.

2 changes: 1 addition & 1 deletion docs/source/workflows/getting_started.rst
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
.. _getting-started:
.. _ai-engine-getting-started:

Getting Started
===============
Expand Down
1 change: 1 addition & 0 deletions docs/source/workflows/predictor_evaluation_workflows.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ Metrics are specified as a set of :class:`PredictorEvaluationMetrics <citrine.in
The evaluator will only compute the subset of metrics valid for each response, so the top-level metrics defined by an evaluator should contain the union of all metrics computed across all responses.

.. _Cross-validation evaluator:

Cross-validation evaluator
^^^^^^^^^^^^^^^^^^^^^^^^^^

Expand Down
3 changes: 2 additions & 1 deletion docs/source/workflows/predictors.rst
Original file line number Diff line number Diff line change
Expand Up @@ -130,7 +130,7 @@ The ML models are independently registered on-platform, but the expression predi

graph_predictor = project.predictors.register(
GraphPredictor(
name = "Big elastic constant predictor,
name = "Big elastic constant predictor",
description = ""
predictors = [
bulk_modulus_predictor.uid,
Expand All @@ -144,6 +144,7 @@ The ML models are independently registered on-platform, but the expression predi
For another example of graph predictor usage, see :ref:`AI Engine Code Examples <graph_predictor_example>`.

.. _Expression Predictor:

Expression predictor
--------------------

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,10 @@ class IngredientRatioConstraint(Serializable['IngredientRatioConstraint'], Const
"""A formulation constraint operating on the ratio of quantities of ingredients and a basis.

Example: "6 to 7 parts ingredient A per 100 parts ingredient B" becomes

.. code:: python
IngredientRatioConstraint(min=6, max=7, ingredient=("A", 100), basis_ingredients=["B"])

IngredientRatioConstraint(min=6, max=7, ingredient=("A", 100), basis_ingredients=["B"])

Parameters
----------
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -124,11 +124,11 @@ class HierarchicalDesignSpace(EngineResource["HierarchicalDesignSpace"], DesignS
referencing other sub-nodes, allowing for the linkage of complex material history shapes
in the resulting candidates.

Every node also contains a set of :class:`~citrine.informatics.dimensions.Dimension`s
Every node also contains a set of :class:`~citrine.informatics.dimensions.Dimension`\\s
used to define any attributes (i.e., properties, processing parameters)
that will appear on the materials produced by that node.

:class:`~citrine.informatics.data_sources.DataSource`s can be included on the configuration
:class:`~citrine.informatics.data_sources.DataSource`\\s can be included on the configuration
to allow for design over "known" materials. The Citrine Platform will look up
the ingredient names from formulation subspaces on the design space nodes
in order to inject their composition/properties into the material history of the candidates.
Expand Down
30 changes: 9 additions & 21 deletions src/citrine/resources/dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -53,45 +53,33 @@ class Dataset(Resource['Dataset']):
unique_name: Optional[str]
An optional, globally unique name that can be used to retrieve the dataset.

Attributes
----------
uid: UUID
Unique uuid4 identifier of this dataset.
deleted: bool
Flag indicating whether or not this dataset has been deleted.
created_by: UUID
ID of the user who created the dataset.
updated_by: UUID
ID of the user who last updated the dataset.
deleted_by: UUID
ID of the user who deleted the dataset, if it is deleted.
create_time: int
Time the dataset was created, in seconds since epoch.
update_time: int
Time the dataset was most recently updated, in seconds since epoch.
delete_time: int
Time the dataset was deleted, in seconds since epoch, if it is deleted.
public: bool
Flag indicating whether the dataset is publicly readable.

"""

_response_key = 'dataset'
_resource_type = ResourceTypeEnum.DATASET

uid = properties.Optional(properties.UUID(), 'id')
"""UUID: Unique uuid4 identifier of this dataset."""
name = properties.String('name')
unique_name = properties.Optional(properties.String(), 'unique_name')
summary = properties.String('summary')
description = properties.String('description')
deleted = properties.Optional(properties.Boolean(), 'deleted')
"""bool: Flag indicating whether or not this dataset has been deleted."""
created_by = properties.Optional(properties.UUID(), 'created_by')
"""UUID: ID of the user who created the dataset."""
updated_by = properties.Optional(properties.UUID(), 'updated_by')
"""UUID: ID of the user who last updated the dataset."""
deleted_by = properties.Optional(properties.UUID(), 'deleted_by')
"""UUID: ID of the user who deleted the dataset, if it is deleted."""
create_time = properties.Optional(properties.Datetime(), 'create_time')
"""int: Time the dataset was created, in seconds since epoch."""
update_time = properties.Optional(properties.Datetime(), 'update_time')
"""int: Time the dataset was most recently updated, in seconds since epoch."""
delete_time = properties.Optional(properties.Datetime(), 'delete_time')
"""int: Time the dataset was deleted, in seconds since epoch, if it is deleted."""
public = properties.Optional(properties.Boolean(), 'public')
"""bool: Flag indicating whether the dataset is publicly readable."""
project_id = properties.Optional(properties.UUID(), 'project_id',
serializable=False, deserializable=False)
session = properties.Optional(properties.Object(Session), 'session',
Expand Down
29 changes: 9 additions & 20 deletions src/citrine/resources/file_link.py
Original file line number Diff line number Diff line change
Expand Up @@ -132,26 +132,6 @@ class FileLink(
url: str
URL that can be used to access the file.

Attributes
----------
uid: UUID
Unique uuid4 identifier of this file; consistent across versions.
version: UUID
Unique uuid4 identifier of this version of this file
version_number: Integer
How many times this file has been uploaded;
files are the "same" if the share a filename and dataset
created_time: Datetime
Time the file was created on platform.
created_by: UUID
Unique uuid4 identifier of this User who loaded this file
mime_type: String
Encoded string representing the type of the file (IETF RFC 2045)
size: Integer
Size in bytes of the file
description: String
A human-readable description of the file

"""

# NOTE: skipping the "metadata" field since it appears to be unused
Expand All @@ -160,13 +140,22 @@ class FileLink(
filename = properties.String('filename')
url = properties.String('url')
uid = properties.Optional(properties.UUID, 'id', serializable=False)
"""UUID: Unique uuid4 identifier of this file; consistent across versions."""
version = properties.Optional(properties.UUID, 'version', serializable=False)
"""UUID: Unique uuid4 identifier of this version of this file."""
created_time = properties.Optional(properties.Datetime, 'created_time', serializable=False)
"""datetime: Time the file was created on platform."""
created_by = properties.Optional(properties.UUID, 'created_by', serializable=False)
"""UUID: Unique uuid4 identifier of this User who loaded this file."""
mime_type = properties.Optional(properties.String, 'mime_type', serializable=False)
"""str: Encoded string representing the type of the file (IETF RFC 2045)."""
size = properties.Optional(properties.Integer, 'size', serializable=False)
"""int: Size in bytes of the file."""
description = properties.Optional(properties.String, 'description', serializable=False)
"""str: A human-readable description of the file."""
version_number = properties.Optional(properties.Integer, 'version_number', serializable=False)
"""int: How many times this file has been uploaded; files are the "same" if they share a
filename and dataset."""

def __init__(self, filename: str, url: str):
GEMDFileLink.__init__(self, filename, url)
Expand Down
32 changes: 8 additions & 24 deletions src/citrine/resources/ingestion.py
Original file line number Diff line number Diff line change
Expand Up @@ -107,19 +107,13 @@ def __repr__(self):


class IngestionException(CitrineException):
"""
[ALPHA] An exception that contains details of a failed ingestion.

Attributes
----------
uid: Optional[UUID]
errors: List[IngestionErrorTrace]

"""
"""[ALPHA] An exception that contains details of a failed ingestion."""

uid = properties.Optional(properties.UUID(), 'ingestion_id', default=None)
"""Optional[UUID]"""
status = properties.Enumeration(IngestionStatusType, "status")
errors = properties.List(properties.Object(IngestionErrorTrace), "errors")
"""List[IngestionErrorTrace]"""

def __init__(self,
*,
Expand Down Expand Up @@ -147,20 +141,14 @@ def from_api_error(cls, source: ApiError) -> "IngestionException":


class IngestionStatus(Resource['IngestionStatus']):
"""
[ALPHA] An object that represents the outcome of an ingestion event.

Attributes
----------
uid: String
status: IngestionStatusType
errors: List[IngestionErrorTrace]

"""
"""[ALPHA] An object that represents the outcome of an ingestion event."""

uid = properties.Optional(properties.UUID(), 'ingestion_id', default=None)
"""UUID"""
status = properties.Enumeration(IngestionStatusType, "status")
"""IngestionStatusType"""
errors = properties.List(properties.Object(IngestionErrorTrace), "errors")
"""List[IngestionErrorTrace]"""

def __init__(self,
*,
Expand Down Expand Up @@ -190,14 +178,10 @@ class Ingestion(Resource['Ingestion']):
every object in that dataset. A user with write access to a dataset can create, update,
and delete objects in the dataset.

Attributes
----------
uid: UUID
Unique uuid4 identifier of this ingestion.

"""

uid = properties.UUID('ingestion_id')
"""UUID: Unique uuid4 identifier of this ingestion."""
project_id = properties.UUID('project_id')
dataset_id = properties.UUID('dataset_id')
session = properties.Object(Session, 'session', serializable=False)
Expand Down
6 changes: 0 additions & 6 deletions src/citrine/resources/material_run.py
Original file line number Diff line number Diff line change
Expand Up @@ -50,12 +50,6 @@ class MaterialRun(
file_links: List[FileLink], optional
Links to associated files, with resource paths into the files API.

Attributes
----------
measurements: List[MeasurementRun], optional
Measurements performed on this material. The link is established by creating the
measurement run and settings its `material` field to this material run.

"""

_response_key = GEMDMaterialRun.typ # 'material_run'
Expand Down
Loading