Version: 2.0

SDK Method Reference

GraphGrid Python SDK

This page is a reference for SDK set up, SDK methods, and SDK response types.

For information about the SDK and how to use it please see the page on SDK Usage.

GraphGrid SDK Set Up

Bootstrap Config

The SdkBootstrapConfig object provides the minimum configuration so the SDK can work properly.

SdkBootstrapConfig(
    access_key=None,
    secret_key=None,
    url_base='localhost',
    is_docker_context=False)

Parameter	Default value	Description
access_key	None	Oauth access key
secret_key	None	Oauth secret key
url_base (optional)	"localhost"	Base address for SDK requests (execution outside of docker)
is_docker_context (optional)	False	Whether execution is inside of docker

The required parameters access_key and secret_key are OAuth keys used to access Security Module token endpoints.

The other two optional parameters, url_base and is_docker_context, are set based on where the SDK code is to be executed.

The parameter is_docker_context is a boolean and defaults to False. If the SDK code is to be executed inside a docker container (ex. a custom DAG or a stand-alone python microservice) then is_docker_context must be True for SDK calls to function properly.

The parameter url_base specifies what base address the SDK should use to call GraphGrid module endpoints. This only works when the SDK code is executed outside of a docker context, for example natively running an SDK python script against a local CDP deployment.

Though a user can specify both, they function as mutually exclusive parameters: when is_docker_context is True the SDK no longer uses the url_base, and when is_docker_context is False the url_base is used for all SDK calls.

GraphGridSdk Object

The GraphGridSdk object is the core SDK python object used by the programmer in order to make SDK calls.

The GraphGridSdk is always constructed with a SdkBootstrapConfig object.

bootstrap_config = SdkBootstrapConfig(
    access_key='accessKey',
    secret_key='secretKey',
    url_base='localhost',
    is_docker_context=False)

sdk = GraphGridSdk(boostrap_config)

GraphGrid SDK Methods

There are currently seven SDK methods available for use:

Method	Description
nmt_train	Kick off training job
nmt_status	Status and results of a training job
job_run	Kick off a custom job
job_status	Status of a custom job
save_dataset	Save a dataset for training
promote_model	Promote an NLP model, swapping it in for use
nmt_train_pipeline	Kick off NLP model training pipeline

The nmt_train and nmt_status methods are provided to trigger, monitor, and retrieve results from a nlp-model-training DAG run. In contrast, the methods job_run and job_status are provided to trigger and monitor custom DAGs.

The nmt_train_pipeline method is specifically for kicking off NLP model training pipeline, it runs training jobs, monitors them, and can promote the newly trained models.

The following subsections dive into the method parameters, their usage, and some examples.

NLP Model Training Methods

NMT Train

The nmt_train SDK call is used to kick off model training. It triggers a new run for the nlp_model_training DAG.

The training specifics and configuration come in through the request_body. Please see the following section on airflow training request bodies.

Parameter	Default value	Description
request_body	None	Training request body for `nlp_model_training` DAG run

Here is an example of triggering a named_entity_recognition model training job, using the sample-dataset.jsonl dataset.

# Train a new model
training_request_body: TrainRequestBody = TrainRequestBody(model="named_entity_recognition",
                                                           datasets="sample-dataset.jsonl",
                                                           no_cache=False, gpu=False)

train_response: NMTTrainResponse = sdk.nmt_train(training_request_body)

The response type is NMTTrainResponse.

{
  "status_code": 200,
  "exception": None,
  "dagId": "nlp_model_training",
  "dagRunId": "manual__2022-05-05T14:29:54.415545+00:00",
  "state": "queued",
  "startDate": None,
  "endDate": None,
  "logicalDate": "2022-05-05T14:29:54.415545+00:00",
  "externalTrigger": True,
  "conf": {
    "no_cache": False,
    "model": "named_entity_recognition",
    "datasets": "sample-dataset.jsonl",
    "gpu": False
  }
}

NMT Status

The nmt_status SDK call is used for getting the status and results of model training. It can be used to programmatically monitor training jobs.

Parameter	Default value	Description
dagRunId	None	The unique id for the DAG run

The dagRunId is the id of the specific training job and is required to access the job's status. You can get the dagRunId from the NMTTrainResponse returned by triggering a new training job, or you can get it directly from the Airflow Webserver UI.

status_response: NMTStatusResponse = sdk.nmt_status(train_response.dagRunId)

The status response NMTStatusResponse contains generic DAG run information, and if the training job has finished is then populated with information about the training results.

Also see example SDK usage for how this can be used to monitor a training job in real time.

Example NMTStatusResponse while running model training:

{
  "status_code": 200,
  "exception": None,
  "dagId": "nlp_model_training",
  "dagRunId": "manual__2022-05-05T14:29:54.415545+00:00",
  "state": "running",
  "startDate": "2022-05-05T14:29:55.617307+00:00",
  "endDate": None,
  "logicalDate": "2022-05-05T14:29:54.415545+00:00",
  "externalTrigger": True,
  "conf": {
    "no_cache": False,
    "model": "named_entity_recognition",
    "datasets": "sample-dataset.jsonl",
    "gpu": False
  },
  "savedModelName": None,
  "savedModelFilename": None,
  "savedModelUrl": None,
  "trainingAccuracy": None,
  "trainingLoss": None,
  "evalAccuracy": None,
  "evalLoss": None,
  "properties": None
}

Example NMTStatusResponse once training has successfully finished:

{
  "status_code": 200,
  "exception": None,
  "dagId": "nlp_model_training",
  "dagRunId": "manual__2022-05-05T14:29:54.415545+00:00",
  "state": "success",
  "startDate": "2022-05-05T14:29:55.617307+00:00",
  "endDate": "2022-05-05T14:38:00.974698+00:00",
  "logicalDate": "2022-05-05T14:29:54.415545+00:00",
  "externalTrigger": True,
  "conf": {
    "no_cache": False,
    "model": "named_entity_recognition",
    "datasets": "sample-dataset.jsonl",
    "gpu": False
  },
  "savedModelName": "20220428T174621-nerModel",
  "savedModelFilename": "20220428T174621-nerModel.tar.gz",
  "savedModelUrl": "http://minio:9000/com-graphgrid-nlp/2.0.0/20220428T174621-nerModel/20220428T174621-nerModel.tar.gz",
  "trainingAccuracy": 0.842,
  "trainingLoss": 0.13,
  "evalAccuracy": None,
  "evalLoss": None,
  "properties": {
    "languages": [
      "en"
    ]
  }
}

Custom Job Methods

Job Run

The job_run SDK method provides a way to trigger a custom job (DAG run).

Parameter	Default value	Description
dag_id	None	The name or id of the DAG
request_body	None	Config values to be used in DAG run

The following example triggers the some-dag DAG with the specified request_body:

run_response: DagRunResponse = sdk.job_run(dag_id="some-dag",
                                           request_body={ "conf": {"exampleConfig": "example" })}

The return type is a DagRunResponse.

{
  "status_code": 200,
  "exception": None,
  "dagId": "some-dag",
  "dagRunId": "manual__2022-05-05T14:29:54.415545+00:00",
  "state": "queued",
  "startDate": None,
  "endDate": None,
  "logicalDate": "2022-05-05T14:29:54.415545+00:00",
  "externalTrigger": True,
  "conf": {
    "exampleConfig": "example"
  }

Job Status

The job_status SDK call returns a job's (DAG run) status. The dag_id and dag_run_id must both be provided to access the DAG run status.

Parameter	Default value	Description
dag_id	None	The name or id of the DAG
dag_run_id	None	The unique id for the DAG run

The following example retrieves the status for DAG run manual__2022-05-05T14:29:54.415545+00:00 under the some-dag DAG.

status_response: DagRunResponse = sdk.job_status(dag_id="some-dag",
                                                 dag_run_id="manual__2022-05-05T14:29:54.415545+00:00")

The return type is a DagRunResponse.

{
  "status_code": 200,
  "exception": None,
  "dagId": "some-dag",
  "dagRunId": "manual__2022-05-05T14:29:54.415545+00:00",
  "state": "running",
  "startDate": "2022-05-05T14:29:55.617307+00:00",
  "endDate": None,
  "logicalDate": "2022-05-05T14:29:54.415545+00:00",
  "externalTrigger": True,
  "conf": {
    "exampleConfig": "example"
  }

Other SDK Methods

The other SDK methods are utility methods that allow you to perform actions programmatically that you would otherwise have to do manually.

Save Dataset

The save_dataset SDK call is used for saving datasets in cloud storage so they can be used for training. Reading-in and saving-out the dataset is done in a streaming manner, so it can handle large datasets.

Parameter	Default value	Description
data_generator	None	The generator streaming training sample lines
dataset_id (optional)	{{timestamp}}	Name/id for the dataset in cloud storage
overwrite (optional)	False	Whether to overwrite the dataset if it already exists

The only required parameter is the data_generator, which is a python generator. These samples can come from anywhere, a local file, some external databse, the ONgDB CDP graph, etc. See this page for more information on the SDK datasets.

By default the dataset name is generated as <timestamp>.jsonl, but a user can provide their own dataset name by using the dataset_id parameter.

Datasets in cloud storage are protected from accidental overwrites, a user must provide the overwrite parameter with a True value to overwrite an existing dataset. A 409 Conflict is issued if a dataset is tried to be overwritten without providing the overwrite=True parameter.

Below is an example for reading in a dataset from a file (training-samples.jsonl) with a generator and saving it out to the name sample-dataset using the SDK:

def read_by_line():
    infile = open("training-samples.jsonl", 'r', encoding='utf8')
    for line in infile:
        yield line.encode()

dataset_response: SaveDatasetResponse = sdk.save_dataset(
                                            data_generator=read_by_line(),
                                            dataset_id="sample-dataset")

The response type is SaveDatasetResponse.

{
  "status_code": 200,
  "exception": None,
  "path": "s3://com-graphgrid-datasets/sample-dataset.jsonl",
  "datasetId": "sample-dataset.jsonl"
}

Promote Model

The promote_model sdk call promotes a trained model to the NLP module and swaps it in for use.

Parameter	Default value	Description
model_name	None	Name of the model to promote within cloud storage
environment (optional)	"default"	The spring config environment to update for the model param

The model_name is the name of the model we wish to promote. Need sentence about 'environment'?

promote_response: PromoteModelResponse = sdk.promote_model(status_response.savedModelName)

The response type is PromoteModelResponse.

{
  "status_code": 200,
  "exception": None,
  "modelName": "20220428T174621-nerModel",
  "task": "ner",
  "paramKey": "spring.nlp.ner.model"
}

Response Types

The SDK methods described return different response objects. This section covers those response objects and what information they provide.

SdkServiceResponse

This is the most generic type of response possible. All following response objects inherit from this class and have access to these fields.

Field	Type	Description
status_code	string	HTTP status code (ex. `200`)
response	string	The response body
exception	RequestException	Exception raised (if one occurred)

PromoteModelResponse

The PromoteModelResponse contains information about the model promotion.

Field	Type	Description
modelName	string	The promoted model's name
task	string	The associated task
paramKey	string	The spring configuration value updated

SaveDatasetResponse

The SaveDatasetResponse contains information about the saved dataset.

Field	Type	Description
datasetId	string	The dataset's name
path	string	The location of the dataset in cloud storage

DagRunResponse

A DagRunResponse is a generic response for a job (a DAG run). It is returned when triggering a job or when getting a job status. It contains the following job information:

Field	Type	Description
dagId	string	The DAG id
dagRunId	string	The job id
state	string	The current state of the job
startDate	string	Time job was triggered
endDate	string	Time job finished
logicalDate	string	Time covered by the job
externalTrigger	bool	Whether the job was manually triggered or scheduled
conf	dict	Job configuration

NMTStatusResponse

The NMTStatusResponse is a subclass of DagRunResponse so it contains all fields in DagRunResponse. While training is occurring NMTStatusResponse can be used to monitor the status of the training job. Once training is completed the NMTStatusReponse also serves as a way to get results about the trained model.

In addition to the status information provided by DagRunResponse an NMTStatusResponse object also contains:

Field	Type	Description
savedModelName	string	The trained model name (ex. `20210223T221627-nerModel` )
savedModelFilename	string	The trained model filename (ex. `20210223T221627-nerModel.tar.gz` )
savedModelUrl	string	The URL location of the trained model
trainingAccuracy	float	The model's training accuracy
trainingLoss	float	The model's training loss
evalAccuracy	float	The model's evaluation accuracy
evalLoss	float	The model's evaluation loss
properties	dict	Properties of the trained model (ex. languages)

These fields populated with information about the trained model once the training job has been successfully completed. Because of this, these fields will not be available during training.

NMTTrainResponse

Currently identical to DagRunResponse.

GraphGrid Python SDK​

GraphGrid SDK Set Up​

Bootstrap Config​

GraphGridSdk Object​

GraphGrid SDK Methods​

NLP Model Training Methods​

NMT Train​

NMT Status​

Custom Job Methods​

Job Run​

Job Status​

Other SDK Methods​

Save Dataset​

Promote Model​

Response Types​

SdkServiceResponse​

PromoteModelResponse​

SaveDatasetResponse​

DagRunResponse​

NMTStatusResponse​

NMTTrainResponse​