SDK Method Reference
GraphGrid Python SDK
This page is a reference for SDK set up, SDK methods, and SDK response types.
For information about the SDK and how to use it please see the page on SDK Usage.
GraphGrid SDK Set Up
Bootstrap Config
The SdkBootstrapConfig
object provides the minimum configuration so the SDK can work properly.
SdkBootstrapConfig(
access_key=None,
secret_key=None,
url_base='localhost',
is_docker_context=False)
Parameter | Default value | Description |
---|---|---|
access_key | None | Oauth access key |
secret_key | None | Oauth secret key |
url_base (optional) | "localhost" | Base address for SDK requests (execution outside of docker) |
is_docker_context (optional) | False | Whether execution is inside of docker |
The required parameters access_key
and secret_key
are OAuth keys used to access Security Module token endpoints.
The other two optional parameters, url_base
and is_docker_context
, are set based on where the SDK code is to be executed.
The parameter is_docker_context
is a boolean and defaults to False
.
If the SDK code is to be executed inside a docker container (ex. a custom DAG or a stand-alone python microservice) then is_docker_context
must
be True
for SDK calls to function properly.
The parameter url_base
specifies what base address the SDK should use to call GraphGrid module endpoints.
This only works when the SDK code is executed outside of a docker context, for example natively running an SDK python script against a local CDP deployment.
Though a user can specify both, they function as mutually exclusive parameters:
when is_docker_context
is True
the SDK no longer uses the url_base
, and when is_docker_context
is False
the url_base
is used for all SDK calls.
GraphGridSdk Object
The GraphGridSdk
object is the core SDK python object used by the programmer in order to make SDK calls.
The GraphGridSdk
is always constructed with a SdkBootstrapConfig
object.
bootstrap_config = SdkBootstrapConfig(
access_key='accessKey',
secret_key='secretKey',
url_base='localhost',
is_docker_context=False)
sdk = GraphGridSdk(boostrap_config)
GraphGrid SDK Methods
There are currently seven SDK methods available for use:
Method | Description |
---|---|
nmt_train | Kick off training job |
nmt_status | Status and results of a training job |
job_run | Kick off a custom job |
job_status | Status of a custom job |
save_dataset | Save a dataset for training |
promote_model | Promote an NLP model, swapping it in for use |
nmt_train_pipeline | Kick off NLP model training pipeline |
The nmt_train
and nmt_status
methods are provided to trigger, monitor, and retrieve results from a nlp-model-training
DAG run.
In contrast, the methods job_run
and job_status
are provided to trigger and monitor custom DAGs.
The nmt_train_pipeline
method is specifically for kicking off NLP model training pipeline, it
runs training jobs, monitors them, and can promote the newly trained models.
The following subsections dive into the method parameters, their usage, and some examples.
NLP Model Training Methods
NMT Train
The nmt_train
SDK call is used to kick off model training. It triggers a new run for the nlp_model_training
DAG.
The training specifics and configuration come in through the request_body
. Please see the following section on airflow training request bodies.
Parameter | Default value | Description |
---|---|---|
request_body | None | Training request body for nlp_model_training DAG run |
Here is an example of triggering a named_entity_recognition
model training job, using the sample-dataset.jsonl
dataset.
# Train a new model
training_request_body: TrainRequestBody = TrainRequestBody(model="named_entity_recognition",
datasets="sample-dataset.jsonl",
no_cache=False, gpu=False)
train_response: NMTTrainResponse = sdk.nmt_train(training_request_body)
The response type is NMTTrainResponse
.
{
"status_code": 200,
"exception": None,
"dagId": "nlp_model_training",
"dagRunId": "manual__2022-05-05T14:29:54.415545+00:00",
"state": "queued",
"startDate": None,
"endDate": None,
"logicalDate": "2022-05-05T14:29:54.415545+00:00",
"externalTrigger": True,
"conf": {
"no_cache": False,
"model": "named_entity_recognition",
"datasets": "sample-dataset.jsonl",
"gpu": False
}
}
NMT Status
The nmt_status
SDK call is used for getting the status and results of model training. It can be used to programmatically monitor training jobs.
Parameter | Default value | Description |
---|---|---|
dagRunId | None | The unique id for the DAG run |
The dagRunId
is the id of the specific training job and is required to access the job's status.
You can get the dagRunId
from the NMTTrainResponse returned by triggering a new training job, or you can get it directly from the Airflow Webserver UI.
status_response: NMTStatusResponse = sdk.nmt_status(train_response.dagRunId)
The status response NMTStatusResponse
contains generic DAG run information, and if the training job has finished is then populated with
information about the training results.
Also see example SDK usage for how this can be used to monitor a training job in real time.
Example NMTStatusResponse while running model training:
{
"status_code": 200,
"exception": None,
"dagId": "nlp_model_training",
"dagRunId": "manual__2022-05-05T14:29:54.415545+00:00",
"state": "running",
"startDate": "2022-05-05T14:29:55.617307+00:00",
"endDate": None,
"logicalDate": "2022-05-05T14:29:54.415545+00:00",
"externalTrigger": True,
"conf": {
"no_cache": False,
"model": "named_entity_recognition",
"datasets": "sample-dataset.jsonl",
"gpu": False
},
"savedModelName": None,
"savedModelFilename": None,
"savedModelUrl": None,
"trainingAccuracy": None,
"trainingLoss": None,
"evalAccuracy": None,
"evalLoss": None,
"properties": None
}
Example NMTStatusResponse once training has successfully finished:
{
"status_code": 200,
"exception": None,
"dagId": "nlp_model_training",
"dagRunId": "manual__2022-05-05T14:29:54.415545+00:00",
"state": "success",
"startDate": "2022-05-05T14:29:55.617307+00:00",
"endDate": "2022-05-05T14:38:00.974698+00:00",
"logicalDate": "2022-05-05T14:29:54.415545+00:00",
"externalTrigger": True,
"conf": {
"no_cache": False,
"model": "named_entity_recognition",
"datasets": "sample-dataset.jsonl",
"gpu": False
},
"savedModelName": "20220428T174621-nerModel",
"savedModelFilename": "20220428T174621-nerModel.tar.gz",
"savedModelUrl": "http://minio:9000/com-graphgrid-nlp/2.0.0/20220428T174621-nerModel/20220428T174621-nerModel.tar.gz",
"trainingAccuracy": 0.842,
"trainingLoss": 0.13,
"evalAccuracy": None,
"evalLoss": None,
"properties": {
"languages": [
"en"
]
}
}
Custom Job Methods
Job Run
The job_run
SDK method provides a way to trigger a custom job (DAG run).
Parameter | Default value | Description |
---|---|---|
dag_id | None | The name or id of the DAG |
request_body | None | Config values to be used in DAG run |
The following example triggers the some-dag
DAG with the specified request_body
:
run_response: DagRunResponse = sdk.job_run(dag_id="some-dag",
request_body={ "conf": {"exampleConfig": "example" })}
The return type is a DagRunResponse
.
{
"status_code": 200,
"exception": None,
"dagId": "some-dag",
"dagRunId": "manual__2022-05-05T14:29:54.415545+00:00",
"state": "queued",
"startDate": None,
"endDate": None,
"logicalDate": "2022-05-05T14:29:54.415545+00:00",
"externalTrigger": True,
"conf": {
"exampleConfig": "example"
}
Job Status
The job_status
SDK call returns a job's (DAG run) status. The dag_id
and dag_run_id
must both be provided to access the DAG run status.
Parameter | Default value | Description |
---|---|---|
dag_id | None | The name or id of the DAG |
dag_run_id | None | The unique id for the DAG run |
The following example retrieves the status for DAG run manual__2022-05-05T14:29:54.415545+00:00
under the some-dag
DAG.
status_response: DagRunResponse = sdk.job_status(dag_id="some-dag",
dag_run_id="manual__2022-05-05T14:29:54.415545+00:00")
The return type is a DagRunResponse
.
{
"status_code": 200,
"exception": None,
"dagId": "some-dag",
"dagRunId": "manual__2022-05-05T14:29:54.415545+00:00",
"state": "running",
"startDate": "2022-05-05T14:29:55.617307+00:00",
"endDate": None,
"logicalDate": "2022-05-05T14:29:54.415545+00:00",
"externalTrigger": True,
"conf": {
"exampleConfig": "example"
}
Other SDK Methods
The other SDK methods are utility methods that allow you to perform actions programmatically that you would otherwise have to do manually.
Save Dataset
The save_dataset
SDK call is used for saving datasets in cloud storage so they can be used for training.
Reading-in and saving-out the dataset is done in a streaming manner, so it can handle large datasets.
Parameter | Default value | Description |
---|---|---|
data_generator | None | The generator streaming training sample lines |
dataset_id (optional) | {{timestamp}} | Name/id for the dataset in cloud storage |
overwrite (optional) | False | Whether to overwrite the dataset if it already exists |
The only required parameter is the data_generator
, which is a python generator.
These samples can come from anywhere, a local file, some external databse, the ONgDB CDP graph, etc.
See this page for more information on the SDK datasets.
By default the dataset name is generated as <timestamp>.jsonl
, but a user can provide their own dataset name by using the dataset_id
parameter.
Datasets in cloud storage are protected from accidental overwrites, a user must provide the overwrite
parameter with a True
value to overwrite an existing dataset.
A 409 Conflict
is issued if a dataset is tried to be overwritten without providing the overwrite=True
parameter.
Below is an example for reading in a dataset from a file (training-samples.jsonl
) with a generator and saving it out to the name sample-dataset
using the SDK:
def read_by_line():
infile = open("training-samples.jsonl", 'r', encoding='utf8')
for line in infile:
yield line.encode()
dataset_response: SaveDatasetResponse = sdk.save_dataset(
data_generator=read_by_line(),
dataset_id="sample-dataset")
The response type is SaveDatasetResponse
.
{
"status_code": 200,
"exception": None,
"path": "s3://com-graphgrid-datasets/sample-dataset.jsonl",
"datasetId": "sample-dataset.jsonl"
}
Promote Model
The promote_model
sdk call promotes a trained model to the NLP module and swaps it in for use.
Parameter | Default value | Description |
---|---|---|
model_name | None | Name of the model to promote within cloud storage |
environment (optional) | "default" | The spring config environment to update for the model param |
The model_name
is the name of the model we wish to promote. Need sentence about 'environment'?
promote_response: PromoteModelResponse = sdk.promote_model(status_response.savedModelName)
The response type is PromoteModelResponse
.
{
"status_code": 200,
"exception": None,
"modelName": "20220428T174621-nerModel",
"task": "ner",
"paramKey": "spring.nlp.ner.model"
}
Response Types
The SDK methods described return different response objects. This section covers those response objects and what information they provide.
SdkServiceResponse
This is the most generic type of response possible. All following response objects inherit from this class and have access to these fields.
Field | Type | Description |
---|---|---|
status_code | string | HTTP status code (ex. 200 ) |
response | string | The response body |
exception | RequestException | Exception raised (if one occurred) |
PromoteModelResponse
The PromoteModelResponse
contains information about the model promotion.
Field | Type | Description |
---|---|---|
modelName | string | The promoted model's name |
task | string | The associated task |
paramKey | string | The spring configuration value updated |
SaveDatasetResponse
The SaveDatasetResponse
contains information about the saved dataset.
Field | Type | Description |
---|---|---|
datasetId | string | The dataset's name |
path | string | The location of the dataset in cloud storage |
DagRunResponse
A DagRunResponse
is a generic response for a job (a DAG run). It is returned when triggering a job or when getting a job status.
It contains the following job information:
Field | Type | Description |
---|---|---|
dagId | string | The DAG id |
dagRunId | string | The job id |
state | string | The current state of the job |
startDate | string | Time job was triggered |
endDate | string | Time job finished |
logicalDate | string | Time covered by the job |
externalTrigger | bool | Whether the job was manually triggered or scheduled |
conf | dict | Job configuration |
NMTStatusResponse
The NMTStatusResponse
is a subclass of DagRunResponse
so it contains all fields in DagRunResponse
.
While training is occurring NMTStatusResponse
can be used to monitor the status of the training job.
Once training is completed the NMTStatusReponse
also serves as a way to get results about the trained model.
In addition to the status information provided by DagRunResponse
an NMTStatusResponse
object also contains:
Field | Type | Description |
---|---|---|
savedModelName | string | The trained model name (ex. 20210223T221627-nerModel ) |
savedModelFilename | string | The trained model filename (ex. 20210223T221627-nerModel.tar.gz ) |
savedModelUrl | string | The URL location of the trained model |
trainingAccuracy | float | The model's training accuracy |
trainingLoss | float | The model's training loss |
evalAccuracy | float | The model's evaluation accuracy |
evalLoss | float | The model's evaluation loss |
properties | dict | Properties of the trained model (ex. languages) |
These fields populated with information about the trained model once the training job has been successfully completed. Because of this, these fields will not be available during training.
NMTTrainResponse
Currently identical to DagRunResponse.