Monitoring
Updating data
It is possible to update data previously streamed to the Openlayer platform. It usually happens when:
- The ground truths for the data streamed to the platform were not available during the model inference time, but became available after some time.
- You want to add human feedback associated with a request, but this feedback was not available during model inference time.
This guide shows how to use Openlayer SDKs to update previously published data.
How to update data
Every row streamed to Openlayer has an inference_id
— a unique identifier
of the row. You can provide the inference_id
during stream time, and if you don’t,
Openlayer will assign unique IDs to your rows.
You must use the inference_id
to specify the rows you want to update.
Let’s say that you want to add a column called label
with ground truths. If you have
your data in a pandas DataFrame similar to:
Python
>>> df
inference_id label
0 d56d2b2c 0
1 3b0b2521 1
2 8c294a3a 0
First, you need to retrieve the inference pipeline object with:
Python
import openlayer
client = openlayer.OpenlayerClient("YOUR_API_KEY_HERE")
project = client.load_project(name="Churn prediction")
inference_pipeline = project.load_inference_pipeline(
name="production",
)
Then, you can update the data specified by the inference IDs with:
Python
inference_pipeline.update_data(
df=df,
inference_id_column_name='inference_id',
ground_truth_column_name='label',
)
Was this page helpful?