Migration guide between kedro-mlflow versions#
This page explains how to migrate an existing kedro project to a more up to date kedro-mlflow versions with breaking changes.
Migration from 0.13.x to 0.14.x#
Upgrade mlflow to mlflow>=2.7.0.
Migration from 0.12.x to 0.13.x#
Upgrade mlflow to mlflow>=1.30.
Migration from 0.11.x to 0.12.x#
Upgrade your kedro project to
kedro>=0.19,<0.20Rename the following
DataSets with theDatasetsuffix (without final capitalized S) in yourcatalog.ymland change names to make them more explicit: | Name inkedro_mlflow<=0.11|Name inkedro_mlflow>=0.12| |——————————-|————————————-| |MlflowArtifactDataSet|MlflowArtifactDataset| |MlflowAbstractModelDataSet|MlflowAbstractModelDataset| |MlflowModelRegistryDataSet|MlflowModelRegistryDataset| |MlflowMetricDataSet|MlflowMetricDataset| |MlflowMetricHistoryDataSet|MlflowMetricHistoryDataset| |MlflowModelLoggerDataSet|MlflowModelTrackingDataset| |MlflowModelSaverDataSet|MlflowModelLocalFileSystemDataset| |MlflowMetricsDataSet|MlflowMetricsHistoryDataset|Update your
MlflowArtifactDatasetcatalog entry to rename thedata_setkey todataset
my_dataset:
type: MlflowArtifactDataset
dataset:
type: ...
If you use
KedroPipelineModelorpipeline_ml_factory, the defaultcopy_modeis nowassignbecause this is the most efficient setup (and usually the desired one) when serving a Kedro Pipeline as a Mlflow model. To get back to the previousdeepcopymode, change the entry to:
pipeline_ml_factory(
training=training_pipeline,
inference=inference_pipeline,
kpm_kwargs=dict(copy_mode="deepcopy"),
)
Migration from 0.10.x to 0.11.x#
If you are registering your
kedro_mlflowhooks manually (instead of using automatic registering from plugin, which is the default), change yoursettings.py
from this:
# <your_project>/src/<your_project>/settings.py
from kedro_mlflow.framework.hooks import MlflowHook
HOOKS = (MlflowPipelineHook(), MlflowNodeHook())
to this:
# <your_project>/src/<your_project>/settings.py
from kedro_mlflow.framework.hooks import MlflowHook
HOOKS = (MlflowHook(),)
The
get_mlflow_configpublic method has been removed and the mlflow configuration is now automatically stored in themlflowattribute ofKedroContext. if you need to access the mlflow configuration, you can use:
from kedro.framework.session import KedroSession
from kedro.framework.startup import bootstrap_project
bootstrap_project(project_path)
with KedroSession.create(
project_path=project_path,
) as session:
context = session.load_context()
print(context.mlflow) # this is where mlflow configuration is stored
Remove the
server.stores_environment_variableskey frommlflow.yml. This is a dead key which was unused. It will now throw an error if it is still written inmlflow.yml.
Migration from 0.9.x to 0.10.x#
You must upgrade your kedro version to kedro>=0.18.1 to use kedro_mlflow>=0.10.
Migration from 0.8.x to 0.9.x#
There are no breaking change in this patch release except if you retrieve the mlflow configuration manually (e.g. in a script or a jupyter notebok). The setup() method needs to be called with context:
from kedro.framework.context import load_context
from kedro_mlflow.config import get_mlflow_config
context = load_context(".")
# the new best practice is just to remove these lines
mlflow_config = get_mlflow_config(context) # pass context instead of session
mlflow_config.setup(context) # pass context instead of session
This is not necessary: the mlflow config is automatically set up when the context is loaded, so unless you need to access the config manually you can get rid of these 2 lines
Migration from 0.7.x to 0.8.x#
Update the
mlflow.ymlconfiguration file withkedro mlflow init --forcecommandpipeline_ml_factory(pipeline_ml=<your-pipeline-ml>,...)(resp.KedroPipelineModel(pipeline_ml=<your-pipeline-ml>, ...)) first argument is renamedpipeline. Change the call topipeline_ml_factory(pipeline=<your-pipeline-ml>)(resp.KedroPipelineModel(pipeline=<your-pipeline-ml>, ...)).Change the call from
pipeline_ml_factory(..., model_signature=<model-signature>, conda_env=<conda-env>, model_name=<model_name>)to ``pipeline_ml_factory(…, log_model_kwargs=dict(signature=, conda_env= , artifact_path=<model_name>}) . Notice that the arguments are renamed to match mlflow's and they are passed as a dict inlog_model_kwargs`.
Migration from 0.6.x to 0.7.x#
If you are working with kedro==0.17.0, update your template to kedro>=0.17.1.
Migration from 0.5.x to 0.6.x#
kedro==0.16.x is no longer supported. You need to update your project template to kedro==0.17.0 template.
Migration from 0.4.x to 0.5.x#
The only breaking change with the previous release is the format of KedroPipelineMLModel class. Hence, if you saved a pipeline as a Mlflow Model with pipeline_ml_factory in kedro-mlflow==0.4.x, loading it (either with MlflowModelTrackingDataset or mlflow.pyfunc.load_model) with kedro-mlflow==0.5.0 installed will raise an error. You will need either to retrain the model or to load it with kedro-mlflow==0.4.x.
Migration from 0.4.0 to 0.4.1#
There are no breaking change in this patch release except if you retrieve the mlflow configuration manually (e.g. in a script or a jupyter notebok). You must add an extra call to the setup() method:
from kedro.framework.context import load_context
from kedro_mlflow.config import get_mlflow_config
context = load_context(".")
mlflow_config = get_mlflow_config(context)
mlflow_config.setup() # <-- add this line which did not exists in 0.4.0
Migration from 0.3.x to 0.4.x#
Catalog entries#
Replace the following entries:
| old | new |
|---|---|
kedro_mlflow.io.MlflowArtifactDataset |
kedro_mlflow.io.artifacts.MlflowArtifactDataset |
kedro_mlflow.io.MlflowMetricsHistoryDataset |
kedro_mlflow.io.metrics.MlflowMetricsHistoryDataset |
Hooks#
Hooks are now auto-registered if you use kedro>=0.16.4. You can remove the following entry from your run.py:
hooks = (MlflowPipelineHook(), MlflowNodeHook())
KedroPipelineModel#
Be aware that if you have saved a pipeline as a mlflow model with pipeline_ml_factory, retraining this pipeline with kedro-mlflow==0.4.0 will lead to a new behaviour. Let assume the name of your output in the DataCatalog was predictions, the output of a registered model will be modified from:
{
"predictions":
{
"<your model-predictions>"
}
}
to:
{
"<your model-predictions>"
}
Thus, parsing the predictions of this model must be updated accordingly.