Setup your Kedro project

Check the installation

Type kedro info in a terminal to check if the plugin is properly discovered by Kedro. If the installation has succeeded, you should see the following ascii art:

 _            _
| | _____  __| |_ __ ___
| |/ / _ \/ _` | '__/ _ \
|   <  __/ (_| | | | (_) |
|_|\_\___|\__,_|_|  \___/
v<kedro-version>

kedro allows teams to create analytics
projects. It is developed as part of
the Kedro initiative at QuantumBlack.

Installed plugins:
kedro_mlflow: <kedro-mlflow-version> (hooks:global,project)

The version <kedro-mlflow-version> of the plugin is installed ans has both global and project commands.

That’s it! You are now ready to go!

Create a kedro project

This plugins must be used in an existing kedro project. If you do not have a kedro project yet, you can create it with kedro new command. See the kedro docs for a tutorial.

For this tutorial and if you do not have a real-world project, I strongly suggest that you accept to include the proposed example to make a demo of this plugin out of the box.

Update the template of your kedro project

In order to use the kedro-mlflow plugin, you need to perform 2 actions:

  1. Create an mlflow.yml file for configuring mlflow in a dedicated file.

  2. Update the src/PYTHON_PACKAGE/run.py to add the necessary hooks to the project context. The MlflowPipelineHook manages the configuration and registers the PipelineML, while the MlflowNodeHook autolog the parameters.

Manual update

The MlflowPipelineHook and MlflowNodeHook hooks need to be registered in the the run.py file. The kedro documenation explain sinde tail how to register a hook.

Your run.py should look like the following code snippet :

from kedro_mlflow.framework.hooks import MlflowNodeHook, MlflowPipelineHook
from <python_package>.pipeline import create_pipelines

class ProjectContext(KedroContext):
    """Users can override the remaining methods from the parent class here,
    or create new ones (e.g. as required by plugins)
    """

    project_name = "<project-name>"
    project_version = "0.16.X" # must be >=0.16.0
    hooks = (
        MlflowNodeHook(flatten_dict_params=False),
        MlflowPipelineHook(model_name="<python_package>",
                           conda_env="src/requirements.txt")
    )  # <-- the new lines to add

Pay attention to the following elements:

  • if you have other hooks (custom, from other plugins…), you can just add them to the hooks tuple

  • you must register both hooks for the plugin to work

  • the hooks are highly parametrizable, you can find a detailed description of their parameters here.