Hooks

Node Hook

class kedro_mlflow.framework.hooks.node_hook.MlflowNodeHook

Bases: object

before_node_run(node: kedro.pipeline.node.Node, catalog: kedro.io.data_catalog.DataCatalog, inputs: Dict[str, Any], is_async: bool) None

Hook to be invoked before a node runs. This hook logs all the parameters of the nodes in mlflow. :param node: The Node to run. :param catalog: A DataCatalog containing the node’s inputs and outputs. :param inputs: The dictionary of inputs dataset. :param is_async: Whether the node was run in async mode.

before_pipeline_run(run_params: Dict[str, Any], pipeline: kedro.pipeline.pipeline.Pipeline, catalog: kedro.io.data_catalog.DataCatalog) None

Hook to be invoked before a pipeline runs. :param run_params: The params needed for the given run.

Should be identical to the data logged by Journal. # @fixme: this needs to be modelled explicitly as code, instead of comment Schema: {

“project_path”: str, “env”: str, “kedro_version”: str, “tags”: Optional[List[str]], “from_nodes”: Optional[List[str]], “to_nodes”: Optional[List[str]], “node_names”: Optional[List[str]], “from_inputs”: Optional[List[str]], “load_versions”: Optional[List[str]], “pipeline_name”: str, “extra_params”: Optional[Dict[str, Any]],

}

Parameters
  • pipeline – The Pipeline that will be run.

  • catalog – The DataCatalog to be used during the run.

log_param(name: str, value: Union[Dict, int, bool, str]) None

Pipeline Hook

class kedro_mlflow.framework.hooks.pipeline_hook.MlflowPipelineHook

Bases: object

after_catalog_created(catalog: kedro.io.data_catalog.DataCatalog, conf_catalog: Dict[str, Any], conf_creds: Dict[str, Any], feed_dict: Dict[str, Any], save_version: str, load_versions: str)
after_pipeline_run(run_params: Dict[str, Any], pipeline: kedro.pipeline.pipeline.Pipeline, catalog: kedro.io.data_catalog.DataCatalog) None

Hook to be invoked after a pipeline runs. :param run_params: The params needed for the given run.

Should be identical to the data logged by Journal. # @fixme: this needs to be modelled explicitly as code, instead of comment Schema: {

“project_path”: str, “env”: str, “kedro_version”: str, “tags”: Optional[List[str]], “from_nodes”: Optional[List[str]], “to_nodes”: Optional[List[str]], “node_names”: Optional[List[str]], “from_inputs”: Optional[List[str]], “load_versions”: Optional[List[str]], “pipeline_name”: str, “extra_params”: Optional[Dict[str, Any]],

}

Parameters
  • pipeline – The Pipeline that was run.

  • catalog – The DataCatalog used during the run.

before_pipeline_run(run_params: Dict[str, Any], pipeline: kedro.pipeline.pipeline.Pipeline, catalog: kedro.io.data_catalog.DataCatalog) None

Hook to be invoked before a pipeline runs. :param run_params: The params needed for the given run.

Should be identical to the data logged by Journal. # @fixme: this needs to be modelled explicitly as code, instead of comment Schema: {

“project_path”: str, “env”: str, “kedro_version”: str, “tags”: Optional[List[str]], “from_nodes”: Optional[List[str]], “to_nodes”: Optional[List[str]], “node_names”: Optional[List[str]], “from_inputs”: Optional[List[str]], “load_versions”: Optional[List[str]], “pipeline_name”: str, “extra_params”: Optional[Dict[str, Any]],

}

Parameters
  • pipeline – The Pipeline that will be run.

  • catalog – The DataCatalog to be used during the run.

on_pipeline_error(error: Exception, run_params: Dict[str, Any], pipeline: kedro.pipeline.pipeline.Pipeline, catalog: kedro.io.data_catalog.DataCatalog)
Hook invoked when the pipeline execution fails.

All the mlflow runs must be closed to avoid interference with further execution.

Parameters
  • error – (Not used) The uncaught exception thrown during the pipeline run.

  • run_params

    (Not used) The params used to run the pipeline. Should be identical to the data logged by Journal with the following schema:

    {
      "project_path": str,
      "env": str,
      "kedro_version": str,
      "tags": Optional[List[str]],
      "from_nodes": Optional[List[str]],
      "to_nodes": Optional[List[str]],
      "node_names": Optional[List[str]],
      "from_inputs": Optional[List[str]],
      "load_versions": Optional[List[str]],
      "pipeline_name": str,
      "extra_params": Optional[Dict[str, Any]]
    }
    

  • pipeline – (Not used) The Pipeline that will was run.

  • catalog – (Not used) The DataCatalog used during the run.