Hooks

Node Hook

class kedro_mlflow.framework.hooks.node_hook.MlflowNodeHook

Bases: object

before_node_run(node: kedro.pipeline.node.Node, catalog: kedro.io.data_catalog.DataCatalog, inputs: Dict[str, Any], is_async: bool) → None: Hook to be invoked before a node runs. This hook logs all the parameters of the nodes in mlflow. :param node: The Node to run. :param catalog: A DataCatalog containing the node’s inputs and outputs. :param inputs: The dictionary of inputs dataset. :param is_async: Whether the node was run in async mode.

before_pipeline_run(run_params: Dict[str, Any], pipeline: kedro.pipeline.pipeline.Pipeline, catalog: kedro.io.data_catalog.DataCatalog) → None

Hook to be invoked before a pipeline runs. :param run_params: The params needed for the given run.

Should be identical to the data logged by Journal. # @fixme: this needs to be modelled explicitly as code, instead of comment Schema: {

“project_path”: str, “env”: str, “kedro_version”: str, “tags”: Optional[List[str]], “from_nodes”: Optional[List[str]], “to_nodes”: Optional[List[str]], “node_names”: Optional[List[str]], “from_inputs”: Optional[List[str]], “load_versions”: Optional[List[str]], “pipeline_name”: str, “extra_params”: Optional[Dict[str, Any]],

}

Parameters

pipeline – The Pipeline that will be run.
catalog – The DataCatalog to be used during the run.

log_param(name: str, value: Union[Dict, int, bool, str]) → None

Pipeline Hook

class kedro_mlflow.framework.hooks.pipeline_hook.MlflowPipelineHook

Bases: object

after_catalog_created(catalog: kedro.io.data_catalog.DataCatalog, conf_catalog: Dict[str, Any], conf_creds: Dict[str, Any], feed_dict: Dict[str, Any], save_version: str, load_versions: str)

after_pipeline_run(run_params: Dict[str, Any], pipeline: kedro.pipeline.pipeline.Pipeline, catalog: kedro.io.data_catalog.DataCatalog) → None

Hook to be invoked after a pipeline runs. :param run_params: The params needed for the given run.

Should be identical to the data logged by Journal. # @fixme: this needs to be modelled explicitly as code, instead of comment Schema: {

“project_path”: str, “env”: str, “kedro_version”: str, “tags”: Optional[List[str]], “from_nodes”: Optional[List[str]], “to_nodes”: Optional[List[str]], “node_names”: Optional[List[str]], “from_inputs”: Optional[List[str]], “load_versions”: Optional[List[str]], “pipeline_name”: str, “extra_params”: Optional[Dict[str, Any]],

}

Parameters

pipeline – The Pipeline that was run.
catalog – The DataCatalog used during the run.

before_pipeline_run(run_params: Dict[str, Any], pipeline: kedro.pipeline.pipeline.Pipeline, catalog: kedro.io.data_catalog.DataCatalog) → None

Hook to be invoked before a pipeline runs. :param run_params: The params needed for the given run.

Should be identical to the data logged by Journal. # @fixme: this needs to be modelled explicitly as code, instead of comment Schema: {

“project_path”: str, “env”: str, “kedro_version”: str, “tags”: Optional[List[str]], “from_nodes”: Optional[List[str]], “to_nodes”: Optional[List[str]], “node_names”: Optional[List[str]], “from_inputs”: Optional[List[str]], “load_versions”: Optional[List[str]], “pipeline_name”: str, “extra_params”: Optional[Dict[str, Any]],

}

Parameters

pipeline – The Pipeline that will be run.
catalog – The DataCatalog to be used during the run.

on_pipeline_error(error: Exception, run_params: Dict[str, Any], pipeline: kedro.pipeline.pipeline.Pipeline, catalog: kedro.io.data_catalog.DataCatalog)

Hook invoked when the pipeline execution fails.: All the mlflow runs must be closed to avoid interference with further execution.

Parameters

error – (Not used) The uncaught exception thrown during the pipeline run.

run_params –

(Not used) The params used to run the pipeline. Should be identical to the data logged by Journal with the following schema:

{
  "project_path": str,
  "env": str,
  "kedro_version": str,
  "tags": Optional[List[str]],
  "from_nodes": Optional[List[str]],
  "to_nodes": Optional[List[str]],
  "node_names": Optional[List[str]],
  "from_inputs": Optional[List[str]],
  "load_versions": Optional[List[str]],
  "pipeline_name": str,
  "extra_params": Optional[Dict[str, Any]]
}

pipeline – (Not used) The Pipeline that will was run.
catalog – (Not used) The DataCatalog used during the run.