Hooks#

Node Hook#

class kedro_mlflow.framework.hooks.mlflow_hook.MlflowHook#

Bases: object

after_catalog_created(catalog: DataCatalog, conf_catalog: dict[str, Any], conf_creds: dict[str, Any], feed_dict: dict[str, Any], save_version: str, load_versions: str)#
after_context_created(context: KedroContext) None#

Hooks to be invoked after a KedroContext is created. This is the earliest hook triggered within a Kedro run. The KedroContext stores useful information such as credentials, config_loader and env. :param context: The context that was created.

after_pipeline_run(run_params: dict[str, Any], pipeline: Pipeline, catalog: DataCatalog) None#

Hook to be invoked after a pipeline runs. :param run_params: The params needed for the given run.

Should be identical to the data logged by Journal. # @fixme: this needs to be modelled explicitly as code, instead of comment Schema: {

“project_path”: str, “env”: str, “kedro_version”: str, “tags”: Optional[list[str]], “from_nodes”: Optional[list[str]], “to_nodes”: Optional[list[str]], “node_names”: Optional[list[str]], “from_inputs”: Optional[list[str]], “load_versions”: Optional[list[str]], “pipeline_name”: str, “extra_params”: Optional[dict[str, Any]],

}

Parameters:
  • pipeline – The Pipeline that was run.

  • catalog – The DataCatalog used during the run.

before_node_run(node: Node, catalog: DataCatalog, inputs: dict[str, Any], is_async: bool) None#

Hook to be invoked before a node runs. This hook logs all the parameters of the nodes in mlflow. :param node: The Node to run. :param catalog: A DataCatalog containing the node’s inputs and outputs. :param inputs: The dictionary of inputs dataset. :param is_async: Whether the node was run in async mode.

before_pipeline_run(run_params: dict[str, Any], pipeline: Pipeline, catalog: DataCatalog) None#

Hook to be invoked before a pipeline runs. :param run_params: The params needed for the given run.

Should be identical to the data logged by Journal. # @fixme: this needs to be modelled explicitly as code, instead of comment Schema: {

“project_path”: str, “env”: str, “kedro_version”: str, “tags”: Optional[list[str]], “from_nodes”: Optional[list[str]], “to_nodes”: Optional[list[str]], “node_names”: Optional[list[str]], “from_inputs”: Optional[list[str]], “load_versions”: Optional[list[str]], “pipeline_name”: str, “extra_params”: Optional[dict[str, Any]],

}

Parameters:
  • pipeline – The Pipeline that will be run.

  • catalog – The DataCatalog to be used during the run.

on_pipeline_error(error: Exception, run_params: dict[str, Any], pipeline: Pipeline, catalog: DataCatalog)#

Hook invoked when the pipeline execution fails. All the mlflow runs must be closed to avoid interference with further execution.

Parameters:
  • error – (Not used) The uncaught exception thrown during the pipeline run.

  • run_params

    (Not used) The params used to run the pipeline. Should be identical to the data logged by Journal with the following schema:

    {
      "project_path": str,
      "env": str,
      "kedro_version": str,
      "tags": Optional[list[str]],
      "from_nodes": Optional[list[str]],
      "to_nodes": Optional[list[str]],
      "node_names": Optional[list[str]],
      "from_inputs": Optional[list[str]],
      "load_versions": Optional[list[str]],
      "pipeline_name": str,
      "extra_params": Optional[dict[str, Any]]
    }
    

  • pipeline – (Not used) The Pipeline that will was run.

  • catalog – (Not used) The DataCatalog used during the run.

sanitize_param_name(name: str) str#
kedro_mlflow.framework.hooks.mlflow_hook.is_windows()#