Datasets

Artifact DataSet

class kedro_mlflow.io.artifacts.mlflow_artifact_dataset.MlflowArtifactDataSet(data_set: Union[str, Dict], run_id: Optional[str] = None, artifact_path: Optional[str] = None, credentials: Optional[Dict[str, Any]] = None)

Bases: kedro.io.core.AbstractVersionedDataSet

This class is a wrapper for any kedro AbstractDataSet. It decorates their save method to log the dataset in mlflow when save is called.

Metrics DataSet

class kedro_mlflow.io.metrics.mlflow_metric_dataset.MlflowMetricDataSet(key: Optional[str] = None, run_id: Optional[str] = None, load_args: Optional[Dict[str, Any]] = None, save_args: Optional[Dict[str, Any]] = None)

Bases: kedro_mlflow.io.metrics.mlflow_abstract_metric_dataset.MlflowAbstractMetricDataSet

DEFAULT_SAVE_MODE = 'overwrite'
SUPPORTED_SAVE_MODES = {'append', 'overwrite'}
__init__(key: Optional[str] = None, run_id: Optional[str] = None, load_args: Optional[Dict[str, Any]] = None, save_args: Optional[Dict[str, Any]] = None)

Initialise MlflowMetricDataSet. :param run_id: The ID of the mlflow run where the metric should be logged :type run_id: str

class kedro_mlflow.io.metrics.mlflow_metric_history_dataset.MlflowMetricHistoryDataSet(key: Optional[str] = None, run_id: Optional[str] = None, load_args: Optional[Dict[str, Any]] = None, save_args: Optional[Dict[str, Any]] = None)

Bases: kedro_mlflow.io.metrics.mlflow_abstract_metric_dataset.MlflowAbstractMetricDataSet

__init__(key: Optional[str] = None, run_id: Optional[str] = None, load_args: Optional[Dict[str, Any]] = None, save_args: Optional[Dict[str, Any]] = None)

Initialise MlflowMetricDataSet. :param run_id: The ID of the mlflow run where the metric should be logged :type run_id: str

class kedro_mlflow.io.metrics.mlflow_metrics_dataset.MlflowMetricsDataSet(run_id: str = None, prefix: Optional[str] = None)

Bases: kedro.io.core.AbstractDataSet

This class represent MLflow metrics dataset.

__init__(run_id: str = None, prefix: Optional[str] = None)

Initialise MlflowMetricsDataSet.

Parameters
  • prefix (Optional[str]) – Prefix for metrics logged in MLflow.

  • run_id (str) – ID of MLflow run.

Deprecated since version 0.7.3: This will be removed in 0.8.0. Deprecated in favor of ‘MlflowMetricDataSet’ (for a single metric) or ‘MlflowMetricHistoryDataSet ‘(for the metric evolution over time)

property run_id

Get run id.

If active run is not found, tries to find last experiment.

Raise DataSetError exception if run id can’t be found.

Returns

String contains run_id.

Return type

str

Models DataSet

class kedro_mlflow.io.models.mlflow_abstract_model_dataset.MlflowAbstractModelDataSet(filepath: str, flavor: str, pyfunc_workflow: Optional[str] = None, load_args: Optional[Dict[str, Any]] = None, save_args: Optional[Dict[str, Any]] = None, version: Optional[kedro.io.core.Version] = None)

Bases: kedro.io.core.AbstractVersionedDataSet

Absract mother class for model datasets.

__init__(filepath: str, flavor: str, pyfunc_workflow: Optional[str] = None, load_args: Optional[Dict[str, Any]] = None, save_args: Optional[Dict[str, Any]] = None, version: Optional[kedro.io.core.Version] = None) None

Initialize the Kedro MlflowModelDataSet.

Parameters are passed from the Data Catalog.

During save, the model is first logged to MLflow. During load, the model is pulled from MLflow run with run_id.

Parameters
  • filepath (str) – Path to store the dataset locally.

  • flavor (str) – Built-in or custom MLflow model flavor module. Must be Python-importable.

  • pyfunc_workflow (str, optional) – Either python_model or loader_module. See https://www.mlflow.org/docs/latest/python_api/mlflow.pyfunc.html#workflows.

  • load_args (Dict[str, Any], optional) – Arguments to load_model function from specified flavor. Defaults to {}.

  • save_args (Dict[str, Any], optional) – Arguments to log_model function from specified flavor. Defaults to {}.

  • version (Version, optional) – Specific version to load.

Raises

DataSetError – When passed flavor does not exist.

class kedro_mlflow.io.models.mlflow_model_logger_dataset.MlflowModelLoggerDataSet(flavor: str, run_id: Optional[str] = None, artifact_path: Optional[str] = 'model', pyfunc_workflow: Optional[str] = None, load_args: Optional[Dict[str, Any]] = None, save_args: Optional[Dict[str, Any]] = None)

Bases: kedro_mlflow.io.models.mlflow_abstract_model_dataset.MlflowAbstractModelDataSet

Wrapper for saving, logging and loading for all MLflow model flavor.

__init__(flavor: str, run_id: Optional[str] = None, artifact_path: Optional[str] = 'model', pyfunc_workflow: Optional[str] = None, load_args: Optional[Dict[str, Any]] = None, save_args: Optional[Dict[str, Any]] = None) None

Initialize the Kedro MlflowModelDataSet.

Parameters are passed from the Data Catalog.

During save, the model is first logged to MLflow. During load, the model is pulled from MLflow run with run_id.

Parameters
  • flavor (str) – Built-in or custom MLflow model flavor module. Must be Python-importable.

  • run_id (Optional[str], optional) – MLflow run ID to use to load the model from or save the model to. Defaults to None.

  • artifact_path (str, optional) – the run relative path to the model.

  • pyfunc_workflow (str, optional) – Either python_model or loader_module. See https://www.mlflow.org/docs/latest/python_api/mlflow.pyfunc.html#workflows.

  • load_args (Dict[str, Any], optional) – Arguments to load_model function from specified flavor. Defaults to None.

  • save_args (Dict[str, Any], optional) – Arguments to log_model function from specified flavor. Defaults to None.

Raises

DataSetError – When passed flavor does not exist.

property model_uri
class kedro_mlflow.io.models.mlflow_model_saver_dataset.MlflowModelSaverDataSet(filepath: str, flavor: str, pyfunc_workflow: Optional[str] = None, load_args: Optional[Dict[str, Any]] = None, save_args: Optional[Dict[str, Any]] = None, log_args: Optional[Dict[str, Any]] = None, version: Optional[kedro.io.core.Version] = None)

Bases: kedro_mlflow.io.models.mlflow_abstract_model_dataset.MlflowAbstractModelDataSet

Wrapper for saving, logging and loading for all MLflow model flavor.

__init__(filepath: str, flavor: str, pyfunc_workflow: Optional[str] = None, load_args: Optional[Dict[str, Any]] = None, save_args: Optional[Dict[str, Any]] = None, log_args: Optional[Dict[str, Any]] = None, version: Optional[kedro.io.core.Version] = None) None

Initialize the Kedro MlflowModelDataSet.

Parameters are passed from the Data Catalog.

During save, the model is saved locally at filepath During load, the model is loaded from the local filepath.

Parameters
  • flavor (str) – Built-in or custom MLflow model flavor module. Must be Python-importable.

  • filepath (str) – Path to store the dataset locally.

  • pyfunc_workflow (str, optional) – Either python_model or loader_module. See https://www.mlflow.org/docs/latest/python_api/mlflow.pyfunc.html#workflows.

  • load_args (Dict[str, Any], optional) – Arguments to load_model function from specified flavor. Defaults to None.

  • save_args (Dict[str, Any], optional) – Arguments to save_model function from specified flavor. Defaults to None.

  • version (Version, optional) – Kedro version to use. Defaults to None.

Raises

DataSetError – When passed flavor does not exist.