kedro-mlflow

Getting started

  • Introduction
    • Kedro vs Mlflow
      • What is Kedro?
      • What is Mlflow?
      • A brief comparison between Kedro and Mlflow
        • Configuration and prototyping: Kedro 1 - 0 Mlflow
        • Versioning: Kedro 1 - 1 Mlflow
        • Model packaging and service: Kedro 1 - 2 Mlflow
        • Conclusion: Use Kedro and add Mlflow for machine learning projects
    • Motivation behind the plugin
      • When should I use kedro-mlflow?
      • Why should I use kedro-mlflow?
        • Benchmark of existing solutions
        • Enforcing Kedro principles
  • Installation
    • Install the plugin
      • Pre-requisites
        • Create a virtual environment
        • Check your kedro version
      • Install the plugin
        • Install from PyPI
        • Install from sources
      • Check the installation
      • Available commands
    • Setup your kedro project
      • Create a kedro project
      • Activate kedro-mlflow in your kedro project
        • Setting up the kedro-mlflow configuration file
        • Declaring kedro-mlflow hooks
    • Migration guide between versions
      • Migration from 0.10.x to 0.11.x
      • Migration from 0.9.x to 0.10.x
      • Migration from 0.8.x to 0.9.x
      • Migration from 0.7.x to 0.8.x
      • Migration from 0.6.x to 0.7.x
      • Migration from 0.5.x to 0.6.x
      • Migration from 0.4.x to 0.5.x
      • Migration from 0.4.0 to 0.4.1
      • Migration from 0.3.x to 0.4.x
        • Catalog entries
        • Hooks
        • KedroPipelineModel
  • Quickstart in 1 mn
    • Goal of the tutorial
    • Create an example project
      • Install the plugin in a virtual environment
      • Install the toy project
        • Installation with kedro>=0.19.0
        • Installation with kedro>=0.16.3
        • Installation with kedro>=0.16.0, <=0.16.2
      • Install dependencies
    • First steps with ``kedro-mlflow``
      • Initialize kedro-mlflow
      • Run the pipeline
      • Open the UI
        • Parameters versioning
        • Artifacts
      • Going further

Experiment tracking

  • In a kedro project
    • Configure mlflow
      • Context: mlflow tracking under the hood
      • The mlflow.yml file
        • Configure the tracking server
        • Deactivate tracking under conditions
        • Configure mlflow experiment
        • Configure the run
        • Extra tracking configuration
        • Configure the user interface
      • Overwrite configuration at runtime
    • Version parameters
      • Automatic parameters versioning
      • How does MlflowHook operates under the hood?
      • Frequently asked questions
        • Will parameters be recorded if the pipeline fails during execution?
        • How are parameters detected by the plugin?
    • Version datasets
      • What is artifact tracking?
      • How to version data in a kedro project?
      • Frequently asked questions
        • Can I pass extra parameters to the MlflowArtifactDataset for finer control?
        • Can I use the MlflowArtifactDataset in interactive mode?
        • How do I upload an artifact to a non local destination (e.g. an S3 or blog storage)?
        • Can I log an artifact in a specific run?
        • Can I reload an artifact from an existing run to use it in another run ?
        • Can I create a remote folder/subfolders architecture to organize the artifacts?
    • Version models
      • What is model tracking?
      • How to track models using MLflow in Kedro project?
      • Frequently asked questions
        • How is it working under the hood?
        • How can I track a custom MLflow model flavor?
        • How can I save model locally and log it in MLflow in one step?
    • Version metrics
      • What is metric tracking?
      • How to version metrics in a kedro project?
        • Saving a single float as a metric with MlflowMetricDataset
        • Saving the evolution of a metric during training with MlflowMetricHistoryDataset
        • Saving several metrics with their entire history with MlflowMetricsHistoryDataset
      • How to return metrics from a node?
    • Open the User Interface
      • The mlflow user interface
      • The kedro-mlflow helper
  • In a notebook
    • How to use in a notebook
      • Reminder on mlflow’s limitations with interactive use
      • Setup mlflow configuration in your notebook
      • Difference with running through the CLI
      • Guidelines and best practices suggestions

Pipeline serving

  • Custom mlflow model for kedro pipelines
    • Reminder on Mlflow Models
      • Introduction to Mlflow Models
      • Pre-requisite for serving a pipeline
    • Scikit-learn like kedro pipelines with ``KedroPipelineModel``
      • Getting started with pipeline_ml_factory
      • Advanced configuration for pipeline_ml_factory
        • Register the model as a new version in the mlflow registry
      • Complete step by step demo project with code
      • Motivation
    • Deployments patterns for ``KedroPipelineModel`` models
      • Deploying a KedroPipelineModel
        • Reuse from a python script
        • Reuse in a kedro pipeline
        • Serve the model with mlflow
      • Pass parameters at runtime to a Kedro PipelineModel
        • Pipeline parameters
        • Configuring the runner
    • Advanced logging for ``KedroPipelineModel``
      • Log a pipeline to mlflow programatically with KedroPipelineModel custom mlflow model
      • Log a pipeline to mlflow with the CLI
  • A mlops framework for continuous model serving
    • Why we need a mlops framework for development lifecycle
      • Machine learning deployment is hard because it comes with a lot of constraints and no adequate tooling
        • Identifying the challenges to address when deploying machine learning
        • A comparison between traditional software development and machine learning projects
      • Deployment issues addressed by kedro-mlflow and their solutions
        • Out of scope
        • Issue 1: The training process is poorly reproducible
        • Issue 2: The data scientist and stakeholders focus on training
        • Issue 3: Inference and training are entirely decoupled
        • Issue 4: Data scientists do not handle business objects
        • Overcoming these problems: support an organisational solution with an efficient tool
    • The architecture of a machine learning project
      • Definition: apps of a machine learning projects
      • Difference between an app and a Kedro pipeline
      • Apps development lifecycle in a machine learning project
        • The data scientist creates at least part of the 3 apps
        • The etl_app
        • The ml_app
        • The user_app
    • A framework for training / inference synchronization
      • Reminder
      • Enforcing these principles with a dedicated tool
        • Synchronizing training and inference pipeline
        • Packaging and serving a Kedro Pipeline
        • kedro-mlflow’s magic: inference autologging
        • Reuse the model in kedro

Technical documentation

  • Python objects
    • DataSets
      • MlflowArtifactDataset
      • Metrics DataSets
        • MlflowMetricDataset
        • MlflowMetricHistoryDataset
      • Models DataSets
        • MlflowModelTrackingDataset
        • MlflowModelLocalFileSystemDataset
        • MlflowModelRegistryDataset
    • Hooks
      • MlflowHook
    • Pipelines
      • PipelineML and pipeline_ml_factory
    • CLI
      • init
      • ui
      • modelify
    • Configuration
  • API documentation
    • Datasets
      • Artifact DataSet
        • MlflowArtifactDataset
      • Metrics DataSet
        • MlflowMetricDataset
        • MlflowMetricHistoryDataset
        • MlflowMetricsHistoryDataset
      • Models DataSet
        • MlflowAbstractModelDataSet
        • MlflowModelTrackingDataset
        • MlflowModelLocalFileSystemDataset
        • MlflowModelRegistryDataset
    • CLI
      • init
      • ui
      • modelify
    • Pipelines
      • KedroMlflowPipelineMLError
      • PipelineML
        • PipelineML.KPM_KWARGS_DEFAULT
        • PipelineML.LOG_MODEL_KWARGS_DEFAULT
        • PipelineML.__init__()
        • PipelineML.filter()
        • PipelineML.from_inputs()
        • PipelineML.from_nodes()
        • PipelineML.inference
        • PipelineML.input_name
        • PipelineML.only_nodes()
        • PipelineML.only_nodes_with_inputs()
        • PipelineML.only_nodes_with_namespace()
        • PipelineML.only_nodes_with_outputs()
        • PipelineML.only_nodes_with_tags()
        • PipelineML.tag()
        • PipelineML.to_nodes()
        • PipelineML.to_outputs()
        • PipelineML.training
      • pipeline_ml_factory()
    • Custom Mlflow Models
      • KedroPipelineModel
        • KedroPipelineModel.__init__()
        • KedroPipelineModel.copy_mode
        • KedroPipelineModel.extract_pipeline_artifacts()
        • KedroPipelineModel.load_context()
        • KedroPipelineModel.predict()
      • KedroPipelineModelError
    • Configuration
      • DictParamsOptions
        • DictParamsOptions.Config
        • DictParamsOptions.flatten
        • DictParamsOptions.model_config
        • DictParamsOptions.recursive
        • DictParamsOptions.sep
      • DisableTrackingOptions
        • DisableTrackingOptions.Config
        • DisableTrackingOptions.model_config
        • DisableTrackingOptions.pipelines
      • ExperimentOptions
        • ExperimentOptions.Config
        • ExperimentOptions.model_config
        • ExperimentOptions.model_post_init()
        • ExperimentOptions.name
        • ExperimentOptions.restore_if_deleted
      • KedroMlflowConfig
        • KedroMlflowConfig.Config
        • KedroMlflowConfig.model_config
        • KedroMlflowConfig.server
        • KedroMlflowConfig.setup()
        • KedroMlflowConfig.tracking
        • KedroMlflowConfig.ui
      • MlflowParamsOptions
        • MlflowParamsOptions.Config
        • MlflowParamsOptions.dict_params
        • MlflowParamsOptions.long_params_strategy
        • MlflowParamsOptions.model_config
      • MlflowServerOptions
        • MlflowServerOptions.Config
        • MlflowServerOptions.credentials
        • MlflowServerOptions.mlflow_registry_uri
        • MlflowServerOptions.mlflow_tracking_uri
        • MlflowServerOptions.model_config
        • MlflowServerOptions.model_post_init()
        • MlflowServerOptions.request_header_provider
      • MlflowTrackingOptions
        • MlflowTrackingOptions.Config
        • MlflowTrackingOptions.disable_tracking
        • MlflowTrackingOptions.experiment
        • MlflowTrackingOptions.model_config
        • MlflowTrackingOptions.params
        • MlflowTrackingOptions.run
      • RequestHeaderProviderOptions
        • RequestHeaderProviderOptions.Config
        • RequestHeaderProviderOptions.init_kwargs
        • RequestHeaderProviderOptions.model_config
        • RequestHeaderProviderOptions.pass_context
        • RequestHeaderProviderOptions.type
      • RunOptions
        • RunOptions.Config
        • RunOptions.id
        • RunOptions.model_config
        • RunOptions.name
        • RunOptions.nested
      • UiOptions
        • UiOptions.Config
        • UiOptions.host
        • UiOptions.model_config
        • UiOptions.port
    • Hooks
      • Node Hook
        • MlflowHook
        • is_windows()
kedro-mlflow
  • Introduction
  • View page source

Introduction

  • Goal of the tutorial
  • Create an example project
    • Install the plugin in a virtual environment
    • Install the toy project
      • Installation with kedro>=0.19.0
      • Installation with kedro>=0.16.3
      • Installation with kedro>=0.16.0, <=0.16.2
    • Install dependencies
  • First steps with ``kedro-mlflow``
    • Initialize kedro-mlflow
    • Run the pipeline
    • Open the UI
      • Parameters versioning
      • Artifacts
    • Going further
Previous Next

© Copyright 2020, Yolan Honoré-Rougé.

Built with Sphinx using a theme provided by Read the Docs.