# Storage


The Prefect Storage interface encapsulates logic for storing, serializing and even running Flows. Each storage unit is able to store multiple flows (possibly with the constraint of name uniqueness within a given unit), and exposes the following methods and attributes:

  • a name attribute
  • a flows attribute that is a dictionary of Flows -> location
  • an add_flow(flow: Flow) -> str method for adding flows to Storage, and that will return the location of the given flow in the Storage unit
  • the __contains__(self, obj) -> bool special method for determining whether the Storage contains a given Flow
  • one of get_flow(flow_location: str) or get_env_runner(flow_location: str) for retrieving a way of interfacing with either flow.run or a FlowRunner for the flow; get_env_runner is intended for situations where flow execution can only be interacted with via environment variables
  • a build() -> Storage method for "building" the storage
  • a serialize() -> dict method for serializing the relevant information about this Storage for later re-use.

# Docker

class

prefect.environments.storage.docker.Docker

(registry_url=None, base_image=None, dockerfile=None, python_dependencies=None, image_name=None, image_tag=None, env_vars=None, files=None, prefect_version=None, local_image=False, ignore_healthchecks=False, secrets=None, base_url=None, tls_config=False, build_kwargs=None)[source]

Docker storage provides a mechanism for storing Prefect flows in Docker images and optionally pushing them to a registry.

A user specifies a registry_url, base_image and other optional dependencies (e.g., python_dependencies) and build() will create a temporary Dockerfile that is used to build the image.

Note that the base_image must be capable of pip installing. Note that registry behavior with respect to image names can differ between providers - for example, Google's GCR registry allows for registry URLs of the form gcr.io/my-registry/subdir/my-image-name whereas DockerHub requires the registry URL to be separate from the image name.

Args:

  • registry_url (str, optional): URL of a registry to push the image to; image will not be pushed if not provided
  • base_image (str, optional): the base image for this environment (e.g. python:3.6), defaults to the prefecthq/prefect image matching your python version and prefect core library version used at runtime.
  • dockerfile (str, optional): a path to a Dockerfile to use in building this storage; note that, if provided, your present working directory will be used as the build context
  • python_dependencies (List[str], optional): list of pip installable dependencies for the image
  • image_name (str, optional): name of the image to use when building, populated with a UUID after build
  • image_tag (str, optional): tag of the image to use when building, populated with a UUID after build
  • env_vars (dict, optional): a dictionary of environment variables to use when building
  • files (dict, optional): a dictionary of files to copy into the image when building
  • prefect_version (str, optional): an optional branch, tag, or commit specifying the version of prefect you want installed into the container; defaults to the version you are currently using or "master" if your version is ahead of the latest tag
  • local_image (bool, optional): an optional flag whether or not to use a local docker image, if True then a pull will not be attempted
  • ignore_healthchecks (bool, optional): if True, the Docker healthchecks are not added to the Dockerfile. If False (default), healthchecks are included.
  • secrets (List[str], optional): a list of Prefect Secrets which will be used to populate prefect.context for each flow run. Used primarily for providing authentication credentials.
  • base_url (str, optional): a URL of a Docker daemon to use when for Docker related functionality. Defaults to DOCKER_HOST env var if not set
  • tls_config (Union[bool, docker.tls.TLSConfig], optional): a TLS configuration to pass to the Docker client. Documentation
  • build_kwargs (dict, optional): Additional keyword arguments to pass to Docker's build step. Documentation
Raises:
  • ValueError: if both base_image and dockerfile are provided

methods:                                                                                                                                                       

prefect.environments.storage.docker.Docker.add_flow

(flow)[source]

Method for adding a new flow to this Storage object.

Args:

  • flow (Flow): a Prefect Flow to add
Returns:
  • str: the location of the newly added flow in this Storage object

prefect.environments.storage.docker.Docker.build

(push=True)[source]

Build the Docker storage object. If image name and tag are not set, they will be autogenerated.

Args:

  • push (bool, optional): Whether or not to push the built Docker image, this requires the registry_url to be set
Returns:
  • Docker: a new Docker storage object that contains information about how and where the flow is stored. Image name and tag are generated during the build process.
Raises:
  • InterruptedError: if either pushing or pulling the image fails

prefect.environments.storage.docker.Docker.create_dockerfile_object

(directory)[source]

Writes a dockerfile to the provided directory using the specified arguments on this Docker storage object.

In order for the docker python library to build a container it needs a Dockerfile that it can use to define the container. This function takes the specified arguments then writes them to a temporary file called Dockerfile.

Note: if files are added to this container, they will be copied to this directory as well.

Args:

  • directory (str, optional): A directory where the Dockerfile will be created, if no directory is specified is will be created in the current working directory
Returns:
  • str: the absolute file path to the Dockerfile

prefect.environments.storage.docker.Docker.get_env_runner

(flow_location)[source]

Given a flow_location within this Storage object, returns something with a run() method which accepts the standard runner kwargs and can run the flow.

Args:

  • flow_location (str): the location of a flow within this Storage
Returns:
  • a runner interface (something with a run() method for running the flow)

prefect.environments.storage.docker.Docker.get_flow

(flow_location)[source]

Given a file path within this Docker container, returns the underlying Flow. Note that this method should only be run within the container itself.

Args:

  • flow_location (str): the file path of a flow within this container
Returns:
  • Flow: the requested flow

prefect.environments.storage.docker.Docker.pull_image

()[source]

Pull the image specified so it can be built.

In order for the docker python library to use a base image it must be pulled from either the main docker registry or a separate registry that must be set as registry_url on this class.

Raises:

  • InterruptedError: if either pulling the image fails

prefect.environments.storage.docker.Docker.push_image

(image_name, image_tag)[source]

Push this environment to a registry

Args:

  • image_name (str): Name for the image
  • image_tag (str): Tag for the image
Raises:
  • InterruptedError: if either pushing the image fails



# Local

class

prefect.environments.storage.local.Local

(directory=None, validate=True, secrets=None)[source]

Local storage class. This class represents the Storage interface for Flows stored as bytes in the local filesystem.

Note that if you register a Flow with Prefect Cloud using this storage, your flow's environment will automatically be labeled with your machine's hostname. This ensures that only agents that are known to be running on the same filesystem can run your flow.

Args:

  • directory (str, optional): the directory the flows will be stored in; defaults to ~/.prefect/flows. If it doesn't already exist, it will be created for you.
  • validate (bool, optional): a boolean specifying whether to validate the provided directory path; if True, the directory will be converted to an absolute path and created. Defaults to True
  • secrets (List[str], optional): a list of Prefect Secrets which will be used to populate prefect.context for each flow run. Used primarily for providing authentication credentials.

methods:                                                                                                                                                       

prefect.environments.storage.local.Local.add_flow

(flow)[source]

Method for storing a new flow as bytes in the local filesytem.

Args:

  • flow (Flow): a Prefect Flow to add
Returns:
  • str: the location of the newly added flow in this Storage object
Raises:
  • ValueError: if a flow with the same name is already contained in this storage

prefect.environments.storage.local.Local.build

()[source]

Build the Storage object.

Returns:

  • Storage: a Storage object that contains information about how and where each flow is stored

prefect.environments.storage.local.Local.get_flow

(flow_location)[source]

Given a flow_location within this Storage object, returns the underlying Flow (if possible).

Args:

  • flow_location (str): the location of a flow within this Storage; in this case, a file path where a Flow has been serialized to
Returns:
  • Flow: the requested flow
Raises:
  • ValueError: if the flow is not contained in this storage



# S3

class

prefect.environments.storage.s3.S3

(bucket, client_options=None, key=None, secrets=None)[source]

S3 storage class. This class represents the Storage interface for Flows stored as bytes in an S3 bucket.

This storage class optionally takes a key which will be the name of the Flow object when stored in S3. If this key is not provided the Flow upload name will take the form slugified-flow-name/slugified-current-timestamp.

Note: Flows registered with this Storage option will automatically be labeled with s3-flow-storage.

Args:

  • bucket (str): the name of the S3 Bucket to store Flows
  • key (str, optional): a unique key to use for uploading a Flow to S3. This is only useful when storing a single Flow using this storage object.
  • client_options (dict, optional): Additional options for the boto3 client.
  • secrets (List[str], optional): a list of Prefect Secrets which will be used to populate prefect.context for each flow run. Used primarily for providing authentication credentials.

methods:                                                                                                                                                       

prefect.environments.storage.s3.S3.add_flow

(flow)[source]

Method for storing a new flow as bytes in the local filesytem.

Args:

  • flow (Flow): a Prefect Flow to add
Returns:
  • str: the location of the newly added flow in this Storage object
Raises:
  • ValueError: if a flow with the same name is already contained in this storage

prefect.environments.storage.s3.S3.build

()[source]

Build the S3 storage object by uploading Flows to an S3 bucket. This will upload all of the flows found in storage.flows. If there is an issue uploading to the S3 bucket an error will be logged.

Returns:

  • Storage: an S3 object that contains information about how and where each flow is stored
Raises:
  • botocore.ClientError: if there is an issue uploading a Flow to S3

prefect.environments.storage.s3.S3.get_flow

(flow_location)[source]

Given a flow_location within this Storage object, returns the underlying Flow (if possible). If the Flow is not found an error will be logged and None will be returned.

Args:

  • flow_location (str): the location of a flow within this Storage; in this case, a file path where a Flow has been serialized to
Returns:
  • Flow: the requested Flow
Raises:
  • ValueError: if the Flow is not contained in this storage
  • botocore.ClientError: if there is an issue downloading the Flow from S3



# GCS

class

prefect.environments.storage.gcs.GCS

(bucket, key=None, project=None, secrets=None)[source]

GoogleCloudStorage storage class. This class represents the Storage interface for Flows stored as bytes in an GCS bucket. To authenticate with Google Cloud, you need to ensure that your Prefect Agent has the proper credentials available (see https://cloud.google.com/docs/authentication/production for all the authentication options).

This storage class optionally takes a key which will be the name of the Flow object when stored in GCS. If this key is not provided the Flow upload name will take the form slugified-flow-name/slugified-current-timestamp.

Note: Flows registered with this Storage option will automatically be labeled with gcs-flow-storage.

Args:

  • bucket (str, optional): the name of the GCS Bucket to store the Flow
  • key (str, optional): a unique key to use for uploading this Flow to GCS. This is only useful when storing a single Flow using this storage object.
  • project (str, optional): the google project where any GCS API requests are billed to; if not provided, the project will be inferred from your Google Cloud credentials.
  • secrets (List[str], optional): a list of Prefect Secrets which will be used to populate prefect.context for each flow run. Used primarily for providing authentication credentials.

methods:                                                                                                                                                       

prefect.environments.storage.gcs.GCS.add_flow

(flow)[source]

Method for storing a new flow as bytes in a GCS bucket.

Args:

  • flow (Flow): a Prefect Flow to add
Returns:
  • str: the key of the newly added flow in the GCS bucket
Raises:
  • ValueError: if a flow with the same name is already contained in this storage

prefect.environments.storage.gcs.GCS.build

()[source]

Build the GCS storage object by uploading Flows to an GCS bucket. This will upload all of the flows found in storage.flows.

Returns:

  • Storage: an GCS object that contains information about how and where each flow is stored

prefect.environments.storage.gcs.GCS.get_flow

(flow_location)[source]

Given a flow_location within this Storage object, returns the underlying Flow (if possible).

Args:

  • flow_location (str): the location of a flow within this Storage; in this case, a file path where a Flow has been serialized to
Returns:
  • Flow: the requested flow
Raises:
  • ValueError: if the flow is not contained in this storage



# Azure

class

prefect.environments.storage.azure.Azure

(container, connection_string=None, blob_name=None, secrets=None)[source]

Azure Blob storage class. This class represents the Storage interface for Flows stored as bytes in an Azure container.

This storage class optionally takes a blob_name which will be the name of the Flow object when stored in Azure. If this key is not provided the Flow upload name will take the form slugified-flow-name/slugified-current-timestamp.

Note: Flows registered with this Storage option will automatically be labeled with azure-flow-storage.

Args:

  • container (str): the name of the Azure Blob Container to store the Flow
  • connection_string (str, optional): an Azure connection string for communicating with Blob storage. If not provided the value set in the environment as AZURE_STORAGE_CONNECTION_STRING will be used
  • blob_name (str, optional): a unique key to use for uploading this Flow to Azure. This is only useful when storing a single Flow using this storage object.
  • secrets (List[str], optional): a list of Prefect Secrets which will be used to populate prefect.context for each flow run. Used primarily for providing authentication credentials.

methods:                                                                                                                                                       

prefect.environments.storage.azure.Azure.add_flow

(flow)[source]

Method for storing a new flow as bytes in an Azure Blob container.

Args:

  • flow (Flow): a Prefect Flow to add
Returns:
  • str: the key of the newly added Flow in the container
Raises:
  • ValueError: if a flow with the same name is already contained in this storage

prefect.environments.storage.azure.Azure.build

()[source]

Build the Azure storage object by uploading Flows to an Azure Blob container. This will upload all of the flows found in storage.flows.

Returns:

  • Storage: an Azure object that contains information about how and where each flow is stored

prefect.environments.storage.azure.Azure.get_flow

(flow_location)[source]

Given a flow_location within this Storage object, returns the underlying Flow (if possible).

Args:

  • flow_location (str): the location of a flow within this Storage; in this case, a file path where a Flow has been serialized to
Returns:
  • Flow: the requested flow
Raises:
  • ValueError: if the flow is not contained in this storage



This documentation was auto-generated from commit n/a
on June 17, 2020 at 17:27 UTC