# Google Cloud Tasks
Tasks that interface with various components of Google Cloud Platform.
Note that these tasks allow for a wide range of custom usage patterns, such as:
- Initialize a task with all settings for one time use
- Initialize a "template" task with default settings and override as needed
- Create a custom Task that inherits from a Prefect Task and utilizes the Prefect boilerplate
# GCSDownload
class
prefect.tasks.gcp.storage.GCSDownload
(bucket, blob=None, project=None, credentials_secret=None, encryption_key_secret=None, **kwargs)[source]Task template for downloading data from Google Cloud Storage as a string.
Args:
bucket (str)
: default bucket name to download fromblob (str, optional)
: default blob name to download.project (str, optional)
: default Google Cloud project to work within. If not provided, will be inferred from your Google Cloud credentialscredentials_secret (str, optional, DEPRECATED)
: the name of the Prefect Secret which stores a JSON representation of your Google Cloud credentials.encryption_key_secret (str, optional, DEPRECATED)
: the name of the Prefect Secret storing an optionalencryption_key
to be used when downloading the Blob**kwargs (dict, optional)
: additional keyword arguments to pass to the Task constructor
methods: |
---|
prefect.tasks.gcp.storage.GCSDownload.run (bucket=None, blob=None, project=None, credentials=None, encryption_key=None, credentials_secret=None, encryption_key_secret=None)[source] |
Run method for this Task. Invoked by calling this Task after initialization within a Flow context.
|
# GCSUpload
class
prefect.tasks.gcp.storage.GCSUpload
(bucket, blob=None, project=None, credentials_secret=None, create_bucket=False, encryption_key_secret=None, **kwargs)[source]Task template for uploading data to Google Cloud Storage. Requires the data already be a string.
Args:
bucket (str)
: default bucket name to upload toblob (str, optional)
: default blob name to upload to; otherwise a random string beginning withprefect-
and containing the Task Run ID will be usedproject (str, optional)
: default Google Cloud project to work within. If not provided, will be inferred from your Google Cloud credentialscredentials_secret (str, optional, DEPRECATED)
: the name of the Prefect Secret which stores a JSON represenation of your Google Cloud credentials.create_bucket (bool, optional)
: boolean specifying whether to create the bucket if it does not exist, otherwise an Exception is raised. Defaults toFalse
.encryption_key_secret (str, optional, DEPRECATED)
: the name of the Prefect Secret storing an optionalencryption_key
to be used when uploading the Blob**kwargs (dict, optional)
: additional keyword arguments to pass to the Task constructor
methods: |
---|
prefect.tasks.gcp.storage.GCSUpload.run (data, bucket=None, blob=None, project=None, credentials=None, encryption_key=None, credentials_secret=None, create_bucket=False, encryption_key_secret=None)[source] |
Run method for this Task. Invoked by calling this Task after initialization within a Flow context.
|
# GCSCopy
class
prefect.tasks.gcp.storage.GCSCopy
(source_bucket=None, source_blob=None, dest_bucket=None, dest_blob=None, project=None, credentials_secret=None, **kwargs)[source]Task template for copying data from one Google Cloud Storage bucket to another, without downloading it locally.
Note that some arguments are required for the task to run, and must be provided either at initialization or as arguments.
Args:
source_bucket (str, optional)
: default source bucket name.source_blob (str, optional)
: default source blob name.dest_bucket (str, optional)
: default destination bucket name.dest_blob (str, optional)
: default destination blob name.project (str, optional)
: default Google Cloud project to work within. If not provided, will be inferred from your Google Cloud credentialscredentials_secret (str, optional, DEPRECATED)
: the name of the Prefect Secret which stores a JSON representation of your Google Cloud credentials.**kwargs (dict, optional)
: additional keyword arguments to pass to the Task constructor
methods: |
---|
prefect.tasks.gcp.storage.GCSCopy.run (source_bucket=None, source_blob=None, dest_bucket=None, dest_blob=None, project=None, credentials=None, credentials_secret=None)[source] |
Run method for this Task. Invoked by calling this Task after initialization within a Flow context.
|
# BigQueryTask
class
prefect.tasks.gcp.bigquery.BigQueryTask
(query=None, query_params=None, project=None, location="US", dry_run_max_bytes=None, credentials_secret=None, dataset_dest=None, table_dest=None, job_config=None, **kwargs)[source]Task for executing queries against a Google BigQuery table and (optionally) returning the results. Note that all initialization settings can be provided / overwritten at runtime.
Args:
query (str, optional)
: a string of the query to executequery_params (list[tuple], optional)
: a list of 3-tuples specifying BigQuery query parameters; currently only scalar query parameters are supported. See the Google documentation for more details on how both the query and the query parameters should be formattedproject (str, optional)
: the project to initialize the BigQuery Client with; if not provided, will default to the one inferred from your credentialslocation (str, optional)
: location of the dataset that will be queried; defaults to "US"dry_run_max_bytes (int, optional)
: if provided, the maximum number of bytes the query is allowed to process; this will be determined by executing a dry run and raising aValueError
if the maximum is exceededcredentials_secret (str, optional, DEPRECATED)
: the name of the Prefect Secret containing a JSON representation of your Google Application credentialsdataset_dest (str, optional)
: the optional name of a destination dataset to write the query results to, if you don't want them returned; if provided,table_dest
must also be providedtable_dest (str, optional)
: the optional name of a destination table to write the query results to, if you don't want them returned; if provided,dataset_dest
must also be providedjob_config (dict, optional)
: an optional dictionary of job configuration parameters; note that the parameters provided here must be pickleable (e.g., dataset references will be rejected)**kwargs (optional)
: additional kwargs to pass to theTask
constructor
methods: |
---|
prefect.tasks.gcp.bigquery.BigQueryTask.run (query=None, query_params=None, project=None, location="US", dry_run_max_bytes=None, credentials=None, credentials_secret=None, dataset_dest=None, table_dest=None, job_config=None)[source] |
Run method for this Task. Invoked by calling this Task within a Flow context, after initialization.
|
# BigQueryLoadGoogleCloudStorage
class
prefect.tasks.gcp.bigquery.BigQueryLoadGoogleCloudStorage
(uri=None, dataset_id=None, table=None, project=None, schema=None, location="US", credentials_secret=None, **kwargs)[source]Task for insert records in a Google BigQuery table via a load job. Note that all of these settings can optionally be provided or overwritten at runtime.
Args:
uri (str, optional)
: GCS path to load data fromdataset_id (str, optional)
: the id of a destination dataset to write the records totable (str, optional)
: the name of a destination table to write the records toproject (str, optional)
: the project to initialize the BigQuery Client with; if not provided, will default to the one inferred from your credentialsschema (List[bigquery.SchemaField], optional)
: the schema to use when creating the tablelocation (str, optional)
: location of the dataset that will be queried; defaults to "US"credentials_secret (str, optional, DEPRECATED)
: the name of the Prefect Secret containing a JSON representation of your Google Application credentials**kwargs (optional)
: additional kwargs to pass to theTask
constructor
methods: |
---|
prefect.tasks.gcp.bigquery.BigQueryLoadGoogleCloudStorage.run (uri=None, dataset_id=None, table=None, project=None, schema=None, location="US", credentials=None, credentials_secret=None, **kwargs)[source] |
Run method for this Task. Invoked by calling this Task within a Flow context, after initialization.
|
# BigQueryStreamingInsert
class
prefect.tasks.gcp.bigquery.BigQueryStreamingInsert
(dataset_id=None, table=None, project=None, location="US", credentials_secret=None, **kwargs)[source]Task for insert records in a Google BigQuery table via the streaming API. Note that all of these settings can optionally be provided or overwritten at runtime.
Args:
dataset_id (str, optional)
: the id of a destination dataset to write the records totable (str, optional)
: the name of a destination table to write the records toproject (str, optional)
: the project to initialize the BigQuery Client with; if not provided, will default to the one inferred from your credentialslocation (str, optional)
: location of the dataset that will be written to; defaults to "US"credentials_secret (str, optional, DEPRECATED)
: the name of the Prefect Secret containing a JSON representation of your Google Application credentials**kwargs (optional)
: additional kwargs to pass to theTask
constructor
methods: |
---|
prefect.tasks.gcp.bigquery.BigQueryStreamingInsert.run (records, dataset_id=None, table=None, project=None, location="US", credentials=None, credentials_secret=None, **kwargs)[source] |
Run method for this Task. Invoked by calling this Task within a Flow context, after initialization.
|
# CreateBigQueryTable
class
prefect.tasks.gcp.bigquery.CreateBigQueryTable
(project=None, credentials_secret=None, dataset=None, table=None, schema=None, clustering_fields=None, time_partitioning=None, **kwargs)[source]Ensures a BigQuery table exists; creates it otherwise. Note that most initialization keywords can optionally be provided at runtime.
Args:
project (str, optional)
: the project to initialize the BigQuery Client with; if not provided, will default to the one inferred from your credentialscredentials_secret (str, optional, DEPRECATED)
: the name of the Prefect Secret containing a JSON representation of your Google Application credentialsdataset (str, optional)
: the name of a dataset in that the table will be createdtable (str, optional)
: the name of a table to createschema (List[bigquery.SchemaField], optional)
: the schema to use when creating the tableclustering_fields (List[str], optional)
: a list of fields to cluster the table bytime_partitioning (bigquery.TimePartitioning, optional)
: abigquery.TimePartitioning
object specifying a partitioninig of the newly created table**kwargs (optional)
: additional kwargs to pass to theTask
constructor
methods: |
---|
prefect.tasks.gcp.bigquery.CreateBigQueryTable.run (project=None, credentials=None, credentials_secret=None, dataset=None, table=None, schema=None)[source] |
Run method for this Task. Invoked by calling this Task within a Flow context, after initialization.
|
This documentation was auto-generated from commit n/a
on May 14, 2020 at 21:12 UTC