Ensure GCP Dataflow jobs are private
Error: GCP Dataflow jobs are not private
Bridgecrew Policy ID: BC_GCP_GENERAL_33
Checkov Check ID: CKV_GCP_94
Severity: HIGH
GCP Dataflow jobs are not private
Description
Cloud Dataflow in GCP is a service used for streaming and batch data processing. A Dataflow job consists of at least one management node and one compute node (both are GCE VMs). By default, these nodes are configured with public IPs that allow them to communicate with the public internet, but this also means they increase your potential attack surface by being publicly accessible.
We recommend you remove the public IPs for your Dataflow jobs. View the official Google documentation for the currently supported internet access configuration options.
Fix - Runtime
GCP Console
Making Dataflow jobs private via the console is not currently supported.
CLI Command
Making running Dataflow jobs private via the gcloud
CLI is not currently supported. Instead, you need to drain or cancel your job and then re-create with the correct flag configured.
# To cancel a Dataflow job
gcloud dataflow jobs cancel JOB_ID
Replace JOB_ID with your Dataflow job ID.
# To drain a Dataflow job
gcloud dataflow jobs drain JOB_ID
Replace JOB_ID with your Dataflow job ID.
# To create a new Dataflow job without public IPs
gcloud dataflow jobs run JOB_NAME \
--disable-public-ips \
--gcs-location=GCS_LOCATION
Replace JOB_ID with your Dataflow job ID. Replace GCS_LOCATION with the GCS bucket name where your job template lives. Must be a URL beginning with gs://
.
Google also provides documentation on how to Turn off external IP address for your Dataflow jobs. This documentation has examples for Java and Python.
Fix - Buildtime
Terraform
- Resource: google_dataflow_job
- Field: ip_configuration
resource "google_dataflow_job" "big_data_job" {
name = "dataflow-job"
template_gcs_path = "gs://my-bucket/templates/template_file"
temp_gcs_location = "gs://my-bucket/tmp_dir"
parameters = {
foo = "bar"
baz = "qux"
}
- ip_configuration = "WORKER_IP_PUBLIC"
+ ip_configuration = "WORKER_IP_PRIVATE"
}
Updated 6 months ago