Ensure GCP Dataproc Clusters do not have public IPs
Error: GCP Dataproc Clusters have public IPs
Bridgecrew Policy ID: BC_GCP_GENERAL_37
Checkov Check ID: CKV_GCP_103
Severity: HIGH
GCP Dataproc Clusters have public IPs
Description
Dataproc is commonly used for data lake modernization, ETL, and data science workloads. A Dataproc cluster contains at least one "management" VM and one "compute" VM which are deployed into a VPC network. A common misconfiguration is creating a Dataproc cluster with public IPs. This security misconfiguration could put your data at risk of accidental exposure, because a public IP accompanied by an open firewall rule allows potentially unauthorized access to the underlining Dataproc VMs.
We recommend you only assign private IPs to your Dataproc clusters.
Fix - Runtime
GCP Console
It is not currently possible to edit a running Dataproc cluster to remove it's public IPs.
To create a Dataproc cluster with only private IPs:
- Log in to the GCP Console.
- Navigate to Dataproc.
- Select Customize Cluster to view Network Configuration settings.
- Locate the Internal IP Only section and select the checkbox next to Configure all instances to have only internal IP addresses
CLI Command
It is not currently possible to edit a running Dataproc cluster to remove it's public IPs.
To create a Dataproc cluster with only private IPs you need to specify the --no-address
flag. As an example:
gcloud beta dataproc clusters create my_cluster \
--region=us-central1 \
--no-address
Fix - Buildtime
Terraform
- Resource: google_dataproc_cluster
- Field: internal_ip_only
resource "google_dataproc_cluster" "accelerated_cluster" {
name = "my-cluster-with-gpu"
region = "us-central1"
cluster_config {
gce_cluster_config {
zone = "us-central1-a"
- internal_ip_only = false
+ internal_ip_only = true
}
master_config {
accelerators {
accelerator_type = "nvidia-tesla-k80"
accelerator_count = "1"
}
}
}
}
Updated 6 months ago