Network Security

Protected Gateway

To access cluster resources, a central protected gateway is configured for access from a single network endpoint:

The following diagram illustrates how the protected gateway and security groups secure access to your cluster and cluster resources:

NN - HDFS NameNode, RM - YARN ResourceManager,
JHS - MapReduce JobHistoryServer, SHS - Spark History Server

Cluster Resource URLs

The following cluster resources are available for access via the gateway endpoint.

Resource URL
AMBARI WEB https://{cloud-controller-host}/{cluster-name}/services/ambari/
ZEPPELIN UI1 https://{cloud-controller-host}/{cluster-name}/services/zeppelin/
NAME NODE https://{cloud-controller-host}/{cluster-name}/services/hdfs/
JOB HISTORY SERVER https://{cloud-controller-host}/{cluster-name}/services/jobhistory/
RESOURCE MANAGER https://{cloud-controller-host}/{cluster-name}/services/yarn/
SPARK HISTORY SERVER2 https://{cloud-controller-host}/{cluster-name}/services/sparkhistory/
HIVE JDBC3 https://{cloud-controller-host}/{cluster-name}/services/jdbc/

1 The Zeppelin UI link is only available for Data Science and EDW-Analytics clusters.
2 The Spark History Server link is only available for EDW-ETL clusters.
3 See Using Apache Hive for more information on accessing Hive via JDBC.

Using Your Own SSL Certificate

By default, the protected gateway on the controller node has been configured with a self-signed certificate for access via HTTPS. This is sufficient for many deployments such as trials, development, testing, or staging. However, for production deployments, a trusted certificate is preferred and can be configured in the controller. Follow these steps to configure the cloud controller to use your own trusted certificate.

Prerequisites

To use your own certificate, you must have:

Steps

  1. SSH to the cloud controller host instance. For example:

    ssh -i mykeypair.pem cloudbreak@[CONTROLLER-IP-ADDRESS]

  2. Make sure that the target fully qualified domain name (FQDN) which you plan to use for the cloud controller is resolvable:

    nslookup [TARGET-CONTROLLER-FQDN]

    For example:

    nslookup hdcloud.example.com

  3. Browse to the controller deployment directory and edit the Profile file:

    vi /var/lib/cloudbreak-deployment/Profile

  4. Replace the value of the PUBLIC_IP variable with the TARGET-CONTROLLER-FQDN value:

    PUBLIC_IP=[TARGET-CONTROLLER-FQDN]

  5. Copy your private key and certificate files for the FQDN onto the controller host. These files must be placed under /var/lib/cloudbreak-deployment/certs/traefik/ directory.

    File permissions for the private key and certificate files can be set to 600.

    File Example
    PRIV-KEY-LOCATION /var/lib/cloudbreak-deployment/certs/traefik/hdcloud.example.com.key
    CERT-LOCATION /var/lib/cloudbreak-deployment/certs/traefik/hdcloud.example.com.crt
  6. Configure TLS details in your Profile by adding the following line at the end of the file.

    Notice that CERT-LOCATION and PRIV-KEY-LOCATION are file locations from Step 5, starting at the /certs/... path.

    export CBD_TRAEFIK_TLS=”[CERT-LOCATION],[PRIV-KEY-LOCATION]”

    For example:

    export CBD_TRAEFIK_TLS="/certs/traefik/hdcloud.example.com.crt,/certs/traefik/hdcloud.example.com.key"

  7. Restart the cloud controller:

    cbd restart

  8. Using your web browser, access to the cloud controller web UI using the new resolvable fully qualified domain name.

  9. Confirm that the connection is SSL-protected and that the certificate used is the certificate that you provided to the controller.

Security Groups

Security groups are set up to control network traffic to the EC2 instances in the system. By default, the system is configured to restrict inbound network traffic to the minimal set of ports. You can add or modify rules to each security group that allow traffic to or from its associated instances. This section describes the default security group configuration for the various components (and instances) in the system.

The inbound and outbound rules (protocols, port and IP ranges) for the security groups can be modified later using the AWS VPC Dashboard.

The naming convention for the security groups that are automatically created is:

Related Component Security Group Name Naming Convention
cloud controller CloudbreakSecurityGroup {CFNStackName}-CloudbreakSecurityGroup-{uniqueID}
cluster master node ClusterNodeSecurityGroupmaster {ClusterName}-ClusterNodeSecurityGroupmaster-{uniqueID}
cluster worker nodes ClusterNodeSecurityGroupworker {ClusterName}-ClusterNodeSecurityGroupworker-{uniqueID}
cluster compute nodes ClusterNodeSecurityGroupcompute {ClusterName}-ClusterNodeSecurityGroupcompute-{uniqueID}

The following security groups are created automatically:

Controller Security Group

The CloudbreakSecurityGroup security group is created when launching your cloud controller and is associated with your cloud controller instance. The following table lists the security group port configuration for the cloud controller instance. The security group Source for these ports is set to the Remote Access CIDR IP specified when launching the cloud controller.

Inbound Port Description
22 SSH access to the Cloud Controller instance.
80 HTTP access to the Cloud Controller UI. This is automatically redirected to the HTTPS (443) port.
443 HTTPS access to the Cloud Controller UI.

Cluster Security Groups

Multiple security groups are created when you create a cluster, one for the master node (ClusterNodeSecurityGroupmaster), one for the all worker nodes (ClusterNodeSecurityGroupworker), and one for all compute nodes (ClusterNodeSecurityGroupcompute). The security group Source for these ports is set to the Remote Access CIDR IP specified when creating the cluster.

The following table lists the master node security group (ClusterNodeSecurityGroupmaster) port configuration.

Inbound Port Description
22 SSH access to the node instance.
9443 Internal management port, used by the cloud controller to communicate with the cluster master node.
84431 HTTPS access to the Ambari, Zeppelin, Hive JDBC, and other Cluster Components. Used by the cloud controller by the protected gateway.

1 Port 8443 is only opened on the master node if when you create cluster, you check at least one of the checkboxes under NETWORK & SECURITY > Protected Gateway Access (to Ambari and Zeppelin Web UIs, Hive JDBC and/or Cluster Components UIs). Although you can use this port to access cluster resources, it is strongly recommended you cluster resources via the protected gateway on the cloud controller.

The following table lists the worker node security group (ClusterNodeSecurityGroupworker) port configuration.

Inbound Port Description
22 SSH access to the node instance.

The following table lists the compute node security group (ClusterNodeSecurityGroupcompute) port configuration.

Inbound Port Description
22 SSH access to the node instance.

Learn More

Refer to the Amazon EC2 Security Groups documentation for more information about viewing and modifying security group rules for EC2 instances.