Managing Data Lakes

You can manage existing data lakes from the DATA LAKE SERVICES page available from the navigation menu in the cloud controller web UI.

Viewing Data Lake Details

To view details of an existing data lake, follow these steps.

  1. From the navigation menu, select DATA LAKE SERVICES.
  2. Select a specific data lake.

The data lake details page includes information about your data lake:

Basic information about your data lake is listed at the top of the page. This includes the name, usage, size, the number of nodes, uptime (HH:MM), attached clusters, and status.

In addition, the following information is available:

Attached Clusters

The ATTACHED CLUSTERS section lists all the clusters attached to your data lake. For each cluster, its name, number of nodes, cluster type and master node public IP is listed.

Data Lake Details

The DATA LAKE DETAILS section includes the following tabs:

Tab Description
HARDWARE Includes information about the data lake instances: instance names, instance IDs (with links to the EC2 console), and public IPs.
TAGS Lists user-defined tags, in the same order as you added them.
NETWORK Includes VPC and subnet IDs (with links to the VPC dashboard), remote access CIDR IP, and protected gateway access settings.
ADVANCED Lists Hive metastore, Ranger database and Ambari database information, as well your S3 path.

Event History

The EVENT HISTORY tab shows you events logged for the data lake, with the most recent event at the top.

Tips

  • Access actions available for this instance by clicking on DATA LAKE ACTIONS.
  • Use the Create New Attached Cluster under DATA LAKE ACTIONS to navigate to the create cluster form and create a cluster attached to a data lake.
  • Access Ambari web UI and Ranger admin web UI using the links.
  • Under Attached Clusters, you can click on a cluster name to navigate to the details of that cluster. If you click on a cluster name and navigate to the cluster details page, you will notice that it shows the data lake to which the cluster is attached.

Accessing Ranger

The data lake includes an instance of Apache Ranger that you use to manage security policies for Hive. Access the Ranger Admin UI and login using the data lake administrator credentials that you specified when creating the data lake.

Terminating a Data Lake

  1. From the navigation menu, select DATA LAKE SERVICES.
  2. Select a specific data lake to access its details.
  3. From the DATA LAKE ACTIONS, select Terminate.
  4. Confirm that you want to terminate the data lake.

You will not be able to terminate the data lake if you have any attached clusters. Be sure to terminate all attached clusters before terminating the data lake.