Release Notes

These Release Notes summarize release-specific changes:


New Features

TP #2.0 (June 14, 2017)

Shared Data Lake Services

Introduces Shared Data Lake Services to handle authentication (LDAP/AD), authorization and audit (Apache Ranger) for workload clusters. You can launch a long-running instance of Shared Data Lake Services and then attach ephemeral workload clusters as you create them to provide authentication, authorization, and audit. For more information, refer to Data Lake Services.

Central Protected Gateway

The protected gateway (which was previously per-cluster) is now available centrally on the controller instance, providing a single-access point for all clusters and cluster resources. For more information, refer to Protected Gateway.

SSL Certificates

Introduces the ability to use your own SSL certificate in place of the self-signed certificate installed by default on the protected gateway. For more information, refer to Using Your Own SSL Certificate.

Encryption for Master, Worker, and Compute Node EBS Volumes

When creating a cluster, you can optionally select to have the cloud controller configure encryption for the EBS volumes attached to the master, worker, and compute nodes. If you choose to use this option, the following types of data are encrypted:

To learn more about EBS encryption, refer to AWS documentation. For instructions, refer to updated Creating a Cluster documentation.

SSE-KMS Encryption

Introduces the ability to work with S3 data that has been encrypted via SSE-KMS. For more information, refer to Working with Encrypted Amazon S3 Data.

S3Guard (Technical Preview)

Introduces the S3Guard (Technical Preview), which mitigates the issues related to eventual consistency model of Amazon S3. It guarantees a consistent view of data stored in Amazon S3 by using a table on a DynamoDB instance as a consistent metadata store.

You can configure S3Guard by adding a set of custom properties when creating a cluster and by adding a required access policy in the IAM console on AWS. For more information, refer to S3Guard.

TP #1.14 (Mar 21, 2017)

Auto Scaling

Auto Scaling provides the ability to increase or decrease the number of nodes in a cluster according to the auto scaling policies that you define. After you create an auto scaling policy, cloud controller will execute the policy when the conditions that you specified are met.

You can create an auto scaling policy when creating a cluster or when the cluster is already running you can manage the auto scaling settings and policies.

For more information, refer to Auto Scaling, Create Cluster, and Managing Clusters.

Shared Druid Metastore (Technical Preview)

When creating an HDP 2.6 cluster based on the BI configuration, you have an option to have a Druid metastore database created with the cluster, or you can use an external Druid metastore that is backed by Amazon RDS. Using an external Amazon RDS database for a Druid metastore allows you to preserve the Druid metastore metadata and reuse it between clusters.

For more information, refer to Managing Shared Metastores and updated step 8 in Create Cluster.

Major UI Changes

TP #1.14 introduces a few major user interface changes:

TP #1.13 (Feb 15, 2017)

Resource Tagging

When creating a cluster, you can optionally add custom tags that will be displayed on the CloudFormation stack and on EC2 instances, allowing you to keep track of the resources that cloud controller crates on your behalf. For more information, refer to new Resource Tagging and updated Creating a Cluster documentation.

Node Auto Repair

The cloud controller monitors clusters by checking for Ambari Agent heartbeat on all cluster nodes. If the Ambari Agent heartbeat is lost on a node, a failure is reported for that node. Once the failure is reported, it is fixed automatically (if auto repair is enabled), or options are available for you to fix the failure manually (if auto repair is disabled).

You can configure auto repair settings for each cluster when you create it. For more information, refer to new Node Auto Repair and updated Creating a Cluster documentation.

New Data Science Template

This technical preview introduces a new HDP 2.6 template Data Science: Apache Spark 2.1, Apache Zeppelin 0.6.2. For an updated list of available HDP 2.5 and HDP 2.6 configurations, refer to Cluster Configurations and Cluster Services.

Protected Gateway Access to Cluster Components and UIs

The following ports are no longer open on master node security group: 443 (web UIs), 8080 (Ambari Web UI), 9995 (Zeppelin UI). In addition, port 9443 is no longer open on the worker and compute node security groups. Instead, port 8443 (Protected Gateway) is now open on the master node security group, providing protected access to Ambari Web UI, Zeppelin UI and other cluster components.

For an updated list of ports open on cluster node security groups and for more information on the protected gateway, refer to Security > Network documentation.

In order to use Hive JDBC and Beeline to access your cluster through the protected gateway, you need to download an SSL certificate from the gateway and add it to your truststore. For instructions on how to do this, refer to Using Apache Hive.

TP #1.12 (Jan 19, 2017)

Technical Preview of HDP 2.6 Including Druid and Spark 2.1

HDCloud for AWS TP #1.11 allows you to access the technical preview of HDP 2.6 and Ambari 2.5, including the technical preview of Druid and Spark 2.1. The following cluster configurations are available:

Cluster Type Services Description
Data Science Spark 1.6,
Zeppelin 0.6.0
This cluster configuration includes stable versions of Spark and Zeppelin.
EDW - ETL Hive 1.2.1,
Spark 1.6
This cluster configuration includes stable versions of Hive and Spark.
EDW - Analytics Hive 2 LLAP,
Zeppelin 0.6.0
This cluster configuration includes a Technical Preview of Hive 2 LLAP.
EDW - ETL Hive 1.2.1,
Spark 2.1
This cluster configuration includes a Technical Preview of Spark 2.1.
BI Druid 0.9.2 This cluster configuration includes a Technical Preview of Druid.

For a list of all available HDP 2.5 and HDP 2.6 configurations, refer to Cluster Configurations.

To get started with Spark 2.1, see the TRY APACHE SPARK 2.1 AND ZEPPELIN IN HORTONWORKS DATA CLOUD blog post.


TP #1.10 (Dec 13, 2016)

Compute Nodes

In addition to master and worker nodes, you can optionally add compute nodes to your cluster for running data processing tasks. Compute nodes can run on standard on-demand instances or on spot instances. For more information, refer to updated Architecture and Creating a Cluster documentation.

Spot Pricing

When creating a cluster you can optionally use spot instances as compute nodes. For more information, refer to updated Creating a Cluster and new Using Spot Instances documentation.

Node Recipes

When creating a cluster, you can optionally upload one or more node recipes - scripts which will be executed on specific node(s) before or after the Ambari cluster installation. You can use recipes for tasks such as installing additional software or performing advanced cluster configuration. For more information, refer to updated Creating a Cluster and Using the CLI documentation.


TP #1.7 (Oct 5, 2016)

CLI: Cluster Create with Custom Properties

The HDC CLI allows you to specify custom cluster configurations. For example:

"Configurations": [
    {
      "core-site": {
        "fs.trash.interval": "5000"
      }
    }
  ]

Run hdc create-cluster generate-cli-skeleton --help to see updated JSON template. This functionality is similar to Custom Properties already available when creating a cluster through the UI.

Cloud Controller with Amazon RDS

Added support for configuring the cloud controller with an existing Amazon RDS instance. See Advanced Launch Options for more information.


TP #1.6 (Sep 21, 2016)

Command Line Interface (CLI)

Provide a Command Line Interface (CLI) for interacting with the cloud controller to manage clusters. See Installing the CLI for more information.


TP #1.5 (Sep 7, 2016)

No new features.


TP #3-1.4 (Aug 19, 2016)

Existing VPC Support

Provide a new CloudFormation template for launching the Cloud Controller that allows you to select an existing VPC and subnet for installing the Controller instance. Also, when creating a cluster, you can optionally choose to install into an existing VPC and subnet. Default is to install into the same VPC as the Controller instance. See Creating a Cluster for more information.

Hive Metastore Management

Provide a way to centrally register Hive Metastores (backed by Amazon RDS) that can be re-used between clusters. See Managing Hive Metastores for more information on registering Hive Metastores.

Cluster Create Notification

When creating a cluster, optionally specify to receive an email notification when the cluster creation is complete.


TP #2 (Jul 2016)

Cluster Template Management

After creating a cluster, you can optionally save as a Cluster Template to re-use when creating future clusters. From the Main Navigation Menu, you can view details or delete cluster templates.

Cluster Create with Custom Properties

When creating a cluster, you can optionally include custom cluster configuration properties. This allows you to set configurations such as core-site automatically. See Custom Properties for more information.

[
  {
  "configuration-type" : {
    "property-name" : "property-value",
    "property-name2" : "property-value”
    }
  },
  {
  "configuration-type2" : {
    "property-name" : "property-value”
    }
  }
]

From the menu you can get to the Cluster home page, manage Cluster Templates and view Cluster History.

Cluster Details Page Improvements

On the Cluster Details page, you can see more information about the Master and Worker nodes, including IP address and SSH Connection details.

Platform Improvements

Known Issues

Note

If the issue that you are experiencing is not listed below, check the Troubleshooting documentation.

General

Problem (BUG-77857): Clock on the Cluster Details Page Shows 0:00

In some circumstances, the clock on the cluster details page may show 0:00 as the running time, even if more time has elapsed.

Workaround: Go back to the cluster dashboard. This will refresh the clock.

Problem (BUG-81014): Cluster Template Uses Default Storage Parameters

When creating a cluster from a template, if the template includes non-default storage settings under HARDWARE AND STORAGE > SHOW ADVANCED OPTIONS, the create cluster form may pre-populate default storage settings such as storage type, storage size, storage count, and encryption.

Workaround: Review the create cluster form and update the storage settings before submitting a request to create a cluster.

Problem (BUG-73946): Cluster History Doesn't Show HDP Version for Some Clusters

The History page may not show HDP Version for some clusters.

Workaround: There is no workaround.


CLI

Problem (BUG-67740): CLI Option 'generate-cli-shared-skeleton' Doesn't Work

The hdc create-cluster generate-cli-shared-skeleton doesn't work.

Workaround: Ignore this option. Use generate-cli-skeleton instead.

Problem (BUG-69444): Connection Error After Password Reset

After reseting your password, you will get the following error when running hdc configure to update the password in the CLI configuration file:

Error while connecting to https://ec2-35-156-72-90.eu-central-1.compute.amazonaws.com/identity/oauth/authorize as user: test@hortonworks.com, please check your username and password. (406 Not Acceptable)

Workaround:

  1. Edit the CLI configuration file directly with sudo. For example:
    sudo vi /home/cloudbreak/.hdc/config
  2. Change the owner of the .hdc directory manually. For example: sudo chown -R cloudbreak:cloudbreak /home/cloudbreak/.hdc/
  3. After performing these steps, hdc configure and other CLI command will start working.

Zeppelin

Problem (BUG-64108): The Ambari Zeppelin View Doesn't Load

Workaround: You can access Zeppelin UI using the the links in the cloud controller UI. Alternatively, you can access Zeppelin UI using the following URL: ZEPPELIN_HOST_IP:9995 where ZEPPELIN_HOST_IP is the IP address of the EC2 instance running Zeppelin.


Behavioral Changes

Fix version JIRA Description
TP #1.13 BUG-48261 Ports 443, 8080 and 9995 are no longer opened on the security groups. See [Security > Network](security-network.md) for the current security group port settings.
GA #1.11 BUG-70460 Ports 18080 and 18081 (Spark History Server UI) are no longer opened when cluster is created. Port 10000 (HiveServer2 JDBC endpoint) is no longer opened when cluster is created via CLI.
GA #1.11 BUG-70460 Spark History Server UI link is no longer available in the cloud controller.
TP #1.10 BUG-69167 The m4x4xlarge (16vCPU, 64.0 GB Memory) is now the default instance type for the master node.
TP #1.10 BUG-67305 Updated JDK for the HDP service from openjdk-1.8.0_77 to openjdk-1.8.0_111.
TP #1.10 BUG-67379 The History page generates a report that includes current day's cluster activity. In the previous versions, the current day's cluster activity was not included.
TP #1.7 BUG-66493 In the controller UI navigation menu, the HIVE METASTORES item has been renamed to SHARED SERVICES.
TP #1.5 RMP-7104

If you launch the hortonworks-data-cloud-hdp AMI, without using the Hortonworks Data Cloud Controller Service CFN template, you will see the following message after you SSH to the EC2 instance:

This EC2 instance has not been launched via the "Hortonworks Data Cloud Controller Service" and hence is not configured for use. Terminate this EC2 instance and use one of the "Hortonworks Data Cloud Controller Service" options to launch the controller which will use this AMI to create clusters with HDP Services.

Terminate the instance and launch the Hortonworks Data Cloud Controller Service using the CFN template. See Launch Cloud Controller.

TP #1.5 BUG-64737

Introduces new RDS connection format. When registering a new Hive Metastore and entering a new RDS connection URL, you need to prefix the URL with //. For example:

 //rv-hive.cvumy8guwdk5.us-east-1.rds.amazonaws.com:5432/hivedb
TP #1.5 BUG-64737 Cloudbreak log shows more information on JDBC connection to aid in debugging.
TP #1.5 BUG-64482 Link to Get Help was moved from the page footer to the main navigation menu.
TP #1.5 BUG-65297 Upgraded cluster to use Ambari 2.4.0.1.

Fixed Issues

Fix version JIRA Description
TP 1.15 BUG-77211 Fixed the cluster cloning issue where some parameters such as Cluster Type and parameters related to autoscaling were not pre-populated correctly.
GA 1.14.4 BUG-76991 Fixed issues with Ambari Hive View 2.0.
TP 1.14 BUG-74985 Fixed the CLI issue where the list and describe clusters commands throw an exception if an auto node repair is in process.
TP 1.13.1 n/a Upgraded a dependent component in order to address a network configuration issue.
TP 1.13 BUG-71907 Stack creation no longer fails if Admin Password includes `:`.
TP 1.13 BUG-71042 Fixed the issue where links on the cluster tile sometimes erroneously included private IP address and, as a result, didn't work.
TP 1.13 BUG-72780 You can now register a Hive Metastore with HDP 2.6.
TP 1.13 BUG-72225 You can now access Files and Tez Views from Ambari Web UI in HDP 2.6 cluster configurations.
TP 1.13 BUG-70561 Fixed the Livy Interpreter 400 HttpClientError.