Welcome to the Cloudbreak on the Azure Marketplace Technical Preview documentation!
Cloudbreak on the Azure Marketplace allows you to provision HDP and HDF clusters on Azure using the Microsoft Azure infrastructure.
Cloudbreak is a tool that simplifies the provisioning, management, and monitoring of on-demand HDP clusters in virtual and cloud environments. It leverages cloud infrastructure to create host instances, and uses Apache Ambari via Ambari blueprints to provision and manage Hortonworks clusters.
Primary Use Cases
Cloudbreak allows you to create, manage, and monitor your clusters on your chosen cloud platform:
- Dynamically deploy, configure, and manage clusters on public and private clouds (AWS, Azure, Google Cloud, OpenmStack).
- Use automated scaling to seamlessly manage elasticity requirements as cluster workloads change.
- Secure your cluster by enabling Kerberos.
The following graphic illustrates high-level architecture of Cloudbreak on the Azure Marketplace:
When you launch the Cloudbreak, a new resource group is created and the following Azure resources are provisioned within it:
If you chose to use an existing virtual network, the virtual network will not be added to the resource group.
- Virtual network (VNet) securely connects Azure resources to each other.
- Network security group (NSG) defines inbound and outbound security rules, which control network traffic flow.
- Virtual machine runs Cloudbreak.
- Public IP address is assigned to your VM so that it can communicate with other Azure resources.
- Network interface (NIC) attached to the VM provides the interconnection between the VM and the underlying software network.
- Blob storage container is created to store Cloudbreak Deployer OS disk's data.
For each created cluster, a new resource group is created and the following Azure resources are provisioned within it:
- A new virtual network (unless you've chosen to use an existing network)
- For each host group: a network security group
- For each node: a VM, an IP address and a network interface
- A new Blob storage container for the OS data of the host groups
You can launch Cloudbreak and provision your clusters in all regions supported by Microsoft Azure.
Default Cluster Configurations
Cloudbreak includes default cluster configurations (in the form of bleuprints) and supports using your own custom cluster configurations (in the form of custom blueprints).
The following default cluster configurations are available:
Platform Version: HDP 2.6
|Cluster Type||Main Services||Description||List of All Services Included|
|Data Science|| Spark 2,
|Useful for data science with Spark 2 and Zeppelin.||HDFS, YARN, MapReduce2, Tez, Hive, Pig, Sqoop, ZooKeeper, Ambari Metrics, Spark 2, Zeppelin|
|EDW - Analytics|| Hive 2 LLAP,
|Useful for EDW analytics using Hive LLAP.||HDFS, YARN, MapReduce2, Tez, Hive 2 LLAP, Druid, Pig, ZooKeeper, Ambari Metrics, Spark 2|
|EDW - ETL|| Hive,
|Useful for ETL data processing with Hive and Spark 2.||HDFS, YARN, MapReduce2, Tez, Hive, Pig, ZooKeeper, Ambari Metrics, Spark 2|
Platform Version: HDF 3.1
|Cluster type||Main services||Description||List of all services included|
|Flow Management||NiFi||Useful for flow management with NiFi.||NiFi, NiFi Registry, ZooKeeper, Ambari Metrics|
|Messaging Management||Kafka||Useful for messaging management with Kafka.||Kafka, ZooKeeper, Ambari Metrics|
The following configuration classification applies:
- Stable configurations are the best choice if you want to avoid issues and other problems with launching and using clusters.
- If you want to use a Technical Preview version of a component in a release of HDP/HDF, use these configurations.
- These are the most cutting edge of the configurations, including Technical Preview components in a Technical Preview HDP/HDF release.
To quickly get started with Cloudbreak on the Azure Marketplace:
- Meet the prerequisites
- Launch a Cloudbreak instance, log in to the Cloudbreak UI, and create a Cloudbreak credential
- Create a cluster
The Cloudbreak software runs in your Azure environment. You are responsible for Azure Portal charges while running Cloudbreak and the clusters being managed by Cloudbreak. To learn more about Azure pricing, refer to Azure documentation.