Hortonworks Data Cloud utilizes AWS security resources such as VPC, security groups, and IAM roles to ensure maximum security for your clusters. In addition, HDCloud imposes network restrictions through a protected gateway and provides authenticated endpoints for cluster services and UIs.
The following diagram illustrates general HDCloud security architecture:
As the diagram illustrates, HDCLoud security architecture ensures:
Network isolation via user-configured VPCs and subnets. Read more about Virtual Private Cloud.
Network security, achieved via out-of-the-box security group settings and traffic restrictions via a protected gateway, through which all traffic is routed, avoiding the need to open multiple ports and protocols for each individual service. Read more about Network Security.
Authenticated endpoints for all services and UIs that are supported. Read more about Authentication.
Controlled use of AWS resources using IAM roles. Read more about IAM Roles.
Security Best Practices and Checklist
Security Best Practices
Follow these best practices to ensure security of your AWS environment:
- Get familiar with Amazon's Best Practices for Managing AWS Access Keys.
- Use "IAM Instance Data" to authenticate applications running on EC2 VMs.
- Keep separate accounts and credentials for different users and applications.
- Install git-secrets and use it to scan your git repositories and history for AWS keys. If found, even in the history, the keys must be considered compromised and revoked.
- Use a Hadoop credential files to keep the Amazon S3 credentials outside of configuration files.
- When issuing AWS secret applications and users, create session tokens with limited lifetimes.
- Users, especially those with administrative accounts, should use multi-factor authentication.
- Have a process for updating keys and secrets.
- Have a process for revoking keys. The AWS Command Line Interface tools can be used for this.
- If your AWS usage is known, set up billing alerts to raise notifications when the usage goes significantly above the normal range.
Checklist for Vulnerabilities
The following checklists will help you optimize your AWS environment for security:
[ ]Application does not have hard-coded credentials in source.
[ ]Credentials used are not "root" credentials, but those of a subsidiary account with fewer rights.
[ ]Users have
git secretsinstalled and configured for pre-commit scanning and validation of source files.
[ ]Credentials are not in a configuration file which is checked in to any SCM repository.
[ ]Documentation/example-code does not include the credentials.
[ ]Documentation/example-code does not include the name of a bucket used in production systems.
[ ]URIs to Amazon S3 filesystems do not contain the credentials in the URI itself.
[ ]Credentials are not logged. This can be done inadvertently if the credentials are kept in a configuration file and the file or a Hadoop configuration derived from it is dumped to a log.
[ ]Credentials are not included in bug reports, especially those to public repositories.
[ ]Different Amazon S3 buckets are used to restrict access by account.
[ ]After manipulating access to an S3 bucket, attempts are made to log in as unauthenticated/unauthorized users so as to verify that access really is restricted.