Releases: aws/aws-parallelcluster-cookbook
AWS ParallelCluster v2.4.0
We're excited to announce the release of AWS ParallelCluster Cookbook 2.4.0.
This is associated with AWS ParallelCluster v2.4.0.
Enhancements
- Add support for EFA on Centos 7, Amazon Linux and Ubuntu 1604
- Add support for Ubuntu in China region
cn-northwest-1
Changes
- SGE: changed following parameters in global configuration
max_unheard 00:03:00
: allows a faster reaction in case of faulty nodesreschedule_unknown 00:00:30
: enables rescheduling of jobs running on failing nodesqmaster_params ENABLE_FORCED_QDEL_IF_UNKNOWN
: forces job deletion on unresponsive nodesqmaster_params ENABLE_RESCHEDULE_KILL
: forces rescheduling or killing of jobs running on failing nodes
- Slurm: decrease SlurmdTimeout to 120 seconds to speed up replacement of faulty nodes
- Always use full master FQDN when mounting NFS on compute nodes. This solves some issues occurring with some networking
setups and custom DNS configurations - Set soft and hard ulimit on open files to 10000 for all supported OSs
- Pin python
supervisor
version to 3.4.0 - Remove unused
compute_instance_type
from jobwatcher.cfg - Removed unused
max_queue_size
from sqswatcher.cfg - Remove double quoting of the post_install args
Bug Fixes
- Fix issue that was preventing Torque from being used on Centos 7
- Start node daemons at the end of instance initialization. The time spent for post-install script and node
initialization is not counted as part of node idletime anymore. - Fix issue which was causing an additional and invalid EBS mount point to be added in case of multiple EBS
- Install Slurm libpmpi/libpmpi2 that is distributed in a separate package since Slurm 17
Support
Need help / have a feature request?
AWS Support: https://console.aws.amazon.com/support/home
ParallelCluster Issues tracker on GitHub: https://github.com/aws/aws-parallelcluster
The HPC Forum on the AWS Forums page: https://forums.aws.amazon.com/forum.jspa?forumID=192
AWS ParallelCluster 2.3.1
We're excited to announce the release of AWS ParallelCluster Cookbook 2.3.1.
This is associated with AWS ParallelCluster v2.3.1.
Enhancements
- FSx Lustre - add support in Amazon Linux
Changes
- Slurm - upgrade to version 18.08.6.2
- Slurm - declare nodes in separate config file and use FUTURE for dummy nodes
- Slurm - set
ReturnToService=1
in scheduler config in order to recover instances that were initially marked as down due to a transient issue. - NVIDIA - update drivers to version 418.56
- CUDA - update toolkit to version 10.0
- Increase default EBS volume size from 15GB to 17GB
- Add
LocalHostname
toCOMPUTE_READY
events - Pin
future
,retrying
andsix
packages in Ubuntu 14.04 - Add
stackname
andmax_queue_size
to sqswatcher configuration
Support
Need help / have a feature request?
AWS Support: https://console.aws.amazon.com/support/home
ParallelCluster Issues tracker on GitHub: https://github.com/aws/aws-parallelcluster
The HPC Forum on the AWS Forums page: https://forums.aws.amazon.com/forum.jspa?forumID=192
AWS ParallelCluster 2.2.1
We're excited to announce the release of AWS ParallelCluster Cookbook 2.2.1.
This is associated with AWS ParallelCluster v2.2.1.
Features
- Support for FSx Lustre with Centos 7
- Check AWS EC2 account limits before starting cluster creation
- Allow users to force job deletion with
SGE
scheduler
Changes
- Set default value to
compute
forplacement_group
option pcluster ssh
: use private IP when the public one is not availablepcluster ssh
: now works also when stack is not completed as long as the master IP is available
Bugfixes
awsbsub
: fix file upload with absolute pathpcluster ssh
: fix issue that was preventing the command from working correctly when stack status isUPDATE_ROLLBACK_COMPLETE
- Fix block device conversion to correctly attach EBS nvme volumes
- Wait for Torque scheduler initialization before completing master node setup
pcluster version
: now works also when no ParallelCluster config is present- Improve
nodewatcher
daemon logic to detect if a SGE compute node has running jobs
Support
Need help / have a feature request?
AWS Support: https://console.aws.amazon.com/support/home
ParallelCluster Issues tracker on GitHub: https://github.com/aws/aws-parallelcluster
The HPC Forum on the AWS Forums page: https://forums.aws.amazon.com/forum.jspa?forumID=192
AWS ParallelCluster 2.1.1
We're excited to announce the release of AWS ParallelCluster Cookbook 2.1.1.
This is associated with AWS ParallelCluster v2.1.1.
Features
- Support for AWS Beijing Region (cn-north-1) and Ningxia Region (cn-northwest-1
Bugfixes
- No longer schedule jobs on compute nodes that are terminating
Support
Need help / have a feature request?
AWS Support: https://console.aws.amazon.com/support/home
ParallelCluster Issues tracker on GitHub: https://github.com/aws/aws-parallelcluster
The HPC Forum on the AWS Forums page: https://forums.aws.amazon.com/forum.jspa?forumID=192
AWS ParallelCluster v2.1.0
We're excited to announce the release of AWS ParallelCluster Cookbook 2.1.0.
This is associated with AWS ParallelCluster v2.1.0.
Features
- Support for Elastic File System (EFS)
- AWS Batch Multinode Parallel support
- Support for RAID 0 and 1 EBS Volumes
- Support for AWS Stockholm Region (eu-north-1)
Bugfixes
- No longer schedule jobs on compute nodes that are terminating
Support
Need help / have a feature request?
AWS Support: https://console.aws.amazon.com/support/home
ParallelCluster Issues tracker on GitHub: https://github.com/aws/aws-parallelcluster
The HPC Forum on the AWS Forums page: https://forums.aws.amazon.com/forum.jspa?forumID=192
AWS ParallelCluster v2.0.2
We're excited to announce the release of AWS ParallelCluster Cookbook 2.0.2.
This is associated with AWS ParallelCluster v2.0.2.
Features
- Support for new GovCloud region us-gov-east-1
Bugfixes
- Fix regression with
shared_dir
parameter in the cluster configuration section. - Fixed issue with
jq
that prevented customers from usingextra_json
- Fixed issue with
awscli
version on ubuntu1404
Support
Need help / have a feature request?
AWS Support: https://console.aws.amazon.com/support/home
ParallelCluster Issues tracker on GitHub: https://github.com/aws/aws-parallelcluster
The HPC Forum on the AWS Forums page: https://forums.aws.amazon.com/forum.jspa?forumID=192
AWS ParallelCluster v2.0.0
We're excited to announce the release of AWS ParallelCluster Cookbook 2.0.0!
This is associated with AWS ParallelCluster v2.0.0.
Features
- AWS Batch integration
- Multiple EBS Volumes
- Support for custom AMI's
Support
Need help / have a feature request?
AWS Support: https://console.aws.amazon.com/support/home
ParallelCluster Issues tracker on GitHub: https://github.com/aws/aws-parallelcluster note: we've moved cookbook issues to the main package, please create new issues there
The HPC Forum on the AWS Forums page: https://forums.aws.amazon.com/forum.jspa?forumID=192
CfnCluster v1.6.0
This is a release of the cfncluster-cookbook v1.6.0, associated with CfnCluster v1.6.0.
Features:
- Refactor scaling up to take into account the number of pending/requested jobs/slots and instance slots.
- Refactor scaling down to scale down faster and take advantage of per-second billing.
- Add
scaledown_idletime
parameter as part of scale-down refactoring - Lock hosts before termination to ensure removal of dead compute nodes from host list
- Fix HTTP proxy support
CfnCluster v1.5.4
This is a release of the cfncluster-cookbook v1.5.4, associated with CfnCluster v1.5.4.
Features:
- Set SGE Accounting summary to be true, this reports a single accounting record
for a mpi job - Add option to disable ganglia
extra_json = { "cfncluster" : { "ganglia_enabled" : "no" } }
CfnCluster v1.5.2
This is a release of the cfncluster-cookbook v1.5.2, associated with CfnCluster v1.5.2.
Bug fixes:
- Fix bug that prevented c5d/m5d instances from working
- Set CPU as a consumable resource in slurm config