User Tools

Site Tools


comparison_of_cluster_software

The following tables compare general and technical information for notable computer cluster software. This software can be grossly separated in four categories: Job scheduler, nodes management, nodes installation and integrated stack (all the above).

General information

<!– keep this list in alphabetical order in two group. The first software with answer then software that need answer please –>

Software Maintainer Category Development status ArchitectureOCS High-Performance/ High-Throughput Computing License Platforms supported Cost Paid support available

| Alchemi

| Amoeba

MIT

| Base One Foundation Component Library

Proprietary

| Stacki

StackIQ All in one actively developed Master deploys to members High-Performance Various licenses RHEL, CentOS, Oracle, Scientific Linux Varies Yes

| HTCondor ://research.cs.wisc.edu/htcondor/

University Of Wisconsin CS dep Job/Data Scheduler actively developed Distributed master/executions/submits node HTC Apache license v2.0 Unix-like, Windows, Mac OS X Free Yes

| DIET

INRIA, SysFera, Open Source All in one GridRPC, SPMD, Hierarchical and distributed architecture, CORBA HTC/HPC CeCILL Unix-like, Mac OS X, AIX Free

| Ganglia ://ganglia.info/

Monitoring actively developed BSD Unix, Linux, Windows NT/XP/2000/2003/2008, FreeBSD, NetBSD, OpenBSD, DragonflyBSD, Mac OS X, Solaris, AIX, IRIX, Tru64, HPUX. Free

| GreenTea Software

| Gridbus Toolkit

| Globus Toolkit

Globus Alliance, Argonne National Laboratory Job/Data Scheduler actively developed SOA Grid Linux Free

| Grid MP ://www.univaud.com/

Univa (formerly United Devices) Job Scheduler no active development Distributed master/worker HTC/HPC Proprietary Windows, Linux, Mac OS X, Solaris Cost

| grun

Erik Aronesty (Expression Analysis) actively developed master/worker HPC/HTC GPL Linux, Mac OS X, BSD Free

| JPPF

Laurent Cohen (founder) actively developed distributed master/worker and P2P HPC/HTC Apache license v2.0 Windows, Linux, Mac OS X, Solaris, Android Free

| Kubernetes

Google actively developed Apache license v2.0 Linux Free

| LanderCluster

Lander Software Technology Co. Ltd Job Scheduler/Monitoring actively developed Proprietary Windows, Linux, & UNIX platforms Cost

| JSTM

| Apache Mesos ://mesos.apache.org/

Apache actively developed Apache license v2.0 Linux Free Yes (Mesosphere)

| Moab Cluster Suite

Cluster Resources, Inc. Job Scheduler/Monitoring actively developed HPC Proprietary Linux, Mac OS X, Windows, AIX, OSF/Tru-64, Solaris, HP-UX, IRIX, FreeBSD & other UNIX platforms Cost

| Moab Cluster Suite ://www.clusterresources.com/pages/products/moab-cluster-suite/workload-manager.php

Cluster Resources, Inc. Job Scheduler actively developed HPC Proprietary Linux, Mac OS X, Windows, AIX, OSF/Tru-64, Solaris, HP-UX, IRIX, FreeBSD & other UNIX platforms Cost

| Maui Cluster Scheduler Molokini Edition

Job Scheduler HTC/HPC Proprietary Unix-like, Free

| NetworkComputer ‌RTDA

Runtime Design Automation actively developed HTC/HPC Proprietary Unix-like, Windows Cost

| OAR

INRIA and LIG Job Scheduler actively developed HPC/HTC GPL Linux/*nix Free

| OpenLava

Teraproc Job Scheduler actively developed Master/Worker, multiple admin/submit nodes HTC/HPC GPL Linux Free Yes

| PBS Professional

PBS Works (A division of Altair) Job Scheduler/Monitoring actively developed HPC Proprietary Unix, Linux, Windows Cost

| Platform LSF ://www.platform.com

IBM Platform Job Scheduler actively developed HPC/HTC Proprietary Unix, Linux, Windows Cost

| Platform Cluster Manager

IBM Platform All in one actively developed HTC/HPC OpenSource Linux Free

| Rocks Cluster Distribution

Open Source/NSF grant All in one actively developed HTC/HPC OpenSource CentOS Free

| Popular Power

| ProActive

INRIA, ActiveEon, Open Source All in one actively developed Master/Worker, SPMD, Distributed Component Model, Skeletons HTC/HPC GPL Unix-like, Windows, Mac OS X Free

| PRUN

Andrey Budnik Job Scheduler actively developed Master node/exec clients, multiple admin/submit nodes HTC Apache license v2.0 Linux/*nix Free

| RPyC

Tomer Filiba actively developed MIT License

  • nix/Windows

| Free

| SLURM

SchedMD Job Scheduler actively developed HPC/HTC GPL Linux/*nix Free Yes

| Oracle Grid Engine ://www.univa.com/oracle.php

Univa Job Scheduler active Development moved to Univa Grid Engine Master node/exec clients, multiple admin/submit nodes HPC/HTC Proprietary

  • nix/Windows

| Cost

| Son of Grid Engine ://arc.liv.ac.uk/SGE/

Dave Love Job Scheduler actively developed Master node/exec clients, multiple admin/submit nodes HPC/HTC SISSL

  • nix/Windows

| Free

| SynfiniWay

Fujitsu actively developed HPC/HTC ? Unix, Linux, Windows Cost

| TORQUE Resource Manager Torque

Cluster Resources, Inc. Job Scheduler actively developed custom Linux, *nix Free

| UniCloud

Univa All in One (dynamic cluster creation/re-sizing, cloud bursting, etc.) Actively Developed Proprietary Oracle Unbreakable Linux, RHEL, and Cent Os Cost

| UniCluster ://www.univaud.com/hpc/products/unicluster/

Univa All in One Functionality and development moved to UniCloud (see above) Free Yes

| UNICORE

| Univa Grid Engine ://www.univa.com/products/grid-engine.php

Univa Job Scheduler actively developed Master node/exec clients, multiple admin/submit nodes HPC/HTC Proprietary

  • nix/Windows

| Cost

| Vaakya ://vaakya.com

Vaakya Technologies Pvt Ltd R&D Technology provider actively developed Cross-Platform, Distributed Computing Architecture Proprietary Windows/Linux Cost

| XGE

| Xgrid

Apple Computer
Software Maintainer Category Development status Architecture High-Performance/ High-Throughput Computing License Platforms supported Cost Paid support available

Table explanation

  • Software: The name of the application that is described

Technical information

Software Implementation Language Authentication Encryption Integrity Global File System Global File System + Kerberos Heterogeneous/ Homogeneous exec node Jobs priority Group priority Queue type SMP aware Max exec node Max job submitted CPU scavenging Parallel job Job checkpointing

| Torque

C SSH, munge None, any Heterogeneous Yes Yes Programmable Yes tested tested Yes Yes Yes (blcr)

| OAR

Perl, Ocaml, Ruby SSH None, NFS Heterogeneous Yes Yes Programmable Yes tested 80k tested >20k Yes Yes Yes (blcr)

| OpenLava ://www.teraproc.com/openlava

C/C++ OS authentication None NFS Heterogeneous Linux Yes Yes Configurable Yes Yes, supports preemption based on priority Yes Yes

| Platform LSF ://www.platform.com

yes Yes to start jobs. Did it suspend job when the person come back? Yes

| HTCondor ://research.cs.wisc.edu/htcondor

C++ GSI, SSL, Kerberos, Password, File System, Remote File System, Windows, Claim To Be, Anonymous None, Triple DES, BLOWFISH None, MD5 None, NFS, AFS Not official, hack with ACL and NFS4 Heterogeneous Yes Yes Fair-share with some programmability basic (hard separation into different node) tested ~10000? tested ~100000? Yes MPI, OpenMP, PVM Yes

| Slurm ://slurm.schedmd.com

C Munge, None, Kerberos Heterogeneous Yes Yes Multifactor Fair-share yes tested 120k tested 100k No Yes Yes (blcr)

| Univa Grid Engine ://www.univa.com/products/grid-engine.php

C OS Authentication/Kerberos/Oauth2 Certificate Based Integrity Arbitrary, e.g. NFS, Lustre, HDFS, AFS AFS Fully heterogeneous Yes; automatically policy controlled (e.g. fair-share, deadline, resource dependent) or manual Yes; can be dependent on user groups as well as projects and is governed by policies Batch, interactive, checkpointing, parallel and combinations Yes, with core binding, GPU and Intel Xeon Phi support commercial deployments with many tens of thousands hosts >300K tested in commercial deployments Yes; can suspend job on interactive usage Yes, with support of arbitrary parallel environments such as OpenMPI, MPICH 1/2, MVAPICH 1/2, LAM, etc. Yes, with support for user, kernel or library level checkpointing environments
Software programation language Authentication Encryption Integrity Global File System Global File System + Kerberos Heterogeneous/ Homogeneous exec node Jobs priority Group priority Queue type SMP aware Max exec node Max job submitted CPU scavenging Parallel job Job checkpointing

Table Explanation

  • Software: The name of the application that is described
  • SMP aware:
    • basic: hard split into multiple virtual host
    • basic+: hard split into multiple virtual host with some minimal/incomplete communication between virtual host on the same computer
    • dynamic: split the resource of the computer(CPU/Ram) on demand

History and adoption

See also

Notes

Add references here.

comparison_of_cluster_software.txt · Last modified: 2016/12/09 22:53 by Mike J. Kreuzer PhD MCSE MCT Microsoft Cloud Ecosystem