Cluster Computers

The C. S. Department maintains a small cluster of 91 (2336 cores) machines for general computing. The machines are set up to use Slurm for batch job submission. Be aware that you will need to use your NetID and NetID password to access the login host (login.cs.duke.edu). The compute cluster machines are on a 3-4 year replacement cycle, so any machine in the cluster should be no more than four years old. You can use the program sinfo to monitor cluster usage. For more information see the Slurm documentation.

N.B. Due to the number of users using the login machines for
computationaly intensive tasks, we have put resource limits
on user sessions on the login machines. Users are limited
to a single CPU and 4GB of RAM. Please use the cluster for
cpu intensive jobs.

Logging in for a terminal session

The terminal session looks like this:

macbook-pro $ ssh netid@login.cs.duke.edu

netid@login.cs.duke.edu's password:

Duo two-factor login for netid

Enter a passcode or select one of the following options:

1. Duo Push to XXX-XXX-1234

2. Phone call to XXX-XXX-1234

3. SMS passcodes to XXX-XXX-1234 (next code starts with: 1)

Passcode or option (1-3): 1

Success. Logging you in...

Last login: Wed May 20 12:21:38 2020 from 174.247.16.115

netid@login~

The rest of your session will proceed as normal. Transferring data using SCP will look just the same as logging in via ssh.

Environment modules

Modules can be loaded and unloaded dynamically and atomically, in a clean fashion. All popular shells are supported, including bash, ksh, zsh, sh, csh, tcsh, fish, cmd, as well as some scripting languages such as tcl, perl, python, ruby, cmake and r. See the environment modules page for details.

Globus

Globus is a data management service frequently used in the research community to transfer or share large scale research data. It is a non-profit service run by the University of Chicago that is available to all Duke users under a standard Globus subscription. Globus is the recommended method to transfer data to and from the Duke Compsci department. Duke Compsci provides the following Globus Collections (endpoints):

Compsci Data Transfer Node – This endpoint can support xtmp storage.

for more information, see the CS Globus page.

Singularity Containers

You can utilize singularity containers on the CS cluster.

Singularity

The cluster is comprised of the following machine configurations:

GPU Resources:

GPU	Cores	Tensor Cores	VRAM	Hosts
16 A6000s	10752	336	48GB	compsci-cluster-fitz
24 a5000s	8192	256	24GB	linux[41-44]
10 V100s	5120	640	32GB	gpu-compute[5-7]
26 P100s	3584		12GB	linux[41-50] gpu-compute[4-5]
24 K80s	4992		12GB	gpu-compute[1-3]
30 2080RTXTi	4352		11GB	linux[41-60]

10x TensorEX TS2-673917-DPN Intel Xeon Gold 6226 Processor, 2.7Ghz (768GB RAM 48 cores). Each of the machines has 2 Nvidia GeForce 2080 RTX Tis.

linux51
linux52
linux53
linux54
linux55
linux56
linux57
linux58
linux59
linux60

10x Tensor TXR231-1000R D126 Intel(R) Xeon(R) CPU E5-2640 v4 @ 2.40GHz (512GB RAM - 40 cores). Each of the machines has 2 Nvidia Tesla P100s, and 1 Nvidia GeForce 2080 RTX Tis. Linux 41-44 each have one A5000.

linux41
linux42
linux43
linux44
linux45
linux46
linux47
linux48
linux49
linux50

N.B. If you need double-precision FP use the k80, p100, or v100

3x Quantum TXR430-0512R Intel(R) Xeon(R) CPU E5-2640 v3 @ 2.60GHz (256GB RAM - 32 cores) with 10GB interconnects. Each of the machines has 4 Nvidia Tesla K80s.

gpu-compute1
gpu-compute2
gpu-compute3

4x Quantum TXR113-1000R Intel(R) Xeon(R) CPU E5-2640 v4 @ 2.40GHz (256GB RAM - 40 cores) with 10GB interconnects. Each of the machines has Nvidia Tesla P100s or V100s.

gpu-compute4 (4x P100s)
gpu-compute5 (2x P100s, 2x V100s)
gpu-compute6 (4x V100s)
gpu-compute7 (4x V100s)

2x Dell R610 with 2x E5540 Xeon Processor, 2.53GHz 8M Cache (96GB RAM - 16 cores)

linux29.cs.duke.edu
linux30.cs.duke.edu

10x Dell R730 with 2 Intel Xeon E5-2640 v4 2.4GHz,25M Cache (256GB RAM - 40 cores)

linux31.cs.duke.edu
linux32.cs.duke.edu
linux33.cs.duke.edu
linux34.cs.duke.edu
linux35.cs.duke.edu
linux36.cs.duke.edu
linux37.cs.duke.edu
linux38.cs.duke.edu
linux39.cs.duke.edu
linux40.cs.duke.edu

10x Dell R610 with 2 E5640 Xeon Processor, 2.66GHz 12M Cache (64GB RAM - 16 cores)

linux1.cs.duke.edu
linux2.cs.duke.edu
linux3.cs.duke.edu
linux4.cs.duke.edu
linux5.cs.duke.edu
linux6.cs.duke.edu
linux7.cs.duke.edu
linux8.cs.duke.edu
linux9.cs.duke.edu
linux10.cs.duke.edu

10x Dell R620 with 2 Xeon(R) CPU E5-2695 v2 @ 2.40GHz 30M Cache (256GB RAM - 48 hyperthreaded cores)

linux11.cs.duke.edu
linux12.cs.duke.edu
linux13.cs.duke.edu
linux14.cs.duke.edu
linux15.cs.duke.edu
linux16.cs.duke.edu
linux17.cs.duke.edu
linux18.cs.duke.edu
linux19.cs.duke.edu
linux20.cs.duke.edu

8x Dell R610 with 2 E5540 Xeon Processor, 2.53GHz 8M Cache (48GB RAM - 16 cores)

linux21.cs.duke.edu
linux22.cs.duke.edu
linux23.cs.duke.edu
linux24.cs.duke.edu
linux25.cs.duke.edu
linux26.cs.duke.edu
linux27.cs.duke.edu
linux28.cs.duke.edu

20x TensorEX TS2-197278655 with 2 Intel Xeon Ice Lake Gold 5317 Processors, 3.0GHz 18MB Cache (64GB RAM - 12 cores) Each of the machines has 4 Nvidia RTX A5000s.

compsci-cluster-fitz-01.cs.duke.edu
compsci-cluster-fitz-02.cs.duke.edu
compsci-cluster-fitz-03.cs.duke.edu
compsci-cluster-fitz-04.cs.duke.edu
compsci-cluster-fitz-06.cs.duke.edu
compsci-cluster-fitz-07.cs.duke.edu
compsci-cluster-fitz-08.cs.duke.edu
compsci-cluster-fitz-09.cs.duke.edu
compsci-cluster-fitz-10.cs.duke.edu
compsci-cluster-fitz-11.cs.duke.edu
compsci-cluster-fitz-12.cs.duke.edu
compsci-cluster-fitz-13.cs.duke.edu
compsci-cluster-fitz-14.cs.duke.edu
compsci-cluster-fitz-15.cs.duke.edu
compsci-cluster-fitz-16.cs.duke.edu
compsci-cluster-fitz-17.cs.duke.edu
compsci-cluster-fitz-18.cs.duke.edu
compsci-cluster-fitz-19.cs.duke.edu
compsci-cluster-fitz-20.cs.duke.edu
compsci-cluster-fitz-21.cs.duke.edu

4x TensorEX TS2-197278655 with 2 Intel Xeon Ice Lake Gold 5317 Processors, 3.0GHz 18MB Cache (64GB RAM - 12 cores) Each of the machines has 4 Nvidia RTX A6000s.

compsci-cluster-fitz-05.cs.duke.edu
compsci-cluster-fitz-22.cs.duke.edu
compsci-cluster-fitz-23.cs.duke.edu
compsci-cluster-fitz-24.cs.duke.edu

Please be aware that compute cluster machines are not backed up. Users should copy any important data to filesystems that are backed up to avoid losing data. In addition, try to be cognizant that this is a shared resource. Please minimize the network traffic for shared resources like disk space. If you need to read and write lots of data, please copy that to local disks, compute the results, and store the results on longer term storage.