Lemur Project Cluster


Physical Configuration

Contents:


The Lemur Project Cluster can be located on the network at boston.lti.cs.cmu.edu.

Machine Configuration

The Lemur Project Cluster is currently held in two full-size racks comprised of two head nodes, 24 back-end (computing) nodes, two high-speed switch, and three JetStor RAID arrays and one Raid, Inc. high-speed fiber-channel array.

The front-end nodes consist of two Dell 2850 servers with 2x 2.8 GHz dual-core Intel Xeon processors, 8 GB of RAM in Boston-A and 16 GB in Boston-B, and 146 GB of on-board hard drive space (in a mirrored, RAID-0 configuration).

Each of the back-end nodes are Sun x2100 servers each with 1 AMD Opteron 180 dual-core CPU (~2.4 GHz), 4 GB of RAM, and ranging from 80 GB to 500 GB of on-board hard drive space.

The front end servers are the only machines that is visible from the outside world. All of the back-ends are on a private sub-network connected through the front-end nodes via Cisco Catalyst 2960 10/100/1GB managed switches.

Disk Configuration

For ease of use from any of our servers, we try to keep a consistent format in naming and accessing our various disk partitions. To access a shared volume from any of our servers, the common format is to access the share as /machine/share. Our current machines that we share are:
  - orleans.lti.cs.cmu.edu (/orl)
  - boston.lti.cs.cmu.edu (/bos)

(Note: for historical reasons, the user shares from the old hartford.lti.cs.cmu.edu are available from /oldhar)

By keeping the share prefixes this way, this allows you to write shell scripts that work on several different systems without any changes.

On each machine you may find one or more of the following share types:

  • User Shares: User shares are partitions for user-specific files. These shares are backed up nightly by SCS Facilities. If you have any important data such as source code, working papers, etc. you should keep them here. The user shares are designated by the /usr*/ in their name. For example, the first user partition on boston is accessible via /bos/usr0.
     
    One caveat though - although we do not enforce user quotas on disk space, please be aware that the user shares are limited and must be shares with all users, so please do not overload the shared space with data or personal items that you should be storing (and backing up) elsewhere - be courteous!
     
  • Data Shares: The data shares (/data*/) are specifically set aside for important data that does not often change, such as original datasets (i.e. TREC, TDT, etc.) These partitions are not backed up automatically, but can be backed on by request. Use of these partitions is relatively restricted - if you need to store data here, please ask!
     
  • Temporary Shares: The temporary storage space (/tmp*/) makes up the majority of the allocated disk space on our machines. You should keep indexes, large temporary files, or large sets of working files on these partitions. The shares are not backed up, except on request. The assumption is that anything on a /tmpN/ partition can be recreated using programs on a /usr*/ partition and data on a /data*/ partition.
     

Specific to the Lemur Project Cluster, the disks in the primary RAID array on the Lemur Project Cluster are configured as follows:

/bos/usr0 (92 GB)
/bos/usr1 (92 GB)
/bos/usr2 (92 GB)
/bos/usr3 (92 GB)
/bos/usr4 (92 GB)
/bos/usr5 (92 GB)
/bos/usr5 (92 GB)
/bos/usr6 (92 GB)
/bos/usr7 (92 GB)
/bos/usr8 (92 GB)
/bos/usr9 (92 GB)
/bos/data0 (1.4 TB)
/bos/data1 (1.4 TB)
/bos/data2 (1.0 TB)
/bos/tmp0 (2.7 TB)
/bos/tmp1 (1.1 TB)
/bos/tmp2 (3.8 TB)
/bos/tmp3 (2.7 TB)
/bos/tmp4 (4.3 TB)
/bos/tmp5 (4.3 TB)

Combined total: 22.8 TB

and the archive disk in the secondary RAID is at: /bos/archive0 (1.3 TB)


The Lemur Project Cluster Machine
 

Rear view

[Up] | [Next (Software Configuration) »]