Physical Configuration
Contents:
The Lemur Project Cluster can be located on the network at boston.lti.cs.cmu.edu.
Machine Configuration
The Lemur Project Cluster is currently held in two full-size racks comprised of two head nodes, 24 back-end (computing) nodes, two high-speed
switch, and three JetStor RAID arrays and one Raid, Inc. high-speed fiber-channel array.
The front-end nodes consist of two Dell 2850 servers with 2x 2.8 GHz dual-core Intel Xeon processors, 8 GB of RAM in Boston-A and 16 GB in Boston-B, and 146 GB
of on-board hard drive space (in a mirrored, RAID-0 configuration).
Each of the back-end nodes are Sun x2100 servers each with 1 AMD Opteron 180 dual-core CPU (~2.4 GHz), 4 GB of RAM,
and ranging from 80 GB to 500 GB of on-board hard drive space.
The front end servers are the only machines that is visible from the outside world. All of the back-ends are on a
private sub-network connected through the front-end nodes via Cisco Catalyst 2960 10/100/1GB managed switches.
Disk Configuration
For ease of use from any of our servers, we try to keep a consistent format in naming and accessing our various disk
partitions. To access a shared volume from any of our servers, the common format is to access the share as /machine/share.
Our current machines that we share are:
- orleans.lti.cs.cmu.edu (/orl)
- boston.lti.cs.cmu.edu (/bos)
(Note: for historical reasons, the user shares from the old hartford.lti.cs.cmu.edu are available from /oldhar)
By keeping the share prefixes this way, this allows you to write shell scripts that work on several different
systems without any changes.
On each machine you may find one or more of the following share types:
-
User Shares: User shares are partitions for user-specific files. These shares are backed up nightly
by SCS Facilities. If you have any important data such as source code, working papers, etc. you should
keep them here. The user shares are designated by the /usr*/ in their name. For example, the first user
partition on boston is accessible via /bos/usr0.
One caveat though - although we do not enforce user quotas on disk space, please be aware that the
user shares are limited and must be shares with all users, so please do not overload the shared space with
data or personal items that you should be storing (and backing up) elsewhere - be courteous!
-
Data Shares: The data shares (/data*/) are specifically set aside for important data that does not
often change, such as original datasets (i.e. TREC, TDT, etc.) These partitions are not backed up automatically,
but can be backed on by request. Use of these partitions is relatively restricted - if you need to store data
here, please ask!
-
Temporary Shares: The temporary storage space (/tmp*/) makes up the majority of the allocated disk
space on our machines. You should keep indexes, large temporary files, or large sets of working files on
these partitions. The shares are not backed up, except on request. The assumption is that anything on a
/tmpN/ partition can be recreated using programs on a /usr*/ partition and data on a /data*/ partition.
Specific to the Lemur Project Cluster, the disks in the primary RAID array on the Lemur Project Cluster are configured as follows:
| /bos/usr0 |  (92 GB) |
| /bos/usr1 |  (92 GB) |
| /bos/usr2 |  (92 GB) |
| /bos/usr3 |  (92 GB) |
| /bos/usr4 |  (92 GB) |
| /bos/usr5 |  (92 GB) |
| /bos/usr5 |  (92 GB) |
| /bos/usr6 |  (92 GB) |
| /bos/usr7 |  (92 GB) |
| /bos/usr8 |  (92 GB) |
| /bos/usr9 |  (92 GB) |
| /bos/data0 |  (1.4 TB) |
| /bos/data1 |  (1.4 TB) |
| /bos/data2 |  (1.0 TB) |
| /bos/tmp0 |  (2.7 TB) |
| /bos/tmp1 |  (1.1 TB) |
| /bos/tmp2 |  (3.8 TB) |
| /bos/tmp3 |  (2.7 TB) |
| /bos/tmp4 |  (4.3 TB) |
| /bos/tmp5 |  (4.3 TB) |
|
| Combined total: 22.8 TB |
and the archive disk in the secondary RAID is at: /bos/archive0 (1.3 TB)
|
 The Lemur Project Cluster Machine |
| |
 Rear view |
|