next up previous
Next: Conclusion Up: Network I/O with Trapeze Previous: Balancing Latency and Bandwidth

Performance

 


  
Figure 4: Single-client sequential disk read and write bandwidths with Slice/Trapeze.
\begin{figure*}
\centerline{

\psfig {file=figs/READ.ps,width=\columnwidth}

\hspace{\columnsep}

\psfig {file=figs/WRITE.ps,width=\columnwidth}

}\end{figure*}

Our goal with Trapeze and Slice is to push the performance bounds for network storage systems using Myrinet and similar networks. Although our work with Slice is preliminary, our initial prototype shows the performance that can be achieved with network storage systems using today's technology and the right network support.

Figure 4 shows read and write bandwidths from disk for high-volume sequential file access through the FreeBSD read and write system call interface using the current Slice prototype. For these tests, the client was a DEC Miata (Personal Workstation 500au) with a 500 MHz Alpha 21164 CPU and a 32-bit 33 MHz PCI bus using the Digital 21174 ``Pyxis'' chipset. The block I/O servers have a 450 MHz Pentium-III on an Asus P2B motherboard with an Intel 440BX chipset, and either of two disk configurations: (1) four Seagate Medalist disks on two separate Ultra-Wide SCSI channels, or (2) four IBM DeskStar 22GXP drives on separate Promise Ultra/33 IDE channels. All machines are equipped with Myricom LANai 4.1 SAN adapters, and run kernels built from the same FreeBSD 4.0 source pool. The Slice client uses a simple round-robin striping policy with a stripe grain of 32KB. The test program (dd) reads or writes 1.25 GB in 64K chunks, but it does not touch the data. Client and server I/O caches are flushed before each run.

Each line in Figure 4 shows the measured I/O bandwidth delivered to a single client using a fixed number of block I/O servers. The four points on each line represent the number of disks on each server; the x-axis gives the total number of disks used for each point. The peak write bandwidth is 66 MB/s; at this point the client CPU saturates due to copying at the system call layer. The peak read bandwidth is 97 MB/s. While reading at 97 MB/s the client CPU utilization is only 28%, since FreeBSD 4.0 avoids most copying for large reads by page remapping at the system call layer. In this configuration, an unpredicted 8KB page fault from I/O server memory completes in under 150 $\mu$s.

We have also experimented with TCP/IP communication using Trapeze. Recent point-to-point TCP bandwidth tests (netperf) yielded a peak bandwidth of 956 Mb/s through the socket interface on a pair of Compaq XP 1000 workstations using Myricom's LANai-5 NICs. These machines have a 500 MHz Alpha 21264 CPU and a 64-bit 33 MHz PCI bus with a Digital 21272 ``Tsunami'' chipset, and were running FreeBSD 4.0 augmented with page remapping for sockets [7].


next up previous
Next: Conclusion Up: Network I/O with Trapeze Previous: Balancing Latency and Bandwidth
Jeff Chase
8/4/1999