next up previous
Next: Zero-Copy Sockets Up: TCP/IP with Trapeze/Myrinet Previous: Trapeze Overview

Low-Overhead Data Movement

  This section describes the optimizations used above and below the TCP/IP protocol stack to reduce data movement overhead for copying and checksumming data. These overheads increase with the volume of data moved per unit of time; at gigabit-per-second bandwidths, data movement overhead can consume a large share of CPU cycles. Unfortunately, faster CPUs do not help appreciably because copying is memory-intensive.

We describe the data movement optimizations as extensions to the conventional FreeBSD send/receive path, which is based on variable-sized kernel network buffers called mbufs [8]. Standard mbufs contain their own buffer space, while external mbufs hold pointers to other kernel buffers, e.g., file buffers or the virtual memory page frames used as Trapeze payload buffers. Packet data is stored in linked chains of mbufs passed between levels of the system; the TCP/IP protocol stack adds and removes headers and checksums by manipulating the mbufs in the chain. On a normal transmission, the socket layer copies IP message from a user memory buffer into a chain, which is passed through the TCP/IP stack to the network driver. On the receiving side, the driver constructs a chain containing each incoming packet header and payload, and passes the chain through the TCP/IP stack to the socket layer. When the receiving process accepts the data, e.g., with a read system call, a socket-layer routine (soreceive) copies the payload into user memory and frees the kernel mbuf chain.



 
next up previous
Next: Zero-Copy Sockets Up: TCP/IP with Trapeze/Myrinet Previous: Trapeze Overview
Jeff Chase
8/4/1999