Proxycizer source code walk-through
The Proxycizer source is fairly complex, but highly modular. It is written in C++, and some of the code depends on language constructs that should be implemented in most compilers, but may not be implemented in some, such as nested classes, and member pointers.
The purpose of this document is to give an overview of the code layout and programming style of the package, and to make it easier for anyone (including myself) to extend it.
The code is split between two directories:
- libsrc, which contains the code for all classes used in the package, and
- progsrc, which contains the stand-alone programs' top-level procedures (such as main).
libsrc/
Proxy classes (so to speak)
Not referring to the Proxy Class pattern, but to a set of classes that simulate web proxy-ish behavior.
- libsrc/Proxy.h
- All classes below inherit from the abstract base class Proxy, which mainly exports Proxy_request and Proxy_response types, and a Proxy::Request() method to accept requests. The Proxy class itself does nothing. It merely provides a common interface; derived classes must do all the work. A ProxyStats object is included in this class; most derived classes add their own stats to this so just calling the ProxyStats::Print() method on it will print out stats for all superclasses of the object.
- libsrc/ProxyCache.{h,cc}
- Sticks a basic cache with LRU replacement onto the proxy.
- libsrc/ProxyCacheQueryable.{h,cc}
- Derived from ProxyCache, this class adds a querying mechanism (using the Query method), so that other caches may find out if this cache is holding a specific object.
- libsrc/ProxyDB.h
- libsrc/ProxyCRISP.{h,cc}
- libsrc/ProxyHarvest.{h,cc}
- libsrc/ProxyRD.{h,cc}
- libsrc/ProxyRPDSD.{h,cc}
- libsrc/ProxyRPDSDCS.{h,cc}
- libsrc/ProxyRPDSDMS.{h,cc}
- The "bottom-level" (final, to use a Java term) classes that complete the implementations of specific proxy simulators. ProxyDB is a special class that uses a db file to return the correct sized object.
Trace readers
These classes are used to iterate over traces.
- libsrc/TraceReader.{h,cc}
- This base class does most of the work, taking care of reading files/pipes, and providing iterator methods. It should templated on a type (typically derived from LogEntry) that provides the method GetOps(), which should return a pointer to a new object of type LogEntryOps. If constructed using a filename, it will also accept compressed files (assuming the file has an extension .gz).
- libsrc/TraceReaderSequence.{h,cc}
- Use this class to treat several proxy trace logs as one large (concatenated) log. The TraceReaderSequence::Open() method can be called more than once to set up a series of log files. TraceReaderSequence is a TraceReader, and acts just like one. Even First() works correctly.
- libsrc/UberTraceReader.h
- This Reader (which skirts the naming order of the package because it's so cool) attempts to automatically determine what type of trace it is looking at. If it can't determine the type from the first 8K of the file, it bails with an error message. This class can usually detect Squid (Crispy or not) access logs, Harvest (Crispy or not) access/hierarchy logs, simclient logs, and defaults to DEC binary logs when all else fails. Uses the UberLogEntry (which has a similar superiority complex). Using this class is exactly equivalent to using TraceReader<UberLogEntry>.
- libsrc/BufferedTraceReader.{h,cc}
- BufferedTraceReader keeps a buffer cache of entries while walking through the log. The "B-" versions of the iterator methods add elements to the buffer cache, and allow the user to traverse the buffer cache, reaccessing previously seen elements. Elements are deleted with BufferedTraceReader::BDelete; a deleted element does not have to be the first element in the buffer cache.
Proxy trace writers
- libsrc/TraceWriter.{h,cc}
- This is fairly self explanatory. Exports a Write() method to write a entry (of templated type, usually derived from LogEntry) to a file.
Log entry classes
The actual reading of files is done by the TraceReader classes described previously, but the parsing of a log line is delegated to these classes and their corresponding Ops classes. The data classes are all derived from LogEntry. They represent the data contained in one log entry. The operation classes are all derived from LogEntryOps. These define a mechanism for reading/writing their corresponding data class from/to a file/pipe. A LogEntryOps class exports the ReadFirstFromFile(FILE *), ReadFromFile(FILE *), WriteFirstToFile(FILE *), and WriteToFile(FILE *) methods. Each TraceReader will only use one LogEntryOps object, which may construct and allocate many LogEntrys. This enables the "long-lived" LogEntryOps object to keep state through iteration over the many LogEntrys in a file. See UberLogEntry for an example.
- libsrc/Mergeable.h
- libsrc/LogEntry.h
- libsrc/LogEntryText.{h,cc}
- These are the abstract LogEntry classes. LogEntry inherits a comparison operator operator <(const Mergeable &) from Mergeable, which uses the _orderval member to enable any set of objects in any of these classes to be ordered (by time, by default). LogEntryText and LogEntryTextOps provide some common infrastructure for reading text-based log files (meaning all but DEC binary log files).
- libsrc/LogEntryATT.{h,cc}
- libsrc/LogEntryDEC.{h,cc}
- libsrc/LogEntrySimclient.{h,cc}
- libsrc/LogEntrySquidAcc.{h,cc}
- libsrc/LogEntryHarvestAcc.{h,cc}
- libsrc/LogEntryHarvestHier.{h,cc}
- These files contain the data and operation classes for each supported log type.
Statistics
- libsrc/Stats.h
- Based on Jason Kastner's perl module Statistics::Descriptive.pm, this class accepts data and will spit out various stats on demand. Stats throws away data after it has calculated and saved the info that it needs. FullStats, however, saves the given data, and enables other statistics, such as Median, Mode, and histograms.
- libsrc/ProxyStats.h
- This class is used by the Proxy class and its subclasses. It provides a way to store (groups of) counters, and a Print() function to print them out nicely. Counters in the same "bracket" (group) are printed separately from other brackets, and counters are also labeled with their fraction (percentage) of the sum of all the other counters in the bracket.
Miscellaneous utilities
Many of these classes are provided courtesy of Owen Astrachan. No doubt, a few of these have gone through enough modification to be totally unrecognizable to him.
- libsrc/Pair.h
- libsrc/LList.{h,cc}
- libsrc/DLList.{h,cc}
- libsrc/Table.h
- libsrc/HTable.{h,cc}
- libsrc/Iterator.h
- libsrc/HIterator.{h,cc}
- libsrc/Vector.h
Miscellaneous utilities, 2ème Partie
- libsrc/proxytrace2txt_v2.h
- Must be downloaded from DEC, gives the definition of the tentry_v2 struct used in DEC binary trace files.
- libsrc/tentry_v2.{h,cc}
- A wrapper for libsrc/proxytrace2txt_v2.h
- libsrc/exiterr.h
- An exit() that returns the line number as the error code.
- libsrc/vtoh.h
- Little-endian (VAX) to host byte-order functions.
Currently unused classes
- libsrc/unused/Merger.cc
- libsrc/unused/Merger.h
- libsrc/unused/MergeableStream.h
- libsrc/unused/Stream.h
Syam Gadde Last modified: Thu May 21 15:35:21 EDT 1998