RFC1944 - Benchmarking Methodology for Network Interconnect Devices

Network Working Group S. Bradner

Request for Comments: 1944 Harvard University

Category: Informational J. McQuaid

Bay Networks

May 1996

Benchmarking Methodology for Network Interconnect Devices

Status of This Memo

This memo provides information for the Internet community. This memo

does not specify an Internet standard of any kind. Distribution of

this memo is unlimited.

Abstract

This document discusses and defines a number of tests that may be

used to describe the performance characteristics of a network

interconnecting device. In addition to defining the tests this

document also describes specific formats for reporting the results of

the tests. Appendix A lists the tests and conditions that we believe

should be included for specific cases and gives additional

information about testing practices. Appendix B is a reference

listing of maximum frame rates to be used with specific frame sizes

on various media and Appendix C gives some examples of frame formats

to be used in testing.

1. IntrodUCtion

Vendors often engage in "specsmanship" in an attempt to give their

products a better position in the marketplace. This often involves

"smoke & mirrors" to confuse the potential users of the products.

This document defines a specific set of tests that vendors can use to

measure and report the performance characteristics of network

devices. The results of these tests will provide the user comparable

data from different vendors with which to evaluate these devices.

A previous document, "Benchmarking Terminology for Network

Interconnect Devices" (RFC1242), defined many of the terms that are

used in this document. The terminology document should be consulted

before attempting to make use of this document.

2. Real world

In producing this document the authors attempted to keep in mind the

requirement that apparatus to perform the described tests must

actually be built. We do not know of "off the shelf" equipment

available to implement all of the tests but it is our opinion that

such equipment can be constructed.

3. Tests to be run

There are a number of tests described in this document. Not all of

the tests apply to all types of devices under test (DUTs). Vendors

should perform all of the tests that can be supported by a specific

type of product. The authors understand that it will take a

considerable period of time to perform all of the recommended tests

nder all of the recommended conditions. We believe that the results

are worth the effort. Appendix A lists some of the tests and

conditions that we believe should be included for specific cases.

4. Evaluating the results

Performing all of the recommended tests will result in a great deal

of data. Much of this data will not apply to the evaluation of the

devices under each circumstance. For example, the rate at which a

router forwards IPX frames will be of little use in selecting a

router for an environment that does not (and will not) support that

protocol. Evaluating even that data which is relevant to a

particular network installation will require eXPerience which may not

be readily available. Furthermore, selection of the tests to be run

and evaluation of the test data must be done with an understanding of

generally accepted testing practices regarding repeatability,

variance and statistical significance of small numbers of trials.

5. Requirements

In this document, the Words that are used to define the significance

of each particular requirement are capitalized. These words are:

* "MUST" This word, or the words "REQUIRED" and "SHALL" mean that

the item is an absolute requirement of the specification.

* "SHOULD" This word or the adjective "RECOMMENDED" means that there

may exist valid reasons in particular circumstances to ignore this

item, but the full implications should be understood and the case

carefully weighed before choosing a different course.

* "MAY" This word or the adjective "OPTIONAL" means that this item

is truly optional. One vendor may choose to include the item because

a particular marketplace requires it or because it enhances the

product, for example; another vendor may omit the same item.

An implementation is not compliant if it fails to satisfy one or more

of the MUST requirements for the protocols it implements. An

implementation that satisfies all the MUST and all the SHOULD

requirements for its protocols is said to be "unconditionally

compliant"; one that satisfies all the MUST requirements but not all

the SHOULD requirements for its protocols is said to be

"conditionally compliant".

6. Test set up

The ideal way to implement this series of tests is to use a tester

with both transmitting and receiving ports. Connections are made

from the sending ports of the tester to the receiving ports of the

DUT and from the sending ports of the DUT back to the tester. (see

Figure 1) Since the tester both sends the test traffic and receives

it back, after the traffic has been forwarded but the DUT, the tester

can easily determine if all of the transmitted packets were received

and verify that the correct packets were received. The same

functionality can be oBTained with separate transmitting and

receiving devices (see Figure 2) but unless they are remotely

controlled by some computer in a way that simulates the single

tester, the labor required to accurately perform some of the tests

(particularly the throughput test) can be prohibitive.

+------------+

+------------ tester <-------------+

+------------+

+-----------> DUT --------------+

+------------+

Figure 1

+--------+ +------------+ +----------+

sender --------> DUT ---------> receiver

+--------+ +------------+ +----------+

Figure 2

6.1 Test set up for multiple media types

Two different setups could be used to test a DUT which is used in

real-world networks to connect networks of differing media type,

local Ethernet to a backbone FDDI ring for example. The tester could

support both media types in which case the set up shown in Figure 1

would be used.

Two identical DUTs are used in the other test set up. (see Figure 3)

In many cases this set up may more accurately simulate the real

world. For example, connecting two LANs together with a WAN link or

high speed backbone. This set up would not be as good at simulating

a system where clients on a Ethernet LAN were interacting with a

server on an FDDI backbone.

+-----------+

+--------------------- tester <---------------------+

+-----------+

+----------+ +----------+

+-------> DUT 1 --------------> DUT 2 ---------+

+----------+ +----------+

Figure 3

7. DUT set up

Before starting to perform the tests, the DUT to be tested MUST be

configured following the instructions provided to the user.

Specifically, it is expected that all of the supported protocols will

be configured and enabled during this set up (See Appendix A). It is

expected that all of the tests will be run without changing the

configuration or setup of the DUT in any way other than that required

to do the specific test. For example, it is not acceptable to change

the size of frame handling buffers between tests of frame handling

rates or to disable all but one transport protocol when testing the

throughput of that protocol. It is necessary to modify the

configuration when starting a test to determine the effect of filters

on throughput, but the only change MUST be to enable the specific

filter. The DUT set up SHOULD include the normally recommended

routing update intervals and keep alive frequency. The specific

version of the software and the exact DUT configuration, including

what functions are disabled, used during the tests MUST be included

as part of the report of the results.

8. Frame formats

The formats of the test frames to use for TCP/IP over Ethernet are

shown in Appendix C: Test Frame Formats. These exact frame formats

SHOULD be used in the tests described in this document for this

protocol/media combination and that these frames will be used as a

template for testing other protocol/media combinations. The specific

formats that are used to define the test frames for a particular test

series MUST be included in the report of the results.

9. Frame sizes

All of the described tests SHOULD be performed at a number of frame

sizes. Specifically, the sizes SHOULD include the maximum and minimum

legitimate sizes for the protocol under test on the media under test

and enough sizes in between to be able to get a full characterization

of the DUT performance. Except where noted, at least five frame

sizes SHOULD be tested for each test condition.

Theoretically the minimum size UDP Echo request frame would consist

of an IP header (minimum length 20 octets), a UDP header (8 octets)

and whatever MAC level header is required by the media in use. The

theoretical maximum frame size is determined by the size of the

length field in the IP header. In almost all cases the actual

maximum and minimum sizes are determined by the limitations of the

media.

In theory it would be ideal to distribute the frame sizes in a way

that would evenly distribute the theoretical frame rates. These

recommendations incorporate this theory but specify frame sizes which

are easy to understand and remember. In addition, many of the same

frame sizes are specified on each of the media types to allow for

easy performance comparisons.

Note: The inclusion of an unrealistically small frame size on some of

the media types (i.e. with little or no space for data) is to help

characterize the per-frame processing overhead of the DUT.

9.1 Frame sizes to be used on Ethernet

64, 128, 256, 512, 1024, 1280, 1518

These sizes include the maximum and minimum frame sizes permitted

by the Ethernet standard and a selection of sizes between these

extremes with a finer granularity for the smaller frame sizes and

higher frame rates.

9.2 Frame sizes to be used on 4Mb and 16Mb token ring

54, 64, 128, 256, 1024, 1518, 2048, 4472

The frame size recommendations for token ring assume that there is

no RIF field in the frames of routed protocols. A RIF field would

be present in any direct source route bridge performance test.

The minimum size frame for UDP on token ring is 54 octets. The

maximum size of 4472 octets is recommended for 16Mb token ring

instead of the theoretical size of 17.9Kb because of the size

limitations imposed by many token ring interfaces. The reminder

of the sizes are selected to permit direct comparisons with other

types of media. An IP (i.e. not UDP) frame may be used in

addition if a higher data rate is desired, in which case the

minimum frame size is 46 octets.

9.3 Frame sizes to be used on FDDI

54, 64, 128, 256, 1024, 1518, 2048, 4472

The minimum size frame for UDP on FDDI is 53 octets, the minimum

size of 54 is recommended to allow direct comparison to token ring

performance. The maximum size of 4472 is recommended instead of

the theoretical maximum size of 4500 octets to permit the same

type of comparison. An IP (i.e. not UDP) frame may be used in

addition if a higher data rate is desired, in which case the

minimum frame size is 45 octets.

9.4 Frame sizes in the presence of disparate MTUs

When the interconnect DUT supports connecting links with disparate

MTUs, the frame sizes for the link with the *larger* MTU SHOULD be

used, up to the limit of the protocol being tested. If the

interconnect DUT does not support the fragmenting of frames in the

presence of MTU mismatch, the forwarding rate for that frame size

shall be reported as zero.

For example, the test of IP forwarding with a bridge or router

that joins FDDI and Ethernet should use the frame sizes of FDDI

when going from the FDDI to the Ethernet link. If the bridge does

not support IP fragmentation, the forwarding rate for those frames

too large for Ethernet should be reported as zero.

10. Verifying received frames

The test equipment SHOULD discard any frames received during a test

run that are not actual forwarded test frames. For example, keep-

alive and routing update frames SHOULD NOT be included in the count

of received frames. In any case, the test equipment SHOULD verify

the length of the received frames and check that they match the

expected length.

Preferably, the test equipment SHOULD include sequence numbers in the

transmitted frames and check for these numbers on the received

frames. If this is done, the reported results SHOULD include in

addition to the number of frames dropped, the number of frames that

were received out of order, the number of duplicate frames received

and the number of gaps in the received frame numbering sequence.

This functionality is required for some of the described tests.

11. Modifiers

It might be useful to know the DUT performance under a number of

conditions; some of these conditions are noted below. The reported

results SHOULD include as many of these conditions as the test

equipment is able to generate. The suite of tests SHOULD be first

run without any modifying conditions and then repeated under each of

the conditions separately. To preserve the ability to compare the

results of these tests any frames that are required to generate the

modifying conditions (management queries for example) will be

included in the same data stream as the normal test frames in place

of one of the test frames and not be supplied to the DUT on a

separate network port.

11.1 Broadcast frames

In most router designs special processing is required when frames

addressed to the hardware broadcast address are received. In

bridges (or in bridge mode on routers) these broadcast frames must

be flooded to a number of ports. The stream of test frames SHOULD

be augmented with 1% frames addressed to the hardware broadcast

address. The frames sent to the broadcast address should be of a

type that the router will not need to process. The aim of this

test is to determine if there is any effect on the forwarding rate

of the other data in the stream. The specific frames that should

be used are included in the test frame format document. The

broadcast frames SHOULD be evenly distributed throughout the data

stream, for example, every 100th frame.

The same test SHOULD be performed on bridge-like DUTs but in this

case the broadcast packets will be processed and flooded to all

outputs.

It is understood that a level of broadcast frames of 1% is much

higher than many networks experience but, as in drug toxicity

evaluations, the higher level is required to be able to gage the

effect which would otherwise often fall within the normal

variability of the system performance. Due to design factors some

test equipment will not be able to generate a level of alternate

frames this low. In these cases the percentage SHOULD be as small

as the equipment can provide and that the actual level be

described in the report of the test results.

11.2 Management frames

Most data networks now make use of management protocols such as

SNMP. In many environments there can be a number of management

stations sending queries to the same DUT at the same time.

The stream of test frames SHOULD be augmented with one management

query as the first frame sent each second during the duration of

the trial. The result of the query must fit into one response

frame. The response frame SHOULD be verified by the test

equipment. One example of the specific query frame that should be

used is shown in Appendix C.

11.3 Routing update frames

The processing of dynamic routing protocol updates could have a

significant impact on the ability of a router to forward data

frames. The stream of test frames SHOULD be augmented with one

routing update frame transmitted as the first frame transmitted

during the trial. Routing update frames SHOULD be sent at the

rate specified in Appendix C for the specific routing protocol

being used in the test. Two routing update frames are defined in

Appendix C for the TCP/IP over Ethernet example. The routing

frames are designed to change the routing to a number of networks

that are not involved in the forwarding of the test data. The

first frame sets the routing table state to "A", the second one

changes the state to "B". The frames MUST be alternated during

the trial.

The test SHOULD verify that the routing update was processed by

the DUT.

11.4 Filters

Filters are added to routers and bridges to selectively inhibit

the forwarding of frames that would normally be forwarded. This

is usually done to implement security controls on the data that is

accepted between one area and another. Different products have

different capabilities to implement filters.

The DUT SHOULD be first configured to add one filter condition and

the tests performed. This filter SHOULD permit the forwarding of

the test data stream. In routers this filter SHOULD be of the

form:

forward input_protocol_address to output_protocol_address

In bridges the filter SHOULD be of the form:

forward destination_hardware_address

The DUT SHOULD be then reconfigured to implement a total of 25

filters. The first 24 of these filters SHOULD be of the form:

block input_protocol_address to output_protocol_address

The 24 input and output protocol addresses SHOULD not be any that

are represented in the test data stream. The last filter SHOULD

permit the forwarding of the test data stream. By "first" and

"last" we mean to ensure that in the second case, 25 conditions

must be checked before the data frames will match the conditions

that permit the forwarding of the frame. Of course, if the DUT

reorders the filters or does not use a linear scan of the filter

rules the effect of the sequence in which the filters are input is

properly lost.

The exact filters configuration command lines used SHOULD be

included with the report of the results.

11.4.1 Filter Addresses

Two sets of filter addresses are required, one for the single

filter case and one for the 25 filter case.

The single filter case should permit traffic from IP address

198.18.1.2 to IP address 198.19.65.2 and deny all other

traffic.

The 25 filter case should follow the following sequence.

deny aa.ba.1.1 to aa.ba.100.1

deny aa.ba.2.2 to aa.ba.101.2

deny aa.ba.3.3 to aa.ba.103.3

...

deny aa.ba.12.12 to aa.ba.112.12

allow aa.bc.1.2 to aa.bc.65.1

deny aa.ba.13.13 to aa.ba.113.13

deny aa.ba.14.14 to aa.ba.114.14

...

deny aa.ba.24.24 to aa.ba.124.24

deny all else

All previous filter conditions should be cleared from the

router before this sequence is entered. The sequence is

selected to test to see if the router sorts the filter

conditions or accepts them in the order that they were entered.

Both of these procedures will result in a greater impact on

performance than will some form of hash coding.

12. Protocol addresses

It is easier to implement these tests using a single logical stream

of data, with one source protocol address and one destination

protocol address, and for some conditions like the filters described

above, a practical requirement. Networks in the real world are not

limited to single streams of data. The test suite SHOULD be first run

with a single protocol (or hardware for bridge tests) source and

destination address pair. The tests SHOULD then be repeated with

using a random destination address. While testing routers the

addresses SHOULD be random and uniformly distributed over a range of

256 networks and random and uniformly distributed over the full MAC

range for bridges. The specific address ranges to use for IP are

shown in Appendix C.

13. Route Set Up

It is not reasonable that all of the routing information necessary to

forward the test stream, especially in the multiple address case,

will be manually set up. At the start of each trial a routing update

MUST be sent to the DUT. This routing update MUST include all of the

network addresses that will be required for the trial. All of the

addresses SHOULD resolve to the same "next-hop". Normally this will

be the address of the receiving side of the test equipment. This

routing update will have to be repeated at the interval required by

the routing protocol being used. An example of the format and

repetition interval of the update frames is given in Appendix C.

14. Bidirectional traffic

Normal network activity is not all in a single direction. To test

the bidirectional performance of a DUT, the test series SHOULD be run

with the same data rate being offered from each direction. The sum of

the data rates should not exceed the theoretical limit for the media.

15. Single stream path

The full suite of tests SHOULD be run along with whatever modifier

conditions that are relevant using a single input and output network

port on the DUT. If the internal design of the DUT has multiple

distinct pathways, for example, multiple interface cards each with

multiple network ports, then all possible types of pathways SHOULD be

tested separately.

16. Multi-port

Many current router and bridge products provide many network ports in

the same module. In performing these tests first half of the ports

are designated as "input ports" and half are designated as "output

ports". These ports SHOULD be evenly distributed across the DUT

architecture. For example if a DUT has two interface cards each of

which has four ports, two ports on each interface card are designated

as input and two are designated as output. The specified tests are

run using the same data rate being offered to each of the input

ports. The addresses in the input data streams SHOULD be set so that

a frame will be directed to each of the output ports in sequence so

that all "output" ports will get an even distribution of packets from

this input. The same configuration MAY be used to perform a

bidirectional multi-stream test. In this case all of the ports are

considered both input and output ports and each data stream MUST

consist of frames addressed to all of the other ports.

Consider the following 6 port DUT:

--------------

--------- in A out X--------

--------- in B out Y--------

--------- in C out Z--------

--------------

The addressing of the data streams for each of the inputs SHOULD be:

stream sent to input A:

packet to out X, packet to out Y, packet to out Z

stream sent to input B:

packet to out X, packet to out Y, packet to out Z

stream sent to input C

packet to out X, packet to out Y, packet to out Z

Note that these streams each follow the same sequence so that 3

packets will arrive at output X at the same time, then 3 packets at

Y, then 3 packets at Z. This procedure ensures that, as in the real

world, the DUT will have to deal with multiple packets addressed to

the same output at the same time.

17. Multiple protocols

This document does not address the issue of testing the effects of a

mixed protocol environment other than to suggest that if such tests

are wanted then frames SHOULD be distributed between all of the test

protocols. The distribution MAY approximate the conditions on the

network in which the DUT would be used.

18. Multiple frame sizes

This document does not address the issue of testing the effects of a

mixed frame size environment other than to suggest that if such tests

are wanted then frames SHOULD be distributed between all of the

listed sizes for the protocol under test. The distribution MAY

approximate the conditions on the network in which the DUT would be

used. The authors do not have any idea how the results of such a test

would be interpreted other than to directly compare multiple DUTs in

some very specific simulated network.

19. Testing performance beyond a single DUT.

In the performance testing of a single DUT, the paradigm can be

described as applying some input to a DUT and monitoring the output.

The results of which can be used to form a basis of characterization

of that device under those test conditions.

This model is useful when the test input and output are homogenous

(e.g., 64-byte IP, 802.3 frames into the DUT; 64 byte IP, 802.3

frames out), or the method of test can distinguish between dissimilar

input/output. (E.g., 1518 byte IP, 802.3 frames in; 576 byte,

fragmented IP, X.25 frames out.)

By extending the single DUT test model, reasonable benchmarks

regarding multiple DUTs or heterogeneous environments may be

collected. In this extension, the single DUT is replaced by a system

of interconnected network DUTs. This test methodology would support

the benchmarking of a variety of device/media/service/protocol

combinations. For example, a configuration for a LAN-to-WAN-to-LAN

test might be:

(1) 802.3-> DUT 1 -> X.25 @ 64kbps -> DUT 2 -> 802.3

Or a mixed LAN configuration might be:

(2) 802.3 -> DUT 1 -> FDDI -> DUT 2 -> FDDI -> DUT 3 -> 802.3

In both examples 1 and 2, end-to-end benchmarks of each system could

be empirically ascertained. Other behavior may be characterized

through the use of intermediate devices. In example 2, the

configuration may be used to give an indication of the FDDI to FDDI

capability exhibited by DUT 2.

Because multiple DUTs are treated as a single system, there are

limitations to this methodology. For instance, this methodology may

yield an aggregate benchmark for a tested system. That benchmark

alone, however, may not necessarily reflect asymmetries in behavior

between the DUTs, latencies introduce by other apparatus (e.g.,

CSUs/DSUs, switches), etc.

Further, care must be used when comparing benchmarks of different

systems by ensuring that the DUTs' features/configuration of the

tested systems have the appropriate common denominators to allow

comparison.

20. Maximum frame rate

The maximum frame rates that should be used when testing LAN

connections SHOULD be the listed theoretical maximum rate for the

frame size on the media.

The maximum frame rate that should be used when testing WAN

connections SHOULD be greater than the listed theoretical maximum

rate for the frame size on that speed connection. The higher rate

for WAN tests is to compensate for the fact that some vendors employ

various forms of header compression.

A list of maximum frame rates for LAN connections is included in

Appendix B.

21. Bursty traffic

It is convenient to measure the DUT performance under steady state

load but this is an unrealistic way to gauge the functioning of a DUT

since actual network traffic normally consists of bursts of frames.

Some of the tests described below SHOULD be performed with both

steady state traffic and with traffic consisting of repeated bursts

of frames. The frames within a burst are transmitted with the

minimum legitimate inter-frame gap.

The objective of the test is to determine the minimum interval

between bursts which the DUT can process with no frame loss. During

each test the number of frames in each burst is held constant and the

inter-burst interval varied. Tests SHOULD be run with burst sizes of

16, 64, 256 and 1024 frames.

22. Frames per token

Although it is possible to configure some token ring and FDDI

interfaces to transmit more than one frame each time that the token

is received, most of the network devices currently available transmit

only one frame per token. These tests SHOULD first be performed

while transmitting only one frame per token.

Some current high-performance workstation servers do transmit more

than one frame per token on FDDI to maximize throughput. Since this

may be a common feature in future workstations and servers,

interconnect devices with FDDI interfaces SHOULD be tested with 1, 4,

8, and 16 frames per token. The reported frame rate SHOULD be the

average rate of frame transmission over the total trial period.

23. Trial description

A particular test consists of multiple trials. Each trial returns

one piece of information, for example the loss rate at a particular

input frame rate. Each trial consists of a number of phases:

a) If the DUT is a router, send the routing update to the "input"

port and pause two seconds to be sure that the routing has settled.

b) Send the "learning frames" to the "output" port and wait 2

seconds to be sure that the learning has settled. Bridge learning

frames are frames with source addresses that are the same as the

destination addresses used by the test frames. Learning frames for

other protocols are used to prime the address resolution tables in

the DUT. The formats of the learning frame that should be used are

shown in the Test Frame Formats document.

c) Run the test trial.

d) Wait for two seconds for any residual frames to be received.

e) Wait for at least five seconds for the DUT to restabilize.

24. Trial duration

The aim of these tests is to determine the rate continuously

supportable by the DUT. The actual duration of the test trials must

be a compromise between this aim and the duration of the benchmarking

test suite. The duration of the test portion of each trial SHOULD be

at least 60 seconds. The tests that involve some form of "binary

search", for example the throughput test, to determine the exact

result MAY use a shorter trial duration to minimize the length of the

search procedure, but the final determination SHOULD be made with

full length trials.

25. Address resolution

The DUT SHOULD be able to respond to address resolution requests sent

by the DUT wherever the protocol requires such a process.

26. Benchmarking tests:

Note: The notation "type of data stream" refers to the above

modifications to a frame stream with a constant inter-frame gap, for

example, the addition of traffic filters to the configuration of the

DUT.

26.1 Throughput

Objective:

To determine the DUT throughput as defined in RFC1242.

Procedure:

Send a specific number of frames at a specific rate through the

DUT and then count the frames that are transmitted by the DUT. If

the count of offered frames is equal to the count of received

frames, the rate of the offered stream is raised and the test

rerun. If fewer frames are received than were transmitted, the

rate of the offered stream is reduced and the test is rerun.

The throughput is the fastest rate at which the count of test

frames transmitted by the DUT is equal to the number of test

frames sent to it by the test equipment.

Reporting format:

The results of the throughput test SHOULD be reported in the form

of a graph. If it is, the x coordinate SHOULD be the frame size,

the y coordinate SHOULD be the frame rate. There SHOULD be at

least two lines on the graph. There SHOULD be one line showing

the theoretical frame rate for the media at the various frame

sizes. The second line SHOULD be the plot of the test results.

Additional lines MAY be used on the graph to report the results

for each type of data stream tested. Text accompanying the graph

SHOULD indicate the protocol, data stream format, and type of

media used in the tests.

We assume that if a single value is desired for advertising

purposes the vendor will select the rate for the minimum frame

size for the media. If this is done then the figure MUST be

expressed in frames per second. The rate MAY also be expressed in

bits (or bytes) per second if the vendor so desires. The

statement of performance MUST include a/ the measured maximum

frame rate, b/ the size of the frame used, c/ the theoretical

limit of the media for that frame size, and d/ the type of

protocol used in the test. Even if a single value is used as part

of the advertising copy, the full table of results SHOULD be

included in the product data sheet.

26.2 Latency

Objective:

To determine the latency as defined in RFC1242.

Procedure:

First determine the throughput for DUT at each of the listed frame

sizes. Send a stream of frames at a particular frame size through

the DUT at the determined throughput rate to a specific

destination. The stream SHOULD be at least 120 seconds in

duration. An identifying tag SHOULD be included in one frame

after 60 seconds with the type of tag being implementation

dependent. The time at which this frame is fully transmitted is

recorded (timestamp A). The receiver logic in the test equipment

MUST recognize the tag information in the frame stream and record

the time at which the tagged frame was received (timestamp B).

The latency is timestamp B minus timestamp A as per the relevant

definition frm RFC1242, namely latency as defined for store and

forward devices or latency as defined for bit forwarding devices.

The test MUST be repeated at least 20 times with the reported

value being the average of the recorded values.

This test SHOULD be performed with the test frame addressed to the

same destination as the rest of the data stream and also with each

of the test frames addressed to a new destination network.

Reporting format:

The report MUST state which definition of latency (from RFC1242)

was used for this test. The latency results SHOULD be reported

in the format of a table with a row for each of the tested frame

sizes. There SHOULD be columns for the frame size, the rate at

which the latency test was run for that frame size, for the media

types tested, and for the resultant latency values for each

type of data stream tested.

26.3 Frame loss rate

Objective:

To determine the frame loss rate, as defined in RFC1242, of a DUT

throughout the entire range of input data rates and frame sizes.

Procedure:

Send a specific number of frames at a specific rate through the

DUT to be tested and count the frames that are transmitted by the

DUT. The frame loss rate at each point is calculated using the

following equation:

( ( input_count - output_count ) * 100 ) / input_count

The first trial SHOULD be run for the frame rate that corresponds

to 100% of the maximum rate for the frame size on the input media.

Repeat the procedure for the rate that corresponds to 90% of the

maximum rate used and then for 80% of this rate. This sequence

SHOULD be continued (at reducing 10% intervals) until there are

two successive trials in which no frames are lost. The maximum

granularity of the trials MUST be 10% of the maximum rate, a finer

granularity is encouraged.

Reporting format:

The results of the frame loss rate test SHOULD be plotted as a

graph. If this is done then the X axis MUST be the input frame

rate as a percent of the theoretical rate for the media at the

specific frame size. The Y axis MUST be the percent loss at the

particular input rate. The left end of the X axis and the bottom

of the Y axis MUST be 0 percent; the right end of the X axis and

the top of the Y axis MUST be 100 percent. Multiple lines on the

graph MAY used to report the frame loss rate for different frame

sizes, protocols, and types of data streams.

Note: See section 18 for the maximum frame rates that SHOULD be

used.

26.4 Back-to-back frames

Objective:

To characterize the ability of a DUT to process back-to-back

frames as defined in RFC1242.

Procedure:

Send a burst of frames with minimum inter-frame gaps to the DUT

and count the number of frames forwarded by the DUT. If the count

of transmitted frames is equal to the number of frames forwarded

the length of the burst is increased and the test is rerun. If

the number of forwarded frames is less than the number

transmitted, the length of the burst is reduced and the test is

rerun.

The back-to-back value is the number of frames in the longest

burst that the DUT will handle without the loss of any frames.

The trial length MUST be at least 2 seconds and SHOULD be

repeated at least 50 times with the average of the recorded values

being reported.

Reporting format:

The back-to-back results SHOULD be reported in the format of a

table with a row for each of the tested frame sizes. There SHOULD

be columns for the frame size and for the resultant average frame

count for each type of data stream tested. The standard deviation

for each measurement MAY also be reported.

26.5 System recovery

Objective:

To characterize the speed at which a DUT recovers from an overload

condition.

Procedure:

First determine the throughput for a DUT at each of the listed

frame sizes.

Send a stream of frames at a rate 110% of the recorded throughput

rate or the maximum rate for the media, whichever is lower, for at

least 60 seconds. At Timestamp A reduce the frame rate to 50% of

the above rate and record the time of the last frame lost

(Timestamp B). The system recovery time is determined by

subtracting Timestamp B from Timestamp A. The test SHOULD be

repeated a number of times and the average of the recorded values

being reported.

Reporting format:

The system recovery results SHOULD be reported in the format of a

table with a row for each of the tested frame sizes. There SHOULD

be columns for the frame size, the frame rate used as the

throughput rate for each type of data stream tested, and for the

measured recovery time for each type of data stream tested.

26.6 Reset

Objective:

To characterize the speed at which a DUT recovers from a device or

software reset.

Procedure:

First determine the throughput for the DUT for the minimum frame

size on the media used in the testing.

Send a continuous stream of frames at the determined throughput

rate for the minimum sized frames. Cause a reset in the DUT.

Monitor the output until frames begin to be forwarded and record

the time that the last frame (Timestamp A) of the initial stream

and the first frame of the new stream (Timestamp B) are received.

A power interruption reset test is performed as above except that

the power to the DUT should be interrupted for 10 seconds in place

of causing a reset.

This test SHOULD only be run using frames addressed to networks

directly connected to the DUT so that there is no requirement to

delay until a routing update is received.

The reset value is obtained by subtracting Timestamp A from

Timestamp B.

Hardware and software resets, as well as a power interruption

SHOULD be tested.

Reporting format:

The reset value SHOULD be reported in a simple set of statements,

one for each reset type.

27. Security Considerations

Security issues are not addressed in this document.

28. Editors' Addresses

Scott Bradner

Harvard University

1350 Mass. Ave, room 813

Cambridge, MA 02138

Phone +1 617 495-3864

Fax +1 617 496-8500

EMail: sob@harvard.edu

Jim McQuaid

Bay Networks

3 Federal Street

Billerica, MA 01821

Phone +1 508 436-3915

Fax: +1 508 670-8145

EMail: jmcquaid@baynetworks.com

Appendix A: Testing Considerations

A.1 Scope Of This Appendix

This appendix discusses certain issues in the benchmarking

methodology where experience or judgment may play a role in the tests

selected to be run or in the approach to constructing the test with a

particular DUT. As such, this appendix MUST not be read as an

amendment to the methodology described in the body of this document

but as a guide to testing practice.

1. Typical testing practice has been to enable all protocols to be

tested and conduct all testing with no further configuration of

protocols, even though a given set of trials may exercise only one

protocol at a time. This minimizes the opportunities to "tune" a

DUT for a single protocol.

2. The least common denominator of the available filter functions

should be used to ensure that there is a basis for comparison

between vendors. Because of product differences, those conducting

and evaluating tests must make a judgment about this issue.

3. Architectural considerations may need to be considered. For

example, first perform the tests with the stream going between

ports on the same interface card and the repeat the tests with the

stream going into a port on one interface card and out of a port

on a second interface card. There will almost always be a best

case and worst case configuration for a given DUT architecture.

4. Testing done using traffic streams consisting of mixed protocols

has not shown much difference between testing with individual

protocols. That is, if protocol A testing and protocol B testing

give two different performance results, mixed protocol testing

appears to give a result which is the average of the two.

5. Wide Area Network (WAN) performance may be tested by setting up

two identical devices connected by the appropriate short- haul

versions of the WAN modems. Performance is then measured between

a LAN interface on one DUT to a LAN interface on the other DUT.

The maximum frame rate to be used for LAN-WAN-LAN configurations is a

judgment that can be based on known characteristics of the overall

system including compression effects, fragmentation, and gross link

speeds. Practice suggests that the rate should be at least 110% of

the slowest link speed. Substantive issues of testing compression

itself are beyond the scope of this document.

Appendix B: Maximum frame rates reference

(Provided by Roger Beeman, Cisco Systems)

Size Ethernet 16Mb Token Ring FDDI

(bytes) (pps) (pps) (pps)

64 14880 24691 152439

128 8445 13793 85616

256 4528 7326 45620

512 2349 3780 23585

768 1586 2547 15903

1024 1197 1921 11996

1280 961 1542 9630

1518 812 1302 8138

Ethernet size

Preamble 64 bits

Frame 8 x N bits

Gap 96 bits

16Mb Token Ring size

SD 8 bits

AC 8 bits

FC 8 bits

DA 48 bits

SA 48 bits

RI 48 bits ( 06 30 00 12 00 30 )

SNAP

DSAP 8 bits

SSAP 8 bits

Control 8 bits

Vendor 24 bits

Type 16 bits

Data 8 x ( N - 18) bits

FCS 32 bits

ED 8 bits

FS 8 bits

Tokens or idles between packets are not included

FDDI size

Preamble 64 bits

SD 8 bits

FC 8 bits

DA 48 bits

SA 48 bits

SNAP

DSAP 8 bits

SSAP 8 bits

Control 8 bits

Vendor 24 bits

Type 16 bits

Data 8 x ( N - 18) bits

FCS 32 bits

ED 4 bits

FS 12 bits

Appendix C: Test Frame Formats

This appendix defines the frame formats that may be used with these

tests. It also includes protocol specific parameters for TCP/IP over

Ethernet to be used with the tests as an example.

C.1. Introduction

The general logic used in the selection of the parameters and the

design of the frame formats is explained for each case within the

TCP/IP section. The same logic has been used in the other sections.

Comments are used in these sections only if there is a protocol

specific feature to be explained. Parameters and frame formats for

additional protocols can be defined by the reader by using the same

logic.

C.2. TCP/IP Information

The following section deals with the TCP/IP protocol suite.

C.2.1 Frame Type.

An application level datagram echo request is used for the test

data frame in the protocols that support such a function. A

datagram protocol is used to minimize the chance that a router

might expect a specific session initialization sequence, as might

be the case for a reliable stream protocol. A specific defined

protocol is used because some routers verify the protocol field

and refuse to forward unknown protocols.

For TCP/IP a UDP Echo Request is used.

C.2.2 Protocol Addresses

Two sets of addresses must be defined: first the addresses

assigned to the router ports, and second the address that are to

be used in the frames themselves and in the routing updates.

The network addresses 192.18.0.0 through 192.19.255.255 are have

been assigned to the BMWG by the IANA for this purpose. This

assignment was made to minimize the chance of conflict in case a

testing device were to be accidentally connected to part of the

Internet. The specific use of the addresses is detailed below.

C.2.2.1 Router port protocol addresses

Half of the ports on a multi-port router are referred to as

"input" ports and the other half as "output" ports even though

some of the tests use all ports both as input and output. A

contiguous series of IP Class C network addresses from

198.18.1.0 to 198.18.64.0 have been assigned for use on the

"input" ports. A second series from 198.19.1.0 to 198.19.64.0

have been assigned for use on the "output" ports. In all cases

the router port is node 1 on the appropriate network. For

example, a two port DUT would have an IP address of 198.18.1.1

on one port and 198.19.1.1 on the other port.

Some of the tests described in the methodology memo make use of

an SNMP management connection to the DUT. The management

Access address for the DUT is assumed to be the first of the

"input" ports (198.18.1.1).

C.2.2.2 Frame addresses

Some of the described tests assume adjacent network routing

(the reboot time test for example). The IP address used in the

test frame is that of node 2 on the appropriate Class C

network. (198.19.1.2 for example)

If the test involves non-adjacent network routing the phantom

routers are located at node 10 of each of the appropriate Class

C networks. A series of Class C network addresses from

198.18.65.0 to 198.18.254.0 has been assigned for use as the

networks accessible through the phantom routers on the "input"

side of DUT. The series of Class C networks from 198.19.65.0

to 198.19.254.0 have been assigned to be used as the networks

visible through the phantom routers on the "output" side of the

DUT.

C.2.3 Routing Update Frequency

The update interval for each routing protocol is may have to be

determined by the specifications of the individual protocol. For

IP RIP, Cisco IGRP and for OSPF a routing update frame or frames

should precede each stream of test frames by 5 seconds. This

frequency is sufficient for trial durations of up to 60 seconds.

Routing updates must be mixed with the stream of test frames if

longer trial periods are selected. The frequency of updates

should be taken from the following table.

IP-RIP 30 sec

IGRP 90 sec

OSPF 90 sec

C.2.4 Frame Formats - detailed discussion

C.2.4.1 Learning Frame

In most protocols a procedure is used to determine the mapping

between the protocol node address and the MAC address. The

Address Resolution Protocol (ARP) is used to perform this

function in TCP/IP. No such procedure is required in XNS or

IPX because the MAC address is used as the protocol node

address.

In the ideal case the tester would be able to respond to ARP

requests from the DUT. In cases where this is not possible an

ARP request should be sent to the router's "output" port. This

request should be seen as coming from the immediate destination

of the test frame stream. (i.e. the phantom router (Figure 2)

or the end node if adjacent network routing is being used.) It

is assumed that the router will cache the MAC address of the

requesting device. The ARP request should be sent 5 seconds

before the test frame stream starts in each trial. Trial

lengths of longer than 50 seconds may require that the router

be configured for an extended ARP timeout.

+--------+ +------------+

phantom ------ P LAN

IN A------ DUT ------------ ------ P LAN

OUT A router ------ P LAN

+--------+ +------------+

Figure 2

In the case where full routing is being used

C.2.4.2 Routing Update Frame

If the test does not involve adjacent net routing the tester

must supply proper routing information using a routing update.

A single routing update is used before each trial on each

"destination" port (see section C.24). This update includes

the network addresses that are reachable through a phantom

router on the network attached to the port. For a full mesh

test, one destination network address is present in the routing

update for each of the "input" ports. The test stream on each

"input" port consists of a repeating sequence of frames, one to

each of the "output" ports.

C.2.4.3 Management Query Frame

The management overhead test uses SNMP to query a set of

variables that should be present in all DUTs that support SNMP.

The variables for a single interface only are read by an NMS

at the appropriate intervals. The list of variables to

retrieve follow:

sysUpTime

ifInOctets

ifOutOctets

ifInUcastPkts

ifOutUcastPkts

C.2.4.4 Test Frames

The test frame is an UDP Echo Request with enough data to fill

out the required frame size. The data should not be all bits

off or all bits on since these patters can cause a "bit

stuffing" process to be used to maintain clock synchronization

on WAN links. This process will result in a longer frame than

was intended.

C.2.4.5 Frame Formats - TCP/IP on Ethernet

Each of the frames below are described for the 1st pair of DUT

ports, i.e. "input" port #1 and "output" port #1. Addresses

must be changed if the frame is to be used for other ports.

C.2.6.1 Learning Frame

ARP Request on Ethernet

-- DATAGRAM HEADER

offset data (hex) description

00 FF FF FF FF FF FF dest MAC address send to

broadcast address

06 xx xx xx xx xx xx set to source MAC address

12 08 06 ARP type

14 00 01 hardware type Ethernet = 1

16 08 00 protocol type IP = 800

18 06 hardware address length 48 bits

on Ethernet

19 04 protocol address length 4 octets

for IP

20 00 01 opcode request = 1

22 xx xx xx xx xx xx source MAC address

28 xx xx xx xx source IP address

32 FF FF FF FF FF FF requesting DUT's MAC address

38 xx xx xx xx DUT's IP address

C.2.6.2 Routing Update Frame

-- DATAGRAM HEADER

offset data (hex) description

00 FF FF FF FF FF FF dest MAC address is broadcast

06 xx xx xx xx xx xx source hardware address

12 08 00 type

-- IP HEADER

14 45 IP version - 4, header length (4

byte units) - 5

15 00 service field

16 00 EE total length

18 00 00 ID

20 40 00 flags (3 bits) 4 (do not

fragment),

fragment offset-0

22 0A TTL

23 11 protocol - 17 (UDP)

24 C4 8D header checksum

26 xx xx xx xx source IP address

30 xx xx xx destination IP address

33 FF host part = FF for broadcast

-- UDP HEADER

34 02 08 source port 208 = RIP

36 02 08 destination port 208 = RIP

38 00 DA UDP message length

40 00 00 UDP checksum

-- RIP packet

42 02 command = response

43 01 version = 1

44 00 00 0

-- net 1

46 00 02 family = IP

48 00 00 0

50 xx xx xx net 1 IP address

53 00 net not node

54 00 00 00 00 0

58 00 00 00 00 0

62 00 00 00 07 metric 7

-- net 2

66 00 02 family = IP

68 00 00 0

70 xx xx xx net 2 IP address

73 00 net not node

74 00 00 00 00 0

78 00 00 00 00 0

82 00 00 00 07 metric 7

-- net 3

86 00 02 family = IP

88 00 00 0

90 xx xx xx net 3 IP address

93 00 net not node

94 00 00 00 00 0

98 00 00 00 00 0

102 00 00 00 07 metric 7

-- net 4

106 00 02 family = IP

108 00 00 0

110 xx xx xx net 4 IP address

113 00 net not node

114 00 00 00 00 0

118 00 00 00 00 0

122 00 00 00 07 metric 7

-- net 5

126 00 02 family = IP

128 00 00 0

130 00 net 5 IP address

133 00 net not node

134 00 00 00 00 0

138 00 00 00 00 0

142 00 00 00 07 metric 7

-- net 6

146 00 02 family = IP

148 00 00 0

150 xx xx xx net 6 IP address

153 00 net not node

154 00 00 00 00 0

158 00 00 00 00 0

162 00 00 00 07 metric 7

C.2.4.6 Management Query Frame

To be defined.

C.2.6.4 Test Frames

UDP echo request on Ethernet

-- DATAGRAM HEADER

offset data (hex) description

00 xx xx xx xx xx xx set to dest MAC address

06 xx xx xx xx xx xx set to source MAC address

12 08 00 type

-- IP HEADER

14 45 IP version - 4 header length 5 4

byte units

15 00 TOS

16 00 2E total length*

18 00 00 ID

20 00 00 flags (3 bits) - 0 fragment

offset-0

22 0A TTL

23 11 protocol - 17 (UDP)

24 C4 8D header checksum*

26 xx xx xx xx set to source IP address**

30 xx xx xx xx set to destination IP address**

-- UDP HEADER

34 C0 20 source port

36 00 07 destination port 07 = Echo

38 00 1A UDP message length*

40 00 00 UDP checksum

-- UDP DATA

42 00 01 02 03 04 05 06 07 some data***

50 08 09 0A 0B 0C 0D 0E 0F

* - change for different length frames

** - change for different logical streams

*** - fill remainder of frame with incrementing octets,

repeated if required by frame length

Values to be used in Total Length and UDP message length fields:

frame size total length UDP message length

64 00 2E 00 1A

128 00 6E 00 5A

256 00 EE 00 9A

512 01 EE 01 9A

768 02 EE 02 9A

1024 03 EE 03 9A

1280 04 EE 04 9A

1518 05 DC 05 C8