Boundary value analysis and integer sequences for fuzzing

Equivalence partition testing is a test case design technique in which test cases are designed using representatives from equivalence classes. An equivalence class (or partition) is a portion of the component input or output domains for which the component behavior is assumed to be the same from the component specification. In equivalence partition testing, test cases are designed to cover each class at least once, thereby reducing the total number of test cases that must be developed and executed.

Boundary value analysis is a test case design technique in which test cases are designed using representatives of boundary values. A boundary value is an input value or output value that is on the boundary between equivalence classes, or an incremental distance either side of the boundary.

At the most primitive level, access to memory in most modern processor architectures is made in 8-bit boundaries, or aligned to 2-byte, 4-byte, and 8-byte boundaries for 16-bit, 32-bit, and 64-bit values respectively. Therefore, compilers attempt to allocate memory in a way that prevents data misalignment, and pads structures in a way that aligns each member of the structure within these boundaries—since accessing unaligned memory may be expensive for processors that supports this type of operation. The same occurs when allocating memory buffers, whereas their sizes are allocated as multiples of these boundaries. More importantly, these boundaries also define the most primitive equivalence classes of any component that relies upon a processor architecture.

These sizes, when represented in a network protocol or file format as a length parameter, uses the two’s complement representation, which is the most common method of representing signed integers on present computers. A -1 two’s complement cast to an unsigned data type is equal to the largest binary repunit1 (a Mersenne number2) in the word length, namely 255, 65535, 4294967295, 18446744073709551615 in bytes, words, double words and quadruple words (Sloane’s A051179), and defines the maximum value of each of these equivalence classes.

Conversely, the minimum value of each of these equivalence classes are given by the binomial number of the form (2^n)+1 (a Fermat number3), namely 257, 65537, 4294967297, 18446744073709551617 (Sloane’s A000215).

Therefore, whether if the component specification is unknown, or if testing the implementation and enforcement of the component specification, integer sequences, such as Mersenne numbers (Sloane’s A000225) and a less common form of Fermat numbers (Sloane’s A000051), when used as representatives of boundary values either side of these boundaries, are heuristically optimal for identifying weaknesses such as “Integer Overflow or Wraparound (CWE-190)” and “Incorrect Calculation of Buffer Size (CWE-131).”

In fuzzing (or fuzz testing, or robustness testing), these equivalence classes, boundaries, and integer sequences, may be used for effectively reducing the number of trials and significantly increasing the number of outcomes in a shorter period of time.

1. “A repunit is a number consisting of copies of the single digit 1. The term “repunit” was coined by Beiler (1966), who also gave the first tabulation of known factors.” Weisstein, Eric W. “Repunit.” From MathWorld—A Wolfram Web Resource. http://mathworld.wolfram.com/Repunit.html

2. “A Mersenne number is a number of the form (2^n) -1 (Sloane’s A000225), where n is an integer. The Mersenne numbers consist of all 1s in base-2, and are therefore binary repunits.” Weisstein, Eric W. “Mersenne Number.” From MathWorld—A Wolfram Web Resource. http://mathworld.wolfram.com/MersenneNumber.html

3. “There are two definitions of the Fermat number. The less common is a number of the form (2^n) + 1 (Sloane’s A000051), obtained by setting x = 1 in a Fermat polynomial. The other much more commonly encountered Fermat numbers are a special case, given by the binomial number of the form (2^(2^n)) + 1 (Sloane’s A000215).” Weisstein, Eric W. “Fermat Number.” From MathWorld—A Wolfram Web Resource. http://mathworld.wolfram.com/FermatNumber.html

In Security.

CWE coverage for Red Hat Customer Portal

CWE has different views for different audiences and purposes. In the early stages of development, CWE only had one hierarchical representation, which originated the current Development Concepts View (or Development View). CWE is currently organized in two main views: Development Concepts (CWE-699), and Research Concepts (CWE-1000).

The Development View organizes weaknesses based on concepts frequently used in software development, and most of its categories and groups build upon well-known past taxonomies. However, the lack of mutually exclusiveness and the large number of categories and groups led to difficult maintenance and several inconsistencies being accidentally introduced during its evolution.

The Research View was developed to eliminate inconsistencies, identify research gaps, and easier maintenance. It is based only on weaknesses themselves and their abstractions, it has no categories, and explicitly models the relationships between weaknesses.

For the elements in the CWE coverage for Red Hat Customer Portal, we carefully selected abstractions with enough relevant information for developers to detect and mitigate all its related weaknesses.

While the preferred type of weakness for CWE assignments are weakness bases, weakness classes are used in cases that there are “research gaps,” and it turns out to be difficult to make consistent CWE assignments using related weakness bases.

One notable example is the “Improper Restriction of Operations within the Bounds of a Memory Buffer (CWE-119),” where its children clearly overlap and contain elements that are admittedly under-studied and under-reported, such as the “Out-of-bounds Read (CWE-125)” and its variants.

Sometimes used to identify signedness errors, another under-studied entry is “Integer Underflow (Wrap or Wraparound) (CWE-191),” which is neither used to identify signedness errors in this coverage, nor to identify incorrect calculations leading to lesser integer values wrapping into higher integer values (or “underflows”), as we do not consider the correctness of the term “underflow.” Such errors are identified by “Incorrect Calculation (CWE-682),” and “Integer Overflow or Wraparound (CWE-190)”—regardless if the wrapping occurs “downwards” or “upwards.”

CWE identifiers are assigned to Red Hat vulnerabilities using the present CWE coverage at the time of the vulnerability assessment. Thus, references to vulnerabilities are divided into “time slices” based on the date the vulnerability was assessed and the present CWE coverage at that time.

The present CWE coverage for Red Hat Customer Portal uses the CWE list version 2.2, is available in CSV format, and also in HTML format emphasized in both Development and Research views using the YUI TreeView Control for easier navigation and reference.

We strongly encourage other Linux vendors to use this coverage and also engage in the CWE Compatibility and Effectiveness Program.

In Security.

CWE compatibility for Red Hat Customer Portal

We are currently engaged in the CWE Compatibility and Effectiveness Program and working towards fulfilling its requirements for using CWE in our own outside-in methodology for identifying and eliminating the most dangerous software errors and weaknesses in our products. The CWE Compatibility and Effectiveness Program is a formal review and evaluation process for declaring products and services as “CWE-Compatible” and “CWE-Effective.”

To understand how CWE identifiers are assigned to Red Hat vulnerabilities, you also need to understand some elements of CWE terminology. CWE identifiers—also known as CWE IDs or CWEs—are organized into four main types: Category, Compound Element, View, and Weakness.

Weaknesses are also organized into three main types: Class, Base, and Variant.

Due to the diverse nature of our products, the preferred type of weakness for CWE assignments are weakness bases. Weakness classes are used in cases that there are “research gaps”, and it turns out to be difficult to make consistent CWE assignments using related weakness bases.

Weaknesses IDs are assigned to Red Hat vulnerabilities in chains. A chain is a sequence of two or more weaknesses closely linked together. Where one weakness X, can directly create the conditions necessary to cause another weakness Y, resulting in a vulnerability. In such cases, CWE refers to X as “primary” to Y, and Y as “resultant” from X.

A named chain is a noted commonly recurring chain that have received its own CWE identifier. A composite is an inclusive combination of two or more weaknesses or chains that forms another named weakness or vulnerability.

Differently from what security assessment tools do—with a few exceptions, such as integer overflows and variants—vulnerabilities are commonly named and reported as its resultant weaknesses, with little or no additional research from the reporting party. Thus, it is up to Red Hat to construct the chain backwards for properly identifying the cause of the vulnerability.

In the last post, I will present and go through the reasons behind the decisions for the elements in the CWE coverage for Red Hat Customer Portal.

In Security.