Citation :
68 6.7.3.1 Formal definition of restrict A new feature of C9X: The restrict type qualifier allows programs to be written so that translators can produce significantly faster executables. Anyone for whom this is not a concern can safely ignore this feature of the language. 20 The problem that the restrict qualifier addresses is that potential aliasing can inhibit optimizations. Specifically, if a translator cannot determine that two different pointers are being used to reference different objects, then it cannot apply optimizations such as maintaining the values of the objects in registers rather than in memory, or reordering loads and stores of these values. This problem can have a significant effect on a program that, for example, performs 25 arithmetic calculations on large arrays of numbers. The effect can be measured by comparing a program that uses pointers with a similar program that uses file scope arrays (or with a similar Fortran program). The array version can run faster by a factor of ten or more on a system with vector processors. Where such large performance gains are possible, implementations have of course offered their own solutions, usually in the form of compiler directives that specify particular 30 optimizations. Differences in the spelling, scope, and precise meaning of these directives have made them troublesome to use in a program that must run on many different systems. This was the motivation for a standard solution. The restrict qualifier was designed to express and extend two types of aliasing information 35 already specified in the language. First, if a single pointer is directly assigned the return value from an invocation of malloc, then that pointer is the sole initial means of access to the allocated object (that is, another pointer can gain access to that object only by being assigned a value that is based on the value of the first 40 pointer). Declaring the pointer to be restrict-qualified expresses this information to a translator. Furthermore, the qualifier can be used to extend a translator s special treatment of such a pointer to more general situations. For example, an invocation of malloc might be hidden from the translator in another function, or a single invocation of malloc might be used to allocate several objects, each referenced through its own pointer. 45 C9X RATIONALE WG14/N897 J11/99-032 69 Second, the library specifies two versions of an object copying function, because on many systems a faster copy is possible if it is known that the source and target arrays do not overlap. The restrict qualifier can be used to express the restriction on overlap in a new prototype that is compatible with the original version: 5 void *memcpy(void * restrict s1, const void * restrict s2, size_t n); void *memmove(void * s1, const void * s2, size_t n); With the restriction visible to a translator, a straightforward implementation of memcpy in C can 10 now give a level of performance that previously required assembly language or other non-standard means. Thus the restrict qualifier provides a standard means with which to make, in the definition of any function, an aliasing assertion of a type that could previously be made only for library functions. 15 The complexity of the specification of the restrict qualifier reflects the fact that C has a rich set of types and a dynamic notion of the type of an object. Recall, for example, that an object does not have a fixed type, but acquires a type when referenced. Similarly, in some of the library functions, the extent of an array object referenced through a pointer parameter is dynamically determined, either by another parameter or by the contents of the array. 20 The full specification is necessary to determine the precise meaning of a qualifier in any context, and so must be understood by compiler implementors. Fortunately, most others will need to understand only a few simple patterns of usage explained in the following examples. 25 A translator can assume that a file scope restrict-qualified pointer is the sole initial means of access to an object, much as if it were the declared name of an array. This is useful for a dynamically allocated array whose size is not known until run time. Note in the example how a single block of storage is effectively subdivided into two disjoint objects. 30 float * restrict a1, * restrict a2; void init(int n) { float * t = malloc(2 * n * sizeof(float)); 35 a1 = t; // a1 refers to 1st half a2 = t + n; // a2 refers to 2nd half } A translator can assume that a restrict-qualified pointer that is a function parameter is, at the 40 beginning of each execution of the function, the sole means of access to an object. Note that this assumption expires with the end of each execution. In the following example, parameters a1 and a2 can be assumed to refer to disjoint array objects because both are restrict-qualified. This implies that each iteration of the loop is independent of the others, and so the loop can be aggressively optimized. 45 WG14/N897 J11/99-032 C9X RATIONALE 70 void f1(int n, float * restrict a1, const float * restrict a2) { int i; for ( i = 0; i < n; i++ ) 5 a1[i] += a2[i]; } A translator can assume that a restrict-qualified pointer declared with block scope is, during each execution of the block, the sole initial means of access to an object. An invocation of the 10 macro shown in the following example is equivalent to an inline version of a call to the function f1 above. # define f2(N,A1,A2) \ { int n = (N); \ 15 float * restrict a1 = (A1); \ float * restrict a2 = (A2); \ int i; \ for ( i = 0; i < n; i++ ) \ a1[i] += a2[i]; \ 20 } The restrict qualifier can be used in the declaration of a structure member. A translator can assume, when an identifier is declared that provides a means of access to an object of that structure type, that the member provides the sole initial means of access to an object of the type specified in 25 the member declaration. The duration of the assumption depends on the scope of the identifier, not on the scope of the declaration of the structure. Thus a translator can assume that s1.a1 and s1.a2 below are used to refer to disjoint objects for the duration of the whole program, but that s2.a1 and s2.a2 are used to refer to disjoint objects only for the duration of each invocation of the f3 function. 30 struct t { int n; float * restrict a1, * restrict a2; }; 35 struct t s1; void f3(struct t s2) { /* ... */ } 40 The meaning of the restrict qualifier for a union member or in a type definition is analogous. Just as an object with a declared name can be aliased by an unqualified pointer, so can the object associated with a restrict-qualified pointer. The restrict qualifier is therefore unlike the register storage class, which precludes such aliasing. 45 This allows the restrict qualifier to be introduced more easily into existing programs, and also allows restrict to be used in new programs that call functions from libraries that do not use C9X RATIONALE WG14/N897 J11/99-032 the qualifier. In particular, a restrict-qualified pointer can be the actual argument for a function parameter that is unqualified. On the other hand, it is easier for a translator to find opportunities for optimization if as many as possible of the pointers in a program are restrictqualified.68 6.7.3.1 Formal definition of restrict A new feature of C9X: The restrict type qualifier allows programs to be written so that translators can produce significantly faster executables. Anyone for whom this is not a concern can safely ignore this feature of the language. 20 The problem that the restrict qualifier addresses is that potential aliasing can inhibit optimizations. Specifically, if a translator cannot determine that two different pointers are being used to reference different objects, then it cannot apply optimizations such as maintaining the values of the objects in registers rather than in memory, or reordering loads and stores of these values. This problem can have a significant effect on a program that, for example, performs 25 arithmetic calculations on large arrays of numbers. The effect can be measured by comparing a program that uses pointers with a similar program that uses file scope arrays (or with a similar Fortran program). The array version can run faster by a factor of ten or more on a system with vector processors. Where such large performance gains are possible, implementations have of course offered their own solutions, usually in the form of compiler directives that specify particular 30 optimizations. Differences in the spelling, scope, and precise meaning of these directives have made them troublesome to use in a program that must run on many different systems. This was the motivation for a standard solution. The restrict qualifier was designed to express and extend two types of aliasing information 35 already specified in the language. First, if a single pointer is directly assigned the return value from an invocation of malloc, then that pointer is the sole initial means of access to the allocated object (that is, another pointer can gain access to that object only by being assigned a value that is based on the value of the first 40 pointer). Declaring the pointer to be restrict-qualified expresses this information to a translator. Furthermore, the qualifier can be used to extend a translator s special treatment of such a pointer to more general situations. For example, an invocation of malloc might be hidden from the translator in another function, or a single invocation of malloc might be used to allocate several objects, each referenced through its own pointer. 45 C9X RATIONALE WG14/N897 J11/99-032 69 Second, the library specifies two versions of an object copying function, because on many systems a faster copy is possible if it is known that the source and target arrays do not overlap. The restrict qualifier can be used to express the restriction on overlap in a new prototype that is compatible with the original version: 5 void *memcpy(void * restrict s1, const void * restrict s2, size_t n); void *memmove(void * s1, const void * s2, size_t n); With the restriction visible to a translator, a straightforward implementation of memcpy in C can 10 now give a level of performance that previously required assembly language or other non-standard means. Thus the restrict qualifier provides a standard means with which to make, in the definition of any function, an aliasing assertion of a type that could previously be made only for library functions. 15 The complexity of the specification of the restrict qualifier reflects the fact that C has a rich set of types and a dynamic notion of the type of an object. Recall, for example, that an object does not have a fixed type, but acquires a type when referenced. Similarly, in some of the library functions, the extent of an array object referenced through a pointer parameter is dynamically determined, either by another parameter or by the contents of the array. 20 The full specification is necessary to determine the precise meaning of a qualifier in any context, and so must be understood by compiler implementors. Fortunately, most others will need to understand only a few simple patterns of usage explained in the following examples. 25 A translator can assume that a file scope restrict-qualified pointer is the sole initial means of access to an object, much as if it were the declared name of an array. This is useful for a dynamically allocated array whose size is not known until run time. Note in the example how a single block of storage is effectively subdivided into two disjoint objects. 30 float * restrict a1, * restrict a2; void init(int n) { float * t = malloc(2 * n * sizeof(float)); 35 a1 = t; // a1 refers to 1st half a2 = t + n; // a2 refers to 2nd half } A translator can assume that a restrict-qualified pointer that is a function parameter is, at the 40 beginning of each execution of the function, the sole means of access to an object. Note that this assumption expires with the end of each execution. In the following example, parameters a1 and a2 can be assumed to refer to disjoint array objects because both are restrict-qualified. This implies that each iteration of the loop is independent of the others, and so the loop can be aggressively optimized. 45 WG14/N897 J11/99-032 C9X RATIONALE 70 void f1(int n, float * restrict a1, const float * restrict a2) { int i; for ( i = 0; i < n; i++ ) 5 a1[i] += a2[i]; } A translator can assume that a restrict-qualified pointer declared with block scope is, during each execution of the block, the sole initial means of access to an object. An invocation of the 10 macro shown in the following example is equivalent to an inline version of a call to the function f1 above. # define f2(N,A1,A2) \ { int n = (N); \ 15 float * restrict a1 = (A1); \ float * restrict a2 = (A2); \ int i; \ for ( i = 0; i < n; i++ ) \ a1[i] += a2[i]; \ 20 } The restrict qualifier can be used in the declaration of a structure member. A translator can assume, when an identifier is declared that provides a means of access to an object of that structure type, that the member provides the sole initial means of access to an object of the type specified in 25 the member declaration. The duration of the assumption depends on the scope of the identifier, not on the scope of the declaration of the structure. Thus a translator can assume that s1.a1 and s1.a2 below are used to refer to disjoint objects for the duration of the whole program, but that s2.a1 and s2.a2 are used to refer to disjoint objects only for the duration of each invocation of the f3 function. 30 struct t { int n; float * restrict a1, * restrict a2; }; 35 struct t s1; void f3(struct t s2) { /* ... */ } 40 The meaning of the restrict qualifier for a union member or in a type definition is analogous. Just as an object with a declared name can be aliased by an unqualified pointer, so can the object associated with a restrict-qualified pointer. The restrict qualifier is therefore unlike the register storage class, which precludes such aliasing. 45 This allows the restrict qualifier to be introduced more easily into existing programs, and also allows restrict to be used in new programs that call functions from libraries that do not use C9X RATIONALE WG14/N897 J11/99-032 the qualifier. In particular, a restrict-qualified pointer can be the actual argument for a function parameter that is unqualified. On the other hand, it is easier for a translator to find opportunities for optimization if as many as possible of the pointers in a program are restrictqualified.
|