XPIDL

This article is in need of a technical review.

XPIDL is an Interface Description Language used to specify XPCOM interface classes.

Interface Description Languages (IDL) are used to describe interfaces in a language- and machine-independent way. IDLs make it possible to define interfaces which can then be processed by tools to autogenerate language-dependent interface specifications. XPIDL is expected to converge towards WebIDL in the future.

Note: Starting in Gecko 9.0, the older xpidl utility, which was previously used to generate C++ header files, typelib information, and so forth has been replaced with pyxpidl in the Gecko SDK. pyxpidl has been used for some time now, but now the older tool has been fully retired.

Writing XPIDL interface files

XPIDL closely resembles OMG IDL, with extended syntax to handle IIDs and additional types. Some examples are in the xpcom/base and xpcom/ds directories of the Mozilla tree.

Explanation of IDL semantics

A full guide to the syntax can be found at XPIDL:Syntax, which is written in an ABNF form.

An xpidl file is essentially just a series of declarations. At the top level, we can define typedefs, native types, or interfaces. Interfaces may furthermore contain typedefs, natives, methods, constants, or attributes. Most declarations can have properties applied to them.

Types

There are three ways to make types: a typedef, a native, or an interface. In addition, there are a few built-in native types. The built-in native types are those listed under the type_spec production above. The following is the correspondence table:

Table 1: Standard IDL types
IDL C++ in parameter C++ out parameter JS type Notes
boolean bool bool * boolean  
char char char * string Only chars in range \u0000-\u00ff permitted
double double double * number  
float float float * number  
long int32_t int32_t * number  
long long int64_t int64_t * number  
octet uint8_t uint8_t * number  
short int16_t int16_t * number  
string const char * char ** string Only chars in range \u0000-\u00ff permitted
unsigned long uint32_t uint32_t * number  
unsigned long long uint64_t uint64_t * number  
unsigned short uint16_t uint16_t * number  
wchar PRUnichar PRUnichar * string Full Unicode set permitted
wstring const PRUnichar * PRUnichar ** string Full Unicode set permitted

In addition to this list, nearly every IDL file includes nsrootidl.idl in some fashion, which also defines the following types:

Table 2: Types provided by nsrootidl.idl
IDL typedef C++ in parameter C++ out parameter JS type Notes
PRTime (XPIDL unsigned long long typedef, 64 bits) number PRTime is in microseconds, while JS date assumes time in milliseconds
nsresult (XPIDL unsigned long typedef, 32 bits) number  
nsrefcnt (XPIDL unsigned long typedef, 32 bits) number  
size_t (XPIDL unsigned long typedef, 32 bits) number  
voidPtr void * void * not allowed  
charPtr char * char ** not allowed  
unicharPtr PRUnichar * PRUnichar ** not allowed  
nsIDRef const nsID & nsID * ?  
nsIIDRef const nsIID & nsIID * ?  
nsCIDRef const nsCID & nsCID * ?  
nsIDPtr const nsID * nsID ** ?  
nsIIDPtr const nsIID * nsIID ** ?  
nsCIDPtr const nsCID * nsCID ** ?  
nsIID const nsIID nsIID * ?  
nsID const nsID nsID * ?  
nsCID const nsCID nsCID * ?  
nsQIResult void * void ** object Should only be used with methods that act like QueryInterface
DOMString const nsAString & nsAString & string Full Unicode set permitted
AUTF8String const nsACString & nsACString & string Full Unicode set permitted (translated to UTF-8)
ACString const nsACString & nsACString & string Only chars in range \u0000-\u00ff permitted
AString const nsAString & nsAString & string Full Unicode set permitted
jsval const jsval & jsval * anything  
jsid jsid jsid * not allowed  

Typedefs in IDL are basically as they are in C or C++: you define first the type that you want to refer to and then the name of the type. Types can of course be one of the fundamental types, or any other type declared via a typedef, interface, or a native type.

Native types are types which correspond to a given C++ type. Most native types are not scriptable: if it is not present in the list above, then it is certainly not scriptable (some of the above, particularly jsid, are not scriptable).

The contents of the parentheses of a native type declaration (although native declarations without parentheses are parsable, I do not trust that they are properly handled by the xpidl handlers) is a string equivalent to the C++ type. XPIDL itself does not interpret this string, it just literally pastes it anywhere the native type is used. The interpretation of the type can be modified by having properties on the native declaration:

Table 3: Native type definitions
astring This is an nsAString declaration. Overrides native string.
cstring This is an nsACString declaration. Overrides native string.
domstring This is an nsAString declaration. Overrides native string.
jsval This type gets const when an in type. Special in typelib.
nsid This type gets const when an in type. Special in typelib.
ptr The type is really (native str)*
ref The type is really (native str)&
utf8string This is an nsACString declaration whose text is UTF-8.

As far as I can tell, these properties also apply to typedefs. Need to verify.

Constants

Constants are technically legal at the top level, but xpidl I forbids them from being placed there; instead, they must be in an interface. The only constants supported are those which become integer types when compiled to source code; string constants and floating point constants, though parseable, cannot be made into a header or xpt file.

Constants are emitted in header files using anonymous enums, although there is an outstanding patch that combines adjacent constants into the same anonymous enums to quiet enum mismatch warnings.

Interfaces

Specifying interfaces is the primary purpose of using xpidl. Interfaces are basically a collection of constants, methods, and attributes; in Mozilla, these are the primary ways in which JavaScript code can interact with native C++ code. Furthermore, interfaces can also inherit from another interface. Every interface should inherit nsISupports in some fashion. However, it is generally not recommended to have a chain of interfaces inheriting from each other if you intend to have a chain of implementations for each interface, as it can cause problems in C++ code.

Table 4: Basic interface attributes
Attribute Interpretation
uuid(12345678-fedc-ba98-7654-0123456789ab) This is the internal way this interface is accessed; it must be unique, and the uuid must be changed anytime any part of the interface or its ancestors are changed. For instructions on how to generate an UUID see Generating GUIDs.
builtinclass JavaScript classes are forbidden from implementing this interface. All children must also be marked with this property.
function The JavaScript implementation of this interface may be a function that is invoked on property calls instead of an object with the given property
scriptable This interface is usable by JavaScript classes. Must inherit from a scriptable interface.
deprecated This interface should no longer be used. The compiler will emit warnings if you attempt to use this.

Methods and attributes

Interfaces declare a series of attributes and methods. Attributes in IDL are akin to JavaScript properties, in that they are a getter and (optionally) a setter pair. In JavaScript contexts, attributes are exposed as a regular property access, while native code sees attributes as a Get and possibly a Set method.

Attributes can be declared readonly, in which case setting causes an error to be thrown in script contexts and native contexts lack the Set method, by using the "readonly" keyword.

To native code, on attribute declared 'attribute type foo;' is syntactic sugar for the declaration of two methods 'type getFoo();' and 'void setFoo(in type foo);'. If foo were declared readonly, the latter method would not be present. Attributes support all of the properties of methods with the exception of optional_argc, as this does not make sense for attributes.

There are some special rules for attribute naming. As a result of vtable munging by the MSVC++ compiler, an attribute with the name `IID' is forbidden. In addition, any attribute whose name matches the regex /^[a-z]{2,3}I[A-Z][a-z]/ is emitted with a warning, as its name looks like an nsIInterface or a mozIInterface declaration. Also like methods, if the first character of an attribute is lowercase in IDL, it is made uppercase in native code only.

Methods define a return type and a series of in and out parameters. When called from a JavaScript context, they invocation looks as it is declared for the most part; some parameter properties can adjust what the code looks like. The calls are more mangled in native contexts.

An important attribute for methods and attributes is scriptability. A method or attribute is scriptable if it is declared in a scriptable interface and it lacks a noscript or notxpcom property. Any method that is not scriptable can only be accessed by native code. However, scriptable methods must contain parameters and a return type that can be translated to script: any native type, save those declared with an nsid, domstring, utf8string, cstring, astring, or jsval property, may not be used in a scriptable method or attribute. An exception to the above rule is if the parameter has the iid_is property (a special case for some QueryInterface-like operations). In general, this means that the only usable native types are those declared in nsrootidl.idl (see above).

Methods and attributes are mangled on conversion to native code. If a method is declared notxpcom, the mangling of the return type is prevented, so it is called mostly as it looks. Otherwise, the return type of the native method is nsresult, and the return type acts as a final outparameter if it is not void. The name is translated so that the first character is unconditionally uppercase; subsequent characters are unaffected. However, the presence of the binaryname property allows the user to select another name to use in native code (to avoid conflicts with other functions). For example, the method '[binaryname(foo)] void bar();' becomes 'nsresult Foo()' in native code (note that capitalization is still applied). However, the capitalization is not applied when using binaryname with attributes; i.e., [binaryname(foo)] readonly attribute Quux bar; becomes Getfoo(Quux**) in native code. Attributes named 'IID' and methods named 'GetIID' are forbidden, although this is checked before binaryname conversion.

The implicit_jscontext and optional_argc parameters are properties which help native code implementations determine how the call was made from script. If implicit_jscontext is present on a method, then an additional JSContext *cx parameter is added just after the regular list which receives the context of the caller. If optional_argc is present, then an additional uint8_t _argc method is added at the end which receives the number of optional arguments that were actually used (obviously, you need to have an optional argument in the first place). Note that if both properties are set, the JSContext *cx is added first, followed by the uint8_t _argc, and then ending with return value parameter. Finally, as an exception to everything already mentioned, for attribute getters and setters the JSContext *cx comes before any other arguments.

In addition, methods and attributes can be both marked as deprecated with the deprecated property, which causes compilers to emit deprecation usage warnings. Note that this is only verified in native code and not script code.

The final native-only property is nostdcall. Normally, declarations are made in the stdcall ABI on Windows to be ABI-compatible with COM interfaces. Any non-scriptable method or attribute with nostdcall instead uses the thiscall ABI convention. Methods without this property generally use NS_IMETHOD in their declarations and NS_IMETHODIMP in their definitions to automatically add in the stdcall declaration specifier on requisite compilers; those that use this method may use a plain `nsresult' instead.

Source and Binary Compatibility

Some consumers of IDL interfaces create binary plugins that expect the interfaces to be stored in a specific way in memory. In other words, some changes made to IDL interfaces require the author to modify the unique identifier (IID) in order to make it clear to plugins that utilize these interfaces that they have changed, and thus their plugin must be recompiled.

Common changes to an interface, such as changes to a method signature, number of arguments, and number or type of attributes, automatically require an IID change. In addition, some changes to interface attributes require that an IID be changed, as well. When a change to an interface made by an XPIDL developer requires that third-party binary addons be recompiled, we say that it affects binary compatibility. When a change to an interface made by an XPIDL developer requires that third-party binary addons change their source code, we say that it affects source compatibility. In table 5, the columns on the far right indicate whether changes to a specific attribute affect source compatibility, binary compatibility, or both.

Table 5: Optional interface attributes
Attribute Valid for methods Valid for attributes Effect Changes Source Compatibility? Changes Binary Compatibility?
binaryname(foo) Y Y Results in the C++ method being called "Foo" Y N
deprecated Y Y Emits a compiler warning if used in C++ code N N
implicit_jscontext Y Y Adds an additional JSContext *cx parameter to the C++ implementation Y Y
noscript Y Y Prohibits the method/attribute from being accessible in JS code N N
nostdcall Y Y The C++ implementation uses virtual nsresult instead of NS_IMETHOD/NS_IMETHODIMP Y Y
notxpcom Y Y The C++ implementation does not return nsresult (implies noscript) Y Y
optional_argc Y N Adds an additional uint8_t _argc parameter to the C++ implementation Y Y

Method parameters

Each method parameter can be specified in one of three modes: in, out, or inout. An out parameter is essentially an auxiliary return value, although these are moderately cumbersome to use from script contexts and should therefore be avoided if reasonable. An inout parameter is an in parameter whose value may be changed as a result of the method; these parameters are rather annoying to use and should generally be avoided if at all possible.

Out and inout parameters are reflected as objects having the .value property which contains the real value of the parameter; it is not initialized in the case of out parameters and is initialized to the passed-in-value for inout parameters. The script code would need to set this property to assign a value to the parameter. Regular in parameters are reflected more or less normally, with numeric types all representing numbers, booleans as true or false, the various strings (including AString et al) as a JavaScript string, and nsid types as a Components.ID instance. In addition, the jsval type is translated as the appropriate JavaScript value (since a jsval is the internal representation of all JavaScript values), and objects that are marked nsIVariant have their
types automatically boxed and unboxed as appropriate.

The equivalent representations of all IDL types in native code is given in the earlier tables; parameters of type inout follow their out form. Native code should pay particular attention to not passing in null values for out parameters (although some parts of the codebase are known to violate this, it is strictly enforced at the JS<->native barrier), and also ensuring that boolean types only receive values of 0 (false) or 1 (true).

Representations of types additionally depend on some of the many types of properties they may have. The array property turns the parameter into an array; the parameter must also have a corresponding size_is property whose argument is the parameter that has the size of the array. In native code, the type gains another pointer indirection, and JavaScript arrays are used in script code. Script code callers can ignore the value of array parameter, but implementors must still set the values appropriately.

The const and shared properties are special to native code. As its name implies, the const property makes its corresponding argument const. The shared property is only meaningful for out or inout parameters and it means that the pointer value should not be freed by the caller. Only the string, wstring, and native types having the nsid, domstring, utf8string, cstirng, astring, or jsval properties may be declared shared, and, even then, only if the parameter is not an array parameter. The shared property also makes its corresponding argument const.

The retval property indicates that the parameter is actually acting as the return value, and it is only the need to assign properties to the parameter that is causing it to be specified as a parameter. It has no effect on native code, but script code uses it like a regular return value. Naturally, a method which contains a retval parameter must be declared void, and the parameter itself must be an out parameter and the last parameter.

Other properties are the optional and iid_is property. The optional property indicates that script code may omit the property without problems; all subsequent parameters must either by optional themselves or the retval parameter. Note that optional out parameters still pass in a variable for the parameter, but its value will be ignored. The iid_is parameter indicates that the real IID of an nsQIResult parameter may be found in the corresponding parameter, to allow script code to automatically unbox the type.

Not all type combinations are possible. Native types with the various string properties are all forbidden from being used as an inout parameter or as an array parameter. In addition, native types with the nsid property but lacking either a ptr or ref property are forbidden unless the method is notxpcom and it is used as an inparameter.

For types that reference heap-allocated data (strings, arrays, interface pointers, etc), you must follow the XPIDL data ownership conventions in order to avoid memory corruption and security vulnerabilities:

  • For in parameters, the caller allocates and deallocates all data. If the callee needs to use the data after the call completes, it must make a private copy of the data, or, in the case of interface pointers, AddRef it.
  • For out parameters, the callee creates the data, and transfers ownership to the caller. For buffers, the callee allocates the buffer with NS_Alloc, and the caller frees the buffer with NS_Free. For interface pointers, the callee does the AddRef on behalf of the caller, and the caller must call Release.
  • For inout parameters, the callee must clean up the old data if it chooses to replace it. Buffers must be deallocated with NS_Free, and interface pointers must be Release'd. Afterwards, the above rules for out apply.
  • Shared out-parameters should not be freed, as they are intended to refer to constant string literals.

Resources (mostly outdated)

Document Tags and Contributors

Last updated by: fscholz,