This is preliminary documentation of the changes introduced to Mozilla as part of the BiDi support contributed by IBM (a.k.a. IBMBIDI), written by Simon Montagu and posted to the mozilla-layout mailing list. While it was published in 2001 and might not be totally accurate, it does help understanding the internals of the BiDi code.
Overview of BiDi processing
Bidi text is reordered according to the Unicode Bidi Algorithm (UBA). The implementation is based on IBM's International Components for Unicode (ICU), which was chosen after comparing and testing the available open-source implementations. As far as we could discover, ICU is the only one which is 100% compatible with UBA, including support for explicit directional controls (LRO, RLO, etc, and their HTML equivalents).
Bidi processing for a given HTML document will only take place if one of the following is true:
- The page includes a Hebrew or Arabic character or a Hindi digit. This is determined in
- The page includes a element with the attribute dir=rtl, either explicitly (
nsGenericHTMLElement::MapCommonAttributesInto), or as a consequence of a style rule (
All these cases use
nsDocument::EnableBidi to set the
mBidiEnabled. In a Bidi-enabled document, the
following things happen:
- During a reflow,
nsBidiPresUtils::Resolveis called. This method uses the UBA to determine the directional properties of the text and reorder frames if necessary. If necessary, text frames are split so that every frame has the same directionality.
FrameManager::SetFramePropertyis used to set the following flags and pointers (for terminology see the specification of the UBA):
- embeddingLevel: the embedding level of the frame
- textClass: the text class of the frame.
- baseLevel: the base level (direction) of the paragraph.
- nextBidi: when a frame has been split, this points
to the next frame (in logical order). It is an
- "Reordering" of frames is accomplished by setting the appropriate frame coordinates. The order of the frames in the content model is not affected, so frames that are adjacent in the content model can be far apart visually. A new frame iterator,
nsVisualIterator(in nsFrameTraversal.cpp) provides visual frame navigation capability.
- Details of rendering are dependent on user preferences and system capabilities. Where the system is capable of tasks such as reversing and shaping text, symmetric swapping, numeric translation, etc., no special text rendering is needed, though there may be a call to a native API to set the base text direction (for example
SetTextAlignon Windows). For systems without Bidi capabilities, the methods in nsIUBidiUtils are used.
Note that we are not affected by buggy Bidi implementations on specific platforms, since the platform never sees a text fragment with mixed directionality, and is not expected to do anything more complicated than displaying left-to-right text from left to right or right-to-left text from right to left.
- In some circumstances, even on a platform with Bidi capability, the layout code has to reverse text fragments or to allow for the fact that they are displayed in reverse. In general, this happens whenever we are dealing with less than a whole frame. Examples of this are in
nsTextFrame::PaintUnicodeTextwhen a selection is displayed;
Text in Visual mode must also be reversed before display on a Bidi platform.
Text fields and Composer
The specification of the Bidi changes to composer was posted in the editor and i18n newsgroups, and responses there were taken into account. The implementation is mostly in layout code, especially in nsSelection.cpp and nsCaret.cpp.
Other BiDi functionality
- Clipboard: based on Bidi Options in Preferences, the Text Mode of the clipboard may be "Logical", "Visual" or "As Source".
- In "As Source" mode, the text copied into the clipboard is exactly the same (from a Bidi point of view) as the original source. The text pasted from the clipboard (to the composer or to an edit field) is pasted as is.
- In "Visual" mode, the text copied into the clipboard is exactly the displayed text. The text pasted from the clipboard is converted (if needed) so that Mozilla displays it (from a Bidi point of view) as it would be displayed by a visual clipboard viewer.
- In "Logical" mode, the text copied into the clipboard is converted (if needed) so that a Logical application will display it (from a Bidi point of view) as it is displayed by Mozilla. Text pasted from the clipboard is treated exactly as if it came from a Logical source.
- Form controls: based on Bidi Options in Preferences, the text mode of form controls may be "Logical", "Visual" or "Like Containing Document". We have also tested behaviour of all controls with
dir=rtland added support where necessary.
- Some support added for alignment
in tables and lists, and fixes for problems with different combinations of
to lists with Hebrew and Arabic
Summary of New Classes
|Class Name||XPCOM interface (if applicable)||Implementation||Comments|
|nsIBidi||intl\unicharutil\public\nsIBidi.h||intl\unicharutil\src\nsBidiImp.cpp||Implementation of the Unicode Bidi algorithm|
|nsIUBidiUtils||intl\unicharutil\public\nsIUBidiUtils.h||intl\unicharutil\src\nsBidiUtilsImp.cpp||Utilities for Bidi processing, including:
||Utilities for the layout engine including:
||subclass of nsFrame with additional method
||subclass of nsFrame
This is a special frame which represents a Bidi control. It is created when resolving text containing a Unicode Bidi control character, a
||widget/src/%platform%/nsBidiKeyboard.cpp||Sets and queries the directionality of the current keyboard language.|