mozilla
Your Search Results

    nsIParserUtils

    Provides non-Web HTML parsing functionality to Firefox extensions and XULRunner applications.
    1.0
    28
    Introduced
    Gecko 13.0
    Inherits from: nsISupports Last changed in Gecko 14.0 (Firefox 14.0 / Thunderbird 14.0 / SeaMonkey 2.11)

    Warning: Do not use this from within Gecko--use nsContentUtils, nsTreeSanitizer, and so on directly instead.

    Implemented by: @mozilla.org/parserutils;1 as a service:

    var parserUtils = Components.classes["@mozilla.org/parserutils;1"]
                      .getService(Components.interfaces.nsIParserUtils);
    

    Method overview

    AString convertToPlainText(in AString src, in unsigned long flags, in unsigned long wrapCol);
    nsIDOMDocumentFragment parseFragment(in AString fragment, in unsigned long flags, in boolean isXML, in nsIURI baseURI, in nsIDOMElement element);
    AString sanitize(in AString src, in unsigned long flags);

    Constants

    Constant Value Description
    SanitizerAllowComments (1 << 0) Flag for sanitizer: Allow comment nodes.
    SanitizerAllowStyle (1 << 1)

    Flag for sanitizer: Allow <style> elements and style attributes (with contents sanitized in case of -moz-binding).

    Note: If -moz-binding is absent, properties that might be XSS risks in other Web engines are preserved!
    SanitizerCidEmbedsOnly (1 << 2)

    Flag for sanitizer: Only allow cid: URLs for embedded content.

    At present, sanitizing CSS backgrounds, and so on., is not supported, so setting this together with SanitizerAllowStyle doesn't make sense.

    At present, sanitizing CSS syntax in SVG presentational attributes is not supported, so this option flattens out SVG.
    SanitizerDropNonCSSPresentation (1 << 3) Flag for sanitizer: Drops non-CSS presentational HTML elements and attributes, such as <font>, <center>, and the bgcolor attribute.
    SanitizerDropForms (1 << 4) Flag for sanitizer: Drops forms and form controls (excluding <fieldset> and <legend>.
    SanitizerDropMedia (1 << 5) Flag for sanitizer: Drops <img>, <video>, <audio>, and <source>, and flattens out SVG.

    Methods

    convertToPlainText()

    Converts HTML to plain text.

    AString convertToPlainText(
      in AString src,
      in unsigned long flags,
      in unsigned long wrapCol
    );
    
    Parameters
    src
    The HTML source to parse (C++ callers are allowed but not required to use the same string for the return value.)
    flags
    Conversion option flags defined in nsIDocumentEncoder.
    wrapCol
    Number of characters per line; 0 for no auto-wrapping.
    Return value

    The plain text conversion of the HTML specified in src.

    Requires Gecko 14.0 (Firefox 14.0 / Thunderbird 14.0 / SeaMonkey 2.11)

    parseFragment()

    Parses markup into a sanitized document fragment.

    nsIDOMDocumentFragment parseFragment(
      in AString fragment,
      in unsigned long flags,
      in boolean isXML,
      in nsIURI baseURI,
      in nsIDOMElement element
    );
    
    Parameters
    fragment
    The input markup.
    flags
    Sanitization option flags defined above.
    isXML
    true if |fragment| is XML and false if HTML.
    baseURI
    The base URL for this fragment.
    element
    The context node for the fragment parsing algorithm.
    Return value

    An nsIDOMDocumentFragment object for the resulting sanitized document fragment.

    Requires Gecko 14.0 (Firefox 14.0 / Thunderbird 14.0 / SeaMonkey 2.11)

    sanitize()

    Parses a string into an HTML document, sanitizes the document, and returns the result serialized to a string.

    The sanitizer is designed to protect against XSS when sanitized content is inserted into a different-origin context without an iframe-equivalent sandboxing mechanism.

    By default, the sanitizer doesn't try to avoid leaking information that the content was viewed to third parties. That is, by default, for example <img> with a source pointing to an HTTP server potentially controlled by a third party is not removed. To avoid ambient information leakage upon loading the sanitized content, use the SanitizerInternalEmbedsOnly flag. In that case, <a> links (and similar) to other content are preserved, so an explicit user action (following a link) after the content has been loaded can still leak information.

    By default, non-dangerous non-CSS presentational HTML elements and attributes or forms are not removed. To remove these, use SanitizerDropNonCSSPresentation and/or SanitizerDropForms.

    By default, comments and CSS is removed. To preserve comments, use SanitizerAllowComments. To preserve <style> elements and style attributes on other elements, use SanitizerAllowStyle. -moz-binding is removed from <style> elements and style attributes if present. In this case, properties that Gecko doesn't recognize can get removed as a side effect.

    Note: If -moz-binding is not present, <style> elements and style attributes, and if SanitizerAllowStyle is specified, the sanitized content may still be XSS dangerous if loaded into a non-Gecko Web engine!
    AString sanitize(
      in AString src,
      in unsigned long flags
    );
    
    Parameters
    src
    The HTML source to parse (C++ callers are allowed but not required to use the same string for the return value).
    flags
    Sanitization option flags defined above.
    Return value

    The resulting text.

    Document Tags and Contributors

    Contributors to this page: Sheppy, Archaeopteryx
    Last updated by: Sheppy,