This spec is incomplete and it is not expected that it will advance beyond draft status. Authors should not use most of these features directly, but instead use JavaScript editing libraries. The features described in this document are not implemented consistently or fully by user agents, and it is not expected that this will change in the foreseeable future. There is currently no alternative to some execCommand actions related to clipboard content and contentEditable=true is often used to draw the caret and move the caret in the block direction as well as a few minor subpoints. This spec is to meant to help implementations in standardizing these existing features. It is predicted that in the future both specs will be replaced by Content Editable and Input Events.

This document defines the behavior of the editing commands that can be executed with execCommand.

Introduction

The APIs specified here were originally introduced in Microsoft's Internet Explorer, but have subsequently been copied by other browsers in a haphazard and imprecise fashion. Although the behavior specified here does not exactly match any browser at the time of writing, it can serve as a target to converge to in the future.

Where the reasoning behind the specification is of interest, such as when major preexisting rendering engines are known not to match it, the reasoning is available by clicking the "comments" button on the right (requires JavaScript). If you have questions about why the specification says something, check the comments first. They're sometimes longer than the specification text itself, and commonly record what the major browsers do and other essential information.

The principles used for writing this reference list are:

Tests

Tests can be found here.

Issues

This specification is mostly feature-complete. It should be considered mostly stable and awaiting implementater review and feedback.

Significant known issues that I need feedback on, or otherwise am not planning to fix just yet:

A variety of other issues are also noted in the text, formatted like this. Feedback would be appreciated on all of them.

TODO:

Also TODO: Things that are only implemented by a couple of browsers and may or may not be useful to spec:

Things I haven't looked at that multiple browsers implement:

Things that would be useful to address for the future but aren't important to fix right now are in comments prefixed with "TODO".

Commands

Properties of commands

This specification defines a number of commands, identified by ASCII case-insensitive strings. Each command can have several pieces of data associated with it:

Supported commands

If you try doing anything with an unrecognized command (except queryCommandSupported), IE10 Developer Preview throws an "Invalid argument" exception. Firefox 15.0a1 throws NS_ERROR_NOT_IMPLEMENTED on querying indeterm/state/value, and returns false from execCommand/queryCommandEnabled. Chrome 19 dev returns false from everything. Opera Next 12.00 alpha throws NOT_SUPPORTED_ERR for execCommand and returns false for enabled/state/value. Originally I went with IE, although of course with a standard exception type. But after discussion (WebKit bug, Mozilla bug), I changed to match WebKit (except that I return "" for value instead of false). The issue is that there are a whole bunch of IE commands that no one else supports or wants to support, and throwing on execCommand() would make lots of pages break. WebKit was unwilling to take the compat risk, so we took the safer option.

Some commands will be supported in a given user agent, and some will not. All commands defined in this specification must be supported, except optionally the copy command, the cut command, and/or the paste command. Additional vendor-specific commands can also be supported, but implementers must prefix any vendor-specific command names with a vendor-specific string (e.g., "ms", "moz", "webkit", "opera").

I.e., no trying to look good on lazy conformance tests by just sticking in a stub implementation that does nothing.

A command that does absolutely nothing in a particular user agent, such that execCommand() never has any effect and queryCommandEnabled() and queryCommandIndeterm() and queryCommandState() and queryCommandValue() each return the same value all the time, must not be supported.

In a particular user agent, every command must be consistently either supported or not. Specifically, a user agent must not permit one page to see the same command sometimes supported and sometimes not over the course of the same browsing session, unless the user agent has been upgraded or reconfigured in the middle of a session. However, user agents may treat the same command as supported for some pages and not others, e.g., if the command is only supported for certain origins for security reasons.

Authors can tell whether a command is supported using queryCommandSupported().

Enabled commands

At any given time, a supported command can be either enabled or not. Authors can tell whether a command is currently enabled using queryCommandEnabled(). Commands that are not enabled do nothing, as described in the definitions of the various methods that invoke commands.

Testing with bold:

IE10PP2 seems to return true if the active range's start node is editable, false otherwise.

Firefox 6.0a2 seems to always return true if there's anything editable on the page, and throw otherwise. (This is bug 676401.)

Chrome 14 dev seems to behave the same as IE10PP2.

Opera 11.11 seems to always return true if there's anything editable on the page, and false otherwise.

Firefox and Opera behave more or less uselessly. IE doesn't make much sense, in that whether a command is enabled seems meaningless: it will execute it on all nodes in the selection, editable or not. Chrome's definition makes sense in that it will only run the command if it's enabled, but it doesn't make much sense to only have the command run if the start is editable.

It's not clear to me what the point of this method is. There's no way we're going to always return true if the command will do something and false if it won't. I originally just stuck with a really conservative definition that happens to be convenient: if there's nothing selected, obviously nothing will work, and we want to bail out early in that case anyway because all the algorithms will talk about the active range. If there are use-cases for it to be more precise, I could make it so.

Bug 16094 illustrated that we don't really want to be able to modify multiple editing hosts at once, nor do we want to do anything if the start and end aren't both editable, so I co-opted this definition to fit my ends.

Among commands defined in this specification, those listed in Miscellaneous commands are always enabled, except for the cut command and the paste command. The other commands defined here are enabled if the active range is not null, its [=range/start node=] is either editable or an [=editing host=], the editing host of its [=range/start node=] is not an EditContext editing host, its [=range/end node=] is either editable or an [=editing host=], the editing host of its [=range/end node=] is not an EditContext editing host, and there is some [=editing host=] that is an [=tree/inclusive ancestor=] of both its [=range/start node=] and its [=range/end node=].

Methods to query and execute commands

        partial interface Document {
          [CEReactions] boolean execCommand(DOMString commandId, optional boolean showUI = false, optional (TrustedHTML or DOMString) value = "");
        };
      

TODO: Add IDL for queryCommand* functions.

TODO: Define behavior for show UI.

When the execCommand(command, show UI, value) method on the {{Document}} interface is invoked, the user agent must run the following steps:

  1. If only one argument was provided, let show UI be false.
  2. If only one or two arguments were provided, let value be the empty string.
  3. For supported: see comment before Supported commands.

    For enabled: I didn't research this closely, but at a first glance, this is possibly how Chrome 14 dev and Opera 11.11 behave. Maybe also Firefox 6.0a2, except it throws if the command isn't enabled, I think. IE9 returns true in at least some cases even if the command is disabled. TODO: Is this right? Maybe we should be returning false in other cases too?

    If command is not supported or not enabled, return false.

  4. If command is not in the Miscellaneous commands section:

    We don't fire events for copy/cut/paste/undo/redo/selectAll because they should all have their own events. We don't fire events for styleWithCSS/useCSS because it's not obvious where to fire them, or why anyone would want them. We don't fire events for unsupported commands, because then if they became supported and were classified with the miscellaneous events, we'd have to stop firing events for consistency's sake.

    1. Let affected editing host be the [=editing host=] that is an [=tree/inclusive ancestor=] of the active range's [=range/start node=] and [=range/end node=], and is not the [=tree/ancestor=] of any [=editing host=] that is an [=tree/inclusive ancestor=] of the active range's [=range/start node=] and [=range/end node=].

      Such an editing host must exist, because otherwise the command would not be enabled.

    2. [=Fire an event=] named "beforeinput" at affected editing host using {{InputEvent}}, with its {{Event/bubbles}} and {{Event/cancelable}} attributes initialized to true, and its {{InputEvent/data}} attribute initialized to null.
    3. If the value returned by the previous step is false, return false.
    4. If command is not enabled, return false.

      We have to check again whether the command is enabled, because the beforeinput handler might have done something annoying like getSelection().removeAllRanges().

    5. Let affected editing host be the [=editing host=] that is an [=tree/inclusive ancestor=] of the active range's [=range/start node=] and [=range/end node=], and is not the [=tree/ancestor=] of any [=editing host=] that is an [=tree/inclusive ancestor=] of the active range's [=range/start node=] and [=range/end node=].

      This new affected editing host is what we'll fire the input event at in a couple of lines. We want to compute it beforehand just to be safe: bugs in the command action might remove the selection or something bad like that, and we don't want to have to handle it later. We recompute it after the beforeinput event is handled so that if the handler moves the selection to some other editing host, the input event will be fired at the editing host that was actually affected.

  5. Take the action for command, passing value to the instructions as an argument.
  6. If the previous step returned false, return false.
  7. If the action modified DOM tree, then [=fire an event=] named "input" at affected editing host using {{InputEvent}}, with its {{Event/isTrusted}} and {{Event/bubbles}} attributes initialized to true, {{InputEvent/inputType}} attribute initialized to the [=map an edit command to input type value|mapped value=] of command, and its {{InputEvent/data}} attribute initialized to null.
  8. Return true.

To map an edit command to input type value, follow this table:

edit commandinputType
backColorformatBackColor
boldformatBold
createLinkinsertLink
fontNameformatFontName
foreColorformatFontColor
strikethroughformatStrikeThrough
superscriptformatSuperscript
deletedeleteContentBackward
forwardDeletedeleteContentForward
indentformatIndent
insertHorizontalRuleinsertHorizontalRule
insertLineBreakinsertLineBreak
insertOrderedListinsertOrderedList
insertParagraphinsertParagraph
insertTextinsertText
insertUnorderedListinsertUnorderedList
justifyCenterformatJustifyCenter
justifyFullformatJustifyFull
justifyLeftformatJustifyLeft
justifyRightformatJustifyRight
outdentformatOutdent
cutdeleteByCut
pasteinsertFromPaste
redohistoryRedo
undohistoryUndo
If no mapping exists, return an empty string.

When the queryCommandEnabled(command) method on the {{Document}} interface is invoked, the user agent must run the following steps:

  1. See comment before Supported commands.

  2. Return true if command is both supported and enabled, false otherwise.

When the queryCommandIndeterm(command) method on the {{Document}} interface is invoked, the user agent must run the following steps:

  1. For supported: see comment before Supported commands.

    What happens if you call queryCommand(Indeterm|State|Value)() on a command where it makes no sense?

    IE9 consistently returns false for all three. However, any command that has a state defined also has a value defined, which is equal to the state: it returns boolean true or false.

    Firefox 6.0a2 consistently throws NS_ERROR_FAILURE for indeterm/state if not supported, and returns an empty string for value. Exceptions include unlink (seems to always return indeterm/state false), and styleWithCss/useCss (throw NS_ERROR_FAILURE even for value).

    Chrome 14 dev returns false for all three, and even does this for unrecognized commands. It also always defines value if state is defined: it returns the state cast to a string, either "true" or "false".

    Opera 11.11 returns false for state and "" for value (it doesn't support indeterm). Like Chrome, this is even for unrecognized commands.

    Gecko's behavior is the most useful. If the author tries querying some aspect of a command that makes no sense, they shouldn't receive a value that looks like it might make sense but is actually just a constant. Originally, I went even further than Gecko: I required exceptions even for value, since doing otherwise makes no sense. But throwing more exceptions is less compatible on the whole than throwing more exceptions, so based on discussion, I switched to a behavior more like Opera, which is more or less IE/WebKit behavior but made slightly more sane.

    If command is not supported or has no indeterminacy, return false.

  2. Return true if command is indeterminate, otherwise false.

When the queryCommandState(command) method on the {{Document}} interface is invoked, the user agent must run the following steps:

  1. See comment on the comparable line for queryCommandIndeterm().

    If command is not supported or has no state, return false.

  2. If the state override for command is set, return it.
  3. Return true if command's state is true, otherwise false.

Firefox 6.0a2 always throws an exception when this is called. Opera 11.11 seems to return false if there's nothing editable on the page, which is unhelpful. The spec follows IE9 and Chrome 14 dev. The reason this is useful, compared to just running one of the other methods and seeing if you get a NOT_SUPPORTED_ERR, is that other methods might throw different exceptions for other reasons. It's easier to check a boolean than to check exception types, especially since as of June 2011 UAs aren't remotely consistent on what they do with unsupported commands.

Actually, correction: Firefox < 15ish throws an exception if nothing editable is on the page. Otherwise it behaves just like IE/Chrome. See Mozilla bug 742240.

When the queryCommandSupported(command) method on the {{Document}} interface is invoked, the user agent must return true if command is supported and available within the current script on the current site, and false otherwise.

When the queryCommandValue(command) method on the {{Document}} interface is invoked, the user agent must run the following steps:

  1. This is what Firefox 6.0a2 and Opera 11.11 seem to do when the command isn't enabled. Chrome 14 dev seems to return the string "false", and IE9 seems to return boolean false. For the case where there's no value, or the command isn't supported, see the comment on the comparable line for queryCommandIndeterm().

    If command is not supported or has no value, return the empty string.

  2. Yuck. This is incredibly messy, as are lots of other fontSize-related things, but I don't want to define a whole second notion of value for the sake of a single command . . .

    If command is "fontSize" and its value override is set, convert the value override to an integer number of pixels and return the legacy font size for the result.

  3. If the value override for command is set, return it.
  4. Return command's value.

All of these methods must treat their command argument ASCII case-insensitively.

The methods in this section have mostly been designed so that the following invariants hold after execCommand() is called, assuming it didn't throw an exception:

The first two points do not always hold for strikethrough or underline, because it can be impossible to unset text-decoration in CSS. Also, by design, the state of insertOrderedList and insertUnorderedList might be true both before and after calling, because they only remove one level of indentation. unlink should set the value to null. And finally, the state of the various justify commands should always be true after calling, and the value should always be the appropriate string ("center", "justify", "left", or "right"). Any other deviations from these invariants are bugs in the specification.

Common definitions

An HTML element is an {{Element}} whose [=Element/namespace=] is the HTML namespace.

A prohibited paragraph child name is "address", "article", "aside", "blockquote", "caption", "center", "col", "colgroup", "dd", "details", "dir", "div", "dl", "dt", "fieldset", "figcaption", "figure", "footer", "form", "h1", "h2", "h3", "h4", "h5", "h6", "header", "hgroup", "hr", "li", "listing", "menu", "nav", "ol", "p", "plaintext", "pre", "section", "summary", "table", "tbody", "td", "tfoot", "th", "thead", "tr", "ul", or "xmp".

These are all the things that will close a <p> if found as a descendant. I think. Plus table stuff, since that can't be a descendant of a p either, although it won't auto-close it.

A prohibited paragraph child is an HTML element whose [=Element/local name=] is a prohibited paragraph child name.

The block/inline node definitions are CSS-based. "Prohibited paragraph child" is conceptually similar to "block node", but based on the element name. Generally we want to use block/inline node when we're interested in the visual effect, and prohibited paragraph children when we're concerned about parsing or semantics. TODO: Audit all "block node" usages to see if they need to become "visible block node", now that block nodes can be invisible (if they descend from display: none).

A block node is either an {{Element}} whose "display" property does not have resolved value "inline" or "inline-block" or "inline-table" or "none", or a [=document=], or a {{DocumentFragment}}.

An inline node is a node that is not a block node.

Something is editable if it is a node; it is not an [=editing host=]; it does not have a contenteditable attribute set to the false state; its [=tree/parent=] is an [=editing host=] or editable; and either it is an HTML element, or it is an svg or math element, or it is not an {{Element}} and its [=tree/parent=] is an HTML element.

An editable node cannot be a [=document=] or {{DocumentFragment}}, its [=tree/parent=] cannot be null, and it must descend from either an {{Element}} or a [=document=].

The editing host of node is null if node is neither editable nor an [=editing host=]; node itself, if node is an [=editing host=]; or the nearest [=tree/ancestor=] of node that is an [=editing host=], if node is editable.

Two nodes are in the same editing host if the editing host of the first is non-null and the same as the editing host of the second.

Barring bugs, the algorithms here will not alter the attributes of a non-editable element; will not remove a non-editable node from its parent (except to immediately give it a new parent in the same editing host); and will not add, remove, or reorder children of a node unless it is either editable or an editing host. An editing host is never editable, so authors are assured that editing commands will only modify the editing host's contents and not the editing host itself.

A collapsed line break is a [^br^] that begins a line box which has nothing else in it, and therefore has zero height.

Is this a good definition at all? I mean things like <p>foo<br></p>, or the second one in <p>foo<br><br></p>. The way I test it is by adding a text node after it containing a zwsp; if that changes the offsetHeight of its nearest non-inline ancestor, I deem it collapsed. But what if it happens to be display: none right now, for instance? Or its ancestor has a fixed height? Would it be better to use some DOM-based definition?

TODO: The thing about li is a not very nice hack. The issue is that an li won't collapse even if it has no children at all, but that's not true in all browsers (at least not in Opera 11.11), and also it breaks assumptions elsewhere. E.g., if it gets turned into a p.

An extraneous line break is a [^br^] that has no visual effect, in that removing it from the DOM would not change layout, except that a [^br^] that is the sole child of an [^li^] is not extraneous.

Also possibly a bad definition. Again, I test by just removing it and seeing what happens. (Actually, setting display: none, so that it doesn't mess up ranges.)

A whitespace node is either a {{Text}} node whose {{CharacterData/data}} is the empty string; or a {{Text}} node whose {{CharacterData/data}} consists only of one or more tabs (0x0009), line feeds (0x000A), carriage returns (0x000D), and/or spaces (0x0020), and whose [=tree/parent=] is an {{Element}} whose resolved value for "white-space" is "normal" or "nowrap"; or a {{Text}} node whose {{CharacterData/data}} consists only of one or more tabs (0x0009), carriage returns (0x000D), and/or spaces (0x0020), and whose [=tree/parent=] is an {{Element}} whose resolved value for "white-space" is "pre-line".

node is a collapsed whitespace node if the following algorithm returns true:

This definition is also bad. It's a crude attempt to emulate CSS2.1 16.6.1, but leaving out a ton of the subtleties. I actually don't want the exact CSS definitions, because those depend on things like where lines are broken, but I'm not sure this definition is right anyway. E.g., what about a pre-line text node consisting of a single line break that's at the end of a block? That collapses, same idea as an extraneous line break. We could also worry about nodes containing only zwsp or such if we wanted, or display: none, or . . .

  1. If node is not a whitespace node, return false.
  2. If node's {{CharacterData/data}} is the empty string, return true.
  3. Let ancestor be node's [=tree/parent=].
  4. If ancestor is null, return true.
  5. If the "display" property of some [=tree/ancestor=] of node has resolved value "none", return true.
  6. While ancestor is not a block node and its [=tree/parent=] is not null, set ancestor to its [=tree/parent=].
  7. At this point we know node consists of some whitespace, of a sort that will collapse if it's at the start or end of a line. We go backwards until we find the first block boundary, and if everything until there is invisible or whitespace, we conclude that node is collapsed. We assume a block boundary is either when we hit a line break or block node, or we hit the end of ancestor (which is the nearest ancestor block node). All this is very imprecise, of course, but it's fairly simple and will work in common cases.

    We have to avoid invoking the definition of "visible" here to avoid infinite recursion: that depends on the concept of collapsed whitespace nodes. Instead, we repeat the parts we need, which turns out to be "not much of it".

    Let reference be node.

  8. While reference is a [=tree/descendant=] of ancestor:
    1. Let reference be the node before it in [=tree order=].
    2. If reference is a block node or a [^br^], return true.
    3. If reference is a {{Text}} node that is not a whitespace node, or is an [^img^], break from this loop.
  9. We found something before our text node on (probably) the same line, so presumably it's not at the line's start. Now we need to look forward and see if we're at the line's end. If we aren't there either, then we assume we're not collapsed, so return false.

    Let reference be node.

  10. While reference is a [=tree/descendant=] of ancestor:
    1. Let reference be the node after it in [=tree order=], or null if there is no such node.
    2. If reference is a block node or a [^br^], return true.
    3. If reference is a {{Text}} node that is not a whitespace node, or is an [^img^], break from this loop.
  11. Return false.

TODO: Consider whether we really want to depend on img specifically here. It seems more likely that we want something like "any replaced content that has nonzero height and width" or such. When fixing this, make sure to audit for other occurrences of this assumption.

Something is visible if it is a node that either is a block node, or a {{Text}} node that is not a collapsed whitespace node, or an [^img^], or a [^br^] that is not an extraneous line break, or any node with a visible [=tree/descendant=]; excluding any node with an [=tree/inclusive ancestor=] {{Element}} whose "display" property has resolved value "none".

Something is invisible if it is a node that is not visible.

TODO: Reconsider whether we want to lump invisible nodes in here. If we don't and change the definition, make sure to audit all callers, since then a block could have collapsed block prop descendants that aren't children.

A collapsed block prop is either a collapsed line break that is not an extraneous line break, or an {{Element}} that is an inline node and whose [=tree/children=] are all either invisible or collapsed block props and that has at least one [=tree/child=] that is a collapsed block prop.

A collapsed block prop is something like the <br> in <p><br></p>, or the <br> and <span> in <p><span><br></span></p>. These are necessary to stop the block from having zero height when it has no other contents, but serve no purpose and should be removed once the block has other contents that stop it from collapsing.

TODO: I say "first range" because I think that's what Gecko actually does, and Gecko is the only one that allows multiple ranges in a selection. This is keeping in mind that it stores ranges sorted by start, not by the order the user added them, and silently removes or shortens existing ranges to avoid overlap. It probably makes the most sense in the long term to have the command affect all ranges. But I'll leave this for later.

The active range is the [=range=] of the selection given by calling getSelection() on the context object. (Thus the active range may be null.)

Each {{Document}} has a boolean CSS styling flag associated with it, which must initially be false. (The styleWithCSS command can be used to modify or query it, by means of the execCommand() and queryCommandState() methods.)

Each {{Document}} is associated with a string known as the default single-line container name, which must initially be "div". (