Introduction

We did the programming part with JS, Now let’s implement our knowledge into action, Let’s create JS which can interact with webpages, This would be a long post so hold tight and here we go.

Browser environment, specs

The JavaScript language was initially created for web browsers. Since then it has evolved and become a language with many uses and platforms.

A platform may be a browser, or a web-server or another host, even a “smart” coffee machine, if it can run JavaScript. Each of them provides platform-specific functionality. The JavaScript specification calls that a host environment.

A host environment provides own objects and functions additional to the language core. Web browsers give a means to control web pages. Node.js provides server-side features, and so on.

Here’s a bird’s-eye view of what we have when JavaScript runs in a web browser:

There’s a “root” object called window. It has two roles:

  1. First, it is a global object for JavaScript code, as described in the chapter info:global-object.
  2. Second, it represents the “browser window” and provides methods to control it.

For instance, here we use it as a global object:

function sayHi() {
    alert("Hello");
}

// global functions are methods of the global object:
window.sayHi();

And here we use it as a browser window, to see the window height:

alert(window.innerHeight); // inner window height

There are more window-specific methods and properties, we’ll cover them later.

DOM (Document Object Model)

Document Object Model, or DOM for short, represents all page content as objects that can be modified.

The document object is the main “entry point” to the page. We can change or create anything on the page using it.

For instance:

// change the background color to red
document.body.style.background = "red";

// change it back after 1 second
setTimeout(() => (document.body.style.background = ""), 1000);

Here we used document.body.style, but there’s much, much more. Properties and methods are described in the specification: DOM Living Standard.

The DOM specification explains the structure of a document and provides objects to manipulate it. There are non-browser instruments that use DOM too.

For instance, server-side scripts that download HTML pages and process them can also use DOM. They may support only a part of the specification though.

There’s also a separate specification, CSS Object Model (CSSOM) for CSS rules and stylesheets, that explains how they are represented as objects, and how to read and write them.

CSSOM is used together with DOM when we modify style rules for the document. In practice though, CSSOM is rarely required, because we rarely need to modify CSS rules from JavaScript (usually we just add/remove CSS classes, not modify their CSS rules), but that’s also possible.

BOM (Browser Object Model)

The Browser Object Model (BOM) represents additional objects provided by the browser (host environment) for working with everything except the document.

For instance:

  • The navigator object provides background information about the browser and the operating system. There are many properties, but the two most widely known are: navigator.userAgent – about the current browser, and navigator.platform – about the platform (can help to differ between Windows/Linux/Mac etc).
  • The location object allows us to read the current URL and can redirect the browser to a new one.

Here’s how we can use the location object:

alert(location.href); // shows current URL
if (confirm("Go to Wikipedia?")) {
    location.href = "https://wikipedia.org"; // redirect the browser to another URL
}

Functions alert/confirm/prompt are also a part of BOM: they are directly not related to the document, but represent pure browser methods of communicating with the user.

BOM is the part of the general HTML specification.

Yes, you heard that right. The HTML spec at https://html.spec.whatwg.org is not only about the “HTML language” (tags, attributes), but also covers a bunch of objects, methods and browser-specific DOM extensions. That’s “HTML in broad terms”. Also, some parts have additional specs listed at https://spec.whatwg.org.

Summary

Talking about standards, we have:

DOM specification
Describes the document structure, manipulations and events, see https://dom.spec.whatwg.org.
CSSOM specification
Describes stylesheets and style rules, manipulations with them and their binding to documents, see https://www.w3.org/TR/cssom-1/.
HTML specification
Describes the HTML language (e.g. tags) and also the BOM (browser object model) – various browser functions: setTimeout, alert, location and so on, see https://html.spec.whatwg.org. It takes the DOM specification and extends it with many additional properties and methods.

Additionally, some classes are described separately at https://spec.whatwg.org/.

Please note these links, as there’s so much stuff to learn it’s impossible to cover and remember everything.

When you’d like to read about a property or a method, the Mozilla manual at https://developer.mozilla.org/en-US/search is also a nice resource, but the corresponding spec may be better: it’s more complex and longer to read, but will make your fundamental knowledge sound and complete.

To find something, it’s often convenient to use an internet search “WHATWG [term]” or “MDN [term]”, e.g https://google.com?q=whatwg+localstorage, https://google.com?q=mdn+localstorage.

Now we’ll get down to learning DOM, because the document plays the central role in the UI.

libs:

  • d3
  • domtree

DOM tree

The backbone of an HTML document is tags.

According to the Document Object Model (DOM), every HTML tag is an object. Nested tags are “children” of the enclosing one. The text inside a tag is an object as well.

All these objects are accessible using JavaScript, and we can use them to modify the page.

For example, document.body is the object representing the <body> tag.

Running this code will make the <body> red for 3 seconds:

document.body.style.background = "red"; // make the background red

setTimeout(() => (document.body.style.background = ""), 3000); // return back

Here we used style.background to change the background color of document.body, but there are many other properties, such as:

  • innerHTML – HTML contents of the node.
  • offsetWidth – the node width (in pixels)
  • …and so on.

Soon we’ll learn more ways to manipulate the DOM, but first we need to know about its structure.

An example of the DOM

Let’s start with the following simple document:

<!DOCTYPE html>
<html>
    <head>
        <title>About elk</title>
    </head>
    <body>
        The truth about elk.
    </body>
</html>

The DOM represents HTML as a tree structure of tags. Here’s how it looks:

On the picture above, you can click on element nodes and their children will
open/collapse.

Every tree node is an object.

Tags are element nodes (or just elements) and form the tree structure: <html> is at the root, then <head> and <body> are its children, etc.

The text inside elements forms text nodes, labelled as #text. A text node contains only a string. It may not have children and is always a leaf of the tree.

For instance, the <title> tag has the text "About elk".

Please note the special characters in text nodes:

  • a newline: (in JavaScript known as \n)
  • a space:

Spaces and newlines are totally valid characters, like letters and digits. They form text nodes and become a part of the DOM. So, for instance, in the example above the <head> tag contains some spaces before <title>, and that text becomes a #text node (it contains a newline and some spaces only).

There are only two top-level exclusions:

  1. Spaces and newlines before <head> are ignored for historical reasons.
  2. If we put something after </body>, then that is automatically moved inside the body, at the end, as the HTML spec requires that all content must be inside <body>. So there can’t be any spaces after </body>.

In other cases everything’s straightforward – if there are spaces (just like any character) in the document, then they become text nodes in the DOM, and if we remove them, then there won’t be any.

Here are no space-only text nodes:

<!DOCTYPE html>
<html>
    <head>
        <title>About elk</title>
    </head>
    <body>
        The truth about elk.
    </body>
</html>

Browser tools (to be covered soon) that work with DOM usually do not show spaces at the start/end of the text and empty text nodes (line-breaks) between tags.

Developer tools save screen space this way.

On further DOM pictures we’ll sometimes omit them when they are irrelevant. Such spaces usually do not affect how the document is displayed.

Autocorrection

If the browser encounters malformed HTML, it automatically corrects it when making the DOM.

For instance, the top tag is always <html>. Even if it doesn’t exist in the document, it will exist in the DOM, because the browser will create it. The same goes for <body>.

As an example, if the HTML file is the single word "Hello", the browser will wrap it into <html> and <body>, and add the required <head>, and the DOM will be:

While generating the DOM, browsers automatically process errors in the document, close tags and so on.

A document with unclosed tags:

<p>
    Hello
    <li>Mom</li>
    <li>and</li>
    <li>Dad</li>
</p>

…will become a normal DOM as the browser reads tags and restores the missing parts:

An interesting “special case” is tables. By the DOM specification they must have <tbody>, but HTML text may (officially) omit it. Then the browser creates <tbody> in the DOM automatically.

For the HTML:

<table id="table">
    <tr>
        <td>1</td>
    </tr>
</table>

DOM-structure will be:

You see? The <tbody> appeared out of nowhere. You should keep this in mind while working with tables to avoid surprises.

Other node types

There are some other node types besides elements and text nodes.

For example, comments:

<!DOCTYPE html>
<html>
    <body>
        The truth about elk.
        <ol>
            <li>An elk is a smart</li>

            <!-- comment -->

            <li>...and cunning animal!</li>
        </ol>
    </body>
</html>

We can see here a new tree node type – comment node, labeled as #comment, between two text nodes.

We may think – why is a comment added to the DOM? It doesn’t affect the visual representation in any way. But there’s a rule – if something’s in HTML, then it also must be in the DOM tree.

Everything in HTML, even comments, becomes a part of the DOM.

Even the <!DOCTYPE...> directive at the very beginning of HTML is also a DOM node. It’s in the DOM tree right before <html>. We are not going to touch that node, we even don’t draw it on diagrams for that reason, but it’s there.

The document object that represents the whole document is, formally, a DOM node as well.

There are 12 node types. In practice we usually work with 4 of them:

  1. document – the “entry point” into DOM.
  2. element nodes – HTML-tags, the tree building blocks.
  3. text nodes – contain text.
  4. comments – sometimes we can put information there, it won’t be shown, but JS can read it from the DOM.

See it for yourself

To see the DOM structure in real-time, try Live DOM Viewer. Just type in the document, and it will show up as a DOM at an instant.

Another way to explore the DOM is to use the browser developer tools. Actually, that’s what we use when developing.

To do so, open the web page elk.html, turn on the browser developer tools and switch to the Elements tab.

It should look like this:

You can see the DOM, click on elements, see their details and so on.

Please note that the DOM structure in developer tools is simplified. Text nodes are shown just as text. And there are no “blank” (space only) text nodes at all. That’s fine, because most of the time we are interested in element nodes.

Clicking the button in the left-upper corner allows us to choose a node from the webpage using a mouse (or other pointer devices) and “inspect” it (scroll to it in the Elements tab). This works great when we have a huge HTML page (and corresponding huge DOM) and would like to see the place of a particular element in it.

Another way to do it would be just right-clicking on a webpage and selecting “Inspect” in the context menu.

At the right part of the tools there are the following subtabs:

  • Styles – we can see CSS applied to the current element rule by rule, including built-in rules (gray). Almost everything can be edited in-place, including the dimensions/margins/paddings of the box below.
  • Computed – to see CSS applied to the element by property: for each property we can see a rule that gives it (including CSS inheritance and such).
  • Event Listeners – to see event listeners attached to DOM elements (we’ll cover them in the next part of the tutorial).
  • …and so on.

The best way to study them is to click around. Most values are editable in-place.

Interaction with console

As we work the DOM, we also may want to apply JavaScript to it. Like: get a node and run some code to modify it, to see the result. Here are few tips to travel between the Elements tab and the console.

For the start:

  1. Select the first <li> in the Elements tab.
  2. Press key:Esc – it will open console right below the Elements tab.

Now the last selected element is available as $0, the previously selected is $1 etc.

We can run commands on them. For instance, $0.style.background = 'red' makes the selected list item red, like this:

That’s how to get a node from Elements in Console.

There’s also a road back. If there’s a variable referencing a DOM node, then we can use the command inspect(node) in Console to see it in the Elements pane.

Or we can just output the DOM node in the console and explore “in-place”, like document.body below:

That’s for debugging purposes of course. From the next chapter on we’ll access and modify DOM using JavaScript.

The browser developer tools are a great help in development: we can explore the DOM, try things and see what goes wrong.

Summary

An HTML/XML document is represented inside the browser as the DOM tree.

  • Tags become element nodes and form the structure.
  • Text becomes text nodes.
  • …etc, everything in HTML has its place in DOM, even comments.

We can use developer tools to inspect DOM and modify it manually.

Here we covered the basics, the most used and important actions to start with. There’s an extensive documentation about Chrome Developer Tools at https://developers.google.com/web/tools/chrome-devtools. The best way to learn the tools is to click here and there, read menus: most options are obvious. Later, when you know them in general, read the docs and pick up the rest.

DOM nodes have properties and methods that allow us to travel between them, modify them, move around the page, and more. We’ll get down to them in the next chapters.

libs:

  • d3
  • domtree

Walking the DOM

The DOM allows us to do anything with elements and their contents, but first we need to reach the corresponding DOM object.

All operations on the DOM start with the document object. That’s the main “entry point” to DOM. From it we can access any node.

Here’s a picture of links that allow for travel between DOM nodes:

Let’s discuss them in more detail.

On top: documentElement and body

The topmost tree nodes are available directly as document properties:

<html> = document.documentElement
The topmost document node is document.documentElement. That’s the DOM node of the <html> tag.
<body> = document.body
Another widely used DOM node is the <body> element – document.body.
<head> = document.head
The <head> tag is available as document.head.

A script cannot access an element that doesn’t exist at the moment of running.

In particular, if a script is inside <head>, then document.body is unavailable, because the browser did not read it yet.

So, in the example below the first alert shows null:

<html>
    <head>
        <script>
            alert("From HEAD: " + document.body); // null, there's no <body> yet
        </script>
    </head>

    <body>
        <script>
            alert("From BODY: " + document.body); // HTMLBodyElement, now it exists
        </script>
    </body>
</html>

In the DOM, the null value means “doesn’t exist” or “no such node”.

Children: childNodes, firstChild, lastChild

There are two terms that we’ll use from now on:

  • Child nodes (or children) – elements that are direct children. In other words, they are nested exactly in the given one. For instance, <head> and <body> are children of <html> element.
  • Descendants – all elements that are nested in the given one, including children, their children and so on.

For instance, here <body> has children <div> and <ul> (and few blank text nodes):

<html>
    <body>
        <div>Begin</div>

        <ul>
            <li>
                <b>Information</b>
            </li>
        </ul>
    </body>
</html>

…And descendants of <body> are not only direct children <div>, <ul> but also more deeply nested elements, such as <li> (a child of <ul>) and <b> (a child of <li>) – the entire subtree.

The childNodes collection lists all child nodes, including text nodes.

The example below shows children of document.body:

<html>
    <body>
        <div>Begin</div>

        <ul>
            <li>Information</li>
        </ul>

        <div>End</div>

        <script>
            for (let i = 0; i < document.body.childNodes.length; i++) {
                alert(document.body.childNodes[i]); // Text, DIV, Text, UL, ..., SCRIPT
            }
        </script>
        ...more stuff...
    </body>
</html>

Please note an interesting detail here. If we run the example above, the last element shown is <script>. In fact, the document has more stuff below, but at the moment of the script execution the browser did not read it yet, so the script doesn’t see it.

Properties firstChild and lastChild give fast access to the first and last children.

They are just shorthands. If there exist child nodes, then the following is always true:

elem.childNodes[0] === elem.firstChild;
elem.childNodes[elem.childNodes.length - 1] === elem.lastChild;

There’s also a special function elem.hasChildNodes() to check whether there are any child nodes.

DOM collections

As we can see, childNodes looks like an array. But actually it’s not an array, but rather a collection – a special array-like iterable object.

There are two important consequences:

  1. We can use for..of to iterate over it:
for (let node of document.body.childNodes) {
    alert(node); // shows all nodes from the collection
}

That’s because it’s iterable (provides the Symbol.iterator property, as required).

  1. Array methods won’t work, because it’s not an array:
alert(document.body.childNodes.filter); // undefined (there's no filter method!)

The first thing is nice. The second is tolerable, because we can use Array.from to create a “real” array from the collection, if we want array methods:

alert(Array.from(document.body.childNodes).filter); // function

DOM collections, and even more – all navigation properties listed in this chapter are read-only.

We can’t replace a child by something else by assigning childNodes[i] = ....

Changing DOM needs other methods. We will see them in the next chapter.

Almost all DOM collections with minor exceptions are live. In other words, they reflect the current state of DOM.

If we keep a reference to elem.childNodes, and add/remove nodes into DOM, then they appear in the collection automatically.

Collections are iterable using for..of. Sometimes people try to use for..in for that.

Please, don’t. The for..in loop iterates over all enumerable properties. And collections have some “extra” rarely used properties that we usually do not want to get:

<body>
    <script>
        // shows 0, 1, length, item, values and more.
        for (let prop in document.body.childNodes) alert(prop);
    </script>
</body>

Siblings and the parent

Siblings are nodes that are children of the same parent.

For instance, here <head> and <body> are siblings:

<html>
    <head>
        ...
    </head>
    <body>
        ...
    </body>
</html>
  • <body> is said to be the “next” or “right” sibling of <head>,
  • <head> is said to be the “previous” or “left” sibling of <body>.

The next sibling is in nextSibling property, and the previous one - in previousSibling.

The parent is available as parentNode.

For example:

// parent of <body> is <html>
alert(document.body.parentNode === document.documentElement); // true

// after <head> goes <body>
alert(document.head.nextSibling); // HTMLBodyElement

// before <body> goes <head>
alert(document.body.previousSibling); // HTMLHeadElement

Element-only navigation

Navigation properties listed above refer to all nodes. For instance, in childNodes we can see both text nodes, element nodes, and even comment nodes if there exist.

But for many tasks we don’t want text or comment nodes. We want to manipulate element nodes that represent tags and form the structure of the page.

So let’s see more navigation links that only take element nodes into account:

The links are similar to those given above, just with Element word inside:

  • children – only those children that are element nodes.
  • firstElementChild, lastElementChild – first and last element children.
  • previousElementSibling, nextElementSibling – neighbor elements.
  • parentElement – parent element.

The parentElement property returns the “element” parent, while parentNode returns “any node” parent. These properties are usually the same: they both get the parent.

With the one exception of document.documentElement:

alert(document.documentElement.parentNode); // document
alert(document.documentElement.parentElement); // null

The reason is that the root node document.documentElement (<html>) has document as its parent. But document is not an element node, so parentNode returns it and parentElement does not.

This detail may be useful when we want to travel up from an arbitrary element elem to <html>, but not to the document:

while ((elem = elem.parentElement)) {
    // go up till <html>
    alert(elem);
}

Let’s modify one of the examples above: replace childNodes with children. Now it shows only elements:

<html>
    <body>
        <div>Begin</div>

        <ul>
            <li>Information</li>
        </ul>

        <div>End</div>

        <script>
            for (let elem of document.body.children) {
                alert(elem); // DIV, UL, DIV, SCRIPT
            }
        </script>
        ...
    </body>
</html>

Till now we described the basic navigation properties.

Certain types of DOM elements may provide additional properties, specific to their type, for convenience.

Tables are a great example of that, and represent a particularly important case:

The <table> element supports (in addition to the given above) these properties:

  • table.rows – the collection of <tr> elements of the table.
  • table.caption/tHead/tFoot – references to elements <caption>, <thead>, <tfoot>.
  • table.tBodies – the collection of <tbody> elements (can be many according to the standard, but there will always be at least one – even if it is not in the source HTML, the browser will put it in the DOM).

<thead>, <tfoot>, <tbody> elements provide the rows property:

  • tbody.rows – the collection of <tr> inside.

<tr>:

  • tr.cells – the collection of <td> and <th> cells inside the given <tr>.
  • tr.sectionRowIndex – the position (index) of the given <tr> inside the enclosing <thead>/<tbody>/<tfoot>.
  • tr.rowIndex – the number of the <tr> in the table as a whole (including all table rows).

<td> and <th>:

  • td.cellIndex – the number of the cell inside the enclosing <tr>.

An example of usage:

<table id="table">
    <tr>
        <td>one</td>
        <td>two</td>
    </tr>
    <tr>
        <td>three</td>
        <td>four</td>
    </tr>
</table>

<script>
    // get td with "two" (first row, second column)
    let td = table.rows[0].cells[1];
    td.style.backgroundColor = "red"; // highlight it
</script>

The specification: tabular data.

There are also additional navigation properties for HTML forms. We’ll look at them later when we start working with forms.

Summary

Given a DOM node, we can go to its immediate neighbors using navigation properties.

There are two main sets of them:

  • For all nodes: parentNode, childNodes, firstChild, lastChild, previousSibling, nextSibling.
  • For element nodes only: parentElement, children, firstElementChild, lastElementChild, previousElementSibling, nextElementSibling.

Some types of DOM elements, e.g. tables, provide additional properties and collections to access their content.

Searching: getElement*, querySelector*

DOM navigation properties are great when elements are close to each other. What if they are not? How to get an arbitrary element of the page?

There are additional searching methods for that.

document.getElementById or just id

If an element has the id attribute, we can get the element using the method document.getElementById(id), no matter where it is.

For instance:

<div id="elem">
    <div id="elem-content">Element</div>
</div>

<script>
    // get the element

    let elem = document.getElementById("elem");

    // make its background red
    elem.style.background = "red";
</script>

Also, there’s a global variable named by id that references the element:

<div id="elem">
    <div id="elem-content">Element</div>
</div>

<script>
    // elem is a reference to DOM-element with id="elem"
    elem.style.background = "red";

    // id="elem-content" has a hyphen inside, so it can't be a variable name
    // ...but we can access it using square brackets: window['elem-content']
</script>

…That’s unless we declare a JavaScript variable with the same name, then it takes precedence:

<div id="elem"></div>

<script>
    let elem = 5; // now elem is 5, not a reference to <div id="elem">

    alert(elem); // 5
</script>

This behavior is described in the specification, so it’s kind of standard. But it is supported mainly for compatibility.

The browser tries to help us by mixing namespaces of JS and DOM. That’s fine for simple scripts, inlined into HTML, but generally isn’t a good thing. There may be naming conflicts. Also, when one reads JS code and doesn’t have HTML in view, it’s not obvious where the variable comes from.

Here in the tutorial we use id to directly reference an element for brevity, when it’s obvious where the element comes from.

In real life document.getElementById is the preferred method.

The id must be unique. There can be only one element in the document with the given id.

If there are multiple elements with the same id, then the behavior of methods that use it is unpredictable, e.g. document.getElementById may return any of such elements at random. So please stick to the rule and keep id unique.

The method getElementById that can be called only on document object. It looks for the given id in the whole document.

querySelectorAll [#querySelectorAll]

By far, the most versatile method, elem.querySelectorAll(css) returns all elements inside elem matching the given CSS selector.

Here we look for all <li> elements that are last children:

<ul>
    <li>The</li>
    <li>test</li>
</ul>
<ul>
    <li>has</li>
    <li>passed</li>
</ul>
<script>
    let elements = document.querySelectorAll("ul > li:last-child");

    for (let elem of elements) {
        alert(elem.innerHTML); // "test", "passed"
    }
</script>

This method is indeed powerful, because any CSS selector can be used.

Pseudo-classes in the CSS selector like :hover and :active are also supported. For instance, document.querySelectorAll(':hover') will return the collection with elements that the pointer is over now (in nesting order: from the outermost <html> to the most nested one).

querySelector [#querySelector]

The call to elem.querySelector(css) returns the first element for the given CSS selector.

In other words, the result is the same as elem.querySelectorAll(css)[0], but the latter is looking for all elements and picking one, while elem.querySelector just looks for one. So it’s faster and also shorter to write.

matches

Previous methods were searching the DOM.

The elem.matches(css) does not look for anything, it merely checks if elem matches the given CSS-selector. It returns true or false.

The method comes in handy when we are iterating over elements (like in an array or something) and trying to filter out those that interest us.

For instance:

<a href="http://example.com/file.zip">...</a>
<a href="http://ya.ru">...</a>

<script>
    // can be any collection instead of document.body.children
    for (let elem of document.body.children) {
        if (elem.matches('a[href$="zip"]')) {
            alert("The archive reference: " + elem.href);
        }
    }
</script>

closest

Ancestors of an element are: parent, the parent of parent, its parent and so on. The ancestors together form the chain of parents from the element to the top.

The method elem.closest(css) looks the nearest ancestor that matches the CSS-selector. The elem itself is also included in the search.

In other words, the method closest goes up from the element and checks each of parents. If it matches the selector, then the search stops, and the ancestor is returned.

For instance:

<h1>Contents</h1>

<div class="contents">
    <ul class="book">
        <li class="chapter">Chapter 1</li>
        <li class="chapter">Chapter 1</li>
    </ul>
</div>

<script>
    let chapter = document.querySelector(".chapter"); // LI

    alert(chapter.closest(".book")); // UL
    alert(chapter.closest(".contents")); // DIV

    alert(chapter.closest("h1")); // null (because h1 is not an ancestor)
</script>

getElementsBy*

There are also other methods to look for nodes by a tag, class, etc.

Today, they are mostly history, as querySelector is more powerful and shorter to write.

So here we cover them mainly for completeness, while you can still find them in the old scripts.

  • elem.getElementsByTagName(tag) looks for elements with the given tag and returns the collection of them. The tag parameter can also be a star "*" for “any tags”.
  • elem.getElementsByClassName(className) returns elements that have the given CSS class.
  • document.getElementsByName(name) returns elements with the given name attribute, document-wide. Very rarely used.

For instance:

// get all divs in the document
let divs = document.getElementsByTagName("div");

Let’s find all input tags inside the table:

<table id="table">
    <tr>
        <td>Your age:</td>

        <td>
            <label>
                <input type="radio" name="age" value="young" checked /> less
                than 18
            </label>
            <label>
                <input type="radio" name="age" value="mature" /> from 18 to 50
            </label>
            <label>
                <input type="radio" name="age" value="senior" /> more than 60
            </label>
        </td>
    </tr>
</table>

<script>
    let inputs = table.getElementsByTagName("input");

    for (let input of inputs) {
        alert(input.value + ": " + input.checked);
    }
</script>

Novice developers sometimes forget the letter "s". That is, they try to call getElementByTagName instead of getElementsByTagName.

The "s" letter is absent in getElementById, because it returns a single element. But getElementsByTagName returns a collection of elements, so there’s "s" inside.

Another widespread novice mistake is to write:

// doesn't work
document.getElementsByTagName("input").value = 5;

That won’t work, because it takes a collection of inputs and assigns the value to it rather than to elements inside it.

We should either iterate over the collection or get an element by its index, and then assign, like this:

// should work (if there's an input)
document.getElementsByTagName("input")[0].value = 5;

Looking for .article elements:

<form name="my-form">
    <div class="article">Article</div>
    <div class="long article">Long article</div>
</form>

<script>
    // find by name attribute
    let form = document.getElementsByName("my-form")[0];

    // find by class inside the form
    let articles = form.getElementsByClassName("article");
    alert(articles.length); // 2, found two elements with class "article"
</script>

Live collections

All methods "getElementsBy*" return a live collection. Such collections always reflect the current state of the document and “auto-update” when it changes.

In the example below, there are two scripts.

  1. The first one creates a reference to the collection of <div>. As of now, its length is 1.
  2. The second scripts runs after the browser meets one more <div>, so its length is 2.
<div>First div</div>

<script>
    let divs = document.getElementsByTagName("div");
    alert(divs.length); // 1
</script>

<div>Second div</div>

<script>
    alert(divs.length); // 2
</script>

In contrast, querySelectorAll returns a static collection. It’s like a fixed array of elements.

If we use it instead, then both scripts output 1:

<div>First div</div>

<script>
    let divs = document.querySelectorAll("div");
    alert(divs.length); // 1
</script>

<div>Second div</div>

<script>
    alert(divs.length); // 1
</script>

Now we can easily see the difference. The static collection did not increase after the appearance of a new div in the document.

Summary

There are 6 main methods to search for nodes in DOM:

Method Searches by... Can call on an element? Live?
querySelector CSS-selector -
querySelectorAll CSS-selector -
getElementById id - -
getElementsByName name -
getElementsByTagName tag or '*'
getElementsByClassName class

By far the most used are querySelector and querySelectorAll, but getElementBy* can be sporadically helpful or found in the old scripts.

Besides that:

  • There is elem.matches(css) to check if elem matches the given CSS selector.
  • There is elem.closest(css) to look for the nearest ancestor that matches the given CSS-selector. The elem itself is also checked.

And let’s mention one more method here to check for the child-parent relationship, as it’s sometimes useful:

  • elemA.contains(elemB) returns true if elemB is inside elemA (a descendant of elemA) or when elemA==elemB.

Node properties: type, tag and contents

Let’s get a more in-depth look at DOM nodes.

In this chapter we’ll see more into what they are and learn their most used properties.

DOM node classes

Different DOM nodes may have different properties. For instance, an element node corresponding to tag <a> has link-related properties, and the one corresponding to <input> has input-related properties and so on. Text nodes are not the same as element nodes. But there are also common properties and methods between all of them, because all classes of DOM nodes form a single hierarchy.

Each DOM node belongs to the corresponding built-in class.

The root of the hierarchy is EventTarget, that is inherited by Node, and other DOM nodes inherit from it.

Here’s the picture, explanations to follow:

The classes are:

  • EventTarget – is the root “abstract” class. Objects of that class are never created. It serves as a base, so that all DOM nodes support so-called “events”, we’ll study them later.
  • Node – is also an “abstract” class, serving as a base for DOM nodes. It provides the core tree functionality: parentNode, nextSibling, childNodes and so on (they are getters). Objects of Node class are never created. But there are concrete node classes that inherit from it, namely: Text for text nodes, Element for element nodes and more exotic ones like Comment for comment nodes.
  • Element – is a base class for DOM elements. It provides element-level navigation like nextElementSibling, children and searching methods like getElementsByTagName, querySelector. A browser supports not only HTML, but also XML and SVG. The Element class serves as a base for more specific classes: SVGElement, XMLElement and HTMLElement.
  • HTMLElement – is finally the basic class for all HTML elements. It is inherited by concrete HTML elements:
    • HTMLInputElement – the class for <input> elements,
    • HTMLBodyElement – the class for <body> elements,
    • HTMLAnchorElement – the class for <a> elements,
    • …and so on, each tag has its own class that may provide specific properties and methods.

So, the full set of properties and methods of a given node comes as the result of the inheritance.

For example, let’s consider the DOM object for an <input> element. It belongs to HTMLInputElement class.

It gets properties and methods as a superposition of (listed in inheritance order):

  • HTMLInputElement – this class provides input-specific properties,
  • HTMLElement – it provides common HTML element methods (and getters/setters),
  • Element – provides generic element methods,
  • Node – provides common DOM node properties,
  • EventTarget – gives the support for events (to be covered),
  • …and finally it inherits from Object, so “plain object” methods like hasOwnProperty are also available.

To see the DOM node class name, we can recall that an object usually has the constructor property. It references the class constructor, and constructor.name is its name:

alert(document.body.constructor.name); // HTMLBodyElement

…Or we can just toString it:

alert(document.body); // [object HTMLBodyElement]

We also can use instanceof to check the inheritance:

alert(document.body instanceof HTMLBodyElement); // true
alert(document.body instanceof HTMLElement); // true
alert(document.body instanceof Element); // true
alert(document.body instanceof Node); // true
alert(document.body instanceof EventTarget); // true

As we can see, DOM nodes are regular JavaScript objects. They use prototype-based classes for inheritance.

That’s also easy to see by outputting an element with console.dir(elem) in a browser. There in the console you can see HTMLElement.prototype, Element.prototype and so on.

Most browsers support two commands in their developer tools: console.log and console.dir. They output their arguments to the console. For JavaScript objects these commands usually do the same.

But for DOM elements they are different:

  • console.log(elem) shows the element DOM tree.
  • console.dir(elem) shows the element as a DOM object, good to explore its properties.

Try it on document.body.

In the specification, DOM classes aren’t described by using JavaScript, but a special Interface description language (IDL), that is usually easy to understand.

In IDL all properties are prepended with their types. For instance, DOMString, boolean and so on.

Here’s an excerpt from it, with comments:

// Define HTMLInputElement

// The colon ":" means that HTMLInputElement inherits from HTMLElement

interface HTMLInputElement: HTMLElement {
  // here go properties and methods of <input> elements


  // "DOMString" means that the value of a property is a string

  attribute DOMString accept;
  attribute DOMString alt;
  attribute DOMString autocomplete;
  attribute DOMString value;


  // boolean value property (true/false)
  attribute boolean autofocus;

  ...

  // now the method: "void" means that the method returns no value

  void select();
  ...
}

The “nodeType” property

The nodeType property provides one more, “old-fashioned” way to get the “type” of a DOM node.

It has a numeric value:

  • elem.nodeType == 1 for element nodes,
  • elem.nodeType == 3 for text nodes,
  • elem.nodeType == 9 for the document object,
  • there are few other values in the specification.

For instance:

<body>
    <script>
        let elem = document.body;

        // let's examine what it is?
        alert(elem.nodeType); // 1 => element

        // and the first child is...
        alert(elem.firstChild.nodeType); // 3 => text

        // for the document object, the type is 9
        alert(document.nodeType); // 9
    </script>
</body>

In modern scripts, we can use instanceof and other class-based tests to see the node type, but sometimes nodeType may be simpler. We can only read nodeType, not change it.

Tag: nodeName and tagName

Given a DOM node, we can read its tag name from nodeName or tagName properties:

For instance:

alert(document.body.nodeName); // BODY
alert(document.body.tagName); // BODY

Is there any difference between tagName and nodeName?

Sure, the difference is reflected in their names, but is indeed a bit subtle.

  • The tagName property exists only for Element nodes.
  • The nodeName is defined for any Node:
    • for elements it means the same as tagName.
    • for other node types (text, comment, etc.) it has a string with the node type.

In other words, tagName is only supported by element nodes (as it originates from Element class), while nodeName can say something about other node types.

For instance, let’s compare tagName and nodeName for the document and a comment node:

<body>
    <!-- comment -->

    <script>
        // for comment
        alert(document.body.firstChild.tagName); // undefined (not an element)
        alert(document.body.firstChild.nodeName); // #comment

        // for document
        alert(document.tagName); // undefined (not an element)
        alert(document.nodeName); // #document
    </script>
</body>

If we only deal with elements, then we can use both tagName and nodeName - there’s no difference.

The browser has two modes of processing documents: HTML and XML. Usually the HTML-mode is used for webpages. XML-mode is enabled when the browser receives an XML-document with the header: Content-Type: application/xml+xhtml.

In HTML mode tagName/nodeName is always uppercased: it’s BODY either for <body> or <BoDy>.

In XML mode the case is kept “as is”. Nowadays XML mode is rarely used.

innerHTML: the contents

The innerHTML property allows to get the HTML inside the element as a string.

We can also modify it. So it’s one of the most powerful ways to change the page.

The example shows the contents of document.body and then replaces it completely:

<body>
    <p>A paragraph</p>
    <div>A div</div>

    <script>
        alert(document.body.innerHTML); // read the current contents
        document.body.innerHTML = "The new BODY!"; // replace it
    </script>
</body>

We can try to insert invalid HTML, the browser will fix our errors:

<body>
    <script>
        document.body.innerHTML = "<b>test"; // forgot to close the tag
        alert(document.body.innerHTML); // <b>test</b> (fixed)
    </script>
</body>

If innerHTML inserts a <script> tag into the document – it becomes a part of HTML, but doesn’t execute.

Beware: “innerHTML+=” does a full overwrite

We can append HTML to an element by using elem.innerHTML+="more html".

Like this:

chatDiv.innerHTML += "<div>Hello<img src='smile.gif'/> !</div>";
chatDiv.innerHTML += "How goes?";

But we should be very careful about doing it, because what’s going on is not an addition, but a full overwrite.

Technically, these two lines do the same:

elem.innerHTML += "...";
// is a shorter way to write:

elem.innerHTML = elem.innerHTML + "...";

In other words, innerHTML+= does this:

  1. The old contents is removed.
  2. The new innerHTML is written instead (a concatenation of the old and the new one).

As the content is “zeroed-out” and rewritten from the scratch, all images and other resources will be reloaded.

In the chatDiv example above the line chatDiv.innerHTML+="How goes?" re-creates the HTML content and reloads smile.gif (hope it’s cached). If chatDiv has a lot of other text and images, then the reload becomes clearly visible.

There are other side-effects as well. For instance, if the existing text was selected with the mouse, then most browsers will remove the selection upon rewriting innerHTML. And if there was an <input> with a text entered by the visitor, then the text will be removed. And so on.

Luckily, there are other ways to add HTML besides innerHTML, and we’ll study them soon.

outerHTML: full HTML of the element

The outerHTML property contains the full HTML of the element. That’s like innerHTML plus the element itself.

Here’s an example:

<div id="elem">Hello <b>World</b></div>

<script>
    alert(elem.outerHTML); // <div id="elem">Hello <b>World</b></div>
</script>

Beware: unlike innerHTML, writing to outerHTML does not change the element. Instead, it replaces it in the DOM.

Yeah, sounds strange, and strange it is, that’s why we make a separate note about it here. Take a look.

Consider the example:

<div>Hello, world!</div>

<script>
    let div = document.querySelector("div");

    // replace div.outerHTML with <p>...</p>

    div.outerHTML = "<p>A new element</p>"; // (*)

    // Wow! 'div' is still the same!

    alert(div.outerHTML); // <div>Hello, world!</div> (**)
</script>

Looks really odd, right?

In the line (*) we replaced div with <p>A new element</p>. In the outer document (the DOM) we can see the new content instead of the <div>. But, as we can see in line (**), the value of the old div variable hasn’t changed!

The outerHTML assignment does not modify the DOM element (the object referenced by, in this case, the variable ‘div’), but removes it from the DOM and inserts the new HTML in its place.

So what happened in div.outerHTML=... is:

  • div was removed from the document.
  • Another piece of HTML <p>A new element</p> was inserted in its place.
  • div still has its old value. The new HTML wasn’t saved to any variable.

It’s so easy to make an error here: modify div.outerHTML and then continue to work with div as if it had the new content in it. But it doesn’t. Such thing is correct for innerHTML, but not for outerHTML.

We can write to elem.outerHTML, but should keep in mind that it doesn’t change the element we’re writing to (’elem’). It puts the new HTML in its place instead. We can get references to the new elements by querying the DOM.

nodeValue/data: text node content

The innerHTML property is only valid for element nodes.

Other node types, such as text nodes, have their counterpart: nodeValue and data properties. These two are almost the same for practical use, there are only minor specification differences. So we’ll use data, because it’s shorter.

An example of reading the content of a text node and a comment:

<body>
    Hello
    <!-- Comment -->
    <script>
        let text = document.body.firstChild;

        alert(text.data); // Hello

        let comment = text.nextSibling;

        alert(comment.data); // Comment
    </script>
</body>

For text nodes we can imagine a reason to read or modify them, but why comments?

Sometimes developers embed information or template instructions into HTML in them, like this:

<!-- if isAdmin -->
<div>Welcome, Admin!</div>
<!-- /if -->

…Then JavaScript can read it from data property and process embedded instructions.

textContent: pure text

The textContent provides access to the text inside the element: only text, minus all <tags>.

For instance:

<div id="news">
    <h1>Headline!</h1>
    <p>Martians attack people!</p>
</div>

<script>
    // Headline! Martians attack people!
    alert(news.textContent);
</script>

As we can see, only text is returned, as if all <tags> were cut out, but the text in them remained.

In practice, reading such text is rarely needed.

Writing to textContent is much more useful, because it allows to write text the “safe way”.

Let’s say we have an arbitrary string, for instance entered by a user, and want to show it.

  • With innerHTML we’ll have it inserted “as HTML”, with all HTML tags.
  • With textContent we’ll have it inserted “as text”, all symbols are treated literally.

Compare the two:

<div id="elem1"></div>
<div id="elem2"></div>

<script>
    let name = prompt("What's your name?", "<b>Winnie-the-pooh!</b>");

    elem1.innerHTML = name;
    elem2.textContent = name;
</script>
  1. The first <div> gets the name “as HTML”: all tags become tags, so we see the bold name.
  2. The second <div> gets the name “as text”, so we literally see <b>Winnie-the-pooh!</b>.

In most cases, we expect the text from a user, and want to treat it as text. We don’t want unexpected HTML in our site. An assignment to textContent does exactly that.

The “hidden” property

The “hidden” attribute and the DOM property specifies whether the element is visible or not.

We can use it in HTML or assign using JavaScript, like this:

<div>Both divs below are hidden</div>

<div hidden>With the attribute "hidden"</div>

<div id="elem">JavaScript assigned the property "hidden"</div>

<script>
    elem.hidden = true;
</script>

Technically, hidden works the same as style="display:none". But it’s shorter to write.

Here’s a blinking element:

<div id="elem">A blinking element</div>

<script>
    setInterval(() => (elem.hidden = !elem.hidden), 1000);
</script>

More properties

DOM elements also have additional properties, in particular those that depend on the class:

  • value – the value for <input>, <select> and <textarea> (HTMLInputElement, HTMLSelectElement…).
  • href – the “href” for <a href="..."> (HTMLAnchorElement).
  • id – the value of “id” attribute, for all elements (HTMLElement).
  • …and much more…

For instance:

<input type="text" id="elem" value="value" />

<script>
    alert(elem.type); // "text"
    alert(elem.id); // "elem"
    alert(elem.value); // value
</script>

Most standard HTML attributes have the corresponding DOM property, and we can access it like that.

If we want to know the full list of supported properties for a given class, we can find them in the specification. For instance, HTMLInputElement is documented at https://html.spec.whatwg.org/#htmlinputelement.

Or if we’d like to get them fast or are interested in a concrete browser specification – we can always output the element using console.dir(elem) and read the properties. Or explore “DOM properties” in the Elements tab of the browser developer tools.

Summary

Each DOM node belongs to a certain class. The classes form a hierarchy. The full set of properties and methods come as the result of inheritance.

Main DOM node properties are:

nodeType
We can use it to see if a node is a text or an element node. It has a numeric value: 1 for elements,3 for text nodes, and a few others for other node types. Read-only.
nodeName/tagName
For elements, tag name (uppercased unless XML-mode). For non-element nodes nodeName describes what it is. Read-only.
innerHTML
The HTML content of the element. Can be modified.
outerHTML
The full HTML of the element. A write operation into elem.outerHTML does not touch elem itself. Instead it gets replaced with the new HTML in the outer context.
nodeValue/data
The content of a non-element node (text, comment). These two are almost the same, usually we use data. Can be modified.
textContent
The text inside the element: HTML minus all <tags>. Writing into it puts the text inside the element, with all special characters and tags treated exactly as text. Can safely insert user-generated text and protect from unwanted HTML insertions.
hidden
When set to true, does the same as CSS display:none.

DOM nodes also have other properties depending on their class. For instance, <input> elements (HTMLInputElement) support value, type, while <a> elements (HTMLAnchorElement) support href etc. Most standard HTML attributes have a corresponding DOM property.

However, HTML attributes and DOM properties are not always the same, as we’ll see in the next chapter.

Attributes and properties

When the browser loads the page, it “reads” (another word: “parses”) the HTML and generates DOM objects from it. For element nodes, most standard HTML attributes automatically become properties of DOM objects.

For instance, if the tag is <body id="page">, then the DOM object has body.id="page".

But the attribute-property mapping is not one-to-one! In this chapter we’ll pay attention to separate these two notions, to see how to work with them, when they are the same, and when they are different.

DOM properties

We’ve already seen built-in DOM properties. There are a lot. But technically no one limits us, and if there aren’t enough, we can add our own.

DOM nodes are regular JavaScript objects. We can alter them.

For instance, let’s create a new property in document.body:

document.body.myData = {
    name: "Caesar",
    title: "Imperator",
};

alert(document.body.myData.title); // Imperator

We can add a method as well:

document.body.sayTagName = function () {
    alert(this.tagName);
};

document.body.sayTagName(); // BODY (the value of "this" in the method is document.body)

We can also modify built-in prototypes like Element.prototype and add new methods to all elements:

Element.prototype.sayHi = function () {
    alert(`Hello, I'm ${this.tagName}`);
};

document.documentElement.sayHi(); // Hello, I'm HTML
document.body.sayHi(); // Hello, I'm BODY

So, DOM properties and methods behave just like those of regular JavaScript objects:

  • They can have any value.
  • They are case-sensitive (write elem.nodeType, not elem.NoDeTyPe).

HTML attributes

In HTML, tags may have attributes. When the browser parses the HTML to create DOM objects for tags, it recognizes standard attributes and creates DOM properties from them.

So when an element has id or another standard attribute, the corresponding property gets created. But that doesn’t happen if the attribute is non-standard.

For instance:

<body id="test" something="non-standard">
    <script>
        alert(document.body.id); // test

        // non-standard attribute does not yield a property
        alert(document.body.something); // undefined
    </script>
</body>

Please note that a standard attribute for one element can be unknown for another one. For instance, "type" is standard for <input> (HTMLInputElement), but not for <body> (HTMLBodyElement). Standard attributes are described in the specification for the corresponding element class.

Here we can see it:

<body id="body" type="...">
    <input id="input" type="text" />
    <script>
        alert(input.type); // text

        alert(body.type); // undefined: DOM property not created, because it's non-standard
    </script>
</body>

So, if an attribute is non-standard, there won’t be a DOM-property for it. Is there a way to access such attributes?

Sure. All attributes are accessible by using the following methods:

  • elem.hasAttribute(name) – checks for existence.
  • elem.getAttribute(name) – gets the value.
  • elem.setAttribute(name, value) – sets the value.
  • elem.removeAttribute(name) – removes the attribute.

These methods operate exactly with what’s written in HTML.

Also one can read all attributes using elem.attributes: a collection of objects that belong to a built-in Attr class, with name and value properties.

Here’s a demo of reading a non-standard property:

<body something="non-standard">
    <script>
        alert(document.body.getAttribute("something")); // non-standard
    </script>
</body>

HTML attributes have the following features:

  • Their name is case-insensitive (id is same as ID).
  • Their values are always strings.

Here’s an extended demo of working with attributes:

<body>
    <div id="elem" about="Elephant"></div>

    <script>
        alert(elem.getAttribute("About")); // (1) 'Elephant', reading

        elem.setAttribute("Test", 123); // (2), writing

        alert(elem.outerHTML); // (3), see if the attribute is in HTML (yes)

        for (let attr of elem.attributes) {
            // (4) list all
            alert(`${attr.name} = ${attr.value}`);
        }
    </script>
</body>

Please note:

  1. getAttribute('About') – the first letter is uppercase here, and in HTML it’s all lowercase. But that doesn’t matter: attribute names are case-insensitive.
  2. We can assign anything to an attribute, but it becomes a string. So here we have "123" as the value.
  3. All attributes including ones that we set are visible in outerHTML.
  4. The attributes collection is iterable and has all the attributes of the element (standard and non-standard) as objects with name and value properties.

Property-attribute synchronization

When a standard attribute changes, the corresponding property is auto-updated, and (with some exceptions) vice versa.

In the example below id is modified as an attribute, and we can see the property changed too. And then the same backwards:

<input />

<script>
    let input = document.querySelector("input");

    // attribute => property
    input.setAttribute("id", "id");
    alert(input.id); // id (updated)

    // property => attribute
    input.id = "newId";
    alert(input.getAttribute("id")); // newId (updated)
</script>

But there are exclusions, for instance input.value synchronizes only from attribute -> to property, but not back:

<input />

<script>
    let input = document.querySelector("input");

    // attribute => property
    input.setAttribute("value", "text");
    alert(input.value); // text

    // NOT property => attribute
    input.value = "newValue";
    alert(input.getAttribute("value")); // text (not updated!)
</script>

In the example above:

  • Changing the attribute value updates the property.
  • But the property change does not affect the attribute.

That “feature” may actually come in handy, because the user actions may lead to value changes, and then after them, if we want to recover the “original” value from HTML, it’s in the attribute.

DOM properties are typed

DOM properties are not always strings. For instance, the input.checked property (for checkboxes) is a boolean:

<input id="input" type="checkbox" checked /> checkbox

<script>
    alert(input.getAttribute("checked")); // the attribute value is: empty string
    alert(input.checked); // the property value is: true
</script>

There are other examples. The style attribute is a string, but the style property is an object:

<div id="div" style="color:red;font-size:120%">Hello</div>

<script>
    // string
    alert(div.getAttribute("style")); // color:red;font-size:120%

    // object
    alert(div.style); // [object CSSStyleDeclaration]
    alert(div.style.color); // red
</script>

Most properties are strings though.

Quite rarely, even if a DOM property type is a string, it may differ from the attribute. For instance, the href DOM property is always a full URL, even if the attribute contains a relative URL or just a #hash.

Here’s an example:

<a id="a" href="#hello">link</a>
<script>
    // attribute
    alert(a.getAttribute("href")); // #hello

    // property
    alert(a.href); // full URL in the form http://site.com/page#hello
</script>

If we need the value of href or any other attribute exactly as written in the HTML, we can use getAttribute.

Non-standard attributes, dataset

When writing HTML, we use a lot of standard attributes. But what about non-standard, custom ones? First, let’s see whether they are useful or not? What for?

Sometimes non-standard attributes are used to pass custom data from HTML to JavaScript, or to “mark” HTML-elements for JavaScript.

Like this:

<!-- mark the div to show "name" here -->
<div show-info="name"></div>
<!-- and age here -->
<div show-info="age"></div>

<script>
    // the code finds an element with the mark and shows what's requested
    let user = {
        name: "Pete",
        age: 25,
    };

    for (let div of document.querySelectorAll("[show-info]")) {
        // insert the corresponding info into the field
        let field = div.getAttribute("show-info");
        div.innerHTML = user[field]; // first Pete into "name", then 25 into "age"
    }
</script>

Also they can be used to style an element.

For instance, here for the order state the attribute order-state is used:

<style>
    /* styles rely on the custom attribute "order-state" */
    .order[order-state="new"] {
        color: green;
    }

    .order[order-state="pending"] {
        color: blue;
    }

    .order[order-state="canceled"] {
        color: red;
    }
</style>

<div class="order" order-state="new">A new order.</div>

<div class="order" order-state="pending">A pending order.</div>

<div class="order" order-state="canceled">A canceled order.</div>

Why would using an attribute be preferable to having classes like .order-state-new, .order-state-pending, order-state-canceled?

Because an attribute is more convenient to manage. The state can be changed as easy as:

// a bit simpler than removing old/adding a new class
div.setAttribute("order-state", "canceled");

But there may be a possible problem with custom attributes. What if we use a non-standard attribute for our purposes and later the standard introduces it and makes it do something? The HTML language is alive, it grows, and more attributes appear to suit the needs of developers. There may be unexpected effects in such case.

To avoid conflicts, there exist data-* attributes.

All attributes starting with “data-” are reserved for programmers’ use. They are available in the dataset property.

For instance, if an elem has an attribute named "data-about", it’s available as elem.dataset.about.

Like this:

<body data-about="Elephants">
    <script>
        alert(document.body.dataset.about); // Elephants
    </script>
</body>

Multiword attributes like data-order-state become camel-cased: dataset.orderState.

Here’s a rewritten “order state” example:

<style>
    .order[data-order-state="new"] {
        color: green;
    }

    .order[data-order-state="pending"] {
        color: blue;
    }

    .order[data-order-state="canceled"] {
        color: red;
    }
</style>

<div id="order" class="order" data-order-state="new">A new order.</div>

<script>
    // read
    alert(order.dataset.orderState); // new

    // modify
    order.dataset.orderState = "pending"; // (*)
</script>

Using data-* attributes is a valid, safe way to pass custom data.

Please note that we can not only read, but also modify data-attributes. Then CSS updates the view accordingly: in the example above the last line (*) changes the color to blue.

Summary

  • Attributes – is what’s written in HTML.
  • Properties – is what’s in DOM objects.

A small comparison:

Properties Attributes
Type Any value, standard properties have types described in the spec A string
Name Name is case-sensitive Name is not case-sensitive

Methods to work with attributes are:

  • elem.hasAttribute(name) – to check for existence.
  • elem.getAttribute(name) – to get the value.
  • elem.setAttribute(name, value) – to set the value.
  • elem.removeAttribute(name) – to remove the attribute.
  • elem.attributes is a collection of all attributes.

For most situations using DOM properties is preferable. We should refer to attributes only when DOM properties do not suit us, when we need exactly attributes, for instance:

  • We need a non-standard attribute. But if it starts with data-, then we should use dataset.
  • We want to read the value “as written” in HTML. The value of the DOM property may be different, for instance the href property is always a full URL, and we may want to get the “original” value.

Modifying the document

DOM modification is the key to creating “live” pages.

Here we’ll see how to create new elements “on the fly” and modify the existing page content.

Example: show a message

Let’s demonstrate using an example. We’ll add a message on the page that looks nicer than alert.

Here’s how it will look:

<style>
    .alert {
        padding: 15px;
        border: 1px solid #d6e9c6;
        border-radius: 4px;
        color: #3c763d;
        background-color: #dff0d8;
    }
</style>

<div class="alert">
    <strong>Hi there!</strong> You've read an important message.
</div>

That was the HTML example. Now let’s create the same div with JavaScript (assuming that the styles are in the HTML/CSS already).

Creating an element

To create DOM nodes, there are two methods:

document.createElement(tag)
Creates a new element node with the given tag:
let div = document.createElement('div');
document.createTextNode(text)
Creates a new text node with the given text:
let textNode = document.createTextNode('Here I am');

Most of the time we need to create element nodes, such as the div for the message.

Creating the message

Creating the message div takes 3 steps:

// 1. Create <div> element
let div = document.createElement("div");

// 2. Set its class to "alert"
div.className = "alert";

// 3. Fill it with the content
div.innerHTML = "<strong>Hi there!</strong> You've read an important message.";

We’ve created the element. But as of now it’s only in a variable named div, not in the page yet. So we can’t see it.

Insertion methods

To make the div show up, we need to insert it somewhere into document. For instance, into <body> element, referenced by document.body.

There’s a special method append for that: document.body.append(div).

Here’s the full code:

<style>
    .alert {
        padding: 15px;
        border: 1px solid #d6e9c6;
        border-radius: 4px;
        color: #3c763d;
        background-color: #dff0d8;
    }
</style>

<script>
    let div = document.createElement("div");
    div.className = "alert";
    div.innerHTML =
        "<strong>Hi there!</strong> You've read an important message.";

    document.body.append(div);
</script>

Here we called append on document.body, but we can call append method on any other element, to put another element into it. For instance, we can append something to <div> by calling div.append(anotherElement).

Here are more insertion methods, they specify different places where to insert:

  • node.append(...nodes or strings) – append nodes or strings at the end of node,
  • node.prepend(...nodes or strings) – insert nodes or strings at the beginning of node,
  • node.before(...nodes or strings) –- insert nodes or strings before node,
  • node.after(...nodes or strings) –- insert nodes or strings after node,
  • node.replaceWith(...nodes or strings) –- replaces node with the given nodes or strings.

Arguments of these methods are an arbitrary list of DOM nodes to insert, or text strings (that become text nodes automatically).

Let’s see them in action.

Here’s an example of using these methods to add items to a list and the text before/after it:

<ol id="ol">
    <li>0</li>
    <li>1</li>
    <li>2</li>
</ol>

<script>
    ol.before("before"); // insert string "before" before <ol>
    ol.after("after"); // insert string "after" after <ol>

    let liFirst = document.createElement("li");
    liFirst.innerHTML = "prepend";
    ol.prepend(liFirst); // insert liFirst at the beginning of <ol>

    let liLast = document.createElement("li");
    liLast.innerHTML = "append";
    ol.append(liLast); // insert liLast at the end of <ol>
</script>

Here’s a visual picture of what the methods do:

So the final list will be:

before
<ol id="ol">
    <li>prepend</li>
    <li>0</li>
    <li>1</li>
    <li>2</li>
    <li>append</li>
</ol>
after

As said, these methods can insert multiple nodes and text pieces in a single call.

For instance, here a string and an element are inserted:

<div id="div"></div>
<script>
    div.before("<p>Hello</p>", document.createElement("hr"));
</script>

Please note: the text is inserted “as text”, not “as HTML”, with proper escaping of characters such as <, >.

So the final HTML is:

&lt;p&gt;Hello&lt;/p&gt;

<hr />
<div id="div"></div>

In other words, strings are inserted in a safe way, like elem.textContent does it.

So, these methods can only be used to insert DOM nodes or text pieces.

But what if we’d like to insert an HTML string “as html”, with all tags and stuff working, in the same manner as elem.innerHTML does it?

insertAdjacentHTML/Text/Element

For that we can use another, pretty versatile method: elem.insertAdjacentHTML(where, html).

The first parameter is a code word, specifying where to insert relative to elem. Must be one of the following:

  • "beforebegin" – insert html immediately before elem,
  • "afterbegin" – insert html into elem, at the beginning,
  • "beforeend" – insert html into elem, at the end,
  • "afterend" – insert html immediately after elem.

The second parameter is an HTML string, that is inserted “as HTML”.

For instance:

<div id="div"></div>
<script>
    div.insertAdjacentHTML("beforebegin", "<p>Hello</p>");
    div.insertAdjacentHTML("afterend", "<p>Bye</p>");
</script>

…Would lead to:

<p>Hello</p>
<div id="div"></div>
<p>Bye</p>

That’s how we can append arbitrary HTML to the page.

Here’s the picture of insertion variants:

We can easily notice similarities between this and the previous picture. The insertion points are actually the same, but this method inserts HTML.

The method has two brothers:

  • elem.insertAdjacentText(where, text) – the same syntax, but a string of text is inserted “as text” instead of HTML,
  • elem.insertAdjacentElement(where, elem) – the same syntax, but inserts an element.

They exist mainly to make the syntax “uniform”. In practice, only insertAdjacentHTML is used most of the time. Because for elements and text, we have methods append/prepend/before/after – they are shorter to write and can insert nodes/text pieces.

So here’s an alternative variant of showing a message:

<style>
    .alert {
        padding: 15px;
        border: 1px solid #d6e9c6;
        border-radius: 4px;
        color: #3c763d;
        background-color: #dff0d8;
    }
</style>

<script>
    document.body.insertAdjacentHTML(
        "afterbegin",
        `<div class="alert">
    <strong>Hi there!</strong> You've read an important message.
  </div>`
    );
</script>

Node removal

To remove a node, there’s a method node.remove().

Let’s make our message disappear after a second:

<style>
    .alert {
        padding: 15px;
        border: 1px solid #d6e9c6;
        border-radius: 4px;
        color: #3c763d;
        background-color: #dff0d8;
    }
</style>

<script>
    let div = document.createElement("div");
    div.className = "alert";
    div.innerHTML =
        "<strong>Hi there!</strong> You've read an important message.";

    document.body.append(div);

    setTimeout(() => div.remove(), 1000);
</script>

Please note: if we want to move an element to another place – there’s no need to remove it from the old one.

All insertion methods automatically remove the node from the old place.

For instance, let’s swap elements:

<div id="first">First</div>
<div id="second">Second</div>
<script>
    // no need to call remove
    second.after(first); // take #second and after it insert #first
</script>

Cloning nodes: cloneNode

How to insert one more similar message?

We could make a function and put the code there. But the alternative way would be to clone the existing div and modify the text inside it (if needed).

Sometimes when we have a big element, that may be faster and simpler.

  • The call elem.cloneNode(true) creates a “deep” clone of the element – with all attributes and subelements. If we call elem.cloneNode(false), then the clone is made without child elements.

An example of copying the message:

<style>
    .alert {
        padding: 15px;
        border: 1px solid #d6e9c6;
        border-radius: 4px;
        color: #3c763d;
        background-color: #dff0d8;
    }
</style>

<div class="alert" id="div">
    <strong>Hi there!</strong> You've read an important message.
</div>

<script>
    let div2 = div.cloneNode(true); // clone the message
    div2.querySelector("strong").innerHTML = "Bye there!"; // change the clone

    div.after(div2); // show the clone after the existing div
</script>

DocumentFragment [#document-fragment]

DocumentFragment is a special DOM node that serves as a wrapper to pass around lists of nodes.

We can append other nodes to it, but when we insert it somewhere, then its content is inserted instead.

For example, getListContent below generates a fragment with <li> items, that are later inserted into <ul>:

<ul id="ul"></ul>

<script>
    function getListContent() {
        let fragment = new DocumentFragment();

        for (let i = 1; i <= 3; i++) {
            let li = document.createElement("li");
            li.append(i);
            fragment.append(li);
        }

        return fragment;
    }

    ul.append(getListContent()); // (*)
</script>

Please note, at the last line (*) we append DocumentFragment, but it “blends in”, so the resulting structure will be:

<ul>
    <li>1</li>
    <li>2</li>
    <li>3</li>
</ul>

DocumentFragment is rarely used explicitly. Why append to a special kind of node, if we can return an array of nodes instead? Rewritten example:

<ul id="ul"></ul>

<script>
    function getListContent() {
        let result = [];

        for (let i = 1; i <= 3; i++) {
            let li = document.createElement("li");
            li.append(i);
            result.push(li);
        }

        return result;
    }

    ul.append(...getListContent()); // append + "..." operator = friends!
</script>

We mention DocumentFragment mainly because there are some concepts on top of it, like template element, that we’ll cover much later.

Old-school insert/remove methods

[old]

There are also “old school” DOM manipulation methods, existing for historical reasons.

These methods come from really ancient times. Nowadays, there’s no reason to use them, as modern methods, such as append, prepend, before, after, remove, replaceWith, are more flexible.

The only reason we list these methods here is that you can find them in many old scripts:

parentElem.appendChild(node)
Appends node as the last child of parentElem.

The following example adds a new <li> to the end of <ol>:

<ol id="list">
  <li>0</li>
  <li>1</li>
  <li>2</li>
</ol>

<script>
  let newLi = document.createElement('li');
  newLi.innerHTML = 'Hello, world!';

  list.appendChild(newLi);
</script>
parentElem.insertBefore(node, nextSibling)
Inserts node before nextSibling into parentElem.

The following code inserts a new list item before the second <li>:

<ol id="list">
  <li>0</li>
  <li>1</li>
  <li>2</li>
</ol>
<script>
  let newLi = document.createElement('li');
  newLi.innerHTML = 'Hello, world!';


  list.insertBefore(newLi, list.children[1]);

</script>

To insert newLi as the first element, we can do it like this:

list.insertBefore(newLi, list.firstChild);
parentElem.replaceChild(node, oldChild)
Replaces oldChild with node among children of parentElem.
parentElem.removeChild(node)
Removes node from parentElem (assuming node is its child).

The following example removes first <li> from <ol>:

<ol id="list">
  <li>0</li>
  <li>1</li>
  <li>2</li>
</ol>

<script>
  let li = list.firstElementChild;
  list.removeChild(li);
</script>

All these methods return the inserted/removed node. In other words, parentElem.appendChild(node) returns node. But usually the returned value is not used, we just run the method.

A word about “document.write”

There’s one more, very ancient method of adding something to a web-page: document.write.

The syntax:

<p>Somewhere in the page...</p>

<script>
    document.write("<b>Hello from JS</b>");
</script>

<p>The end</p>

The call to document.write(html) writes the html into page “right here and now”. The html string can be dynamically generated, so it’s kind of flexible. We can use JavaScript to create a full-fledged webpage and write it.

The method comes from times when there was no DOM, no standards… Really old times. It still lives, because there are scripts using it.

In modern scripts we can rarely see it, because of the following important limitation:

The call to document.write only works while the page is loading.

If we call it afterwards, the existing document content is erased.

For instance:

<p>After one second the contents of this page will be replaced...</p>

<script>
    // document.write after 1 second
    // that's after the page loaded, so it erases the existing content
    setTimeout(() => document.write("<b>...By this.</b>"), 1000);
</script>

So it’s kind of unusable at “after loaded” stage, unlike other DOM methods we covered above.

That’s the downside.

There’s an upside also. Technically, when document.write is called while the browser is reading (“parsing”) incoming HTML, and it writes something, the browser consumes it just as if it were initially there, in the HTML text.

So it works blazingly fast, because there’s no DOM modification involved. It writes directly into the page text, while the DOM is not yet built.

So if we need to add a lot of text into HTML dynamically, and we’re at page loading phase, and the speed matters, it may help. But in practice these requirements rarely come together. And usually we can see this method in scripts just because they are old.

Summary

  • Methods to create new nodes:

    • document.createElement(tag) – creates an element with the given tag,
    • document.createTextNode(value) – creates a text node (rarely used),
    • elem.cloneNode(deep) – clones the element, if deep==true then with all descendants.
  • Insertion and removal:

    • node.append(...nodes or strings) – insert into node, at the end,
    • node.prepend(...nodes or strings) – insert into node, at the beginning,
    • node.before(...nodes or strings) –- insert right before node,
    • node.after(...nodes or strings) –- insert right after node,
    • node.replaceWith(...nodes or strings) –- replace node.
    • node.remove() –- remove the node.

    Text strings are inserted “as text”.

  • There are also “old school” methods:

    • parent.appendChild(node)
    • parent.insertBefore(node, nextSibling)
    • parent.removeChild(node)
    • parent.replaceChild(newElem, node)

    All these methods return node.

  • Given some HTML in html, elem.insertAdjacentHTML(where, html) inserts it depending on the value of where:

    • "beforebegin" – insert html right before elem,
    • "afterbegin" – insert html into elem, at the beginning,
    • "beforeend" – insert html into elem, at the end,
    • "afterend" – insert html right after elem.

    Also there are similar methods, elem.insertAdjacentText and elem.insertAdjacentElement, that insert text strings and elements, but they are rarely used.

  • To append HTML to the page before it has finished loading:

    • document.write(html)

    After the page is loaded such a call erases the document. Mostly seen in old scripts.

Styles and classes

Before we get into JavaScript’s ways of dealing with styles and classes – here’s an important rule. Hopefully it’s obvious enough, but we still have to mention it.

There are generally two ways to style an element:

  1. Create a class in CSS and add it: <div class="...">
  2. Write properties directly into style: <div style="...">.

JavaScript can modify both classes and style properties.

We should always prefer CSS classes to style. The latter should only be used if classes “can’t handle it”.

For example, style is acceptable if we calculate coordinates of an element dynamically and want to set them from JavaScript, like this:

let top = /* complex calculations */;
let left = /* complex calculations */;

elem.style.left = left; // e.g '123px', calculated at run-time
elem.style.top = top; // e.g '456px'

For other cases, like making the text red, adding a background icon – describe that in CSS and then add the class (JavaScript can do that). That’s more flexible and easier to support.

className and classList

Changing a class is one of the most often used actions in scripts.

In the ancient time, there was a limitation in JavaScript: a reserved word like "class" could not be an object property. That limitation does not exist now, but at that time it was impossible to have a "class" property, like elem.class.

So for classes the similar-looking property "className" was introduced: the elem.className corresponds to the "class" attribute.

For instance:

<body class="main page">
    <script>
        alert(document.body.className); // main page
    </script>
</body>

If we assign something to elem.className, it replaces the whole string of classes. Sometimes that’s what we need, but often we want to add/remove a single class.

There’s another property for that: elem.classList.

The elem.classList is a special object with methods to add/remove/toggle a single class.

For instance:

<body class="main page">
    <script>
        // add a class
        document.body.classList.add("article");

        alert(document.body.className); // main page article
    </script>
</body>

So we can operate both on the full class string using className or on individual classes using classList. What we choose depends on our needs.

Methods of classList:

  • elem.classList.add/remove("class") – adds/removes the class.
  • elem.classList.toggle("class") – adds the class if it doesn’t exist, otherwise removes it.
  • elem.classList.contains("class") – checks for the given class, returns true/false.

Besides, classList is iterable, so we can list all classes with for..of, like this:

<body class="main page">
    <script>
        for (let name of document.body.classList) {
            alert(name); // main, and then page
        }
    </script>
</body>

Element style

The property elem.style is an object that corresponds to what’s written in the "style" attribute. Setting elem.style.width="100px" works the same as if we had in the attribute style a string width:100px.

For multi-word property the camelCase is used:

background-color  => elem.style.backgroundColor
z-index           => elem.style.zIndex
border-left-width => elem.style.borderLeftWidth

For instance:

document.body.style.backgroundColor = prompt("background color?", "green");

Browser-prefixed properties like -moz-border-radius, -webkit-border-radius also follow the same rule: a dash means upper case.

For instance:

button.style.MozBorderRadius = "5px";
button.style.WebkitBorderRadius = "5px";

Resetting the style property

Sometimes we want to assign a style property, and later remove it.

For instance, to hide an element, we can set elem.style.display = "none".

Then later we may want to remove the style.display as if it were not set. Instead of delete elem.style.display we should assign an empty string to it: elem.style.display = "".

// if we run this code, the <body> will blink
document.body.style.display = "none"; // hide

setTimeout(() => (document.body.style.display = ""), 1000); // back to normal

If we set style.display to an empty string, then the browser applies CSS classes and its built-in styles normally, as if there were no such style.display property at all.

Normally, we use style.* to assign individual style properties. We can’t set the full style like div.style="color: red; width: 100px", because div.style is an object, and it’s read-only.

To set the full style as a string, there’s a special property style.cssText:

<div id="div">Button</div>

<script>
    // we can set special style flags like "important" here
    div.style.cssText = `color: red !important;
    background-color: yellow;
    width: 100px;
    text-align: center;
  `;

    alert(div.style.cssText);
</script>

This property is rarely used, because such assignment removes all existing styles: it does not add, but replaces them. May occasionally delete something needed. But we can safely use it for new elements, when we know we won’t delete an existing style.

The same can be accomplished by setting an attribute: div.setAttribute('style', 'color: red...').

Mind the units

Don’t forget to add CSS units to values.

For instance, we should not set elem.style.top to 10, but rather to 10px. Otherwise it wouldn’t work:

<body>
    <script>
        // doesn't work!
        document.body.style.margin = 20;
        alert(document.body.style.margin); // '' (empty string, the assignment is ignored)

        // now add the CSS unit (px) - and it works
        document.body.style.margin = "20px";
        alert(document.body.style.margin); // 20px

        alert(document.body.style.marginTop); // 20px
        alert(document.body.style.marginLeft); // 20px
    </script>
</body>

Please note: the browser “unpacks” the property style.margin in the last lines and infers style.marginLeft and style.marginTop from it.

Computed styles: getComputedStyle

So, modifying a style is easy. But how to read it?

For instance, we want to know the size, margins, the color of an element. How to do it?

The style property operates only on the value of the "style" attribute, without any CSS cascade.

So we can’t read anything that comes from CSS classes using elem.style.

For instance, here style doesn’t see the margin:

<head>
    <style>
        body {
            color: red;
            margin: 5px;
        }
    </style>
</head>
<body>
    The red text
    <script>
        alert(document.body.style.color); // empty
        alert(document.body.style.marginTop); // empty
    </script>
</body>

…But what if we need, say, to increase the margin by 20px? We would want the current value of it.

There’s another method for that: getComputedStyle.

The syntax is:

getComputedStyle(element, [pseudo]);
element
Element to read the value for.
pseudo
A pseudo-element if required, for instance ::before. An empty string or no argument means the element itself.

The result is an object with styles, like elem.style, but now with respect to all CSS classes.

For instance:

<head>
    <style>
        body {
            color: red;
            margin: 5px;
        }
    </style>
</head>
<body>
    <script>
        let computedStyle = getComputedStyle(document.body);

        // now we can read the margin and the color from it

        alert(computedStyle.marginTop); // 5px
        alert(computedStyle.color); // rgb(255, 0, 0)
    </script>
</body>

There are two concepts in CSS:

  1. A computed style value is the value after all CSS rules and CSS inheritance is applied, as the result of the CSS cascade. It can look like height:1em or font-size:125%.
  2. A resolved style value is the one finally applied to the element. Values like 1em or 125% are relative. The browser takes the computed value and makes all units fixed and absolute, for instance: height:20px or font-size:16px. For geometry properties resolved values may have a floating point, like width:50.5px.

A long time ago getComputedStyle was created to get computed values, but it turned out that resolved values are much more convenient, and the standard changed.

So nowadays getComputedStyle actually returns the resolved value of the property, usually in px for geometry.

We should always ask for the exact property that we want, like paddingLeft or marginTop or borderTopWidth. Otherwise the correct result is not guaranteed.

For instance, if there are properties paddingLeft/paddingTop, then what should we get for getComputedStyle(elem).padding? Nothing, or maybe a “generated” value from known paddings? There’s no standard rule here.

There are other inconsistencies. As an example, some browsers (Chrome) show 10px in the document below, and some of them (Firefox) – do not:

<style>
    body {
        margin: 10px;
    }
</style>
<script>
    let style = getComputedStyle(document.body);
    alert(style.margin); // empty string in Firefox
</script>

Visited links may be colored using :visited CSS pseudoclass.

But getComputedStyle does not give access to that color, because otherwise an arbitrary page could find out whether the user visited a link by creating it on the page and checking the styles.

JavaScript may not see the styles applied by :visited. And also, there’s a limitation in CSS that forbids applying geometry-changing styles in :visited. That’s to guarantee that there’s no side way for an evil page to test if a link was visited and hence to break the privacy.

Summary

To manage classes, there are two DOM properties:

  • className – the string value, good to manage the whole set of classes.
  • classList – the object with methods add/remove/toggle/contains, good for individual classes.

To change the styles:

  • The style property is an object with camelCased styles. Reading and writing to it has the same meaning as modifying individual properties in the "style" attribute. To see how to apply important and other rare stuff – there’s a list of methods at MDN.

  • The style.cssText property corresponds to the whole "style" attribute, the full string of styles.

To read the resolved styles (with respect to all classes, after all CSS is applied and final values are calculated):

  • The getComputedStyle(elem, [pseudo]) returns the style-like object with them. Read-only.

Element size and scrolling

There are many JavaScript properties that allow us to read information about element width, height and other geometry features.

We often need them when moving or positioning elements in JavaScript.

Sample element

As a sample element to demonstrate properties we’ll use the one given below:

<div id="example">...Text...</div>
<style>
    #example {
        width: 300px;
        height: 200px;
        border: 25px solid #e8c48f;
        padding: 20px;
        overflow: auto;
    }
</style>

It has the border, padding and scrolling. The full set of features. There are no margins, as they are not the part of the element itself, and there are no special properties for them.

The element looks like this:

You can open the document in the sandbox.

The picture above demonstrates the most complex case when the element has a scrollbar. Some browsers (not all) reserve the space for it by taking it from the content (labeled as “content width” above).

So, without scrollbar the content width would be 300px, but if the scrollbar is 16px wide (the width may vary between devices and browsers) then only 300 - 16 = 284px remains, and we should take it into account. That’s why examples from this chapter assume that there’s a scrollbar. Without it, some calculations are simpler.

Usually paddings are shown empty on our illustrations, but if there’s a lot of text in the element and it overflows, then browsers show the “overflowing” text at padding-bottom, that’s normal.

Geometry

Here’s the overall picture with geometry properties:

Values of these properties are technically numbers, but these numbers are “of pixels”, so these are pixel measurements.

Let’s start exploring the properties starting from the outside of the element.

offsetParent, offsetLeft/Top

These properties are rarely needed, but still they are the “most outer” geometry properties, so we’ll start with them.

The offsetParent is the nearest ancestor that the browser uses for calculating coordinates during rendering.

That’s the nearest ancestor that is one of the following:

  1. CSS-positioned (position is absolute, relative, fixed or sticky), or
  2. <td>, <th>, or <table>, or
  3. <body>.

Properties offsetLeft/offsetTop provide x/y coordinates relative to offsetParent upper-left corner.

In the example below the inner <div> has <main> as offsetParent and offsetLeft/offsetTop shifts from its upper-left corner (180):

<main style="position: relative" id="main">
    <article>
        <div id="example" style="position: absolute; left: 180px; top: 180px">
            ...
        </div>
    </article>
</main>
<script>
    alert(example.offsetParent.id); // main
    alert(example.offsetLeft); // 180 (note: a number, not a string "180px")
    alert(example.offsetTop); // 180
</script>

There are several occasions when offsetParent is null:

  1. For not shown elements (display:none or not in the document).
  2. For <body> and <html>.
  3. For elements with position:fixed.

offsetWidth/Height

Now let’s move on to the element itself.

These two properties are the simplest ones. They provide the “outer” width/height of the element. Or, in other words, its full size including borders.

For our sample element:

  • offsetWidth = 390 – the outer width, can be calculated as inner CSS-width (300px) plus paddings (2 * 20px) and borders (2 * 25px).
  • offsetHeight = 290 – the outer height.

Geometry properties are calculated only for displayed elements.

If an element (or any of its ancestors) has display:none or is not in the document, then all geometry properties are zero (or null for offsetParent).

For example, offsetParent is null, and offsetWidth, offsetHeight are 0 when we created an element, but haven’t inserted it into the document yet, or it (or it’s ancestor) has display:none.

We can use this to check if an element is hidden, like this:

function isHidden(elem) {
    return !elem.offsetWidth && !elem.offsetHeight;
}

Please note that such isHidden returns true for elements that are on-screen, but have zero sizes (like an empty <div>).

clientTop/Left

Inside the element we have the borders.

To measure them, there are properties clientTop and clientLeft.

In our example:

  • clientLeft = 25 – left border width
  • clientTop = 25 – top border width

…But to be precise – these properties are not border width/height, but rather relative coordinates of the inner side from the outer side.

What’s the difference?

It becomes visible when the document is right-to-left (the operating system is in Arabic or Hebrew languages). The scrollbar is then not on the right, but on the left, and then clientLeft also includes the scrollbar width.

In that case, clientLeft would be not 25, but with the scrollbar width 25 + 16 = 41.

Here’s the example in hebrew:

clientWidth/Height

These properties provide the size of the area inside the element borders.

They include the content width together with paddings, but without the scrollbar:

On the picture above let’s first consider clientHeight.

There’s no horizontal scrollbar, so it’s exactly the sum of what’s inside the borders: CSS-height 200px plus top and bottom paddings (2 * 20px) total 240px.

Now clientWidth – here the content width is not 300px, but 284px, because 16px are occupied by the scrollbar. So the sum is 284px plus left and right paddings, total 324px.

If there are no paddings, then clientWidth/Height is exactly the content area, inside the borders and the scrollbar (if any).

So when there’s no padding we can use clientWidth/clientHeight to get the content area size.

scrollWidth/Height

These properties are like clientWidth/clientHeight, but they also include the scrolled out (hidden) parts:

On the picture above:

  • scrollHeight = 723 – is the full inner height of the content area including the scrolled out parts.
  • scrollWidth = 324 – is the full inner width, here we have no horizontal scroll, so it equals clientWidth.

We can use these properties to expand the element wide to its full width/height.

Like this:

// expand the element to the full content height
element.style.height = `${element.scrollHeight}px`;
Click the button to expand the element:

<div
    id="element"
    style="width:300px;height:200px; padding: 0;overflow: auto; border:1px solid black;"
>
    text text text text text text text text text text text text text text text
    text text text text text text text text text text text text text text text
    text text text text text text text text text text text text text text text
    text text text text text text text text text text text text text text text
    text text text text text text text text text text text text text text text
    text text text text text text text text text text text text text text text
    text text text text text text text text text text text text text text text
    text text text text text text text text text text text text text text text
    text text text text text text text text text text text text text text text
    text text text text text
</div>

<button
    style="padding:0"
    onclick="element.style.height = `${element.scrollHeight}px`"
>
    element.style.height = `${element.scrollHeight}px`
</button>

scrollLeft/scrollTop

Properties scrollLeft/scrollTop are the width/height of the hidden, scrolled out part of the element.

On the picture below we can see scrollHeight and scrollTop for a block with a vertical scroll.

In other words, scrollTop is “how much is scrolled up”.

Most of the geometry properties here are read-only, but scrollLeft/scrollTop can be changed, and the browser will scroll the element.

If you click the element below, the code `elem.scrollTop += 10` executes. That
makes the element content scroll `10px` down.

<div
    onclick="this.scrollTop+=10"
    style="cursor:pointer;border:1px solid black;width:100px;height:80px;overflow:auto"
>
    Click<br />Me<br />1<br />2<br />3<br />4<br />5<br />6<br />7<br />8<br />9
</div>

Setting scrollTop to 0 or a big value, such as 1e9 will make the element scroll to the very top/bottom respectively.

Don’t take width/height from CSS

We’ve just covered geometry properties of DOM elements, that can be used to get widths, heights and calculate distances.

But as we know from the chapter info:styles-and-classes, we can read CSS-height and width using getComputedStyle.

So why not to read the width of an element with getComputedStyle, like this?

let elem = document.body;

alert(getComputedStyle(elem).width); // show CSS width for elem

Why should we use geometry properties instead? There are two reasons:

  1. First, CSS width/height depend on another property: box-sizing that defines “what is” CSS width and height. A change in box-sizing for CSS purposes may break such JavaScript.

  2. Second, CSS width/height may be auto, for instance for an inline element:

    <span id="elem">Hello!</span>
    
    <script>
        alert(getComputedStyle(elem).width); // auto
    </script>
    

    From the CSS standpoint, width:auto is perfectly normal, but in JavaScript we need an exact size in px that we can use in calculations. So here CSS width is useless.

And there’s one more reason: a scrollbar. Sometimes the code that works fine without a scrollbar becomes buggy with it, because a scrollbar takes the space from the content in some browsers. So the real width available for the content is less than CSS width. And clientWidth/clientHeight take that into account.

…But with getComputedStyle(elem).width the situation is different. Some browsers (e.g. Chrome) return the real inner width, minus the scrollbar, and some of them (e.g. Firefox) – CSS width (ignore the scrollbar). Such cross-browser differences is the reason not to use getComputedStyle, but rather rely on geometry properties.

If your browser reserves the space for a scrollbar (most browsers for Windows
do), then you can test it below. [iframe src="cssWidthScroll" link border=1] The
element with text has CSS `width:300px`. On a Desktop Windows OS, Firefox,
Chrome, Edge all reserve the space for the scrollbar. But Firefox shows `300px`,
while Chrome and Edge show less. That's because Firefox returns the CSS width
and other browsers return the "real" width.

Please note that the described difference is only about reading getComputedStyle(...).width from JavaScript, visually everything is correct.

Summary

Elements have the following geometry properties:

  • offsetParent – is the nearest positioned ancestor or td, th, table, body.
  • offsetLeft/offsetTop – coordinates relative to the upper-left edge of offsetParent.
  • offsetWidth/offsetHeight – “outer” width/height of an element including borders.
  • clientLeft/clientTop – the distances from the upper-left outer corner to the upper-left inner (content + padding) corner. For left-to-right OS they are always the widths of left/top borders. For right-to-left OS the vertical scrollbar is on the left so clientLeft includes its width too.
  • clientWidth/clientHeight – the width/height of the content including paddings, but without the scrollbar.
  • scrollWidth/scrollHeight – the width/height of the content, just like clientWidth/clientHeight, but also include scrolled-out, invisible part of the element.
  • scrollLeft/scrollTop – width/height of the scrolled out upper part of the element, starting from its upper-left corner.

All properties are read-only except scrollLeft/scrollTop that make the browser scroll the element if changed.

Window sizes and scrolling

How do we find the width and height of the browser window? How do we get the full width and height of the document, including the scrolled out part? How do we scroll the page using JavaScript?

For most such requests, we can use the root document element document.documentElement, that corresponds to the <html> tag. But there are additional methods and peculiarities important enough to consider.

Width/height of the window

To get window width and height we can use clientWidth/clientHeight of document.documentElement:

For instance, this button shows the height of your window:

<button onclick="alert(document.documentElement.clientHeight)">
    alert(document.documentElement.clientHeight)
</button>

Browsers also support properties window.innerWidth/innerHeight. They look like what we want. So why not to use them instead?

If there exists a scrollbar, and it occupies some space, clientWidth/clientHeight provide the width/height without it (subtract it). In other words, they return width/height of the visible part of the document, available for the content.

…And window.innerWidth/innerHeight include the scrollbar.

If there’s a scrollbar, and it occupies some space, then these two lines show different values:

alert(window.innerWidth); // full window width
alert(document.documentElement.clientWidth); // window width minus the scrollbar

In most cases we need the available window width: to draw or position something. That is: inside scrollbars if there are any. So we should use documentElement.clientHeight/Width.

Please note: top-level geometry properties may work a little bit differently when there’s no <!DOCTYPE HTML> in HTML. Odd things are possible.

In modern HTML we should always write DOCTYPE.

Width/height of the document

Theoretically, as the root document element is document.documentElement, and it encloses all the content, we could measure document full size as document.documentElement.scrollWidth/scrollHeight.

But on that element, for the whole page, these properties do not work as intended. In Chrome/Safari/Opera if there’s no scroll, then documentElement.scrollHeight may be even less than documentElement.clientHeight! Sounds like a nonsense, weird, right?

To reliably obtain the full document height, we should take the maximum of these properties:

let scrollHeight = Math.max(
    document.body.scrollHeight,
    document.documentElement.scrollHeight,
    document.body.offsetHeight,
    document.documentElement.offsetHeight,
    document.body.clientHeight,
    document.documentElement.clientHeight
);

alert("Full document height, with scrolled out part: " + scrollHeight);

Why so? Better don’t ask. These inconsistencies come from ancient times, not a “smart” logic.

Get the current scroll [#page-scroll]

DOM elements have their current scroll state in elem.scrollLeft/scrollTop.

For document scroll document.documentElement.scrollLeft/Top works in most browsers, except older WebKit-based ones, like Safari (bug 5991), where we should use document.body instead of document.documentElement.

Luckily, we don’t have to remember these peculiarities at all, because the scroll is available in the special properties window.pageXOffset/pageYOffset:

alert("Current scroll from the top: " + window.pageYOffset);
alert("Current scroll from the left: " + window.pageXOffset);

These properties are read-only.

Scrolling: scrollTo, scrollBy, scrollIntoView [#window-scroll]

To scroll the page from JavaScript, its DOM must be fully built.

For instance, if we try to scroll the page from the script in <head>, it won’t work.

Regular elements can be scrolled by changing scrollTop/scrollLeft.

We can do the same for the page using document.documentElement.scrollTop/Left (except Safari, where document.body.scrollTop/Left should be used instead).

Alternatively, there’s a simpler, universal solution: special methods window.scrollBy(x,y) and window.scrollTo(pageX,pageY).

  • The method scrollBy(x,y) scrolls the page relative to its current position. For instance, scrollBy(0,10) scrolls the page 10px down.

    The button below demonstrates this:
    
    <button onclick="window.scrollBy(0,10)">window.scrollBy(0,10)</button>
    
  • The method scrollTo(pageX,pageY) scrolls the page to absolute coordinates, so that the top-left corner of the visible part has coordinates (pageX, pageY) relative to the document’s top-left corner. It’s like setting scrollLeft/scrollTop.

    To scroll to the very beginning, we can use scrollTo(0,0).

    <button onclick="window.scrollTo(0,0)">window.scrollTo(0,0)</button>
    

These methods work for all browsers the same way.

scrollIntoView

For completeness, let’s cover one more method: elem.scrollIntoView(top).

The call to elem.scrollIntoView(top) scrolls the page to make elem visible. It has one argument:

  • if top=true (that’s the default), then the page will be scrolled to make elem appear on the top of the window. The upper edge of the element is aligned with the window top.
  • if top=false, then the page scrolls to make elem appear at the bottom. The bottom edge of the element is aligned with the window bottom.
The button below scrolls the page to make itself show at the window top:

<button onclick="this.scrollIntoView()">this.scrollIntoView()</button>

And this button scrolls the page to show it at the bottom:

<button onclick="this.scrollIntoView(false)">this.scrollIntoView(false)</button>

Forbid the scrolling

Sometimes we need to make the document “unscrollable”. For instance, when we need to cover it with a large message requiring immediate attention, and we want the visitor to interact with that message, not with the document.

To make the document unscrollable, it’s enough to set document.body.style.overflow = "hidden". The page will freeze on its current scroll.

Try it:

<button onclick="document.body.style.overflow = 'hidden'">
    document.body.style.overflow = 'hidden'
</button>

<button onclick="document.body.style.overflow = ''">
    document.body.style.overflow = ''
</button>

The first button freezes the scroll, the second one resumes it.

We can use the same technique to “freeze” the scroll for other elements, not just for document.body.

The drawback of the method is that the scrollbar disappears. If it occupied some space, then that space is now free, and the content “jumps” to fill it.

That looks a bit odd, but can be worked around if we compare clientWidth before and after the freeze, and if it increased (the scrollbar disappeared) then add padding to document.body in place of the scrollbar, to keep the content width the same.

Summary

Geometry:

  • Width/height of the visible part of the document (content area width/height): document.documentElement.clientWidth/Height

  • Width/height of the whole document, with the scrolled out part:

    let scrollHeight = Math.max(
        document.body.scrollHeight,
        document.documentElement.scrollHeight,
        document.body.offsetHeight,
        document.documentElement.offsetHeight,
        document.body.clientHeight,
        document.documentElement.clientHeight
    );
    

Scrolling:

  • Read the current scroll: window.pageYOffset/pageXOffset.

  • Change the current scroll:

    • window.scrollTo(pageX,pageY) – absolute coordinates,
    • window.scrollBy(x,y) – scroll relative the current place,
    • elem.scrollIntoView(top) – scroll to make elem visible (align with the top/bottom of the window).

Coordinates

To move elements around we should be familiar with coordinates.

Most JavaScript methods deal with one of two coordinate systems:

  1. Relative to the window - similar to position:fixed, calculated from the window top/left edge.
    • we’ll denote these coordinates as clientX/clientY, the reasoning for such name will become clear later, when we study event properties.
  2. Relative to the document - similar to position:absolute in the document root, calculated from the document top/left edge.
    • we’ll denote them pageX/pageY.

When the page is scrolled to the very beginning, so that the top/left corner of the window is exactly the document top/left corner, these coordinates equal each other. But after the document shifts, window-relative coordinates of elements change, as elements move across the window, while document-relative coordinates remain the same.

On this picture we take a point in the document and demonstrate its coordinates before the scroll (left) and after it (right):

When the document scrolled:

  • pageY - document-relative coordinate stayed the same, it’s counted from the document top (now scrolled out).
  • clientY - window-relative coordinate did change (the arrow became shorter), as the same point became closer to window top.

Element coordinates: getBoundingClientRect

The method elem.getBoundingClientRect() returns window coordinates for a minimal rectangle that encloses elem as an object of built-in DOMRect class.

Main DOMRect properties:

  • x/y – X/Y-coordinates of the rectangle origin relative to window,
  • width/height – width/height of the rectangle (can be negative).

Additionally, there are derived properties:

  • top/bottom – Y-coordinate for the top/bottom rectangle edge,
  • left/right – X-coordinate for the left/right rectangle edge.
For instance click this button to see its window coordinates:

<p>
    <input
        id="brTest"
        type="button"
        value="Get coordinates using button.getBoundingClientRect() for this button"
        onclick="showRect(this)"
    />
</p>

<script>
    function showRect(elem) {
        let r = elem.getBoundingClientRect();
        alert(`x:${r.x}
y:${r.y}
width:${r.width}
height:${r.height}
top:${r.top}
bottom:${r.bottom}
left:${r.left}
right:${r.right}
`);
    }
</script>

If you scroll the page and repeat, you'll notice that as window-relative button
position changes, its window coordinates (`y/top/bottom` if you scroll
vertically) change as well.

Here’s the picture of elem.getBoundingClientRect() output:

As you can see, x/y and width/height fully describe the rectangle. Derived properties can be easily calculated from them:

  • left = x
  • top = y
  • right = x + width
  • bottom = y + height

Please note:

  • Coordinates may be decimal fractions, such as 10.5. That’s normal, internally browser uses fractions in calculations. We don’t have to round them when setting to style.left/top.
  • Coordinates may be negative. For instance, if the page is scrolled so that elem is now above the window, then elem.getBoundingClientRect().top is negative.

Mathematically, a rectangle is uniquely defined with its starting point (x,y) and the direction vector (width,height). So the additional derived properties are for convenience.

Technically it’s possible for width/height to be negative, that allows for “directed” rectangle, e.g. to represent mouse selection with properly marked start and end.

Negative width/height values mean that the rectangle starts at its bottom-right corner and then “grows” left-upwards.

Here’s a rectangle with negative width and height (e.g. width=-200, height=-100):

As you can see, left/top do not equal x/y in such case.

In practice though, elem.getBoundingClientRect() always returns positive width/height, here we mention negative width/height only for you to understand why these seemingly duplicate properties are not actually duplicates.

Internet Explorer and Edge don’t support x/y properties for historical reasons.

So we can either make a polyfill (add getters in DomRect.prototype) or just use top/left, as they are always the same as x/y for positive width/height, in particular in the result of elem.getBoundingClientRect().

There are obvious similarities between window-relative coordinates and CSS position:fixed.

But in CSS positioning, right property means the distance from the right edge, and bottom property means the distance from the bottom edge.

If we just look at the picture above, we can see that in JavaScript it is not so. All window coordinates are counted from the top-left corner, including these ones.

elementFromPoint(x, y) [#elementFromPoint]

The call to document.elementFromPoint(x, y) returns the most nested element at window coordinates (x, y).

The syntax is:

let elem = document.elementFromPoint(x, y);

For instance, the code below highlights and outputs the tag of the element that is now in the middle of the window:

let centerX = document.documentElement.clientWidth / 2;
let centerY = document.documentElement.clientHeight / 2;

let elem = document.elementFromPoint(centerX, centerY);

elem.style.background = "red";
alert(elem.tagName);

As it uses window coordinates, the element may be different depending on the current scroll position.

The method document.elementFromPoint(x,y) only works if (x,y) are inside the visible area.

If any of the coordinates is negative or exceeds the window width/height, then it returns null.

Here’s a typical error that may occur if we don’t check for it:

let elem = document.elementFromPoint(x, y);
// if the coordinates happen to be out of the window, then elem = null

elem.style.background = ""; // Error!

Using for “fixed” positioning

Most of time we need coordinates in order to position something.

To show something near an element, we can use getBoundingClientRect to get its coordinates, and then CSS position together with left/top (or right/bottom).

For instance, the function createMessageUnder(elem, html) below shows the message under elem:

let elem = document.getElementById("coords-show-mark");

function createMessageUnder(elem, html) {
    // create message element
    let message = document.createElement("div");
    // better to use a css class for the style here
    message.style.cssText = "position:fixed; color: red";

    // assign coordinates, don't forget "px"!
    let coords = elem.getBoundingClientRect();

    message.style.left = coords.left + "px";
    message.style.top = coords.bottom + "px";

    message.innerHTML = html;

    return message;
}

// Usage:
// add it for 5 seconds in the document
let message = createMessageUnder(elem, "Hello, world!");
document.body.append(message);
setTimeout(() => message.remove(), 5000);
Click the button to run it:

<button id="coords-show-mark">
    Button with id="coords-show-mark", the message will appear under it
</button>

The code can be modified to show the message at the left, right, below, apply CSS animations to “fade it in” and so on. That’s easy, as we have all the coordinates and sizes of the element.

But note the important detail: when the page is scrolled, the message flows away from the button.

The reason is obvious: the message element relies on position:fixed, so it remains at the same place of the window while the page scrolls away.

To change that, we need to use document-based coordinates and position:absolute.

Document coordinates [#getCoords]

Document-relative coordinates start from the upper-left corner of the document, not the window.

In CSS, window coordinates correspond to position:fixed, while document coordinates are similar to position:absolute on top.

We can use position:absolute and top/left to put something at a certain place of the document, so that it remains there during a page scroll. But we need the right coordinates first.

There’s no standard method to get the document coordinates of an element. But it’s easy to write it.

The two coordinate systems are connected by the formula:

  • pageY = clientY + height of the scrolled-out vertical part of the document.
  • pageX = clientX + width of the scrolled-out horizontal part of the document.

The function getCoords(elem) will take window coordinates from elem.getBoundingClientRect() and add the current scroll to them:

// get document coordinates of the element
function getCoords(elem) {
    let box = elem.getBoundingClientRect();

    return {
        top: box.top + window.pageYOffset,
        right: box.right + window.pageXOffset,
        bottom: box.bottom + window.pageYOffset,
        left: box.left + window.pageXOffset,
    };
}

If in the example above we used it with position:absolute, then the message would stay near the element on scroll.

The modified createMessageUnder function:

function createMessageUnder(elem, html) {
    let message = document.createElement("div");
    message.style.cssText = "position:absolute; color: red";

    let coords = getCoords(elem);

    message.style.left = coords.left + "px";
    message.style.top = coords.bottom + "px";

    message.innerHTML = html;

    return message;
}

Summary

Any point on the page has coordinates:

  1. Relative to the window – elem.getBoundingClientRect().
  2. Relative to the document – elem.getBoundingClientRect() plus the current page scroll.

Window coordinates are great to use with position:fixed, and document coordinates do well with position:absolute.

Both coordinate systems have their pros and cons; there are times we need one or the other one, just like CSS position absolute and fixed.

Introduction to browser events

An event is a signal that something has happened. All DOM nodes generate such signals (but events are not limited to DOM).

Here’s a list of the most useful DOM events, just to take a look at:

Mouse events:

  • click – when the mouse clicks on an element (touchscreen devices generate it on a tap).
  • contextmenu – when the mouse right-clicks on an element.
  • mouseover / mouseout – when the mouse cursor comes over / leaves an element.
  • mousedown / mouseup – when the mouse button is pressed / released over an element.
  • mousemove – when the mouse is moved.

Keyboard events:

  • keydown and keyup – when a keyboard key is pressed and released.

Form element events:

  • submit – when the visitor submits a <form>.
  • focus – when the visitor focuses on an element, e.g. on an <input>.

Document events:

  • DOMContentLoaded – when the HTML is loaded and processed, DOM is fully built.

CSS events:

  • transitionend – when a CSS-animation finishes.

There are many other events. We’ll get into more details of particular events in next chapters.

Event handlers

To react on events we can assign a handler – a function that runs in case of an event.

Handlers are a way to run JavaScript code in case of user actions.

There are several ways to assign a handler. Let’s see them, starting from the simplest one.

HTML-attribute

A handler can be set in HTML with an attribute named on<event>.

For instance, to assign a click handler for an input, we can use onclick, like here:

<input value="Click me" onclick="alert('Click!')" type="button" />

On mouse click, the code inside onclick runs.

Please note that inside onclick we use single quotes, because the attribute itself is in double quotes. If we forget that the code is inside the attribute and use double quotes inside, like this: onclick="alert("Click!")", then it won’t work right.

An HTML-attribute is not a convenient place to write a lot of code, so we’d better create a JavaScript function and call it there.

Here a click runs the function countRabbits():

<script>
    function countRabbits() {
        for (let i = 1; i <= 3; i++) {
            alert("Rabbit number " + i);
        }
    }
</script>

<input type="button" onclick="countRabbits()" value="Count rabbits!" />

As we know, HTML attribute names are not case-sensitive, so ONCLICK works as well as onClick and onCLICK… But usually attributes are lowercased: onclick.

DOM property

We can assign a handler using a DOM property on<event>.

For instance, elem.onclick:

<input id="elem" type="button" value="Click me" />
<script>
    elem.onclick = function () {
        alert("Thank you");
    };
</script>

If the handler is assigned using an HTML-attribute then the browser reads it, creates a new function from the attribute content and writes it to the DOM property.

So this way is actually the same as the previous one.

These two code pieces work the same:

  1. Only HTML:

    <input type="button" onclick="alert('Click!')" value="Button" />
    
  2. HTML + JS:

    <input type="button" id="button" value="Button" />
    <script>
        button.onclick = function () {
            alert("Click!");
        };
    </script>
    

In the first example, the HTML attribute is used to initialize the button.onclick, while in the second example – the script, that’s all the difference.

As there’s only one onclick property, we can’t assign more than one event handler.

In the example below adding a handler with JavaScript overwrites the existing handler:

<input type="button" id="elem" onclick="alert('Before')" value="Click me" />
<script>
    elem.onclick = function () {
        // overwrites the existing handler
        alert("After"); // only this will be shown
    };
</script>

To remove a handler – assign elem.onclick = null.

Accessing the element: this

The value of this inside a handler is the element. The one which has the handler on it.

In the code below button shows its contents using this.innerHTML:

<button onclick="alert(this.innerHTML)">Click me</button>

Possible mistakes

If you’re starting to work with events – please note some subtleties.

We can set an existing function as a handler:

function sayThanks() {
    alert("Thanks!");
}

elem.onclick = sayThanks;

But be careful: the function should be assigned as sayThanks, not sayThanks().

// right
button.onclick = sayThanks;

// wrong
button.onclick = sayThanks();

If we add parentheses, then sayThanks() becomes is a function call. So the last line actually takes the result of the function execution, that is undefined (as the function returns nothing), and assigns it to onclick. That doesn’t work.

…On the other hand, in the markup we do need the parentheses:

<input type="button" id="button" onclick="sayThanks()" />

The difference is easy to explain. When the browser reads the attribute, it creates a handler function with body from the attribute content.

So the markup generates this property:

button.onclick = function () {
    sayThanks(); // <-- the attribute content goes here
};

Don’t use setAttribute for handlers.

Such a call won’t work:

// a click on <body> will generate errors,
// because attributes are always strings, function becomes a string
document.body.setAttribute("onclick", function () {
    alert(1);
});

DOM-property case matters.

Assign a handler to elem.onclick, not elem.ONCLICK, because DOM properties are case-sensitive.

addEventListener

The fundamental problem of the aforementioned ways to assign handlers – we can’t assign multiple handlers to one event.

Let’s say, one part of our code wants to highlight a button on click, and another one wants to show a message on the same click.

We’d like to assign two event handlers for that. But a new DOM property will overwrite the existing one:

input.onclick = function () {
    alert(1);
};
// ...
input.onclick = function () {
    alert(2);
}; // replaces the previous handler

Developers of web standards understood that long ago and suggested an alternative way of managing handlers using special methods addEventListener and removeEventListener. They are free of such a problem.

The syntax to add a handler:

element.addEventListener(event, handler, [options]);
event
Event name, e.g. "click".
handler
The handler function.
options
An additional optional object with properties: - once: if true, then the listener is automatically removed after it triggers. - capture: the phase where to handle the event, to be covered later in the chapter info:bubbling-and-capturing. For historical reasons, options can also be false/true, that’s the same as {capture: false/true}. - passive: if true, then the handler will not call preventDefault(), we’ll explain that later in info:default-browser-action.

To remove the handler, use removeEventListener:

element.removeEventListener(event, handler, [options]);

To remove a handler we should pass exactly the same function as was assigned.

That doesn’t work:

elem.addEventListener("click", () => alert("Thanks!"));
// ....
elem.removeEventListener("click", () => alert("Thanks!"));

The handler won’t be removed, because removeEventListener gets another function – with the same code, but that doesn’t matter, as it’s a different function object.

Here’s the right way:

function handler() {
    alert("Thanks!");
}

input.addEventListener("click", handler);
// ....
input.removeEventListener("click", handler);

Please note – if we don’t store the function in a variable, then we can’t remove it. There’s no way to “read back” handlers assigned by addEventListener.

Multiple calls to addEventListener allow to add multiple handlers, like this:

<input id="elem" type="button" value="Click me" />

<script>
    function handler1() {
        alert("Thanks!");
    }

    function handler2() {
        alert("Thanks again!");
    }

    elem.onclick = () => alert("Hello");
    elem.addEventListener("click", handler1); // Thanks!
    elem.addEventListener("click", handler2); // Thanks again!
</script>

As we can see in the example above, we can set handlers both using a DOM-property and addEventListener. But generally we use only one of these ways.

There exist events that can’t be assigned via a DOM-property. Only with addEventListener.

For instance, the DOMContentLoaded event, that triggers when the document is loaded and DOM is built.

// will never run
document.onDOMContentLoaded = function () {
    alert("DOM built");
};
// this way it works
document.addEventListener("DOMContentLoaded", function () {
    alert("DOM built");
});

So addEventListener is more universal. Although, such events are an exception rather than the rule.

Event object

To properly handle an event we’d want to know more about what’s happened. Not just a “click” or a “keydown”, but what were the pointer coordinates? Which key was pressed? And so on.

When an event happens, the browser creates an event object, puts details into it and passes it as an argument to the handler.

Here’s an example of getting pointer coordinates from the event object:

<input type="button" value="Click me" id="elem" />

<script>
    elem.onclick = function (event) {
        // show event type, element and coordinates of the click
        alert(event.type + " at " + event.currentTarget);
        alert("Coordinates: " + event.clientX + ":" + event.clientY);
    };
</script>

Some properties of event object:

event.type
Event type, here it’s "click".
event.currentTarget
Element that handled the event. That’s exactly the same as this, unless the handler is an arrow function, or its this is bound to something else, then we can get the element from event.currentTarget.
event.clientX / event.clientY
Window-relative coordinates of the cursor, for pointer events.

There are more properties. Many of them depend on the event type: keyboard events have one set of properties, pointer events - another one, we’ll study them later when we come to different events in details.

If we assign a handler in HTML, we can also use the event object, like this:

<input type="button" onclick="alert(event.type)" value="Event type" />

That’s possible because when the browser reads the attribute, it creates a handler like this: function(event) { alert(event.type) }. That is: its first argument is called "event", and the body is taken from the attribute.

Object handlers: handleEvent

We can assign not just a function, but an object as an event handler using addEventListener. When an event occurs, its handleEvent method is called.

For instance:

<button id="elem">Click me</button>

<script>
    let obj = {
        handleEvent(event) {
            alert(event.type + " at " + event.currentTarget);
        },
    };

    elem.addEventListener("click", obj);
</script>

As we can see, when addEventListener receives an object as the handler, it calls obj.handleEvent(event) in case of an event.

We could also use a class for that:

<button id="elem">Click me</button>

<script>
    class Menu {
        handleEvent(event) {
            switch (event.type) {
                case "mousedown":
                    elem.innerHTML = "Mouse button pressed";
                    break;
                case "mouseup":
                    elem.innerHTML += "...and released.";
                    break;
            }
        }
    }

    let menu = new Menu();
    elem.addEventListener("mousedown", menu);
    elem.addEventListener("mouseup", menu);
</script>

Here the same object handles both events. Please note that we need to explicitly setup the events to listen using addEventListener. The menu object only gets mousedown and mouseup here, not any other types of events.

The method handleEvent does not have to do all the job by itself. It can call other event-specific methods instead, like this:

<button id="elem">Click me</button>

<script>
    class Menu {
        handleEvent(event) {
            // mousedown -> onMousedown
            let method =
                "on" + event.type[0].toUpperCase() + event.type.slice(1);
            this[method](event);
        }

        onMousedown() {
            elem.innerHTML = "Mouse button pressed";
        }

        onMouseup() {
            elem.innerHTML += "...and released.";
        }
    }

    let menu = new Menu();
    elem.addEventListener("mousedown", menu);
    elem.addEventListener("mouseup", menu);
</script>

Now event handlers are clearly separated, that may be easier to support.

Summary

There are 3 ways to assign event handlers:

  1. HTML attribute: onclick="...".
  2. DOM property: elem.onclick = function.
  3. Methods: elem.addEventListener(event, handler[, phase]) to add, removeEventListener to remove.

HTML attributes are used sparingly, because JavaScript in the middle of an HTML tag looks a little bit odd and alien. Also can’t write lots of code in there.

DOM properties are ok to use, but we can’t assign more than one handler of the particular event. In many cases that limitation is not pressing.

The last way is the most flexible, but it is also the longest to write. There are few events that only work with it, for instance transitionend and DOMContentLoaded (to be covered). Also addEventListener supports objects as event handlers. In that case the method handleEvent is called in case of the event.

No matter how you assign the handler – it gets an event object as the first argument. That object contains the details about what’s happened.

We’ll learn more about events in general and about different types of events in the next chapters.

Bubbling and capturing

Let’s start with an example.

This handler is assigned to <div>, but also runs if you click any nested tag like <em> or <code>:

<div onclick="alert('The handler!')">
    <em
        >If you click on <code>EM</code>, the handler on
        <code>DIV</code> runs.</em
    >
</div>

Isn’t it a bit strange? Why does the handler on <div> run if the actual click was on <em>?

Bubbling

The bubbling principle is simple.

When an event happens on an element, it first runs the handlers on it, then on its parent, then all the way up on other ancestors.

Let’s say we have 3 nested elements FORM > DIV > P with a handler on each of them:

<style>
    body * {
        margin: 10px;
        border: 1px solid blue;
    }
</style>

<form onclick="alert('form')">
    FORM
    <div onclick="alert('div')">
        DIV
        <p onclick="alert('p')">P</p>
    </div>
</form>

A click on the inner <p> first runs onclick:

  1. On that <p>.
  2. Then on the outer <div>.
  3. Then on the outer <form>.
  4. And so on upwards till the document object.

So if we click on <p>, then we’ll see 3 alerts: p -> div -> form.

The process is called “bubbling”, because events “bubble” from the inner element up through parents like a bubble in the water.

The key word in this phrase is “almost”.

For instance, a focus event does not bubble. There are other examples too, we’ll meet them. But still it’s an exception, rather than a rule, most events do bubble.

event.target

A handler on a parent element can always get the details about where it actually happened.

The most deeply nested element that caused the event is called a target element, accessible as event.target.

Note the differences from this (=event.currentTarget):

  • event.target – is the “target” element that initiated the event, it doesn’t change through the bubbling process.
  • this – is the “current” element, the one that has a currently running handler on it.

For instance, if we have a single handler form.onclick, then it can “catch” all clicks inside the form. No matter where the click happened, it bubbles up to <form> and runs the handler.

In form.onclick handler:

  • this (=event.currentTarget) is the <form> element, because the handler runs on it.
  • event.target is the actual element inside the form that was clicked.

Check it out:

[codetabs height=220 src=“bubble-target”]

It’s possible that event.target could equal this – it happens when the click is made directly on the <form> element.

Stopping bubbling

A bubbling event goes from the target element straight up. Normally it goes upwards till <html>, and then to document object, and some events even reach window, calling all handlers on the path.

But any handler may decide that the event has been fully processed and stop the bubbling.

The method for it is event.stopPropagation().

For instance, here body.onclick doesn’t work if you click on <button>:

<body onclick="alert(`the bubbling doesn't reach here`)">
    <button onclick="event.stopPropagation()">Click me</button>
</body>

If an element has multiple event handlers on a single event, then even if one of them stops the bubbling, the other ones still execute.

In other words, event.stopPropagation() stops the move upwards, but on the current element all other handlers will run.

To stop the bubbling and prevent handlers on the current element from running, there’s a method event.stopImmediatePropagation(). After it no other handlers execute.

Bubbling is convenient. Don’t stop it without a real need: obvious and architecturally well thought out.

Sometimes event.stopPropagation() creates hidden pitfalls that later may become problems.

For instance:

  1. We create a nested menu. Each submenu handles clicks on its elements and calls stopPropagation so that the outer menu won’t trigger.
  2. Later we decide to catch clicks on the whole window, to track users’ behavior (where people click). Some analytic systems do that. Usually the code uses document.addEventListener('click'…) to catch all clicks.
  3. Our analytic won’t work over the area where clicks are stopped by stopPropagation. Sadly, we’ve got a “dead zone”.

There’s usually no real need to prevent the bubbling. A task that seemingly requires that may be solved by other means. One of them is to use custom events, we’ll cover them later. Also we can write our data into the event object in one handler and read it in another one, so we can pass to handlers on parents information about the processing below.

Capturing

There’s another phase of event processing called “capturing”. It is rarely used in real code, but sometimes can be useful.

The standard DOM Events describes 3 phases of event propagation:

  1. Capturing phase – the event goes down to the element.
  2. Target phase – the event reached the target element.
  3. Bubbling phase – the event bubbles up from the element.

Here’s the picture of a click on <td> inside a table, taken from the specification:

That is: for a click on <td> the event first goes through the ancestors chain down to the element (capturing phase), then it reaches the target and triggers there (target phase), and then it goes up (bubbling phase), calling handlers on its way.

Before we only talked about bubbling, because the capturing phase is rarely used. Normally it is invisible to us.

Handlers added using on<event>-property or using HTML attributes or using two-argument addEventListener(event, handler) don’t know anything about capturing, they only run on the 2nd and 3rd phases.

To catch an event on the capturing phase, we need to set the handler capture option to true:

elem.addEventListener(..., {capture: true})
// or, just "true" is an alias to {capture: true}
elem.addEventListener(..., true)

There are two possible values of the capture option:

  • If it’s false (default), then the handler is set on the bubbling phase.
  • If it’s true, then the handler is set on the capturing phase.

Note that while formally there are 3 phases, the 2nd phase (“target phase”: the event reached the element) is not handled separately: handlers on both capturing and bubbling phases trigger at that phase.

Let’s see both capturing and bubbling in action:

<style>
    body * {
        margin: 10px;
        border: 1px solid blue;
    }
</style>

<form>
    FORM
    <div>
        DIV
        <p>P</p>
    </div>
</form>

<script>
    for (let elem of document.querySelectorAll("*")) {
        elem.addEventListener(
            "click",
            (e) => alert(`Capturing: ${elem.tagName}`),
            true
        );
        elem.addEventListener("click", (e) =>
            alert(`Bubbling: ${elem.tagName}`)
        );
    }
</script>

The code sets click handlers on every element in the document to see which ones are working.

If you click on <p>, then the sequence is:

  1. HTML -> BODY -> FORM -> DIV (capturing phase, the first listener):
  2. P (target phase, triggers two times, as we’ve set two listeners: capturing and bubbling)
  3. DIV -> FORM -> BODY -> HTML (bubbling phase, the second listener).

There’s a property event.eventPhase that tells us the number of the phase on which the event was caught. But it’s rarely used, because we usually know it in the handler.

If we addEventListener(..., true), then we should mention the same phase in removeEventListener(..., true) to correctly remove the handler.

If we have multiple event handlers on the same phase, assigned to the same element with addEventListener, they run in the same order as they are created:

elem.addEventListener("click", (e) => alert(1)); // guaranteed to trigger first
elem.addEventListener("click", (e) => alert(2));

Summary

When an event happens – the most nested element where it happens gets labeled as the “target element” (event.target).

  • Then the event moves down from the document root to event.target, calling handlers assigned with addEventListener(..., true) on the way (true is a shorthand for {capture: true}).
  • Then handlers are called on the target element itself.
  • Then the event bubbles up from event.target up to the root, calling handlers assigned using on<event> and addEventListener without the 3rd argument or with the 3rd argument false/{capture:false}.

Each handler can access event object properties:

  • event.target – the deepest element that originated the event.
  • event.currentTarget (=this) – the current element that handles the event (the one that has the handler on it)
  • event.eventPhase – the current phase (capturing=1, target=2, bubbling=3).

Any event handler can stop the event by calling event.stopPropagation(), but that’s not recommended, because we can’t really be sure we won’t need it above, maybe for completely different things.

The capturing phase is used very rarely, usually we handle events on bubbling. And there’s a logic behind that.

In real world, when an accident happens, local authorities react first. They know best the area where it happened. Then higher-level authorities if needed.

The same for event handlers. The code that set the handler on a particular element knows maximum details about the element and what it does. A handler on a particular <td> may be suited for that exactly <td>, it knows everything about it, so it should get the chance first. Then its immediate parent also knows about the context, but a little bit less, and so on till the very top element that handles general concepts and runs the last.

Bubbling and capturing lay the foundation for “event delegation” – an extremely powerful event handling pattern that we study in the next chapter.

Event delegation

Capturing and bubbling allow us to implement one of most powerful event handling patterns called event delegation.

The idea is that if we have a lot of elements handled in a similar way, then instead of assigning a handler to each of them – we put a single handler on their common ancestor.

In the handler we get event.target, see where the event actually happened and handle it.

Let’s see an example – the Ba-Gua diagram reflecting the ancient Chinese philosophy.

Here it is:

[iframe height=350 src=“bagua” edit link]

The HTML is like this:

<table>
    <tr>
        <th colspan="3">
            <em>Bagua</em> Chart: Direction, Element, Color, Meaning
        </th>
    </tr>
    <tr>
        <td class="nw">
            <strong>Northwest</strong><br />Metal<br />Silver<br />Elders
        </td>
        <td class="n">...</td>
        <td class="ne">...</td>
    </tr>
    <tr>
        ...2 more lines of this kind...
    </tr>
    <tr>
        ...2 more lines of this kind...
    </tr>
</table>

The table has 9 cells, but there could be 99 or 9999, doesn’t matter.

Our task is to highlight a cell <td> on click.

Instead of assign an onclick handler to each <td> (can be many) – we’ll setup the “catch-all” handler on <table> element.

It will use event.target to get the clicked element and highlight it.

The code:

let selectedTd;

table.onclick = function (event) {
    let target = event.target; // where was the click?

    if (target.tagName != "TD") return; // not on TD? Then we're not interested

    highlight(target); // highlight it
};

function highlight(td) {
    if (selectedTd) {
        // remove the existing highlight if any
        selectedTd.classList.remove("highlight");
    }
    selectedTd = td;
    selectedTd.classList.add("highlight"); // highlight the new td
}

Such a code doesn’t care how many cells there are in the table. We can add/remove <td> dynamically at any time and the highlighting will still work.

Still, there’s a drawback.

The click may occur not on the <td>, but inside it.

In our case if we take a look inside the HTML, we can see nested tags inside <td>, like <strong>:

<td>
    <strong>Northwest</strong>

    ...
</td>

Naturally, if a click happens on that <strong> then it becomes the value of event.target.

In the handler table.onclick we should take such event.target and find out whether the click was inside <td> or not.

Here’s the improved code:

table.onclick = function (event) {
    let td = event.target.closest("td"); // (1)

    if (!td) return; // (2)

    if (!table.contains(td)) return; // (3)

    highlight(td); // (4)
};

Explanations:

  1. The method elem.closest(selector) returns the nearest ancestor that matches the selector. In our case we look for <td> on the way up from the source element.
  2. If event.target is not inside any <td>, then the call returns immediately, as there’s nothing to do.
  3. In case of nested tables, event.target may be a <td>, but lying outside of the current table. So we check if that’s actually our table’s <td>.
  4. And, if it’s so, then highlight it.

As the result, we have a fast, efficient highlighting code, that doesn’t care about the total number of <td> in the table.

Delegation example: actions in markup

There are other uses for event delegation.

Let’s say, we want to make a menu with buttons “Save”, “Load”, “Search” and so on. And there’s an object with methods save, load, search… How to match them?

The first idea may be to assign a separate handler to each button. But there’s a more elegant solution. We can add a handler for the whole menu and data-action attributes for buttons that has the method to call:

<button data-action="save">Click to Save</button>

The handler reads the attribute and executes the method. Take a look at the working example:

<div id="menu">
    <button data-action="save">Save</button>
    <button data-action="load">Load</button>
    <button data-action="search">Search</button>
</div>

<script>
    class Menu {
        constructor(elem) {
            this._elem = elem;
            elem.onclick = this.onClick.bind(this); // (*)
        }

        save() {
            alert("saving");
        }

        load() {
            alert("loading");
        }

        search() {
            alert("searching");
        }

        onClick(event) {
            let action = event.target.dataset.action;
            if (action) {
                this[action]();
            }
        }
    }

    new Menu(menu);
</script>

Please note that this.onClick is bound to this in (*). That’s important, because otherwise this inside it would reference the DOM element (elem), not the Menu object, and this[action] would not be what we need.

So, what advantages does delegation give us here?

  • We don’t need to write the code to assign a handler to each button. Just make a method and put it in the markup.
  • The HTML structure is flexible, we can add/remove buttons at any time.

We could also use classes .action-save, .action-load, but an attribute data-action is better semantically. And we can use it in CSS rules too.

The “behavior” pattern

We can also use event delegation to add “behaviors” to elements declaratively, with special attributes and classes.

The pattern has two parts:

  1. We add a custom attribute to an element that describes its behavior.
  2. A document-wide handler tracks events, and if an event happens on an attributed element – performs the action.

Behavior: Counter

For instance, here the attribute data-counter adds a behavior: “increase value on click” to buttons:

Counter: <input type="button" value="1" data-counter /> One more counter:
<input type="button" value="2" data-counter />

<script>
    document.addEventListener("click", function (event) {
        if (event.target.dataset.counter != undefined) {
            // if the attribute exists...
            event.target.value++;
        }
    });
</script>

If we click a button – its value is increased. Not buttons, but the general approach is important here.

There can be as many attributes with data-counter as we want. We can add new ones to HTML at any moment. Using the event delegation we “extended” HTML, added an attribute that describes a new behavior.

When we assign an event handler to the document object, we should always use addEventListener, not document.on<event>, because the latter will cause conflicts: new handlers overwrite old ones.

For real projects it’s normal that there are many handlers on document set by different parts of the code.

Behavior: Toggler

One more example of behavior. A click on an element with the attribute data-toggle-id will show/hide the element with the given id:

<button data-toggle-id="subscribe-mail">Show the subscription form</button>

<form id="subscribe-mail" hidden>Your mail: <input type="email" /></form>

<script>
    document.addEventListener("click", function (event) {
        let id = event.target.dataset.toggleId;
        if (!id) return;

        let elem = document.getElementById(id);

        elem.hidden = !elem.hidden;
    });
</script>

Let’s note once again what we did. Now, to add toggling functionality to an element – there’s no need to know JavaScript, just use the attribute data-toggle-id.

That may become really convenient – no need to write JavaScript for every such element. Just use the behavior. The document-level handler makes it work for any element of the page.

We can combine multiple behaviors on a single element as well.

The “behavior” pattern can be an alternative to mini-fragments of JavaScript.

Summary

Event delegation is really cool! It’s one of the most helpful patterns for DOM events.

It’s often used to add the same handling for many similar elements, but not only for that.

The algorithm:

  1. Put a single handler on the container.
  2. In the handler – check the source element event.target.
  3. If the event happened inside an element that interests us, then handle the event.

Benefits:

  • Simplifies initialization and saves memory: no need to add many handlers.
  • Less code: when adding or removing elements, no need to add/remove handlers.
  • DOM modifications: we can mass add/remove elements with innerHTML and the like.

The delegation has its limitations of course:

  • First, the event must be bubbling. Some events do not bubble. Also, low-level handlers should not use event.stopPropagation().
  • Second, the delegation may add CPU load, because the container-level handler reacts on events in any place of the container, no matter whether they interest us or not. But usually the load is negligible, so we don’t take it into account.

Browser default actions

Many events automatically lead to certain actions performed by the browser.

For instance:

  • A click on a link - initiates navigation to its URL.
  • A click on a form submit button - initiates its submission to the server.
  • Pressing a mouse button over a text and moving it - selects the text.

If we handle an event in JavaScript, we may not want the corresponding browser action to happen, and want to implement another behavior instead.

Preventing browser actions

There are two ways to tell the browser we don’t want it to act:

  • The main way is to use the event object. There’s a method event.preventDefault().
  • If the handler is assigned using on<event> (not by addEventListener), then returning false also works the same.

In this HTML a click on a link doesn’t lead to navigation, browser doesn’t do anything:

<a href="/" onclick="return false">Click here</a>
or
<a href="/" onclick="event.preventDefault()">here</a>

In the next example we’ll use this technique to create a JavaScript-powered menu.

The value returned by an event handler is usually ignored.

The only exception is return false from a handler assigned using on<event>.

In all other cases, return value is ignored. In particular, there’s no sense in returning true.

Example: the menu

Consider a site menu, like this:

<ul id="menu" class="menu">
    <li><a href="/html">HTML</a></li>
    <li><a href="/javascript">JavaScript</a></li>
    <li><a href="/css">CSS</a></li>
</ul>

Here’s how it looks with some CSS:

[iframe height=70 src=“menu” link edit]

Menu items are implemented as HTML-links <a>, not buttons <button>. There are several reasons to do so, for instance:

  • Many people like to use “right click” – “open in a new window”. If we use <button> or <span>, that doesn’t work.
  • Search engines follow <a href="..."> links while indexing.

So we use <a> in the markup. But normally we intend to handle clicks in JavaScript. So we should prevent the default browser action.

Like here:

menu.onclick = function (event) {
    if (event.target.nodeName != "A") return;

    let href = event.target.getAttribute("href");
    alert(href); // ...can be loading from the server, UI generation etc

    return false; // prevent browser action (don't go to the URL)
};

If we omit return false, then after our code executes the browser will do its “default action” – navigating to the URL in href. And we don’t need that here, as we’re handling the click by ourselves.

By the way, using event delegation here makes our menu very flexible. We can add nested lists and style them using CSS to “slide down”.

Certain events flow one into another. If we prevent the first event, there will be no second.

For instance, mousedown on an <input> field leads to focusing in it, and the focus event. If we prevent the mousedown event, there’s no focus.

Try to click on the first <input> below – the focus event happens. But if you click the second one, there’s no focus.

<input value="Focus works" onfocus="this.value=''" />
<input onmousedown="return false" onfocus="this.value=''" value="Click me" />

That’s because the browser action is canceled on mousedown. The focusing is still possible if we use another way to enter the input. For instance, the key:Tab key to switch from the 1st input into the 2nd. But not with the mouse click any more.

The “passive” handler option

The optional passive: true option of addEventListener signals the browser that the handler is not going to call preventDefault().

Why that may be needed?

There are some events like touchmove on mobile devices (when the user moves their finger across the screen), that cause scrolling by default, but that scrolling can be prevented using preventDefault() in the handler.

So when the browser detects such event, it has first to process all handlers, and then if preventDefault is not called anywhere, it can proceed with scrolling. That may cause unnecessary delays and “jitters” in the UI.

The passive: true options tells the browser that the handler is not going to cancel scrolling. Then browser scrolls immediately providing a maximally fluent experience, and the event is handled by the way.

For some browsers (Firefox, Chrome), passive is true by default for touchstart and touchmove events.

event.defaultPrevented

The property event.defaultPrevented is true if the default action was prevented, and false otherwise.

There’s an interesting use case for it.

You remember in the chapter info:bubbling-and-capturing we talked about event.stopPropagation() and why stopping bubbling is bad?

Sometimes we can use event.defaultPrevented instead, to signal other event handlers that the event was handled.

Let’s see a practical example.

By default the browser on contextmenu event (right mouse click) shows a context menu with standard options. We can prevent it and show our own, like this:

<button>Right-click shows browser context menu</button>

<button oncontextmenu="alert('Draw our menu'); return false">
    Right-click shows our context menu
</button>

Now, in addition to that context menu we’d like to implement document-wide context menu.

Upon right click, the closest context menu should show up.

<p>Right-click here for the document context menu</p>
<button id="elem">Right-click here for the button context menu</button>

<script>
    elem.oncontextmenu = function (event) {
        event.preventDefault();
        alert("Button context menu");
    };

    document.oncontextmenu = function (event) {
        event.preventDefault();
        alert("Document context menu");
    };
</script>

The problem is that when we click on elem, we get two menus: the button-level and (the event bubbles up) the document-level menu.

How to fix it? One of solutions is to think like: “When we handle right-click in the button handler, let’s stop its bubbling” and use event.stopPropagation():

<p>Right-click for the document menu</p>
<button id="elem">
    Right-click for the button menu (fixed with event.stopPropagation)
</button>

<script>
    elem.oncontextmenu = function (event) {
        event.preventDefault();

        event.stopPropagation();

        alert("Button context menu");
    };

    document.oncontextmenu = function (event) {
        event.preventDefault();
        alert("Document context menu");
    };
</script>

Now the button-level menu works as intended. But the price is high. We forever deny access to information about right-clicks for any outer code, including counters that gather statistics and so on. That’s quite unwise.

An alternative solution would be to check in the document handler if the default action was prevented? If it is so, then the event was handled, and we don’t need to react on it.

<p>
    Right-click for the document menu (added a check for event.defaultPrevented)
</p>
<button id="elem">Right-click for the button menu</button>

<script>
    elem.oncontextmenu = function (event) {
        event.preventDefault();
        alert("Button context menu");
    };

    document.oncontextmenu = function (event) {
        if (event.defaultPrevented) return;

        event.preventDefault();
        alert("Document context menu");
    };
</script>

Now everything also works correctly. If we have nested elements, and each of them has a context menu of its own, that would also work. Just make sure to check for event.defaultPrevented in each contextmenu handler.

As we can clearly see, event.stopPropagation() and event.preventDefault() (also known as return false) are two different things. They are not related to each other.

There are also alternative ways to implement nested context menus. One of them is to have a single global object with a handler for document.oncontextmenu, and also methods that allow us to store other handlers in it.

The object will catch any right-click, look through stored handlers and run the appropriate one.

But then each piece of code that wants a context menu should know about that object and use its help instead of the own contextmenu handler.

Summary

There are many default browser actions:

  • mousedown – starts the selection (move the mouse to select).
  • click on <input type="checkbox"> – checks/unchecks the input.
  • submit – clicking an <input type="submit"> or hitting key:Enter inside a form field causes this event to happen, and the browser submits the form after it.
  • keydown – pressing a key may lead to adding a character into a field, or other actions.
  • contextmenu – the event happens on a right-click, the action is to show the browser context menu.
  • …there are more…

All the default actions can be prevented if we want to handle the event exclusively by JavaScript.

To prevent a default action – use either event.preventDefault() or return false. The second method works only for handlers assigned with on<event>.

The passive: true option of addEventListener tells the browser that the action is not going to be prevented. That’s useful for some mobile events, like touchstart and touchmove, to tell the browser that it should not wait for all handlers to finish before scrolling.

If the default action was prevented, the value of event.defaultPrevented becomes true, otherwise it’s false.

Technically, by preventing default actions and adding JavaScript we can customize the behavior of any elements. For instance, we can make a link <a> work like a button, and a button <button> behave as a link (redirect to another URL or so).

But we should generally keep the semantic meaning of HTML elements. For instance, <a> should perform navigation, not a button.

Besides being “just a good thing”, that makes your HTML better in terms of accessibility.

Also if we consider the example with <a>, then please note: a browser allows us to open such links in a new window (by right-clicking them and other means). And people like that. But if we make a button behave as a link using JavaScript and even look like a link using CSS, then <a>-specific browser features still won’t work for it.

Dispatching custom events

We can not only assign handlers, but also generate events from JavaScript.

Custom events can be used to create “graphical components”. For instance, a root element of our own JS-based menu may trigger events telling what happens with the menu: open (menu open), select (an item is selected) and so on. Another code may listen for the events and observe what’s happening with the menu.

We can generate not only completely new events, that we invent for our own purposes, but also built-in ones, such as click, mousedown etc. That may be helpful for automated testing.

Event constructor

Build-in event classes form a hierarchy, similar to DOM element classes. The root is the built-in Event class.

We can create Event objects like this:

let event = new Event(type[, options]);

Arguments:

  • type – event type, a string like "click" or our own like "my-event".

  • options – the object with two optional properties:

    • bubbles: true/false – if true, then the event bubbles.
    • cancelable: true/false – if true, then the “default action” may be prevented. Later we’ll see what it means for custom events.

    By default both are false: {bubbles: false, cancelable: false}.

dispatchEvent

After an event object is created, we should “run” it on an element using the call elem.dispatchEvent(event).

Then handlers react on it as if it were a regular browser event. If the event was created with the bubbles flag, then it bubbles.

In the example below the click event is initiated in JavaScript. The handler works same way as if the button was clicked:

<button id="elem" onclick="alert('Click!');">Autoclick</button>

<script>
    let event = new Event("click");
    elem.dispatchEvent(event);
</script>

There is a way to tell a “real” user event from a script-generated one.

The property event.isTrusted is true for events that come from real user actions and false for script-generated events.

Bubbling example

We can create a bubbling event with the name "hello" and catch it on document.

All we need is to set bubbles to true:

<h1 id="elem">Hello from the script!</h1>

<script>
    // catch on document...
    document.addEventListener("hello", function (event) {
        // (1)
        alert("Hello from " + event.target.tagName); // Hello from H1
    });

    // ...dispatch on elem!
    let event = new Event("hello", { bubbles: true }); // (2)
    elem.dispatchEvent(event);

    // the handler on document will activate and display the message.
</script>

Notes:

  1. We should use addEventListener for our custom events, because on<event> only exists for built-in events, document.onhello doesn’t work.
  2. Must set bubbles:true, otherwise the event won’t bubble up.

The bubbling mechanics is the same for built-in (click) and custom (hello) events. There are also capturing and bubbling stages.

MouseEvent, KeyboardEvent and others

Here’s a short list of classes for UI Events from the UI Event specification:

  • UIEvent
  • FocusEvent
  • MouseEvent
  • WheelEvent
  • KeyboardEvent

We should use them instead of new Event if we want to create such events. For instance, new MouseEvent("click").

The right constructor allows to specify standard properties for that type of event.

Like clientX/clientY for a mouse event:

let event = new MouseEvent("click", {
    bubbles: true,
    cancelable: true,
    clientX: 100,
    clientY: 100,
});

alert(event.clientX); // 100

Please note: the generic Event constructor does not allow that.

Let’s try:

let event = new Event("click", {
    bubbles: true, // only bubbles and cancelable
    cancelable: true, // work in the Event constructor
    clientX: 100,
    clientY: 100,
});

alert(event.clientX); // undefined, the unknown property is ignored!

Technically, we can work around that by assigning directly event.clientX=100 after creation. So that’s a matter of convenience and following the rules. Browser-generated events always have the right type.

The full list of properties for different UI events is in the specification, for instance, MouseEvent.

Custom events

For our own, completely new events types like "hello" we should use new CustomEvent. Technically CustomEvent is the same as Event, with one exception.

In the second argument (object) we can add an additional property detail for any custom information that we want to pass with the event.

For instance:

<h1 id="elem">Hello for John!</h1>

<script>
    // additional details come with the event to the handler
    elem.addEventListener("hello", function (event) {
        alert(event.detail.name);
    });

    elem.dispatchEvent(
        new CustomEvent("hello", {
            detail: { name: "John" },
        })
    );
</script>

The detail property can have any data. Technically we could live without, because we can assign any properties into a regular new Event object after its creation. But CustomEvent provides the special detail field for it to evade conflicts with other event properties.

Besides, the event class describes “what kind of event” it is, and if the event is custom, then we should use CustomEvent just to be clear about what it is.

event.preventDefault()

Many browser events have a “default action”, such as navigating to a link, starting a selection, and so on.

For new, custom events, there are definitely no default browser actions, but a code that dispatches such event may have its own plans what to do after triggering the event.

By calling event.preventDefault(), an event handler may send a signal that those actions should be canceled.

In that case the call to elem.dispatchEvent(event) returns false. And the code that dispatched it knows that it shouldn’t continue.

Let’s see a practical example - a hiding rabbit (could be a closing menu or something else).

Below you can see a #rabbit and hide() function that dispatches "hide" event on it, to let all interested parties know that the rabbit is going to hide.

Any handler can listen for that event with rabbit.addEventListener('hide',...) and, if needed, cancel the action using event.preventDefault(). Then the rabbit won’t disappear:

<pre id="rabbit">
  |\   /|
   \|_|/
   /. .\
  =\_Y_/=
   {>o<}
</pre>
<button onclick="hide()">Hide()</button>

<script>
    // hide() will be called automatically in 2 seconds
    function hide() {
        let event = new CustomEvent("hide", {
            cancelable: true, // without that flag preventDefault doesn't work
        });
        if (!rabbit.dispatchEvent(event)) {
            alert("The action was prevented by a handler");
        } else {
            rabbit.hidden = true;
        }
    }

    rabbit.addEventListener("hide", function (event) {
        if (confirm("Call preventDefault?")) {
            event.preventDefault();
        }
    });
</script>

Please note: the event must have the flag cancelable: true, otherwise the call event.preventDefault() is ignored.

Events-in-events are synchronous

Usually events are processed in a queue. That is: if the browser is processing onclick and a new event occurs, e.g. mouse moved, then it’s handing is queued up, corresponding mousemove handlers will be called after onclick processing is finished.

The notable exception is when one event is initiated from within another one, e.g. using dispatchEvent. Such events are processed immediately: the new event handlers are called, and then the current event handling is resumed.

For instance, in the code below the menu-open event is triggered during the onclick.

It’s processed immediately, without waiting for onlick handler to end:

<button id="menu">Menu (click me)</button>

<script>
    menu.onclick = function () {
        alert(1);

        menu.dispatchEvent(
            new CustomEvent("menu-open", {
                bubbles: true,
            })
        );

        alert(2);
    };

    // triggers between 1 and 2
    document.addEventListener("menu-open", () => alert("nested"));
</script>

The output order is: 1 -> nested -> 2.

Please note that the nested event menu-open is caught on the document. The propagation and handling of the nested event is finished before the processing gets back to the outer code (onclick).

That’s not only about dispatchEvent, there are other cases. If an event handler calls methods that trigger to other events – they are too processed synchronously, in a nested fashion.

Let’s say we don’t like it. We’d want onclick to be fully processed first, independently from menu-open or any other nested events.

Then we can either put the dispatchEvent (or another event-triggering call) at the end of onclick or, maybe better, wrap it in the zero-delay setTimeout:

<button id="menu">Menu (click me)</button>

<script>
    menu.onclick = function () {
        alert(1);

        setTimeout(() =>
            menu.dispatchEvent(
                new CustomEvent("menu-open", {
                    bubbles: true,
                })
            )
        );

        alert(2);
    };

    document.addEventListener("menu-open", () => alert("nested"));
</script>

Now dispatchEvent runs asynchronously after the current code execution is finished, including mouse.onclick, so event handlers are totally separate.

The output order becomes: 1 -> 2 -> nested.

Summary

To generate an event from code, we first need to create an event object.

The generic Event(name, options) constructor accepts an arbitrary event name and the options object with two properties:

  • bubbles: true if the event should bubble.
  • cancelable: true if the event.preventDefault() should work.

Other constructors of native events like MouseEvent, KeyboardEvent and so on accept properties specific to that event type. For instance, clientX for mouse events.

For custom events we should use CustomEvent constructor. It has an additional option named detail, we should assign the event-specific data to it. Then all handlers can access it as event.detail.

Despite the technical possibility to generate browser events like click or keydown, we should use with the great care.

We shouldn’t generate browser events as it’s a hacky way to run handlers. That’s a bad architecture most of the time.

Native events might be generated:

  • As a dirty hack to make 3rd-party libraries work the needed way, if they don’t provide other means of interaction.
  • For automated testing, to “click the button” in the script and see if the interface reacts correctly.

Custom events with our own names are often generated for architectural purposes, to signal what happens inside our menus, sliders, carousels etc.

Mouse events

In this chapter we’ll get into more details about mouse events and their properties.

Please note: such events may come not only from “mouse devices”, but are also from other devices, such as phones and tablets, where they are emulated for compatibility.

Mouse event types

We’ve already seen some of these events:

mousedown/mouseup
Mouse button is clicked/released over an element.
mouseover/mouseout
Mouse pointer comes over/out from an element.
mousemove
Every mouse move over an element triggers that event.
click
Triggers after mousedown and then mouseup over the same element if the left mouse button was used.
dblclick
Triggers after two clicks on the same element within a short timeframe. Rarely used nowadays.
contextmenu
Triggers when the right mouse button is pressed. There are other ways to open a context menu, e.g. using a special keyboard key, it triggers in that case also, so it’s not exactly the mouse event.

…There are several other events too, we’ll cover them later.

Events order

As you can see from the list above, a user action may trigger multiple events.

For instance, a left-button click first triggers mousedown, when the button is pressed, then mouseup and click when it’s released.

In cases when a single action initiates multiple events, their order is fixed. That is, the handlers are called in the order mousedown -> mouseup -> click.

Click the button below and you'll see the events. Try double-click too. On the
teststand below all mouse events are logged, and if there is more than a 1
second delay between them they are separated by a horizontal ruler. Also we can
see the `button` property that allows to detect the mouse button, it's explained
below.

<input
    onmousedown="return logMouse(event)"
    onmouseup="return logMouse(event)"
    onclick="return logMouse(event)"
    oncontextmenu="return logMouse(event)"
    ondblclick="return logMouse(event)"
    value="Click me with the right or the left mouse button"
    type="button"
/>
<input onclick="logClear('test')" value="Clear" type="button" />
<form id="testform" name="testform">
    <textarea style="font-size:12px;height:150px;width:360px;"></textarea>
</form>

Mouse button

Click-related events always have the button property, which allows to get the exact mouse button.

We usually don’t use it for click and contextmenu events, because the former happens only on left-click, and the latter – only on right-click.

From the other hand, mousedown and mouseup handlers we may need event.button, because these events trigger on any button, so button allows to distinguish between “right-mousedown” and “left-mousedown”.

The possible values of event.button are:

Button state event.button
Left button (primary) 0
Middle button (auxillary) 1
Right button (secondary) 2
X1 button (back) 3
X2 button (forward) 4

Most mouse devices only have the left and right buttons, so possible values are 0 or 2. Touch devices also generate similar events when one taps on them.

Also there’s event.buttons property that has all currently pressed buttons as an integer, one bit per button. In practice this property is very rarely used, you can find details at MDN if you ever need it.

Old code may use event.which property that’s an old non-standard way of getting a button, with possible values:

  • event.which == 1 – left button,
  • event.which == 2 – middle button,
  • event.which == 3 – right button.

As of now, event.which is deprecated, we shouldn’t use it.

Modifiers: shift, alt, ctrl and meta

All mouse events include the information about pressed modifier keys.

Event properties:

  • shiftKey: key:Shift
  • altKey: key:Alt (or key:Opt for Mac)
  • ctrlKey: key:Ctrl
  • metaKey: key:Cmd for Mac

They are true if the corresponding key was pressed during the event.

For instance, the button below only works on key:Alt+Shift+click:

<button id="button">Alt+Shift+Click on me!</button>

<script>
    button.onclick = function (event) {
        if (event.altKey && event.shiftKey) {
            alert("Hooray!");
        }
    };
</script>

On Windows and Linux there are modifier keys key:Alt, key:Shift and key:Ctrl. On Mac there’s one more: key:Cmd, corresponding to the property metaKey.

In most applications, when Windows/Linux uses key:Ctrl, on Mac key:Cmd is used.

That is: where a Windows user presses key:Ctrl+Enter or key:Ctrl+A, a Mac user would press key:Cmd+Enter or key:Cmd+A, and so on.

So if we want to support combinations like key:Ctrl+click, then for Mac it makes sense to use key:Cmd+click. That’s more comfortable for Mac users.

Even if we’d like to force Mac users to key:Ctrl+click – that’s kind of difficult. The problem is: a left-click with key:Ctrl is interpreted as a right-click on MacOS, and it generates the contextmenu event, not click like Windows/Linux.

So if we want users of all operating systems to feel comfortable, then together with ctrlKey we should check metaKey.

For JS-code it means that we should check if (event.ctrlKey || event.metaKey).

Keyboard combinations are good as an addition to the workflow. So that if the visitor uses a keyboard – they work.

But if their device doesn’t have it – then there should be a way to live without modifier keys.

Coordinates: clientX/Y, pageX/Y

All mouse events provide coordinates in two flavours:

  1. Window-relative: clientX and clientY.
  2. Document-relative: pageX and pageY.

We already covered the difference between them in the chapter info:coordinates.

In short, document-relative coordinates pageX/Y are counted from the left-upper corner of the document, and do not change when the page is scrolled, while clientX/Y are counted from the current window left-upper corner. When the page is scrolled, they change.

For instance, if we have a window of the size 500x500, and the mouse is in the left-upper corner, then clientX and clientY are 0, no matter how the page is scrolled.

And if the mouse is in the center, then clientX and clientY are 250, no matter what place in the document it is. They are similar to position:fixed in that aspect.

Move the mouse over the input field to see clientX/clientY (the example is in the iframe, so coordinates are relative to that iframe):

<input
    onmousemove="this.value=event.clientX+':'+event.clientY"
    value="Mouse over me"
/>

Preventing selection on mousedown

Double mouse click has a side-effect that may be disturbing in some interfaces: it selects text.

For instance, a double-click on the text below selects it in addition to our handler:

<span ondblclick="alert('dblclick')">Double-click me</span>

If one presses the left mouse button and, without releasing it, moves the mouse, that also makes the selection, often unwanted.

There are multiple ways to prevent the selection, that you can read in the chapter info:selection-range.

In this particular case the most reasonable way is to prevent the browser action on mousedown. It prevents both these selections:

Before...
<b ondblclick="alert('Click!')" onmousedown="return false"> Double-click me </b>
...After

Now the bold element is not selected on double clicks, and pressing the left button on it won’t start the selection.

Please note: the text inside it is still selectable. However, the selection should start not on the text itself, but before or after it. Usually that’s fine for users.

If we want to disable selection to protect our page content from copy-pasting, then we can use another event: oncopy.

<div oncopy="alert('Copying forbidden!');return false">
    Dear user, The copying is forbidden for you. If you know JS or HTML, then
    you can get everything from the page source though.
</div>

If you try to copy a piece of text in the <div>, that won’t work, because the default action oncopy is prevented.

Surely the user has access to HTML-source of the page, and can take the content from there, but not everyone knows how to do it.

Summary

Mouse events have the following properties:

  • Button: button.

  • Modifier keys (true if pressed): altKey, ctrlKey, shiftKey and metaKey (Mac).

    • If you want to handle key:Ctrl, then don’t forget Mac users, they usually use key:Cmd, so it’s better to check if (e.metaKey || e.ctrlKey).
  • Window-relative coordinates: clientX/clientY.

  • Document-relative coordinates: pageX/pageY.

The default browser action of mousedown is text selection, if it’s not good for the interface, then it should be prevented.

In the next chapter we’ll see more details about events that follow pointer movement and how to track element changes under it.

Moving the mouse: mouseover/out, mouseenter/leave

Let’s dive into more details about events that happen when the mouse moves between elements.

Events mouseover/mouseout, relatedTarget

The mouseover event occurs when a mouse pointer comes over an element, and mouseout – when it leaves.

These events are special, because they have property relatedTarget. This property complements target. When a mouse leaves one element for another, one of them becomes target, and the other one - relatedTarget.

For mouseover:

  • event.target – is the element where the mouse came over.
  • event.relatedTarget – is the element from which the mouse came (relatedTarget -> target).

For mouseout the reverse:

  • event.target – is the element that the mouse left.
  • event.relatedTarget – is the new under-the-pointer element, that mouse left for (target -> relatedTarget).

In the example below each face and its features are separate elements. When you move the mouse, you can see mouse events in the text area.

Each event has the information about both target and relatedTarget:

[codetabs src=“mouseoverout” height=280]

The relatedTarget property can be null.

That’s normal and just means that the mouse came not from another element, but from out of the window. Or that it left the window.

We should keep that possibility in mind when using event.relatedTarget in our code. If we access event.relatedTarget.tagName, then there will be an error.

Skipping elements

The mousemove event triggers when the mouse moves. But that doesn’t mean that every pixel leads to an event.

The browser checks the mouse position from time to time. And if it notices changes then triggers the events.

That means that if the visitor is moving the mouse very fast then some DOM-elements may be skipped:

If the mouse moves very fast from #FROM to #TO elements as painted above, then intermediate <div> elements (or some of them) may be skipped. The mouseout event may trigger on #FROM and then immediately mouseover on #TO.

That’s good for performance, because there may be many intermediate elements. We don’t really want to process in and out of each one.

On the other hand, we should keep in mind that the mouse pointer doesn’t “visit” all elements along the way. It can “jump”.

In particular, it’s possible that the pointer jumps right inside the middle of the page from out of the window. In that case relatedTarget is null, because it came from “nowhere”:

You can check it out "live" on a teststand below. Its HTML has two nested
elements: the `
<div id="child">
    ` is inside the `
    <div id="parent">
        `. If you move the mouse fast over them, then maybe only the child div
        triggers events, or maybe the parent one, or maybe there will be no
        events at all. Also move the pointer into the child `div`, and then move
        it out quickly down through the parent one. If the movement is fast
        enough, then the parent element is ignored. The mouse will cross the
        parent element without noticing it. [codetabs height=360
        src="mouseoverout-fast"]
    </div>
</div>
In case of fast mouse movements, intermediate elements may be ignored, but one thing we know for sure: if the pointer "officially" entered an element (`mouseover` event generated), then upon leaving it we always get `mouseout`.

Mouseout when leaving for a child

An important feature of mouseout – it triggers, when the pointer moves from an element to its descendant, e.g. from #parent to #child in this HTML:

<div id="parent">
    <div id="child">...</div>
</div>

If we’re on #parent and then move the pointer deeper into #child, but we get mouseout on #parent!

That may seem strange, but can be easily explained.

According to the browser logic, the mouse cursor may be only over a single element at any time – the most nested one and top by z-index.

So if it goes to another element (even a descendant), then it leaves the previous one.

Please note another important detail of event processing.

The mouseover event on a descendant bubbles up. So, if #parent has mouseover handler, it triggers:

You can see that very well in the example below: `
<div id="child">
    ` is inside the `
    <div id="parent">
        `. There are `mouseover/out` handlers on `#parent` element that output
        event details. If you move the mouse from `#parent` to `#child`, you see
        two events on `#parent`: 1. `mouseout [target: parent]` (left the
        parent), then 2. `mouseover [target: child]` (came to the child,
        bubbled). [codetabs height=360 src="mouseoverout-child"]
    </div>
</div>

As shown, when the pointer moves from #parent element to #child, two handlers trigger on the parent element: mouseout and mouseover:

parent.onmouseout = function (event) {
    /* event.target: parent element */
};
parent.onmouseover = function (event) {
    /* event.target: child element (bubbled) */
};

If we don’t examine event.target inside the handlers, then it may seem that the mouse pointer left #parent element, and then immediately came back over it.

But that’s not the case! The pointer is still over the parent, it just moved deeper into the child element.

If there are some actions upon leaving the parent element, e.g. an animation runs in parent.onmouseout, we usually don’t want it when the pointer just goes deeper into #parent.

To avoid it, we can check relatedTarget in the handler and, if the mouse is still inside the element, then ignore such event.

Alternatively we can use other events: mouseenter and mouseleave, that we’ll be covering now, as they don’t have such problems.

Events mouseenter and mouseleave

Events mouseenter/mouseleave are like mouseover/mouseout. They trigger when the mouse pointer enters/leaves the element.

But there are two important differences:

  1. Transitions inside the element, to/from descendants, are not counted.
  2. Events mouseenter/mouseleave do not bubble.

These events are extremely simple.

When the pointer enters an element – mouseenter triggers. The exact location of the pointer inside the element or its descendants doesn’t matter.

When the pointer leaves an element – mouseleave triggers.

This example is similar to the one above, but now the top element has
`mouseenter/mouseleave` instead of `mouseover/mouseout`. As you can see, the
only generated events are the ones related to moving the pointer in and out of
the top element. Nothing happens when the pointer goes to the child and back.
Transitions between descendants are ignored [codetabs height=340
src="mouseleave"]

Event delegation

Events mouseenter/leave are very simple and easy to use. But they do not bubble. So we can’t use event delegation with them.

Imagine we want to handle mouse enter/leave for table cells. And there are hundreds of cells.

The natural solution would be – to set the handler on <table> and process events there. But mouseenter/leave don’t bubble. So if such event happens on <td>, then only a handler on that <td> is able to catch it.

Handlers for mouseenter/leave on <table> only trigger when the pointer enters/leaves the table as a whole. It’s impossible to get any information about transitions inside it.

So, let’s use mouseover/mouseout.

Let’s start with simple handlers that highlight the element under mouse:

// let's highlight an element under the pointer
table.onmouseover = function (event) {
    let target = event.target;
    target.style.background = "pink";
};

table.onmouseout = function (event) {
    let target = event.target;
    target.style.background = "";
};
Here they are in action. As the mouse travels across the elements of this table,
the current one is highlighted: [codetabs height=480
src="mouseenter-mouseleave-delegation"]

In our case we’d like to handle transitions between table cells <td>: entering a cell and leaving it. Other transitions, such as inside the cell or outside of any cells, don’t interest us. Let’s filter them out.

Here’s what we can do:

  • Remember the currently highlighted <td> in a variable, let’s call it currentElem.
  • On mouseover – ignore the event if we’re still inside the current <td>.
  • On mouseout – ignore if we didn’t leave the current <td>.

Here’s an example of code that accounts for all possible situations:

[js src=“mouseenter-mouseleave-delegation-2/script.js”]

Once again, the important features are:

  1. It uses event delegation to handle entering/leaving of any <td> inside the table. So it relies on mouseover/out instead of mouseenter/leave that don’t bubble and hence allow no delegation.
  2. Extra events, such as moving between descendants of <td> are filtered out, so that onEnter/Leave runs only if the pointer leaves or enters <td> as a whole.
Here's the full example with all details: [codetabs height=460
src="mouseenter-mouseleave-delegation-2"] Try to move the cursor in and out of
table cells and inside them. Fast or slow -- doesn't matter. Only `
<td>` as a whole is highlighted, unlike the example before.</td>

Summary

We covered events mouseover, mouseout, mousemove, mouseenter and mouseleave.

These things are good to note:

  • A fast mouse move may skip intermediate elements.
  • Events mouseover/out and mouseenter/leave have an additional property: relatedTarget. That’s the element that we are coming from/to, complementary to target.

Events mouseover/out trigger even when we go from the parent element to a child element. The browser assumes that the mouse can be only over one element at one time – the deepest one.

Events mouseenter/leave are different in that aspect: they only trigger when the mouse comes in and out the element as a whole. Also they do not bubble.

Drag’n’Drop with mouse events

Drag’n’Drop is a great interface solution. Taking something and dragging and dropping it is a clear and simple way to do many things, from copying and moving documents (as in file managers) to ordering (dropping items into a cart).

In the modern HTML standard there’s a section about Drag and Drop with special events such as dragstart, dragend, and so on.

These events allow us to support special kinds of drag’n’drop, such as handling dragging a file from OS file-manager and dropping it into the browser window. Then JavaScript can access the contents of such files.

But native Drag Events also have limitations. For instance, we can’t prevent dragging from a certain area. Also we can’t make the dragging “horizontal” or “vertical” only. And there are many other drag’n’drop tasks that can’t be done using them. Also, mobile device support for such events is very weak.

So here we’ll see how to implement Drag’n’Drop using mouse events.

Drag’n’Drop algorithm

The basic Drag’n’Drop algorithm looks like this:

  1. On mousedown - prepare the element for moving, if needed (maybe create a clone of it, add a class to it or whatever).
  2. Then on mousemove move it by changing left/top with position:absolute.
  3. On mouseup - perform all actions related to finishing the drag’n’drop.

These are the basics. Later we’ll see how to other features, such as highlighting current underlying elements while we drag over them.

Here’s the implementation of dragging a ball:

ball.onmousedown = function (event) {
    // (1) prepare to moving: make absolute and on top by z-index
    ball.style.position = "absolute";
    ball.style.zIndex = 1000;

    // move it out of any current parents directly into body
    // to make it positioned relative to the body
    document.body.append(ball);

    // centers the ball at (pageX, pageY) coordinates
    function moveAt(pageX, pageY) {
        ball.style.left = pageX - ball.offsetWidth / 2 + "px";
        ball.style.top = pageY - ball.offsetHeight / 2 + "px";
    }

    // move our absolutely positioned ball under the pointer
    moveAt(event.pageX, event.pageY);

    function onMouseMove(event) {
        moveAt(event.pageX, event.pageY);
    }

    // (2) move the ball on mousemove
    document.addEventListener("mousemove", onMouseMove);

    // (3) drop the ball, remove unneeded handlers
    ball.onmouseup = function () {
        document.removeEventListener("mousemove", onMouseMove);
        ball.onmouseup = null;
    };
};

If we run the code, we can notice something strange. On the beginning of the drag’n’drop, the ball “forks”: we start dragging its “clone”.

Here's an example in action: [iframe src="ball" height=230] Try to drag'n'drop
with the mouse and you'll see such behavior.

That’s because the browser has its own drag’n’drop support for images and some other elements. It runs automatically and conflicts with ours.

To disable it:

ball.ondragstart = function () {
    return false;
};

Now everything will be all right.

In action: [iframe src="ball2" height=230]

Another important aspect – we track mousemove on document, not on ball. From the first sight it may seem that the mouse is always over the ball, and we can put mousemove on it.

But as we remember, mousemove triggers often, but not for every pixel. So after swift move the pointer can jump from the ball somewhere in the middle of document (or even outside of the window).

So we should listen on document to catch it.

Correct positioning

In the examples above the ball is always moved so, that it’s center is under the pointer:

ball.style.left = pageX - ball.offsetWidth / 2 + "px";
ball.style.top = pageY - ball.offsetHeight / 2 + "px";

Not bad, but there’s a side-effect. To initiate the drag’n’drop, we can mousedown anywhere on the ball. But if “take” it from its edge, then the ball suddenly “jumps” to become centered under the mouse pointer.

It would be better if we keep the initial shift of the element relative to the pointer.

For instance, if we start dragging by the edge of the ball, then the pointer should remain over the edge while dragging.

Let’s update our algorithm:

  1. When a visitor presses the button (mousedown) - remember the distance from the pointer to the left-upper corner of the ball in variables shiftX/shiftY. We’ll keep that distance while dragging.

    To get these shifts we can substract the coordinates:

    // onmousedown
    let shiftX = event.clientX - ball.getBoundingClientRect().left;
    let shiftY = event.clientY - ball.getBoundingClientRect().top;
    
  2. Then while dragging we position the ball on the same shift relative to the pointer, like this:

    // onmousemove
    // ball has position:absoute
    ball.style.left = event.pageX - shiftX + "px";
    ball.style.top = event.pageY - shiftY + "px";
    

The final code with better positioning:

ball.onmousedown = function (event) {
    let shiftX = event.clientX - ball.getBoundingClientRect().left;
    let shiftY = event.clientY - ball.getBoundingClientRect().top;

    ball.style.position = "absolute";
    ball.style.zIndex = 1000;
    document.body.append(ball);

    moveAt(event.pageX, event.pageY);

    // moves the ball at (pageX, pageY) coordinates
    // taking initial shifts into account
    function moveAt(pageX, pageY) {
        ball.style.left = pageX - shiftX + "px";
        ball.style.top = pageY - shiftY + "px";
    }

    function onMouseMove(event) {
        moveAt(event.pageX, event.pageY);
    }

    // move the ball on mousemove
    document.addEventListener("mousemove", onMouseMove);

    // drop the ball, remove unneeded handlers
    ball.onmouseup = function () {
        document.removeEventListener("mousemove", onMouseMove);
        ball.onmouseup = null;
    };
};

ball.ondragstart = function () {
    return false;
};
In action (inside `<iframe>`): [iframe src="ball3" height=230]</iframe>

The difference is especially noticeable if we drag the ball by its right-bottom corner. In the previous example the ball “jumps” under the pointer. Now it fluently follows the pointer from the current position.

Potential drop targets (droppables)

In previous examples the ball could be dropped just “anywhere” to stay. In real-life we usually take one element and drop it onto another. For instance, a “file” into a “folder” or something else.

Speaking abstract, we take a “draggable” element and drop it onto “droppable” element.

We need to know:

  • where the element was dropped at the end of Drag’n’Drop – to do the corresponding action,
  • and, preferably, know the droppable we’re dragging over, to highlight it.

The solution is kind-of interesting and just a little bit tricky, so let’s cover it here.

What may be the first idea? Probably to set mouseover/mouseup handlers on potential droppables?

But that doesn’t work.

The problem is that, while we’re dragging, the draggable element is always above other elements. And mouse events only happen on the top element, not on those below it.

For instance, below are two <div> elements, red one on top of the blue one (fully covers). There’s no way to catch an event on the blue one, because the red is on top:

<style>
    div {
        width: 50px;
        height: 50px;
        position: absolute;
        top: 0;
    }
</style>
<div style="background:blue" onmouseover="alert('never works')"></div>
<div style="background:red" onmouseover="alert('over red!')"></div>

The same with a draggable element. The ball is always on top over other elements, so events happen on it. Whatever handlers we set on lower elements, they won’t work.

That’s why the initial idea to put handlers on potential droppables doesn’t work in practice. They won’t run.

So, what to do?

There’s a method called document.elementFromPoint(clientX, clientY). It returns the most nested element on given window-relative coordinates (or null if given coordinates are out of the window).

We can use it in any of our mouse event handlers to detect the potential droppable under the pointer, like this:

// in a mouse event handler
ball.hidden = true; // (*) hide the element that we drag

let elemBelow = document.elementFromPoint(event.clientX, event.clientY);
// elemBelow is the element below the ball, may be droppable

ball.hidden = false;

Please note: we need to hide the ball before the call (*). Otherwise we’ll usually have a ball on these coordinates, as it’s the top element under the pointer: elemBelow=ball. So we hide it and immediately show again.

We can use that code to check what element we’re “flying over” at any time. And handle the drop when it happens.

An extended code of onMouseMove to find “droppable” elements:

// potential droppable that we're flying over right now
let currentDroppable = null;

function onMouseMove(event) {
    moveAt(event.pageX, event.pageY);

    ball.hidden = true;
    let elemBelow = document.elementFromPoint(event.clientX, event.clientY);
    ball.hidden = false;

    // mousemove events may trigger out of the window (when the ball is dragged off-screen)
    // if clientX/clientY are out of the window, then elementFromPoint returns null
    if (!elemBelow) return;

    // potential droppables are labeled with the class "droppable" (can be other logic)
    let droppableBelow = elemBelow.closest(".droppable");

    if (currentDroppable != droppableBelow) {
        // we're flying in or out...
        // note: both values can be null
        //   currentDroppable=null if we were not over a droppable before this event (e.g over an empty space)
        //   droppableBelow=null if we're not over a droppable now, during this event

        if (currentDroppable) {
            // the logic to process "flying out" of the droppable (remove highlight)
            leaveDroppable(currentDroppable);
        }
        currentDroppable = droppableBelow;
        if (currentDroppable) {
            // the logic to process "flying in" of the droppable
            enterDroppable(currentDroppable);
        }
    }
}

In the example below when the ball is dragged over the soccer gate, the gate is highlighted.

[codetabs height=250 src=“ball4”]

Now we have the current “drop target”, that we’re flying over, in the variable currentDroppable during the whole process and can use it to highlight or any other stuff.

Summary

We considered a basic Drag’n’Drop algorithm.

The key components:

  1. Events flow: ball.mousedown -> document.mousemove -> ball.mouseup (don’t forget to cancel native ondragstart).
  2. At the drag start – remember the initial shift of the pointer relative to the element: shiftX/shiftY and keep it during the dragging.
  3. Detect droppable elements under the pointer using document.elementFromPoint.

We can lay a lot on this foundation.

  • On mouseup we can intellectually finalize the drop: change data, move elements around.
  • We can highlight the elements we’re flying over.
  • We can limit dragging by a certain area or direction.
  • We can use event delegation for mousedown/up. A large-area event handler that checks event.target can manage Drag’n’Drop for hundreds of elements.
  • And so on.

There are frameworks that build architecture over it: DragZone, Droppable, Draggable and other classes. Most of them do the similar stuff to what’s described above, so it should be easy to understand them now. Or roll your own, as you can see that that’s easy enough to do, sometimes easier than adapting a third-part solution.

Pointer events

Pointer events is a modern way to handle input from a variety of pointing devices, such as a mouse, a pen/stylus, a touchscreen and so on.

The brief history

Let’s make a small overview, so that you understand the general picture and the place of Pointer Events among other event types.

  • Long ago, in the past, there existed only mouse events.

    Then touch devices appeared. For the old code to work, they also generate mouse events. For instance, tapping generates mousedown. But mouse events were not good enough, as touch devices are more powerful in many aspects. For example, it’s possible to touch multiple points at once, and mouse events don’t have any properties for that.

  • So touch events were introduced, such as touchstart, touchend, touchmove, that have touch-specific properties (we don’t cover them in detail here, because pointer events are even better).

    Still, it wasn’t enough, as there are many other devices, such as pens, that have their own features. Also, writing a code that listens both touch and mouse events was cumbersome.

  • To solve these issues, the new standard Pointer Events was introduced. It provides a single set of events for all kinds of pointing devices.

As of now, Pointer Events Level 2 specification is supported in all major browsers, while the Pointer Events Level 3 is in the works. Unless you code for Internet Explorer 10 or Safari 12 and below, there’s no point in using mouse or touch events any more. We can switch to pointer events.

That said, there are important peculiarities, one should know them to use them correctly and avoid extra surprises. We’ll pay attention to them in this article.

Pointer event types

Pointer events are named similar to mouse events:

Pointer Event Mouse event
pointerdown mousedown
pointerup mouseup
pointermove mousemove
pointerover mouseover
pointerout mouseout
pointerenter mouseenter
pointerleave mouseleave
pointercancel -
gotpointercapture -
lostpointercapture -

As we can see, for every mouse<event>, there’s a pointer<event> that plays a similar role. Also there are 3 additional pointer events that don’t have a corresponding mouse... counterpart, we’ll soon explain about them.

We can replace mouse<event> events with pointer<event> in our code and expect things to continue working fine with mouse.

The support for touch devices will also “magically” improve, but we’ll probably need to add touch-action: none rule in CSS. See the details below in the section about pointercancel.

Pointer event properties

Pointer events have the same properties as mouse events, such as clientX/Y, target etc, plus some extra:

  • pointerId - the unique identifier of the pointer causing the event. Allows to handle multiple pointers, such as a touchscreen with stylus and multi-touch (explained below).

  • pointerType - the pointing device type, must be a string, one of: “mouse”, “pen” or “touch”.

    We can use this property to react differently on various pointer types.

  • isPrimary - true for the primary pointer (the first finger in multi-touch).

For pointers that measure a contact area and pressure, e.g. a finger on the touchscreen, the additional properties can be useful:

  • width - the width of of the area where the pointer touches the device. Where unsupported, e.g. for mouse it’s always 1.
  • height - the height of of the area where the pointer touches the device. Where unsupported, always 1.
  • pressure - the pressure of the pointer tip, in range from 0 to 1. For devices that don’t support pressure must be either 0.5 (pressed) or 0.
  • tangentialPressure - the normalized tangential pressure.
  • tiltX, tiltY, twist - pen-specific properties that describe how the pen is positioned relative the surface.

These properties aren’t very well supported across devices, so they are rarely used. You can find the details in the specification if needed.

Multi-touch

One of the things that mouse events totally don’t support is multi-touch: a user can touch them in several places at once at their phone or tablet, perform special gestures.

Pointer Events allow to handle multi-touch with the help of pointerId and isPrimary properties.

Here’s what happens when a user touches a screen at one place, and then puts another finger somewhere else on it:

  1. At the first touch:
    • pointerdown with isPrimary=true and some pointerId.
  2. For the second finger and further touches:
    • pointerdown with isPrimary=false and a different pointerId for every finger.

Please note: there pointerId is assigned not to the whole device, but for each touching finger. If we use 5 fingers to simultaneously touch the screen, we have 5 pointerdown events with respective coordinates and different pointerId.

The events associated with the first finger always have isPrimary=true.

We can track multiple touching fingers using their pointerId. When the user moves move and then detouches a finger, we get pointermove and pointerup events with the same pointerId as we had in pointerdown.

Here's the demo that logs `pointerdown` and `pointerup` events: [iframe
src="multitouch" edit height=200] Please note: you must be using a touchscreen
device, such as a phone or a tablet to actually see the difference. For
single-touch devices, such as a mouse, there'll be always same `pointerId` with
`isPrimary=true`, for all pointer events.

Event: pointercancel

We’ve mentioned the importance of touch-action: none before. Now let’s explain why, as skipping this may cause our interfaces to malfunction.

The pointercancel event fires when there’s an ongoing pointer interaction, and then something happens that causes it to be aborted, so that no more pointer events are generated.

Such causes are:

  • The pointer device hardware was disabled.
  • The device orientation changed (tablet rotated).
  • The browser decided to handle the interaction on its own, considering it a mouse gesture or zoom-and-pan action or something else.

We’ll demonstrate pointercancel on a practical example to see how it affects us.

Let’s say we’re impelementing drag’n’drop for a ball, just as in the beginning of the article info:mouse-drag-and-drop.

Here are the flow of user actions and corresponding events:

  1. The user presses the mouse button on an image, to start dragging
    • pointerdown event fires
  2. Then they start dragging the image
    • pointermove fires, maybe several times
  3. Surprise! The browser has native drag’n’drop support for images, that kicks in and takes over the drag’n’drop process, thus generating pointercancel event.
    • The browser now handles drag’n’drop of the image on its own. The user may even drag the ball image out of the browser, into their Mail program or a File Manager.
    • No more pointermove events for us.

So the issue is that the browser “hijacks” the interaction: pointercancel fires and no more pointermove events are generated.

Here's the demo with pointer events (only `up/down`, `move` and `cancel`) logged
in the textarea: [iframe src="ball" height=240 edit]

We’d like to implement our own drag’n’drop, so let’s tell the browser not to take it over.

Prevent default browser actions to avoid pointercancel.

We need to do two things:

  1. Prevent native drag’n’drop from happening:
    • Can do it by setting ball.ondragstart = () => false, just as described in the article info:mouse-drag-and-drop.
    • That works well for mouse events.
  2. For touch devices, there are also touch-related browser actions. We’ll have problems with them too.
    • We can prevent them by setting #ball { touch-action: none } in CSS.
    • Then our code will start working on touch devices.

After we do that, the events will work as intended, the browser won’t hijack the process and emit no pointercancel.

This demo adds these lines: [iframe src="ball-2" height=240 edit] As you can
see, there's no `pointercancel` any more.

Now we can add the code to actually move the ball, and our drag’n’drop will work for mouse devices and touch devices.

Pointer capturing

Pointer capturing is a special feature of pointer events.

The idea is that we can “bind” all events with a particular pointerId to a given element. Then all subsequent events with the same pointerId will be retargeted to the same element. That is: the browser sets that element as the target and trigger associated handlers, no matter where it actually happened.

The related methods are:

  • elem.setPointerCapture(pointerId) - binds the given pointerId to elem.
  • elem.releasePointerCapture(pointerId) - unbinds the given pointerId from elem.

Such binding doesn’t hold long. It’s automatically removed after pointerup or pointercancel events, or when the target elem is removed from the document.

Now when do we need this?

Pointer capturing is used to simplify drag’n’drop kind of interactions.

Let’s recall the problem we met while making a custom slider in the article info:mouse-drag-and-drop.

  1. First, the user presses pointerdown on the slider thumb to start dragging it.
  2. …But then, as they move the pointer, it may leave the slider: go below or over it.

But we continue tracking track pointermove events and move the thumb until pointerup, even though the pointer is not on the slider any more.

Previously, to handle pointermove events that happen outside of the slider, we listened for pointermove events on the whole document.

Pointer capturing provides an alternative solution: we can call thumb.setPointerCapture(event.pointerId) in pointerdown handler, and then all future pointer events until pointerup will be retargeted to thumb.

That is: events handlers on thumb will be called, and event.target will always be thumb, even if the user moves their pointer around the whole document. So we can listen at thumb for pointermove, no matter where it happens.

Here’s the essential code:

thumb.onpointerdown = function (event) {
    // retarget all pointer events (until pointerup) to me
    thumb.setPointerCapture(event.pointerId);
};

thumb.onpointermove = function (event) {
    // move the slider: listen at thumb, as all events are retargeted to it
    let newLeft = event.clientX - slider.getBoundingClientRect().left;
    thumb.style.left = newLeft + "px";
};

// note: no need to call thumb.releasePointerCapture,
// it happens on pointerup automatically
The full demo: [iframe src="slider" height=100 edit]

As a summary: the code becomes cleaner as we don’t need to add/remove handlers on the whole document any more. That’s what pointer capturing does.

There are two associated pointer events:

  • gotpointercapture fires when an element uses setPointerCapture to enable capturing.
  • lostpointercapture fires when the capture is released: either explicitly with releasePointerCapture call, or automatically on pointerup/pointercancel.

Summary

Pointer events allow to handle mouse, touch and pen events simultaneously.

Pointer events extend mouse events. We can replace mouse with pointer in event names and expect our code to continue working for mouse, with better support for other device types.

Remember to set touch-events: none in CSS for elements that we engage, otherwise the browser hijacks many types of touch interactions and pointer events won’t be generated.

Additional abilities of Pointer events are:

  • Multi-touch support using pointerId and isPrimary.
  • Device-specific properties, such as pressure, width/height and others.
  • Pointer capturing: we can retarget all pointer events to a specific element until pointerup/pointercancel.

As of now, pointer events are supported in all major browsers, so we can safely switch to them, if IE10- and Safari 12- are not needed. And even with those browsers, there are polyfills that enable the support of pointer events.

Keyboard: keydown and keyup

Before we get to keyboard, please note that on modern devices there are other ways to “input something”. For instance, people use speech recognition (especially on mobile devices) or copy/paste with the mouse.

So if we want to track any input into an <input> field, then keyboard events are not enough. There’s another event named input to track changes of an <input> field, by any means. And it may be a better choice for such task. We’ll cover it later in the chapter info:events-change-input.

Keyboard events should be used when we want to handle keyboard actions (virtual keyboard also counts). For instance, to react on arrow keys key:Up and key:Down or hotkeys (including combinations of keys).

Teststand [#keyboard-test-stand]

To better understand keyboard events, you can use the teststand.

To better understand keyboard events, you can use the teststand below. Try
different key combinations in the text field. [codetabs src="keyboard-dump"
height=480]

Keydown and keyup

The keydown events happens when a key is pressed down, and then keyup – when it’s released.

event.code and event.key

The key property of the event object allows to get the character, while the code property of the event object allows to get the “physical key code”.

For instance, the same key key:Z can be pressed with or without key:Shift. That gives us two different characters: lowercase z and uppercase Z.

The event.key is exactly the character, and it will be different. But event.code is the same:

Key event.key event.code
key:Z z (lowercase) KeyZ
key:Shift+Z Z (uppercase) KeyZ

If a user works with different languages, then switching to another language would make a totally different character instead of "Z". That will become the value of event.key, while event.code is always the same: "KeyZ".

Every key has the code that depends on its location on the keyboard. Key codes described in the UI Events code specification.

For instance:

  • Letter keys have codes "Key<letter>": "KeyA", "KeyB" etc.
  • Digit keys have codes: "Digit<number>": "Digit0", "Digit1" etc.
  • Special keys are coded by their names: "Enter", "Backspace", "Tab" etc.

There are several widespread keyboard layouts, and the specification gives key codes for each of them.

Read the alphanumeric section of the spec for more codes, or just press a key in the teststand above.

Seems obvious, but people still make mistakes.

Please evade mistypes: it’s KeyZ, not keyZ. The check like event.code=="keyZ" won’t work: the first letter of "Key" must be uppercase.

What if a key does not give any character? For instance, key:Shift or key:F1 or others. For those keys, event.key is approximately the same as event.code:

Key event.key event.code
key:F1 F1 F1
key:Backspace Backspace Backspace
key:Shift Shift ShiftRight or ShiftLeft

Please note that event.code specifies exactly which key is pressed. For instance, most keyboards have two key:Shift keys: on the left and on the right side. The event.code tells us exactly which one was pressed, and event.key is responsible for the “meaning” of the key: what it is (a “Shift”).

Let’s say, we want to handle a hotkey: key:Ctrl+Z (or key:Cmd+Z for Mac). Most text editors hook the “Undo” action on it. We can set a listener on keydown and check which key is pressed.

There’s a dilemma here: in such a listener, should we check the value of event.key or event.code?

On one hand, the value of event.key is a character, it changes depending on the language. If the visitor has several languages in OS and switches between them, the same key gives different characters. So it makes sense to check event.code, it’s always the same.

Like this:

document.addEventListener("keydown", function (event) {
    if (event.code == "KeyZ" && (event.ctrlKey || event.metaKey)) {
        alert("Undo!");
    }
});

On the other hand, there’s a problem with event.code. For different keyboard layouts, the same key may have different characters.

For example, here are US layout (“QWERTY”) and German layout (“QWERTZ”) under it (from Wikipedia):

For the same key, US layout has “Z”, while German layout has “Y” (letters are swapped).

Literally, event.code will equal KeyZ for people with German layout when they press key:Y.

If we check event.code == 'KeyZ' in our code, then for people with German layout such test will pass when they press key:Y.

That sounds really odd, but so it is. The specification explicitly mentions such behavior.

So, event.code may match a wrong character for unexpected layout. Same letters in different layouts may map to different physical keys, leading to different codes. Luckily, that happens only with several codes, e.g. keyA, keyQ, keyZ (as we’ve seen), and doesn’t happen with special keys such as Shift. You can find the list in the specification.

To reliably track layout-dependent characters, event.key may be a better way.

On the other hand, event.code has the benefit of staying always the same, bound to the physical key location, even if the visitor changes languages. So hotkeys that rely on it work well even in case of a language switch.

Do we want to handle layout-dependant keys? Then event.key is the way to go.

Or we want a hotkey to work even after a language switch? Then event.code may be better.

Auto-repeat

If a key is being pressed for a long enough time, it starts to “auto-repeat”: the keydown triggers again and again, and then when it’s released we finally get keyup. So it’s kind of normal to have many keydown and a single keyup.

For events triggered by auto-repeat, the event object has event.repeat property set to true.

Default actions

Default actions vary, as there are many possible things that may be initiated by the keyboard.

For instance:

  • A character appears on the screen (the most obvious outcome).
  • A character is deleted (key:Delete key).
  • The page is scrolled (key:PageDown key).
  • The browser opens the “Save Page” dialog (key:Ctrl+S)
  • …and so on.

Preventing the default action on keydown can cancel most of them, with the exception of OS-based special keys. For instance, on Windows key:Alt+F4 closes the current browser window. And there’s no way to stop it by preventing the default action in JavaScript.

For instance, the <input> below expects a phone number, so it does not accept keys except digits, +, () or -:

<script>
    function checkPhoneKey(key) {
        return (
            (key >= "0" && key <= "9") ||
            key == "+" ||
            key == "(" ||
            key == ")" ||
            key == "-"
        );
    }
</script>
<input
    onkeydown="return checkPhoneKey(event.key)"
    placeholder="Phone, please"
    type="tel"
/>

Please note that special keys, such as key:Backspace, key:Left, key:Right, key:Ctrl+V, do not work in the input. That’s a side-effect of the strict filter checkPhoneKey.

Let’s relax it a little bit:

<script>
    function checkPhoneKey(key) {
        return (
            (key >= "0" && key <= "9") ||
            key == "+" ||
            key == "(" ||
            key == ")" ||
            key == "-" ||
            key == "ArrowLeft" ||
            key == "ArrowRight" ||
            key == "Delete" ||
            key == "Backspace"
        );
    }
</script>
<input
    onkeydown="return checkPhoneKey(event.key)"
    placeholder="Phone, please"
    type="tel"
/>

Now arrows and deletion works well.

…But we still can enter anything by using a mouse and right-click + Paste. So the filter is not 100% reliable. We can just let it be like that, because most of time it works. Or an alternative approach would be to track the input event – it triggers after any modification. There we can check the new value and highlight/modify it when it’s invalid.

Legacy

In the past, there was a keypress event, and also keyCode, charCode, which properties of the event object.

There were so many browser incompatibilities while working with them, that developers of the specification had no way, other than deprecating all of them and creating new, modern events (described above in this chapter). The old code still works, as browsers keep supporting them, but there’s totally no need to use those any more.

Summary

Pressing a key always generates a keyboard event, be it symbol keys or special keys like key:Shift or key:Ctrl and so on. The only exception is key:Fn key that sometimes presents on a laptop keyboard. There’s no keyboard event for it, because it’s often implemented on lower level than OS.

Keyboard events:

  • keydown – on pressing the key (auto-repeats if the key is pressed for long),
  • keyup – on releasing the key.

Main keyboard event properties:

  • code – the “key code” ("KeyA", "ArrowLeft" and so on), specific to the physical location of the key on keyboard.
  • key – the character ("A", "a" and so on), for non-character keys, such as key:Esc, usually has the same value as code.

In the past, keyboard events were sometimes used to track user input in form fields. That’s not reliable, because the input can come from various sources. We have input and change events to handle any input (covered later in the chapter info:events-change-input). They trigger after any kind of input, including copy-pasting or speech recognition.

We should use keyboard events when we really want keyboard. For example, to react on hotkeys or special keys.

Scrolling

The scroll event allows to react on a page or element scrolling. There are quite a few good things we can do here.

For instance:

  • Show/hide additional controls or information depending on where in the document the user is.
  • Load more data when the user scrolls down till the end of the page.

Here’s a small function to show the current scroll:

window.addEventListener("scroll", function () {
    document.getElementById("showScroll").innerHTML = window.pageYOffset + "px";
});
In action: Current scroll = <b id="showScroll">scroll the window</b>

The scroll event works both on the window and on scrollable elements.

Prevent scrolling

How do we make something unscrollable?

We can’t prevent scrolling by using event.preventDefault() in onscroll listener, because it triggers after the scroll has already happened.

But we can prevent scrolling by event.preventDefault() on an event that causes the scroll, for instance keydown event for key:pageUp and key:pageDown.

If we add an event handler to these events and event.preventDefault() in it, then the scroll won’t start.

There are many ways to initiate a scroll, so it’s more reliable to use CSS, overflow property.

Here are few tasks that you can solve or look through to see the applications on onscroll.

Form properties and methods

Forms and control elements, such as <input> have a lot of special properties and events.

Working with forms will be much more convenient when we learn them.

Document forms are members of the special collection document.forms.

That’s a so-called “named collection”: it’s both named and ordered. We can use both the name or the number in the document to get the form.

document.forms.my - the form with name="my"
document.forms[0] - the first form in the document

When we have a form, then any element is available in the named collection form.elements.

For instance:

<form name="my">
    <input name="one" value="1" />
    <input name="two" value="2" />
</form>

<script>
    // get the form
    let form = document.forms.my; // <form name="my"> element

    // get the element
    let elem = form.elements.one; // <input name="one"> element

    alert(elem.value); // 1
</script>

There may be multiple elements with the same name, that’s often the case with radio buttons.

In that case form.elements[name] is a collection, for instance:

<form>
    <input type="radio" name="age" value="10" />
    <input type="radio" name="age" value="20" />
</form>

<script>
    let form = document.forms[0];

    let ageElems = form.elements.age;

    alert(ageElems[0]); // [object HTMLInputElement]
</script>

These navigation properties do not depend on the tag structure. All control elements, no matter how deep they are in the form, are available in form.elements.

A form may have one or many <fieldset> elements inside it. They also have elements property that lists form controls inside them.

For instance:

<body>
    <form id="form">
        <fieldset name="userFields">
            <legend>info</legend>
            <input name="login" type="text" />
        </fieldset>
    </form>

    <script>
        alert(form.elements.login); // <input name="login">

        let fieldset = form.elements.userFields;
        alert(fieldset); // HTMLFieldSetElement

        // we can get the input by name both from the form and from the fieldset
        alert(fieldset.elements.login == form.elements.login); // true
    </script>
</body>

There’s a shorter notation: we can access the element as form[index/name].

In other words, instead of form.elements.login we can write form.login.

That also works, but there’s a minor issue: if we access an element, and then change its name, then it is still available under the old name (as well as under the new one).

That’s easy to see in an example:

<form id="form">
    <input name="login" />
</form>

<script>
    alert(form.elements.login == form.login); // true, the same <input>

    form.login.name = "username"; // change the name of the input

    // form.elements updated the name:
    alert(form.elements.login); // undefined
    alert(form.elements.username); // input

    // form allows both names: the new one and the old one
    alert(form.username == form.login); // true
</script>

That’s usually not a problem, because we rarely change names of form elements.

Backreference: element.form

For any element, the form is available as element.form. So a form references all elements, and elements reference the form.

Here’s the picture:

For instance:

<form id="form">
    <input type="text" name="login" />
</form>

<script>
    // form -> element
    let login = form.login;

    // element -> form
    alert(login.form); // HTMLFormElement
</script>

Form elements

Let’s talk about form controls.

input and textarea

We can access their value as input.value (string) or input.checked (boolean) for checkboxes.

Like this:

input.value = "New value";
textarea.value = "New text";

input.checked = true; // for a checkbox or radio button

Please note that even though <textarea>...</textarea> holds its value as nested HTML, we should never use textarea.innerHTML to access it.

It stores only the HTML that was initially on the page, not the current value.

select and option

A <select> element has 3 important properties:

  1. select.options – the collection of <option> subelements,
  2. select.value – the value of the currently selected <option>,
  3. select.selectedIndex – the number of the currently selected <option>.

They provide three different ways of setting a value for a <select>:

  1. Find the corresponding <option> element and set option.selected to true.
  2. Set select.value to the value.
  3. Set select.selectedIndex to the number of the option.

The first way is the most obvious, but (2) and (3) are usually more convenient.

Here is an example:

<select id="select">
    <option value="apple">Apple</option>
    <option value="pear">Pear</option>
    <option value="banana">Banana</option>
</select>

<script>
    // all three lines do the same thing
    select.options[2].selected = true;
    select.selectedIndex = 2;
    select.value = "banana";
</script>

Unlike most other controls, <select> allows to select multiple options at once if it has multiple attribute. That’s feature is rarely used. In that case we need to use the first way: add/remove the selected property from <option> subelements.

We can get their collection as select.options, for instance:

<select id="select" multiple>
    <option value="blues" selected>Blues</option>
    <option value="rock" selected>Rock</option>
    <option value="classic">Classic</option>
</select>

<script>
    // get all selected values from multi-select
    let selected = Array.from(select.options)
        .filter((option) => option.selected)
        .map((option) => option.value);

    alert(selected); // blues,rock
</script>

The full specification of the <select> element is available in the specification https://html.spec.whatwg.org/multipage/forms.html#the-select-element.

new Option

This is rarely used on its own. But there’s still an interesting thing.

In the specification there’s a nice short syntax to create <option> elements:

option = new Option(text, value, defaultSelected, selected);

Parameters:

  • text – the text inside the option,
  • value – the option value,
  • defaultSelected – if true, then selected HTML-attribute is created,
  • selected – if true, then the option is selected.

There may be a small confusion about defaultSelected and selected. That’s simple: defaultSelected sets HTML-attribute, that we can get using option.getAttribute('selected'). And selected - whether the option is selected or not, that’s more important. Usually both values are either set to true or not set (same as false).

For instance:

let option = new Option("Text", "value");
// creates <option value="value">Text</option>

The same element selected:

let option = new Option("Text", "value", true, true);

Option elements have properties:

option.selected
Is the option selected.
option.index
The number of the option among the others in its <select>.
option.text
Text content of the option (seen by the visitor).

References

Summary

Form navigation:

document.forms
A form is available as document.forms[name/index].
form.elements
Form elements are available as form.elements[name/index], or can use just form[name/index]. The elements property also works for <fieldset>.
element.form
Elements reference their form in the form property.

Value is available as input.value, textarea.value, select.value etc, or input.checked for checkboxes and radio buttons.

For <select> we can also get the value by the index select.selectedIndex or through the options collection select.options.

These are the basics to start working with forms. We’ll meet many examples further in the tutorial.

In the next chapter we’ll cover focus and blur events that may occur on any element, but are mostly handled on forms.

Focusing: focus/blur

An element receives a focus when the user either clicks on it or uses the key:Tab key on the keyboard. There’s also an autofocus HTML attribute that puts the focus into an element by default when a page loads and other means of getting a focus.

Focusing on an element generally means: “prepare to accept the data here”, so that’s the moment when we can run the code to initialize the required functionality.

The moment of losing the focus (“blur”) can be even more important. That’s when a user clicks somewhere else or presses key:Tab to go to the next form field, or there are other means as well.

Losing the focus generally means: “the data has been entered”, so we can run the code to check it or even to save it to the server and so on.

There are important peculiarities when working with focus events. We’ll do the best to cover them further on.

Events focus/blur

The focus event is called on focusing, and blur – when the element loses the focus.

Let’s use them for validation of an input field.

In the example below:

  • The blur handler checks if the field the email is entered, and if not – shows an error.
  • The focus handler hides the error message (on blur it will be checked again):
<style>
    .invalid {
        border-color: red;
    }
    #error {
        color: red;
    }
</style>

Your email please: <input type="email" id="input" />

<div id="error"></div>

<script>
    input.onblur = function () {
        if (!input.value.includes("@")) {
            // not email
            input.classList.add("invalid");
            error.innerHTML = "Please enter a correct email.";
        }
    };

    input.onfocus = function () {
        if (this.classList.contains("invalid")) {
            // remove the "error" indication, because the user wants to re-enter something
            this.classList.remove("invalid");
            error.innerHTML = "";
        }
    };
</script>

Modern HTML allows us to do many validations using input attributes: required, pattern and so on. And sometimes they are just what we need. JavaScript can be used when we want more flexibility. Also we could automatically send the changed value to the server if it’s correct.

Methods focus/blur

Methods elem.focus() and elem.blur() set/unset the focus on the element.

For instance, let’s make the visitor unable to leave the input if the value is invalid:

<style>
    .error {
        background: red;
    }
</style>

Your email please: <input type="email" id="input" />
<input
    type="text"
    style="width:220px"
    placeholder="make email invalid and try to focus here"
/>

<script>
    input.onblur = function () {
        if (!this.value.includes("@")) {
            // not email
            // show the error
            this.classList.add("error");

            // ...and put the focus back
            input.focus();
        } else {
            this.classList.remove("error");
        }
    };
</script>

It works in all browsers except Firefox (bug).

If we enter something into the input and then try to use key:Tab or click away from the <input>, then onblur returns the focus back.

Please note that we can’t “prevent losing focus” by calling event.preventDefault() in onblur, because onblur works after the element lost the focus.

A focus loss can occur for many reasons.

One of them is when the visitor clicks somewhere else. But also JavaScript itself may cause it, for instance:

  • An alert moves focus to itself, so it causes the focus loss at the element (blur event), and when the alert is dismissed, the focus comes back (focus event).
  • If an element is removed from DOM, then it also causes the focus loss. If it is reinserted later, then the focus doesn’t return.

These features sometimes cause focus/blur handlers to misbehave – to trigger when they are not needed.

The best recipe is to be careful when using these events. If we want to track user-initiated focus-loss, then we should avoid causing it ourselves.

Allow focusing on any element: tabindex

By default many elements do not support focusing.

The list varies a bit between browsers, but one thing is always correct: focus/blur support is guaranteed for elements that a visitor can interact with: <button>, <input>, <select>, <a> and so on.

From the other hand, elements that exist to format something, such as <div>, <span>, <table> – are unfocusable by default. The method elem.focus() doesn’t work on them, and focus/blur events are never triggered.

This can be changed using HTML-attribute tabindex.

Any element becomes focusable if it has tabindex. The value of the attribute is the order number of the element when key:Tab (or something like that) is used to switch between them.

That is: if we have two elements, the first has tabindex="1", and the second has tabindex="2", then pressing key:Tab while in the first element – moves the focus into the second one.

The switch order is: elements with tabindex from 1 and above go first (in the tabindex order), and then elements without tabindex (e.g. a regular <input>).

Elements with matching tabindex are switched in the document source order (the default order).

There are two special values:

  • tabindex="0" puts an element among those without tabindex. That is, when we switch elements, elements with tabindex=0 go after elements with tabindex ≥ 1.

    Usually it’s used to make an element focusable, but keep the default switching order. To make an element a part of the form on par with <input>.

  • tabindex="-1" allows only programmatic focusing on an element. The key:Tab key ignores such elements, but method elem.focus() works.

For instance, here’s a list. Click the first item and press key:Tab:

Click the first item and press Tab. Keep track of the order. Please note that
many subsequent Tabs can move the focus out of the iframe with the example.
<ul>
    <li tabindex="1">One</li>
    <li tabindex="0">Zero</li>
    <li tabindex="2">Two</li>
    <li tabindex="-1">Minus one</li>
</ul>

<style>
    li {
        cursor: pointer;
    }
    :focus {
        outline: 1px dashed green;
    }
</style>

The order is like this: 1 - 2 - 0. Normally, <li> does not support focusing, but tabindex full enables it, along with events and styling with :focus.

We can add tabindex from JavaScript by using the elem.tabIndex property. That has the same effect.

Delegation: focusin/focusout

Events focus and blur do not bubble.

For instance, we can’t put onfocus on the <form> to highlight it, like this:

<!-- on focusing in the form -- add the class -->
<form onfocus="this.className='focused'">
    <input type="text" name="name" value="Name" />
    <input type="text" name="surname" value="Surname" />
</form>

<style>
    .focused {
        outline: 1px solid red;
    }
</style>

The example above doesn’t work, because when user focuses on an <input>, the focus event triggers on that input only. It doesn’t bubble up. So form.onfocus never triggers.

There are two solutions.

First, there’s a funny historical feature: focus/blur do not bubble up, but propagate down on the capturing phase.

This will work:

<form id="form">
    <input type="text" name="name" value="Name" />
    <input type="text" name="surname" value="Surname" />
</form>

<style>
    .focused {
        outline: 1px solid red;
    }
</style>

<script>
    // put the handler on capturing phase (last argument true)
    form.addEventListener("focus", () => form.classList.add("focused"), true);
    form.addEventListener("blur", () => form.classList.remove("focused"), true);
</script>

Second, there are focusin and focusout events – exactly the same as focus/blur, but they bubble.

Note that they must be assigned using elem.addEventListener, not on<event>.

So here’s another working variant:

<form id="form">
    <input type="text" name="name" value="Name" />
    <input type="text" name="surname" value="Surname" />
</form>

<style>
    .focused {
        outline: 1px solid red;
    }
</style>

<script>
    form.addEventListener("focusin", () => form.classList.add("focused"));
    form.addEventListener("focusout", () => form.classList.remove("focused"));
</script>

Summary

Events focus and blur trigger on focusing/losing focus on the element.

Their specials are:

  • They do not bubble. Can use capturing state instead or focusin/focusout.
  • Most elements do not support focus by default. Use tabindex to make anything focusable.

The current focused element is available as document.activeElement.

Events: change, input, cut, copy, paste

Let’s cover various events that accompany data updates.

Event: change

The change event triggers when the element has finished changing.

For text inputs that means that the event occurs when it loses focus.

For instance, while we are typing in the text field below – there’s no event. But when we move the focus somewhere else, for instance, click on a button – there will be a change event:

<input type="text" onchange="alert(this.value)" />
<input type="button" value="Button" />

For other elements: select, input type=checkbox/radio it triggers right after the selection changes:

<select onchange="alert(this.value)">
    <option value="">Select something</option>
    <option value="1">Option 1</option>
    <option value="2">Option 2</option>
    <option value="3">Option 3</option>
</select>

Event: input

The input event triggers every time after a value is modified by the user.

Unlike keyboard events, it triggers on any value change, even those that does not involve keyboard actions: pasting with a mouse or using speech recognition to dictate the text.

For instance:

<input type="text" id="input" /> oninput: <span id="result"></span>
<script>
    input.oninput = function () {
        result.innerHTML = input.value;
    };
</script>

If we want to handle every modification of an <input> then this event is the best choice.

On the other hand, input event doesn’t trigger on keyboard input and other actions that do not involve value change, e.g. pressing arrow keys key:⇦ key:⇨ while in the input.

The input event occurs after the value is modified.

So we can’t use event.preventDefault() there – it’s just too late, there would be no effect.

Events: cut, copy, paste

These events occur on cutting/copying/pasting a value.

They belong to ClipboardEvent class and provide access to the data that is copied/pasted.

We also can use event.preventDefault() to abort the action, then nothing gets copied/pasted.

For instance, the code below prevents all such events and shows what we are trying to cut/copy/paste:

<input type="text" id="input" />
<script>
    input.oncut =
        input.oncopy =
        input.onpaste =
            function (event) {
                alert(
                    event.type +
                        " - " +
                        event.clipboardData.getData("text/plain")
                );
                return false;
            };
</script>

Please note, that it’s possible to copy/paste not just text, but everything. For instance, we can copy a file in the OS file manager, and paste it.

There’s a list of methods in the specification that can work with different data types including files, read/write to the clipboard.

But please note that clipboard is a “global” OS-level thing. Most browsers allow read/write access to the clipboard only in the scope of certain user actions for the safety, e.g. in onclick event handlers.

Also it’s forbidden to generate “custom” clipboard events with dispatchEvent in all browsers except Firefox.

Summary

Data change events:

Event Description Specials
change A value was changed. For text inputs triggers on focus loss.
input For text inputs on every change. Triggers immediately unlike change.
cut/copy/paste Cut/copy/paste actions. The action can be prevented. The event.clipboardData property gives read/write access to the clipboard.

Forms: event and method submit

The submit event triggers when the form is submitted, it is usually used to validate the form before sending it to the server or to abort the submission and process it in JavaScript.

The method form.submit() allows to initiate form sending from JavaScript. We can use it to dynamically create and send our own forms to server.

Let’s see more details of them.

Event: submit

There are two main ways to submit a form:

  1. The first – to click <input type="submit"> or <input type="image">.
  2. The second – press key:Enter on an input field.

Both actions lead to submit event on the form. The handler can check the data, and if there are errors, show them and call event.preventDefault(), then the form won’t be sent to the server.

In the form below:

  1. Go into the text field and press key:Enter.
  2. Click <input type="submit">.

Both actions show alert and the form is not sent anywhere due to return false:

<form onsubmit="alert('submit!');return false">
    First: Enter in the input field <input type="text" value="text" /><br />
    Second: Click "submit": <input type="submit" value="Submit" />
</form>

When a form is sent using key:Enter on an input field, a click event triggers on the <input type="submit">.

That’s rather funny, because there was no click at all.

Here’s the demo:

<form onsubmit="return false">
    <input type="text" size="30" value="Focus here and press enter" />
    <input type="submit" value="Submit" onclick="alert('click')" />
</form>

Method: submit

To submit a form to the server manually, we can call form.submit().

Then the submit event is not generated. It is assumed that if the programmer calls form.submit(), then the script already did all related processing.

Sometimes that’s used to manually create and send a form, like this:

let form = document.createElement("form");
form.action = "https://google.com/search";
form.method = "GET";

form.innerHTML = '<input name="q" value="test">';

// the form must be in the document to submit it
document.body.append(form);

form.submit();

Page: DOMContentLoaded, load, beforeunload, unload

The lifecycle of an HTML page has three important events:

  • DOMContentLoaded – the browser fully loaded HTML, and the DOM tree is built, but external resources like pictures <img> and stylesheets may be not yet loaded.
  • load – not only HTML is loaded, but also all the external resources: images, styles etc.
  • beforeunload/unload – the user is leaving the page.

Each event may be useful:

  • DOMContentLoaded event – DOM is ready, so the handler can lookup DOM nodes, initialize the interface.
  • load event – external resources are loaded, so styles are applied, image sizes are known etc.
  • beforeunload event – the user is leaving: we can check if the user saved the changes and ask them whether they really want to leave.
  • unload – the user almost left, but we still can initiate some operations, such as sending out statistics.

Let’s explore the details of these events.

DOMContentLoaded

The DOMContentLoaded event happens on the document object.

We must use addEventListener to catch it:

document.addEventListener("DOMContentLoaded", ready);
// not "document.onDOMContentLoaded = ..."

For instance:

<script>
    function ready() {
        alert("DOM is ready");

        // image is not yet loaded (unless was cached), so the size is 0x0
        alert(`Image size: ${img.offsetWidth}x${img.offsetHeight}`);
    }

    document.addEventListener("DOMContentLoaded", ready);
</script>

<img id="img" src="https://en.js.cx/clipart/train.gif?speed=1&cache=0" />

In the example the DOMContentLoaded handler runs when the document is loaded, so it can see all the elements, including <img> below.

But it doesn’t wait for the image to load. So alert shows zero sizes.

At first sight, the DOMContentLoaded event is very simple. The DOM tree is ready – here’s the event. There are few peculiarities though.

DOMContentLoaded and scripts

When the browser processes an HTML-document and comes across a <script> tag, it needs to execute before continuing building the DOM. That’s a precaution, as scripts may want to modify DOM, and even document.write into it, so DOMContentLoaded has to wait.

So DOMContentLoaded definitely happens after such scripts:

<script>
    document.addEventListener("DOMContentLoaded", () => {
        alert("DOM ready!");
    });
</script>

<script src="https://cdnjs.cloudflare.com/ajax/libs/lodash.js/4.3.0/lodash.js"></script>

<script>
    alert("Library loaded, inline script executed");
</script>

In the example above, we first see “Library loaded…”, and then “DOM ready!” (all scripts are executed).

There are two exceptions from this rule:

  1. Scripts with the async attribute, that we’ll cover a bit later, don’t block DOMContentLoaded.
  2. Scripts that are generated dynamically with document.createElement('script') and then added to the webpage also don’t block this event.

DOMContentLoaded and styles

External style sheets don’t affect DOM, so DOMContentLoaded does not wait for them.

But there’s a pitfall. If we have a script after the style, then that script must wait until the stylesheet loads:

<link type="text/css" rel="stylesheet" href="style.css" />
<script>
    // the script doesn't not execute until the stylesheet is loaded
    alert(getComputedStyle(document.body).marginTop);
</script>

The reason for this is that the script may want to get coordinates and other style-dependent properties of elements, like in the example above. Naturally, it has to wait for styles to load.

As DOMContentLoaded waits for scripts, it now waits for styles before them as well.

Built-in browser autofill

Firefox, Chrome and Opera autofill forms on DOMContentLoaded.

For instance, if the page has a form with login and password, and the browser remembered the values, then on DOMContentLoaded it may try to autofill them (if approved by the user).

So if DOMContentLoaded is postponed by long-loading scripts, then autofill also awaits. You probably saw that on some sites (if you use browser autofill) – the login/password fields don’t get autofilled immediately, but there’s a delay till the page fully loads. That’s actually the delay until the DOMContentLoaded event.

window.onload [#window-onload]

The load event on the window object triggers when the whole page is loaded including styles, images and other resources. This event is available via the onload property.

The example below correctly shows image sizes, because window.onload waits for all images:

<script>
    window.onload = function () {
        // same as window.addEventListener('load', (event) => {
        alert("Page loaded");

        // image is loaded at this time
        alert(`Image size: ${img.offsetWidth}x${img.offsetHeight}`);
    };
</script>

<img id="img" src="https://en.js.cx/clipart/train.gif?speed=1&cache=0" />

window.onunload

When a visitor leaves the page, the unload event triggers on window. We can do something there that doesn’t involve a delay, like closing related popup windows.

The notable exception is sending analytics.

Let’s say we gather data about how the page is used: mouse clicks, scrolls, viewed page areas, and so on.

Naturally, unload event is when the user leaves us, and we’d like to save the data on our server.

There exists a special navigator.sendBeacon(url, data) method for such needs, described in the specification https://w3c.github.io/beacon/.

It sends the data in background. The transition to another page is not delayed: the browser leaves the page, but still performs sendBeacon.

Here’s how to use it:

let analyticsData = { /* object with gathered data */ };

window.addEventListener("unload", function() {
  navigator.sendBeacon("/analytics", JSON.stringify(analyticsData));
};
  • The request is sent as POST.
  • We can send not only a string, but also forms and other formats, as described in the chapter info:fetch, but usually it’s a stringified object.
  • The data is limited by 64kb.

When the sendBeacon request is finished, the browser probably has already left the document, so there’s no way to get server response (which is usually empty for analytics).

There’s also a keepalive flag for doing such “after-page-left” requests in fetch method for generic network requests. You can find more information in the chapter info:fetch-api.

If we want to cancel the transition to another page, we can’t do it here. But we can use another event – onbeforeunload.

window.onbeforeunload [#window.onbeforeunload]

If a visitor initiated navigation away from the page or tries to close the window, the beforeunload handler asks for additional confirmation.

If we cancel the event, the browser may ask the visitor if they are sure.

You can try it by running this code and then reloading the page:

window.onbeforeunload = function () {
    return false;
};

For historical reasons, returning a non-empty string also counts as canceling the event. Some time ago browsers used to show it as a message, but as the modern specification says, they shouldn’t.

Here’s an example:

window.onbeforeunload = function () {
    return "There are unsaved changes. Leave now?";
};

The behavior was changed, because some webmasters abused this event handler by showing misleading and annoying messages. So right now old browsers still may show it as a message, but aside of that – there’s no way to customize the message shown to the user.

readyState

What happens if we set the DOMContentLoaded handler after the document is loaded?

Naturally, it never runs.

There are cases when we are not sure whether the document is ready or not. We’d like our function to execute when the DOM is loaded, be it now or later.

The document.readyState property tells us about the current loading state.

There are 3 possible values:

  • "loading" – the document is loading.
  • "interactive" – the document was fully read.
  • "complete" – the document was fully read and all resources (like images) are loaded too.

So we can check document.readyState and setup a handler or execute the code immediately if it’s ready.

Like this:

function work() {
    /*...*/
}

if (document.readyState == "loading") {
    // loading yet, wait for the event
    document.addEventListener("DOMContentLoaded", work);
} else {
    // DOM is ready!
    work();
}

There’s also the readystatechange event that triggers when the state changes, so we can print all these states like this:

// current state
console.log(document.readyState);

// print state changes
document.addEventListener("readystatechange", () =>
    console.log(document.readyState)
);

The readystatechange event is an alternative mechanics of tracking the document loading state, it appeared long ago. Nowadays, it is rarely used.

Let’s see the full events flow for the completeness.

Here’s a document with <iframe>, <img> and handlers that log events:

<script>
    log("initial readyState:" + document.readyState);

    document.addEventListener("readystatechange", () =>
        log("readyState:" + document.readyState)
    );
    document.addEventListener("DOMContentLoaded", () =>
        log("DOMContentLoaded")
    );

    window.onload = () => log("window onload");
</script>

<iframe src="iframe.html" onload="log('iframe onload')"></iframe>

<img src="http://en.js.cx/clipart/train.gif" id="img" />
<script>
    img.onload = () => log("img onload");
</script>

The working example is in the sandbox.

The typical output:

  1. [1] initial readyState:loading
  2. [2] readyState:interactive
  3. [2] DOMContentLoaded
  4. [3] iframe onload
  5. [4] img onload
  6. [4] readyState:complete
  7. [4] window onload

The numbers in square brackets denote the approximate time of when it happens. Events labeled with the same digit happen approximately at the same time (+- a few ms).

  • document.readyState becomes interactive right before DOMContentLoaded. These two things actually mean the same.
  • document.readyState becomes complete when all resources (iframe and img) are loaded. Here we can see that it happens in about the same time as img.onload (img is the last resource) and window.onload. Switching to complete state means the same as window.onload. The difference is that window.onload always works after all other load handlers.

Summary

Page load events:

  • The DOMContentLoaded event triggers on document when the DOM is ready. We can apply JavaScript to elements at this stage.
    • Script such as <script>...</script> or <script src="..."></script> block DOMContentLoaded, the browser waits for them to execute.
    • Images and other resources may also still continue loading.
  • The load event on window triggers when the page and all resources are loaded. We rarely use it, because there’s usually no need to wait for so long.
  • The beforeunload event on window triggers when the user wants to leave the page. If we cancel the event, browser asks whether the user really wants to leave (e.g we have unsaved changes).
  • The unload event on window triggers when the user is finally leaving, in the handler we can only do simple things that do not involve delays or asking a user. Because of that limitation, it’s rarely used. We can send out a network request with navigator.sendBeacon.
  • document.readyState is the current state of the document, changes can be tracked in the readystatechange event:
    • loading – the document is loading.
    • interactive – the document is parsed, happens at about the same time as DOMContentLoaded, but before it.
    • complete – the document and resources are loaded, happens at about the same time as window.onload, but before it.

Scripts: async, defer

In modern websites, scripts are often “heavier” than HTML: their download size is larger, and processing time is also longer.

When the browser loads HTML and comes across a <script>...</script> tag, it can’t continue building the DOM. It must execute the script right now. The same happens for external scripts <script src="..."></script>: the browser must wait until the script downloads, execute it, and only after process the rest of the page.

That leads to two important issues:

  1. Scripts can’t see DOM elements below them, so they can’t add handlers etc.
  2. If there’s a bulky script at the top of the page, it “blocks the page”. Users can’t see the page content till it downloads and runs:
<p>...content before script...</p>

<script src="https://javascript.info/article/script-async-defer/long.js?speed=1"></script>

<!-- This isn't visible until the script loads -->
<p>...content after script...</p>

There are some workarounds to that. For instance, we can put a script at the bottom of the page. Then it can see elements above it, and it doesn’t block the page content from showing:

<body>
    ...all content is above the script...

    <script src="https://javascript.info/article/script-async-defer/long.js?speed=1"></script>
</body>

But this solution is far from perfect. For example, the browser notices the script (and can start downloading it) only after it downloaded the full HTML document. For long HTML documents, that may be a noticeable delay.

Such things are invisible for people using very fast connections, but many people in the world still have slow internet speeds and use a far-from-perfect mobile internet connection.

Luckily, there are two <script> attributes that solve the problem for us: defer and async.

defer

The defer attribute tells the browser that it should go on working with the page, and load the script “in background”, then run the script when it loads.

Here’s the same example as above, but with defer:

<p>...content before script...</p>

<script
    defer
    src="https://javascript.info/article/script-async-defer/long.js?speed=1"
></script>

<!-- visible immediately -->
<p>...content after script...</p>
  • Scripts with defer never block the page.
  • Scripts with defer always execute when the DOM is ready, but before DOMContentLoaded event.

The following example demonstrates that:

<p>...content before scripts...</p>

<script>
    document.addEventListener("DOMContentLoaded", () =>
        alert("DOM ready after defer!")
    ); // (2)
</script>

<script
    defer
    src="https://javascript.info/article/script-async-defer/long.js?speed=1"
></script>

<p>...content after scripts...</p>
  1. The page content shows up immediately.
  2. DOMContentLoaded waits for the deferred script. It only triggers when the script (2) is downloaded and executed.

Deferred scripts keep their relative order, just like regular scripts.

So, if we have a long script first, and then a smaller one, then the latter one waits.

<script
    defer
    src="https://javascript.info/article/script-async-defer/long.js"
></script>
<script
    defer
    src="https://javascript.info/article/script-async-defer/small.js"
></script>

Browsers scan the page for scripts and download them in parallel, to improve performance. So in the example above both scripts download in parallel. The small.js probably makes it first.

But the specification requires scripts to execute in the document order, so it waits for long.js to execute.

The defer attribute is ignored if the <script> tag has no src.

async

The async attribute means that a script is completely independent:

  • The page doesn’t wait for async scripts, the contents are processed and displayed.
  • DOMContentLoaded and async scripts don’t wait for each other:
    • DOMContentLoaded may happen both before an async script (if an async script finishes loading after the page is complete)
    • …or after an async script (if an async script is short or was in HTTP-cache)
  • Other scripts don’t wait for async scripts, and async scripts don’t wait for them.

So, if we have several async scripts, they may execute in any order. Whatever loads first – runs first:

<p>...content before scripts...</p>

<script>
    document.addEventListener("DOMContentLoaded", () => alert("DOM ready!"));
</script>

<script
    async
    src="https://javascript.info/article/script-async-defer/long.js"
></script>
<script
    async
    src="https://javascript.info/article/script-async-defer/small.js"
></script>

<p>...content after scripts...</p>
  1. The page content shows up immediately: async doesn’t block it.
  2. DOMContentLoaded may happen both before and after async, no guarantees here.
  3. Async scripts don’t wait for each other. A smaller script small.js goes second, but probably loads before long.js, so runs first. That’s called a “load-first” order.

Async scripts are great when we integrate an independent third-party script into the page: counters, ads and so on, as they don’t depend on our scripts, and our scripts shouldn’t wait for them:

<!-- Google Analytics is usually added like this -->
<script async src="https://google-analytics.com/analytics.js"></script>

Dynamic scripts

We can also add a script dynamically using JavaScript:

let script = document.createElement("script");
script.src = "/article/script-async-defer/long.js";
document.body.append(script); // (*)

The script starts loading as soon as it’s appended to the document (*).

Dynamic scripts behave as “async” by default.

That is:

  • They don’t wait for anything, nothing waits for them.
  • The script that loads first – runs first (“load-first” order).
let script = document.createElement("script");
script.src = "/article/script-async-defer/long.js";

script.async = false;

document.body.append(script);

For example, here we add two scripts. Without script.async=false they would execute in load-first order (the small.js probably first). But with that flag the order is “as in the document”:

function loadScript(src) {
    let script = document.createElement("script");
    script.src = src;
    script.async = false;
    document.body.append(script);
}

// long.js runs first because of async=false
loadScript("/article/script-async-defer/long.js");
loadScript("/article/script-async-defer/small.js");

Summary

Both async and defer have one common thing: downloading of such scripts doesn’t block page rendering. So the user can read page content and get acquainted with the page immediately.

But there are also essential differences between them:

Order DOMContentLoaded
async Load-first order. Their document order doesn’t matter – which loads first Irrelevant. May load and execute while the document has not yet been fully downloaded. That happens if scripts are small or cached, and the document is long enough.
defer Document order (as they go in the document). Execute after the document is loaded and parsed (they wait if needed), right before DOMContentLoaded.

Please note that if you’re using defer, then the page is visible before the script loads.

So the user may read the page, but some graphical components are probably not ready yet.

There should be “loading” indications in the proper places, and disabled buttons should show as such, so the user can clearly see what’s ready and what’s not.

In practice, defer is used for scripts that need the whole DOM and/or their relative execution order is important. And async is used for independent scripts, like counters or ads. And their relative execution order does not matter.

Resource loading: onload and onerror

The browser allows us to track the loading of external resources – scripts, iframes, pictures and so on.

There are two events for it:

  • onload – successful load,
  • onerror – an error occurred.

Loading a script

Let’s say we need to load a third-party script and call a function that resides there.

We can load it dynamically, like this:

let script = document.createElement("script");
script.src = "my.js";

document.head.append(script);

…But how to run the function that is declared inside that script? We need to wait until the script loads, and only then we can call it.

For our own scripts we could use JavaScript modules here, but they are not widely adopted by third-party libraries.

script.onload

The main helper is the load event. It triggers after the script was loaded and executed.

For instance:

let script = document.createElement("script");

// can load any script, from any domain
script.src = "https://cdnjs.cloudflare.com/ajax/libs/lodash.js/4.3.0/lodash.js";
document.head.append(script);

script.onload = function () {
    // the script creates a helper function "_"
    alert(_); // the function is available
};

So in onload we can use script variables, run functions etc.

…And what if the loading failed? For instance, there’s no such script (error 404) or the server is down (unavailable).

script.onerror

Errors that occur during the loading of the script can be tracked in an error event.

For instance, let’s request a script that doesn’t exist:

let script = document.createElement("script");
script.src = "https://example.com/404.js"; // no such script
document.head.append(script);

script.onerror = function () {
    alert("Error loading " + this.src); // Error loading https://example.com/404.js
};

Please note that we can’t get HTTP error details here. We don’t know if it was an error 404 or 500 or something else. Just that the loading failed.

Events onload/onerror track only the loading itself.

Errors that may occur during script processing and execution are out of scope for these events. That is: if a script loaded successfully, then onload triggers, even if it has programming errors in it. To track script errors, one can use window.onerror global handler.

Other resources

The load and error events also work for other resources, basically for any resource that has an external src.

For example:

let img = document.createElement("img");
img.src = "https://js.cx/clipart/train.gif"; // (*)

img.onload = function () {
    alert(`Image loaded, size ${img.width}x${img.height}`);
};

img.onerror = function () {
    alert("Error occurred while loading image");
};

There are some notes though:

  • Most resources start loading when they are added to the document. But <img> is an exception. It starts loading when it gets a src (*).
  • For <iframe>, the iframe.onload event triggers when the iframe loading finished, both for successful load and in case of an error.

That’s for historical reasons.

Crossorigin policy

There’s a rule: scripts from one site can’t access contents of the other site. So, e.g. a script at https://facebook.com can’t read the user’s mailbox at https://gmail.com.

Or, to be more precise, one origin (domain/port/protocol triplet) can’t access the content from another one. So even if we have a subdomain, or just another port, these are different origins with no access to each other.

This rule also affects resources from other domains.

If we’re using a script from another domain, and there’s an error in it, we can’t get error details.

For example, let’s take a script error.js that consists of a single (bad) function call:

// 📁 error.js
noSuchFunction();

Now load it from the same site where it’s located:

<script>
    window.onerror = function (message, url, line, col, errorObj) {
        alert(`${message}\n${url}, ${line}:${col}`);
    };
</script>
<script src="/article/onload-onerror/crossorigin/error.js"></script>

We can see a good error report, like this:

Uncaught ReferenceError: noSuchFunction is not defined
https://javascript.info/article/onload-onerror/crossorigin/error.js, 1:1

Now let’s load the same script from another domain:

<script>
    window.onerror = function (message, url, line, col, errorObj) {
        alert(`${message}\n${url}, ${line}:${col}`);
    };
</script>
<script src="https://cors.javascript.info/article/onload-onerror/crossorigin/error.js"></script>

The report is different, like this:

Script error.
, 0:0

Details may vary depending on the browser, but the idea is the same: any information about the internals of a script, including error stack traces, is hidden. Exactly because it’s from another domain.

Why do we need error details?

There are many services (and we can build our own) that listen for global errors using window.onerror, save errors and provide an interface to access and analyze them. That’s great, as we can see real errors, triggered by our users. But if a script comes from another origin, then there’s not much information about errors in it, as we’ve just seen.

Similar cross-origin policy (CORS) is enforced for other types of resources as well.

To allow cross-origin access, the <script> tag needs to have the crossorigin attribute, plus the remote server must provide special headers.

There are three levels of cross-origin access:

  1. No crossorigin attribute – access prohibited.
  2. crossorigin="anonymous" – access allowed if the server responds with the header Access-Control-Allow-Origin with * or our origin. Browser does not send authorization information and cookies to remote server.
  3. crossorigin="use-credentials" – access allowed if the server sends back the header Access-Control-Allow-Origin with our origin and Access-Control-Allow-Credentials: true. Browser sends authorization information and cookies to remote server.

You can read more about cross-origin access in the chapter info:fetch-crossorigin. It describes the fetch method for network requests, but the policy is exactly the same.

Such thing as “cookies” is out of our current scope, but you can read about them in the chapter info:cookie.

In our case, we didn’t have any crossorigin attribute. So the cross-origin access was prohibited. Let’s add it.

We can choose between "anonymous" (no cookies sent, one server-side header needed) and "use-credentials" (sends cookies too, two server-side headers needed).

If we don’t care about cookies, then "anonymous" is the way to go:

<script>
    window.onerror = function (message, url, line, col, errorObj) {
        alert(`${message}\n${url}, ${line}:${col}`);
    };
</script>
<script
    crossorigin="anonymous"
    src="https://cors.javascript.info/article/onload-onerror/crossorigin/error.js"
></script>

Now, assuming that the server provides an Access-Control-Allow-Origin header, everything’s fine. We have the full error report.

Summary

Images <img>, external styles, scripts and other resources provide load and error events to track their loading:

  • load triggers on a successful load,
  • error triggers on a failed load.

The only exception is <iframe>: for historical reasons it always triggers load, for any load completion, even if the page is not found.

The readystatechange event also works for resources, but is rarely used, because load/error events are simpler.

Mutation observer

MutationObserver is a built-in object that observes a DOM element and fires a callback in case of changes.

We’ll first take a look at the syntax, and then explore a real-world use case, to see where such thing may be useful.

Syntax

MutationObserver is easy to use.

First, we create an observer with a callback-function:

let observer = new MutationObserver(callback);

And then attach it to a DOM node:

observer.observe(node, config);

config is an object with boolean options “what kind of changes to react on”:

  • childList – changes in the direct children of node,
  • subtree – in all descendants of node,
  • attributes – attributes of node,
  • attributeFilter – an array of attribute names, to observe only selected ones.
  • characterData – whether to observe node.data (text content),

Few other options:

  • attributeOldValue – if true, pass both the old and the new value of attribute to callback (see below), otherwise only the new one (needs attributes option),
  • characterDataOldValue – if true, pass both the old and the new value of node.data to callback (see below), otherwise only the new one (needs characterData option).

Then after any changes, the callback is executed: changes are passed in the first argument as a list of MutationRecord objects, and the observer itself as the second argument.

MutationRecord objects have properties:

  • type – mutation type, one of
    • "attributes": attribute modified
    • "characterData": data modified, used for text nodes,
    • "childList": child elements added/removed,
  • target – where the change occurred: an element for "attributes", or text node for "characterData", or an element for a "childList" mutation,
  • addedNodes/removedNodes – nodes that were added/removed,
  • previousSibling/nextSibling – the previous and next sibling to added/removed nodes,
  • attributeName/attributeNamespace – the name/namespace (for XML) of the changed attribute,
  • oldValue – the previous value, only for attribute or text changes, if the corresponding option is set attributeOldValue/characterDataOldValue.

For example, here’s a <div> with a contentEditable attribute. That attribute allows us to focus on it and edit.

<div contenteditable id="elem">Click and <b>edit</b>, please</div>

<script>
    let observer = new MutationObserver((mutationRecords) => {
        console.log(mutationRecords); // console.log(the changes)
    });

    // observe everything except attributes
    observer.observe(elem, {
        childList: true, // observe direct children
        subtree: true, // and lower descendants too
        characterDataOldValue: true, // pass old data to callback
    });
</script>

If we run this code in the browser, then focus on the given <div> and change the text inside <b>edit</b>, console.log will show one mutation:

mutationRecords = [{
  type: "characterData",
  oldValue: "edit",
  target: <text node>,
  // other properties empty
}];

If we make more complex editing operations, e.g. remove the <b>edit</b>, the mutation event may contain multiple mutation records:

mutationRecords = [{
  type: "childList",
  target: <div#elem>,
  removedNodes: [<b>],
  nextSibling: <text node>,
  previousSibling: <text node>
  // other properties empty
}, {
  type: "characterData"
  target: <text node>
  // ...mutation details depend on how the browser handles such removal
  // it may coalesce two adjacent text nodes "edit " and ", please" into one node
  // or it may leave them separate text nodes
}];

So, MutationObserver allows to react on any changes within DOM subtree.

Usage for integration

When such thing may be useful?

Imagine the situation when you need to add a third-party script that contains useful functionality, but also does something unwanted, e.g. shows ads <div class="ads">Unwanted ads</div>.

Naturally, the third-party script provides no mechanisms to remove it.

Using MutationObserver, we can detect when the unwanted element appears in our DOM and remove it.

There are other situations when a third-party script adds something into our document, and we’d like to detect, when it happens, to adapt our page, dynamically resize something etc.

MutationObserver allows to implement this.

Usage for architecture

There are also situations when MutationObserver is good from architectural standpoint.

Let’s say we’re making a website about programming. Naturally, articles and other materials may contain source code snippets.

Such snippet in an HTML markup looks like this:

...
<pre class="language-javascript"><code>
  // here's the code
  let hello = "world";
</code></pre>
...

Also we’ll use a JavaScript highlighting library on our site, e.g. Prism.js. A call to Prism.highlightElem(pre) examines the contents of such pre elements and adds into them special tags and styles for colored syntax highlighting, similar to what you see in examples here, at this page.

When exactly to run that highlighting method? We can do it on DOMContentLoaded event, or at the bottom of the page. At that moment we have our DOM ready, can search for elements pre[class*="language"] and call Prism.highlightElem on them:

// highlight all code snippets on the page
document
    .querySelectorAll('pre[class*="language"]')
    .forEach(Prism.highlightElem);

Everything’s simple so far, right? There are <pre> code snippets in HTML, we highlight them.

Now let’s go on. Let’s say we’re going to dynamically fetch materials from a server. We’ll study methods for that later in the tutorial. For now it only matters that we fetch an HTML article from a webserver and display it on demand:

let article =
    /* fetch new content from server */
    (articleElem.innerHTML = article);

The new article HTML may contain code snippets. We need to call Prism.highlightElem on them, otherwise they won’t get highlighted.

Where and when to call Prism.highlightElem for a dynamically loaded article?

We could append that call to the code that loads an article, like this:

let article =
    /* fetch new content from server */
    (articleElem.innerHTML = article);

let snippets = articleElem.querySelectorAll('pre[class*="language-"]');
snippets.forEach(Prism.highlightElem);

…But imagine, we have many places in the code where we load contents: articles, quizzes, forum posts. Do we need to put the highlighting call everywhere? That’s not very convenient, and also easy to forget.

And what if the content is loaded by a third-party module? E.g. we have a forum written by someone else, that loads contents dynamically, and we’d like to add syntax highlighting to it. No one likes to patch third-party scripts.

Luckily, there’s another option.

We can use MutationObserver to automatically detect when code snippets are inserted in the page and highlight them.

So we’ll handle the highlighting functionality in one place, relieving us from the need to integrate it.

Dynamic highlight demo

Here’s the working example.

If you run this code, it starts observing the element below and highlighting any code snippets that appear there:

let observer = new MutationObserver((mutations) => {
    for (let mutation of mutations) {
        // examine new nodes, is there anything to highlight?

        for (let node of mutation.addedNodes) {
            // we track only elements, skip other nodes (e.g. text nodes)
            if (!(node instanceof HTMLElement)) continue;

            // check the inserted element for being a code snippet
            if (node.matches('pre[class*="language-"]')) {
                Prism.highlightElement(node);
            }

            // or maybe there's a code snippet somewhere in its subtree?
            for (let elem of node.querySelectorAll('pre[class*="language-"]')) {
                Prism.highlightElement(elem);
            }
        }
    }
});

let demoElem = document.getElementById("highlight-demo");

observer.observe(demoElem, { childList: true, subtree: true });

Here, below, there’s an HTML-element and JavaScript that dynamically fills it using innerHTML.

Please run the previous code (above, observes that element), and then the code below. You’ll see how MutationObserver detects and highlights the snippet.

A demo-element with id="highlight-demo", run the code above to observe it.

The following code populates its innerHTML, that causes the MutationObserver to react and highlight its contents:

let demoElem = document.getElementById("highlight-demo");

// dynamically insert content with code snippets
demoElem.innerHTML = `A code snippet is below:
  <pre class="language-javascript"><code> let hello = "world!"; </code></pre>
  <div>Another one:</div>
  <div>
    <pre class="language-css"><code>.class { margin: 5px; } </code></pre>
  </div>
`;

Now we have MutationObserver that can track all highlighting in observed elements or the whole document. We can add/remove code snippets in HTML without thinking about it.

Additional methods

There’s a method to stop observing the node:

  • observer.disconnect() – stops the observation.

When we stop the observing, it might be possible that some changes were not processed by the observer yet.

  • observer.takeRecords() – gets a list of unprocessed mutation records, those that happened, but the callback did not handle them.

These methods can be used together, like this:

// we'd like to stop tracking changes
observer.disconnect();

// handle unprocessed some mutations
let mutationRecords = observer.takeRecords();
...

Observers use weak references to nodes internally. That is: if a node is removed from DOM, and becomes unreachable, then it becomes garbage collected.

The mere fact that a DOM node is observed doesn’t prevent the garbage collection.

Summary

MutationObserver can react on changes in DOM: attributes, added/removed elements, text content.

We can use it to track changes introduced by other parts of our code, as well as to integrate with third-party scripts.

MutationObserver can track any changes. The config “what to observe” options are used for optimizations, not to spend resources on unneeded callback invocations.

libs:

  • d3
  • domtree

Selection and Range

In this chapter we’ll cover selection in the document, as well as selection in form fields, such as <input>.

JavaScript can get the existing selection, select/deselect both as a whole or partially, remove the selected part from the document, wrap it into a tag, and so on.

You can get ready to use recipes at the end, in “Summary” section. But you’ll get much more if you read the whole chapter. The underlying Range and Selection objects are easy to grasp, and then you’ll need no recipes to make them do what you want.

Range

The basic concept of selection is Range: basically, a pair of “boundary points”: range start and range end.

Each point represented as a parent DOM node with the relative offset from its start. If the parent node is an element node, then the offset is a child number, for a text node it’s the position in the text. Examples to follow.

Let’s select something.

First, we can create a range (the constructor has no parameters):

let range = new Range();

Then we can set the selection boundaries using range.setStart(node, offset) and range.setEnd(node, offset).

For example, consider this fragment of HTML:

<p id="p">Example: <i>italic</i> and <b>bold</b></p>

Here’s its DOM structure, note that here text nodes are important for us:

Let’s select "Example: <i>italic</i>". That’s two first children of <p> (counting text nodes):

<p id="p">Example: <i>italic</i> and <b>bold</b></p>

<script>
    let range = new Range();

    range.setStart(p, 0);
    range.setEnd(p, 2);

    // toString of a range returns its content as text (without tags)
    alert(range); // Example: italic

    // apply this range for document selection (explained later)
    document.getSelection().addRange(range);
</script>
  • range.setStart(p, 0) – sets the start at the 0th child of <p> (that’s the text node "Example: ").
  • range.setEnd(p, 2) – spans the range up to (but not including) 2nd child of <p> (that’s the text node " and ", but as the end is not included, so the last selected node is <i>).

Here’s a more flexible test stand where you try more variants:

<p id="p">Example: <i>italic</i> and <b>bold</b></p>

From <input id="start" type="number" value="1" /> – To
<input id="end" type="number" value="4" />
<button id="button">Click to select</button>
<script>
    button.onclick = () => {
        let range = new Range();

        range.setStart(p, start.value);
        range.setEnd(p, end.value);

        // apply the selection, explained later
        document.getSelection().removeAllRanges();
        document.getSelection().addRange(range);
    };
</script>

E.g. selecting from 1 to 4 gives range <i>italic</i> and <b>bold</b>.

We don’t have to use the same node in setStart and setEnd. A range may span across many unrelated nodes. It’s only important that the end is after the start.

Selecting parts of text nodes

Let’s select the text partially, like this:

That’s also possible, we just need to set the start and the end as a relative offset in text nodes.

We need to create a range, that:

  • starts from position 2 in <p> first child (taking all but two first letters of “Example: “)
  • ends at the position 3 in <b> first child (taking first three letters of “bold”, but no more):
<p id="p">Example: <i>italic</i> and <b>bold</b></p>

<script>
    let range = new Range();

    range.setStart(p.firstChild, 2);
    range.setEnd(p.querySelector("b").firstChild, 3);

    alert(range); // ample: italic and bol

    // use this range for selection (explained later)
    window.getSelection().addRange(range);
</script>

The range object has following properties:

  • startContainer, startOffset – node and offset of the start,
    • in the example above: first text node inside <p> and 2.
  • endContainer, endOffset – node and offset of the end,
    • in the example above: first text node inside <b> and 3.
  • collapsed – boolean, true if the range starts and ends on the same point (so there’s no content inside the range),
    • in the example above: false
  • commonAncestorContainer – the nearest common ancestor of all nodes within the range,
    • in the example above: <p>

Range methods

There are many convenience methods to manipulate ranges.

Set range start:

  • setStart(node, offset) set start at: position offset in node
  • setStartBefore(node) set start at: right before node
  • setStartAfter(node) set start at: right after node

Set range end (similar methods):

  • setEnd(node, offset) set end at: position offset in node
  • setEndBefore(node) set end at: right before node
  • setEndAfter(node) set end at: right after node

As it was demonstrated, node can be both a text or element node: for text nodes offset skips that many of characters, while for element nodes that many child nodes.

Others:

  • selectNode(node) set range to select the whole node
  • selectNodeContents(node) set range to select the whole node contents
  • collapse(toStart) if toStart=true set end=start, otherwise set start=end, thus collapsing the range
  • cloneRange() creates a new range with the same start/end

To manipulate the content within the range:

  • deleteContents() – remove range content from the document
  • extractContents() – remove range content from the document and return as DocumentFragment
  • cloneContents() – clone range content and return as DocumentFragment
  • insertNode(node) – insert node into the document at the beginning of the range
  • surroundContents(node) – wrap node around range content. For this to work, the range must contain both opening and closing tags for all elements inside it: no partial ranges like <i>abc.

With these methods we can do basically anything with selected nodes.

Here’s the test stand to see them in action:

Click buttons to run methods on the selection, "resetExample" to reset it.

<p id="p">Example: <i>italic</i> and <b>bold</b></p>

<p id="result"></p>
<script>
    let range = new Range();

    // Each demonstrated method is represented here:
    let methods = {
        deleteContents() {
            range.deleteContents();
        },
        extractContents() {
            let content = range.extractContents();
            result.innerHTML = "";
            result.append("extracted: ", content);
        },
        cloneContents() {
            let content = range.cloneContents();
            result.innerHTML = "";
            result.append("cloned: ", content);
        },
        insertNode() {
            let newNode = document.createElement("u");
            newNode.innerHTML = "NEW NODE";
            range.insertNode(newNode);
        },
        surroundContents() {
            let newNode = document.createElement("u");
            try {
                range.surroundContents(newNode);
            } catch (e) {
                alert(e);
            }
        },
        resetExample() {
            p.innerHTML = `Example: <i>italic</i> and <b>bold</b>`;
            result.innerHTML = "";

            range.setStart(p.firstChild, 2);
            range.setEnd(p.querySelector("b").firstChild, 3);

            window.getSelection().removeAllRanges();
            window.getSelection().addRange(range);
        },
    };

    for (let method in methods) {
        document.write(
            `<div><button onclick="methods.${method}()">${method}</button></div>`
        );
    }

    methods.resetExample();
</script>

There also exist methods to compare ranges, but these are rarely used. When you need them, please refer to the spec or MDN manual.

Selection

Range is a generic object for managing selection ranges. We may create such objects, pass them around – they do not visually select anything on their own.

The document selection is represented by Selection object, that can be obtained as window.getSelection() or document.getSelection().

A selection may include zero or more ranges. At least, the Selection API specification says so. In practice though, only Firefox allows to select multiple ranges in the document by using key:Ctrl+click (key:Cmd+click for Mac).

Here’s a screenshot of a selection with 3 ranges, made in Firefox:

Other browsers support at maximum 1 range. As we’ll see, some of Selection methods imply that there may be many ranges, but again, in all browsers except Firefox, there’s at maximum 1.

Selection properties

Similar to a range, a selection has a start, called “anchor”, and the end, called “focus”.

The main selection properties are:

  • anchorNode – the node where the selection starts,
  • anchorOffset – the offset in anchorNode where the selection starts,
  • focusNode – the node where the selection ends,
  • focusOffset – the offset in focusNode where the selection ends,
  • isCollapsedtrue if selection selects nothing (empty range), or doesn’t exist.
  • rangeCount – count of ranges in the selection, maximum 1 in all browsers except Firefox.

There are many ways to select the content, depending on the user agent: mouse, hotkeys, taps on a mobile etc.

Some of them, such as a mouse, allow the same selection can be created in two directions: “left-to-right” and “right-to-left”.

If the start (anchor) of the selection goes in the document before the end (focus), this selection is said to have “forward” direction.

E.g. if the user starts selecting with mouse and goes from “Example” to “italic”:

Otherwise, if they go from the end of “italic” to “Example”, the selection is directed “backward”, its focus will be before the anchor:

That’s different from Range objects that are always directed forward: the range start can’t be after its end.

Selection events

There are events on to keep track of selection:

  • elem.onselectstart – when a selection starts on elem, e.g. the user starts moving mouse with pressed button.
    • Preventing the default action makes the selection not start.
  • document.onselectionchange – whenever a selection changes.
    • Please note: this handler can be set only on document.

Selection tracking demo

Here’s a small demo that shows selection boundaries dynamically as it changes:

<p id="p">Select me: <i>italic</i> and <b>bold</b></p>

From <input id="from" disabled /> – To <input id="to" disabled />
<script>
    document.onselectionchange = function () {
        let { anchorNode, anchorOffset, focusNode, focusOffset } =
            document.getSelection();

        from.value = `${anchorNode && anchorNode.data}:${anchorOffset}`;
        to.value = `${focusNode && focusNode.data}:${focusOffset}`;
    };
</script>

Selection getting demo

To get the whole selection:

  • As text: just call document.getSelection().toString().
  • As DOM nodes: get the underlying ranges and call their cloneContents() method (only first range if we don’t support Firefox multiselection).

And here’s the demo of getting the selection both as text and as DOM nodes:

<p id="p">Select me: <i>italic</i> and <b>bold</b></p>

Cloned: <span id="cloned"></span>
<br />
As text: <span id="astext"></span>

<script>
    document.onselectionchange = function () {
        let selection = document.getSelection();

        cloned.innerHTML = astext.innerHTML = "";

        // Clone DOM nodes from ranges (we support multiselect here)
        for (let i = 0; i < selection.rangeCount; i++) {
            cloned.append(selection.getRangeAt(i).cloneContents());
        }

        // Get as text
        astext.innerHTML += selection;
    };
</script>

Selection methods

Selection methods to add/remove ranges:

  • getRangeAt(i) – get i-th range, starting from 0. In all browsers except firefox, only 0 is used.
  • addRange(range) – add range to selection. All browsers except Firefox ignore the call, if the selection already has an associated range.
  • removeRange(range) – remove range from the selection.
  • removeAllRanges() – remove all ranges.
  • empty() – alias to removeAllRanges.

Also, there are convenience methods to manipulate the selection range directly, without Range:

  • collapse(node, offset) – replace selected range with a new one that starts and ends at the given node, at position offset.
  • setPosition(node, offset) – alias to collapse.
  • collapseToStart() - collapse (replace with an empty range) to selection start,
  • collapseToEnd() - collapse to selection end,
  • extend(node, offset) - move focus of the selection to the given node, position offset,
  • setBaseAndExtent(anchorNode, anchorOffset, focusNode, focusOffset) - replace selection range with the given start anchorNode/anchorOffset and end focusNode/focusOffset. All content in-between them is selected.
  • selectAllChildren(node) – select all children of the node.
  • deleteFromDocument() – remove selected content from the document.
  • containsNode(node, allowPartialContainment = false) – checks whether the selection contains node (partially if the second argument is true)

So, for many tasks we can call Selection methods, no need to access the underlying Range object.

For example, selecting the whole contents of the paragraph <p>:

<p id="p">Select me: <i>italic</i> and <b>bold</b></p>

<script>
    // select from 0th child of <p> to the last child
    document.getSelection().setBaseAndExtent(p, 0, p, p.childNodes.length);
</script>

The same thing using ranges:

<p id="p">Select me: <i>italic</i> and <b>bold</b></p>

<script>
    let range = new Range();
    range.selectNodeContents(p); // or selectNode(p) to select the <p> tag too

    document.getSelection().removeAllRanges(); // clear existing selection if any
    document.getSelection().addRange(range);
</script>

If the selection already exists, empty it first with removeAllRanges(). And then add ranges. Otherwise, all browsers except Firefox ignore new ranges.

The exception is some selection methods, that replace the existing selection, like setBaseAndExtent.

Selection in form controls

Form elements, such as input and textarea provide special API for selection, without Selection or Range objects. As an input value is a pure text, not HTML, there’s no need for such objects, everything’s much simpler.

Properties:

  • input.selectionStart – position of selection start (writeable),
  • input.selectionEnd – position of selection end (writeable),
  • input.selectionDirection – selection direction, one of: “forward”, “backward” or “none” (if e.g. selected with a double mouse click),

Events:

  • input.onselect – triggers when something is selected.

Methods:

  • input.select() – selects everything in the text control (can be textarea instead of input),

  • input.setSelectionRange(start, end, [direction]) – change the selection to span from position start till end, in the given direction (optional).

  • input.setRangeText(replacement, [start], [end], [selectionMode]) – replace a range of text with the new text.

    Optional arguments start and end, if provided, set the range start and end, otherwise user selection is used.

    The last argument, selectionMode, determines how the selection will be set after the text has been replaced. The possible values are:

    • "select" – the newly inserted text will be selected.
    • "start" – the selection range collapses just before the inserted text (the cursor will be immediately before it).
    • "end" – the selection range collapses just after the inserted text (the cursor will be right after it).
    • "preserve" – attempts to preserve the selection. This is the default.

Now let’s see these methods in action.

Example: tracking selection

For example, this code uses onselect event to track selection:

<textarea id="area" style="width:80%;height:60px">
Selecting in this text updates values below.
</textarea>
<br />
From <input id="from" disabled /> – To <input id="to" disabled />

<script>
    area.onselect = function () {
        from.value = area.selectionStart;
        to.value = area.selectionEnd;
    };
</script>

Please note:

  • onselect triggers when something is selected, but not when the selection is removed.
  • document.onselectionchange event should not trigger for selections inside a form control, according to the spec, as it’s not related to document selection and ranges. Some browsers generate it, but we shouldn’t rely on it.

Example: moving cursor

We can change selectionStart and selectionEnd, that sets the selection.

An important edge case is when selectionStart and selectionEnd equal each other. Then it’s exactly the cursor position. Or, to rephrase, when nothing is selected, the selection is collapsed at the cursor position.

So, by setting selectionStart and selectionEnd to the same value, we move the cursor.

For example:

<textarea id="area" style="width:80%;height:60px">
Focus on me, the cursor will be at position 10.
</textarea>

<script>
    area.onfocus = () => {
        // zero delay setTimeout to run after browser "focus" action finishes
        setTimeout(() => {
            // we can set any selection
            // if start=end, the cursor it exactly at that place
            area.selectionStart = area.selectionEnd = 10;
        });
    };
</script>

Example: modifying selection

To modify the content of the selection, we can use input.setRangeText() method. Of course, we can read selectionStart/End and, with the knowledge of the selection, change the corresponding substring of value, but setRangeText is more powerful and often more convenient.

That’s a somewhat complex method. In its simplest one-argument form it replaces the user selected range and removes the selection.

For example, here the user selection will be wrapped by *...*:

<input
    id="input"
    style="width:200px"
    value="Select here and click the button"
/>
<button id="button">Wrap selection in stars *...*</button>

<script>
    button.onclick = () => {
        if (input.selectionStart == input.selectionEnd) {
            return; // nothing is selected
        }

        let selected = input.value.slice(
            input.selectionStart,
            input.selectionEnd
        );
        input.setRangeText(`*${selected}*`);
    };
</script>

With more arguments, we can set range start and end.

In this example we find "THIS" in the input text, replace it and keep the replacement selected:

<input id="input" style="width:200px" value="Replace THIS in text" />
<button id="button">Replace THIS</button>

<script>
    button.onclick = () => {
        let pos = input.value.indexOf("THIS");
        if (pos >= 0) {
            input.setRangeText("*THIS*", pos, pos + 4, "select");
            input.focus(); // focus to make selection visible
        }
    };
</script>

Example: insert at cursor

If nothing is selected, or we use equal start and end in setRangeText, then the new text is just inserted, nothing is removed.

We can also insert something “at the cursor” using setRangeText.

Here’s a button that inserts "HELLO" at the cursor position and puts the cursor immediately after it. If the selection is not empty, then it gets replaced (we can detect it by comparing selectionStart!=selectionEnd and do something else instead):

<input id="input" style="width:200px" value="Text Text Text Text Text" />
<button id="button">Insert "HELLO" at cursor</button>

<script>
    button.onclick = () => {
        input.setRangeText(
            "HELLO",
            input.selectionStart,
            input.selectionEnd,
            "end"
        );
        input.focus();
    };
</script>

Making unselectable

To make something unselectable, there are three ways:

  1. Use CSS property user-select: none.

    <style>
        #elem {
            user-select: none;
        }
    </style>
    <div>
        Selectable
        <div id="elem">Unselectable</div>
        Selectable
    </div>
    

    This doesn’t allow the selection to start at elem. But the user may start the selection elsewhere and include elem into it.

    Then elem will become a part of document.getSelection(), so the selection actually happens, but its content is usually ignored in copy-paste.

  2. Prevent default action in onselectstart or mousedown events.

    <div>
        Selectable
        <div id="elem">Unselectable</div>
        Selectable
    </div>
    
    <script>
        elem.onselectstart = () => false;
    </script>
    

    This prevents starting the selection on elem, but the visitor may start it at another element, then extend to elem.

    That’s convenient when there’s another event handler on the same action that triggers the select (e.g. mousedown). So we disable the selection to avoid conflict, still allowing elem contents to be copied.

  3. We can also clear the selection post-factum after it happens with document.getSelection().empty(). That’s rarely used, as this causes unwanted blinking as the selection appears-disappears.

References

Summary

We covered two different APIs for selections:

  1. For document: Selection and Range objects.
  2. For input, textarea: additional methods and properties.

The second API is very simple, as it works with text.

The most used recipes are probably:

  1. Getting the selection:

    let selection = document.getSelection();
    
    let cloned = /* element to clone the selected nodes to */;
    
    // then apply Range methods to selection.getRangeAt(0)
    // or, like here, to all ranges to support multi-select
    for (let i = 0; i < selection.rangeCount; i++) {
      cloned.append(selection.getRangeAt(i).cloneContents());
    }
    
  2. Setting the selection:

    let selection = document.getSelection();
    
    // directly:
    selection.setBaseAndExtent(...from...to...);
    
    // or we can create a range and:
    selection.removeAllRanges();
    selection.addRange(range);
    

And finally, about the cursor. The cursor position in editable elements, like <textarea> is always at the start or the end of the selection. We can use it to get cursor position or to move the cursor by setting elem.selectionStart and elem.selectionEnd.

Event loop: microtasks and macrotasks

Browser JavaScript execution flow, as well as in Node.js, is based on an event loop.

Understanding how event loop works is important for optimizations, and sometimes for the right architecture.

In this chapter we first cover theoretical details about how things work, and then see practical applications of that knowledge.

Event Loop

The concept of event loop is very simple. There’s an endless loop, when JavaScript engine waits for tasks, executes them and then sleeps waiting for more tasks.

The general algorithm of the engine:

  1. While there are tasks:
    • execute them, starting with the oldest task.
  2. Sleep until a task appears, then go to 1.

That’s a formalization for what we see when browsing a page. JavaScript engine does nothing most of the time, only runs if a script/handler/event activates.

Examples of tasks:

  • When an external script <script src="..."> loads, the task is to execute it.
  • When a user moves their mouse, the task is to dispatch mousemove event and execute handlers.
  • When the time is due for a scheduled setTimeout, the task is to run its callback.
  • …and so on.

Tasks are set – the engine handles them – then waits for more tasks (while sleeping and consuming close to zero CPU).

It may happen that a task comes while the engine is busy, then it’s enqueued.

The tasks form a queue, so-called “macrotask queue” (v8 term):

For instance, while the engine is busy executing a script, a user may move their mouse causing mousemove, and setTimeout may be due and so on, these tasks form a queue, as illustrated on the picture above.

Tasks from the queue are processed on “first come – first served” basis. When the engine browser is done with the script, it handles mousemove event, then setTimeout handler, and so on.

So far, quite simple, right?

Two more details:

  1. Rendering never happens while the engine executes a task. Doesn’t matter if the task takes a long time. Changes to DOM are painted only after the task is complete.
  2. If a task takes too long, the browser can’t do other tasks, process user events, so after a time it raises an alert like “Page Unresponsive” suggesting to kill the task with the whole page. That happens when there are a lot of complex calculations or a programming error leading to infinite loop.

That was a theory. Now let’s see how we can apply that knowledge.

Use-case 1: splitting CPU-hungry tasks

Let’s say we have a CPU-hungry task.

For example, syntax-highlighting (used to colorize code examples on this page) is quite CPU-heavy. To highlight the code, it performs the analysis, creates many colored elements, adds them to the document – for a large amount of text that takes a lot of time.

While the engine is busy with syntax highlighting, it can’t do other DOM-related stuff, process user events, etc. It may even cause the browser to “hiccup” or even “hang” for a bit, which is unacceptable.

We can avoid problems by splitting the big task into pieces. Highlight first 100 lines, then schedule setTimeout (with zero-delay) for the next 100 lines, and so on.

To demonstrate this approach, for the sake of simplicity, instead of text-highlighting, let’s take a function that counts from 1 to 1000000000.

If you run the code below, the engine will “hang” for some time. For server-side JS that’s clearly noticeable, and if you are running it in-browser, then try to click other buttons on the page – you’ll see that no other events get handled until the counting finishes.

let i = 0;

let start = Date.now();

function count() {
    // do a heavy job
    for (let j = 0; j < 1e9; j++) {
        i++;
    }

    alert("Done in " + (Date.now() - start) + "ms");
}

count();

The browser may even show a “the script takes too long” warning.

Let’s split the job using nested setTimeout calls:

let i = 0;

let start = Date.now();

function count() {
    // do a piece of the heavy job (*)
    do {
        i++;
    } while (i % 1e6 != 0);

    if (i == 1e9) {
        alert("Done in " + (Date.now() - start) + "ms");
    } else {
        setTimeout(count); // schedule the new call (**)
    }
}

count();

Now the browser interface is fully functional during the “counting” process.

A single run of count does a part of the job (*), and then re-schedules itself (**) if needed:

  1. First run counts: i=1...1000000.
  2. Second run counts: i=1000001..2000000.
  3. …and so on.

Now, if a new side task (e.g. onclick event) appears while the engine is busy executing part 1, it gets queued and then executes when part 1 finished, before the next part. Periodic returns to the event loop between count executions provide just enough “air” for the JavaScript engine to do something else, to react to other user actions.

The notable thing is that both variants – with and without splitting the job by setTimeout – are comparable in speed. There’s not much difference in the overall counting time.

To make them closer, let’s make an improvement.

We’ll move the scheduling to the beginning of the count():

let i = 0;

let start = Date.now();

function count() {
    // move the scheduling to the beginning
    if (i < 1e9 - 1e6) {
        setTimeout(count); // schedule the new call
    }

    do {
        i++;
    } while (i % 1e6 != 0);

    if (i == 1e9) {
        alert("Done in " + (Date.now() - start) + "ms");
    }
}

count();

Now when we start to count() and see that we’ll need to count() more, we schedule that immediately, before doing the job.

If you run it, it’s easy to notice that it takes significantly less time.

Why?

That’s simple: as you remember, there’s the in-browser minimal delay of 4ms for many nested setTimeout calls. Even if we set 0, it’s 4ms (or a bit more). So the earlier we schedule it - the faster it runs.

Finally, we’ve split a CPU-hungry task into parts - now it doesn’t block the user interface. And its overall execution time isn’t much longer.

Use case 2: progress indication

Another benefit of splitting heavy tasks for browser scripts is that we can show progress indication.

Usually the browser renders after the currently running code is complete. Doesn’t matter if the task takes a long time. Changes to DOM are painted only after the task is finished.

On one hand, that’s great, because our function may create many elements, add them one-by-one to the document and change their styles – the visitor won’t see any “intermediate”, unfinished state. An important thing, right?

Here’s the demo, the changes to i won’t show up until the function finishes, so we’ll see only the last value:

<div id="progress"></div>

<script>
    function count() {
        for (let i = 0; i < 1e6; i++) {
            i++;
            progress.innerHTML = i;
        }
    }

    count();
</script>

…But we also may want to show something during the task, e.g. a progress bar.

If we split the heavy task into pieces using setTimeout, then changes are painted out in-between them.

This looks prettier:

<div id="progress"></div>

<script>
    let i = 0;

    function count() {
        // do a piece of the heavy job (*)
        do {
            i++;
            progress.innerHTML = i;
        } while (i % 1e3 != 0);

        if (i < 1e7) {
            setTimeout(count);
        }
    }

    count();
</script>

Now the <div> shows increasing values of i, a kind of a progress bar.

Use case 3: doing something after the event

In an event handler we may decide to postpone some actions until the event bubbled up and was handled on all levels. We can do that by wrapping the code in zero delay setTimeout.

In the chapter info:dispatch-events we saw an example: custom event menu-open is dispatched in setTimeout, so that it happens after the “click” event is fully handled.

menu.onclick = function () {
    // ...

    // create a custom event with the clicked menu item data
    let custo