Memory Management

  • Revision slug: JavaScript/Memory_Management
  • Revision title: Memory Management
  • Revision id: 305623
  • Created:
  • Creator: ethertank
  • Is current revision? No
  • Comment

Revision Content

Introduction

Low-level languages, like C, have low-level memory management primitives like malloc() and free(). On the other hand, JavaScript values are allocated when things (objects, strings, etc.) are created and "automatically" free'd when they are not used anymore. The latter process is called garbage collection. This "automatically" is a source of confusion and gives JavaScript (and high-level languages) developers the impression they can decide not to care about memory management. This is a mistake.

Memory lifecycle

Regardless of the programming language, memory lifecycle is pretty much always the same:

  1. Allocate the memory you need
  2. use it (read, write)
  3. Release the allocated memory when it is not needed anymore

The first and second parts are explicit in all languages. The last part is explicit in low-level languages, but is mostly implicit in high-level languages like JavaScript.

Allocation in JavaScript

Value initialization

In order not to bother the programmer with allocations, JavaScript does it alongside with declaring values.

var n = 123; // allocates memory for a number
var s = "azerty"; // allocates memory for a string 

var o = {
  a: 1,
  b: null
}; // allocates memory for an object and contained values

var a = [1, null, "abra"]; // (like object) allocates memory for the array and contained values

function f(a){
  return a + 2;
} // allocates a function (which is a callable object)

// function expressions also allocate an object
someElement.addEventListener('click', function(){
  someElement.style.backgroundColor = 'blue';
}, false);

Allocation via function calls

Some function calls result in object allocation.

var d = new Date();
var e = document.createElement('div'); // allocates an DOM element

Some methods allocate new values or objects:

var s = "azerty";
var s2 = s.substr(0, 3); // s2 is a new string
// Since strings are immutable value, JavaScript may decide to not allocate memory, but just store the [0, 3] range.

var a = ["ouais ouais", "nan nan"];
var a2 = ["generation", "nan nan"];
var a3 = a.concat(a2); // new array with 4 elements being the concatenation of a and a2 elements

Using values

Using value basically means reading and writing in allocated memory. This can be done by reading or writing the value of a variable or an object property or even passing an argument to a function.

Release when the memory is not needed anymore

Most of memory management issues come at this phase. The hardest task here is to find when "the allocated memory is not needed any longer". It often requires for the developer to determine where in the program such piece of memory is not needed anymore and free it.

High-level languages interpreter embed a piece of software called "garbage collector" which job is to track memory allocation and use in order to find when a piece of allocated memory is not needed any longer in which case, it will automatically free it. This process is an approximation since the general problem of knowing whether some piece of memory is needed is undecidable (can't be solved by an algorithm).

Garbage collection

As stated above the general problem of automatically finding whether some memory "is not needed anymore" is undecidable. As a consequence, garbage collections implement a restriction of a solution to the general problem. This section will explain the necessary notions to understand the main garbage collection algorithms and their limitations.

References

The main notion garbage collection algorithms rely on is the notion of reference. Within the context of memory management, an object is said to reference another object if the former has an access to the latter (either implicitly or explicitly). For instance, a JavaScript object has a reference to its prototype (implicit reference) and to its properties values (explicit reference).

In this context, the notion of "object" is extended to something broader than regular JavaScript objects and also contains function scopes (or the global lexical scope)

Reference-counting garbage collection

This is the most naive garbage collection algorithm. This algorithm reduces the definition of "an object is not needed anymore" to "an object has no other object referencing to it". An object is considered garbage-collectable if there is zero reference pointing at this object.

Example

var o = { 
  a: {
    b:2
  }
}; // 2 objects are created. One is referenced by the other as one of its property.
// The other is referenced by virtue of being assigned to the 'o' variable.
// Obviously, none can be garbage-collected


var o2 = o; // the 'o2' variable is the second thing that has a reference to the object
o = 1; // now, the object that was originally in 'o' has a unique reference embodied by the 'o2' variable

var oa = o2.a; // reference to 'a' property of the object.
// This object has now 2 references: one as a property, the other as the 'oa' variable

o2 = "yo"; // The object that was originally in 'o' has now zero references to it.
// It can be garbage-collected.
// However what was its 'a' property is still referenced by the 'oa' variable, so it cannot be free'd

oa = null; // what was the 'a' property of the object originally in o has zero references to it.
// it can be garbage collected.

Limitation : cycles

This naive algorithm has the limitation that if objects reference one another (and form a cycle), they may be "not needed anymore" and yet not garbage-collectable.

function f(){
  var o = {};
  var o2 = {};
  o.a = o2; // o references o2
  o2.a = o; // o2 references o

  return "azerty";
}

f();
// Two objects are created and reference one another thus creating a cycle.
// They will not get out of the function scope after the function call, so they
// are effectively useless and could be free'd.
// However, the reference-counting algorithm considers that since each of both object is referenced
// at least once, none can be garbage-collected

Real life example

Internet Explorer 6, 7 are known to have a reference-counting garbage collector for DOM objects. For them, a common pattern is known to generate memory leaks systematically:

var div = document.createElement("div");
div.onclick = function(){
  doSomething();
}; // The div has a reference to the event handler via its 'onclick' property
// The handler also has a reference to the div since the 'div' variable can be accessed within the function scope
// This cycle will cause both objects not to be garbage-collected and thus a memory leak.

Mark-and-sweep algorithm

This algorithm reduces the definition of "an object is not needed anymore" to "an object is unreachable".

This algorithm assumes the knowledge of a set of objects called roots (In JavaScript, the root is the global object). Periodically, the garbage-collector will start from these roots, find all objects that are referenced from these roots, then all objects referenced from these, etc. Starting from the roots, the garbage collector will thus find all reachable objects and collect all non-reachable objects.

This algorithm is better than the previous one since "an object has zero reference" leads to this object being unreachable. The opposite is not true as we have seen with cycles.

As of 2012, all modern browsers ship a mark-and-sweep garbage-collector. All improvements made in the field of JavaScript garbage collection (generational/incremental/concurrent/parallel garbage collection) over the last few years are implementation improvements of this algorithm, but not improvements over the garbage collection algorithm itself nor its reduction of the definition of when "an object is not needed anymore".

Cycles are not a problem anymore

In the first above example, after the function call returns, the 2 objects are not referenced anymore by something reachable from the global object. Consequently, they will be found unreachable by the garbage collector.

The same thing goes with the second example. Once the div and its handler are made unreachable from the roots, they can both be garbage-collected despite referencing each other.

Limitation: objects need to be made explicitly unreachable

Although this is marked as a limitation, it is one that is rarely reached in practice which is why no one usually cares that much about garbage collection.

See also

Revision Source

<h2 id="Introduction">Introduction</h2>
<p>Low-level languages, like C, have low-level memory management primitives like <code>malloc()</code> and <code>free()</code>. On the other hand, JavaScript values are allocated when things (objects, strings, etc.) are created and "automatically" free'd when they are not used anymore. The latter process is called <em>garbage collection</em>. This "automatically" is a source of confusion and gives JavaScript (and high-level languages) developers the impression they can decide not to care about memory management. This is a mistake.</p>
<h2 id="Memory_lifecycle">Memory lifecycle</h2>
<p>Regardless of the programming language, memory lifecycle is pretty much always the same:</p>
<ol>
  <li>Allocate the memory you need</li>
  <li>use it (read, write)</li>
  <li>Release the allocated memory when it is not needed anymore</li>
</ol>
<p>The first and second parts are explicit in all languages. The last part is explicit in low-level languages, but is mostly implicit in high-level languages like JavaScript.</p>
<h3 id="Allocation_in_JavaScript">Allocation in JavaScript</h3>
<h4 id="Value_initialization">Value initialization</h4>
<p>In order not to bother the programmer with allocations, JavaScript does it alongside with declaring values.</p>

<div style="overflow:hidden">
<pre class="brush: js">
var n = 123; // allocates memory for a number
var s = "azerty"; // allocates memory for a string 

var o = {
  a: 1,
  b: null
}; // allocates memory for an object and contained values

var a = [1, null, "abra"]; // (like object) allocates memory for the array and contained values

function f(a){
  return a + 2;
} // allocates a function (which is a callable object)

// function expressions also allocate an object
someElement.addEventListener('click', function(){
  someElement.style.backgroundColor = 'blue';
}, false);
</pre>
</div>

<h4 id="Allocation_via_function_calls">Allocation via function calls</h4>
<p>Some function calls result in object allocation.</p>
<pre class="brush: js">
var d = new Date();
var e = document.createElement('div'); // allocates an DOM element
</pre>
<p>Some methods allocate new values or objects:</p>
<pre class="brush: js">
var s = "azerty";
var s2 = s.substr(0, 3); // s2 is a new string
// Since strings are immutable value, JavaScript may decide to not allocate memory, but just store the [0, 3] range.

var a = ["ouais ouais", "nan nan"];
var a2 = ["generation", "nan nan"];
var a3 = a.concat(a2); // new array with 4 elements being the concatenation of a and a2 elements
</pre>
<h3 id="Using_values">Using values</h3>
<p>Using value basically means reading and writing in allocated memory. This can be done by reading or writing the value of a variable or an object property or even passing an argument to a function.</p>
<h3 id="Release_when_the_memory_is_not_needed_anymore">Release when the memory is not needed anymore</h3>
<p>Most of memory management issues come at this phase. The hardest task here is to find when "the allocated memory is not needed any longer". It often requires for the developer to determine where in the program such piece of memory is not needed anymore and free it.</p>
<p>High-level languages interpreter embed a piece of software called "garbage collector" which job is to track memory allocation and use in order to find when a piece of allocated memory is not needed any longer in which case, it will automatically free it. This process is an approximation since the general problem of knowing whether some piece of memory is needed is <a class="external" href="http://en.wikipedia.org/wiki/Decidability_%28logic%29">undecidable</a> (can't be solved by an algorithm).</p>
<h2 id="Garbage_collection">Garbage collection</h2>
<p>As stated above the general problem of automatically finding whether some memory "is not needed anymore" is undecidable. As a consequence, garbage collections implement a restriction of a solution to the general problem. This section will explain the necessary notions to understand the main garbage collection algorithms and their limitations.</p>
<h3 id="References">References</h3>
<p>The main notion garbage collection algorithms rely on is the notion of <em>reference</em>. Within the context of memory management, an object is said to reference another object if the former has an access to the latter (either implicitly or explicitly). For instance, a JavaScript object has a reference to its <a href="/en/JavaScript/Guide/Inheritance_and_the_prototype_chain" title="en/JavaScript/Guide/Inheritance_and_the_prototype_chain">prototype</a> (implicit reference) and to its properties values (explicit reference).</p>
<p>In this context, the notion of "object" is extended to something broader than regular JavaScript objects and also contains function scopes (or the global lexical scope)</p>
<h3 id="Reference-counting_garbage_collection">Reference-counting garbage collection</h3>
<p>This is the most naive garbage collection algorithm. This algorithm reduces the definition of "an object is not needed anymore" to "an object has no other object referencing to it". An object is considered garbage-collectable if there is zero reference pointing at this object.</p>
<h4 id="Example">Example</h4>
<pre class="brush: js">
var o = { 
  a: {
    b:2
  }
}; // 2 objects are created. One is referenced by the other as one of its property.
// The other is referenced by virtue of being assigned to the 'o' variable.
// Obviously, none can be garbage-collected


var o2 = o; // the 'o2' variable is the second thing that has a reference to the object
o = 1; // now, the object that was originally in 'o' has a unique reference embodied by the 'o2' variable

var oa = o2.a; // reference to 'a' property of the object.
// This object has now 2 references: one as a property, the other as the 'oa' variable

o2 = "yo"; // The object that was originally in 'o' has now zero references to it.
// It can be garbage-collected.
// However what was its 'a' property is still referenced by the 'oa' variable, so it cannot be free'd

oa = null; // what was the 'a' property of the object originally in o has zero references to it.
// it can be garbage collected.
</pre>
<h4 id="Limitation_.3A_cycles">Limitation : cycles</h4>
<p>This naive algorithm has the limitation that if objects reference one another (and form a cycle), they may be "not needed anymore" and yet not garbage-collectable.</p>
<pre class="brush: js">
function f(){
  var o = {};
  var o2 = {};
  o.a = o2; // o references o2
  o2.a = o; // o2 references o

  return "azerty";
}

f();
// Two objects are created and reference one another thus creating a cycle.
// They will not get out of the function scope after the function call, so they
// are effectively useless and could be free'd.
// However, the reference-counting algorithm considers that since each of both object is referenced
// at least once, none can be garbage-collected
</pre>
<h4 id="Real_life_example">Real life example</h4>
<p>Internet Explorer 6, 7 are known to have a reference-counting garbage collector for DOM objects. For them, a common pattern is known to generate memory leaks systematically:</p>
<pre class="brush: js">
var div = document.createElement("div");
div.onclick = function(){
  doSomething();
}; // The div has a reference to the event handler via its 'onclick' property
// The handler also has a reference to the div since the 'div' variable can be accessed within the function scope
// This cycle will cause both objects not to be garbage-collected and thus a memory leak.
</pre>
<h3 id="Mark-and-sweep_algorithm">Mark-and-sweep algorithm</h3>
<p>This algorithm reduces the definition of "an object is not needed anymore" to "an object is unreachable".</p>
<p>This algorithm assumes the knowledge of a set of objects called <em>roots</em> (In JavaScript, the root is the global object). Periodically, the garbage-collector will start from these roots, find all objects that are referenced from these roots, then all objects referenced from these, etc. Starting from the roots, the garbage collector will thus find all <em>reachable</em> objects and collect all non-reachable objects.</p>
<p>This algorithm is better than the previous one since "an object has zero reference" leads to this object being unreachable. The opposite is not true as we have seen with cycles.</p>
<p>As of 2012, all modern browsers ship a mark-and-sweep garbage-collector. All improvements made in the field of JavaScript garbage collection (generational/incremental/concurrent/parallel garbage collection) over the last few years are implementation improvements of this algorithm, but not improvements over the garbage collection algorithm itself nor its reduction of the definition of when "an object is not needed anymore".</p>
<h4 id="Cycles_are_not_a_problem_anymore">Cycles are not a problem anymore</h4>
<p>In the first above example, after the function call returns, the 2 objects are not referenced anymore by something reachable from the global object. Consequently, they will be found unreachable by the garbage collector.</p>
<p>The same thing goes with the second example. Once the div and its handler are made unreachable from the roots, they can both be garbage-collected despite referencing each other.</p>
<h4 id="Limitation.3A_objects_need_to_be_made_explicitly_unreachable">Limitation: objects need to be made explicitly unreachable</h4>
<p>Although this is marked as a limitation, it is one that is rarely reached in practice which is why no one usually cares that much about garbage collection.</p>
<h2 id="See_also">See also</h2>
<ul>
  <li><a class="external" href="http://www.ibm.com/developerworks/web/library/wa-memleak/">IBM article on "Memory leak patterns in JavaScript" (2007)</a></li>
  <li><a class="external" href="http://msdn.microsoft.com/en-us/magazine/ff728624.aspx">Kangax article on how to register event handler and avoid memory leaks (2010)</a></li>
</ul>
Revert to this revision