Performance

  • Revision slug: Storage/Performance
  • Revision title: Performance
  • Revision id: 60918
  • Created:
  • Creator: BrettWilson
  • Is current revision? No
  • Comment /* Caching */

Revision Content

Introduction

mozStorage uses sqlite as the database backend. It has generally good performance for a small embedded database. However, many things case various database operations to be slow.

Queries

Many times careful reordering of the SQL statement or creating the proper indices can give much better performance. See the sqlite optimizer overview on the sqlite web site for information on how sqlite uses indices and executes statements.

You might also want to try to the "explain" feature on your statements to see if they are using the indices you expect. Type "explain" followed by your statement to see the plan. For example, explain select * from moz_history; The results are hard to understand, but you should be able to see whether it is using indices.

You can download the command line tool from the sqlite download page. Be sure you have the same version command line tool (or possibly newer) as what Mozilla is using. Currently (2006-04-10) Mozilla uses sqlite 3.3.4 but the latest precompiled version of the command line tools is not available for all platforms. This will lead to errors that say "database is encrypted" because the tool is not able to recognise the file format. You may want to check the SQLITE_VERSION definition in {{template.Source("db/sqlite3/src/sqlite3.h")}} for the current version if you are having problems.

Caching

Sqlite has a cache of database pages in memory. It keeps ones associated with the current transaction so it can roll them back, and it also keeps recently used ones so it can run faster.

By default, it only keeps the pages in memory during a transaction (if you don't explicitly open a transaction, one will be opened for you enclosing each individual statement). At the end of a transaction, the cache is flushed. If you subsequently begin a new transaction, the pages you need will be re-read from disk (or hopefully the OS cache). This makes even simple operations block on OS file reads, which can be prohibitive on some systems with bad disk caches or networked drives.



You can control the size of the memory cache using the cache_size pragma. This value controls the number of pages of the file that can be kept in memory at once. The page size can be set using the page_size pragma before any operations have been done on the file. You can see an example of setting the maximum cache size to be a percentage of memory in nsNavHistory::InitDB in {{template.Source("browser/components/places/src/nsNavHistory.cpp")}}.

Disk writes

Sqlite provides the basic ACID rules of database theory:

  • Atomicity: each transaction is atomic can can not be "partially" committed.
  • Consistency: the database won't get corrupted.
  • Isolation: multiple trasnactions do not affect each other.
  • Durability: once a commit has gone through, the data is guaranteed to be committed.

The problem is that these requirements make some operations, especially commits, very slow. For each commit, does two file syncs to disk among other file operations (see the "Atomic Commit" section of http://www.sqlite.org/php2004/slides-all.html for more information on how this works). These disk syncs are very slow and limit the speed of a commit to the rotational speed of the mechanical disk.

For the browser history, this overhead is unacceptably high. On some systems, the cost of committing a new page to the history database was as high as downloading the entire page (from a fast nearby page load test server) and rendering the page to the screen. As a result, Mozilla has implemented a lazy sync system.

Lazy writing

Mozilla has relaxed the ACID requirements in order to speed up commits. In particular, we have dropped durability. This means that when a commit returns, you are not guaranteed that the commit has gone through. If the power goes out right away, that commit may (or may not) be lost. However, we still support the other (ACI) requirements. This means that the database will not get corrupted. If the power goes out immediately after a commit, the transaction will be like it was rolled back: the database will still be in a consistent state.

Higher commit performance is achieved by writing to the database from a separate thread (see {{template.Source("storage/src/mozStorageAsyncIO.cpp")}} which is associated with the storage service in {{template.Source("storage/src/mozStorageService.cpp")}}). The main database thread does everything exactly as it did before. However, we have overridden the file operations and everything comes through the AsnycIO module. This file is based on test_async.c from the sqlite distribution.

The AsyncIO module packages writes up in messages and puts them on the write thread's message queue. This write thread waits for messages and processes them as fast as it can. This means that writes, locking, and most importantly, disk syncs, only block the AsyncIO thread. Reads are done synchronously, taking into account unwritten data still in the buffer.

Shutdown

If you are doing many writes, the AsnycIO thread will fall behind. Hopefully, the application will give enough time for this thread to flush before exiting. If there are still items in the write queue on shutdown, the storage service will block until all data has been written. It then goes into single-threaded mode where all operations are synchronous. This enables other services to still use the database after the storage service has gotten the shutdown message.

Durable transactions

There is currently no way to ensure durability for particularly important transactions where speed is less of an issue. Writing a flush command to guarantee data has been written may be added in the future.

Vacuuming and zero-fill

Sqlite has a VACUUM command to compress unused space from the database. Sqlite works like a memory manager or a file system. When data is deleted, the associated bytes are marked as free but are not removed from the file. This means that the file will not shrink, and some data may still be visible in the file. The way to work around this is to run the VACUUM command to remove this space.

Vacuuming is very slow. The vacuum command is essentially the same as the command line sqlite3 olddb .dump | sqlite3 newdb; mv newdb olddb. On some networked drives, vacuuming a 10MB database has been timed at over one minute. Therefore, you should avoid vacuuming whenever possible.

Some items in databases are privacy sensetive such as deleted history items. Users have the expectation that deleting items in their history will remove the traces of that from the databasbe. As a result, Mozilla enables the SQLITE_SECURE_DELETE preprocessor flag. This flag causes deleted items to be filled with 0s on disk. This eliminates the need to vacuum except to reclaim disk space, and makes many operations much faster.

Zero-filling can have significant performance overhead in some situations. For example, the history service used to delete many database items at shutdown when expiring old history items. This operation is not necessarily slow, but writing 0s to disk in an "ACI" database is still slow. This made shutdown very slow because the AsyncIO thread would block shutdown ({{template.Bug(328598)}}). Shutdown times of more than 30 seconds were seen. As a result, this bug introduced incremental history expiration eliminating the need to write many 0s to disk on shutdown.

Unfortunately, this operation can not be controlled on a per-transaction or per-connection basis. Some operations will benefit, while others will be hurt.

Revision Source

<h2 name="Introduction"> Introduction </h2>
<p>mozStorage uses sqlite as the database backend. It has generally good performance for a small embedded database. However, many things case various database operations to be slow.
</p>
<h2 name="Queries"> Queries </h2>
<p>Many times careful reordering of the SQL statement or creating the proper indices can give much better performance. See the <a class="external" href="http://www.sqlite.org/optoverview.html">sqlite optimizer overview</a> on the sqlite web site for information on how sqlite uses indices and executes statements.
</p><p>You might also want to try to the "explain" feature on your statements to see if they are using the indices you expect. Type "explain" followed by your statement to see the plan. For example, <code>explain select * from moz_history;</code> The results are hard to understand, but you should be able to see whether it is using indices.
</p><p>You can download the command line tool from the <a class="external" href="http://www.sqlite.org/download.html">sqlite download page</a>. Be sure you have the same version command line tool (or possibly newer) as what Mozilla is using. Currently (2006-04-10) Mozilla uses sqlite 3.3.4 but the latest precompiled version of the command line tools is not available for all platforms. This will lead to errors that say "database is encrypted" because the tool is not able to recognise the file format. You may want to check the SQLITE_VERSION definition in {{template.Source("db/sqlite3/src/sqlite3.h")}} for the current version if you are having problems.
</p>
<h2 name="Caching"> Caching </h2>
<p>Sqlite has a cache of database pages in memory. It keeps ones associated with the current transaction so it can roll them back, and it also keeps recently used ones so it can run faster.
</p><p>By default, it only keeps the pages in memory during a transaction (if you don't explicitly open a transaction, one will be opened for you enclosing each individual statement). At the end of a transaction, the cache is flushed. If you subsequently begin a new transaction, the pages you need will be re-read from disk (or hopefully the OS cache). This makes even simple operations block on OS file reads, which can be prohibitive on some systems with bad disk caches or networked drives.
</p><p><br>
</p><p><br>
</p><p>You can control the size of the memory cache using the <code>cache_size</code> <a class="external" href="http://www.sqlite.org/pragma.html">pragma</a>. This value controls the number of pages of the file that can be kept in memory at once. The page size can be set using the <code>page_size</code> pragma before any operations have been done on the file. You can see an example of setting the maximum cache size to be a percentage of memory in nsNavHistory::InitDB in {{template.Source("browser/components/places/src/nsNavHistory.cpp")}}.
</p>
<h2 name="Disk_writes"> Disk writes </h2>
<p>Sqlite provides the basic ACID rules of database theory:
</p>
<ul><li> Atomicity: each transaction is atomic can can not be "partially" committed.
</li><li> Consistency: the database won't get corrupted.
</li><li> Isolation: multiple trasnactions do not affect each other.
</li><li> Durability: once a commit has gone through, the data is guaranteed to be committed.
</li></ul>
<p>The problem is that these requirements make some operations, especially commits, very slow. For each commit, does two file syncs to disk among other file operations (see the "Atomic Commit" section of http://www.sqlite.org/php2004/slides-all.html for more information on how this works). These disk syncs are very slow and limit the speed of a commit to the rotational speed of the mechanical disk.
</p><p>For the browser history, this overhead is unacceptably high. On some systems, the cost of committing a new page to the history database was as high as downloading the entire page (from a fast nearby page load test server) and rendering the page to the screen. As a result, Mozilla has implemented a lazy sync system.
</p>
<h3 name="Lazy_writing"> Lazy writing </h3>
<p>Mozilla has relaxed the ACID requirements in order to speed up commits. In particular, we have dropped durability. This means that when a commit returns, you are not guaranteed that the commit has gone through. If the power goes out right away, that commit may (or may not) be lost. However, we still support the other (ACI) requirements. This means that the database will not get corrupted. If the power goes out immediately after a commit, the transaction will be like it was rolled back: the database will still be in a consistent state.
</p><p>Higher commit performance is achieved by writing to the database from a separate thread (see {{template.Source("storage/src/mozStorageAsyncIO.cpp")}} which is associated with the storage service in {{template.Source("storage/src/mozStorageService.cpp")}}). The main database thread does everything exactly as it did before. However, we have overridden the file operations and everything comes through the AsnycIO module. This file is based on <a class="external" href="http://www.sqlite.org/cvstrac/rlog?f=sqlite/src/test_async.c">test_async.c</a> from the sqlite distribution.
</p><p>The AsyncIO module packages writes up in messages and puts them on the write thread's message queue. This write thread waits for messages and processes them as fast as it can. This means that writes, locking, and most importantly, disk syncs, only block the AsyncIO thread. Reads are done synchronously, taking into account unwritten data still in the buffer.
</p>
<h4 name="Shutdown"> Shutdown </h4>
<p>If you are doing many writes, the AsnycIO thread will fall behind. Hopefully, the application will give enough time for this thread to flush before exiting. If there are still items in the write queue on shutdown, the storage service will block until all data has been written. It then goes into single-threaded mode where all operations are synchronous. This enables other services to still use the database after the storage service has gotten the shutdown message.
</p>
<h4 name="Durable_transactions"> Durable transactions </h4>
<p>There is currently no way to ensure durability for particularly important transactions where speed is less of an issue. Writing a flush command to guarantee data has been written may be added in the future.
</p>
<h2 name="Vacuuming_and_zero-fill"> Vacuuming and zero-fill </h2>
<p>Sqlite has a VACUUM command to compress unused space from the database. Sqlite works like a memory manager or a file system. When data is deleted, the associated bytes are marked as free but are not removed from the file. This means that the file will not shrink, and some data may still be visible in the file. The way to work around this is to run the VACUUM command to remove this space.
</p><p>Vacuuming is very slow. The vacuum command is essentially the same as the command line <code>sqlite3 olddb .dump | sqlite3 newdb; mv newdb olddb</code>. On some networked drives, vacuuming a 10MB database has been timed at over one minute. Therefore, you should avoid vacuuming whenever possible.
</p><p>Some items in databases are privacy sensetive such as deleted history items. Users have the expectation that deleting items in their history will remove the traces of that from the databasbe. As a result, Mozilla enables the <code>SQLITE_SECURE_DELETE</code> preprocessor flag. This flag causes deleted items to be filled with 0s on disk. This eliminates the need to vacuum except to reclaim disk space, and makes many operations much faster.
</p><p>Zero-filling can have significant performance overhead in some situations. For example, the history service used to delete many database items at shutdown when expiring old history items. This operation is not necessarily slow, but writing 0s to disk in an "ACI" database is still slow. This made shutdown very slow because the AsyncIO thread would block shutdown ({{template.Bug(328598)}}). Shutdown times of more than 30 seconds were seen. As a result, this bug introduced incremental history expiration eliminating the need to write many 0s to disk on shutdown.
</p><p>Unfortunately, this operation can not be controlled on a per-transaction or per-connection basis. Some operations will benefit, while others will be hurt.
</p>
Revert to this revision