|
Tuesday, January 5, 2010
Understanding Exchange 2010 Storage Architecture: Part 1
By Mahmoud Magdy
In this article, we will take a close look at the Exchange 2010 Storage architecture, but first let us go back to the basics by reviewing the ESE engine storage and then delve into the new enhancements that were introduced with Exchange 2010. First, a brief review of the ESE basics: Microsoft’s Extensible Storage Engine (ESE) is an ISAM (Indexed Sequential Access Method) data storage technology. The purpose of the ESE is to allow applications to store and retrieve data via indexed and sequential access. The ESE is suitable for server applications since its transactions are highly concurrent; but at the same time it is lightweight enough that it also works well for auxiliary applications. Worried about losing stored data in the event of a system crashing? The ESE provides transacted data update and retrieval, meaning that data consistency is maintained should your system crash via the ESE’s crash recovery mechanism.
As you all know, ESE relies on the B+ tree in order to store data. The following diagram features a simple tree that illustrates how information is stored in the data tree:
Since sorting and searching through mounds of data is time-consuming, ESE stores data in trees in order to optimize their sorting and searching behavior. In addition, the regular tree model has been updated using the B+ tree to allow for faster, more efficient sorting of data.
There are 2 types of data sorting: either internal or external. Internal data sorting means that the system can store and sort the data in the memory. However, since it is impossible for each system to sort its data within the memory, the system is forced to store data on the disk and then begin using the B+ Tree.
Data in the ESE is stored based on the following hierarchy:
- A property is created, generated and placed in table record. Keep in mind that MAPI uses properties in order to define data and their structure at the lowest level.
- Multiple properties are placed in a record.
- The record is stored on a node, and a corresponding key is used to both index and vastly access the record. One thing to remember is that the leaf nodes (the end nodes) are logically linked together to allow the horizontal crawling and movement of data within the B+ Tree.
- A record is placed into lines which are then stored on a page, with the page being the smallest element of the hard disk. Storage sizes in previous versions of Exchange: In Exchange 2003 the hard disk size was 4 KB. That number doubled to 8 KB in Exchange 2007, and then quadrupled to 32 KB in Exchange 2010.
How did Microsoft improve the storage engine in Exchange 2010?
Exchange 2007 introduced significant enhancements for the storage usage and optimization, however Microsoft wanted to further improve these enhancements with the release of Exchange 2010. While doing preliminary research to determine the most pertinent areas in storage use and optimization that need attention, Microsoft found that enterprises suffer from several challenges with the current storage technologies, including but not limited to:
- Random IO and disk limits: The current technologies provide limited random IOs throughput; however, most of the current systems can perform several hundred requests on sequential IOs.
- Storage Design flexibility: As email communication increases, enterprises are continually demanding improved and flexible options for storing users’ growing amounts of data.
- Using SATA Disks and JBOD technologies: Enterprises were limited to their capacity limits by the SAS/SCSI disks; however, there are currently 2 TB SATA disks (even though Exchange should be able to work with the limited throughput of the SATA disk.)
Task 1: change the ESE storage scheme:

In previous versions of Exchange, as illustrated in the first diagram, there were multiple tables per database that contained the users’ data. In figure 2 (and in Exchange 2007) there were multiple tables (for example: mailbox table, folders table, messages table, etc) per mailbox database. Thus, in order to open a user’s mailbox, Exchange required multiple small IOs to be performed.
In Exchange 2010, Microsoft moved to a table per mailbox, making it faster and easier to open a user’s mailbox. With Exchange 2010, opening a mailbox requires fewer and larger IOs in order to open a user’s mailbox and read specific email messages stored inside. This is due to the fact that the underlying architecture of the storage design was modified in Exchange 2010 in order to reduce IOPS (input/output operations per second). Microsoft dramatically reduced IOPS with Exchange 2010 to a full 70% reduction over 2007 and a 90% reduction over Exchange 2003.
In addition to the aforementioned features introduced in Exchange 2010, other enhancements have also been made to further reduce IOPS, including the Lazy View update and the usage of the ‘pay to play’ method. Remember that in previous versions of Exchange, custom views were updated as soon as the store received an email. Although this technique provided the end users with a better experience, it had a negative impact on Exchange, forcing the Exchange system to continuously update the view and create random small IOs in order to keep the store with the most updated view. With the Lazy View update, the email store is only updated when requested by the end user.
Exchange 2010 utilizes Lazy View technology in which the views are updated when the user attempts to access them. Although this increased the time it takes to open the view, it dramatically enhanced the Exchange IO performance by using the notion that it is faster for the disk to read data stored in larger, sequential pieces versus the disk head having to gather smaller chunks of data spread out across the disk.
In order for Microsoft to create a table per mailbox, they had to remove SIS (Single Instance storage). Some of you may complain about this initially, but never fear: Microsoft provided a work-around known as Database compression. This technology is used to compress the content of the database (especially text and html files), and provides an alternative to the SIS removal issue.
Now take another look at the Exchange 2010 ESE and compare it to Exchange 2007’s ESE. In Exchange 2007, in order to open a message in Joe’s mailbox, Exchange had to open the mailbox table, read the message header, open the message and read the attachment (examples of small random IOs.)
In Exchange 2010, the Exchange system can open the mailbox table, read the message header, and open the message directly. It is important to note that since these tables are now logically connected it is more convenient for Exchange to access them, and thanks to the new page size in Exchange 2010, E14 can read the entire message body in a single IO. If additional IOs are needed they can be done, but in order to streamline the data gathering process, these commands are now grouped in larger, sequential IOs.
Let us pause at this point and revisit our discussion of Microsoft’s enhancements to the ESE in Exchange 2010 in Part 2, at which time we will delve deeper into the topics of physical and logical contiguity.Labels: Exchange 2010, Exchange Information Stores
0 Comments:
|