Internal of DBMS consist of 5 core components , and we know that … it was more like logical structure … but the internals I am talking about here is how data is handled, or can say how data is read, write, stored physically specifically in SQL Server (I think Oracle does it a little differently). Knowledge of internal mechanism is very important because it provides us insight of how stuff works. It is very helpful when it comes to optimize / improve performance of DBMS as a whole system.
The little most physical chunk of storage device that is used to store data by storage engine is called PAGE and it is of 8KB in size (same size is used by ORACLE DBMS too)including 96 byte header which leaves about 8060 bytes of space for records. Its header contains sequence number of that page, and pointer to previous and next page. So we can say that, longer the row record fewer the entries in page and so more pages are required. Here one thing is to remember, Log files (.LDF) doesn’t contain data pages, it just contain series of log records.
The storage engine prefers to read and write in group of 8 pages … known as EXTENT. So Extent is collection of 8 contiguous pages and 64KB in size. So we always format our storage device which will be used to store Data and Log files, with sector (its again HDD 101 😀 , tracks, sectors, cylinders … !!!)size of 64KB so we can use the space optimally.
Now this extents are stored with help of GAM (Global Allocation Mapping) and SGAM (Stored Global Allocation Mapping). And extents can be Uniform or Mixed type. Uniform extents are those who contains data of only same table but Mixed extents contain data from more than one place. Also, as these extents are not always full (it’s another way to optimize performance, we will see it some other time), its free space is recorded using another flag, its called PAGE FREE SPACE (PFS). Here you should note that, GAM and SGAM can store upto 64K extent info (nearly 4GB) and PFS can store upto 8k page info. So we will find page repeating at every 8000 pages and GAM/SGAM at every 64,000 extent after first.
There is also another element is used to track down different indexes its called INDEX ALLOCATION MAP (IAM). So now when it comes to reading or writing data at data page (if you still remember, its PAGE of 8KB), storage engine quickly looks at IAM and PFS to find a place to write or read … and to make this operation faster, both IAM and PFS are stored at Buffer POOL (or in layman term, RAM :-D) …
Now, when we
write data Change data… Storage engine always keep record of it with help of two type of page systems. It is called Differential Changed Map (DCM) and Bulk Changed Map (BCM). Later one is used only when we are using Bulked Logged Recover Model (don’t ask, we will discuss it later) …
There is one super great tool available to download from CodePlex. It’s called Internals Viewer for SQL Server. In real life it won’t help much, but for learning more about internals its really great tool to understand how SQL Server internally play with data.
I’ve taken images and part of story from MSDN.
It’s Just a Thought …