SQL Server … Searching Tree !!!

If by any chance you happen to know a little history of computers, you will be surprised that most of the early innovators had taken their primary field of study as Mathematics or Physics !!! At first glance, you may wonder what a mathematician is doing in field of computers right ?? But even if you read the word computer , it can be seen as a device that compute (or say calculate) stuffs .. right ? So, if you want to work at very low level (or say Machine level) you need to know very good math … or more precisely different algorithms. Computer itself is a machine which requires set of instructions to perform “any” task and to do that task , it always use algorithms (… now you know why ALGORITHM was core subject of CS right ?) … some of the algorithms like Store Algorithm, Sort Algorithm, Search Algorithm … but as long as you don’t want to develop your own OS with your own File System you don’t need to worry about it.

Here you may ask, what the hell it has to do with SQL Server ?? Fact is, these algorithms are not only used to store, search or sort data at OS but they are also used in any DBMS system and yes in SQL Server too. It is not like your carrier will end if you don’t know these algorithms but having idea of these algorithm can come handy when it comes to performance tuning or even will help you to understand, how SQL Server search, store/retrieve or sorts data physically.

And The Winner is ….

Most commonly used (and efficient one) algorithm is called B-tree … its variant of Binary Tree. It allows SQL Server to efficiently search, sort and store data physically. Now, you may ask why B-Tee … well then, I actually found one comparison table of different algorithm which actually compares Sort time required for various Algorithms and Binary Tree is the most effective algorithm for all sort, search and store. And B-tree is even better then Binary Tree. It is constructed with Nodes, namely Parent, Leaf and Intermediate. And as name suggests, Parent node is a node which is followed by one or more then one leaf nodes. And Leaf node is the last one in chain, and it doesn’t have any other nodes after it. All other nodes between parent and leaf nodes are called Intermediate nodes. Here it is also required to note that, this B-tree used in SQL Server is a Balanced B-tree, which means all leaf nodes of all parent nodes are kept at same level of depth (in other words, if your B-tree is 3 level deep then ALL Last Leaf node will be level 3 no more no less.)

B-tree

How values are inserted into tree is a bit “off” the topic, but in general it can be explained as, … if new value comes (by insert or update) then it is compared with parent node and then it is stored accordingly to leaf node. Consider that we have already structured B-tree as above, and now we want to insert value 10, it will be inserted between 9 and 12. But if want to insert value 3 it goes to first leaf node but as we can see its already full so that leaf will split into two leafs (1,2) and (5,6) and parent node will be (3,7,16) .. (if i have understood it correctly fingerscrossed) … here we have taken some assumptions like, our tree is only 1 level deep and its leaf node can have only 4 members at most.

Here I can’t help my self from mentioning point of performance optimization. As you can see, if we add a new member to a leaf which is already full, it has to split the leaf and add member to parent node. If you can imagine, it will take a little more processing time then usual insertion where leaf node is having free space, right ?? What happens if we are inserting 100s of not if 1000s of rows or even updating them, this fraction of time increase will sum up to very noticeable amount which is not good at all. To overcome this problem, there is one parameter called “Fill Factor” is defined when we create indexes (clustered indexes physically stores data on data pages, i will write about it some other time).

I think, I should end there and I hope that I have made you understood, why it is important to know algorithms (if not all then at least B-tree) !!!

If you haven’t checked already, image is from Wikipedia

Its Just A Thought … fingerscrossed

Leave a Reply

Your email address will not be published. Required fields are marked *