Atomicity Consistency Isolation and Durability these are key features of any Database Management System weather its SQL Server, Oracle, MySQL or any … and so it is essential for any DBMS to be able to recover / restore data to its original state or a state in any given period of time defined by it’s user. Now, I am using SQL Server mainly so I am gonna talk about it …
In SQL Server, when ever a transaction (a set of instructions) is done it gets registered or say logged into log file of that database where transaction is being performed. Now, at event of disaster it is really important for database to recover from it with minimal without loss of data, because enterprise itself depends only on data !! And at that event this log file (or say Logs) plays really important part. Let me explain it in more detail …. when ever we do any transaction it gets logged in Log file, but it doesn’t gets written to data file immediately. So, it is possible that if we made changes to 100 records it won’t exist in real data file (but if we query those records we still can see new data because SQL Server has marked those data pages as dirty so any query referencing it will show new records not the real data that is stored in data file ) … now I know it sounds a bit of confusing but in one line just understand that, if we make change to database it doesn’t gets written to data file immediately, instead database engine marks those records (or a data page where those records are stored ) as dirty and it only gets written after some given time or after those record size reaches upto some value (btw, the process responsible to write these dirty pages to data file is called “Lazy Writer” ). So, like I was talking, if you have changed 100 records of some table say Credit Card details and still that data is not changed in real data pages … meanwhile something goes wrong and server gets crashed … or in another case say someone deleted whole table of Customer Information, what you think will happen ?? Well, the answer lies in something called “Recovery Model” of SQL Server.
Also, before we go in to more details there is another phenomena you need to understand … it is called Point in time recovery. It can be explained as ability to restore data into exactly same as any given point in time of past. So, which help of full backup and log backup, we can restore our database to state of any given time.
There are 3 types of Recovery Models available in SQL Server. And user should do selection based on purpose / usage of database.
Simple Recovery Model : as name suggests, it is really simple to use and require the least administrative efforts. If you can recall, earlier I told you that when ever we make changes those changes are written to log file first and then written back to data file after certain amount of time or size. In Simple Recovery Model, it is done automatically by database engine itself. Advantage of this method is your log file size doesn’t increase with time because data is flushed out of it frequently by database engine itself. But main disadvantage is in ability to do point – in – time recovery because we don’t have any log backups to take. So if we have full backup of last night and something goes wrong at evening of next day then we will lose all data between that time period !!! So, you can assume this recovery model is generally not used in Production environment (ofcourse there is always exception) …
Full Recovery Model : This is generally default (but actually it depends upon Model database settings) recovery model. And in this recovery model, data is kept in log file until it is written to data file by process of backup, which is called Log Backup. And as you can assume, this recovery model allows us to do point – in – time recovery. But you also need to remember that as log files doesn’t get flushed out by itself you need some process to take backup (and the interval of backup depends upon how much data is being changed) … so don’t get surprise if you find your log file in size of double than your data file itself …
Bulk- Logged Recovery Model : This recovery model is used mainly for bulk operations (e.g. bulk inserts or index creation). As you know whenever we do transaction it gets registered in log file, and if the frequency of data change is high then size of log file gets increase very fast. So, when you are performing bulk operations this can be real troublesome, in that case we can use this recovery model – or we change Full recovery model to Bulk-Logged Recovery Model. Ideally it is same as Full recovery model when there is no bulk operation is going on … if bulk operations are being performed at that time it doesn’t log all operations it keeps the size of log file from increasing, but also it can’t do point – in – time recovery because of that.
When you create new database, it’s default recovery model will be what is of Model database. So I think if you are using it just for you then you should keep it as Simple Recovery Model, otherwise you will need to always make sure that you take backup regularly…
Well, I think this is enough for today …
Remember, It’s Just a Thought …