| Use CASE in the UPDATE statement | |
|
sql concepts and interview questions
Saturday, 26 November 2011
using Case in a update Statement
Sunday, 16 October 2011
select Query Syntax
[ WITH <common_table_expression>]
SELECT select_list [ INTO new_table ]
[ FROM table_source ] [ WHERE search_condition ]
[ GROUP BY group_by_expression ]
[ HAVING search_condition ]
[ ORDER BY order_expression [ ASC | DESC ] ]
The UNION, EXCEPT and INTERSECT operators can be used between queries to combine or compare their results into one result set.
SELECT select_list [ INTO new_table ]
[ FROM table_source ] [ WHERE search_condition ]
[ GROUP BY group_by_expression ]
[ HAVING search_condition ]
[ ORDER BY order_expression [ ASC | DESC ] ]
The UNION, EXCEPT and INTERSECT operators can be used between queries to combine or compare their results into one result set.
Using the INSTEAD OF Trigger in SQL Server 2005
I, like probably a lot of people, am going to have to do a bit of DB refactoring in order to be able to take advantage of Linq to SQL and/or the ADO.NET Entity Framework. While the framework is certainly flexible, there's plenty of databases out there that were designed without an O/R layer in mind, and there's definitely spots where you can pigeonhole yourself if you didn't take O/R into account during the initial design.
So, one great way to simplify your model is to create views that abstract the complicated underlying model. Views work great for just about any read-only operation, however, when you get into the inserting and deleting is when you hit some heavy limitations. From the SQL Server 2005 BOL:
You can modify the data of an underlying base table through a view, as long as the following conditions are true:The first one is obviously the biggest issue (at least for me). It's often likely that you've got more than one table that aggregates into a single entity.
- Any modifications, including UPDATE, INSERT, and DELETE statements, must reference columns from only one base table.
- The columns being modified in the view must directly reference the underlying data in the table columns. The columns cannot be derived in any other way, such as through the following:
- An aggregate function: AVG, COUNT, SUM, MIN, MAX, GROUPING, STDEV, STDEVP, VAR, and VARP.
- A computation. The column cannot be computed from an expression that uses other columns. Columns that are formed by using the set operators UNION, UNION ALL, CROSSJOIN, EXCEPT, and INTERSECT amount to a computation and are also not updatable.
- The columns being modified are not affected by GROUP BY, HAVING, or DISTINCT clauses.
- TOP is not used anywhere in the select_statement of the view together with the WITH CHECK OPTION clause.
So, one option you have then is the INSTEAD OF trigger. Apparently this feature has been around since SQL 2000, but in my ignorance, haven't heard of it until recently. The INSTEAD OF triggers can be placed on any view or table to replace the standard action of the INSERT statement.
As with all triggers, this is one of those "Use with great caution" features, as it definitely convolutes the normal execution path. That said, it's pretty powerful, and gets you around most of the updatable view limitations.
Now, I'm going to show a simple example. I'm perfectly aware that this example easy to do in the ADO.NET Entity Framework. This example doesn't represent my own database issues, and why I am investigating going this route. Just trying to keep it simple for this example. If you'd like to hear about my database issues, go ahead and comment, and I'd love to discuss and see if there's better options out there.
So, here's the schema:
Now, in my application, I want to represent a single entity called Employee, that is an aggregation of the Person attributes and the Employee attributes. So, what I can do then is create a view.
CREATE VIEW [dbo].[Employees]
AS
SELECT dbo.Employee.EmployeeID, dbo.Employee.Salary, dbo.Person.FirstName, dbo.Person.LastName, dbo.Person.PersonID
FROM dbo.Employee INNER JOIN
dbo.Person ON dbo.Employee.PersonID = dbo.Person.PersonID
Now, this view is going to work great, except when I want to update the data in the tables. Now, I can write some insert code using the new Linq To SQL functionality:AS
SELECT dbo.Employee.EmployeeID, dbo.Employee.Salary, dbo.Person.FirstName, dbo.Person.LastName, dbo.Person.PersonID
FROM dbo.Employee INNER JOIN
dbo.Person ON dbo.Employee.PersonID = dbo.Person.PersonID
MyDatabaseClassesDataContext ctx = new MyDatabaseClassesDataContext();
Employee emp = new Employee();
emp.FirstName = "Jim";
emp.LastName = "Fiorato";
emp.Salary = 400000;
ctx.Employees.InsertOnSubmit(emp);
ctx.SubmitChanges();
However, that will throw an exception like the following: Employee emp = new Employee();
emp.FirstName = "Jim";
emp.LastName = "Fiorato";
emp.Salary = 400000;
ctx.Employees.InsertOnSubmit(emp);
ctx.SubmitChanges();
System.Data.SqlClient.SqlException: View or function 'dbo.Employees' is not updatable because the modification affects multiple base tables.So, now what I can do is create the INSTEAD OF trigger on the view. I'll just do an example of the insert trigger. That trigger will look like this (note this doesn't support multiple items in the inserted rowset at all):
CREATE TRIGGER [dbo].[TR_INSERT_Employees]
ON [dbo].[Employees]
INSTEAD OF INSERT
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
DECLARE @PersonID AS UNIQUEIDENTIFIER
DECLARE @EmployeeID AS UNIQUEIDENTIFIER
SET @PersonID = NEWID();
SET @EmployeeID = NEWID();
-- Person Table
INSERT INTO Person (PersonID, FirstName, LastName)
SELECT @PersonID, FirstName, LastName FROM inserted;
-- Employee Table
INSERT INTO Employee (EmployeeID, PersonID, Salary)
SELECT @EmployeeID, @PersonID, Salary FROM inserted;
END
This all makes me wonder if we're just pushing the impedance mismatch work-around further into the database from the code. Is that a good or bad thing? It's likely bad, but just how bad? I can't tell yet. ON [dbo].[Employees]
INSTEAD OF INSERT
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
DECLARE @PersonID AS UNIQUEIDENTIFIER
DECLARE @EmployeeID AS UNIQUEIDENTIFIER
SET @PersonID = NEWID();
SET @EmployeeID = NEWID();
-- Person Table
INSERT INTO Person (PersonID, FirstName, LastName)
SELECT @PersonID, FirstName, LastName FROM inserted;
-- Employee Table
INSERT INTO Employee (EmployeeID, PersonID, Salary)
SELECT @EmployeeID, @PersonID, Salary FROM inserted;
END
I think people usually feel better about what's in their database, vs. what's in their code, because relational databases are something you can wrap you head around, where boundaries are clear, and people make quick and mostly accurate assumptions about what is going on. Whereas, in you C# code, you'll often find it hard to grasp it all, because it could be doing so many things, in so many different ways.
clustered and Non clustered Index
1. Introduction
We all know that data entered in the tables are persisted in the physical drive in the form of database files. Think about a table, sayCustomer (For any leading bank India), that has around 16 million records. When we try to retrieve records for two or three customers based on their customer id, all 16 million records are taken and comparison is made to get a match on the supplied customer ids. Think about how much time that will take if it is a web application and there are 25 to 30 customers that want to access their data through internet. Does the database server do 16 million x 30 searches? The answer is no because all modern databases use the concept of index. 2. What is an Index
Index is a database object, which can be created on one or more columns (16 Max column combination). When creating the index will read the column(s) and forms a relevant data structure to minimize the number of data comparisons. The index will improve the performance of data retrieval and adds some overhead on data modification such as create, delete and modify. So it depends on how much data retrieval can be performed on table versus how much of DML (Insert, Delete and Update) operations. In this article, we will see creating the
Index. The below two sections are taken from my previous article as it is required here. If your database has changes for the next two sections, you can directly go to section 5. 3. First Create Two Tables
To explain these constraints, we need two tables. First, let us create these tables. Run the below scripts to create the tables. Copy paste the code on the new Query Editor window, then execute it.CREATE TABLE Student(StudId smallint, StudName varchar(50), Class tinyint);
CREATE TABLE TotalMarks(StudentId smallint, TotalMarks smallint);
GoNote that there are no constraints at present on these tables. We will add the constraints one by one. 4. Primary Key Constraint
A table column with this constraint is called as the key column for the table. This constraint helps the table to make sure that the value is not repeated and also nonull entries. We will mark the StudId column of the Student table as primary key. Follow these steps: - Right click the
studenttable and click on the modify button. - From the displayed layout,
selecttheStudIdrow by clicking the Small Square like button on the left side of the row. - Click on the Set Primary Key toolbar button to set the
StudIdcolumn as primary key column.
Now this column does not allow
null values and duplicate values. You can try inserting values to violate these conditions and see what happens. A table can have only one Primary key. Multiple columns can participate on the primary key column. Then, the uniqueness is considered among all the participant columns by combining their values. 5. Clustered Index
The primary key created for theStudId column will create a clustered index for the Studid column. A table can have only one clustered index on it. When creating the clustered index, SQL server 2005 reads the
Studid column and forms a Binary tree on it. This binary tree information is then stored separately in the disc. Expand the table Student and then expand the Indexes. You will see the following index created for you when the primary key is created: With the use of the binary tree, now the search for the student based on the
studid decreases the number of comparisons to a large amount. Let us assume that you had entered the following data in the table student: The
index will form the below specified binary tree. Note that for a given parent, there are only one or two Childs. The left side will always have a lesser value and the right side will always have a greater value when compared to parent. The tree can be constructed in the reverse way also. That is, left side higher and right side lower. Now let us assume that we had written a query like below:
Select * from student where studid = 103;
Select * from student where studid = 107;Execution without index will return value for the first query after third comparison.Execution without index will return value for the second query at eights comparison.
Execution of first query with index will return value at first comparison.
Execution of second query with index will return the value at the third comparison. Look below:
- Compare 107 vs 103 : Move to right node
- Compare 107 vs 106 : Move to right node
- Compare 107 vs 107 : Matched, return the record
YahooLogin. Let us assume there are 33 million users around the world that have Yahoo email id and that is stored in the YahooLogin. When a user logs in by giving the user name and password, the comparison required is 1 to 25, with the binary tree that is clustered index. Look at the above picture and guess yourself how fast you will reach into the level 25. Without Clustered index, the comparison required is 1 to 33 millions.
Got the usage of Clustered index? Let us move to Non-Clustered index.
6. Non Clustered Index
A non-clustered index is useful for columns that have some repeated values. Say for example,AccountType column of a bank database may have 10 million rows. But, the distinct values of account type may be 10-15. A clustered index is automatically created when we create the primary key for the table. We need to take care of the creation of the non-clustered index. Follow the steps below to create a Non-clustered index on our table
Student based on the column class. - After expanding the
Studenttable, right click on theIndexes. And click on the New Index. - From the displayed dialog, type the index name as shown below and then click on the Add button to select the column(s) that participate in the index. Make sure the
Indextype is Non-Clustered. - In the select column dialog, place a check mark for the column class. This tells that we need a non-clustered index for the column
Student.Class. You can also combine more than one column to create theIndex. Once the column is selected, click on the OK button. You will return the dialog shown above with the selected column marked in blue. Our index has only one column. If you selected more than one column, using theMoveUpandMoveDownbutton, you can change order of the indexed columns. When you are using the combination of columns, always use the highly repeated column first and more unique columns down in the list. For example, let use assume the correct order for creating the Non-clustered index is:Class,DateOfBirth,PlaceOfBirth. - Click on the Index folder on the right side and you will see the non-clustered index based on the column class is created for you.
7. How Does a Non-Clustered Index Work?
A table can have more than one Non-Clustered index. But, it should have only one clustered index that works based on the Binary tree concept. Non-Clustered column always depends on the Clustered column on the database.This can be easily explained with the concept of a book and its index page at the end. Let us assume that you are going to a bookshop and found a big 1500 pages of C# book that says all about C#. When you glanced at the book, it has all beautiful color pages and shiny papers. But, that is not only the eligibility for a good book right? One you are impressed, you want to see your favorite topic of Regular Expressions and how it is explained in the book. What will you do? I just peeped at you from behind and recorded what you did as below:
- You went to the Index page (it has total 25 pages). It is already sorted and hence you easily picked up Regular Expression that comes on page Number 17.
- Next, you noted down the number displayed next to it which is 407, 816, 1200-1220.
- Your first target is Page 407. You opened a page in the middle, the page is greater than 500.
- Then you moved to a somewhat lower page. But it still reads 310.
- Then you moved to a higher page. You are very lucky you exactly got page 407. [Yes man you got it. Otherwise I need to write more. OK?]
- That’s all, you started exploring what is written about Regular expression on that page, keeping in mind that you need to find page 816 also.
Here, the class column with distinct values 1,2,3..12 will store the clustered index columns value along with it. Say for example; Let us take only class value of 1. The Index goes like this:
1: 100, 104, 105So here, you can easily get all the records that have value for class = 1. Map this with the Book index example nowFriday, 7 October 2011
Tips to avoid sql injection
- Encrypt sensitive data.
- Access the database using an account with the least privileges necessary.
- Install the database using an account with the least privileges necessary.
- Ensure that data is valid.
- Do a code review to check for the possibility of second-order attacks.
- Use parameterised queries.
- Use stored procedures.
- Re-validate data in stored procedures.
- Ensure that error messages give nothing away about the internal architecture of the application or the database.
Monday, 3 October 2011
TRANSACTION Isolation Levels in SQL Server
SQL Server 2005 has some unique features to deal with the Transaction system in the database world. It has some unique sets to take care of every possibility of transactions or types of transaction. Technically, it will give us discrete ways to isolate the transactions from occurrence of deadlocks or crashes.
Before going deeper to the Isolation level that SQL Server provides to distinguish types of transaction, let’s have a look on the definition of the TRANSACTION. What does transaction means in real world and in a database scenario
Transaction: When you give something to me, and I take it; then it’s a transaction. When you withdraw money from an ATM machine, and you receive the money; then it is also a kind of transaction. Here, what I am trying to reflect is a simple question that, Is the transaction above is valid or consistent. What if I deny accepting that you haven’t given me anything, may be you have given to someone else instead of me? What if after withdrawing of money from your account, your account balance still shows the same amount as before. (Oh! For this case you have to be lucky enough
). And what will happen if you and your partner are withdrawing all your money from the joint account at the same time from different ATMs.
So there must be some methodology to keep track of all these things and to manage them perfectly even in such natural disaster conditions, the database and the information regarding any transaction must be in a consistent form.
To achieve the above thought in the database system, we have Locking mechanism. It acts like this, suppose there is a room and that is electronically locked and only the person who knows the password can enter, provided the room is empty or he has to wait till the room is evacuated by the other person. But here we have a little controversy, like the person who is waiting outside may have some different task than the person who is already inside. And it may be possible that both of the tasks may not interfere to each other or may interfere slightly that may be manageable. So at the end of the discussion, we may conclude that the security system must provide different types of security code or passwords to the corresponding person. Let’s have a deeper look on this.
Suppose you are doing a transaction for withdrawing money from the ATM machine and at the same time the bank manager is doing a routine checkup of your transaction which is totally a different operation and suppose at the same time the bank teller is checking your account for the remaining balance. All these operations are different but accessing the same entity or resource and that is your account information that is kept inside the database. Out of these operations only you are doing write operation in the database as you are withdrawing money and the remaining balance has to be updated in the database. So a proper security mechanism must be implemented here to ensure non-conflict or smooth going of these operations. Here the security can be ensured by putting locks (and of course the type of locks) for each type of operations, which means you are isolating the resources from other transactions that may hamper its consistency. Here comes the role of Isolation levels.
The Isolation levels are categorized depending on the type of locks it uses for a particular level. At lower level of isolation more users can access the same resource without any confliction, but they may face concurrency related issues such as dirty-reads and data inaccuracy (described below). At higher Isolation level, these types of issues can be eliminated but here only a limited no. of users can access the resource.
Let’s have a look on Locks and type of Locks. Locks can be treated as a policy which will prevent you or a process to perform any action (that may conflict with other actions) on an object or resource if that object or resource is already occupied by any other process or user. It’s something like you are going to propose someone who is already with someone else. But situation matters (may be you are lucky enough for this). Like it depends on what you are going to do and on what the other person is doing. So for such type of situations, we have types of locks.
Types of Locks:- Shared Locks(S): This lock is useful when you are doing some read operations and no manipulations like write operations (update/delete/insert). This is compatible with other shared locks, update locks and Intent shared locks. It can prevent users from performing dirty reads (described below).
- Exclusive Locks(X): These locks are big possessive types. They are not compatible with any other locks. Like these locks will not work if any other locks are already there with the resource neither it will let other locks to be created on the resource until it finishes its job. This lock used for data-modification operations, such as INSERT, UPDATE or DELETE.
- Update Locks (U): This can be treated as a mixture and perfect collaboration of the above two locks (Shared and Exclusive). Let’s take an example. You are going to perform an update operation on a table at row number 23. So here you are doing two types of operation, one is searching the record 23 which can be achieved by implementing shared lock and the other is updating the record after it has found which will be achieved by Exclusive lock. So, here the shared lock transforms to exclusive lock when it finds the target or else it will be remain as shared lock only. This prevents deadlocks to a great extent. This lock is compatible with Intent shared and shared locks.
- Intent locks (also called as Demand Locks): These are used to establish a lock hierarchy. Here it will protect placing a shared (S) lock or exclusive (X) lock on a resource lower in the lock hierarchy. For example, suppose you are performing a read operation on a piece of data with shared lock. At the same time another user wants to modify data with exclusive lock, but the shared lock is compatible with other shared locks as a result any number of shared locks can be obtained on a piece of data and hence the user with exclusive has to wait indefinitely till the completion of all shared lock operations. So to avoid this type of starving situation, Intent locks are very useful. Here if the second user comes with Intent Exclusive lock, then no other transaction can grab a shared lock. Here it can claim the use of exclusive lock after the first transaction completes.
There are basically three types of Intent Locks that are most popular:
a) Intent Shared Lock(IS)
b) Intent exclusive (IX)
c) Shared with intent exclusive (SIX)
b) Intent exclusive (IX)
c) Shared with intent exclusive (SIX)
To get more information on Intent Locks, refer the link below:
http://msdn.microsoft.com/en-us/library/aa213039(SQL.80).aspx
http://msdn.microsoft.com/en-us/library/aa213039(SQL.80).aspx
- Schema Locks: These locks protect the schema of the database. This deals with the DDL (Data Definition Language) commands like adding or dropping column information for a table, rename table, drop table, blocking any DDL operation during the execution of the query. There are two types of Schema Locks:
a) Schema modification (Sch-M): This lock is applied only when the SQL Server engine is modifying the structure of the schema like adding or dropping the columns of a table. During this period if any other transaction tries to access that object then that will be denied or delayed.
b) Schema stability (Sch-S): This indicates a query using this table being compiled. Here it will not block any transactional locks like shared locks or exclusive locks to perform any operation on the data. But if the query is in running condition, it will prevent execution of any DDL commands on that table.
- Bulk Update Locks: This lock is useful while performing BULK operation on the TABLE like BULK INSERT. It will prevent any other types of normal T-SQL operations to be executed on the table except BULK processing of the data.
Now let us explore some buzzwords in Isolation Level:
Lost updates: It generally occurs when more than one transaction tries to update any specific record at a time i.e. when one update is successfully written to the database, but accidently a second update from different transaction overwrites the previous update information. This is called Lost Updates.
Non-repeatable reads (also called Inconsistent analysis): Dealing with inconsistent data i.e. suppose you read one value from a table and started working on it but meanwhile some other process modifies the value in the source resulting a false output in your transaction, then it is called Non-repeatable reads. Let’s have a more practical example, suppose before withdrawing money from your account, you always perform a balance check and you find 90$ as a balance in your account. Then you perform withdraw operation and try to withdraw 60$ from your account but meanwhile the bank manager debits 50$ from your account as a penalty of minimum balance (100$), as a result you have only 40$ in your account now. So your transaction either fails as the demanded amount (60$) is not there in your account or it may show (-20$) (which is quite impossible as of banking constraints
). More simply we can say Non-repeatable reads take place if a transaction is able to read the same row several times and gets a different value for each time.
Repeatable Reads: This specifies that transactions cannot read data that has been modified by other transactions but not yet committed and if the current transaction is reading some data then no other transactions can modify that data until the current transaction completes.
Phantom reads: Don’t be afraid, we are not talking about ghosts or phantom in opera. Here Phantom means unexpected or unrealistic. It occurs basically when two identical queries are executed, and the set of rows returned by the second query is different from the first. Let’s have a simple example; suppose your banking policy got changed and according to that the minimum balance should be 150$ instead of 100$ for each account type, anyways this is not a big deal for a data base administrator. He will perform an update statement for each account type where the minimum balance is less than 150$ and updates the value to 150$. But unfortunately when the manager checks the database, he got one record with minimum balance less than 150$ in the same table. The DBA got surprised, how come this is possible as he performed the update statement on the whole table.
This is called Phantom read. The occurrence of Phantom reads are very rare as it needs proper circumstances and timing for such type of events as in the above example, someone may have inserted one new record with the minimum balance less than 150$ at the very same time when the DBA executed the UPDATE statement. And as it is a new record, it didn’t interfere with the UPDATE transaction and executed successfully. This type of Phantom reads can be avoided using higher level of isolation i.e. SERIALIZABLE (described below).
Dirty reads: This is one of the types of Non-repeatable Reads. This happens when a process tries to read a piece of data while some other process is performing some update operations on that piece of data and is not completed yet.
Now coming to the root point of the article i.e. the Isolation levels; we have basically five types of Isolation level in SQL Server 2005. Each one is described below:
Here we consider a simple example for all the below cases. The data shown in the table is taken by assumption and is only used for example purpose; the data given may or may not be right as per real scenario. The table information is given below:
Database Name: OLAP
Table Name: dbo.car_info
Table Column Information:
| Column_name | Type |
| Car_Sl_No | int |
| CarCompany | varchar |
| CarBodyType | varchar |
| CarName | varchar |
| EngineType | varchar |
Table Data:
| Car_Sl_No | CarCompany | CarBodyType | CarName | EngineType |
| 1 | Maruti | small | Maruti-800 | petrol |
| 2 | Honda | sedan | City | petrol |
| 3 | Maruti | small | Maruti-800 | petrol |
| 4 | Maruti | small | Waganor Duo | petrol |
| 5 | Honda | sedan | City | petrol |
| 6 | TATA | small | indica | diesel |
| 7 | Mahindra | SUV | Scorpio | diesel |
| 8 | TATA | SUV | Sumo | diesel |
| 9 | Maruti | sedan | SX4 | petrol |
| 10 | Maruti | sedan | Swift-Dzire | diesel |
| 11 | TATA | small | Nano | petrol |
Assumption: Here in all our examples, two different transactions can be considered as done by two different users. For testing, you can achieve this by two separate Query windows or two separate instances for SQL Server Management Studio (SSMS). But you have to be careful enough to run the queries for both the connections simultaneously or immediately.
1. READ UNCOMMITTED Isolation Level: This is very useful in case you need higher concurrency in the transactions. Here one transaction can access the data that has been modified by the second transaction even if the second transaction is not committed.
Syntax:
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
Example: Suppose the User1 is trying to update the EngineType from ‘petrol’ to ‘diesel’ for Car_Sl_No with value 2. And at the same time User2 is trying to read the data for the Car_Sl_No with value 2. Under normal condition or default setting, User2 cannot read the data from that row. But if the User2 sets the transaction isolation level to ‘Read Uncommitted’, then it is possible to read that row with updated information even if the transaction is not committed by User1.
For User1:
[ Copy to Clipboard ] | [ View Source ]
USE OLAP Go BEGIN TRAN UPDATE [OLAP].[dbo].[car_info] SET [EngineType] = 'diesel' WHERE Car_Sl_No = 2
Here, note that the transaction is still running, as there is no commit statement in the above code. Under default condition, the query ran by User2 will keep executing till the User1 commits the transaction.
For User2:
[ Copy to Clipboard ] | [ View Source ]
USE OLAP Go SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED --Above statment is used to read the updated value even if the transation is not committed. SELECT [Car_Sl_No] ,[CarCompany] ,[CarBodyType] ,[CarName] ,[EngineType] FROM [OLAP].[dbo].[car_info] where Car_Sl_No = 2
As in the above code, we set the transaction isolation level to ‘Read Uncommitted’; User2 can access that record with updated data.
Output:
Although it increases the concurrency of the transactions but did you notice the disadvantage behind this. What if User1 ROLLBACK his transaction or if somehow the management studio of User1 crashed or hanged (As the transaction is not committed yet, it will rollback itself, resulting false or inconsistent value to User2).
Limitations:
- Dirty-reads
- Lost Updates
- Phantom reads
- Non-repeatable reads
Advantages:
- Higher Concurrency
In SSIS (SQL Server Integration Service): To achieve the above norm in SSIS, select the task or container on which you want to set the isolation level. Then go to Properties, and set the property named ‘IsolationLevel’ to “ReadUncommitted”.
The benefit here is that more than one task can access the same table simultaneously in case of parallel execution of the package.
2. READ COMMITTED Isolation Level: This is the default level set in SQL Server 2005 and the immediate higher level of ‘READ UNCOMMITTED Isolation Level’. It prevents transactions to read data if some other transaction is doing some update operation on the data as a result eliminates Dirty Reads. It prevents reading of uncommitted data. But is affected with other demerits like ‘Lost Updates’.
Syntax:
SET TRANSACTION ISOLATION LEVEL READ COMMITTED
Example: Considering our previous example, let the EngineType for Car_Sl_No with value 2 is NULL and User1 is trying to update the EngineType to ‘petrol’, but at the same time User2 started a new transaction checked the value as Null and starts updating the record to ‘diesel’ before the transaction is committed by User1. As a result User1 lost its updated value, it is overwritten by User2.
For User1:
[ Copy to Clipboard ] | [ View Source ]
USE OLAP Go BEGIN TRAN DECLARE @EngineType varchar(20) SELECT @EngineType = [EngineType] FROM [OLAP].[dbo].[car_info] where Car_Sl_No = 2 --The below waitfor statement is used for other opearations that User1 is doing for this transaction. WAITFOR DELAY '00:00:10' --For acheiving real time Concurrency in this example IF @EngineType IS NULL BEGIN UPDATE [OLAP].[dbo].[car_info] SET [EngineType] = 'petrol' WHERE Car_Sl_No = 2 END ELSE BEGIN Print 'Record is already updated' END COMMIT TRAN
[ Copy to Clipboard ] | [ View Source ]
USE OLAP Go BEGIN TRAN DECLARE @EngineType varchar(20) SELECT @EngineType = [EngineType] FROM [OLAP].[dbo].[car_info] where Car_Sl_No = 2 --Here waitfor statement is same for User2 also WAITFOR DELAY '00:00:10' --For acheiving real time Concurrency in this example IF @EngineType IS NULL BEGIN UPDATE [OLAP].[dbo].[car_info] SET [EngineType] = 'diesel' WHERE Car_Sl_No = 2 END ELSE BEGIN Print 'Record is already updated' END COMMIT TRAN
Here both the users successfully updated the value, but the value updated by User2 persists and User1 lost its updated value.
Output: The final output for the record is
Limitations:
- Lower Concurrency than ReadUncommitted
- Lost Updates
Advantage:
- Eliminates Dirty Reads
In SSIS (SQL Server Integration Service): Select the task or container on which you want to set the isolation level. Then go to Properties, and set the property named ‘IsolationLevel’ to “ReadCommitted”.
3. REPEATABLE READ Isolation Level: It is the next higher level than the previous isolation level and the main point here is it does not release the shared lock once the transaction starts for reading data. In simple terms, a transaction cannot read data if it has been modified by other transaction but not yet committed. Also no other transactions can modify data if that data has been read by the current transaction until the current transaction completes. Here in this isolation level, the concurrency rate is very low. As a result, eliminates ‘Lost updates’, non-repeatable reads, etc. But still has a big problem and that is called ‘Phantom read’. Let’s have an example to elaborate this.
Syntax:
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ
Example: Suppose the manager of a showroom declares to transfer all the cars manufactured by Honda Company to another showroom and to maintain a proper record for this operation. We need to add one more column called ‘TransferredSatus’ to indicate whether that car is transferred or not. Here, the DBA will check for the presence of any Honda Company cars in the record that are not yet transferred by checking the value of the column ‘TransferredSatus’. If he found some, then corresponding transfer operations will be performed and the record will be updated to ‘1’ (i.e. transferred). Here by using ‘Repeatable Read’ isolation level, we can eliminate ‘Lost Update’, ‘dirty reads’ and ‘non-repeatable reads’. But what if at the time of updating the database, someone else from the inventory system inserts one record about the new Honda Company car that just arrived to the showroom. Let’s see the effect.
For User1:
[ Copy to Clipboard ] | [ View Source ]
USE OLAP Go SET TRANSACTION ISOLATION LEVEL REPEATABLE READ BEGIN TRAN --check the existance Honda company cars Declare @Car_Sl_No int Declare TransferingCarsCursor CURSOR FOR Select Car_Sl_No from dbo.car_info where CarCompany = 'Honda' and TransferredSatus = 0 OPEN TransferingCarsCursor FETCH NEXT FROM TransferingCarsCursor INTO @Car_Sl_No WHILE @@FETCH_STATUS = 0 BEGIN ---------------------------------- ------Car transfering operations-- ---------------------------------- FETCH NEXT FROM TransferingCarsCursor INTO @Car_Sl_No END CLOSE TransferingCarsCursor DEALLOCATE TransferingCarsCursor WAITFOR DELAY '00:00:10' --For acheiving real time Concurrency in this example -- This is the time when the other user inserts new record about new Honda car. Update dbo.car_info set TransferredSatus = 1 where CarCompany = 'Honda' and TransferredSatus = 0 COMMIT TRAN
Here it found only 2 records from Honda Company.
For User2:
[ Copy to Clipboard ] | [ View Source ]
USE OLAP Go BEGIN TRAN INSERT INTO [OLAP].[dbo].[car_info] ([CarCompany] ,[CarBodyType] ,[CarName] ,[EngineType] ,[TransferredSatus]) VALUES ('Honda','sedan','Civic GX','petrol',0) COMMIT TRAN
But in between the execution of the transaction by User1, User2 inserts one new record about the new Honda Car. Assume the record is inserted before the Update statement of User1, as a result instead of updating only 2 records; User1 updates the new record as well along with the earlier records, showing wrong information in the chart. This is called ‘Phantom Read’. Even ‘Repeatable Read’ isolation mode can’t resolve this problem. For this, you need to implement higher isolation level i.e. SERIALIZABLE.
Output for User1:
(3 row(s) affected)
Limitations:
- Lower Concurrency
- Phantom Reads
Advantage:
- Eliminates Dirty Reads
- Eliminates Lost Updates
- Eliminates Non-Repeatable Reads
In SSIS (SQL Server Integration Service): Select the task or container on which you want to set the isolation level. Then go to Properties, and set the property named ‘IsolationLevel’ to “RepeatableRead”.
4. SERIALIZABLE Isolation Level: It is highest level in Isolation levels as a result the concurrency rate is low. But it eliminates all issues related to concurrency like dirty read, non repeatable reads, lost updates and even phantom reads. According to this Isolation Level:
- Statements cannot read data if other transactions are performing update operations on the data and is not committed yet.
- Also no other transactions can perform any update operations until the current transaction completes its read operations.
- And the important point here is that it is performing a Range Lock based on the filters used to get the data from the table i.e. it locks not only the current records but also the new records that are falling under the current filter condition. In simple language, no other transactions can insert new rows that are falling under the current filter condition until the transaction completes.
Considering our previous example, we will set the isolation level to Serializable.
Syntax:
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
For User1:
[ Copy to Clipboard ] | [ View Source ]
USE OLAP Go SET TRANSACTION ISOLATION LEVEL SERIALIZABLE BEGIN TRAN --check the existance Honda company cars Declare @Car_Sl_No int Declare TransferingCarsCursor CURSOR FOR Select Car_Sl_No from dbo.car_info where CarCompany = 'Honda' and TransferredSatus = 0 OPEN TransferingCarsCursor FETCH NEXT FROM TransferingCarsCursor INTO @Car_Sl_No WHILE @@FETCH_STATUS = 0 BEGIN ---------------------------------- ------Car transfering operations-- ---------------------------------- FETCH NEXT FROM TransferingCarsCursor INTO @Car_Sl_No END CLOSE TransferingCarsCursor DEALLOCATE TransferingCarsCursor WAITFOR DELAY '00:00:10' --For acheiving real time Concurrency in this example -- This is the time when the other user inserts new record about new Honda car. Update dbo.car_info set TransferredSatus = 1 where CarCompany = 'Honda' and TransferredSatus = 0 COMMIT TRAN
For User2:
[ Copy to Clipboard ] | [ View Source ]
USE OLAP Go BEGIN TRAN INSERT INTO [OLAP].[dbo].[car_info] ([CarCompany] ,[CarBodyType] ,[CarName] ,[EngineType] ,[TransferredSatus]) VALUES ('Honda','sedan','Civic GX','petrol',0) COMMIT TRAN
Output for User1:
(2 row(s) affected)
Here User2 transaction will wait till the User1 transaction completed avoiding ‘Phantom reads’.
Limitations:
- Lower Concurrency
Advantage:
- Eliminates Dirty Reads
- Eliminates Lost Updates
- Eliminates Non-Repeatable Reads
- Eliminates Phantom Reads
In SSIS (SQL Server Integration Service): Select the task or container on which you want to set the isolation level. Then go to Properties, and set the property named ‘IsolationLevel’ to “Serializable”.
5. SNAPSHOT Isolation Level: It specifies that the data accessed by any transaction is consistent and valid for that particular transaction and the data will be same throughout the whole transaction. It implements Row Versioning to isolate data for each transaction i.e. it will keep separate version of each modified row in the transaction in the tempdb database totally dedicated to that transaction. Any update of data in the original row will not affect the current transaction.
The ALLOW_SNAPSHOT_ISOLATION database option must be set to ON before you can start a transaction that uses the SNAPSHOT isolation level. It is by default kept as OFF because of performance issues.
To enable SNAPSHOT isolation level, use the below alter database command.
ALTER DATABASE OLAP SET ALLOW_SNAPSHOT_ISOLATION ON
We will consider a small example to illustrate the above condition.
Syntax:
SET TRANSACTION ISOLATION LEVEL SNAPSHOT
Example: We will try to insert a new record in the [car_info] table by User1 and at the same time we will try to fetch the records by User2.
For User1:
[ Copy to Clipboard ] | [ View Source ]
USE OLAP Go BEGIN TRAN INSERT INTO [OLAP].[dbo].[car_info] ([CarCompany] ,[CarBodyType] ,[CarName] ,[EngineType] ,[TransferredSatus]) VALUES ('Honda','sedan','Civic Hybrid','petrol',0)
Note: The above transaction is not committed yet.
For User2:
[ Copy to Clipboard ] | [ View Source ]
USE OLAP Go SET TRANSACTION ISOLATION LEVEL SNAPSHOT BEGIN TRAN Select * from dbo.car_info where CarCompany = 'Honda' COMMIT TRAN
Output for User1:
(1 row(s) affected)
Output for User2:
One record is successfully inserted by User1, but a consisted version of the previous data is kept in Version store (in tempdb) before the starting of the transaction. So User2 is accessing the data from the version store and is unable to show the newly inserted record.
Now commit the transaction for User1 by “COMMIT TRAN” command, and again run the transaction for User2, the output will be as below:
You can check the version store for the current transaction along with other information regarding the current transaction by running the below DMVs (Dynamic Management Views) before committing User1 transaction.
select * from sys.dm_tran_active_snapshot_database_transactions
Output:
Limitations:
- Low performance due to versioning in tempdb
Advantage:
- Eliminates Dirty Reads
- Eliminates Lost Updates
- Eliminates Non-Repeatable Reads
- Allows multiple updates by versioning
In SSIS (SQL Server Integration Service): Select the task or container on which you want to set the isolation level. Then go to Properties, and set the property named ‘IsolationLevel’ to “Snapshot”.
Other Isolation Levels in SSIS:
- Chaos Isolation Level: Behaves the same way as ReadUncommitted, with additional features as stated below:
- It permits viewing uncommitted changes by other transactions.
- It checks any other uncompleted update transactions with higher restrictive isolation levels to ensure not to raise any conflicts i.e. any pending changes from more highly isolated transactions cannot be overwritten.
- Rollback is not supported in this Isolation level.
If you want to perform read operations over the data once per transaction, then go for the Chaos isolation level.
In SSIS (SQL Server Integration Service): Select the task or container on which you want to set the isolation level. Then go to Properties, and set the property named ‘IsolationLevel’ to “Chaos”.
- Unspecified Isolation Level: When the Isolation level of any transaction cannot be determined, then it comes under ‘Unspecified Isolation Level’ i.e. a different isolation level than the ones above are used. For example performing custom transaction operation like ODBCtransaction, if the transaction level does not set by the user then it will execute according to the isolation level associated by the ODBC driver.
In SSIS (SQL Server Integration Service): Select the task or container on which you want to set the isolation level. Then go to Properties, and set the property named ‘IsolationLevel’ to “Unspecified”.
Optimistic Vs Pessimistic:
Optimistic concurrency: Here SQL Server assumes that the occurrence of resource conflicts between different transactions are very rare but not impossible. So it allows transactions to execute without locking any resources. Only in case of any modifications in the data, it will check for any conflicts, if it finds any then it will perform the locking operations accordingly. In simple terms, we are assuming that every transaction will carry on without any problem except some exceptional cases.
Pessimistic Concurrency: Here it will lock resources irrespective of the type of transaction to ensure successful completion of transaction without deadlocks. Here, we are assuming that the conflicts are likely and some major steps have to be taken to avoid those conflicts.
Let’s have an example on this:
Suppose in a car showroom, a customer wants to go for a test drive, but before the manager say something, it has to be clear that the car is empty and is ready for driving. What if another customer is already requested for the test drive for the same car? If the manager allows both of them to drive the car simultaneously, considering mutual understanding between the customers then we call it as an Optimistic concurrency. But if the manager wants to be sure about non-conflicts of the customer, then he allows the customers for test driving one-by-one. This is what we call as Pessimistic Concurrency.
Subscribe to:
Comments (Atom)