03 September, 2010

Efficient Method of Getting Middle Rows From Sql Server 2008, 2005 OR Efficient Method of Paging With Sort In Sql Server 2008, 2005

Step 1: Create a table with 3 fields: ID int, Name varchar(50), Marks float.

Step 2: Put some rows in this table.

Step 3:

Alter Procedure Efficient_Paging_Student
@SortColumn varchar(50),
@SortDirection varchar(4),
@PageSize int,
@PageNumber int
AS

DECLARE @MaxRows INT
SET @MaxRows = @PageSize * @PageNumber

SET ROWCOUNT @MaxRows

SELECT ID, Name, Marks
INTO #Temp
FROM student
--WHERE --conditions
ORDER BY (SELECT CASE @SortColumn
WHEN 'ID' THEN convert(sql_variant, ID)
WHEN 'Name' THEN Name
WHEN 'Marks' THEN Marks
END),
ID

SET ROWCOUNT @PageSize

SELECT ID, Name, Marks
INTO #Foo
FROM #Temp
ORDER BY (SELECT CASE @SortColumn
WHEN 'ID' THEN convert(sql_variant, ID)
WHEN 'Name' THEN Name
WHEN 'Marks' THEN Marks
END) DESC,
ID DESC

SET ROWCOUNT 0

IF(@SortDirection = 'ASC')
BEGIN
SELECT t.ID, t.Name, t.Marks
FROM #Foo
AS bar
INNER JOIN student AS t ON bar.ID = t.ID
ORDER BY (SELECT CASE @SortColumn
WHEN 'ID' THEN convert(sql_variant, bar.ID)
WHEN 'Name' THEN bar.Name
WHEN 'Marks' THEN bar.Marks
END),
bar.ID
END
ELSE IF(@SortDirection = 'DESC')
BEGIN
SELECT t.ID, t.Name, t.Marks
FROM #Foo
AS bar
INNER JOIN student AS t ON bar.ID = t.ID
ORDER BY (SELECT CASE @SortColumn
WHEN 'ID' THEN convert(sql_variant, bar.ID)
WHEN 'Name' THEN bar.Name
WHEN 'Marks' THEN bar.Marks
END) DESC,
bar.ID
END

Step 4: Test

Efficient_Paging_Student 'ID', 'DESC', 4, 2

Efficient_Paging_Student 'Name', 'ASC', 4, 2

Explanation:

Using this example we can get any page (set of rows) having any number and any page size. For example if page number is 2 and page size is 4 then we get row numbers 5, 6, 7, 8. Note that row number here do not means id. Its not a fixed field. Its row number 5th, 6th, 7th and 8th AFTER sorting is done according to desired column name.

To make it efficient, we have to limit the rows returned in each query. Using "top" keyword don't work here because top keyword don't take a variable. However, ROWNUMBER can be set to any variable:

SET ROWNUMBER @VAR

This works.

The whole procedure is divided into 3 select statements, first of them returned a different number of rows than the last two. We can't combine the last two because of this error:

Msg 1033, Level 15, State 1, Procedure Efficient_Paging_Student, Line 41
The ORDER BY clause is invalid in views, inline functions, derived tables, subqueries, and common table expressions, unless TOP or FOR XML is also specified.

It means that we can't apply "order by" clause to derived tables. Therefore we have to store that table value (coming from the second select statement) in a separate hash table called #foo. Its not an issue because the size of this table is same as size of page at front end (web page, windows form etc) which is usually a very small size (5 to 100 rows).

The actual load of this method is in the first select statement, its because here the number of rows selected is equal to the product of page size and page number. It means that if number of rows can be evenly divided by page size (for example 100 rows and 5 rows each table) then for the last page the query select the entire table. I don't know anyway to avoid this because the sorting requires us to go to that level, that is each row have to be accessed once no matter what (unless we are using views but thats a different story).

Once the first select statement is executed, the rest of the procedure runs in flash because the second and third select statements deal with only as much rows as the size of page. Since size of page is always very little (discussed above) therefore its not an issue at all.

At last we have to do the final sorting. Remember that "order by" clause sort by default in ascending order so I didn't wrote "asc" keyword anywhere in any query. To find out whether to use ascending order or descending order I had to use an if statement.

An important part here is selecting of column name. See:

(SELECT CASE @SortColumn
WHEN 'ID' THEN convert(sql_variant, bar.ID)
WHEN 'Name' THEN bar.Name
WHEN 'Marks' THEN bar.Marks
END)

Look that each case statement returns a column name but not in quotes !!!. It is 100% valid sql code for sql 2005 and above. Whatever column name is returns is fit in above query. Following two statements parse to same query if @SortColumn is equal to 'Name':

SELECT t.ID, t.Name, t.Marks
FROM #Foo
AS bar
INNER JOIN student AS t ON bar.ID = t.ID
ORDER BY Name DESC,
bar.ID

SELECT t.ID, t.Name, t.Marks
FROM #Foo
AS bar
INNER JOIN student AS t ON bar.ID = t.ID
ORDER BY (SELECT CASE @SortColumn
WHEN 'ID' THEN convert(sql_variant, bar.ID)
WHEN 'Name' THEN bar.Name
WHEN 'Marks' THEN bar.Marks
END) DESC,
bar.ID

Its a very important thing to remember and greatly simplify the code. For prior versions of sql server, one have to use separate if/else or case statements to get this behavior, would be very messy. Note that the select case statement above have to be written in the stored procedure, there is no way to make a separate function for that. A function cannot return name of column.

Note that RowCount is specific to current thread. Value of RowCount set to some value in one thread don't affect its value in another thread. This can be tested by opening to "query windows" in management studio:

1st window:
type and execute: SET ROWCOUNT 1

keep this window open

2nd window:
type and execute: select * from student

RESULT: All rows of student table are returned. If there are 10 rows then all 10 rows are returned.

Go back to the first window (it should be open already). Type and execute "select * from student". Only 1 row should be returned. Its because in this thread (a query window is thread, so do a procedure, a function etc) the rowcount is set to 1. Now set rowcount to 5 and execute "select * from student". Now 5 rows should be returned.

Note that a thread is specific to a user as well as a function. It means two users simultaneously executing a stored procedure are still in two different threads.

No comments:

Post a Comment