Category nameļ¼šDatabases

Paging is not so easy as most developers think it is

A lot of websites use somekind of paging mechanish to limit the results shown. Either to save bandwidth where results are served in pages or because search criteria should be made more specific by not having paging but just limit the maximum records returned.

DataGrid paging

The datagrid supports paging but this paging mechanish first fetches all records from the database and then storing the datatable in the viewstate and only showing a subset of that data to the user.

Depending on the type of data either a table or index scan will be performed by the database. If you have a filter/where clause then hopefully you will have a matching index that is also ordered as expected.

Pro’s:

  • Easy

Con’s:

  • Slow
  • A lot of network traphic between webserver and database
  • Large webpages because of the viewstate

A workaround to limit the viewstate is by reloading the dataset at each page hit and disabling viewstate on the datagrid.

 

Database paging

Instead of letting the database return all records we ask it to return only a subset of records. Most applications work with a current page and a page size. So when your page size is 10 and the current page is 2 then rows 20-29 should returned. Some databases have native support for paging like MySql with its LIMIT keyword but paging really is hell with older Microsoft SQL Server versions and with SQL 2005 you can easily hack paging in a query by using the ROW_NUMBER() function and AFAIK SQL 2008 does not add functionality to make paging just as easy as MySql’s LIMIT statement.

So what you actually want is to not be bothered with such code and the easiest way to do that is by either making use of linq to sql, the entity framework, nhibernate or any other database abstraction layer.

NHibernate

NHibernate for example has the SetMaxResults and the SetFirstResult methods. These are so easy to use to fetch a certain ‘page’.

   27             criteria

   28                 .SetFirstResult(currentPage*pageSize)

   29                 .SetMaxResults(pageSize)

   30                 .List();

 

Linq to sql

Linq to sql has the similar methods Skip and Take and work in the same way as NHibernates methods.

   27 IQueryable<T> source;

   28             …

   29             source

   30                 .Skip(currentPage*pageSize)

   31                 .Take(pageSize)

   32                 .ToList();

 

Pro’s

  • Quite fast if good indexed are available in the database.
  • Only the needed data is tranfered from the database to the webserver

Con’s

  • Paging properties must be passed from front-end to back-end
  • Cannot use most ASP.NET controls that support paging

Paging issues

Index scan

But even this can have performance issues. Imagine a table that a contains billions of records. A query that performs an index scan will be slow too! So you have to be sure that your query will use one or more indexes but that it will not be performing an index scan. For example when you have a LIKE ‘%x%’ statement. Depending on the database you use this can even result in a table scan.

Out-of-date data

When for example you are viewing a log that results in paging a log table then it is highly imaginable that log rows are being added while you are browsing the log. So when you are watching page 0 and waiting for a couple of seconds and then viewing page two could result in even newer data being shown than on page 0 because data has been added. This is due to the results not being cached and the results ordered descending on creation data (or identity).

What you would like is that when you view page 0 that page 1 will contain older data (in this log example). Here paging like described previously will not work.

The first (0) page query would be something like:

SELECT
    top 20 id, title, status
FROM
    log
ORDER BY
    id DESC

Then the query for the second page (1) would be needing the id value of the LAST row of the previous query result

SELECT
    top 20 id, title, status
FROM
    log
WHERE
    id<@lastRowId
ORDER BY
    id DESC

The solution here is not making use of a “page” but only a “next” set. This is exactly what you need in most situations and if you are not sure about this then I can refer to an excellent application that (I think) is making use of this mechanism, Google Reader. You scroll down and then data is fetched at the same time and all previous downloaded items are not removed.

Pro’s

  • Easy to implement
  • Fast
  • Page result is what you would expect

Con’s

  • Sorting will always result in viewing the result at the first ‘page’
  • Requires unique indexes / combined columns

 

Supporting non unique indexes / order by clause

So what if your table does not have a unique index for the where ORDER clause that you specified? Imagine a result where the filter column is a “name”, the page size is 10 but you have 16 people sharing the same name. This would result in a page only showing the first 10 people and the next page showing people with another name. The viewer will miss 6 people! It is this reason that we want to see all ties in the result.

It could be that you do not want to extend the ORDER BY because that could result in a table scan because the ORDER BY needs data not available in the index.

This can be solved by making use of the sql keyword WITH TIES. This makes sure that the result will always contain all rows matching the where in combination with the order clause.

SELECT
    top 10 WITH TIES name, dob, county, city, accountnr
FROM
   clients
ORDER BY
   name

This query will result in a page containing at least 16 people in the situation I described above. The page will be larger then 10 but this will not be a problem in an application that fetches data like google reader.

Pro’s

  • Very fast and most efficient usage of indexes.

Con’s

  • There it not a standard SQL syntax. Most vendors support similar behaviour (see http://troels.arvin.dk/db/rdbms/).
  • Not supported natively by frameworks like NHibernate, Linq to sql, EF
  • Recent Posts
  • Recent Comments
  • Archives
  • Categories
  • Meta