Clustered Indexes in SQL 2000

OK—

So I am teaching SQL Administration this week, and I come across an example of database objects in SQL.  Now, I have taught this class at least 10 times over the last three years and it has never really bothered me, but today it did.  Probably because I have donXML’s article on “Viral Coding Examples…”

In this particular example, they were showing a clustered index on the Lastname of a customers table.  Now, why is that a “bad” thing?  Well, first an investigation of clustered/non-clustered indexes is in order.

To start this off, a non-indexed table is called a heap.  Any time data is retrieved, a table scan is run to look for the correct data.  In a table with an inordinate number of rows, this is a bad thing.  So, we build an index.

So the DBA/developer pulls up SQL Enterprise Manager and decides to build an index on last name to make things run a bit faster.  Up pops the index property page, and the user enters the name and clicks the column or columns for the index and then stall out on the “Clustered” checkbox.  Most developers I know have no clue what the “Clustered” checkbox on the SQL index properties even does.  (I have even stood behind a developer who tried creating three of four clustered indexes **SQL gurus chuckle here**)  So what is it that the “Clustered” index actually does.

First, that unordered pile of data pages we called the heap—that gets shuffled into an ordered list of records.  On top of that, then, we create a BTree structure to quickly find the data page.  In this case, the leaf level page of the index is the actual data page.  So that being the case, what happens if I try to created a second clustered index…  Well, we rebuild the data ordered by the new index and throw out the old clustered index.

Now, there is a lot more we could say about indexes, but this entry is about “Viral” badness.  So why is it bad to create the clustered index with the Lastname as the key value?

Well, now we have to move to the insert process.  What happens when you add a new row to the database?  Well, we have to put the data in the correct spot in the clustered index (in Lastname order, right?).  So what happens if the data page that we are trying to insert into is full?  We get a page split.  The more we add records to the database, the more likely we will have page splits and the increased likelihood of database fragmentation.  Over time, this becomes a problem.  Was this the best choice for a clustering key?  Do you even have to have a clustered index?  The answers—No and No.

But wait—this is MOC (MS Official Curriculum).  Granted, the slide is not talking about the indexes.  The point of the slide isn’t indexes.  The problem is that a casual user (those who don’t take the time to figure out what clustered means) sees this and assumes that a Clustered Index on Lastname is “OK.”  My next few posts will be about indexing, but I need to substantiate the posts with pictures, so they will take a bit to draw.

Stay tuned.

Print | posted on Monday, November 08, 2004 9:29 AM

Feedback

# re: Clustered Indexes in SQL 2000

left by Don Demsak at 11/17/2004 2:43 AM Gravatar
I'm glad that my post helped someone else re-evaluate the examples we are using. And, since my wife is a development DBA, we both thank you for noticing the problem.

# re: Clustered Indexes in SQL 2000

left by DavidG at 1/25/2006 1:10 PM Gravatar
I understand the point about inserts with a clustered index on lastname.

As a new DBA, I think it would be helpful to include a supplemental case study article that goes through the calculations used to determine a good clustered index. If you add one customer a day, it seems that a clustered index on lastname could be a good choice; especially if there is opportunity to rebuild indexes.

# re: Clustered Indexes in SQL 2000

left by Jason at 8/2/2007 8:57 AM Gravatar
for employees, create a rowid column (rowguid), use it as clustered unique index, build non-clustered index on lastname, firstname, employeenumber or whatever else. Period.

# re: Clustered Indexes in SQL 2000

left by satnam singh at 9/19/2007 8:30 AM Gravatar
the article is excellent
thanking you
satnam singh

# re: Clustered Indexes in SQL 2000

left by manisha bhatia at 2/29/2008 1:22 AM Gravatar
hi
the articale is good ,it help to understandin of indexes in sql
Title  
Name
Email (never displayed)
Url
Comments   
Please add 7 and 1 and type the answer here: