Change indexes on Student_ table for improved performance

Brian Lewis reporter

In Sql Server, a CLUSTERED index on a table means that the physical records are sorted according to the order of that index. This means that when reading a range of records from the index - say, from 'C' -> 'F' ; all the required records in the table willl be contiguous, making it faster to read them.

Downside is that when a new record is written, or when a key is changed, the record needs to be inserted in the table, it has to be placed in its correct place in the sequence. If the clustered index is based on an Identity, this is not a problem - the new key is always at the end, and the key is never changed. But if the key is a uniqueidentifier; as it is on Student_ ; then keys are not allocated in sequence, so new records need to get inserted into their correct place, which can mean every insert results in a large part of the table being rewritten ( to make the space available for the insert).

RULE: indexes based on uniqueidentifiers should not be CLUSTERED !

This is a bit of a trap, because SSMS always defaults the primary key to CLUSTERED, even if it is a uniqueidentifier. Always worth checking this when building a new table: which index - if any - should be the clustered index?

2019-04-18T00:47:50+00:00

Brian Lewis reporter

Another issue relates to multi part indexes:

to find all Students not accounted for in an enrolment year, loadndoeStudents repeatedly runs:

studentID not in (Select stuID from StudentEnrolment_ WHERE stueYear = @SurveyYear)

Clearly an index on stueYear will help - but how do we get the stuID for each of the records in that year?

If the index just uses stuID, then the query will need to go back to the record in the table to get this value.

An alternative would be to make a multi-part key StuYear, stuID : but this complicates inserts, since the index entry now gets inserted into a specific place in the index, and the tree structure of the index may end up being frequently rebalanced.

Compromise position is to use the INCLUDE clause on the index:

CREATE NONCLUSTERED INDEX [IX_StudentEnrolment_Year] ON [dbo].[StudentEnrolment_]
(
    [stueYear] ASC
)
INCLUDE ( [stuID])

This stores the stuID on the index entry, but does not make it part of the index. This means that the original query expression:

studentID not in (Select stuID from StudentEnrolment_ WHERE stueYear = @SurveyYear)

can be evaluated without reading the table data, everything needed is in the index.

But adding a record is still simple because any new record can just be placed at the end of the index entries for the year.

2019-04-18T01:00:05+00:00

Brian Lewis reporter

changed status to closed

Pull Request #306 resolves

2019-04-18T01:13:42+00:00

Comments (3)