Hash table, parallelism and Cursor performance in practice

Oopps! Upgrade your browser pretty please. Oopps! Upgrade your browser pretty please.

This Christmas my kids finally showed some interest in Lego and I got the opportunity to practice Hash tables, parallelism and cursors in reality. I saved all my Lego from my childhood to be able to pass them on to my kids. It’s more than 20 000 pieces (records) stored in one single box (like a table in SQL server) with the measure 70x40x40cm. The pieces are unsorted like a hash table without a clustered index, well, without any index at all.

The building instructions are very sequential, like a cursor, you have to get piece X from the box and connect to your construction before you continue to next step. In this case, my kid wanted to build a Bionicle fortress and to build it we need to get approximately 500 pieces and complete more than 50 steps.

Every piece (record) we need to get from the box (table) has some attributes (columns) that have to match, like color, size and shape, so we have to scan the pieces in the box to find the matching piece.

As you already figured out, the pieces are unsorted and can be anywhere in the box. For every piece we need to get, we have to scan all the pieces until we find the right one. In the worst case, we need to iterate through all 20 000 pieces >500 times. We can illustrate this with this pseudo code:


SELECT size, color, shape FROM [building instruction]

OPEN curr


FETCH NEXT FROM curr INTO @size, @color, @shape

            SELECT TOP 1 Piece FROM legoBox
            WHERE size=@size AND color=@color AND shape=@shape           

            FETCH NEXT FROM curr INTO @size, @color, @shape
CLOSE curr


The fun part with lego is not the search part, it’s the building part and to keep my joy for my kid we divided the work between us. He is the client requesting pieces, and I had to act as SQL engine.

When the client requested a piece, I had only one choice as the data is unsorted:

–          Scan the box and stop when I found the piece, and keep doing this until all requests are handled

As you can see, this can be time consuming, due to the number of requests.

Well, I actually had a second choice. I could pick up pieces in chunks (data pages), sort them, keep them in my workspace (cache) and scan the workspace (cache) on subsequent requests. The sort phase of this approach is time consuming, but could save some time on subsequent requests.

The problem with the second approach is that the sort is time consuming and my workspace (cache) is limited and when I run out of workspace, I have to put the pieces (records) back into the box (table).

Do I have another option? Well, I could try to use some kind of qualified guessing and pick up pieces that I might need (read ahead). This can save some time, but consumes workspace with pieces that I might need.

I did try to use more kids (worker threads) to run the scan of the box in parallel to reduce the time, but the amount of work wasn’t evenly spread between the kids and me so one worker completed the scan faster than the others and had to wait for the rest of the worker (cxpacket waits) until the piece was found. The problem still exists, I have to scan the complete box for every piece that we needed.

The best option would have been to pick up all the pieces, sort them and stored them sorted in another box. That could save time depending on how I sort them. If I would give all pieces a unique number (a fictive key) and sort them by the number I wouldn’t save much time as I need to get them depending on the attributes:

–          Size

–          Color

–          Shape

If I sort them based on the attributes I save a lot of time as I can to a quick seek (index seek) for every individual piece that the client requests, but I still get 500 requests.

The best option would be sorting them based on the Lego number, then I could get all the pieces I need for the building instruction into the workspace. As all 500 pieces fits in the workspace, I only need to scan the workspace, not seek as the pieces are not sorted by attributes.

Actually, the absolute best option would be sorting the pieces based on attributes and store all pieces for a construction in separate boxes (partitions). In this case I only need to seek for my piece in a subset of pieces (partition elimination). This will only take a fraction of the original time.

The problem with the last two options are that I haven’t stored the Lego number along with my pieces.


–          It is time consuming to scan unsorted data over and over again

–          It is time consuming to sort the data, avoid doing it when you get your data

–          Parallelism is good when the amount of work is evenly spread

–          Iterations like cursors are time consuming

–          Fetching sorted data can be really fast, depending on how the data is sorted

–          Partitioning can increase the performance a lot, depending on the partitioning key

–          Design your data model wisely from the beginning.

Recommendation for SQL server:

–          Always create at least a clustered  index for tables that contains more than just a few records

–          Choose the clustered index wisely, a fictive key (like an identity column) doesn’t help you in your search as you never search for the fictive number

–          Don’t waste your clustered index on columns you never use in filtering

–          The clustered index need to be as narrow as possible as it is included in any other non clustered index.

–          Create additional non clustered indexes to cover other search criteria

Some people recommends using a narrow and ever incrementing column as the clustered index, like an identity column, but that might cause you some trouble with:

–          Key lookups, due to the fact that you never search the fictive key

–          Hot spots in the last data page, as all data should be placed in the last page

–          Statistics that are out of date as 20% of the table has to be changed before the statistics are auto updated.

On the other hand, if you don’t use an incrementing column you end up with page splits and fragmentation of the index.

To be able to tune your database for best performance you need to know your data and how it’s going to be used.