SQL Server and My Disaster Recovery Experience

How true when they say, you learn best from experience.

Last month, I got a call from a client requesting an onsite visit to check why their SQL Server backups are failing. So I went and immediately ran a full database backup. A dreaded error message greeted me. The following is the error:

10 percent processed.
20 percent processed.
30 percent processed.
Msg 3271, Level 16, State 1, Line 1
A nonrecoverable I/O error occurred on file “K:\DATA\xxxxx.mdf:” 1117(The request could not be performed because of an I/O device error.).
Msg 3013, Level 16, State 1, Line 1
BACKUP DATABASE is terminating abnormally.

I felt a brief terror and the first step that I did in solving the situation was post the error message as a status on my Facebook to convey to my friends how my day is gearing up to be undesirable. How genius of me. Right? 🙂

Then I checked the last known good backup and found out that the server has not been backing up successfully for the last 10 days from when they reported the problem. The client never thought the problem would be critical as the applications are running smoothly uninterrupted since they can remember.

As I investigate the problem, it was concluded that one disk in one RAID 10 array is defective. Worst, the array was meant to store the MDF files. The mirror in the RAID 10 array doesn’t seem to kick in too. The other arrays where the LOGs, the TEMPDBs are located SEEM to be working fine. Hardware wasn’t my expertise so I never bothered to spend time on what went wrong with the hard disks and RAID controllers and decided to just focus on how get the data intact through other means since backing up is failing.

Luckily for me, the TRANSACTION LOGs are doing fine and can be copied. However, as I restored from the last known known good full backup and restored all the transaction log backups that were done right after, it was determined that TRANSACTION LOG backups have stopped too when SQL Server Agent stopped running few days prior. When I attempted to backup the transaction log, I got a similar I/O error. I then remember one of Kimberly Tripps’ PASS SUMMIT videos on disaster recovery. In the video, she demo’d how SQL Server handled when an entire disk volume/array went black. Then hope came on to me. Though no one encounters this every day, and probably no SQL Server professional has encountered this in his/her entire SQL Server career, I felt some degree of hope that the problem I am about to deal with is solvable. The mere fact that the applications depending on the database that can’t be backed up are running perfectly fine tells me something in SQL Server is allowing the database to stay and used online.

My first instincts based on my limited actual experience on disaster and data recovery were to do the following:

  • Never to turn off/restart the server nor restart the services, which I did
  • Ask the client if they can go offline while I resolve the issue, which they allowed so I turned off web server running the applications
  • Find a way to get a full backup of the current state of the database, which I failed
  • Find a way to copy all the associated files (MDFs, LDFS, etc) and having it restored on another SQL Server instance, which I also failed. The MDF file simply just can’t be copied using various tools.
  • Google for relevant info. There were lots but none was as specific to assist me on how to go about it one step at a time
  • Ask fellow SQL Server practitioners about it, but I never came to actually go about this during the whole exercise of data recovery
  • Watch again Kimberly Tripp’s video, which I did and gave me some indirect insights how SQL Server behaves when it is crippled.

What eventually allowed me to recover the data wasn’t accidental but a result of some sound practice in database deployment. There were two things that allowed for recovery:

  1. The LOG files saved the day for me as they happened to be stored on a separate container (disk array) since the day the server was first put up.
  2. SQL Server had ample memory (RAM). It has more than enough memory (RAM) to contain more than twice the size of the database concerned.

So what really happened? How did the storing of the TRANSACTION LOG files on a separate disk array allowed the system to stay good and recoverable? How did memory helped out.

  • SQL Server has the ability to keep data for as long as it can in memory. For read only queries, SQL Server will never bother to read data from data files on disk for as long as it can, provided data aren’t flushed out of memory (e.g. when a need for another memory is requested and there is not enough memory for SQL Server to grant the request another chunk of memory).
  • UPDATES are also reflected on memory and the transaction is recorded in the TRANSACTION LOGs when a database is in Full Recovery Mode. SQL Server does not need to update data files for as long as it can. Data are persisted when TRANSACTION LOGS are backed up in this mode.
  • Coincidentally (well not really as the client followed the advice to have ample memory based on earlier recommendations), the ample amount of memory the SQL Server has allowed the memory manager not to flush out data from memory to accommodate other requests for other data.
  • What appeared to be a problem though was READing from the disks. It appears SQL Server can write to the LOG files but can’t back them up, thus, it allowed SQL Server to continue updating data in memory and record the transactions in the log files allowing the applications to still working. I can do a SELECT * FROM table despite the disk read problems (for SQL Server is getting it from memory). I just can’t backup the database whether full or transactional.

What I eventually did to successfully recover 100% of the critical data are the following:

  • I restored the last known good backup sets (FULL and TRANSACTION LOGS) to another defect-free location
  • I compared the row counts and contents of tables (FROM MEMORY vs LAST KNOWN GOOD BACKUP). Turned out only 2 tables had differences and these tables are the most critical and frequently used ones.
  • I simply copied data from memory to another database located in a defect-free disk array.

That was it. I recovered everything I needed.

After I made sure everything was fully recovered and have the database installed on a new server and have the applications running for a few days, the defective server was decommissioned. Before laying it to rest, I played with it with fear of losing data and knowing it is still essentially an active ‘production’ server.

Here are some experiments I did:

  • Tried various file recovery tools in copying the almost 2GB MDF file to no avail. The attempt always ends up with an I/O error.
  • I tried updating the database, insert new records in the two critical tables and it worked flawlessly as if nothing has gone wrong. I was able to INSERT 100k records with no redflags. I was able to edit 5k records and delete 5k records with no clamor whatsoever.
  • I attempted to try FULL BACKUP one last time and the error is now different. It says:The operating system returned error 21(The device is not ready.) to SQL Server during a write at offset 0x0000000023c000 in file ‘ K:\DATA\xxxxx.mdf ‘. Additional messages in the SQL Server error log and system event log may provide more detail. This is a severe system-level error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online. (Microsoft SQL Server, Error: 823)Apparently, the array can’t be accessed anymore. Maybe I had subjected it with added stress it gave out. But HEY, I can still access the database and do some updates.
  • Lastly, I rebooted the ‘newly decommissioned’ database server and it just won’t start anymore.

Hope I don’t get to experience this again until I retire from doing SQL Server 😀

**************************************************
Toto Gamboa is a consultant specializing on databases, Microsoft SQL Server and software development operating in the Philippines. He is currently a member and one of the leaders of Philippine SQL Server Users Group, a Professional Association for SQL Server (PASS) chapter and is one of Microsoft’s MVP for SQL Server in the Philippines. You may reach him by sending an email to totogamboa@gmail.com

mvp

Performance Tuning : Tiny Queries Help in Identifying Memory Pressure

Often, when given a database server experiencing severe performance problem, our attention is focused more on what is the obvious. And typically, we consider large, complex and long running queries as immediate suspects. We often consider simple, small, and short running queries the least of our worries. In this blog, I’ll show you how these tiny queries help in determining if your SQL Server box is under memory pressure.

However, you have to understand first how SQL Server utilizes memory. In a nutshell, queries that need memory allocation ‘requests’ a certain amount of memory from SQL Server’s memory broker. SQL Server then computes how much memory it can give to the query. Initially, 25% of the query memory is computed and goes through some more belt-tightening schemes to determine the final X amount of memory to grant to the query. When memory is available, the query is immediately granted an X amount of memory and the query executes and returns back the memory right after it has done its work. When memory isn’t available on the other hand, the query is put on hold in a queue until its next turn to do the cycle of ‘requesting’ memory or until it has timed out (few seconds up to 24 hours).

On a server experiencing severe memory pressure where simple, small and short running queries are put on hold for execution due to lack of memory to use, a few lines of query that needs 20KB of memory and used to execute for under a second can take minutes, hours of waiting before it gets its requested memory. Then it continous to execute and finishes its work. It can even timeout if memory isn’t ever granted. Thus, this gives everyone the impression that something is very wrong with the query and/or the server.

If you have these tiny queries put on hold very often by the memory broker, your SQL Server badly needs additional memory (RAM) even if you have 256GB of memory in your machine. If you have these tiny queries timed out after 24 hours of waiting, your server is funny. Though you may have 256GB of RAM in your server, it is simply not enough.

For a server experiencing severe memory pressure, you would get a handful of these tiny queries, which I would personally define as queries needing 2 Megabytes or less of memory to execute, put on hold every 5 seconds. Your circumstances might be different from mine so you can vary this number. For a server with plenty of memory to spare, you would not see these tiny queries put on hold for days or weeks or months or ever.

SQL Server provides a way though to determine if tiny queries are put on hold often. You may capture the results of the following query in a regular interval for days to monitor. I’d usually do 7-day monitoring runs:

SELECT
    COUNT(*) AS small_queries_on_hold
FROM sys.dm_exec_query_memory_grants  
WHERE
    requested_memory_kb <= 2048  — 2 MB
    AND grant_time IS NULL

Do this for every 15 seconds or less depending on how your server is experiencing the pressure. For a quick simple run, you may execute this query in SSMS when you are experiencing severe slowdown and have F5 pressed is regular intervals for a few minutes.

If the result for small_queries_on_hold  is always zero, you may vary/increase requested_memory_kb for up to 10MB. If you get zero result still, you may be experiencing problems other than having memory pressure.

A non-zero result for small_queries_on_hold suggest you may be experiencing severe memory pressure. It means, even small queries have a hard time getting memory grants from SQL Server. This should prompt you right away to request for a purchase of  additional RAM for SQL Server.

You may also want to run the following query to see if there are queries, regardless of memory size requirements, that are put on hold.

SELECT
    COUNT(*) AS small_queries_on_hold
FROM sys.dm_exec_query_memory_grants  
WHERE
   grant_time IS NULL

Once you have upgraded your server’s memory and give more memory to SQL Server, you may want to run the above queries once again to see if there are small queries that are put on hold by SQL Server’s memory broker. Chances are, you would see less occurrences of small_queries_on_hold or you have completely eliminated the occurrence of small queries having put on hold by SQL Server.

Increase your RAM until you don’t see the occurrences or if you can tolerate the number of small_queries_on_hold on a certain time window.

This is all folks. Hope this helps!

**************************************************
Toto Gamboa is a consultant specializing on databases, Microsoft SQL Server and software development operating in the Philippines. He is currently a member and one of the leaders of Philippine SQL Server Users Group, a Professional Association for SQL Server (PASS) chapter and is one of Microsoft’s MVP for SQL Server in the Philippines. You may reach him by sending an email to totogamboa@gmail.com

Making a Case for SQL Server over SQL Database in Azure

When Windows Azure went online some years back, the buzz prompted me to check on it too. But my main reason to check out Microsoft’s cloud solution Azure was Azure’s SQL Database (formerly SQL Azure). Initially, the Azure SQL Database was pretty lame if you start comparing it to its on-premise roots (the Microsoft SQL Server). A SQL Server veteran’s top 10 lists of things he can find in Azure SQL Database would probably be composed of things that SQL Database doesn’t have.

  • no way of backing up databases
  • no automatic database sharding mechanisms
  • no fulltext indexing and seaching
  • and yeah it doesn’t do this
  • and it doesn’t do that
  • etc …

Then Microsoft introduced FEDERATION as a way to make things elastic in terms of scaling database growth and people have hailed the feature as something that will allow Azure SQL Database to make a name of its own as the feature isn’t even available in SQL Server.

Then, one begs to ask if Microsoft could have allowed SQL Server to run on Windows Azure instead. Today, Microsoft gave us that power to provision, install and run SQL Server on an Azure Virtual Machine just like what we do on an in-premise environment. To some, this could put Azure’s SQL Database to the back end of the long line.

Why SQL Server Over Azure SQL Database?

The answer is simple. It is so we can run existing applications on Azure that are using SQL Server features not found in Azure SQL Database.

However, if the goal is just to migrate existing SQL Server based applications to the cloud, then we might be missing some of the major reasons why we have the cloud in the first place. And we dont have to go to that direction for this article. People are aware that aside from wanting to get away with dealing with the nuances of an in-premise infra, to some, going for the cloud simply means cloud scale load. And for those of us who still wanted an SQL way of dealing with data in the cloud, Azure SQL Database, can definitely handle any cloud scale load thrown its way as it has inherent mechanisms, like FEDERATION, to address issues related to handling load on a massive scale. But there are also a lot of things to be desired that are still not available in Azure SQL Database. Eventually, Azure SQL Database will mature and we will not have to deal with the missing pieces but some of us can’t wait any longer.

So the questions now are the challenges one need to face to use SQL Server in Azure instead of Azure SQL Database, specifically one that will allow cloud scaling. The same questions have been bugging my company so we can transition our applications and services over to the cloud (Azure specifically) without throwing away so much of the current SQL Server related assets that we have. The primary reason for this is that, aside from adopting the cloud, we are still maintaining our in-premise code base until when there are no more clients depending on it. But I don’t see our existing clients going in that direction so soon. The way I see it, the challenge would be our ability to maintain and run two code-bases. One for in-premise, and the other for the cloud. Eventually, the code for the cloud will become the code for in-premise.

So to rationalize all these based on my own and company’s scenario, it seems running SQL Server on an Azure VM will be the way to go. The challenge will be to use SQL Server on Azure as storage and data management engine of choice for a multi-client SAAS (software-as-a-service) type of system. I have identified a few major issues that I need to deal with from the get go:

(1) how to handle horizontal scaling

How would I shard? I would think a single database per client would be easier. Given that SQL Server wasn’t designed to be automatically elastic, I would probably need to develop a mechanism that would create a new database on some SQL Server instance on an Azure VM as soon as a new client registers. These VMs can be on the same or different Azure instance. The SQL SErver instance on Azure VM can be pre-determined, provisioned and should be running on standby and wait for new registrants.

(2) full text

Fulltext indexing is currently not available in Azure SQL Database. To provide a similar service, one needs to look at 3rd party search solutions like Lucene.NET. The difference between the fulltext indexing in SQL Server and that of Lucene is that, the former allows you to mix things up in your TSQL queries in the same TSQL statements. By using SQL Server on an Azure VM, one can have fulltext indexing and searching. However, SQL Server fulltext can only be configured on a per database basis. To do fulltext across multiple databases across multiple SQL Server instances on various Azure VM instances, one has to probably make SQL SErver fulltext work like that of Lucene.NET. To probably do that, one has to treat each client database as a document.

(3) reporting services

The simplest implementation for a multi-client setup I can think of right now is to include the login information in the report’s parameter and hi-jack the info to get the right database to query. But I am not sure yet if this is a good approach security wise. The other option is to use the report RDL control and access the RDL files from the apps.

(4) High Availability & Backup

Going for SQL Server on an Azure VM means you have to deal with High Availability issues like you do with in-premise implementations. Though Microsoft made sure that the Azure instances one is using all have redundant instances, there were downtimes that you don’t have control that you have to deal with. The good thing with using SQL Server on an Azure VM is that, full database backups like those you do in in-premise setup are available.

Given the above challenges, I would definitely go for SQL Server in an Azure VM over Azure SQL Database. It is probably much easier to fashion out a custom sharding mechanism or horizontal scaler, a lucene.NET like implementation using SQL Server’s fulltext than deal with a lot of Azure SQL Database missing pieces. By also picking SQL Server as the storage of choice, I could probably port the entire system with negligible changes to other cloud infrastructure provider. Should a client decides to run everything off the cloud, then that wouldn’t be a problem either.

It seems to me now that Microsoft has provided me a potent option with having SQL Server on an Azure VM. I don’t have to wait until they get everything I need in Azure SQL Database. Surely, my personal and company’s needs  would make a case for SQL Server in Azure.

**************************************************

f677754256
Maira-ira Beach
Photographed in Pagudpud, Ilocos Norte

**************************************************
Toto Gamboa is a consultant specializing on databases, Microsoft SQL Server and software development operating in the Philippines. He is currently a member and one of the leaders of Philippine SQL Server Users Group, a Professional Association for SQL Server (PASS) chapter and is one of Microsoft’s MVP for SQL Server in the Philippines. You may reach him by sending an email to totogamboa@gmail.com

Building, Deploying Databases Incrementally On Multiple Servers

SSDT + DACFx = Cool Database Version Control

 

For years, building and deploying a database to multiple clients was all pain and a lot of work. I have relied mostly in hand-coded solution to keep my sanity intact. The goal was simply to synchronize all databases deployed to various clients from a single schema which our company maintains. The goal was very simple, yet it was easier said than done. Often, one or more of these databases gets less and less updates over time which made things difficult to track. Some of these are of our own doing and some due to the inherent nature of our business as we don’t necessarily and automatically update databases as we wish. Clients need to sign up and approve of the updates and upgrades to the system and the databases.

In our case, there are tens of thousands of stored procedures to maintain and hundreds of tables, indexes and other SQL objects to track.These objects should be also in synched with the applications as version 1.0 of apps have different sets of Sql Server objects than application version 2.00. And we maintain up to 3 different versions hoping the clients would sign-up to upgrade to the latest faster than we solve all these complications. Some of these objects evolved from more than a decade of operation. In the end, databases get out of synched from the model that we have at hand as versions upon versions in various clients change. Today, the challenges are even greater. As we transition to a more faster and smaller roll out of updates (people refer nowadays to a more ‘agile’ way of evolving applications), new features, changes, fixes, deprecations, or whatever is on the plate increase the chances of our database out there getting out of synched.

A Viable Solution : SSDT and DACFx

Now enter the realm of SSDT and DACFx, it has provided us a free and automated way of doing what yesterday was a major headache for most of us. Some semblance of these workings started with SQL Server 2005 SP4 but I have waited for SSDT to have gone out to fully embrace the solution. Now there is that ‘silver bullet’ that would allow us to slay a werewolf of database proportions. So what are these tools to build databases incrementally and have all the increments deployed in an automatic manner (yes, it only involves a single command with a few parameters to dabble with!)?

The Tools

All we need now are the following:

  • SQL Server Data Tools (SSDT). A free database development tool that will allow us to design our databases and package it for deployment from version 1.0 to a series of minor and major updates to its schema. It is better to check out what it is if you still have not heard of SSDT. Coupled with DACFx, you can build Data Tier Applications (DAC).
  • DACFx. SQL Server Data-Tier Application Framework which is a set of tools, API’s and services intended to simplify our development, deployment and management of SQL Server schemas. The framework specifically allows us to build Data Tier Applications (DAC).
  • SQLPackage.Exe, a DAC tool use to deal with DACPACs and a lot more.

How These Things Work

Here, I would simply present a typical scenario everyone can relate and see how these tools work. Our goal is to deploy TESTDB v1.0 and update it with TESTDB v1.1. Using SSDT, I will have to author my TESTDB v1.0 with probably Table1 and StoredProcedure1 in it. Then I have to get the database snapshot.

snapshotdb

This forces SSDT to generate a DACPAC file containing metadata and instructions on how my TESTDB v1.0 is like when it is deployed. For this article I got the file TESTDB_20130301_11-32-20.dacpac under the snapshots folder of the project. The file naming gives me instant version management of some sort and that is neat. Using SQLPackage.Exe, I could go to any client’s SQL Server, fire up CMD and run TESTDB_20130301_11-32-20.dacpac.

sqlpackage

SQLPackage.Exe will proceed to deploy TESTDB v1.0 on any instance of SQL Server that I want. I basically only need to know the proper connection string that I can use for the process. But of course there is more to that, but I leave all that outside of this article for now as I wanted to show how easy it is all these can be achieved.

At this point in time, I got the following:

  • A simple, stand-alone dacpac file that I can use to deploy TESTDB v1.0 on more servers. I can have others or the clients do that for me. No more of those hand-coded scripts that could easily get tampered or accidentally modified for me. I can keep this file knowing this sets up the schema version 1.0 for my TESTDB. And I can use it weeks, months or even years after without worrying to recall what it needs to do.
  • A ‘testdb1’ database sitting on a new server
  • and a head-ache free afternoon

At some point in time, I probably might need to upgrade TESTDB with new tables and stored procs. We can refer to this as TESTDB v1.1. I can do that with SSDT by opening the TESTDB project, add the new objects and have a snapshot once again of the latest version of TESTDB to get a new DACPAC associated to TESTDB’s new version/updates. I just have to run the new DACPAC file on new or existing servers. This is where these tools get interesting and there are possibly varying outcomes depending on what SQLPackage.Exe encounters:

  • if testdb1 exists in the target server, SQLPackage.EXE will update the schema as necessary. It will check existing objects, leave it as is or update it as necessary and add new ones.  It will do this without DATA LOSS. Yes, no DATA LOSS.
  • if testdb1 does not exist in the target server, it will simply create testdb1. Objects from previous versions (e.g. TESTDB v1.0) will be recreated up to the latest version (which is TESTDB v1.1)
  • A possibility of DATA LOSS is also detected during this process. For example, if a certain column from v1.0 was changed from VARCHAR to INT in v1.1, SQLPackage.Exe prompts the user and cancels the entire update. THIS prevents DATA LOSS.
  • A version drift is also detected during this process. A user is prompted if a modification of the database schema outside this formal and controlled process of deployment is present. Once detected, it also cancels the entire update. THIS prevents DATA LOSS and structural modifications on databases that were tweaked outside of this process for some valid reasons.

With all these free and simple yet very effective deployment tools, I’d have my weekends for more non-tech activities. Oh well, I should have gone fishing but these tools tell me to tell others so they can go fishing too. 🙂

**************************************************

IMG_0468
𝗦𝘂𝗹𝗳𝘂𝗿-𝗯𝗶𝗹𝗹𝗲𝗱 𝗡𝘂𝘁𝗵𝗮𝘁𝗰𝗵, 𝘚𝘪𝘵𝘵𝘢 𝘰𝘦𝘯𝘰𝘤𝘩𝘭𝘢𝘮𝘺𝘴
Photographed in Mountain Province

**************************************************
Toto Gamboa is a consultant specializing on databases, Microsoft SQL Server and software development operating in the Philippines. He is currently a member and one of the leaders of Philippine SQL Server Users Group, a Professional Association for SQL Server (PASS) chapter and is one of Microsoft’s MVP for SQL Server in the Philippines. You may reach him by sending an email to totogamboa@gmail.com

Database Server Provisioning: Items Of Concern Using Commodity Servers

There were series of requests for info directed at me lately from clients and acquaintances on how to best deal with their need to come up with the right server specification for their SQL Server based applications.

Upon some substantial researching, I came to a conclusion that preparing a new environment for our SQL Server isn’t an exact science. It almost always followed by lots of questions where answers are hard to come by easily. The complexity of coming up with a sound specification is heightened by the lack or low frequency of our exposure to the task at hand. I mean, how many of us are tasked to come up with a new server for some new SQL Server based application on a daily or even weekly basis?

So I thought of compiling what I have stumbled upon the Web, the online fora, and from interactions with various hardware vendors. This compilation of information would probably serve as a starting point to coming up with sound configuration of a machine that would run our SQL Server and would also serve as my guide and reminder of the things that I need to be concerned of.

Today, we have so many options to choose from. But I only need to focus on those that will affect SQL Server’s performance using commodity servers. So what is in the market today that needs our attention?

CAUTION: I’ll only assume that the server will be configured as dedicated only to SQL Server. I won’t cover virtualized scenarios here.

Processing  Unit

  • 64bit. Going 64bit is the way to go in most of us. This should allow us  to go beyond the memory addressing limits of a 32bit system. SQL Server need not task itself too with the extra memory handling posed by 32bit.
  • Speed. Faster core speed would generally give us better performance. So I’d say buy the fastest per core speed processing unit your money can have.
  • Cache. Larger cache are better. Cache allows the CPU to keep frequently used data nearer and faster to the processing unit. The more it can accommodate, the better.
  • Thread. Today’s processing units offer flavors that have multi-threading in the cores. Extra threads allow us to have more ‘processors’ (logical) than what we actually have (physical) to work on a given task. SQL Server takes advantage of this. In fact, the maker of SQL Server states that the compute capacity of the 2 logical processors in the hyperthreaded core is greater than the compute capacity of the same core with hyperthreading disabled. So go for the processors with threading.
  • Cores. Today’s staple are from commodity servers have 4, 6, 8 cores per socket. Go for the highest numbers your money can buy. Be aware though that starting with SQL Server 2012, the number of cores that you have comes with licensing implications, so you need to check on these concerns too.
  • Sockets. Today’s commodity servers have 1, 2 and 4 sockets. Though your application may need only 1 single processor, it is sensible to factor in growth and future expansion options. You may opt for a server with spare sockets for growth. However, do take note that you may have difficulties acquiring additional processors for your server if you decide to expand after several years. It also make sense that if you have enough money to fill the spare sockets with the processors you need. Be aware though that starting with SQL Server 2012, the number of sockets that you have comes with licensing implications, so you need to check on these concerns too.

Memory

  • Size. More is better. Period! 🙂
  • Density. Go for the denser memory modules so don’t quickly fill the limited number of memory slots meant for expansion. E.g., If your server has 4 memory slots and can handle up to 32GB max and you need 16GB of memory, it is better to get two 8GB modules than four 4GB modules so you still have two slots for expansion. Expanding to 32GB will render your 4GB modules unusable.

Storage and Related Components

Probably the slowest components in your server, choosing the right combination of storage and related components can lead you to a better, performant SQL Server database server.

  • Capacity. In SQL Server terms, server storing capacity isn’t disk size. SQL Server has various ways of utilizing storage. You can have multiple file groups across several arrays or disks in multiple enclosures. Size is really dictated by what your application needs. Determine first your size estimates and have enough buffer for growth. If you can expand size in the future without having to shut down and reconfigure your system, that is for me is the most sensible path to take.
  • Disk Size. Individual disk size may or may not affect SQL Server. But do note that size can relatively affect seek speed. Go easy with larger disks. Heads can travel longer in larger capacity drives. So if you are building an array of multiple drives, you may prefer building it with smaller capacity drives.
  • SATA, SAS. To make things simple, SAS drives are faster but are more expensive than SATA drives.
  • SSD. Solid State Drives are the future but it is here now. Though I have not personally tried this yet in a production server, vendors and customer testimonies tout SSDs are superbly faster in a lot of cases and increases significantly the server’s performance. There are still reliability issues that I have read but things are getting better each day. I’d probably want to utilize SSDs for my tempdb.
  • RAID. There should be more than enough information on RAID and SQL Server on the web. It is one piece of component an SQL Server DBA need to be concerned of and familiarity is the key. There are plenty of storage/server options that allows you to have multiple RAID controllers and build various array types. It is just impossible to have it covered here in this article. I’d say it is always better to check with your vendor. However, I have the belief that in SQL Server, the more RAID arrays you have spread across several controllers using the fastest, most reliable and less dense disk drives are the way to go. This way, you can freely spread load such as separating transaction logs, tempdb, data files to their own arrays. You may check on other blogs on this matter.

Ethernet Ports

  • Bandwidth. Servers with 1Gbps ports are now a norm but I have seen some server models with 10Gbps ports. However, you can only take advantage of these high volume ports if your network (cabling, switches, etc) supports it. I would assume that within the datacenter, or the least within your servers’ rack/enclosure that you have the fastest links. SQL Server will take advantage of whatever bandwidth you have in your box.

Blades, Storage Area Networks, NAS and External Storage options

These types of options present a bit different set of concerns to an SQL Server DBA as compared to standard/commodity servers.  But the nuances of the common components I have mentioned above can still be applied on these options. I would love to expand this article to cover these options. This is all for now!

**************************************************

IMG_0103_whiskered_treeswift
Whiskered Tree-Swift, 𝘏𝘦𝘮𝘪𝘱𝘳𝘰𝘤𝘯𝘦 𝘤𝘰𝘮𝘢𝘵𝘢 (female)
Photographed in Mt. Palay Palay, Cavite

**************************************************
Toto Gamboa is a consultant specializing on databases, Microsoft SQL Server and software development operating in the Philippines. He is currently a member and one of the leaders of Philippine SQL Server Users Group, a Professional Association for SQL Server (PASS) chapter and is one of Microsoft’s MVP for SQL Server in the Philippines. You may reach him by sending an email to totogamboa@gmail.com

Some Major Changes to SQL Server Licensing in 2012 Version

Last month, Microsoft has published changes to SQL Server’s licensing model when the 2012 version hits gold. A number of major changes will likely catch everyone’s attention. Here are those that caught my while:

  • Editions now are streamlined to just three (3) flavors:
    • Enterprise
    • Business Intelligence (new)
    • Standard
  • Enterprise Edition will now be licensed per CPU core instead of the usual per CPU Socket (processor).
  • Server + CAL Licensing will only be available in the Standard and Business Intelligence flavors.
  • Core licenses will be sold in Packs of 2 cores.
  • Cost of per core license will be 1/4 the price of current SQL Server 2008R2’s per processor cost license
  • Cost of Server CAL Licensing has increased by more or less 25%
  • To license your server based on cores, Microsoft requires you to buy a minimum of 4 core licenses per CPU socket.

These changes will definitely impact your upgrading plans. To carefully plot your licensing strategies, you can visit SQL Server 2012 Licensing Overview.

Where I See These Changes Have An Impact

  • Scenario 1: Upgrading from Servers with dual cores or less. If you are upgrading to a very old server with 2 CPUs with single core for each, you need to buy 4 core licenses for each CPU. That means you have to buy 8 core licenses. It would be rather logical that when doing this, you might as well upgrade your hardware. Besides, your hardware upgrade seems to be long overdue.
  • Scenario 2: Upgrading from Servers with more than 4 cores. In the past, If one experiences the need to have more CPU power for SQL Server, one only has to buy a new server with as many cores per processor that money can get, reinstall your SQL Server and you are good to go … all without minding licensing issues and cost. With SQL Server 2012, you will be spending more as you go beyond 4 cores. Going 6 cores, you need 2 more core licenses. Going 8 and 10 core, you need to buy additional 4 and 6 core licenses respectively.
  • Scenario 3: Upgrading from Servers with Server + CAL Licensing. There is a 25% increase in cost for CALs.

For those of you who are still using versions 2000 and 2005 and are planning to upgrade but find these 2012 changes unpalatable, you may want to rush upgrading to SQL Server 2008 R2 instead and get stuck in that version’s perks and licensing model before your resellers remove the older SKUs with older licensing models from their sales portfolios and force you to buy SQL Server 2012.

Of course, SQL Server 2012 Express will still be FREE!

Juneau That You Are On Steroids With Denali?

Supporting multiple versions of SQL Server for a single application typically means headache. And that is what I have been in since I started making commercial applications (banking mostly on the benefits provided by T-SQL). We have not abstracted our databases with non-TSQL objects in the mid-tier for very valid and interesting reasons. In my current situation, I am dealing with SQL Server 2005, 2008, 2008/R2 supporting bits of features from each version. Thank heavens all our clients have upgraded already from version 2000. My world though is still 80% SQL Server 2005, and being still a very formidable version, I don’t think I will be able to part ways with this build anytime soon. Case in point, clients cannot just easily upgrade to higher versions every time one comes out.

As a result, I am mired in managing our SQL Server development assets using various tools. Anything that can help tract and manage everything. Ever since, I had always wished for a tool that will address most of my SQL Server development concerns if not all. All these years, I have heavily relied on SSMS and supported by various other tools. When Database Projects arrive with Visual Studio, it got me interested but it tanks out every time I load it up with what I have. The last one was with Visual Studio 2010. With the sheer number of objects that we have, capacity and performance was always the issue and each time, I had to resort back to SSMS without exploring further those tools.

When Juneau, now officially called SQL Server Developer Tools, or SSDT, was announced as one of the touted tool in the next version of SQL Server  codenamed “Denali”, I was pretty excited about the possibilities it will bring to my situation. After getting my hands on it with Denali CTP3, I say it really got me excited. Though I haven’t tried steroids, based on the effects that I have seen, SSDT (sounds like drug testing haha) gets my nod despite some issues that I have encountered. I am aware that what I got is preview build and for sure Microsoft will eventually iron out those. The benefits far outweigh the quirks that I have seen. It would be worth to upgrade even with SSDT alone.

So what is in SQL Server Developer Tools?

I barely scraped the surface yet but here is what I have found that will a big impact with my situation:

  • It is a unified environment/tool for SQL Server, Business Intelligence, SSIS. Everything I need is provided for within the SSDT shell. I dont have to switch back and forth between Visual Studio, SSMS, and BIDS. Creation of new reports and SQL Server objects such as tables, views, stored procedures, file groups, full text catalogs, indexes … all can be done within SSDT. Existing ones can be added too. When I had the chance to explore MySQL Workbench, I find it cool to have objects that I create scripted first before it is created physically. This capability is now available in SSDT.
  • “Intellisense support for TSQL” for SQL Server 2005. With SSDT, it has not actually upgraded SQL Server 2005 then gave it TSQL Intellisense. TSQL Intellisense is outright available within SSDT regardless of your targetted SQL Server platform. With SSDT, you model your database schema/objects without the tool actually being connected to a targetted version 2005 live database.  This way, you work with your TSQL with Intellisense provided by SSDT, and you can publish your project to an SQL Server 2005 database by choosing 2005 as your target platform. Build/error checks are done prior to the publishing of the database project.
  • SQL Server Platform Targetting. SSDT supports versions 2005, 2008, 2008 R2, Denali and SQL Azure. You do your thing and just select your target platform when you are done. During the build, you will be prompted for platform violations.
  • Refactoring (Rename, wildcard expansion, Fully qualified naming). While in TSQL writing mode, you can rename a field and SSDT does the rest to all affected objects (tables, stored procedures, etc). Neat. No more 3rd party stuff to achieve this. 🙂
  • Supports Incremental Project Publishing. You can keep a development version of your database and upgrade your live version incrementally either via script or while connected to the database. It can even prompt you when your changes/upgrades causes a data loss. Very cool.
  • SSDT detects changes done outside of SSDT. It prompts you during publishing of a database project when it detects objects created outside of the tool. This is necessary so you know if somebody is messing up with your databases. I do wish though that SSDT would have the capability to reverse engineer these detected objects when it is determined to be legit as sometimes, you get to tweak directly the database using other tools like SSMS.
  • Importing of Existing Database Schema (from 2005 to Denali) into a database development project. One need not start from an empty project, existing databases can be dealt with.
  • Scalability. I am glad SSDT successfully imported the more than 20,000 objects in our current database. The previous attempts I had using Visual Studios’ Database Project (both 2008 and 2010) just hanged midstream after hours of importing all the objects. Took SSDT more than 20 minutes to import our database using a lowly test Denali test machine.

This is all for now folks! Regardless of some unwanted quirks I have encountered when using SSDT, I can still say, it is worth looking at. I am just waiting for some decent Denali machine to use so I get to load up my projects in SSDT and take the next level from there.

**************************************************
Toto Gamboa is a consultant specializing on databases, Microsoft SQL Server and software development operating in the Philippines. He is currently a member and one of the leaders of Philippine SQL Server Users Group (PHISSUG), a Professional Association for SQL Server (PASS) chapter and is one of Microsoft’s MVP for SQL Server in the Philippines. You may reach him by sending an email to totogamboa@gmail.com

My First Shot At Sharding Databases

Sometime in 2004, I was faced with a question whether to have our library system’s design, which I started way back in 1992 using Clipper on Novell Netware and later was ported to ASP/MSSQL, be torn apart and come up with a more advanced, scalable and flexible design. The usual problem I would encounter most often is that sometimes in an academic organization, there could be varying structures. In public schools, they have regional structures where libraries are shared by various schools in the region. In some organizations, a school have sister schools with several campuses each with one or more libraries in it but managed by only one entity. In one setup, a single campus can have several schools in it, with each having one or more libraries. These variations pose a lot of challenge in terms of programming and deployment. A possible design nightmare. Each school or library would often emphasize their independence and uniqueness against other schools and libraries, for example wanting to have their own library policies and control over their collections and users and customers and yet have that desire to share their resources to others and interoperate with one another. Even within inside a campus, one library can even operate on a different time schedule from the other library just a hallway apart. That presented a lot of challenge in terms of having a sound database design.

The old design from 1992 to 2004 was a single database with most tables have references to common tables called “libraries” and “organizations”. That was an attempt to partition content by libraries or organization (a school). Scaling though wasn’t a concern that time as even the largest library in the country won’t probably consume a few gigs of harddisk space. The challenge came as every query inside the system has to filter everything by library or school. As features of our library system grew in numbers and became more advanced and complex, it is apparent that the old design, though very simple when contained in a single database, would soon burst into a problem. Coincidentally though, I have been contemplating to advance the product in terms of feature set. Flexibility was my number one motivator, second was the possibility of doing it all over the web.

Then came the ultimate question, should I retain the design and improve on it, or should I be more daring and ambitious. I scoured over the Internet for guidance of a sound design and after a thorough assessment of current and possibly future challenges that would include scaling, I ended up with a decision to instead break things apart and abandon the single database mindset. The old design went into my garbage bin. Consequently, that was the beginning of my love of sharding databases to address issues of library organization, manageability and control and to some extent, scalability.

The immediate question was how I am gonna do the sharding. Picking up the old schema from the garbage bin, it was pretty obvious that breaking them apart by libraries is the most logical. I haven’t heard the concept of a “tenant” then, but I dont have to as the logic behind in choosing it is as ancient as it can be. There were other potential candidate for keys to shard the database like “schools” or “organization”, but the most logical is the “library”. It is the only entity that can stand drubbing and scrubbing. I went on to design our library system with each database containing only one tenant, the library. As of this writing, our library system have various configurations: one school have several libraries inside their campus, another have several campuses scattered all over metro manila with some campus having one or more libraries but everything sits on a VPN accessing a single server.

Our design though is yet to become fully sharded at all levels as another system acts as a common control for all the databases. This violates the concept of a truly sharded design where there should be no shared entity among shards. Few tweaks here and there though would fully comply with the concept. Our current design though is 100% sharded at the library level.

So Why Sharding?

The advent of computing in the cloud present to us new opportunities, especially with ISVs. With things like Azure, we will be forced to rethink our design patterns. The most challenging perhaps is on how to design not only to address concerns of scalability, but to make our applications tenant-aware and tenant-ready. This challenge is not only present in the cloud, but a lot of on-premise applications can be designed this way. This could help in everyone’s eventual transition to the cloud. But cloud or not, we could benefit a lot on sharding. In our case, we can pretty much support any configuration out there. We also got to mimic the real world operation of libraries. And it eases up on a lot of things like security and control.

Developing Sharded Applications

Aside from databases, applications need to be fully aware that it is not anymore accessing a single database where it can easily query everything with ease without minding other data exists somewhere. Could be on a different database, sitting on another server. Though the application will be a bit more complex in terms of design, often, it is easier to comprehend and develop if you have an app instance mind only a single tenant as oppose to an app instance trying to filter out other tenants just to get the information set of just one tenant.

Our Library System

Currently our library system runs on sharded mode both on premise and on cloud-like hosted environments. You might want to try its online search:

Looking at SQL Azure

Sharding isn’t automatic to any Microsoft SQL Server platform including SQL Azure. One needs to do it by hand and from ground up. This might change in the future though. I am quite sure Microsoft will see this compelling feature. SQL Azure is the only Azure based product that currently does not have natural/inherent support for scaling out.  If I am Microsoft, they should offer a full SQL Server service like shared windows hosting sites do along side SQL Azure so it eases up adoption. Our systems database design readiness (being currently sharded) would allow us to easily embrace the new service. But I understand, it would affect, possibly dampen their SQL Azure efforts if they do it. But I would try to reconsider it than offering a very anemic product.

As of now, though we may have wanted to take our library system to Azure with few minor tweaks, we just can’t in this version of SQL Azure for various reasons as stated below:

  • SQL Server Full Text Search. SQL Azure does not support this in its current version.
  • Database Backup and Restore. SQL Azure does not support this in its current version.
  • Reporting Services. SQL Azure does not support this in its current version.
  • $109.95 a month on Azure/SQL Azure versus $10 a month shared host with a full-featured IIS7/SQL Server 2008 DB

My Design Paid Off

Now I am quite happy that the potentials of a multi-tenant, sharded database design, though is as ancient, it is beginning to get attention with the advent of cloud computing. My 2004 database design is definitely laying the groundwork for cloud computing adoption. Meanwhile, I have to look for solutions to address what’s lacking in SQL Azure. There could be some easy work around.

I’ll find time on the technical aspects of sharding databases in my future blogs. I am also thinking that PHISSUG should have one of this sharding tech-sessions.

**************************************************
Toto Gamboa is a consultant specializing on databases, Microsoft SQL Server and software development operating in the Philippines. He is currently a member and one of the leaders of Philippine SQL Server Users Group, a Professional Association for SQL Server (PASS) chapter and is one of Microsoft’s MVP for SQL Server in the Philippines. You may reach him by sending an email to totogamboa@gmail.com

Speed Matters : Subquery vs Table Variable vs Temporary Table

This has been perhaps written quite a number of times by a lot of TSQL gurus out there but earlier I was presented with a TSQL code that prompted me to recheck my notions on subqueries, table variables and temporary tables. The TSQL code presented to me is something built dynamically that depends heavily on user inputs. The issue presented was performance. The code used to run acceptably fine until earlier when things are annoyingly slow.

Upon seeing the code, I said to myself, problem solved. Instantly, I thought I was able to spot what is causing the performance issue. The code was written using TSQL’s temporary tables. Based from experience, I would automatically shoot down any attempt to use temporary tables when executing a TSQL batch. And though Microsoft has introduced table variables as a better alternative to temporary tables, better in terms of technicality and programmability, delving with the two options would always be my last resort. In most cases, I’d rather tackle difficult scenarios requiring handling of temporal data using subqueries. In fact, I immediately ruled a rewrite of the code to minimize the use of temporary tables, and at least replace it with table variables which were supposed to be a lot better. But upon further investigation, I ruled a rewrite because of some other things not related to the topics at hand here. On the other hand, am trying to figure out how things can be converted using subqueries which I would primarily prefer.

In my investigation though, something interesting came up and I thought I’d better time the executions of my experiment while I make the templates of how things are done using subqueries. The results now intrigues me a bit, and probably would take a second look at the code I was shown earlier. I might be looking at the wrong place. Nevertheless, the results of my experiment might interest others out there. So here it is:

The Experiment 

I always love to use one of our company’s product, the library system, to test my experiments with anything to do with TSQL as its database is as diverse in design and available test data is in great abundance. I picked one small database sample for my experiment and this is how it went:

  • Test 1 is intended to check how different approaches fare on small sets of data. I have to query all library materials that were written in ‘Filipino’ (which would result into 974 rows out of the 895701 rows in a single table). Based on the result, I have to search for a string pattern “bala” in all of the materials’ titles using the LIKE operator which would eventually result to 7 rows.
  • Test 2 is basically the same query against the same environment with different query conditions to see how the three approaches fare on large sets of data. So I queried all library materials that were written in ‘English’ (which would result into 43896 rows out of the 895701 rows). Based on the result, I have to search for a string pattern “life” in all of the materials’ titles using the LIKE operator which would eventually result to 440 rows.

I wrote three queries that each would use a specific approach to query my desired output in this experiment.

First query using SUB QUERIES:

Second query uses TABLE VARIABLES:

The third and last query uses TEMPORARY TABLES:

I run all three queries (3x each) after a service restart to clean-up tempdb and I got the following results. Figures are presented in the number of seconds:

  Total Rows Temporal Data Count Final Result Sub Queries Table Variables Temporary Tables
Test 1 895701 rows 974 rows 7 rows 2 secs 4 secs 2 secs
Test 2 895701 rows 43896 rows 440 rows 3 secs 23 secs 4 secs

 

The results pretty much proved interesting. I thought, based on the queries I wrote that temporary tables would be slightly slower of the three approaches. The execution plan for table variables and temporary tables are pretty much the same. I was a bit surprised that the execution time of the table variable is almost twice that of a temporary table on small sets of data but temporary tables trumps table variables by a large margin in large set of data. I was expecting a slight difference only as, though both are a bit different in terms of scoping and technicality, and to some extent purpose, both uses TEMPDB as their memory (contrary to the notion that table variables use in-memory) when handling data. What is notable though is that, disk I/O is costlier with Table Variables. I am not sure at this point what causes this cost. Will try to dig deeper when I have more time.

Nevertheless, this particular experiment does not provide a general conclusion that one of the three approaches is better than among the three. I still subscribe to the general notions:

  • That sub queries are neat and would perform better in most if not all cases
  • That temporary tables would be slower on environments with heavy concurrency as sql server would handle more locks especially when using it within a transaction. I have not tested this scenario here though.
  • That temporary tables may be a faster solution if constraints/indexes are needed in between queries, especially when a large set of data is expected, statistics on table variables columns arent created and it can’t have indexes. But surprisingly, this experiment shows Table Variables performed poorly among the three with handling large data set as being its slowest.
  • That table variables are a good option technically/programmatically

You just be the judge. It would be prudent to conduct tests. I can’t say my experiment here is conclusive at best, but from hereon, I would avoid using Table Variables whenever I can. For very simple queries like in the tests, it performed poorly. I can’t imagine how things will be with larger datasets and more complex queries.

I am on to my next experiment!

**************************************************
Toto Gamboa is a consultant specializing on databases, Microsoft SQL Server and software development operating in the Philippines. He is currently a member and one of the leaders of Philippine SQL Server Users Group, a Professional Association for SQL Server (PASS) chapter and is one of Microsoft’s MVP for SQL Server in the Philippines. You may reach him by sending an email to totogamboa@gmail.com

Why Am I Still Stuck with T-SQL?

Not a day pass without something new coming out from software companies like Microsoft. And it has been a challenge to keep up for application developers like me. I happen to start my all-Microsoft stack effort during the heydays of Visual Basic 3 and Access. Prior to that, I was a Borland kind of kid mesmerized at how neat my code was when printed on reams of continuous paper.

I was really fast then in absorbing new software technologies for application development and related products but I am no longer a kid I used to be. I am now a slow slumbering oldie when it comes to absorbing software development technologies. I actually envy those guys now who are doing and using technologies that just came out of the baking oven. I feel they are so smart to figure out new things so fast. Isn’t that great?

And every time I get to mingle with these developers, they often wonder why I am still stuck with Transact-SQL (T-SQL). Every time .. and it never fails. It now makes me wonder why I am stuck with T-SQL when there are a lot of new alternatives. This makes me beg to question if I am still relevant with the times. Well, let’s see.

Am I in? Or Out?

My first serious brush with T-SQL was when I was contracted to develop a school library system. I thought of using VB3 + Access but being a fast kid then, I opted to choose what was the latest and ended up using VB4 and SQL Server 6.5. I was mostly doing VB code and using DAO/ADO to connect to my databases and still doing client-side cursors when going through my tables here and there. I had trouble adapting to sql’s set-based processing and mindset with my Clipper-heavy and Access background. In no time, I was able to absorb T-SQL and began moving some of my data manipulation code to stored procedures.

When VB5 came out I decided to upgrade the school library application with an improved database structure with all data manipulation in stored procedures. This time, no more non-TSQL code for my data. I was able to re-use some TSQL code taken from the app’s previous version.

VB6 came out and Microsoft touted a better data access component in RDO. Around that time, I was able to get more libraries to use my system so I virtually kept up with anything new from Microsoft. I upgraded the front-end portion of my library system while I was able to re-use all my stored procedures.

Shortly after, the Web’s irresistible force dawned on me and I took up ASP and VB Script and migrated a portion of my library application to face the web. During this time, I also upgraded to SQL Server 7.0. I had some inline SQL codes which were prone to SQL Injection but I was able to retain all my stored procedures.

When .NET came out, I had my library system upgraded to an entirely new schema, platform and language (ASP.NET, Windows Forms, ADO.NET, SQL Server 2000/2005). This time, I hired somebody else to code it for me.

Never Obsolete

With all the changes I made to the library system, the only technology that remained constant was T-SQL. In most cases, I was able to re-use code. In all cases, I was able to take advantage of the benefits of re-using my T-SQL experience .. all these while I managed to bulk up my knowledge on T-SQL.

VB4 is gone; my T-SQL is still here. VB5 is gone; my T-SQL is still here. VB6 is gone; my T-SQL is still here. ASP is gone; still my T-SQL is here. DAO, ADO, and RDO are gone but my T-SQL remained. I moved from VB to C#, yet I am still using T-SQL.

Today we have ADO.NET, we have LINQ. Soon they will be gone (I have heard LINQ-To-SQL is now deprecated). And tomorrow I can probably move from ASP.NET to Silverlight or some else new, or from ASP.NET to PHP, but I have the feeling I still be using T-SQL. Microsoft is even realizing in going back to something as basic as tables and queues with Azure but it can’t ignore T-SQL, thus we have SQL Azure.

Betting My Future

I am inclined to think that my investments on T-SQL have already been paid back immensely and I am still reaping its benefits. With the advent of the relatively new cloud computing, and having various players offering various cloud computing technologies and services, I can’t help the urge in identifying which part of the cloud computing technology stack will survive against the onslaught of constant change and would manage to stay relatively stable. I am afraid some of our current investments to other technologies wont be as useful in the cloud but I am betting my future once again with SQL.

What about you?

**************************************************
Toto Gamboa is a consultant specializing on databases, Microsoft SQL Server and software development operating in the Philippines. He is currently a member and one of the leaders of Philippine SQL Server Users Group, a Professional Association for SQL Server (PASS) chapter and is one of Microsoft’s MVP for SQL Server in the Philippines. You may reach him by sending an email to totogamboa@gmail.com