The Front-line Database Blog

All you need to know about adding a new EBS Volume

So for this months blog we are going to delve into adding a new EBS Volume to an EC2 instance and although this is a reasonably simple operation, we will drill down into some of the options you have and how these can impact performance long-term.

We will look at the following:

GPT v MBR
IO Block Size is not the same as NTFS Allocation Size
Should you format your drive to 64KB
Adding a Volume

I will use the terms volume/drive/disk interchangeably throughout this blog.

Ok, lets get into this and firstly look at GPT v MBR. GPT (Guid Partition Table) and MBR (Master Boot Record) these are two partitioning schemes that are used for hard drives. We’ll keep this at a fairly high level and summarize that GPT is probably the way to go when you adding an EBS Volume. MBR has a size limit of 2TB, where as GPT does not, also GPT gives you some extra protection against corruption as it writes info into several drive areas and includes a secondary backup table.

One of the biggest myths in the data world, is the misconception that I/O Block size and Allocation Unit size are the same thing. This is just not true. You may often have read the SQL Server does I/O operations in (8x8kb) (An extent), and this is where the recommendation for an allocation unit size of 64kb comes from. In reality SQL Server I/O can be of varying sizes depending on what operation is being performed.

Lets take Transaction Log Writes as an example, these need to be hardened to disk every time a commit is issued, so these writes might only be 512 or 4096 bytes in size, depending on the underlying storage. (Transaction log writes are sector-aligned in size, more on this later). Even with the use of the log buffer, where transaction writes are kept in cache and flushed when it becomes full (if not committed), log flushes can typically be up to 60kb in size, but a WAL protocol may also cause the log buffer to flush. So in short Transaction Log writes can be anything from 512 bytes to 60kb in size.

Other examples of SQL Server I/O operations that would not conform to a size of 64KB would be :

Bulk Loads
Backup and Restore Operations, will depend on where you are backing up to
Checkpoint
ColumnStore
File Initialization

Ok then, is it worth re-formatting a drive that is to be used by SQL Server to 64KB, what exactly is this doing for us?

So a quick recap, disks are made up of a series of tracks, these are too large to be a single unit of storage and so these are broken down further into what are called sectors. So sectors are the smallest unit of storage that can be on a drive, and they can vary in size from 512 bytes to 4kb, depending on the disk. Remember we said Transaction Log IO’s are aligned to the sector size on the disk.

A Cluster or Allocation Unit, is a group of sectors, typically between 4 to 8.

So lets run over that again. A Sector is the smallest unit of storage that can be written and an Allocation Unit (Cluster) is the smallest unit of storage a file can take up.

On windows drives on your laptop, they are typically formatted to 4kb in size, this makes sense as you would likely have many small files, and so you don’t want to have loads of wasted space. However on a drive that is storing database files, these are generally very large it makes sense to have a larger allocation unit size, for SQL Server that is recommended to be 64kb.

So how does having a bigger Allocation Unit Size help then?

The bigger the AU, the less metadata that you need to store, remember to read a file, you must firstly read in all the metadata, this will give you information as to where to find the physical location of sectors, but if you have large files and a small AU, then you will have a lot more metadata that needs to be read in, thereby increasing the amount of I/O that is required.

So really once a file has a growth, the work of the allocation unit is done, during a physical read it is the block size that determines the impact on I/O.

Useful command for examining the details of your drive, such as formatting size, sector size…..

fsutil fsinfo ntfsinfo <drive>:

Right, that’s all great to know, now how do we actually attach an EBS volume to an EC2 instance?

EC2 Console – Volumes – Create Volume – Select all your desired settings
Volume State = ‘Available’ – this means it has not yet been attached to an EC2 instance
Actions – Attach Volume – Select EC2 instance you want to attach to
Volume State = ‘In-use’, happy days, but we are not done yet, we need to make the new drive available to the Windows System.
RDP to EC2 – Disk Management – Right Click on Disk and bring online – Right Click and Initialize Disk
MBR or GPT – OK – Disk will now be online

So what’s left, we must specify the size of the disk and if we want to format.

Right Click – New Simple Volume – Specify disk size – Assign drive letter
Next – Allocation Unit Size – Change from default (4kb) and select 64kb for drive hosting data files – this may take a few mins and then you’ll be done 🙂

I hope this month’s blog post gives you a little better understanding of how the underlying storage volumes work and how the type and formatting used can impact IO performance on your SQL Server Instance.

As always, if you liked this blog please follow me on Twitter to be kept up-to-date with my latest blog releases and more fun stuff.

Until next time, Slán!

Disclaimer: Never just believe everything you read in a post, test this out for yourself, we all can make mistakes! Happy coding.

AWS DMS Hidden Gotcha’s Part I

So everyday at work we use AWS DMS for replicating data from our core main databases which are SQL Server instances installed on EC2 machines to various data stores in the AWS cloud. And don’t get me wrong this is a great service that is super fast and also fairly durable but there are a few hidden gotcha’s that can crop up depending on how you initially setup the first DMS task to run on your source server.

When creating a DMS Task to pull from a source Database on MS SQL Server, if the user that DMS uses to connect to SQL Server is a sysadmin, then DMS will automatically handle all the MS-Replication setup for you. (One caveat being that Replication is enabled on the instance).

Wow, this is great, so it takes care of setting up the publications, the artifacts to be copied, the log reader agent job, so what’s the problem? Well there not problems more things that the user should be aware of.

First up, the user needs to be SYSADMIN, now I know if you do this properly then that user should only be used for DMS Replication and its password should be in the secret store, but it is still a sysadmin, and it doesn’t need to be, if you are prepared to setup the Replication yourself, rather than asking DMS to do it, the user does not need this elevated permission. Remember we should always go be the rule of ‘Minimum Privileges to do your job’, we are not following that rule in this case.
Because DMS has done the replication setup for you, its easy to gloss over what is actually going on under the hood. For example, DMS Tasks allow you the option to filter out data, i.e. but did you know that this filter is not applied on the source database but on the Replication instance. So if you had a table called items and you wanted to exclude items that were made in ‘Russia’ then you could add a filter on the country column on that row, but items made in ‘Russia’ would still be replicated to the Replication instance, they just don’t make it to the target. Think of all the extra I/O that is happening, not to mention the extra cost if you are sending data to a different AZ. If you had instead added a filter on the artifact itself in the source database this would not happen.
Again similar to above is that all columns on a table are replicated to the Replication instance. So again going back to our example of the ‘items’ table if this table was 70 columns wide but you only needed 10 of those columns, this doesn’t matter, DMS has setup the replication to replicate all the columns so all 70 are replicated to the DMS Replication instance and it is only at this point that the columns are filtered out before the data is sent to the target.
Stopping/Deleting a DMS task does not stop the replication on the source database. The Log Reader Agent will continue to read the transaction log and for all artifacts marked for replication send the changes to the Distribution database where they will live for a while before being deleted. But again think of the extra CPU that is being used here when its no longer needed or the extra I/O and memory. Its very easy to forget about the underlying replication when you allow DMS to do the setup.
My final point on this is a more of ‘Beware of this potentially happening’ one. We replicate a table at work that has a number of columns, one of them being of type ‘text’ (yes I know, this is bad!) and we have the DMS Task setup the do limited LOB of 35KB, all good so far. We also monitor blocking on that table and sometimes we receive blocking messages that our AWS DMS user is attempting to query this text column for a particular row in the table but it is being blocked by a session that is updating that row. But why would it be reading from the table? DMS uses transactional replication which reads from the transaction log. I hold my hands up and say I cannot reproduce this on our test environment but we do see this a few times a month, so does DMS need to query the table for text columns on certain occasions?

Again let me say, these are not bugs or flaws with AWS DMS it is correct to not apply filtering at the source database and to replicate all columns, but its just that a lot of the time the user is actually unaware of what is going on in the back-end.

As always, if you liked this blog please follow me on Twitter to be kept up-to-date with my latest blog releases and more fun stuff.

Until next time, Slán!

Disclaimer: Never just believe everything you read in a post, test this out for yourself, we all can make mistakes! Happy coding.

Modifying an Existing Capture Instance in MS SQL Server CDC

So lets say that you want to add/drop a column from an existing Capture Instance but don’t want to lose any data that is currently in the capture instance and also you cannot afford to have any downtime, how do you achieve this. Please read on to find out.

So Microsoft have provided you with the ability to be able to have two capture instances on a table at any one time, I’d imagine to allow for this very situation.

For purposes of illustration lets say I have a table called dbo.Country and I now want the column countryid to be included in the already existing capture instance.

Step 1 – Create a 2nd version of the capture instance to now include the column countryid

----Step 1 - Create a 2nd version of the capture instance, we now want the capture instance to include countryid
--Enabling CDC on a table with net changes turned on
EXEC sys.sp_cdc_enable_table
	@source_schema = N'dbo',
	@source_name = N'country',
	@role_name = NULL,
	@filegroup_name = NULL,
	@supports_net_changes = 1,
	@captured_column_list = 'countryname,countryid',
    @capture_instance = 'dbo_country_2'

Step 2 – Now copy all of the data from the original capture instance (cdc.dbo_country_ct) into the newly created capture instance (cdc.dbo_country_2_ct).

INSERT INTO cdc.dbo_Country_2_CT
(__$start_lsn, __$end_lsn,__$seqval,__$operation,__$update_mask,countryname)
SELECT
    __$start_lsn, 
    __$end_lsn,
    __$seqval,
    __$operation,--
    __$update_mask,
    countryname    
FROM cdc.dbo_country_ct

Step 3 – Make a note of the start_lsn for the dbo_Country capture instance, we will need this later on when we will need to reset the lsn that is stored in the cdc.change_table.

select * from [cdc].[change_tables] where capture_instance = 'dbo_Country'

Step 4 – So you now have got everything you needed from the original capture instance (cdc.dbo_country_ct), it can now be dropped and recreate it again will mean the column countryid will now be included in it.

--Dropping the original capture instance
exec sys.sp_cdc_disable_table @source_schema = 'dbo', @source_name = 'country', @capture_instance = 'dbo_country'

--Creating a new capture instance where countryid will also be captured
EXEC sys.sp_cdc_enable_table
	@source_schema = N'dbo',
	@source_name = N'Country',
	@role_name = NULL,
	@filegroup_name = NULL,
	@supports_net_changes = 1,
	@captured_column_list = 'countryid,countryname',
    @capture_instance = 'dbo_country'

Step 5 – Copy all the data from capture instance 2 (cdc.dbo_country_2_ct) into the newly created version of the original capture instance (cdc.dbo_country_ct).

INSERT INTO cdc.dbo_country_ct
        (__$start_lsn, __$end_lsn,__$seqval,__$operation,__$update_mask,countryId,countryname)
    SELECT
        __$start_lsn, 
        __$end_lsn,
        __$seqval,
        __$operation,--
        __$update_mask,
        countryId,
        countryname    
    FROM cdc.dbo_country_2_ct

Step 6 – Now that both capture instances have the same data we can drop our version 2 of the capture instance, as we no longer need it.

exec sys.sp_cdc_disable_table @source_schema = 'dbo', @source_name = 'country', @capture_instance = 'dbo_country_2'

Step 7 – Ok so now we are almost done, we are back to just having one capture instance on our dbo.country table, with the same name as the original capture instance but it now contains the additional column countryid and we achieved all that without losing any data. So just one final thing to do and that is resetting the start_lsn for this table.

Why do we need to do this? So by creating a new capture instance, the start_lsn gets set in the cdc.change_table for the table dbo.country to the latest lsn that is in the database, that is find for any changes going forward but remember we have copied older changes into our new capture instance and if we want to be able to access them then we need to reset the start_lsn to a time further in the past, se we can set it to its original value, remember the value we saved before we dropped the original instance.

--This query does not return any data
DECLARE @beginTime as Date = CAST(GETDATE()-1 AS Date)
DECLARE @endTime AS Date = CAST(GETDATE() AS DATE)
declare @from_lsn binary(10), @to_lsn binary(10), @lowest_lsn binary(10), @max_lsn binary(10)
        set @from_lsn = sys.fn_cdc_map_time_to_lsn('smallest greater than or equal', @beginTime)
        set @to_lsn = sys.fn_cdc_map_time_to_lsn('largest less than or equal', @endTime)
        set @lowest_lsn = sys.fn_cdc_get_min_lsn('dbo_country')      
		set @max_lsn = sys.fn_cdc_get_max_lsn();
        IF (@from_lsn IS NULL) OR (@lowest_lsn > @from_lsn)
        BEGIN			
	        SET @from_lsn = @lowest_lsn;
        END
		IF (@to_lsn IS NULL) OR (@max_lsn < @to_lsn)
		BEGIN
			SET @to_lsn = @max_lsn;
		END
        select * from cdc.fn_cdc_get_net_changes_dbo_country(@from_lsn, @to_lsn, 'all')

The reason that no data is being returned is that the function ‘sys.fn_cdc_get_min_lsn(‘dbo_country’)’ returns the min lsn by reading the start_lsn from the cdc.change_tables table for the table ‘dbo.country’ and we know that the start_lsn was reset when we re-created the capture instance.

So to overcome this we will run the following code.

UPDATE [cdc].[change_tables] set [start_lsn] =@startLsn
WHERE [capture_instance] ='dbo_country';

--Where @startlsn is the value that you would have saved before
--dropping the original capture instance, see Step 3.

Ok, so you’re probably wondering why go this all this bother right? Why don’t we just use the 2nd version of the capture instance (cdc.dbo_country_2_ct) and away we go. The problem is, when you create a capture instance the functions that you are recommended to use for querying that capture instance are created and they will have the capture instance name in them. So for example if you had some other application that was looking for the function cdc.fn_cdc_get_all_changes_dbo_country and this will no longer exist as the function name will now incorporate that of the new capture instance cdc.dbo_country_2_ct, then your application breaks. Hence why IMHO its best to keep the capture instance name the same and so the function names will remain the same and so any applications that might be querying the functions (good practice) or querying the capture instance directly(bad practice) will not break.

As always, if you liked this blog please follow me on Twitter to be kept up-to-date with my latest blog releases and more fun stuff.

Until next time, Slán!

Disclaimer: Never just believe everything you read in a post, test this out for yourself, we all can make mistakes! Happy coding.

AWS DMS Replication Recovery Checkpoint

The Data Migration Service or DMS for short is one of AWS’s most useful service. It allows you to migrate data from one data source to another, provided at least one of the those data sources.

Before we get into the main topic of this article, ‘Explaining the Recovery Checkpoint’ of DMS, lets have a quick recap of what DMS offers.

It offers both an initial ‘re-Load’ and continuous replication feature. For the ‘re-Load’ a schema modification lock is taken out on all of the tables specified in the DMS task briefly at the beginning of the process and then a shared lock while the data is read from the table directly. This allows you to reload all of the data.

For continuous replication, either MS- Replication or MS-CDC is used to allow what ever changes are happening to to be replicated over from the source to the target. This is done by reading from the transaction log, the changes that are being written to it. It prevents unread changes from being truncated from the log by creating a dummy transaction in the transaction log. It has been found that DMS can lose track of this transaction and it can just run and run, thus preventing the transaction log from being truncated and also from any changes from being replicated over, best thing to do in this case is to stop and restart the DMS task. If a table has a Primary Key on it, MS-Replication will handle the changes and for changes without a PK, then CDC needs to be enabled on those tables.

Ok so, lets get back to explaining about the recovery checkpoint.

When you stop a DMS ‘Replication Only’ DMS task, (More on the Replication Only bit later), you can either ‘Resume’ that DMS task or ‘Restart’, so what’s the difference?

‘Resuming’ the DMS task will just continue the DMS replication from this point on, so any changes which were made to the database while the DMS task was stopped do not get replicated over, but any changes from this point onwards will.

‘Restarting’ a DMS task allows you to continue the replication from a specific point, that can either be the following:

Specific point in time
A log sequence number
A recovery checkpoint

Lets explore each option.

The ‘Specific Point in time’, allows you to specify down to the second from which time in the past you wish this DMS task to resume from. AWS DMS is able to convert this point in time into a native start point such as LSN for SQL Server. See official docs here.

The ‘Specified LSN’ allows you to enter a LSN in the case of SQL Server or SCN in the case of Oracle, this is a specific log record that was written out to the transaction log.

The ‘Recovery Checkpoint’ , is the last checkpoint that was recorded by the DMS Task, this can either be obtained from the public.awsdms_txn_state table in the target database, provided the ‘TaskRecoveryTableEnabled’ checkpoint is enabled in the DMS Task, then this table will be created or else the checkpoint can be obtained from the DMS Task UI.

This ability to ‘Restart a DMS Task from a specific point in time’ will only work for ‘Replication Only’ DMS tasks, as if you restart a DMS task that ‘Loads and Replicates’ it will attempt to do a full re-load.

Ok so that’ s super interesting but how is the DMS task able to do this?

So as we mentioned at the beginning of the article, DMS continuousreplication reads from the transaction log. Taking the example of SQL Server being the source, what it is essentially doing is calling an API that is reading from the function fn_dblog, this returns the changes that are currently in the transaction log. Ok great, but one small problem, don’t most databases have regular transaction log backups taken, say every 15 minutes which will truncate the transaction log, yes they do, so if DMS can’t find what it is looking for by calling the function fn_dblog, it can also call the function fn_dump_dblog(), which allows DMS to read from the transaction log backups, provided those backups still exist and have not been moved to some other location, then you have a problem!

Ok so this feature, can be super useful, you can stop a DMS task for a number of hours say and edit that task and tell it, when I restart you I want you to start from this point in time, or this Log Sequence Number or this checkpoint. This feature can be used to save costs where a replication instance might only run during daytime hours or even for DR, where a snapshot could be taken daily of the target database, this could then be restored to a particular location if needed and a replication-only DMS task created and told to re-start replication from this date and time onwards. Again, the only thing to make sure is that the transaction log backup is still accessible.

** If you like my monthly blogs, please do follow me on Twitter. @sqlrebel

Until next time Slán!

Disclaimer: Never just believe everything you read in a post, test this out for yourself, we all can make mistakes! Happy coding.

Collations and Encoding – PostgreSQL

Ok, not the most exciting topic in the world, but before you blow off reading this months blog know this, the collations that you have set at a database level in PostgreSQL impact the storage engines ability to be able to use an Index on a wild card search. Say what? Yes you heard me right, read on for some more detail!

So what are Collations and Encodings?

A Collation is a SQL schema object that maps a SQL name to an operating system locale. (See PostgreSQL docs). It mainly consists of two parts:

LC_COLLATE – Concerned with sorting
LC_CTYPE – Concerned with Capitalization

An Encoding is concerned with how many bytes is used to store the data.

As we may know already from our blog on SQL Server Collations, the database system uses the collation to perform ordering, for example as you may guess when the ‘order by’ clause is used. i.e. It tells us in the English language that ‘a’ comes before ‘b’.

The ‘default’ collation is what you will be using unless you specify otherwise and this means that you are using the default collation of the database server that you are on and this may differ depending on which region the database exists in. This is set upon creation of the PostgreSQL cluster and inherited by any databases that are created after, but you can explicitly specify a collation when creating a database.

The C and POSIX collations both specify the programming language C behavior, which basically means:

The characters A-Z are treated as letters and nothing else
And that sort order is then based solely on byte comparison.

This would differ say for example if you used the collation for french, then the sort order would not be based on byte values but on cultural rules.

‘C’ is the North American/English friendly UNIX default.

You can see why just using ‘C’ collation would be popular but if storing text from a language other than English then it is not the appropriate collation to use.

The code below will show you what the default collation and encoding on the server is.

show lc_collate

Ok, that’s all very interesting but lets get to the point that you said this impacted the use of Indexes!

So how does a locale impact indexing?

An Index is an ordered data set – when PostgreSQL looks up an Index it has to make certain assumptions as to where it will find the data based on the sort order of the text. So Indexes only work because they are sorted.
So for example if the locale is set to C then ‘a’ will be found before ‘b’ but what of the character á (a fada in the Irish Language), if C is the default collation then PostgreSQL would not treat this as a letter and just sort it based on its byte value and so this could lead to your indexes becoming less efficient as data is not correctly sorted.
If you database is using any other collation other than ‘C’ as the default, then any ‘like’ searches that are performed will not be able to use the standard B-Tree indexes!
So the standard B-tree indexes that you create on a table can only be used in wild card searches if the ‘C’ collation is being used.

Why is this the case?

The ‘LIKE’ operator needs to compare on byte position and not on character rules. If you have a different collation specified to ‘C’ then sort order is determined by the cultural rules of that collation and not the byte value and hence this is why if a like operator is used, then the storage engine will not be able to use any indexes in the searching as it cannot rely on the sorting of the index.

So what’s the solution then?

If you know that you will need to do pattern searching, then create an additional index and during index creation you specify an operator class (This tells the storage engine what comparison functions to be used when sorting the data for storage).

The following operator classes are most useful:

text_pattern_ops for type text
varchar_pattern_ops for type varchar
char_pattern_ops for type char

The difference between these operator classes and the default operator class that is used by an index is that these compare values strictly on byte position rather than local specific collation rule.

create index text_index on test_table(col varchar_pattern_ops);

Until next time, Slán!

Disclaimer: Never just believe everything you read in a post, test this out for yourself, we all can make mistakes! Happy coding.

HOT Updates – PostgreSQL

So you might be aware that PostgreSQL uses MVCC to handle transaction isolation and this has loads of advantages such as –

Very few locks are required, meaning readers don’t block writers and writers don’t block readers.
ROLLBACK can be very fast as it does not have to undo any changes.

So what is MVCC? It basically means that each time a row is to be modified, rather than updating the existing row a new version of the row is created, with the updated data. So yes, this means that you can have multiple versions of the same row in the database at any given time. The older rows are safe to be deleted when there is no transaction that exists any longer that can see still see that older row.

But back to what we are here to talk about today – HOT Updates.

So in fact, each update in PostgreSQL is essentially an insert followed some time later by a delete. This MVCC approach has two major drawbacks –

Tables that experience heavy update load can become bloated with old ‘Deleted’ rows that at some point need to be removed.
Each update causes a new row to be inserted which could mean all of the indexes need to be updated to be made aware of the new position of the row. (Remember in PostgreSQL, tables are heaps and so the physical position of the row is stored in a non clustered index as opposed to say SQL Server where you can make a table a Clustered Index, in which case a logical pointer to the row is stored so even if the physical position of the row changes the indexes do not need to be updated as they are pointing to the Clustered Index.

So a feature called HOT updates introduced in PosgreSQL 8.3 can help with this. Here’s how.

As already mentioned PostgreSQL tables are heaps, meaning that the physical pages are not stored in order and either are the rows within those pages. Instead PostgreSQL has an array of pointers at the start of each page, that points to a row on that page. Because of this it doesn’t matter how the rows are ordered within each page, any process trying to access a particular row will just go straight to the array pointer at the beginning of each page to find the row it is looking for. In fact a TID (Tuple Identifier) which is used to identify a row in a table consists of a page number and an offset number of what item in the array pointer to read in order to get the physical location of the row on the page.

HOT stands for Heap Only Tuple, meaning a tuple that is not referenced from outside the table block. When a new version of a row is added, a pointer is added to the old row version to point to this new row provided that both entries of the row are on the same page. So to access the heap only tuple, PostgreSQL follows the ‘hot chain’ within the page.

So why is this good?

So the external address of the row stays the same hence, as long as the column that was changed is not indexed then there is no need to make any change to any index. Instead an Index scan just follow the ‘Hot Chain’ to find the version of the row which is the most current for that transaction.

There is also the added advantage that dead rows can be removed without the need of a VACUUM, which means, update heavy tables don’t have to rely on VACUUM to clear out old rows. So any process that detects a ‘hot chain’, even a SELECT statement, will try and lock that page and reorganize it, removing all old tuples. This is made possible because that process knows that these tuples are only referenced internally within that page.

Ok then, so these HOT Updates sound great, what conditions are necessary for these to happen?

The columns being updated must not be referenced by an index
There must be enough space within the page for the addition of an updated row (Remember this only works if both the old and new version of the row reside on the same page.) So you need to set your fill factor accordingly.

Default fill factor for heaps is 100%, and for Non-Clustered Indexes is 90%

So what should I set the fill factor on my table to then?

As with most things in I.T. the answer is it depends, firstly on whether the table in question experiences a lot of updates, if so it then becomes a balancing act, set the fill factor too low and yes you gain from HOT tuples, but on the other hand, your table will take up more pages, which means less data can be fitted into memory, which could mean less efficient queries. Too little fill factor and you won’t be able to get the gains that can be achieved from HOT updates, of course it also depends on how heavily indexed that table is.

My advice would be the following:

For sure change the fill factor from 100% (even start out and try 90%) for tables that are update heavy
If you can run tests on your environment to determine which fill factor % gives you the best performance for you.

The reloptions field on the pg_class view is a text array of all of the option set on each table, if you have fill factor set it should show up here, if not then assume the default fill factor of 100% is used.

select reloptions,* from pg_class

Until next time, Slán!

Disclaimer: Never just believe everything you read in a post, test this out for yourself, we all can make mistakes! Happy coding.

UTF-16 and UTF-8 Encoding – SQL Server

When storing a string in SQL Server we use either char(n), varchar(n), nchar(n) and nvarchar(n). So what really is the difference?

So we might know that char and varchar store Ascii characters and use 1 byte per character to store this data, where as nchar and nvarchar store Unicode characters and use 2 bytes per character to store this data. While in part this is mostly true, lets dig a little deeper on this.

UTF-16 is the Unicode encoding that is used for nchar, nvarchar in SQL Server. As already mentioned it uses 2 bytes to store each character. So nvarchar(20) can store up to 20 Unicode characters and will take up 40 bytes of space.

** Important *

The ‘n’ refers to the string length in byte pairs and not character length, it just so happens that for unicode characters 0-65,536 one character can be stored per each byte pair. So for example if you had nvarchar(10), this is saying that as long as the characters stored fall within unicode 0-65,536, then you can store 10 characters here but this will cost you 20 bytes of storage. However, if you were storing a character(s) above the 65,536 range, you might not fit 10 characters in as they may need 2 byte pairs per character.

Prior to SQL Server 2019, if you wanted to store Unicode data then nchar and nvarchar were you only options. However, arriving with SQL Server 2019 is the support for UTF-8 Unicode encoding to be used on char and varchar fields. Say what? Does this mean we should never use nchar or nvarchar again then? Not so.

Ok lets firstly look at what is UTF-8?

So UTF-8 is widely used Unicode character encoding (For example its used in Postgres). It differs with UTF-16 in how many bytes it uses to store data. So characters that fall within the Ascii range (0-127) are encoding using 1 byte, above that range, and similar to UTF-16 2 bytes are used to represent characters, but only to a point, above a specific range 3 bytes can be used to represent a character and above that again 4 bytes. So in summary with UTF-8, depending on the character it can take anything from 1 byte to 4 bytes of storage, so you have a split of 4 ranges essentially.

With UTF-16 there is only a split of 2 ranges, the lower range will always use 2 bytes of storage and the upper range will always use 4 bytes of storage.

Ok, looks interesting so how do we use UTF-8 in SQL Sever 2019 then?

So UTF-8 is used in char and varchar data types and it is enabled when you change or create a columns/table collation to a collation that has a UTF8 suffix.

Changing 
LATIN1_GENERAL_100_CI_AS_SC 
to 
LATIN1_GENERAL_100_CI_AS_SC_UTF8.

So the UTF-8 character encoding is available only to Window Collations that allow additional characters.

The UTF-16 character encoding used by nchar and nvarchar will remain unchanged.

So lets looks and some examples.

Latin1_General_100_CI_AI_SC_UTF8 if this collation was used on a column that was say char(15), so this will store data using UTF-8 character encoding. So what this means is that if Ascii characters (in the range 0-127) are stored in this column then it will take up a max of 15 bytes of space. Above that range, 2 bytes, 3 bytes or even up to 4 bytes per character can be used to store a character.

Compare that to a column of type nvarchar(15) which will use UTF-16 character encoding, this has two ranges of which it can either store characeters using 2 or 4 bytes, BUT its ability to store characters only using 2 bytes is a much large range than that of UTF-8.

So moral of the story here is that, Before you choose whether to use UTF-8 or UTF-16 encoding for a database or column, consider the type of data that will be stored in it.

For example, if it is mostly ASCII characters that you are storing and these require 1 bytes in UTF-8 and 2 bytes in UTF-16, storing this data in a char or varchar column with a UTF8 collation as opposed to storing it in a nchar or nvarchar column using UTF-16 could lead to a 50% reduction in space savings.

Above the ASCII range however, such as Latin-based languages mostly require 2 bytes per character so there is not much saving to be gained.

However on the other side, if it was mostly Asian data that you were storing, where each character would typically require 3 bytes in UTF-8, this could be stored using UTF-16 much more effeciently.

So to sum, coming out with SQL Server 2019 you now have the ability to store Unicode characters in char and varchar data types provided the correct UTF-8 collations are used. On whether you use char/varchar or nchar/nvarchar depends on the data you are storing and into what character range it falls. If you have not upgraded to 2019 yet, and you have the need to store Unicode characters then nchar/nvarchar is your only option.

Until next time, Slán!

Disclaimer: Never just believe everything you read in a post, test this out for yourself, we all can make mistakes! Happy coding.

CDC Anomaly with Metadata-Only change with Default Constraint – SQL Server

Recently at work we had a requirement to change an existing column on a table from NULL to NOT NULL. Now the column in question (call it ColumnA) was belonged to a very large table, (over 1 billion rows) and for a lot of the rows in this table that column was null. Rather than running an UPDATE statement on this we came up with the idea of just putting a default constraint on ColumnA. This would be much quicker than the UPDATE statement as from SQL Server 2012 onwards this is a metadata only change.

A Metadata what? So rather than updating all of the rows which had the value NULL in ColumnA in our VLT, with a metadata-only change, the default value is stored in the metadata of that table. So no rows are updated at all and the change is instantaneous. Look in the below table to see where the metadata is actually stored.

sys.system_internals_partition_columns

I know what you’re thinking, doesn’t this cause an issue when querying that table? No, there is no impact for not having the data in the actual row. It can still be included when the table is queried and it can still be indexed, its basically the same as if every row was updated with that value without having the bear the cost of performing the physical update.

What happens if one of these rows gets updated? So if a row that has the default value for ColumnA gets updated, the actual default value will be physically stored in ColumnA after the update, (for an update that is not updating columnA itself.)

So that’s great, all good, BUT WAIT! There’s always a catch right. 🙂

So at work we happened to have CDC on the VTL and on ColumnA, after a while we started to notice some strange behavior, if a row was deleted, this was recorded in the capture instance but guess what? The value for ColumnA was null!!

WHAT? So it seems CDC reads the actual value in that column, and does not read from the metadata, and for most rows with that ColumnA the physically stored value would be null.

What can you do? It depends on your use case, for some people this may be acceptable, for us it wasn’t so we really only had one choice, we had to run that update statement after all, setting the default value for all rows where columnA was null. Of course this only impacted legacy rows, any row updated or any new row added the default value would be stored at the row level.

All in all, this to me doesn’t seem correct behavior, either metatdata-only changes work for all scenarios or they don’t at all would be my (probably rather simplistic view) of how it should work.

Until next time Slán!

Disclaimer: Never just believe everything you read in a post, test this out for yourself, we all can make mistakes! Happy coding.

Trigrams – PostgreSQL

We have some postgres functions at my work where wildcard searches are done, where both the wildcard symbol % is added to the front and back of the search term. By prefixing the symbol to the front of search term this means that a B-Tree index cannot be used in the searching and so a full table/index scan needs to be performed. But what if there was a better way, hello Trigrams 🙂

So a Trigram is just a sequence of three consecutive characters. In PostgreSQL you can create a Trigram GiN index, and all this does is that for each word in the column that the index is built from it splits each word up into trigrams.

Say what? So take the word Dog, this would be split into the following trigrams – (” d” ,” do”, “dog”, “og “) You’ll notice the 1st two trigrams and last one have spaces in them, this is because it prefixes each word with two spaces and suffixes it with one.

So how does the Trigram help? – If you had the following query

SELECT * FROM dbo.animal WHERE TYPE like '%dog%';

The query engine determines the trigrams pattern of the search term that has been entered and will then do a Bitmap Index scan of the GiN Trigram index looking for all rows that contain these trigrams. This could result in some false positives and not all words that have the same trigrams are the same words, to eliminate these you match the rows that were found to have the same trigrams against the pattern.

See sample code below on how to create a trigram index in PostgreSQL

Create the Trigram Extension
Create a GiN Index that uses the trigram operator class
Test a wild card search using (LIKE, ILIKE

1. CREATE EXTENSION pg_trgm;
2. CREATE INDEX idx_tablename_colname_gintrig on tablename USING GIN(colname gin_trgm_ops);
3. SELECT * FROM dbo.tablename WHERE colname LIKE '%somevalue%';

Disclaimer: Never just believe everything you read in a post, test this out for yourself, we all can make mistakes! Happy coding.