Geeks With Blogs
John Conwell: aka Turbo Research in the visual exploration of data
I recently profiled a sproc that makes heavy use of the TSQL SUBSTRING function (hundreds of thousands of times) to see how it performs on a SQL 2005 database compared to a SQL 2000 database.  Much to my surprise the SQL 2005 database performed worse...dramatically worse than SQL 2000.

After much researching it turns out the problem is that the column the text was stored in was an NTEXT, but SQL 2005 has deprecated the NTEXT in favor of NVARCHAR(MAX).  Now, you'd think that string functions on NTEXT would have the same performance on 2005 as it did on 2000, but thats not the case. 

Ok, so NTEXT is old badness, and NVARCHAR(MAX) is new goodness.  Then the next logical step would be to convert the column to be a NVARCHAR(MAX) data type, but here lies a little but very important gotcha.

By default NTEXT stores the text value in the LOB structure and the table structure just holds a pointer to the location in the LOB where the text lives. 

Conversely, the default setting for NVARCHAR(MAX) is to store its text value in the table structure, unless the text is over 8,000 bytes at which point it behaves like an NTEXT and stores the text value in the LOB , and stores a pointer to the text in the table.

So, just to recap, the default settings for NTEXT and NVARCHAR(MAX) are completely opposite.

Now, what do you think will happen when you execute an ALTER COLUMN on a NTEXT column that changes the data type to a NVARCHAR(MAX)?  Where do you think the data will be stored?  In the LOB structure or the table structure?

Well, lets walk through an example.  First create a table with one NTEXT column:

CREATE TABLE [dbo].[testTable](
    [testText] [ntext] NULL
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]

Next, put 20 rows in the table:

INSERT INTO testTable SELECT 'hmmm...i wonder if this will work'

Then run a select query with IO STATISTICS:

SET STATISTICS IO ON
SELECT * FROM testTable
SET STATISTICS IO OFF

Now, looking at the IO stats, we see there was only 1 logical read, but 60 LOB logical reads.  This is pretty much as expected as NTEXT stores its text value in the LOB not the table:

Table 'testTable'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 60, lob physical reads 0, lob read-ahead reads 0.

Now, lets alter the table to be an NVARCHAR(MAX):

ALTER TABLE testTable ALTER COLUMN testText NVARCHAR(MAX) null

Now when we run the select query again with UI STATISTICS we still get a lot of LOB reads (though less than we did with NTEXT).  So its obvious that when SQL Server did the alter table, it didn't use the default NVARCHAR(MAX) setting of text in row, but kept the text in the LOB and still uses pointers lookups to get the text out of the LOB.

Table 'testTable'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 40, lob physical reads 0, lob read-ahead reads 0.

This is not as expected and can be devastating for performance if you don't catch it, since NVARCHAR(MAX) with text not in row actually performs WORSE than NTEXT when doing SUBSTRING calls.

So how do we fix this problem?  Its actually fairly easy.  After running your alter table, run an update statement setting the column value to itself, like so:

UPDATE testTable SET testText = testText

SQL server moves the text from the LOB structure to the table (if less than 8,000 bytes).  So when we run the select again with IO STATISTICS we get 0 LOB reads. 

Table 'testTable'. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

YEA!  This is what we want.

Now, just for grins, what do you think happens if we change the NVARCHAR(MAX) back to NTEXT?  Well it turns out that SQL Server moves the text back to the LOB structure.  Completely backwards from what it did when converting NTEXT to NVARCHAR(MAX).



Posted on Wednesday, April 16, 2008 2:34 PM SQL Server Performance | Back to top


Comments on this post: NTEXT vs NVARCHAR(MAX) in SQL 2005

# re: NTEXT vs NVARCHAR(MAX) in SQL 2005
Requesting Gravatar...
Good to know, thanks. The last paragraph doesn't surprise me, but the rest I probably wouldn't have considered.
Left by Alex on Apr 17, 2008 9:11 AM

# re: NTEXT vs NVARCHAR(MAX) in SQL 2005
Requesting Gravatar...
I just converted a table using this technique (90,000 records). I had three columns using the ntext type. I converted one column at a time and a simple select query which took 4 secs on the unchanged table still took 4s until I had changed all the columns using ntext to nvarchar, and then it came down to 3secs avg. In retrospect I think I would have like to have done some more intense queries just to see for myself what would happen.

Good to know, thanks!
Left by Rhett on May 05, 2008 6:58 AM

# re: NTEXT vs NVARCHAR(MAX) in SQL 2005
Requesting Gravatar...
very helpful thanks. I discovered today that NTEXT is to be deprecated. I discovered this totally by chance through some very strange LINQ exception that I got because I mistyped a property. I guess I am 3+ years behind with SQL server:)
Left by Stilgar on May 05, 2008 6:41 PM

# re: NTEXT vs NVARCHAR(MAX) in SQL 2005
Requesting Gravatar...
Thanks for the information. I too am apparently a bit behind on SQL server, as I wasn't aware that the TEXT/NTEXT type were going to be deprecated. Now to learn more about LOBs, BLOBs, CLOBs, etc.
Left by Brian on May 21, 2008 1:16 PM

# re: NTEXT vs NVARCHAR(MAX) in SQL 2005
Requesting Gravatar...
Thanks for the info!
Any idea what the conversion does to the table size? Is the table larger because data is stored in-row or smaller because NVARCHAR is of a variable length?
Left by Narine on Jun 16, 2008 11:50 AM

# re: NTEXT vs NVARCHAR(MAX) in SQL 2005
Requesting Gravatar...
If you use the default settings for NVARCHAR(MAX) then yes, your table will be larger, which can cause other performance problems because less rows can now fit in a page of memory.

Its really a trade off. You need to look at the entire ecosystem of your database (queries, indexes, and tables) and see what makes sense for your specific needs.

In my case, I ended up switching from NTEXT to NVARCHAR(MAX) to get the string processing speed, but then vertically split the table, moving the big string columns off to a new 1-to-1 lookup table, because adding the text to the table data structure hurt other queries. With the vertical partition and the NVARCHAR(MAX) changes, I got very good sproc throughput.
Left by John on Jun 17, 2008 7:52 AM

# re: NTEXT vs NVARCHAR(MAX) in SQL 2005
Requesting Gravatar...
Thanks for the post, but I'm confused about something. The NTEXT had 60 lob logical reads and when converted to NVARCHAR(MAX) is was 40 lob logical reads. Once the update was run, it was 0 lob logical reads indicating that less is better. Isn't 40 better than 60? Why did you indicate that 40 was worse than 60?
Left by don on Aug 01, 2008 8:30 AM

# re: NTEXT vs NVARCHAR(MAX) in SQL 2005
Requesting Gravatar...
Thank you so much for the post. I am just about to upgrade my SQL 2000 database to SQL 2005 and I have multiple tables which use ntext data type. You mentioned that "SQL server moves the text from the LOB structure to the table (if less than 8,000 bytes)." On my db, the ntext column hold more that 8000 characters. Do you know if SQL 2005 will move all data over?
Left by Wei on Aug 29, 2008 8:26 AM

# re: NTEXT vs NVARCHAR(MAX) in SQL 2005
Requesting Gravatar...
Thank you.

Just upgraded from SQL2000 to SQL2005, and trying to update some data into our new DB, but hit datatype issue.

Because the col is "ntext" it will NOT allow such a long substring of text to be stored/entered, so it looks like I need to convert the col to "nvarchar(max)" in order for the DB to accept this long string of char/HTML.

*Does anyone have a more extensive list of the datatypes that are being deprecated?*

ciao ciao Ntext!
Left by Fennel Doyle on Oct 01, 2008 10:41 AM

# re: NTEXT vs NVARCHAR(MAX) in SQL 2005
Requesting Gravatar...
Great post! I know this might be a old hat, but here's a conversion script:

select
table_name,
column_name,
(
'alter table [' + table_name + '] ' +
'alter column [' + column_name +'] ' +
case when data_type = 'text' then 'varchar(max)' else 'nvarchar(max)' end + ' ' +
case when is_nullable = 'yes' then 'null' else 'not null' end + ';' + char(10) +
'update [' + table_name + '] set [' + column_name + '] = [' + column_name + '];'
+ char(10) +char(13) ) [sql]
into
#tmp
from
information_schema.columns
where
data_type = 'ntext' or data_type = 'text'

while exists(select null from #tmp)
begin
set nocount on
declare @sql nvarchar(max)
select top 1 @sql = [sql] from #tmp
set nocount off
exec sp_executesql @sql
set nocount on
delete from #tmp where [sql] = @sql
end
drop table #tmp
Left by Robert Baskette on Oct 28, 2008 5:06 PM

# re: NTEXT vs NVARCHAR(MAX) in SQL 2005
Requesting Gravatar...
Interesting results. Now a question about this data being stored in row and IO performance.
Say I have a simple table
Id BigInt, Title Varchar(500), Detail Varchar(Max).
Id is the Clustered (and only) Index. I want to display 100 records at a time selecting just Id, Title and then if a record is selected fetch the Detail on demand.

Now with the Detail as text, I am going to fit a fairly large number of records per page and the IO cost to get a list of records will be low. Now what this is telling me is that if the Detail is stored in row unless it exceeds 8K then the number of records stored per page is going to decrease massively and IO use is going to go through the roof.
Left by Simon Porter on Nov 10, 2008 11:48 AM

# re: NTEXT vs NVARCHAR(MAX) in SQL 2005
Requesting Gravatar...
Rebert, I think this is more accurate:

select
o.name as table_name,
c.name as column_name,
(
'alter table [' + o.name + '] ' +
'alter column [' + c.name +'] ' +
case when c.system_type_id = TYPE_ID('text') then 'varchar(max)' else 'nvarchar(max)' end + ' ' +
case when c.is_nullable = 1 then 'null' else 'not null' end + ';' + char(10) +
'update [' + o.name + '] set [' + c.name + '] = [' + c.name + '];'
+ char(10) +char(13) ) [sql]
into
#tmp
from sys.columns c
left join sys.objects o on (c.object_id = o.object_id)
where
o.type = 'U' and o.is_ms_shipped = 0 and c.system_type_id = TYPE_ID('ntext') or c.system_type_id = TYPE_ID('text')

while exists(select null from #tmp)
begin
set nocount on
declare @sql nvarchar(max)
select top 1 @sql = [sql] from #tmp
set nocount off
exec sp_executesql @sql
set nocount on
delete from #tmp where [sql] = @sql
end
drop table #tmp
Left by Francois Nel on Feb 25, 2009 7:58 AM

# ok..so what about the foot print on disk
Requesting Gravatar...
I have a a table which is 14 GIG when using (ntext).

After I convert this table to varchar(max), will will the footprint on the hard disk look like.

Will it be 1/2 the size to be about 7 GB?
or
Will it be the same size of 14GB?
or
Will it actually become bigger 21GB?


I choose to not recreate the table, since I had other idenity columns which I would have to turn
set identity off //on.

So I used alter table / alter column signature varchar(max).

In my first tests...it appears the table is actually increasing in size to be more than 21 GB in size? Not what I was anticpating...

Anynow, if someone could help me answer why this may happen the(table size actually increasing by 1/3)?

I even ran a backup and shrank the database to reclaim additional space (if any existed)...But the only thing that shrank was the log file (ldf).
to note: I had to set the log file size to be 16 GIG when originally performing the task becuase it said the log file was not large enough, even though I had set the log file to autogrow by 5%. This could be related to some other task running too....but expanding the log file before running alter table/alter column statement seem to at least fix that part.

--Still working on this situation...My goal was to get with the "new" standard of varchar(max) before looking into sql 2008 and to hopefully also shrink the size of our overall database/backups.

Any help would be appreciated.
Thanks, Doug Lubey of Louisiana

Left by Douglas Lubey of Louisiana on Mar 07, 2009 10:10 AM

# update to foot print size of varchar(max) ....RESULTS
Requesting Gravatar...
Just like this article suggests...
After I altered the data column type to varchar(max) the table will EVENTUALLY SHRINK TO BE 1/2 THE SIZE..if you follow your alter table statement with and UPDATE TABLE STATEMENT.

ORIGINALLY WITH JUST RUNNING THE ALTER TABLE:
The table size did double....why? Behind the scenes SQL server 2005 actually creates a duplicate feild of the same data. One field with the data type "text" and one field with the datatype varchar(max).

BASED ON THE SUGGESTION ABOVE:
I then ran an update statement

UPDATE StSignaturesTbl
set signatureString=signatureString,
where Date_Created >= '1/1/2002'
================
At first the dataTable did not shrink even after running this statement...but slowly it did..after checking back on a hourly basis.

HERE IS THE OUTLINE OF WHAT HAPPENED:
-----
1>original table size: 14 GB
2>Alter Table/Alter Column: 21 GB (it increased)
3>Waited....it stayed the same ...21 gigabytes
4>Ran the update statement(Above)..no changes at first
5>Waited ....3-6 hours...the table eventually shrank to be 1/2 the size of the original foot print size on the hard disk..

END RESULT: After both tasks completed and after waiting patiently between tasks. THE FINAL SIZE FO THE DATATABLE IS NOW 7 GB in size....

THIS IS WHAT WE WERE EXPECTED...50% gain....in hard disk space..

Thanks again for this article...
Doug Lubey of Louisiana


ONE THING TO NOTE: make sure you ADD enough space to your log file (.ldf) to cover an additional 100% of the TABLE SIZE...before running the alter column/alter table statements or you will get a run-time error once the log file size reaches it upper limit (EVEN IF YOU have auto-increase set to 10%)...I Did have it said to increase automatically by 10%...SQL SERVER 2005 still said it could not reclaim disk space fast enough ..do not know why...but increasing the log file size to the original size + THE table size did the trick. USED THE properties(right-click_) of the database to do this.
Left by Douglas Lubey of Louisiana on Mar 08, 2009 4:17 PM

# re: NTEXT vs NVARCHAR(MAX) in SQL 2005
Requesting Gravatar...
So much helpful.
Thank you Mate
Left by Ethan on Apr 21, 2009 5:40 PM

# re: NTEXT vs NVARCHAR(MAX) in SQL 2005
Requesting Gravatar...
If migration performance is a consideration, you could use SELECT INTO with a conversion function into a new table. Then drop the old table and rename the newly created table. This would have the sideaffects of creating the storage properly, and operating faster in non-logged mode, not increasing the transaction log size. Of course, the columns null/identity options would need to be alters and constraints, indexes would need to be readded, so it is a bit more work. Recommend for a few very large tables.
Left by Chip in Tampa on Apr 22, 2009 11:26 AM

# update to foot print size of varchar(max) ....RESULTS
Requesting Gravatar...
Douglas,
Keep in mind that going from NTEXT to VARCHAR(MAX) you are going from a datatype that supports unicode characters to one that does not. Hence, while you have saved half of the storage space for that data, any data that does not fit in your single byte VARCHAR(MAX) character set code page was lost and the potential for storing characters outside this single byte character set is also lost. So, TEXT and VARCHAR are good datatypes for singe-byte character sets and NTEXT and NVARCHAR are good for supporting all characters for all languages (i.e. Unicode character set). The reason the size is double with the N-Types is that they use a two-byte (USC2) encoding for the Unicode characters they represent. So, be sure to know what data you will be storing in these fields and then you will know which datatype is best for you (N vs Non-N types).
Left by DBall on May 09, 2009 1:04 AM

# re: NTEXT vs NVARCHAR(MAX) in SQL 2005
Requesting Gravatar...
Thanks this was exactly what I was looking for.
Left by Aaron on Jul 23, 2009 12:56 PM

# re: NTEXT vs NVARCHAR(MAX) in SQL 2005
Requesting Gravatar...
These are good differences that i wanted to know..Thanx
Left by Danish on Sep 30, 2009 2:29 AM

# re: NTEXT vs NVARCHAR(MAX) in SQL 2005
Requesting Gravatar...
thank you, this was exactly what I needed!
Left by Daniel Williams on Oct 05, 2009 5:13 PM

# re: NTEXT vs NVARCHAR(MAX) in SQL 2005
Requesting Gravatar...
Great article! Lots of good information and follow through!
Left by Jayelliii on Jan 27, 2010 6:50 AM

# re: NTEXT vs NVARCHAR(MAX) in SQL 2005
Requesting Gravatar...
Great article I agree with the many others here. The LOB was something I was definitely overlooking in my initial conversions.

Here is what I am trying to come up with. I have many databases that we take care on our hosting servers (all for the same ecommerce application). This particular DB design was built primarily for a SQL2000 platform and consequently is littered with a lot of [ntext] columns. As times change, we would like to put all these DB's onto our newer SQL2008 server but I am adamant about eliminating the ntext data type from databases that are going onto this server.

So, I need a script that can be run against a given DB that will identify all the tables/columns (possibly put them into an array). Then, one by one (n+1) we need to convert each of these columns to nvarchar(MAX) and set the column value to itself.

Any assistance would be appreciated.
Left by Mark on Feb 23, 2010 1:07 PM

# re: NTEXT vs NVARCHAR(MAX) in SQL 2005
Requesting Gravatar...
Thanks! This i dnot' know about tese future
Left by Prain (PG) on Apr 03, 2010 3:54 AM

# re: NTEXT vs NVARCHAR(MAX) in SQL 2005
Requesting Gravatar...
thanks for the post.. this is the article I looking for

part time jobs search engine | free mp3 download
Left by mp3 song free download on Apr 06, 2010 12:53 PM

# re: NTEXT vs NVARCHAR(MAX) in SQL 2005
Requesting Gravatar...
SQL is getting competative now. Its still one of the best though
Left by Noise Cancelling Earbuds on Apr 08, 2010 1:53 AM

# re: NTEXT vs NVARCHAR(MAX) in SQL 2005
Requesting Gravatar...
Argh! So frustrating, I just want to append some text onto an ntext column! But it looks like I am going to have to convert it to the new nvarchar(MAX).. or rewrite the whole damn query and then I read ntext is deprecated anyway.

Ugh, I have been using ntext for 10 years without needing to do this obviously, never would have expected such a restriction to be in place.

Fun fun fun! ;-)
Left by Baby Names Pronounced on Apr 09, 2010 2:01 AM

# re: NTEXT vs NVARCHAR(MAX) in SQL 2005
Requesting Gravatar...
Thanks for the info it was very useful. Your last complaint is unjustified as of course SQL Server moves the data back out of the table structure after altering back to NTEXT. That's how NTEXT works. NVARCHAR can support both that's why the original problem was there, but I agree that SQL Server should automatically pull < 8000 char data back into the table.
Left by Alan Macdonald on Mar 10, 2011 6:25 AM

# re: NTEXT vs NVARCHAR(MAX) in SQL 2005
Requesting Gravatar...
Great Article! Helped me out three years later, hehe (:
Left by iulo on Apr 29, 2011 5:43 AM

# re: NTEXT vs NVARCHAR(MAX) in SQL 2005
Requesting Gravatar...
Thanks, good article!
I'm having some problems during the transition to {SQL Server Native Client 10.0} and converting the type of text, ntext, image in the varchar (max), nvarchar (max), varbinary (max). Before that we used {SQL Server} and types of text, ntext, image, then {SQL Native Client} and type text, ntext, image problems did not exist.
Problem description:
If you convert type to varchar (max), nvarchar (max), varbinary (max) and using a driver ODBC {SQL Native Client} or {SQL Server Native Client 10.0}, the data are cut (retained only 12 characters). Maybe someone knows why?
Left by Igor on Sep 28, 2011 1:26 AM

Your comment:
 (will show your gravatar)


Copyright © John Conwell | Powered by: GeeksWithBlogs.net | Join free