This article may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist. To maintain the flow of the article, we've left these URLs in the text, but disabled the links.

Brief Tutorial on Text, Ntext, and Image

Joseph Gama

Text, ntext, and image data have been around a long time, but their nuances can be easy to overlook. In this tutorial, Joseph Gama provides a quick overview of the implementation and usage of these special data types. He dedicates the article to Dr. Aaron Gordon, Dr. Earl Hasz, Dr. Jerry Shultz, and Dr. Shahar Boneh from Metropolitan State College of Denver for their extraordinary teachings and to Adam and Karen Schwartz for their fantastic support.

Databases are growing in size and complexity, in part because today's hardware and software allow us to store mind-boggling amounts of data—including multimedia and document data. JPG, PNG, MP3, DOC/RTF, HTML, Unicode, and XML data can all be stored as image, text, or ntext in SQL Server databases.

Generally speaking, you use text to store huge ASCII character strings, ntext for Unicode character strings, and image for binary image data. Worried about size? Text gives you up to 2^31 - 1 (2,147,483,647) variable-length non-Unicode characters, ntext up to 2^30 - 1 (1,073,741,823) characters, and image up to 2^31 - 1 (2,147,483,647) bytes. The actual storage size, in bytes, for ntext is two times the number of characters entered. The SQL-92 synonym for ntext is national text.

So how do they work? They use pointers to reference the data. Special functions allow the pointer to add, extract, or remove data from them. Table 1 provides a summary of their strengths and limitations.

Table 1. Strengths and limitations of text, ntext, and image data types.

Can do

Can't do

Serve as input or output parameters for stored procedures.

Work with DECLARE, SET, or FETCH, meaning that you can't use them as variables the same way as with other data types.

Serve as input into a UDF.

Be a return data type for UDFs (nor can timestamp).

Have up to 8,000 bytes be converted into other data types.

Be retrieved from a cursor with a FETCH unless you do a conversion.

Participate in a UNION ALL clause.

Work with sql_variant.

 

Be compared, sorted, or used in a GROUP BY clause. The only exception is when using IS NULL or LIKE. (An easy workaround is to use CONVERT or user-defined functions that would return other data types.)

 

Participate in a UNION clause because this type of UNION is equivalent to a DISTINCT clause and it would cause an error because text, ntext, and image data types can't be sorted.

Text and image functions

Here I'll describe a number of text and image functions, showing the syntax and an example with output for each.

TEXTPTR

TEXTPTR returns a varbinary of length 16 bytes, which is the text-pointer value that references a text, ntext, or image column.

Syntax:

TEXTPTR ( column )

--TEXTPTR sample, create a text-pointer, see its value
create table #t (n ntext)
insert #t values('abcdef')
DECLARE @ptrval binary(16)
SELECT @ptrval = TEXTPTR(n) FROM #t
print @ptrval
drop table #t

Output:

  0xFFFF6900000000004D00000001000000

TEXTVALID

TEXTVALID returns an int, which will be of value 1 if the text-pointer is valid, 0 otherwise.

Syntax:

TEXTVALID ( 'table.column' , text_ptr )

--TEXTVALID sample, creates a text-pointer, tests it
create table #t (n ntext)
insert #t values('abxyef')
DECLARE @ptrval binary(16), @ptrval2 binary(16)
SELECT @ptrval = TEXTPTR(n) FROM #t
if TEXTVALID('#t.n',@ptrval)=1
print '@ptrval has a valid text pointer.'
else	
print '@ptrval has an invalid text pointer.'
if TEXTVALID('#t.n',@ptrval2)=1	
print '@ptrval2 has a valid text pointer.'
else  print '@ptrval2 has an invalid text pointer.'
drop table #t

Output:

  @ptrval has a valid text pointer.
@ptrval2 has an invalid text pointer.

SET TEXTSIZE

SET TEXTSIZE sets the size, an int value, of text and ntext data to be returned when using a SELECT statement.

Syntax:

SET TEXTSIZE { number }

--SET TEXTSIZE sample
create table #t (n ntext)
insert #t values('abcdefghijk')
SET TEXTSIZE 10--ntext is unicode, 2 bytes/character
select * from #t
SET TEXTSIZE 20--ntext is unicode, 2 bytes/character
select * from #t
drop table #t

Output:

  abcde
abcdefghij

@@TEXTSIZE

@@TEXTSIZE returns the size, an int value, of text and ntext data to be returned when using a SELECT statement. This value is set with SET TEXTSIZE.

Syntax:

@@TEXTSIZE

--@@TEXTSIZE sample
SET TEXTSIZE 10--ntext is unicode, 2 bytes/character
print @@TEXTSIZE
SET TEXTSIZE 20--ntext is unicode, 2 bytes/character
print @@TEXTSIZE

Output:

  10
20

WRITETEXT

WRITETEXT overwrites the data from a text, ntext, or image column.

Syntax:

WRITETEXT { table.column text_ptr } [ WITH LOG ] { data }

--WRITETEXT sample
create table #t (n ntext)
insert #t values('abc')
DECLARE @ptrval binary(16)
SELECT @ptrval = TEXTPTR(n)
FROM #t
WRITETEXT #t.n @ptrval 'def'
select * from #t
drop table #t

Output:

  def

UPDATETEXT

UPDATETEXT changes the data from an existing text, ntext, or image column. **

Syntax:

UPDATETEXT { table_name.dest_column_name dest_text_ptr } { NULL | insert_offset } { NULL | delete_length } [ WITH LOG ] [ inserted_data | { table_name.src_column_name src_text_ptr } ]

--UPDATETEXT sample insertion only
create table #t (n ntext)
insert #t values('bd')
DECLARE @ptrval binary(16), @i int
SELECT @ptrval = TEXTPTR(n)
FROM #t
UPDATETEXT #t.n @ptrval 0 0 'a'--insert at beginning
select * from #t
UPDATETEXT #t.n @ptrval 2 0 'c'--insert in the middle
select * from #t
set @i=(select DATALENGTH(n) from #t)/2
--/2 only if ntext, 2 bytes/character
print @i
UPDATETEXT #t.n @ptrval @i 0 'e'--insert at the end
select * from #t
drop table #t

Output:

  abd
abcd
abcde

Sample deletion and insertion:

  --UPDATETEXT sample deletion+insertion
create table #t (n ntext)
insert #t values('abxyef')
DECLARE @ptrval binary(16), @i int
SELECT @ptrval = TEXTPTR(n) 
FROM #t
UPDATETEXT #t.n @ptrval 2 2 'cd'--insert 2, delete 2 
--chars starting at position 2
select * from #t
drop table #t

Output:

  abcdef

READTEXT

READTEXT reads a certain amount of data from a text, ntext, or image column.

Syntax:

READTEXT { table.column text_ptr offset size } [ HOLDLOCK ]

--READTEXT sample
create table #t (n ntext)
insert #t values('abcdefghijk')
DECLARE @ptrval binary(16)
SELECT @ptrval = TEXTPTR(n) FROM #t
READTEXT #t.n @ptrval 3 8
--read 8 characters starting at position 3
drop table #t

Output:

  defghijk

DATALENGTH

DATALENGTH returns the size (number of bytes) of a text, ntext, or image column.

Syntax:

DATALENGTH ( expression )

--DATALENGTH sample
create table #t (n ntext)
insert #t values('1234567890')
DECLARE @i int
set @i=(select DATALENGTH(n) from #t)
--it should return the length in bytes=2*UNICODE length
PRINT @i
drop table #t

Output:

  20

PATINDEX

PATINDEX returns the location, an int value, of the first occurrence of a pattern in a text, ntext, or image column, or 0 if the pattern wasn't found.

Syntax:

PATINDEX ( '%pattern%' , expression )

--PATINDEX sample
create table #t (n ntext)
insert #t values('Hello Tim, long time no see!')
SELECT PATINDEX('%tim%', n) FROM #t
SELECT PATINDEX('%time%', n) FROM #t
drop table #t

Output:

  7
17

CONVERT

CONVERT returns an expression converted from one data type to another.

Syntax:

CONVERT ( data_type [ ( length ) ] , expression [ , style ] )

--CONVERT sample
create table #t (n ntext)
insert #t values('Hello Tim, long time no see!')
DECLARE @c nvarchar(5)
SET @c=(select convert(nvarchar(5),n) from #t)
print @c
drop table #t

Output:

  Hello

CAST

CAST returns an expression casted (converted) from one data type to another.

Syntax:

CAST ( expression AS data_type )

--CAST sample
create table #t (n ntext)
insert #t values('Hello Tim, long time no see!')
DECLARE @c nvarchar(5)
SET @c=(select CAST ( n  AS nvarchar(5) ) from #t)
print @c
drop table #t

Output:

  Hello

Common implementations

Now that we have a little background, let's look at some common uses.

Saving a text, ntext, or image column to a file

I created three separate stored procs that show you how you can save one column of type text, ntext, or image to a file. You'll find the code for these and all other examples in the accompanying Download file.

--saveText2file sample
create table ##t (n text)
insert ##t values('Hello Tim, long time no see!')
EXEC saveText2file 'c:\test.txt', '##t','n', ''
drop table ##t

--saveNtext2file sample create table ##t (n ntext) insert ##t values('Hello Tim, long time no see!') EXEC saveNtext2file 'c:\test.txt', '##t','n', '' drop table ##t

--saveImage2file sample exec saveImage2file 'c:\Category1.bak', 'Northwind..Categories', 'Picture', 'where categoryid=1'

Updating a text, ntext, or image column from a file

Because TEXTPTR, WRITETEXT, and UPDATETEXT don't allow variable names to define the table or column parameters, reading the contents of a file into a column requires you to use dynamic SQL. Stored proc readImageFromfile can handle both image and varchar data types because it reads the data as binary and writes it without using temporary tables. Ntext can be read using readNtextFromfile.

--readImageFromfile sample
--reading a text column from a file
create table ##t (n text)
insert ##t values('Hi Tim, long time no see!')
EXEC readImageFromfile 'c:\hello.txt', '##t','n', ''
select * from ##t
drop table ##t

Output:

  Hello

The same code will work with an image column. The only thing to remember is that ntext columns shouldn't receive data from ASCII text files—nor should text columns receive Unicode data—because there's no conversion performed by this stored procedure. The data comes in a raw format.

The stored procedure readImageFromfile2 appends data rather than replacing the original contents, and readNtextFromfile reads Ntext (Unicode) data. Appending data is easier than updating because updating is performed by first replacing the original data with the first block of new data and then continuously appending new blocks of data, while appending doesn't require that first step. Reading ntext is basically the same as reading any other data type, other than the fact that Unicode must be Little Endian. (Many text processors add 0xFFFE to the beginning of a Unicode text file.) The stored procedure readNtextFromfile has some extra code to remove 0xFFFEm, the "Byte Order Mark" (BOM), which is used to distinguish between the Big Endian and Little Endian variants of Unicode.

See the Download file for another stored procedure called readNtextFromfileBigEndian that will read Unicode Big Endian from a file and convert it into Little Endian.

Saving the text of an object to a file

By using a temporary table with an ntext column, it's possible to read the text of a rule, default, unencrypted stored proc, UDF, trigger, or view to a file. The temporary table is created and then a new record is inserted with the INSERT EXEC technique. This record will contain the object's text retrieved from sysobjects with the system stored procedure sp_helptext. This is done in the stored procedure saveobj with the following code:

create table #temp(s nvarchar(4000))
insert #temp
exec sp_helptext @object

The stored procedure saveobj can save any rule, default, unencrypted stored proc, UDF, trigger, or view to a text file, making it very useful for backup and documentation purposes.

  --saveobj sample, saving object hello
exec saveobj 'c:\hello.sql', 'hello'

Reading the text of an object from a file

Reading a file that's more than 8,000 bytes requires the use of dynamic SQL because many buffers will be filled accordingly to the size of the file and then executed later. The stored proc readobj works with a loop that will create the dynamic SQL using a loop to determine the number of temporary variables to create. Each temporary variable holds 8,000 bytes of data, which doesn't limit the size of the file imported. Using temporary variable names with a character and increasing numbers to differentiate them will achieve that purpose.

--readobj sample, reading object hello
exec readobj 'c:\hello.sql'

INSERT or UPDATE statements over 8,000 bytes

Stored procedure readobj can create rule, default, unencrypted stored procedure, UDF, trigger, or view objects from text files. It also allows INSERT, UPDATE, DELETE, or other statements greater than 8,000 bytes. If the statements aren't in a text file, but are instead in a giant SQL string, a simple stored procedure can execute it:

CREATE PROCEDURE exec_ntext (@SQL ntext)
--Execute a gigantic SQL statement
AS
EXEC (@SQL)

Using user-defined functions to compare text, ntext, or image columns

The fastest and easiest way to compare two columns is to compare them by size, but this technique has obvious limitations since two columns might have the same length but different contents:

CREATE FUNCTION  testlength (@a ntext, @b ntext)
--returns true if both inputs have the same length
RETURNS bit AS
BEGIN declare @temp_bit bit if datalength(@a)=datalength(@b) set @temp_bit= 1 else set @temp_bit=0 return @temp_bit END

(The input in the preceding example was ntext, but it could have been text or image.)

The best way to compare two columns is to compare either the entire contents or a subset thereof:

  CREATE FUNCTION  testequality (@a ntext, @b ntext)
--returns 1 if the comparison result is true
RETURNS bit AS  
BEGIN 
declare @temp_bit bit
if  @a like @b
  set @temp_bit= 1
else
  set @temp_bit=0
return @temp_bit
END

Pattern matching (the second input variable), defined by using wildcard characters, makes this technique extremely powerful. You probably know that SQL Server's wildcard characters are:

  • % A string of any size, even an empty one.
  • _ A string with only one character.
  • [ ] One character included in a set of characters ([fhsvf]) or in a range of characters ([k-t]).
  • [^] One character not included in a set of characters ([^fhsvf]) or in a range of characters ([^k-t]).

Determining the type of an image

A quick-and-dirty way to find out if an image is a bitmap, JPEG, PNG, or TIFF is to check the first bytes of the image to determine whether they fit into the header specifications for those formats. This technique isn't bulletproof, because it can fail in the case of raw data, but it's the first step to a more elaborate identification function:

CREATE FUNCTION  getImageType (@a image)
--returns the type of image format
RETURNS varchar(4) AS
BEGIN declare @out varchar(4), @temp varbinary(8) SET @temp=convert(varbinary(8), @a) SET @out=CASE WHEN LEFT(@temp,2)=0x424D THEN 'BMP' WHEN LEFT(@temp,2)=0xFFD8 THEN 'JPG' WHEN LEFT(@temp,8)=0x89504E470D0A1A0A THEN 'PNG' WHEN (LEFT(@temp,2)=0x4949)OR(LEFT(@temp,2)=0x4D4D) THEN 'TIFF' ELSE '' END return @out END

These and other samples are available in the Download file. Have fun exploring, remembering, and adapting for your own applications!

Download NTEXTCODE.exe

To find out more about SQL Server Professional and Pinnacle Publishing, visit their website at http://www.pinpub.com/html/main.isx?sub=57

Note: This is not a Microsoft Corporation website. Microsoft is not responsible for its content.

This article is reproduced from the July 2003 issue of SQL Server Professional. Copyright 2003, by Pinnacle Publishing, Inc., unless otherwise noted. All rights are reserved. SQL Server Professional is an independently produced publication of Pinnacle Publishing, Inc. No part of this article may be used or reproduced in any fashion (except in brief quotations used in critical articles and reviews) without prior consent of Pinnacle Publishing, Inc. To contact Pinnacle Publishing, Inc., please call 1-800-493-4867 x4209.