Defining Column-Specific Properties
To complete the data model described in Chapter 8, "Building Your Data Model," you must define column-specific properties for each column. The column-specific properties describe the constraints and identify the set of valid values that attributes (or columns) can assume.
The purpose of column-specific properties is to guard the semantic integrity of the data in the model; that is, to ensure that it reflects reality in a sensible way. The integrity of the data model is at risk if you can substitute a name for a telephone number or if you can enter a fraction where only integers are allowed.
To define column-specific properties, first define the constraints that a data value must satisfy before it can be part of the column. Use the following constraints to specify column-specific properties:
You can identify the primary and foreign keys in each table to place referential constraints on columns. For more information on primary and foreign keys, see Chapter 8, "Building Your Data Model."
Extended Data Types
In addition to the built-in data types that Universal Server supports, you can create and use the following data types to specify the data type of a column:
- Opaque data types. You can use these encapsulated data types to define columns in the same way that you use built-in types. When you create an opaque data type, you also define the functions, operators, and aggregates to operate on the type. For information about opaque types, see the CREATE OPAQUE TYPE statement in the Informix Guide to SQL: Syntax and the user guide Extending INFORMIX-Universal Server: Data Types.
- Distinct data types. These data types have the same representation as, but are distinct from, existing data types. You can create a distinct type from a built-in type, opaque type, named row type, or other distinct type. For information about distinct types, see Chapter 3, "Environment Variables" in the Informix Guide to SQL: Reference
and the CREATE DISTINCT TYPE statement in the Informix Guide to SQL: Syntax.
- Complex data types. These data types combine one or more existing data types to create a new data type. A complex data type allows access to each of its component data types. For information about complex types, see Chapter 10, "Understanding Complex Data Types."
Built-In Data Types
A built-in data type is a data type that the database server defines. INTEGER, CHAR, DATE, and DECIMAL are examples of built-in data types. To help you choose the appropriate data types for implementing your data model, this section provides a description of the built-in data types. For additional information on the built-in data types, see Chapter 3, "Environment Variables," in the Informix Guide to SQL: Reference.
Choosing a Data Type
The first constraint on any column is the one that is implicit in the data type for the column. When you choose a data type, you constrain the column so that it contains only values that can be represented by that type.
Every column in a table must have a data type that is chosen from the built-in or extended types that the database server supports. The choice of data type is important for the following reasons:
Using Data Types in Referential Constraints
Almost all data type combinations must match when you are trying to pick columns for primary and foreign keys. For example, if you define a primary key as a CHAR data type, you must also define the foreign key as a CHAR data type. However, when you specify a SERIAL data type on a primary key in one table, you specify an INTEGER on the foreign key of the relationship. Similarly, when you specify a SERIAL8 data type on a primary key in one table, you specify an INT8 on the foreign key of the relationship.The only data type combinations that you can mix in a relationship are as follows:
Figure 9-1 shows the decision tree that summarizes the choices among built-in data types. The choices are explained in the following sections.
Numeric Data Types
Informix database servers support eight numeric data types. Some are best suited for counters and codes, some for engineering quantities, and some for money.
Counters and Codes: INTEGER, SMALLINT, and INT8
The INTEGER and SMALLINT data types hold small whole numbers. They are suited for columns that contain counts, sequence numbers, numeric identity codes, or any range of whole numbers when you know in advance the maximum and minimum values to be stored.
Both types are stored as signed binary integers. INTEGER values have 32 bits and can represent whole numbers from -(231 -1) through 231-1; that is, from -2,147,483,647 through 2,147,483,647. (The maximum negative number, -2,147,483,248, is reserved and cannot be used.)
SMALLINT values have only 16 bits. They can represent whole numbers from -32,767 through 32,767. (The maximum negative number, -32,768, is reserved and cannot be used.)
The INTEGER and SMALLINT data types have the following advantages:
The disadvantage to using INTEGER and SMALLINT is the limited range of values that they can store. The database server does not store a value that exceeds the capacity of an integer. Of course, such excess is not a problem when you know the maximum and minimum values to be stored.
The INT8 data type is stored as a signed binary integer, which uses 8 bytes per value. Although INT8 takes up twice the space as the INTEGER data type, INT8 has the advantage of a significantly larger range of data representation. INT8 can represent integers ranging from - (2 63 -1) through 2 63 -1; that is from - 9,223,372,036,854,775,807 through 9,223,372,036,854,775,807. (The maximum negative number, - 9,223,372,036,854,775,808, is reserved and cannot be used.)
Automatic Sequences: SERIAL and SERIAL8
The SERIAL data type is simply INTEGER with a special feature. Similarly, the SERIAL8 data type is INT8 with a special feature. Whenever a new row is inserted into a table, the database server automatically generates a new value for a SERIAL or SERIAL8 column. A table can have only one SERIAL or SERIAL8 column, but it can have both a SERIAL column and a SERIAL8 column. Because the database server generates them, the serial values in new rows are always different even when multiple users are adding rows at the same time. This service is useful, because it is quite difficult for an ordinary program to coin unique numeric codes under those conditions.
The SERIAL data type can yield up to 231-1 positive integers. Consequently, the database server uses all the positive serial numbers by the time it inserts 231-1 rows in a table. For most users the exhaustion of the positive serial numbers is not a concern, however, because a single application would need to insert a row every second for 68 years, or 68 applications would need to insert a row every second for a year, to use all the positive serial numbers. However, if all the positive serial numbers were used, the database server would continue to generate new numbers. It would treat the next serial quantity as a signed integer. Because the database server uses only positive values, it would simply wrap around and start to generate integer values that begin with a 1.
The SERIAL8 data type can yield up to 263 -1 positive integers. With a reasonable starting value, it is virtually impossible to cause a SERIAL8 value to wrap around during insertions.
For SERIAL and SERIAL8 data types, the sequence of generated numbers always increases. When rows are deleted from the table, their serial numbers are not reused. Rows that are sorted on a SERIAL or SERIAL8 column are returned in the order in which they were created. That cannot be said of any other data types.
You can specify the initial value in a SERIAL or SERIAL8 column in the CREATE TABLE statement. This makes it possible to generate different subsequences of system-assigned keys in different tables. The stores7 database uses this technique. In stores7, the customer numbers begin at 101, and the order numbers start at 1001. As long as this small business does not register more than 899 customers, all customer numbers have three digits, and order numbers have four.
A SERIAL or SERIAL8 column is not automatically a unique column. If you want to be perfectly sure that no duplicate serial numbers occur, you must apply a unique constraint (see "Using CREATE TABLE"). If you define the table using the interactive schema editor in DB-Access or SQL Editor, it automatically applies a unique constraint to any SERIAL or SERIAL8 column.
The SERIAL and SERIAL8 data types have the following advantages:
The SERIAL and SERIAL8 data types have the following disadvantages:
Altering the next SERIAL or SERIAL8 number
The starting value for a SERIAL or SERIAL8 column is set when the column is created (see "Using CREATE TABLE"). You can use the ALTER TABLE statement later to reset the next value, the value that is used for the next-inserted row.
You cannot set the next value below the current maximum value in the column because doing so causes the database server to generate duplicate numbers in certain situations. However, you can set the next value to any value higher than the current maximum, thus creating gaps in the sequence.
Approximate Numbers: FLOAT and SMALLFLOAT
In scientific, engineering, and statistical applications, numbers are often known to only a few digits of accuracy, and the magnitude of a number is as important as its exact digits.
The floating-point data types are designed for these applications. They can represent any numerical quantity, fractional or whole, over a wide range of magnitudes from the cosmic to the microscopic. For example, they can easily represent both the average distance from the Earth to the Sun (1.5 ¥ 109 meters) or Planck's constant (6.625 ¥ 10-27). Their only restriction is their limited precision. Floating-point numbers retain only the most significant digits of their value. If a value has no more digits than a floating-point number can store, the value is stored exactly. If it has more digits, it is stored in approximate form, with its least-significant digits treated as zeros.
This lack of exactitude is fine for many uses, but you should never use a floating-point data type to record money or any other quantity whose least significant digits should not be changed to zero.
Two sizes of floating-point data types exist. The FLOAT type is a double-precision, binary floating-point number as implemented in the C language on your computer. A FLOAT data type value usually takes up 8 bytes. The SMALLFLOAT (also known as REAL) data type is a single-precision, binary floating-point number that usually takes up 4 bytes. The main difference between the two data types is their precision. A FLOAT column retains about 16 digits of its values; a SMALLFLOAT column retains only about 8 digits.
Floating-point numbers have the following advantages:
The main disadvantage of floating-point numbers is that digits outside their range of precision are treated as zeros.
Adjustable-Precision Floating Point: DECIMAL(p)
The DECIMAL(p) data type is a floating-point data type similar to FLOAT and SMALLFLOAT. The important difference is that you specify how many significant digits it retains. The precision you write as p can range from 1 to 32, from fewer than SMALLFLOAT up to twice the precision of FLOAT.
The magnitude of a DECIMAL(p) number ranges from 10-130 to 10124.
It is easy to be confused about decimal data types. The one under discussion is DECIMAL(p); that is, DECIMAL with only a precision specified. The size of DECIMAL(p) numbers depends on their precision; they occupy 1+p/2 bytes (rounded up to a whole number, if necessary).
DECIMAL(p) has the following advantages over FLOAT:
The DECIMAL(p) data type has the following disadvantages compared to FLOAT:
Fixed-Point Numbers: DECIMAL and MONEY
Most commercial applications need to store numbers that have fixed numbers of digits on the right and left of the decimal point. Amounts of money are the most common examples. Amounts in U.S. and other currencies are written with two digits to the right of the decimal point. Normally, you also know the number of digits needed on the left, depending on the kind of transactions that are recorded: perhaps 5 digits for a personal budget, 7 digits for a small business, and 12 or 13 digits for a national budget.
These numbers are fixed-point numbers because the decimal point is fixed at a specific place, regardless of the value of the number. The DECIMAL(p,s) data type is designed to hold them. When you specify a column of this type, you write its precision (p) as the total number of digits that it can store, from 1 to 32. You write its scale (s) as the number of those digits that fall to the right of the decimal point. (Figure 9-2 shows the relation between precision and scale.) Scale can be zero, meaning it stores only whole numbers. When only whole numbers are stored, DECIMAL(p,s) provides a way of storing integers of up to 32 digits.
Like the DECIMAL(p) data type, DECIMAL(p,s) takes up space in proportion to its precision. One value occupies 1+p/2 bytes, rounded up to a whole number of bytes.
The MONEY type is identical to DECIMAL(p,s), but with one extra feature. Whenever the database server converts a MONEY value to characters for display, it automatically includes a currency symbol.
The advantages of DECIMAL(p,s) over INTEGER and FLOAT are that much greater precision is available (up to 32 digits as compared with 10 digits for INTEGER and 16 digits for FLOAT), and both the precision and the amount of storage required can be adjusted to suit the application.
The disadvantages are that arithmetic operations are less efficient and that many programming languages do not support numbers in this form. Therefore, when a program extracts a number, it usually must convert the number to another numeric form for processing. (However, INFORMIX-4GL programs can use DECIMAL(p,s) and MONEY values directly.)
Choosing a currency format
Each nation has its own way of displaying money values. When an Informix database server displays a MONEY value, it refers to a currency format that the user specifies. The default locale specifies a U.S. English currency format of the following form:
For non-English locales, you can change the current format by means of the MONETARY category of the locale file. For more information on using locales, refer to Chapter 1 of the Guide to GLS Functionality.
To customize this currency format, choose your locale appropriately or set the DBMONEY environment variable. For more information, see Chapter 3, "Environment Variables" of the Informix Guide to SQL: Reference.
Chronological Data Types
Informix database servers support three data types for recording time. The DATE data type stores a calendar date. DATETIME records a point in time to any degree of precision from a year to a fraction of a second. The INTERVAL data type stores a span of time; that is, a duration.
Calendar Dates: DATE
The DATE data type stores a calendar date. A DATE value is actually a signed integer whose contents are interpreted as a count of full days since midnight on December 31, 1899. Most often it holds a positive count of days into the current century.
The DATE format has ample precision to carry dates into the far future (58,000 centuries). Negative DATE values are interpreted as counts of days prior to the epoch date; that is, a DATE value of -1 represents the day December 30, 1899.
Because DATE values are integers, Informix database servers permit them to be used in arithmetic expressions. For example, you can take the average of a DATE column, or you can add 7 or 365 to a DATE column. In addition, a rich set of functions exists specifically for manipulating DATE values. (See the Informix Guide to SQL: Syntax.)
The DATE data type is compact, at 4 bytes per item. Arithmetic functions and comparisons execute quickly on a DATE column.
Choosing a date format
You can punctuate and order the components of a date in many ways. When an Informix database server displays a DATE value, it refers to a date format that the user specifies. The default locale specifies a U.S. English date format of the form:
To customize this date format, choose your locale appropriately or set the DBDATE environment variable. For more information, see Chapter 3 of the Informix Guide to SQL: Reference.
For languages other than English, you can also change the date format by means of the TIME category of the locale file. For more information on using locales, refer to the Guide to GLS Functionality.
Exact Points in Time: DATETIME
The DATETIME data type stores any moment in time in the era that begins 1 A.D. In fact, DATETIME is really a family of 28 data types, each with a different precision. When you define a DATETIME column, you specify its precision. The column can contain any sequence from the list year, month, day, hour, minute, second, and fraction. Thus, you can define a DATETIME column that stores only a year, only a month and day, or a date and time that is exact to the hour or even to the millisecond. The size of a DATETIME value ranges from 2 to 11 bytes depending on its precision, as Figure 9-3 shows.
The advantage of DATETIME is that it can store dates more precisely than to the nearest day, and it can store time values. Its sole disadvantage is an inflexible display format, but you can circumvent this advantage. (See "Forcing the format of a DATETIME or INTERVAL value".)
Durations: INTERVAL
The INTERVAL data type stores a duration, that is, a length of time. The difference between two DATETIME values is an INTERVAL, which represents the span of time that separates them. The following examples might help to clarify the differences:
Like DATETIME, INTERVAL is a family of types with different precisions. An INTERVAL value can represent a count of years and months; or it can represent a count of days, hours, minutes, seconds, or fractions of seconds; 18 precisions are possible. The size of an INTERVAL value ranges from 2 to 12 bytes, depending on the formulas that Figure 9-4 shows.
INTERVAL values can be negative as well as positive. You can add or subtract them, and you can scale them by multiplying or dividing by a number. This is not true of either DATE or DATETIME. You can reasonably ask, "What is one-half the number of days until April 23?" but not, "What is one-half of April 23?"
Forcing the format of a DATETIME or INTERVAL value
The database server always displays the components of an INTERVAL or DATETIME value in the order year-month-day hour:minute:second.fraction. It does not refer to the date format that is defined to the operating system, as it does when it formats a DATE value.
You can write a SELECT statement that displays the date part of a DATETIME value in the system-defined format. The trick is to isolate the component fields using the EXTEND function and pass them through the MDY() function, which converts them to a DATE. The following code shows a partial example:
Choosing a DATETIME Format
When an Informix database server displays a DATETIME value, it refers to a DATETIME format that the user specifies. The default locale specifies a U.S. English DATETIME format of the following form:
For languages other than English, you change the DATETIME format by means of the TIME category of the locale file. For more information on using locales, refer to the Guide to GLS Functionality.
To customize this DATETIME format, choose your locale appropriately or set the GL_DATETIME or DBTIME environment variable. For more information, see the Guide to GLS Functionality.
Boolean Data Type
The BOOLEAN data type is a one byte data type. In DB-Access or SQL Editor, legal values are true ('t'), false ('f') or NULL. The values are case insensitive.
The following table shows how the BOOLEAN data type is represented.
You can compare a BOOLEAN column against another BOOLEAN column, or against Boolean values ('t','f' ). For example, suppose you create the following table:
The following query returns rows from the emp_info table where bool_col values are true.
The following query returns rows from the emp_info table where bool_col values are null.
You can also use a column that is assigned the BOOLEAN data type to capture the results of an expression as shown in the following example:
Character Data Types
The database server supports the NCHAR data type and NVARCHAR, the special-use character data type.
Character Data: CHAR(n) and NCHAR(n)
The CHAR(n) data type contains a sequence of n bytes. These characters can be a mixture of English and non-English characters and can be either single byte or multibyte (Asian). The length n ranges from 1 to 32,767. Whenever a CHAR(n) value is retrieved or stored, exactly n bytes are transferred. If an inserted value is shorter than n, the database server extends the value by using single byte ASCII space characters to make up n bytes.
Data in CHAR columns is sorted in code-set order. For example, in the ASCII code set, the character a has a code-set value of 97, b has 98, and so forth. The database server sorts CHAR(n) data in this order.
The NCHAR(n) data type also contains a sequence of n bytes. These characters can be a mixture of English and non-English characters and can be either single byte or multibyte (Asian). The length of n has the same limits as the CHAR(n) data type. Whenever an NCHAR(n) value is retrieved or stored, exactly n bytes are transferred. The number of characters transferred can be less than the number of bytes if the data contains multibyte characters. If an inserted value is shorter than n, the database server extends the value by using single byte ASCII space characters to make up n bytes.
The database server sorts data in NCHAR(n) columns according to the order that the locale specifies. For example, the French locale specifies that the character ê is sorted after the value e but before the value f. In other words, the sort order dictated by the French locale is e, ê, f, and so on. For more information on using locales, refer to the Guide to GLS Functionality.
A CHAR(n) or NCHAR(n) value can include tabs and spaces but normally contains no other nonprinting characters. When rows are inserted using INSERT or UPDATE, or when rows are loaded with a utility program, no means exists for entering nonprintable characters. However, when rows are created by a program using embedded SQL, the program can insert any character except the null (binary zero) character. It is not a good idea to store nonprintable characters in a character column because standard programs and utilities do not expect them.
The advantage of the CHAR(n) or NCHAR(n) data type is its availability on all database servers. The only disadvantage of CHAR(n) or NCHAR(n) is its fixed length. When the length of data values varies widely from row to row, space is wasted.
Varying-Length Strings: CHARACTER VARYING(m,r), VARCHAR(m,r), NVARCHAR(m,r), and LVARCHAR
For the following data types, m represents the maximum number of bytes and r represents the minimum number of bytes.
Often the items in a character column have different lengths; that is, many have an average length, and only a few have the maximum length. The following data types are designed to save disk space when you store such data:
- CHARACTER VARYING (m,r). The CHARACTER VARYING (m,r) data type contains a sequence of, at most, m bytes or at the least, r bytes. This data type is the ANSI-compliant format for character data of varying length. CHARACTER VARYING (m,r) supports code-set order for comparisons of its character data.
- VARCHAR (m,r). VARCHAR (m,r) is an Informix-specific data type for storing character data of varying length. In functionality, it is the same as CHARACTER VARYING(m,r).
- NVARCHAR (m,r). NVARCHAR (m,r) is also an Informix-specific data type for storing character data of varying length. It compares character data in the order that the locale specifies.
- LVARCHAR. LVARCHAR is an Informix-specific data type for storing character data of varying length for values greater than 256 bytes but less than 32 kilobytes. LVARCHAR supports code-set order for comparisons of its character data
When you define columns of VARCHAR(m,r), CHARACTER VARYING(m,r), or VARCHAR(m,r) data types, you specify m as the maximum number of bytes. If an inserted value consists of fewer than m bytes, the database server does not extend the value with single-byte spaces (as with CHAR(n) and NCHAR(n) values.) Instead, it stores only the actual contents on disk, with a 1-byte length field. The limit on m is 254 bytes for indexed columns and 255 bytes for non-indexed columns.
The second parameter, r, is an optional reserve length that sets a lower limit on the number of bytes required by the value that is being stored on disk. Even if a value requires fewer than r bytes, r bytes are nevertheless allocated to hold it. The purpose is to save time when rows are updated. (See "Varying-Length Execution Time".)
The advantages of the CHARACTER VARYING(m,r) or VARCHAR(m,r) data type over the CHAR(n) data type are as follows:
These advantages also apply to the NVARCHAR(m,r) data type in comparison to the NCHAR(n) data type.
The following list describes the disadvantages of using CHARACTER VARYING(m,r), VARCHAR(m,r), and NVARCHAR(m,r) data types:
Varying-Length Execution Time
When you use the CHARACTER VARYING(m,r), VARCHAR(m,r), or NVARCHAR(m,r) data types, the rows of a table have a varying number of bytes instead of a fixed number of bytes. The speed of database operations is affected when the rows of a table have a varying number of bytes.
Because more rows fit in a disk page, the database server can search the table with fewer disk operations than if the rows were of a fixed number of bytes. As a result, queries can execute more quickly. Insert and delete operations can be a little quicker for the same reason.
When you update a row, the amount of work the database server must do depends on the number of bytes in the new row as compared with the number of bytes in the old row. If the new row uses the same number of bytes or fewer, the execution time is not significantly different than it is with fixed-length rows. However, if the new row requires a greater number of bytes than the old one, the database server might have to perform several times as many disk operations. Thus, updates of a table that use CHARACTER VARYING(m,r), VARCHAR(m,r), or NVARCHAR(m,r) data can sometimes be slower than updates of a fixed-length field.
To mitigate this effect, specify r as a number of bytes that covers a high proportion of the data items. Then most rows use the reserve number of bytes, and padding wastes only a little space. Updates are slow only when a value using the reserve number of bytes is replaced with a value that uses more than the reserve number of bytes.
Large Object Data Types
Universal Server supports both simple large objects and smart large objects to handle data that exceeds a length of 255 bytes and non-ASCII character data.
Smart large objects refer to columns that are assigned a BLOB or CLOB data type. A smart large object allows an application program to randomly access column data, which means you can read or write to any part of a BLOB or CLOB column in any arbitrary order.
Simple large objects refer to columns that are assigned a TEXT or BYTE data type. A simple large object can store and retrieve character data or binary data, but cannot randomly access portions of the column data. In other words, TEXT or BYTE data can be inserted or deleted but cannot be modified. The database server simply stores or retrieves the TEXT or BYTE data in a single SQL statement.
The following sections describe additional differences between simple large objects and smart large objects.
Smart Large Objects: CLOB
The CLOB data type stores a block of text. It is designed to store ASCII text data, including formatted text such as HTML or PostScript. Although you can store any data in a CLOB object, Informix tools expect a CLOB object to be printable, so restrict this data type to printable ASCII text.
CLOB values are not stored with the rows of which they are a part. They are allocated in whole disk pages, usually areas away from rows. (For more information, see the INFORMIX-Universal Server Administrator's Guide.)
The CLOB data type is similar to the TEXT data type except that the CLOB data type provides the following advantages:
The disadvantages of the CLOB data type are as follows:
Smart Large Objects: BLOB
The BLOB datatype is designed to hold any data that a program can generate: graphic images, satellite images, video clips, audio clips, or formatted documents saved by any word processor or spreadsheet. The database server permits any kind of data of any length in a BLOB column.
BLOB data items are stored in whole disk pages in separate disk areas from normal row data.
The advantage of the BLOB data type, as opposed to CLOB, is that it accepts any data. Otherwise, the advantages and disadvantages of the BLOB data type are the same as for the CLOB data type.
Using Smart Large Objects
To store columns of a CLOB or BLOB data type, you must allocate an sbspace. An sbspace is a logical storage unit that stores BLOB and CLOB data in the most efficient way possible. You can write INFORMIX-ESQL/C programs that allow users to fetch and store CLOB and BLOB data. Application programmers who want to access and manipulate large objects directly can consult the INFORMIX-ESQL/C Programmer's Manual.
In any SQL statement, interactive or programmed, a CLOB or BLOB column cannot be used in the following ways:
However, DataBlade developers have the capability to create indexes on CLOB columns.
In a SELECT statement entered interactively, a CLOB or BLOB column can:
From an ESQL/C program, you can use the ifx_lo_stat() function to determine the length of CLOB or BLOB data.
Copying smart large objects
Universal Server provides functions that you can call from within an SQL statement to import and export smart large objects. Figure 9-5 shows the smart-large-object functions. For detailed information and the syntax of smart-large-object functions, see the Expression segment in the Informix Guide to SQL: Syntax.
Figure 9-5
SQL Functions for Smart Large Objects
You can use any of the functions that Figure 9-5 shows in the SELECT, UPDATE, and INSERT statements. (The following examples assume that the SBSPACENAME parameter has been specified as sbspace1 .)
Suppose you create the following inmate and fbi_list tables:
The following INSERT statement uses the FILETOBLOB() and FILETOCLOB() functions to insert a row of the inmate table.
In the preceding example, the first argument for the FILETOBLOB() and FILETOCLOB() functions specifies the path of the source file to be copied into the BLOB and CLOB columns respectively. The second argument for each function specifies whether the source file is located on the client computer ('client') or server computer ('server'). The following rules apply for specifying the path of a filename in a function argument, depending on whether the file resides on the client or server computer:
The following UPDATE statement uses the LOCOPY() function to copy BLOB data from the mugshot column of the fbi_list table into the picture column of the inmate table:
The first argument for LOCOPY() specifies the column (mugshot) from which the large object is exported. The second and third arguments specify the name of the table (fbi_list) and column (mugshot) whose storage characteristics are used for the newly created large object. After execution of the UPDATE statement, the picture column contains data from the mugshot column. Because LOCOPY() uses the storage defaults of the column that it exports, this instance of the picture column is stored in sbspace3, which is the default storage specified for the mugshot column of the fbi_list table.
The following SELECT statement uses the LOTOFILE() function to copy data from the felony column into the felon_322.txt file that is located on the client computer:
The first argument for LOTOFILE() specifies the name of the column from which data is to be exported. The second argument specifies the name of the file into which data is to be copied. The third argument specifies whether the target file is located on the client computer ('client') or server computer ('server'). (See the previous discussion for the rules that apply to specifying the path of a filename for client and server computers.)
Inheritance of characteristics for smart large objects
The database administrator can specify, at the column level, the estimated extent size for CLOB or BLOB data to override sbspace defaults. An extent is the unit of storage allocation that is used when a large object needs additional storage. For information about how to specify an extent size when you create a table, see the CREATE TABLE statement in the Informix Guide to SQL: Syntax. If the size of the smart large object is not specified at the column-level, the CLOB or BLOB column inherits characteristics from the sbspace defaults or (if sbspace defaults are not specified) from the default values in the database server.
ESQL/C programs that access CLOB or BLOB data from a row can override some column-level defaults at the time that the program creates a BLOB or CLOB instance. For information about how to override column-level defaults when you create an instance of a smart large object from an ESQL/C program, see the INFORMIX-ESQL/C Programmer's Manual.
Figure 9-6 shows the precedence rules that Universal Server uses to determine which characteristics a smart large object inherits.
For information about Universal Server defaults and sbspace defaults, see the INFORMIX-Universal Server Administrator's Guide.
Simple Large Objects: TEXT
The TEXT data type stores a block of text. It is designed to store self-contained documents: business forms, program source or data files, or memos. Although you can store any data in a TEXT item, Informix tools expect a TEXT item to be printable, so restrict this data type to printable ASCII text.
TEXT values are not stored with the rows of which they are a part. They are allocated in whole disk pages, usually in areas away from rows. (See the INFORMIX-Universal Server Administrator's Guide.)
The advantage of the TEXT data type over CHAR(n) and VARCHAR(m,r) is that the size of a TEXT data item has no limit except the capacity of disk storage to hold it. The disadvantages of the TEXT data type are as follows:
You can display TEXT values in reports that you generate with INFORMIX-4GL programs or the ACE report writer. You can display TEXT values on a screen and edit them using screen forms generated with INFORMIX-4GL programs or with the PERFORM screen-form processor.
Simple Large Objects: BYTE
The BYTE data type is designed to hold any data that a program can generate: graphic images, program object files, and formatted documents saved by any word processor or spreadsheet. The database server permits any kind of data of any length in a BYTE column.
As with TEXT, BYTE data items are stored in whole disk pages in separate disk areas from normal row data.
The advantage of the BYTE data type, as opposed to TEXT or CHAR(n), is that it accepts any data. Its disadvantages are the same as those of the TEXT data type.
Using Simple Large Objects
To store columns of a TEXT or BYTE data type, you must allocate a blobspace. A blobspace is a logical storage unit that stores TEXT and BYTE data in the most efficient way possible. Normally, you use INFORMIX-ESQL/C or NewEra programs to fetch and store TEXT and BYTE data. In such a program, you can fetch, insert, or update a simple large object value in a manner similar to the way that you read or write a sequential file.
In any SQL statement, interactive or programmed, a TEXT or BYTE column cannot be used in the following ways:
In a SELECT statement entered interactively, or in a form or report, a TEXT or BYTE column can:
In an interactive INSERT statement, you can use the VALUES clause to insert a simple-large-object value, but the only value that you can give that column is null. However, you can use the SELECT form of the INSERT statement to copy a simple large object value from another table.
In an interactive UPDATE statement, you can update a simple-large-object column to null or to a subquery that returns a simple-large-object column.
Changing the Data Type
After the table is built, you can use the ALTER TABLE statement to change the data type that is assigned to a column. Although such alterations are sometimes necessary, you should avoid them for the following reasons:
Restrictions apply for using the ALTER TABLE statement to change the data type that is assigned to a column of a table in an inheritance hierarchy. For information about altering a table in an inheritance hierarchy, see "Altering the Structure of a Table in a Table Hierarchy".
Null Values
Columns in a table can be designated as containing null values. A null value means that the value for the column can be unknown or not applicable. For example, in the telephone-directory example in Chapter 8, the anniv column of the name table can contain null values; if you do not know the person's anniversary, you do not specify it. Do not confuse null value with zero or blank value. To specify that the value of a column is null, you use the NULL keyword. For example, the following statement inserts a row into the manufact table and specifies that the value for the lead_time column is null:
Columns that are collection types cannot contain null elements. For more information, see "Collection Data Types".
Default Values
A default value is the value that is inserted into a column when an explicit value is not specified in an INSERT statement. A default value can be a literal character string that either you define or one of the following SQL null, constant expressions defines:
Not all columns need default values, but as you work with your data model, you might discover instances where the use of a default value saves data-entry time or prevents data-entry error. For example, the telephone-directory model has a State column. While you are looking at the data for this column, you discover that more than 50 percent of the addresses list California as the state. To save time, you specify the string "CA" as the default value for the State column.
Check Constraints
Check constraints specify a condition or requirement on a data value before data can be assigned to a column during an INSERT or UPDATE statement. If a row evaluates to false for any of the check constraints that are defined on a table during an insert or update, the database server returns an error. To define a constraint, use the CREATE TABLE or ALTER TABLE statements. For example, the following requirement constrains the values of an integer domain to a certain range:
To express constraints on character-based domains, use the MATCHES predicate and the regular-expression syntax that it supports. For example, the following constraint restricts a telephone domain to the form of a U.S. local telephone number:
For additional information about check constraints, see the CREATE TABLE and ALTER TABLE statements in the Informix Guide to SQL: Syntax.
Domains
A domain is an alias that you create to substitute for the name of a data type. In particular, domains provide a useful shorthand notation for collection data types that have long typenames. (For information about collection data types, see "Collection Data Types".) Once you create a domain you can use it anywhere the typename would be used. You can use a domain to specify a data type only; a domain does not have any other properties such as constraints or default values.
You cannot use the CREATE DOMAIN statement to create an alias for the following data types:
Creating a Domain
To create or drop a domain you use the CREATE DOMAIN or DROP DOMAIN statement. For example, suppose you want to use the following collection data type to define columns in different tables:
The following statement creates a domain name for the collection data type.
You can use the domain anywhere you might use the typename. For example, the following statement use the d_employee domain to define the data type of a column in a table:
For more information, see the description of the CREATE DOMAIN statement in the Informix Guide to SQL: Syntax.
Dropping a Domain
To drop a domain you use the DROP DOMAIN statement. You can drop a domain that is currently being used to specify the data type of a column. For example, suppose you want to drop the d_employee domain that was used to define a column of the department table in the preceding section. The following statement shows how to drop a domain:
After you drop a domain, any columns that currently are defined on the domain continue to retain the original data type assigned to the column. For example, consider the following sequence of SQL statements:
The first SELECT statement returns the employee column, which is of type, SET(VARCHAR(30)NOT NULL), even though the d_employee domain has been dropped.The second SELECT statement returns the employee column, which is also of type SET(VARCHAR(30)NOT NULL) even though the d_employee domain has been recreated as a different data type. As the examples illustrate, once you define a column on a domain, the type of the column does not change.
For more information, see the description of the DROP DOMAIN statement in the Informix Guide to SQL: Syntax.
To change the data type of a column defined on a domain
1. Drop the domain
2. Create a new domain
3. Use the ALTER TABLE statement to modify the column data type
For example, to change the data type of the employees column of the department table (shown in the preceding examples), you might construct the following statements:
Although execution of the DROP DOMAIN and CREATE DOMAIN statements changes the domain definition, to change the data type of the employees column, which has been defined on the d_employee domain, you must use the ALTER TABLE statement.
|