Abstract
Indexes can be created on the result of a function applied to columns in a table. That can be useful in a variety of ways, such as fast case-insensitive searching on character data. There is also a particular solution explored using built-in function “ifx_checksum” for very compact indexes on columns wider than normally allowed.
Content
Examples in this article were produced using the provided “stores_demo” database in a Docker container from informix-dockerhub with Informix Dynamic Server (IDS) updated to 14.10.FC11.
Functional indexes are documented here. For example, you might want to search rapidly through a character column beginning with supplied text in either lower or upper case (without installing the Basic Text Search extension). The following table was created to test this, with alternate rows upshifted:
SELECT company AS name FROM customer INTO company;
UPDATE company SET name = UPPER(name) WHERE MOD(ROWID,2) = 0;
This SQL uses built-in function UPPER to get the required results:
SET EXPLAIN ON; -- query plan appended to ~/sqexplain.out
SELECT * FROM company WHERE UPPER(name) LIKE 'SPORT%';
SET EXPLAIN OFF;
The results are:
SPORTS SPOT
Sport Stuff
Sports Center
Sportstown
SPORTING PLACE
However, the query plan is:
1) informix.company: SEQUENTIAL SCAN
Filters: UPPER(informix.company.name ) LIKE 'SPORT%'
That would be a problem on a large table as it would scan all rows.
You might then try and fail to create an index on the built-in function:
CREATE INDEX ix_company ON company (UPPER(name));
9844: Invalid function (upper) used in a functional key.
You can only create an index on a user-defined function. This can be a stored procedure language (SPL) function, so we can create a simple wrapper function:
CREATE FUNCTION sp_upper(data VARCHAR(255))
RETURNING VARCHAR(255) AS data
WITH (NOT VARIANT);
RETURN UPPER(TRIM(data));
END FUNCTION;
- You must declare the function as invariant, meaning it will always return the same output for given input.
- Index keys have a limit on the total number of bytes, so set a sensible length on the input and output.
- Use TRIM in case the intput data is CHAR to discard trailing spaces.
We can now create a functional index:
CREATE INDEX ix_company ON company (sp_upper(name));
Retesting as follows returns the same results:
SET EXPLAIN ON; -- query plan appended to ~/sqexplain.out
SELECT * FROM company WHERE sp_upper(name) LIKE 'SPORT%';
SET EXPLAIN OFF;
This time, the query plan contains:
1) informix.company: INDEX PATH
(1) Index Name: informix.ix_company
Index Keys: informix.sp_upper(name) (Serial, fragments: ALL)
Lower Index Filter: informix.sp_upper(informix.company.name )LIKE 'SPORT%'
Having seen the basics of functional indexes for fast case-insensitive queries, we can now look at the second purpose of this article. What if we need to index a wide column set which might require excessive disk space and/or exceed the allowed limits? The following example on the “stores_demo” database is based on a real case where a unique index was reduced from 14GB to 1GB.
The “stores_demo” database contains a table with the following schema:
CREATE TABLE warehouses
(
warehouse_name LVARCHAR,
warehouse_id INT,
warehouse_spec LVARCHAR
);
Note that LVARCHAR is equivalent to LVARCHAR(1024).
Firstly, we might want to make “warehouse_id” a unique SERIAL number as probably intended:
ALTER TABLE warehouses MODIFY
(
warehouse_id SERIAL NOT NULL PRIMARY KEY
);
Let’s say we then want to make “warehouse_name” unique. The obvious solution fails:
CREATE UNIQUE INDEX ix_warehouse_name ON warehouses (warehouse_name);
-517: The total size of the index is too large or too many parts in index.
More details on any error number can be obtained with “finderr” as in this case:
$ finderr -517
-517 The total size of the index is too large or too many parts in index.
Informix Dynamic Server has limits on the number of columns that can be
included in an index and on the total number of bytes in a key (the sum
of the widths of the columns). This CREATE INDEX statement would exceed
that limit for this database server. Informix Dynamic Server
allows 16 key parts (columnar or functional) and a width of 390 bytes on
a 2K page platform or in a 2K page dbspace. On 4K page platforms or using a
dbspace with a non-default page size allows greater than 390 bytes for the
width of indexed columns.
Fortunately, there is a built-in function “ifx_checksum” we can use that is described here:
https://docs.deistercloud.com/content/Databases.30/IBM Informix.2/Tips/Checksum.xml?embedded=true
IBM and HCL Informix documentation refer to it in the context of Enterprise Replication, but does not otherwise cover it.
We can create the index with:
DROP INDEX IF EXISTS ix_warehouse_name;
DROP FUNCTION IF EXISTS sp_checksum(LVARCHAR);
CREATE FUNCTION sp_checksum(data LVARCHAR)
RETURNING INT AS checksum
WITH (NOT VARIANT);
/*
Function wrapper for built-in "ifx_checksum"
to index an LVARCHAR column
Doug Lawry, Oninit Consulting, December 2024
*/
RETURN ifx_checksum(data,0);
END FUNCTION;
CREATE INDEX ix_warehouse_name ON warehouses (sp_checksum(warehouse_name)) ONLINE;
- The index cannot be unique as there could be checksum collisions, so uniqueness must be enforced by triggers.
- IDS 11.70+ supports the ONLINE keyword to create an index while the table is in use.
If you are creating more than one checksum index, you might need different variations of “sp_checksum” with varying data types and numbers of parameters. Those with more than one would have a combined checksum expression as in this example:
RETURN
ifx_checksum(data1,
ifx_checksum(data2,
ifx_checksum(data3,0)));
Once you have functional indexes, you will be prevented from dropping the function:
DROP FUNCTION IF EXISTS sp_checksum(LVARCHAR);
705: Cannot drop/modify procedure (informix.sp_checksum). It is currently in use.
Experience shows that a separate function to check rapidly whether a row already exists is more useful in a separate function, so it can be used both in triggers and applications:
DROP FUNCTION IF EXISTS sp_warehouses_lookup;
CREATE FUNCTION sp_warehouses_lookup
(
p_warehouse_name LVARCHAR,
p_except INT DEFAULT 0
)
RETURNING INT AS id;
/*
Function to get serial primary key value
using ifx_checksum functional index for speed
but also comparing actual values for exactness
Doug Lawry, Oninit Consulting, December 2024
*/
DEFINE l_warehouse_id INT;
LET l_warehouse_id = NULL;
-- Fetch first row with specified value
SELECT {+INDEX(w ix_warehouse_name)}
FIRST 1 warehouse_id
INTO l_warehouse_id
FROM warehouses AS w
WHERE
sp_checksum(warehouse_name) =
sp_checksum(p_warehouse_name)
AND warehouse_name = p_warehouse_name
AND warehouse_id != p_except;
RETURN l_warehouse_id;
END FUNCTION;
- There should be a parameter for each column in the index (one in our example).
- An optional last parameter is needed for triggers to ignore the row being checked.
- Optimizer directive +INDEX ensures the checksum index will be used.
- The warehouse name must match as well as the checksum in case of collision.
- A third condition is present to ignore the specified row if provided for triggers.
We can now create the trigger procedure:
DROP PROCEDURE IF EXISTS sp_warehouses_unique;
CREATE PROCEDURE sp_warehouses_unique()
REFERENCING OLD AS o NEW AS n
FOR warehouses;
/*
Trigger procedure to enforce uniqueness
using function to look up any duplicate
Doug Lawry, Oninit Consulting, December 2024
*/
IF
sp_warehouses_lookup
(
n.warehouse_name,
n.warehouse_id
)
IS NOT NULL
THEN
-- Any exception will roll it back
IF INSERTING THEN
RAISE EXCEPTION -239, -100, 'ix_warehouses_checksum';
ELSE
RAISE EXCEPTION -346, -100;
END IF
END IF
END PROCEDURE;
The triggers are then as follows:
DROP TRIGGER IF EXISTS ins_warehouses;
DROP TRIGGER IF EXISTS upd_warehouses;
CREATE TRIGGER ins_warehouses INSERT ON warehouses
FOR EACH ROW
(
EXECUTE PROCEDURE sp_warehouses_unique()
WITH TRIGGER REFERENCES
);
CREATE TRIGGER upd_warehouses UPDATE ON warehouses
FOR EACH ROW
(
EXECUTE PROCEDURE sp_warehouses_unique()
WITH TRIGGER REFERENCES
);
TIP: set undocumented parameter NONEXCLTRIG so you can drop or create triggers while the table is in use.
You should now find that you cannot create duplicates in “warehouse_name”:
INSERT INTO warehouses (warehouse_name) VALUES ('Liverpool, England');
239: Could not insert new row - duplicate value in a UNIQUE INDEX column
100: ISAM error: duplicate value for a record with unique key.
UPDATE warehouses SET (warehouse_name) = ('Liverpool, England')
WHERE warehouse_name = 'Seattle, USA';
346: Could not update a row in the table.
100: ISAM error: duplicate value for a record with unique key.
Conclusion
The techniques in this article can provide fast fuzzier searches, and massively reduce the space required to index wide column sets.
Disclaimer
Suggestions above are provided “as is” without warranty of any kind, either express or implied, including without limitation any implied warranties of condition, uninterrupted use, merchantability, fitness for a particular purpose, or non-infringement.
Contact Us
If you have any questions or would like to find out more about this topic, please contact us.