Functional indexes and ifx_checksum

Abstract

Indexes can be created on the result of a function applied to columns in a table. That can be useful in a variety of ways, such as fast case-insensitive searching on character data. There is also a particular solution explored using built-in function “ifx_checksum” for very compact indexes on columns wider than normally allowed.

Content

Examples in this article were produced using the provided “stores_demo” database in a Docker container from informix-dockerhub with Informix Dynamic Server (IDS) updated to 14.10.FC11.

Functional indexes are documented here. For example, you might want to search rapidly through a character column beginning with supplied text in either lower or upper case (without installing the Basic Text Search extension). The following table was created to test this, with alternate rows upshifted:

Copy to Clipboard

SELECT company AS name FROM customer INTO company;
UPDATE company SET name = UPPER(name) WHERE MOD(ROWID,2) = 0;

 
SELECT company AS name FROM customer INTO company;UPDATE company SET name = UPPER(name) WHERE MOD(ROWID,2) = 0;

This SQL uses built-in function UPPER to get the required results:

Copy to Clipboard

SET EXPLAIN ON; -- query plan appended to ~/sqexplain.out
SELECT * FROM company WHERE UPPER(name) LIKE 'SPORT%';
SET EXPLAIN OFF;

 
SET EXPLAIN ON; -- query plan appended to ~/sqexplain.outSELECT * FROM company WHERE UPPER(name) LIKE 'SPORT%';SET EXPLAIN OFF;

The results are:

Copy to Clipboard

SPORTS SPOT
Sport Stuff
Sports Center
Sportstown
SPORTING PLACE

 
SPORTS SPOTSport StuffSports CenterSportstownSPORTING PLACE

However, the query plan is:

Copy to Clipboard

  1) informix.company: SEQUENTIAL SCAN

        Filters: UPPER(informix.company.name ) LIKE 'SPORT%'

​x
 
  1) informix.company: SEQUENTIAL SCAN​        Filters: UPPER(informix.company.name ) LIKE 'SPORT%'

That would be a problem on a large table as it would scan all rows.

You might then try and fail to create an index on the built-in function:

Copy to Clipboard

CREATE INDEX ix_company ON company (UPPER(name));

9844: Invalid function (upper) used in a functional key.

 
CREATE INDEX ix_company ON company (UPPER(name));​9844: Invalid function (upper) used in a functional key.

You can only create an index on a user-defined function. This can be a stored procedure language (SPL) function, so we can create a simple wrapper function:

Copy to Clipboard

CREATE FUNCTION sp_upper(data VARCHAR(255))
RETURNING VARCHAR(255) AS data
WITH (NOT VARIANT);
    RETURN UPPER(TRIM(data));
END FUNCTION;

 
CREATE FUNCTION sp_upper(data VARCHAR(255))RETURNING VARCHAR(255) AS dataWITH (NOT VARIANT);    RETURN UPPER(TRIM(data));END FUNCTION;

You must declare the function as invariant, meaning it will always return the same output for given input.
Index keys have a limit on the total number of bytes, so set a sensible length on the input and output.
Use TRIM in case the intput data is CHAR to discard trailing spaces.

We can now create a functional index:

Copy to Clipboard

CREATE INDEX ix_company ON company (sp_upper(name));

CREATE INDEX ix_company ON company (sp_upper(name));

Retesting as follows returns the same results:

Copy to Clipboard

SET EXPLAIN ON; -- query plan appended to ~/sqexplain.out
SELECT * FROM company WHERE sp_upper(name) LIKE 'SPORT%';
SET EXPLAIN OFF;

 
SET EXPLAIN ON; -- query plan appended to ~/sqexplain.outSELECT * FROM company WHERE sp_upper(name) LIKE 'SPORT%';SET EXPLAIN OFF;

This time, the query plan contains:

Copy to Clipboard

  1) informix.company: INDEX PATH

    (1) Index Name: informix.ix_company
        Index Keys: informix.sp_upper(name)   (Serial, fragments: ALL)
        Lower Index Filter: informix.sp_upper(informix.company.name )LIKE 'SPORT%'

 
  1) informix.company: INDEX PATH​    (1) Index Name: informix.ix_company        Index Keys: informix.sp_upper(name)   (Serial, fragments: ALL)        Lower Index Filter: informix.sp_upper(informix.company.name )LIKE 'SPORT%'

Having seen the basics of functional indexes for fast case-insensitive queries, we can now look at the second purpose of this article. What if we need to index a wide column set which might require excessive disk space and/or exceed the allowed limits? The following example on the “stores_demo” database is based on a real case where a unique index was reduced from 14GB to 1GB.

The “stores_demo” database contains a table with the following schema:

Copy to Clipboard

CREATE TABLE warehouses
(
    warehouse_name   LVARCHAR,
    warehouse_id     INT,
    warehouse_spec   LVARCHAR
);

 
CREATE TABLE warehouses(    warehouse_name   LVARCHAR,    warehouse_id     INT,    warehouse_spec   LVARCHAR);

Note that LVARCHAR is equivalent to LVARCHAR(1024).

Firstly, we might want to make “warehouse_id” a unique SERIAL number as probably intended:

Copy to Clipboard

ALTER TABLE warehouses MODIFY
(
    warehouse_id     SERIAL NOT NULL PRIMARY KEY
);

 
ALTER TABLE warehouses MODIFY(    warehouse_id     SERIAL NOT NULL PRIMARY KEY);

Let’s say we then want to make “warehouse_name” unique. The obvious solution fails:

Copy to Clipboard

CREATE UNIQUE INDEX ix_warehouse_name ON warehouses (warehouse_name);

-517: The total size of the index is too large or too many parts in index.

 
CREATE UNIQUE INDEX ix_warehouse_name ON warehouses (warehouse_name);​-517: The total size of the index is too large or too many parts in index.

More details on any error number can be obtained with “finderr” as in this case:

Copy to Clipboard

$ finderr -517
-517    The total size of the index is too large or too many parts in index.

Informix Dynamic Server has limits on the number of columns that can be
included in an index and on the total number of bytes in a key (the sum
of the widths of the columns). This CREATE INDEX statement would exceed
that limit for this database server. Informix Dynamic Server
allows 16 key parts (columnar or functional) and a width of 390 bytes on
a 2K page platform or in a 2K page dbspace. On 4K page platforms or using a
dbspace with a non-default page size allows greater than 390 bytes for the
width of indexed columns.

 
$ finderr -517-517    The total size of the index is too large or too many parts in index.​Informix Dynamic Server has limits on the number of columns that can beincluded in an index and on the total number of bytes in a key (the sumof the widths of the columns). This CREATE INDEX statement would exceedthat limit for this database server. Informix Dynamic Serverallows 16 key parts (columnar or functional) and a width of 390 bytes ona 2K page platform or in a 2K page dbspace. On 4K page platforms or using adbspace with a non-default page size allows greater than 390 bytes for thewidth of indexed columns.

Fortunately, there is a built-in function “ifx_checksum” we can use that is described here:

https://docs.deistercloud.com/content/Databases.30/IBM Informix.2/Tips/Checksum.xml?embedded=true

IBM and HCL Informix documentation refer to it in the context of Enterprise Replication, but does not otherwise cover it.

We can create the index with:

Copy to Clipboard

DROP INDEX    IF EXISTS ix_warehouse_name;
DROP FUNCTION IF EXISTS sp_checksum(LVARCHAR);

CREATE FUNCTION sp_checksum(data LVARCHAR)
    RETURNING INT AS checksum
    WITH (NOT VARIANT);
/*
    Function wrapper for built-in "ifx_checksum"
    to index an LVARCHAR column

    Doug Lawry, Oninit Consulting, December 2024
*/
    RETURN ifx_checksum(data,0);

END FUNCTION;

CREATE INDEX ix_warehouse_name ON warehouses (sp_checksum(warehouse_name)) ONLINE;

 
DROP INDEX    IF EXISTS ix_warehouse_name;DROP FUNCTION IF EXISTS sp_checksum(LVARCHAR);​CREATE FUNCTION sp_checksum(data LVARCHAR)    RETURNING INT AS checksum    WITH (NOT VARIANT);/*    Function wrapper for built-in "ifx_checksum"    to index an LVARCHAR column​    Doug Lawry, Oninit Consulting, December 2024*/    RETURN ifx_checksum(data,0);​END FUNCTION;​CREATE INDEX ix_warehouse_name ON warehouses (sp_checksum(warehouse_name)) ONLINE;

The index cannot be unique as there could be checksum collisions, so uniqueness must be enforced by triggers.
IDS 11.70+ supports the ONLINE keyword to create an index while the table is in use.

If you are creating more than one checksum index, you might need different variations of “sp_checksum” with varying data types and numbers of parameters. Those with more than one would have a combined checksum expression as in this example:

Copy to Clipboard

    RETURN
        ifx_checksum(data1,
        ifx_checksum(data2,
        ifx_checksum(data3,0)));

 
    RETURN        ifx_checksum(data1,        ifx_checksum(data2,        ifx_checksum(data3,0)));

Once you have functional indexes, you will be prevented from dropping the function:

Copy to Clipboard

DROP FUNCTION IF EXISTS sp_checksum(LVARCHAR);

705: Cannot drop/modify procedure (informix.sp_checksum). It is currently in use.

 
DROP FUNCTION IF EXISTS sp_checksum(LVARCHAR);​705: Cannot drop/modify procedure (informix.sp_checksum). It is currently in use.

Experience shows that a separate function to check rapidly whether a row already exists is more useful in a separate function, so it can be used both in triggers and applications:

Copy to Clipboard

DROP FUNCTION IF EXISTS sp_warehouses_lookup;

CREATE FUNCTION sp_warehouses_lookup
(
    p_warehouse_name    LVARCHAR,
    p_except            INT DEFAULT 0
)
    RETURNING INT AS id;
/*
    Function to get serial primary key value
    using ifx_checksum functional index for speed
    but also comparing actual values for exactness

    Doug Lawry, Oninit Consulting, December 2024
*/
    DEFINE l_warehouse_id INT;

    LET l_warehouse_id = NULL;

    -- Fetch first row with specified value
    SELECT {+INDEX(w ix_warehouse_name)}
    FIRST 1 warehouse_id
    INTO l_warehouse_id
    FROM warehouses AS w
    WHERE
        sp_checksum(warehouse_name) =
        sp_checksum(p_warehouse_name)
    AND warehouse_name = p_warehouse_name
    AND warehouse_id != p_except;

    RETURN l_warehouse_id;

END FUNCTION;

 
DROP FUNCTION IF EXISTS sp_warehouses_lookup;​CREATE FUNCTION sp_warehouses_lookup(    p_warehouse_name    LVARCHAR,    p_except            INT DEFAULT 0)    RETURNING INT AS id;/*    Function to get serial primary key value    using ifx_checksum functional index for speed    but also comparing actual values for exactness​    Doug Lawry, Oninit Consulting, December 2024*/    DEFINE l_warehouse_id INT;​    LET l_warehouse_id = NULL;​    -- Fetch first row with specified value    SELECT {+INDEX(w ix_warehouse_name)}    FIRST 1 warehouse_id    INTO l_warehouse_id    FROM warehouses AS w    WHERE        sp_checksum(warehouse_name) =        sp_checksum(p_warehouse_name)    AND warehouse_name = p_warehouse_name    AND warehouse_id != p_except;​    RETURN l_warehouse_id;​END FUNCTION;

There should be a parameter for each column in the index (one in our example).
An optional last parameter is needed for triggers to ignore the row being checked.
Optimizer directive +INDEX ensures the checksum index will be used.
The warehouse name must match as well as the checksum in case of collision.
A third condition is present to ignore the specified row if provided for triggers.

We can now create the trigger procedure:

Copy to Clipboard

DROP PROCEDURE IF EXISTS sp_warehouses_unique;

CREATE PROCEDURE sp_warehouses_unique()
    REFERENCING OLD AS o NEW AS n
    FOR warehouses;
/*
    Trigger procedure to enforce uniqueness
    using function to look up any duplicate

    Doug Lawry, Oninit Consulting, December 2024
*/
    IF
        sp_warehouses_lookup
        (
            n.warehouse_name,
            n.warehouse_id
        )
        IS NOT NULL
    THEN
        -- Any exception will roll it back
        IF INSERTING THEN
            RAISE EXCEPTION -239, -100, 'ix_warehouses_checksum';
        ELSE
            RAISE EXCEPTION -346, -100;
        END IF
    END IF

END PROCEDURE;

 
DROP PROCEDURE IF EXISTS sp_warehouses_unique;​CREATE PROCEDURE sp_warehouses_unique()    REFERENCING OLD AS o NEW AS n    FOR warehouses;/*    Trigger procedure to enforce uniqueness    using function to look up any duplicate​    Doug Lawry, Oninit Consulting, December 2024*/    IF        sp_warehouses_lookup        (            n.warehouse_name,            n.warehouse_id        )        IS NOT NULL    THEN        -- Any exception will roll it back        IF INSERTING THEN            RAISE EXCEPTION -239, -100, 'ix_warehouses_checksum';        ELSE            RAISE EXCEPTION -346, -100;        END IF    END IF​END PROCEDURE;

The triggers are then as follows:

Copy to Clipboard

DROP TRIGGER IF EXISTS ins_warehouses;
DROP TRIGGER IF EXISTS upd_warehouses;

CREATE TRIGGER ins_warehouses INSERT ON warehouses
FOR EACH ROW
(
    EXECUTE PROCEDURE sp_warehouses_unique()
        WITH TRIGGER REFERENCES
);

CREATE TRIGGER upd_warehouses UPDATE ON warehouses
FOR EACH ROW
(
    EXECUTE PROCEDURE sp_warehouses_unique()
        WITH TRIGGER REFERENCES
);

 
DROP TRIGGER IF EXISTS ins_warehouses;DROP TRIGGER IF EXISTS upd_warehouses;​CREATE TRIGGER ins_warehouses INSERT ON warehousesFOR EACH ROW(    EXECUTE PROCEDURE sp_warehouses_unique()        WITH TRIGGER REFERENCES);​CREATE TRIGGER upd_warehouses UPDATE ON warehousesFOR EACH ROW(    EXECUTE PROCEDURE sp_warehouses_unique()        WITH TRIGGER REFERENCES);

TIP: set undocumented parameter NONEXCLTRIG so you can drop or create triggers while the table is in use.

You should now find that you cannot create duplicates in “warehouse_name”:

Copy to Clipboard

INSERT INTO warehouses (warehouse_name) VALUES ('Liverpool, England');

239: Could not insert new row - duplicate value in a UNIQUE INDEX column
100: ISAM error: duplicate value for a record with unique key.

 
INSERT INTO warehouses (warehouse_name) VALUES ('Liverpool, England');​239: Could not insert new row - duplicate value in a UNIQUE INDEX column100: ISAM error: duplicate value for a record with unique key.

Copy to Clipboard

UPDATE warehouses SET (warehouse_name) = ('Liverpool, England')
WHERE warehouse_name = 'Seattle, USA';

346: Could not update a row in the table.
100: ISAM error:  duplicate value for a record with unique key.

 
UPDATE warehouses SET (warehouse_name) = ('Liverpool, England')WHERE warehouse_name = 'Seattle, USA';​346: Could not update a row in the table.100: ISAM error:  duplicate value for a record with unique key.

Conclusion

The techniques in this article can provide fast fuzzier searches, and massively reduce the space required to index wide column sets.

Disclaimer

Suggestions above are provided “as is” without warranty of any kind, either express or implied, including without limitation any implied warranties of condition, uninterrupted use, merchantability, fitness for a particular purpose, or non-infringement.