Abstract
Updated July 2024
Thanks to Andreas Ledger, RFE INFX-I-368 has been delivered, and this feature is now available in all IDS Editions from version 14.10.FC11 onwards.
Best practice has always been not to create indexes on highly duplicate data. Scanning the entire table for a common value may be quicker, and updating an index can be very costly when many pointers to rows with the same value are spread over several pages. A work-around when an index is essential in this scenario is to extend the index with a more selective column, but this obviously makes it bigger.
For the first time, a far better solution was provided in IDS 14.10.FC2. It’s known as Informix 14.10 Partial Indexes
Partial Indexes (click for the relevant page in the IBM Knowledge Center).
In this article, we will demonstrate how to identify where such indexes might be appropriate, how to create them, and how much smaller they can potentially be.
Content
A classic example is the index on the “status” column of the Sage Line 500 “opheadm” table which contains sales orders. 99% of rows will typically have the value “8” for “Invoiced” in the status column: if we wanted to retrieve all those rows, the query optimizer will probably find it quicker to scan the whole table without using any index. However, most of the time, we are looking for rows with one of the other values, for which we do need an index. The following simulates the scenario (the real table obviously has many more columns):
DROP TABLE IF EXISTS opheadm;
CREATE RAW TABLE opheadm (order_no CHAR(10), status CHAR(1));
DROP PROCEDURE IF EXISTS opheadm_insert;
CREATE PROCEDURE opheadm_insert()
DEFINE x, s INTEGER;
FOR x = 0 TO 999999
LET s = MOD(x,1000) + 1;
IF s > 7 THEN
LET s = 8;
END IF
INSERT INTO opheadm VALUES(LPAD(x,6,'0'),s);
END FOR;
END PROCEDURE;
EXECUTE PROCEDURE opheadm_insert();
CREATE INDEX ix_opheadm_status ON opheadm(status);
SELECT status, COUNT(*) AS number
FROM opheadm
GROUP BY 1
ORDER BY 1;
That returns:
status | number |
---|---|
1 | 1000 |
2 | 1000 |
3 | 1000 |
4 | 1000 |
5 | 1000 |
6 | 1000 |
7 | 1000 |
8 | 993000 |
The standard index has 2783 used pages, 2766 leaves, and 3 levels.
SQL to replace this with a partial index excluding invoiced rows is:
DROP INDEX ix_opheadm_status;
CREATE INDEX 'informix'.ix_opheadm_status ON opheadm (status)
FRAGMENT BY EXPRESSION
PARTITION part_0 (status IN ('8')) IN datadbs INDEX OFF,
PARTITION part_1 REMAINDER IN datadbs ONLINE;
- Stating the dbspace name before “INDEX OFF” is recommended as it’s necessary for mixed page sizes: see Caveats.
- Partition names can be whatever you like instead of “part_0” and “part_1” within the normal Informix object name rules.
The new index has 24 used pages, 22 leaves, and 2 levels.
The following function helps you identify indexes that could be candidates, and generates SQL to replace them with partial indexes:
It checks all duplicate indexes on single columns in a specified database to see if there are values accounting for more than a given percentage of rows in the table. For example, create and populate the example table as described in the IBM Knowledge Center Partial Indexes page, but with this index:
CREATE INDEX 'informix'.idx2 ON tab1 (n2);
Then run the new function:
EXECUTE FUNCTION sp_partial_indexes(10); -- 10% threshold, current database
SQL generated:
DROP INDEX idx2; -- 33 leaves, 2 levels
CREATE INDEX 'informix'.idx2 ON tab1 (n2)
FRAGMENT BY EXPRESSION
PARTITION part_0 (n2 IS NULL) IN datadbs INDEX OFF,
PARTITION part_1 REMAINDER IN datadbs ONLINE;
The number of leaves and levels is shown so that you can decide which indexes are big enough to matter.
Run on the standard demo database:
EXECUTE FUNCTION sp_partial_indexes(50,'stores_demo'); -- 50% threshold
SQL generated:
RENAME INDEX 108_20 TO ix_108_20; -- system generated constraint index names begin with space
DROP INDEX ix_108_20; -- 1 leaves, 1 levels
CREATE INDEX 'informix'.ix_108_20 ON cust_calls (call_code)
FRAGMENT BY EXPRESSION
PARTITION part_0 (call_code IN ('I')) IN datadbs INDEX OFF,
PARTITION part_1 REMAINDER IN datadbs ONLINE;
The extra first statement above will be some help in dealing with indexes implied by primary or foreign key constraints, whose actual name begins with a space to prevent alteration: see the RENAME INDEX documentation page.
However, CREATE INDEX still fails with error -350 “Index already exists on the column” as the index has not in fact been dropped but only hidden again. The following is a complete solution using more meaningful object names:
ALTER TABLE cust_calls DROP CONSTRAINT r108_20; -- also drops the hidden index
CREATE INDEX 'informix'.ix_cust_calls_call_code ON cust_calls (call_code)
FRAGMENT BY EXPRESSION
PARTITION part_0 (call_code IN ('I')) IN datadbs INDEX OFF,
PARTITION part_1 REMAINDER IN datadbs ONLINE;
ALTER TABLE cust_calls ADD
CONSTRAINT FOREIGN KEY (call_code) REFERENCES call_type
CONSTRAINT 'informix'.fk_cust_calls_call_code NOVALIDATE;
As you can see, the SQL generated is only a guide, and you may well need to edit the results, as well as experimenting with different thresholds.
Caveats
Tables comprised of multiple fragments (partitions) is part of the parallelisation features reserved for Enterprise Edition. Unfortunately, Informix 14.10 Partial Indexes have been implemented using the same FRAGMENT BY EXPRESSION (or PARTITION) syntax, and this is rejected on lower editions (except Developer) with error 26453 “Fragmentation is not supported in this edition of IDS” prior to version 14.10.FC11 (resolved by RFE INFX-I-368).
This article was updated on 18th August 2020 with new information from Roland Wintgen who has opened a case with IBM. A problem occurs if the database or table is in a dbspace with a non-default page size. For example, in an instance with 2KB pages by default, if you create a 4KB page dbspace “data4kb” then a new database in it, the following simpler form of SQL for one of our examples produces an error:
CREATE INDEX 'informix'.ix_opheadm_status ON opheadm (status)
FRAGMENT BY EXPRESSION
(status IN ('8')) INDEX OFF,
REMAINDER IN data4kb ONLINE;
-- SQL Error -26015: All fragments of the table or index need to be of same pagesize.
You have to declare a dbspace name for both fragments in this situation. Perhaps it otherwise defaults to the root dbspace, causing the error. However, it still fails:
CREATE INDEX 'informix'.ix_opheadm_status ON opheadm (status)
FRAGMENT BY EXPRESSION
(status IN ('8')) IN data4kb INDEX OFF,
REMAINDER IN data4kb ONLINE;
-- SQL Error -858: Cannot specify the same partition/space name twice in a fragmentation specification.
The solution is also to name the fragments (aka. partitions):
CREATE INDEX 'informix'.ix_opheadm_status ON opheadm (status)
FRAGMENT BY EXPRESSION
PARTITION part_0 (status IN ('8')) IN data4kb INDEX OFF,
PARTITION part_1 REMAINDER IN data4kb ONLINE;
The stored procedure and examples in this article have been amended accordingly. If accepted as a bug by IBM, we will provide further updates here with the defect number and IDS version containing the fix when either are available. Meanwhile, the above is a full work-around.
Conclusion
If Informix 14.10 Partial Indexes are available in your version and edition, they may save you considerable disk space and make reads, inserts and deletes lighter.
Disclaimer
Suggestions above are provided “as is” without warranty of any kind, either express or implied, including without limitation any implied warranties of condition, uninterrupted use, merchantability, fitness for a particular purpose, or non-infringement.
Contact us
If you have any questions regarding Informix 14.10 Partial Indexes or would like to find out more, simply contact us.