Abstract
When testing a database migration to a new platform that is not compatible, for example with a different OS or locale, you cannot restore from a level 0 archive on the new server. Data must be copied either via flat files or directly using SQL tools. If this is from a live database that is being updated and cannot be exclusively locked (which rules out dbexport), you cannot guarantee consistency of the data being reloaded, so NOVALIDATE must be set to ensure foreign keys can be reapplied.
This article provides stored procedure language (SPL) code to identify and delete any resulting invalid data.
Content
The following SQL generator was posted in the dbexport alternative thread on the IBM Informix community forum in May 2021, the only change below being that PROCEDURE has been replaced with FUNCTION which is more correct if data is returned:
The foreign key hierarchy is traversed for the entire current database. You will need to create the function in every database that needs to be checked: table names are assumed to be in the database containing the stored procedure or function unless explicitly stated otherwise, and not in the database to which the calling session is connected.
The main FOREACH SELECT statement examines system catalog tables within the database being checked as follows:
Alias | Table | Description |
---|---|---|
dc | sysconstraints | Foreign key constraint (FK) |
do | sysobjstate | FK object state |
dt | systables | FK table |
di | sysindices | FK index |
dr | sysreferences | Constraint relationship |
pc | sysconstraints | Primary key constraint (PK) |
pt | systables | PK table |
pi | sysindices | PK index |
Criterion | Description |
---|---|
dc.constrtype = 'R' | Referential constraint (FK) |
do.objtype = 'C' | Constraint object |
do.state = 'E' | Enabled (remove to include all) |
As well as the necessary object names and table IDs, “sysindices.indexkeys” are selected for both the foreign and primary key. We are using table sysindices which contains the column numbers in an index as an array so is easier to use here, rather than the more commonly used sysindexes view which represents the array as 16 separate columns. We can then loop through the 16 possible elements using an undocumented internal function with this specification:
Overall usage is as given in the source code comments:
UNLOAD is implemented client-side, so will only work in dbaccess or AGS Server Studio. If using other SQL clients, you will need to save the result set in such a way that semi-colons are added to each line, or edit the SPL function to do that.
Note also the under-documented trick to treat function output as a sub-query with:
The file produced by executing this in the standard “stores_demo” database created by dbaccessdemo contains:
There are repeating groups of 4 statements per foreign key. An example group with linefeeds inserted for readability follows:
The first statement identifies rows in the foreign key table for which the column values are not present in the corresponding primary key, and uses the technique described here:
https://www.oninitgroup.com/faq-items/sub-queries-a-faster-alternative-to-not-exists
NULL values in foreign key columns are exempted from referential checking, so are excluded with IS NOT NULL criteria.
The ROWIDs are saved in a temporary table, which is then used to:
- save row contents in standard pipe-delimited UNLOAD format to file “table-name.foreign-key-name.unl”;
- delete those rows.
WITH NO LOG is appended as a precaution, though you should always have configuration parameter TEMPTAB_NOLOG set to 1, making this the default anyway. This does not cause any error if the database is not logged.
The temporary table is then dropped as the name will be reused in the next repeating group of statements.
Should you wish to report invalid data without fixing it, the RETURN of delete statements can be commented out in the code, or lines beginning with DELETE removed from what is generated afterwards.
The result can then be executed to save a flat file per foreign key. Empty files can be ignored, or removed on Unix or Linux with:
If you decided not include DELETE statements, file contents can be reviewed before fixing data manually.
Caveats
The code assumes ROWID is available for all tables involved. If you have Enterprise Edition and are using tables with multiple partitions, WITH ROWIDS must be applied to them. Alternatively, if every table contains a SERIAL, SERIAL8 or BIGSERIAL column, the SPL function could be recoded to use those instead. Another work-around from IDS 14.10.xC6 onwards would be to use the undocumented IFX_ROW_ID pseudo-column: queries on them no longer sequentially scan the table.
Conclusion
The tool provided in this article means that a database can be copied while fully in use, and any resulting inconsistencies in referential integrity identified and fixed afterwards as necessary.
References
Documentation links not shown in full in this article can be queried at the new IBM Informix 14.10 site:
Disclaimer
Suggestions above are provided “as is” without warranty of any kind, either express or implied, including without limitation any implied warranties of condition, uninterrupted use, merchantability, fitness for a particular purpose, or non-infringement.