Updated December 2021
Abstract
This new feature, documented here, was introduced with IBM Informix Dynamic Server (IDS) version 14.10. It is an enhancement to Enterprise Replication (ER) allowing for data changes to be handled by a stored procedure rather than being applied directly to a replica table. It is very likely to be used in a Loopback Replication arrangement, so that the stored procedure fired is contained in the same database as the source table.
Such a replicate is also known as an “asynchronous post-commit trigger”. This is because, unlike a conventional trigger, it runs in background with respect to the calling application and occurs after the transaction has been committed. ER infrastructure should ensure reliable buffering and delivery should a backlog build up or if the server is restarted.
This article gives an example of how this might be useful to perform whatever type of action you wish when data changes have been committed.
Content
Links in this article not shown in full are for pages within the new IBM Documentation site:
https://www.ibm.com/docs/en/informix-servers/14.10
Included ER “cdr” (continuous data replication) commands are clickable documentation links and not described in detail for brevity.
Relevant material is here:
https://github.com/nagaraju-inturi/informix-async-postcommit-triggers
The test system used for this article is an IDS 14.10.FC5 Developer Edition Linux Docker container created with the image available from:
https://hub.docker.com/r/ibmcom/informix-developer-database
See also our article on Loopback Replication which contains pre-requisite instructions.
The documentation page for Replication to SPL routine describes two alternate usages that require target procedures with very different parameters:
- one per replicated column plus additional control columns;
- a single JSON document.
Simple tests of both follow, using the “stores_demo” database provided with IDS and installed with the “dbaccessdemo” command. In both cases, changes to a table will be written to a staging/audit table.
Example SQL to set up database objects for the first approach is:
CREATE TABLE state_log ( op_type CHAR(1), source_id INT, commit_time INT, txn_id BIGINT, userid INT, session_id INT, old_code CHAR(2), old_sname CHAR(15), new_code CHAR(2), new_sname CHAR(15) ); CREATE PROCEDURE sp_state_log ( op_type CHAR(1), source_id INT, commit_time INT, txn_id BIGINT, userid INT, session_id INT, old_code CHAR(2), old_sname CHAR(15), new_code CHAR(2), new_sname CHAR(15) ) INSERT INTO state_log VALUES ( op_type, source_id, commit_time, txn_id, userid, session_id, old_code, old_sname, new_code, new_sname ); END PROCEDURE; |
Note that “userid” and “session_id” were added in IDS 14.10.FC6 and should be removed from the above for earlier versions. The rest of this article has example output from FC5 but requires no code changes.
- That’s rather laborious: the control columns have to be listed 3 times and the columns in the original table 6 times.
- Parameter types must conform to the documentation/schema and be in the right order, whereas their names are up to you.
Shell commands to define and start the replicate are:
cdr define replicate repl_state -C always -S row -A -R \ --splname=sp_state_log \ "P stores_demo@g_informix:informix.state" "select * from informix.state" \ "R stores_demo@g_loopback:informix.state" "select * from informix.state" cdr start replicate repl_state |
See our Loopback Replication article for a description of the standard options chosen above.
SQL for a basic test is:
INSERT INTO state VALUES (53, 'GB'); UPDATE state SET sname = 'UK' WHERE code = 53; DELETE FROM state WHERE code = 53; SELECT * FROM state_log; |
Results were:
op_type | source_id | commit_time | txn_id | old_code | old_sname | new_code | new_sname |
---|---|---|---|---|---|---|---|
I | 1 | 1621534543 | 12124707279076 | 53 | GB | ||
U | 1 | 1621534543 | 12124707287288 | 53 | GB | 53 | UK |
D | 1 | 1621534543 | 12124707295776 | 53 | UK |
Recommended SQL statements to set up database objects for the second approach are:
CREATE TABLE data_change_log (data BSON); CREATE INDEX data_change_log ON data_change_log (BSON_VALUE_BIGINT(data, 'commit_time')) USING BSON; CREATE PROCEDURE sp_data_change_capture (data JSON) INSERT INTO data_change_log VALUES (data); END PROCEDURE; |
- That’s massively simpler: column/parameter names do not need specifying.
- The same log table and procedure can be used for multiple replicates (the data contains the table name).
- BSON has been chosen for the table column type (the INSERT recasts it) which has functions we need.
- We can then index “commit_time” as shown, which might be necessary should the table became sizeable.
Shell commands to define and start the replication differ only on line 2:
cdr define replicate repl_state -C always -S row -A -R \ --jsonsplname=sp_data_change_capture \ "P stores_demo@g_informix:informix.state" "select * from informix.state" \ "R stores_demo@g_loopback:informix.state" "select * from informix.state" cdr start replicate repl_state |
- Option “–splname” has been replaced with “–jsonsplname”.
- A shared stored procedure can be used for all source tables.
After rerunning the same simple test as before, the data recorded can be shown with spaces between JSON elements for better readability with word-wrapping via this trick (which would not affect how the JSON document behaves):
SELECT REPLACE(data::JSON::LVARCHAR, ',"', ', "') AS data FROM data_change_log; |
data |
---|
{"operation":"insert", "table":"state", "owner":"informix", "database":"stores_demo", "txnid":12124695617764, "commit_time":1621529490, "rowdata":{"code":"53", "sname":"GB"}} |
{"operation":"update", "table":"state", "owner":"informix", "database":"stores_demo", "txnid":12124695625976, "commit_time":1621529490, "rowdata":{"code":"53", "sname":"UK"}, "before_rowdata":{"code":"53", "sname":"GB"}} |
{"operation":"delete", "table":"state", "owner":"informix", "database":"stores_demo", "txnid":12124695638244, "commit_time":1621529490, "rowdata":{"code":"53", "sname":"UK"}} |
The flexibility of the BSON data is such that we could write general purpose code applying to any number of tables. For example, we might want a report showing only columns updated during a specific date range as recorded in the staging/audit table:
CREATE FUNCTION sp_data_change_diff ( p_tabname VARCHAR(128) DEFAULT '*', p_datetime_1 DATETIME YEAR TO SECOND DEFAULT NULL, p_datetime_2 DATETIME YEAR TO SECOND DEFAULT NULL ) RETURNING DATETIME YEAR TO SECOND AS date_time, VARCHAR(128) AS table_name, VARCHAR(255) AS key_value, VARCHAR(128) AS column_name, VARCHAR(255) AS value_before, VARCHAR(255) AS value_after; { Show columns updated in JSON loopback replication output See also https://en.wikipedia.org/wiki/Year_2038_problem Doug Lawry, Oninit Consulting, May 2021 Examples usage: EXECUTE FUNCTION sp_data_change_diff(); EXECUTE FUNCTION sp_data_change_diff('customer'); EXECUTE FUNCTION sp_data_change_diff ( '*', '2021-05-19 12:27:00', '2021-05-19 12:29:00' ); } DEFINE l_data BSON; DEFINE l_tabname VARCHAR(128); DEFINE l_utc_time BIGINT; DEFINE l_datetime DATETIME YEAR TO SECOND; DEFINE l_colno SMALLINT; DEFINE l_colname VARCHAR(128); DEFINE l_key_value VARCHAR(255); DEFINE l_value_1 VARCHAR(255); DEFINE l_value_2 VARCHAR(255); FOREACH SELECT data INTO l_data FROM data_change_log WHERE BSON_VALUE_VARCHAR (data, 'table') MATCHES p_tabname AND BSON_VALUE_VARCHAR (data, 'operation') = 'update' AND BSON_VALUE_BIGINT (data, 'commit_time') BETWEEN sp_datetime_to_utc (NVL(p_datetime_1, MDY(01,02,1970))) AND sp_datetime_to_utc (NVL(p_datetime_2, MDY(01,18,2038))) LET l_tabname = BSON_VALUE_VARCHAR (l_data, 'table'); LET l_utc_time = BSON_VALUE_BIGINT (l_data, 'commit_time'); LET l_datetime = DBINFO ('UTC_TO_DATETIME', l_utc_time); FOREACH SELECT colno, colname INTO l_colno, l_colname FROM systables AS t JOIN syscolumns AS c ON c.tabid = t.tabid WHERE tabname = l_tabname ORDER BY colno LET l_value_1 = BSON_VALUE_VARCHAR (l_data, 'before_rowdata.' || l_colname); LET l_value_2 = BSON_VALUE_VARCHAR (l_data, 'rowdata.' || l_colname); IF l_colno = 1 THEN LET l_key_value = l_value_1; END IF IF NVL(l_value_1,'') != NVL(l_value_2,'') THEN RETURN l_datetime, l_tabname, l_key_value, l_colname, l_value_1, l_value_2 WITH RESUME; END IF END FOREACH END FOREACH END FUNCTION; |
That calls one other SPL function:
CREATE FUNCTION sp_datetime_to_utc ( date_time DATETIME YEAR TO SECOND DEFAULT CURRENT YEAR TO SECOND ) RETURNING BIGINT AS utc_time; { Inverse of DBINFO('utc_to_datetime',INT) Doug Lawry, Oninit Consulting, May 2021 Check same result to test correctness: SELECT DBINFO('utc_current'), sp_datetime_to_utc() FROM systables WHERE tabid = 1; Maximum safe range with up to 24 hours TZ offset: '1970-01-02 00:00:00' -- zero is 1970-01-01 '2038-01-18 00:00:00' -- Google "Year 2038" This function will cope after that using BIGINT, but DBINFO('utc_to_datetime',INT) will need fixing. } DEFINE t INTERVAL DAY(7) TO SECOND; DEFINE d INTERVAL DAY(7) TO DAY; DEFINE s INTERVAL SECOND(7) TO SECOND; -- Time since UTC start LET t = date_time - DBINFO('utc_to_datetime', 0); -- Split days/seconds as max INTERVAL scale is 9: LET d = t; LET s = t - d; -- Recombine total seconds, allowed via CHAR: RETURN d :: CHAR(9) * 24 * 60 * 60 + s :: CHAR(9); END FUNCTION; |
IDS has built-in function DBINFO(‘utc_to_datetime’,INT) but not DBINFO(‘datetime_to_utc’,DATETIME).
There is a feature request for this submitted in 2015:
https://ibm-data-and-ai.ideas.aha.io/ideas/INFX-I-139
Searching for an SPL solution shows Jonathon Leffler’s is the most shared:
https://stackoverflow.com/questions/14187569/convert-datetime-to-unix-epoch-in-informix
However, that doesn’t handle time zones correctly, and our version above is simpler.
To test “spdata_change_diff”, two similar JSON replicates were created using a convenient shell script:
$ cat repl_define.sh [ $# -lt 2 ] && exec echo "Usage: $0 database table-names" set -e # exit on error DB=$1 ; shift # database name TO=informix # table owner G1=g_informix # groups in sqlhosts G2=g_loopback for TN # table name do RP=repl_$TN # replicate name T1=$TO.$TN # owner + table names T2=$T1 cdr define replicate $RP -C always -S row -A -R \ --jsonsplname=sp_data_change_capture \ "P $DB@$G1:$T1" "select * from $T1" \ "R $DB@$G2:$T2" "select * from $T2" cdr start replicate $RP done $ repl_define.sh stores_demo customer stock Interpreting this replicate as a master replicate. Verification of stores_demo@g_informix:informix.customer started Verification of stores_demo@g_informix:informix.customer is successful Verification of stores_demo@g_loopback:informix.customer started Verification of stores_demo@g_loopback:informix.customer is successful Interpreting this replicate as a master replicate. Verification of stores_demo@g_informix:informix.stock started Verification of stores_demo@g_informix:informix.stock is successful Verification of stores_demo@g_loopback:informix.stock started Verification of stores_demo@g_loopback:informix.stock is successful |
SQL statements to clear the staging/audit table and run suitable test transactions were:
TRUNCATE data_change_log; UPDATE customer SET city = 'Sunny Vale' WHERE city = 'Sunnyvale'; UPDATE customer SET city = 'Sunnyvale' WHERE city = 'Sunny Vale'; UPDATE stock SET description = REPLACE(description, 'tires', 'tyres') WHERE description MATCHES '* tires'; UPDATE stock SET description = REPLACE(description, 'tyres', 'tires') WHERE description MATCHES '* tyres'; |
We can produce a report across all the data changes captured with:
EXECUTE FUNCTION sp_data_change_diff(); |
That returns:
date_time | table_name | key_value | column_name | value_before | value_after |
---|---|---|---|---|---|
21/05/2021 18:14:29 | customer | 101 | city | Sunnyvale | Sunny Vale |
21/05/2021 18:14:29 | customer | 109 | city | Sunnyvale | Sunny Vale |
21/05/2021 18:14:29 | customer | 111 | city | Sunnyvale | Sunny Vale |
21/05/2021 18:14:29 | customer | 101 | city | Sunny Vale | Sunnyvale |
21/05/2021 18:14:29 | customer | 109 | city | Sunny Vale | Sunnyvale |
21/05/2021 18:14:29 | customer | 111 | city | Sunny Vale | Sunnyvale |
21/05/2021 18:14:29 | stock | 101 | description | bicycle tires | bicycle tyres |
21/05/2021 18:14:29 | stock | 101 | description | bicycle tires | bicycle tyres |
21/05/2021 18:14:29 | stock | 101 | description | bicycle tyres | bicycle tires |
21/05/2021 18:14:29 | stock | 101 | description | bicycle tyres | bicycle tires |
Parameters can be given to match for specific tables and/or a date range: see comments in code above.
Should you need to match for multiple table names, that can be done by treating the result set as a table:
SELECT * FROM TABLE (FUNCTION sp_data_change_diff()) AS data (date_time, table_name, key_value, column_name, value_before, value_after) WHERE table_name IN ('customer', 'stock'); |
The same results are returned in this case.
Caveats
IDS 14.10 is required. See other prerequisites in our Loopback Replication article.
Limitations restated from the documentation are:
- The target table must exist even if otherwise unused.
- Ensure not to combine participant definitions that include both table and SPL routine as the target.
- Out-of-row data datatypes like text, byte, blob, clob are not supported.
We raised two support cases which now have defect numbers:
IT37197: cdr list catalog is not showing ‘replication to SPL routine’ which was defined in cdr define repl
IT36450: –jsonsplname of “cdr repl” with CHAR/VCHAR column replicated containing single/double quote JSON won’t be constructed correctly.
The first means you cannot easily find out what procedure name was given (also not included in “cdr list repl” output) or correctly regenerate your “cdr define repl” commands. The second is self-explanatory and is a show-stopper if your data is affected.
See also these defects:
IT36973: -208 error in RIS file for transactions > 65535 rows replicating to SPL due to CDR data sync thread exhausting SQL statement IDs
IT36987: SPL conflict resolution or post commit triggers potentially failing with errors 674 or 201
Conclusion
Although the “replication to SPL routine” feature seems still to be a work-in-progress as at IDS 14.10.xC5, it is very flexible and could be a unique solution to unusual problems. Other use cases described elsewhere include real-time analytics such as aggregates or leader boards.
Disclaimer
Suggestions above are provided “as is” without warranty of any kind, either express or implied, including without limitation any implied warranties of condition, uninterrupted use, merchantability, fitness for a particular purpose, or non-infringement.