Abstract
Avoiding sub-queries can be essential to reduce long run-times on large data sets.
However, it is sometimes necessary to identify rows in one table for which there is no associated row in another table. An example would be an order for which there are no order detail lines. The conventional approach is to use the NOT EXISTS (sub-query)
, but this can be extremely slow for large data sets. This article demonstrates how an ANSI join can achieve the same result much more efficiently, even on large data sets.
Content
Run-time of such SELECT statements and sub-queries can be significantly reduced by using an ANSI outer join and post-join condition for NULLs, as shown in the real example below used by the Oninit Archiver module on the Openbet application:
Informix, in theory, attempts to “flatten” sub-queries, but this does not seem very effective in the above real-world scenario. In the “before” version, for each row in the outer query, a separate fetch is made via the inner query. In the “after” version, the two tables are joined in a single pass in one query, and rows discarded if the resulting joined key value is NULL.
Conclusion
Avoiding sub-queries can be essential to reduce long run-times on large data sets, and the above technique demonstrates how to achieve this while at the same time producing identical results to “NOT EXISTS”.
Disclaimer
The code fix suggested above is provided “as is” without warranty of any kind, either express or implied, including without limitation any implied warranties of condition, uninterrupted use, merchantability, fitness for a particular purpose, or non-infringement.
Contact us
If you have any questions or would like to find out more about sub-queries, ANSI joins and Informix, simply contact us.