Tuesday 9 April 2024

Oracle APEX ZIP API - an Upgrade which is not officially documented

Recently I had a request from a customer to unzip files in the database. Of course, I used one of the most popular Oracle APEX APIs - APEX_ZIP - which is implemented in the APEX long time ago... I think in version 4 or 5.

At the moment I'm writing this blog I've been using the latest Oracle APEX 23.2. and I noticed some new functionality in this API... which is not yet officially documented in the Oracle API documentation.

https://docs.oracle.com/en/database/oracle/apex/23.2/aeapi/APEX_ZIP.html#GUID-270BFF3A-5FB1-4089-894E-978608F9BD87

So, what's new?

It goes for a new function named "get_dir_entries" and a new function "get_file_content" adapted for a new dir_entries approach.

Warning! Old function get_files is getting deprecated! This is stated in the package comments:

In which APEX version is this new functionality implemented?

I checked all APEX versions going from 23.2 toward previous versions and figured out that those new functionalities are implemented in version 21.2

Version 21.1:

Version 21.2:


But in the official documentation there is no mention of those new functions:


New approach vs old approach - benchmark

In the package comment it is stated that this new function is much faster and much more efficient.

So I tested performances on extracting 10112 files from a ZIP archive. There are 2 steps in the process: 

  1. get a list of all files from the ZIP file with get_dir_entries or ger_files function
  2. extract all files from the ZIP file with get_file_content function

Results are:

  • new approach with get_dir_entries took cca 30 seconds to finish
  • old approach with get_files - after more than 1 hour I stopped the script execution

Looking in details for a new approach:

  • the first step of preparing a file list took most of the time... around 29 seconds
  • to extract 10112 files - it took only a second.

Benchmark details (dbms output):


The code I used for testing:

DECLARE
    l_zip blob;
    l_files apex_zip.t_dir_entries;
    l_file blob;
    l_counter pls_integer := 1;
    l_index varchar2(32767);
    
    PROCEDURE p_out (p_text varchar2) IS
    BEGIN
        dbms_output.put_line(to_char(systimestamp, 'hh24:mi:ssXff') || ' - ' || p_text);
    END p_out;
    
BEGIN
    SELECT blob_content 
    INTO l_zip
    FROM import_zip_xml 
    WHERE id = 2;
    
    dbms_output.put_line('ZIP file size: ' || dbms_lob.getLength(l_zip) );
    p_out('Getting dir entries list...');
    
    l_files := apex_zip.get_dir_entries (
        p_zipped_blob => l_zip,
        p_only_files => true
    );
    
    p_out('Total files in the list: ' || l_files.count);
    p_out('Start extracting files...');

    --unzip files
    l_index := l_files.first;
    
    LOOP
        EXIT WHEN l_index is null;
        
        --p_out('Processing file ' || to_char(l_counter, 'fm00000') || ' ' || l_index || ': ');
        
        l_file := apex_zip.get_file_content (
            p_zipped_blob => l_zip,
            p_dir_entry => l_files( l_index ) 
        );
        
        l_index := l_files.next(l_index);
        l_counter := l_counter + 1;
    END LOOP;
    p_out('Finish.');
END;


One more detail I noticed - a function get_dir_entries is returning not only a collection of filenames found in the ZIP file but for every file it returns an uncompressed file size too.

This is the record type and the collection it returns:


A Conclusion

If You have a newer version of APEX, >= 21.2, this new approach is definitely worth using.

Performances are blazing fast.

Plus, there are additional metadata provided for files, like an uncompressed file size for example.

No comments:

Post a Comment