Recently I had a request from a customer to unzip files in the database. Of course, I used one of the most popular Oracle APEX APIs - APEX_ZIP - which is implemented in the APEX long time ago... I think in version 4 or 5.
At the moment I'm writing this blog I've been using the latest Oracle APEX 23.2. and I noticed some new functionality in this API... which is not yet officially documented in the Oracle API documentation.
So, what's new?
It goes for a new function named "get_dir_entries" and a new function "get_file_content" adapted for a new dir_entries approach.
Warning! Old function get_files is getting deprecated! This is stated in the package comments:
In which APEX version is this new functionality implemented?
I checked all APEX versions going from 23.2 toward previous versions and figured out that those new functionalities are implemented in version 21.2
Version 21.1:
Version 21.2:
But in the official documentation there is no mention of those new functions:
New approach vs old approach - benchmark
In the package comment it is stated that this new function is much faster and much more efficient.
So I tested performances on extracting 10112 files from a ZIP archive. There are 2 steps in the process:
- get a list of all files from the ZIP file with get_dir_entries or ger_files function
- extract all files from the ZIP file with get_file_content function
Results are:
- new approach with get_dir_entries took cca 30 seconds to finish
- old approach with get_files - after more than 1 hour I stopped the script execution
Looking in details for a new approach:
- the first step of preparing a file list took most of the time... around 29 seconds
- to extract 10112 files - it took only a second.
Benchmark details (dbms output):
The code I used for testing:
DECLARE
l_zip blob;
l_files apex_zip.t_dir_entries;
l_file blob;
l_counter pls_integer := 1;
l_index varchar2(32767);
PROCEDURE p_out (p_text varchar2) IS
BEGIN
dbms_output.put_line(to_char(systimestamp, 'hh24:mi:ssXff') || ' - ' || p_text);
END p_out;
BEGIN
SELECT blob_content
INTO l_zip
FROM import_zip_xml
WHERE id = 2;
dbms_output.put_line('ZIP file size: ' || dbms_lob.getLength(l_zip) );
p_out('Getting dir entries list...');
l_files := apex_zip.get_dir_entries (
p_zipped_blob => l_zip,
p_only_files => true
);
p_out('Total files in the list: ' || l_files.count);
p_out('Start extracting files...');
--unzip files
l_index := l_files.first;
LOOP
EXIT WHEN l_index is null;
--p_out('Processing file ' || to_char(l_counter, 'fm00000') || ' ' || l_index || ': ');
l_file := apex_zip.get_file_content (
p_zipped_blob => l_zip,
p_dir_entry => l_files( l_index )
);
l_index := l_files.next(l_index);
l_counter := l_counter + 1;
END LOOP;
p_out('Finish.');
END;
One more detail I noticed - a function get_dir_entries is returning not only a collection of filenames found in the ZIP file but for every file it returns an uncompressed file size too.
This is the record type and the collection it returns:
A Conclusion
If You have a newer version of APEX, >= 21.2, this new approach is definitely worth using.
Performances are blazing fast.
Plus, there are additional metadata provided for files, like an uncompressed file size for example.