Super Zip Utility - ZipUtilityLZH 1.0.0 Lempel-Ziv/Huffman data compression archival CTOS shareware utility. (c) 1994 S. Kurowski (SJCSJK) 94/07/28 This utility performs three basic functions: (1) Archives a list of files to an archive file, using LZWH compression. (2) Restores a list of files from an archived file, decompressing them. (3) Same as (1), but the output archive is a self-extracting run file. Environment Requirements: OS levels: BTOS II 3.0 or later, CTOS I 3.3 or later, CTOS II 3.3 or later, CTOS III 1.0 or later. Standard SW: BTOS II 3.0 or later, CTOS 12.0 or later. Disk storage: 170 sectors for all installed utility software. 27 sectors (13k) overhead is used by run file archive extraction code in self-extracting archives. Memory: 368k for Zip Archive and Unzip Archive. 132k for self-extracting archive files. Archive Contents: ZipUtilityLZH.run Version x1.0.0 utility ZipUtilityMsg.bin Version x1.0.0 binary messages ZipUtility.data Self-extracting executable code <$000>SuperZipCmds.sub Version x1.0.0 Executive commands <$000>ZipUtilityMsg.txt PLK text version of binary messages <$000>SuperZip.ReadMeFirst (this file) Revision History: 1.0.0 94/07/28 First release. The three commands added by Submit of <$000>SuperZipCmds.sub are: Zip Archive (Case 00) [File list] [File prefix(es) from] [Archive data set (.zpt)] [Delete existing archive data set?] [Confirm each?] [Print file] [Zip to run file?] Unzip Archive (Case 01) [Archive data set (.zpt)] [File list from (<*>*)] [File list to (<*>*)] [Overwrite okay?] [Confirm each?] [Print file] [List files only?] Unzip Run Archive - or - Run (both are case 00) Run file archive Run file ('Run' is NOT added!) [CASE] [Command] [File list from (<*>*)] [Parameter 1] [File list to (<*>*)] [Parameter 2] [Overwrite okay?] [Parameter 3] [Confirm each?] [Parameter 4] Using the SuperZip utility: SuperZip Features - Most of the command parameters work as with the regular 12.x CTOS Standard Software except where noted. The <$000>ZipUtilityMsg.txt file may be nationalized to replace the provided ZipUtilityMsg.bin file. The yes/no/blank Executive parameters are also nationalized when a local Nls.sys (or RMOS NLSService) is available. When archiving or unarchiving files, SuperZip will show you how large the compressed file data is in comparision to the input file, as a percentage. For example, if the input file was 2000 bytes long, and the compressed output file was 1000 bytes long, the ratio would be display as "... (50.0%) done.". If you are restoring this same file to its original form, the ratio would display as "... (200.0%) done.". Please note that this hopefully more intuitive representation of compression ratios is not the same as the formally defined 'compression ratio' quantity, which is ( 100% - displayed ratio ). SuperZip archive files always have the suffix ".zpt". For each file archived, the archive contains a variable-length header containing a compression method identifier, a checksum value used to verify file data integrity at the time it is decompressed, the uncompressed file size, the compressed file size and the name of the file, including the source directory. The number of files processed is displayed at the conclusion of the archival or restore operation. During compression or decompression of files, the "spinning" activity indicator is a mechanism provided to show the relative compression rates and let the user know the software is actually working upon a file. The indicator "ticks" every 4 sectors that are bit-written during compression or bit-read during decompression. A complete turn of the indicator means 16 sectors have been processed at the bit level. A pause of the spinning indicator is normal for this algorithm as compression stages take turns - only the secondary Huffman stage outputs or inputs bit data. A long pause indicates the file is very highly compressable and is using a single 8k dictionary for large data runs. There is no activity indicator for self-extracting archives. Super Zip Archive - The Zip Archive command "[File list]" parameter accepts any list of files. Wildcard filespecs are expanded by the Executive. The [File prefix(es) from] parameter works in association with the [File List] parameter in a fashion identical to that of the LCopy utility to build file names to archive from different sources. All successfully archived input files are stored in a single output archive or self-extracting run file archive. Self-Extracting Run File Archives - The Zip Archive parameter "[Zip to run file?]" optionally archives the dataset to a self-extracting archive run file (always suffixed with ".run"). The resultant run file archive dataset may then be run to extract its contents. To keep the self-extracting code as small as possible (the primary criteria in the first place) no message files are used by the self-extractor. This parameter has SuperZip append an otherwise regular .zpt archive file to a copy of the [Sys]ZipUtility.data file, which is the self-extracting executable. Please note quick arithmetic assuming average nominal file compression at archival is around 40% (with 28 sector extractor overhead) shows we should expect zip-to-run archiving less than around 45 sectors of file data (depending largely upon data types) will result in an executable archive larger than the unarchived file data. Normally larger data sets are archived. When you run a self-extracting archive run file it will accept four optional parameters. These parameters may by typed directly into the Executive Run command by an archive recipient, or you may instead use the (rather cheesy- looking) Unzip Run Archive command also included in SuperZipCmds.sub. Note that if you use the Executive Run command you will need to enclose the file list subparameters in literals ('') to prevent the Executive from expanding your wildcards when restoring. When running a self-extracting SuperZip archive, user interaction is facilitated by the following conventions: (a) The [File list from (<*>*)] and [File list to (<*>*)] parameters work as usual. (Using literals around wildcards when using Run.) Additionally, the use of null literals in [File list to (<*>*)] will act as a filespec placeholder and will restore the file using its original name. (b) Files being unarchived are displayed in the format of: filespecfrom ==>> filespecto (c) If an error occurs, the CTOS error code is displayed at the end of the line in parenthesis - for example if the file already exists: filespecfrom ==>> filespecto (224) (d) If the output file is being overwritten, an asterisk is displayed at the end of the line: filespecfrom ==>> filespecto * By default, files are not overwritten. (e) If the [Confirm each?] parameter is set to 'yes' OR the output file already exists and the [Overwrite okay?] parameter is default, the self-extractor will display the same information as in (b) but in parenthesis followed by a question mark prompting a user decision: ( filespecfrom ==>> filespecto ) ? At this point the extractor will pause for a GO, CANCEL, or FINISH keystroke from the user. Each of these keystrokes operates just as it would in SuperZip's Unzip Archive utility. SuperZip Unzip Archive - The Unzip Archive restore parameter "[List files only?]" optionally shows the contents of an archive without restoring it (NOT a run file archive, however), listing the names of the files, their uncompressed and compressed sizes in bytes and sectors, and computes the total uncompressed and compressed file sectors for the archive. The total compressed sectors value is not strictly the sum of the individual compressed sector counts of the archive's content files, but rather the actual stored sector sum minus the archive file headers. If you attempt to restore an archive not produced by this utility it will report CTOS error code 7 (Not Implemented) and abort processing of the file. Also, if each resultant decompressed file computed checksum does not match its header checksum, CTOS error code 245 (Run File Invalid Checksum) is reported. In the latter case, the corrupt decompressed file is not deleted so that any of its intact contents can be recovered. The utility will not terminate so the remaining files in the archive file will still be processed. SuperZip Design Notes - The LZH algorithm implemented here is based in part upon earlier work by Okumura (~1990) (used also in LHarc). The first stage uses an LZ78 dictionary variation, while the secondary Huffman statistical stage is engaged to compress the contents of the buffered compressed dictionary data. It separately counts the dictionary unencoded data and pointer/length values and builds a Huffman tree encoding these as variable-length integral-bit codes. When decompressing a file, these two stages are reversed, Huffman first then LZ78. Substantially more processing is required for compression than decompression to generate the statistics in the second stage; consequently, decompression is much faster than compression. The disk bytestream I/O used for the self-extracting archive code is a 2k 'micro' bytestream module replacing all of the CTOS.lib synchronous disk bytestream calls with equivalent forward-only acting operations (in C). An advanced version of this algorithm (LZA1) is under development that replaces the secondary Huffman stage with table-driven arithmetic coding per Fenwick and Gutmann (U. of Auckland, NZ, 1993) and further work of Fenwick (1993) using a new statistical binary indexed tree structure, both of which significantly facilitate arithmetic coding speed. An order-0 prototype (LZWA0) of this has already been completed at DSD San Jose. Check the MailOrder/TS2 shareware system for this new version later this year. **** Ideas & suggestings regarding this utility are welcome. **** **** You may distribute this utility freely so long as it is understood to be **** shareware for no cost. **** **** San Jose, California - sjk (SJCSJK)