PerlIO-gzip-0.18000755 001765 001765 00000000000 10510035477 013414 5ustar00nicknick000000 000000 PerlIO-gzip-0.18/gzip.pm000644 001765 001765 00000006326 10510001426 014774 0ustar00nicknick000000 000000 package PerlIO::gzip; use 5.008; use strict; use warnings; use XSLoader (); our $VERSION = '0.18'; XSLoader::load "PerlIO::gzip", $VERSION; 1; __END__ =head1 NAME PerlIO::gzip - Perl extension to provide a PerlIO layer to gzip/gunzip =head1 SYNOPSIS use PerlIO::gzip; open FOO, "<:gzip", "file.gz" or die $!; print while ; # And it will be uncompressed... binmode FOO, ":gzip(none)" # Starts reading deflate stream from here on =head1 DESCRIPTION PerlIO::gzip provides a PerlIO layer that manipulates files in the format used by the C program. Compression and Decompression are implemented, but not together. If you attempt to open a file for reading and writing the open will fail. =head1 EXPORT PerlIO::gzip exports no subroutines or symbols, just a perl layer C =head1 LAYER ARGUMENTS The C layer takes a comma separated list of arguments. 4 exclusive options choose the header checking mode: =over 4 =item gzip The default. Expects a standard gzip file header for reading, writes a standard gzip file header. =item none Expects or writes no file header; assumes the file handle is immediately a deflate stream (eg as would be found inside a C file) =item auto Potentially dangerous. If the first two bytes match the C header "\x1f\x8b" then a gzip header is assumed (and checked) else a deflate stream is assumed. No different from gzip on writing. =item autopop Potentially dangerous. If the first two bytes match the C header "\x1f\x8b" then a gzip header is assumed (and checked) else the layer is silently popped. This results in gzip files being transparently decompressed, other files being treated normally. Of course, this has sides effects such as File::Copy becoming gunzip, and File::Compare comparing the uncompressed contents of files. In autopop mode Opening a handle for writing (or reading and writing) will cause the gzip layer to automatically be popped. =back Optionally you can add this flag: =over 4 =item lazy For reading, defer header checking until the first read. For writing, don't write a header until the first buffer empty of compressed data to disk. (and don't write anything at all if no data was written to the handle) By default, gzip header checking is done before the C (or C) returns, so if an error is detected in the gzip header the C or C will fail. However, this will require reading some data, or writing a header. With lazy set on a file opened for reading the check is deferred until the first read so the C should always succeed, but any problems with the header will cause an error on read. open FOO, "<:gzip(lazy)", "file.gz" or die $!; # Dangerous. while () { print; } # Whoa. Bad. You're not distinguishing between errors and EOF. If you're not careful you won't spot the errors - like the example above you'll think you got end of file. lazy is ignored if you are in autopop mode. =back =head1 AUTHOR Nicholas Clark, Enwc10+perlio-gzip@colon.colondot.netE =head1 SEE ALSO L, L, L (the gzip file format specification), L (DEFLATE compressed data format specification) =cut PerlIO-gzip-0.18/t000755 001765 001765 00000000000 10510035477 013657 5ustar00nicknick000000 000000 PerlIO-gzip-0.18/Changes000644 001765 001765 00000013116 10510027316 014761 0ustar00nicknick000000 000000 Revision history for Perl extension PerlIO::Gzip. 0.01 Sat Feb 3 13:51:15 2001 - original version; created by h2xs 1.21 with options -c -n Layer::Gzip /usr/local/include/sfio.h 0.02 Sunday parse gzip header, 40 regression tests 0.03 Tue Feb 6 18:50:01 GMT 2001 Change name from Layer::Gzip to PerlIO::gzip following feedback on p5p. Add checks to Makefile.PL to ensure you have 5.7.0 [or later :-)] and PerlIO configure. 0.04 Tue Feb 6 23:56:16 GMT 2001 Added autopop mode. There must be a better name for it. Changed order in PerlIOGzip_pushed to call PerlIObuf first (D'oh) and to pop without error in autopop mode with a write. 0.05 Sun Feb 11 22:12:42 GMT 2001 Mmm. CRCs at the ends of files - maybe I should check them? Now gives error on close if the z_stream is at the end and either the CRC or the length recorded in the gzip trailer is wrong. Doesn't try these checks for files with no gzip header. PROTOTYPES: DISABLE in the .xs 0.06 Fri Feb 16 23:59:18 GMT 2001 We have compression! Hmm. Didn't check the return from close on perl.gz - transpires there was a bug in my code to read the length of uncompressed data from the gzip file Hmm. Why was I passing *my* args down to the PerlIOBuf I'm a derived class of? On the other hand, why not? ISA or HASA? My args, HASA, I guess, so I'll give it NULL args. 0.07 Sun Feb 18 23:21:48 GMT 2001 s/to/too/ in the BARF message in Makefile.PL [1 character change. Look at the paperwork that generated :-)] use XSLoader rather than Dynaloader. 0.08 Mon Feb 19 21:14:45 GMT 2001 fix bug in get_more's return value that basically prevented you from reading any header with embedded filename from an unbuffered layer. [Obscure, but it's supposed to work with unbuffered] However, still problems with unread, so you can't inflate any gzip file with an original filename (or comment) in the header. This is related to the /* fix me */ mentioned in perlio.c. Hopefully it will be soon. 0.09 Sat Apr 21 16:14:54 BST 2001 The perlio.c layer API has changed by the 5.7.1 release. 0.08 and earlier won't compile - 0.09 mainly consists of necessary changes to function prototypes and the layer struct. 5.7.1 contains a known bug - when opening a file with layers specified, failure of a layer to push is supposed to cause the open to fail. In 5.7.0 it did. In 5.7.1 it doesn't. Commented the tests that will fail due to this bug. Attempted to detect the "OS" type at compile time to set the default OS for the gzip header. 0.10 Sun Aug 26 13:05:25 BST 2001 By 5.7.2 it seems that the pushed argument is passed in as &sv_undef rather than NULL. I wasn't testing SvOK() and was getting use of uninitialized value errors. 0.11 Mon Oct 29 20:28:38 GMT 2001 perlio.c layer API has changed again with the addition of a dup() function for cloning layers across. (both within and across ithreads) Currently I don't want to pretend that PerlIO::gzip can cope with this, hence it will croak. Fixing this properly is a TODO. 0.12 Mon Mar 18 21:03:52 GMT 2002 Catch up with the 5.7.3 perlio API changes. 0.13 Tue Jul 2 21:55:08 BST 2002 Richard Clamp sent me a complete patch to catch up with the 5.8.0-RC2 perlio API changes, and sprinkles Cie dust to make it compile for a threaded perl. (But please don't create any threads or fork on Win32 while a gzip layer is active, as bad things will happen (duplicate frees)) 0.14 Fri Jul 19 23:19:24 BST 2002 Whoops. Same undefined behaviour bug in my argument passing code as PerlIO::subfile. No surprise there, as PerlIO::subfile got the code from PerlIO::gzip. So I'll migrate the fix back from there to here. Well spotted valgrind. Use valgrind. http://developer.kde.org/~sewardj/ It's far more effective than coffee. 0.15 Tue Jul 15 21:07:11 BST 2003 gzip.xs was using PerlIOBuf_close, which isn't on the list of exported symbols. Hence gzip.xs won't link on platforms such as AIX and Windows. This is fixed in 5.8.1 - Inline the code for PerlIOBuf_close when building for 5.8.0. 0.16 Fri Jun 25 09:38:28 BST 2004 Merge this in from Sun Nov 4 15:27:24 GMT 2001: Split the tests from test.pl into t/read.t, t/write.t use Test::More; [Happy Schwern? :-)] and loop over several buffering possibilities. Actually unlink the test perl.gz file. 0.17 Wed Jun 30 18:29:30 BST 2004 Track down the cause of the problem reported by some Linux users. It's another manifestation of the core perlio bug, whereby perlio loses data if you push another layer onto a file handle with unread data. In this case the layer push is happening on some Linux systems because the stdio layer isn't supporting Perl's fast buffer snooping, presuambly because glibc has decided to use mmap() for the file. Work around is for PerlIO::gzip to push its buffering layer before it calls unread, which is a somewhat messy hack, but works. This means that all the TODO tests can be taken out, as they now pass. Fix small typo in read.t 0.18 Sun Oct 1 22:08:32 BST 2006 Patch from alexchorny [at] gmail.com (#21469) - write.t fails (I had the skip arguments wrong) TODO 2 tests in read.t which fail. I'm infering that this is because :stdio opens in text mode, and there's no way to tell it to open in binary mode. Time to consult p5p. PerlIO-gzip-0.18/MANIFEST000644 001765 001765 00000000343 10510035477 014624 0ustar00nicknick000000 000000 Changes gzip.pm gzip.xs MANIFEST Makefile.PL README t/read.t t/write.t t/ok3.gz t/ok17.gz t/ok19 t/ok21 t/ok50.gz.short t/ok54.gz.len t/ok58.gz.crc META.yml Module meta-data (added by MakeMaker) PerlIO-gzip-0.18/gzip.xs000644 001765 001765 00000111465 10510035227 015021 0ustar00nicknick000000 000000 /* -*- c -*- */ /* gzip.xs * * Copyright (C) 2001, 2002, Nicholas Clark * * You may distribute this work under the terms of either the GNU General * Public License or the Artistic License, as specified in perl's README * file. * */ #define PERL_NO_GET_CONTEXT #include "EXTERN.h" #include "perl.h" #include "XSUB.h" #include #include "perliol.h" /* auto|gzip|none lazy csum name= extra= comment= */ /* stick a buffer on layer below turn of crlf zalloc in the zs struct being non-NULL is sign that we need to tidy up */ #define GZIP_HEADERSIZE 10 #define GZIP_TEXTFLAG 0x01 #define GZIP_HAS_HEADERCRC 0x02 #define GZIP_HAS_EXTRAFIELD 0x04 #define GZIP_HAS_ORIGNAME 0x08 #define GZIP_HAS_COMMENT 0x10 /* 0x20 is encrypted, which we'll treat as if its unknown. */ #define GZIP_KNOWNFLAGS 0x1F #define LAYERGZIP_STATUS_NORMAL 0 #define LAYERGZIP_STATUS_INPUT_EOF 1 #define LAYERGZIP_STATUS_ZSTREAM_END 2 #define LAYERGZIP_STATUS_CONFUSED 3 #define LAYERGZIP_STATUS_1ST_DO_HEADER 4 #define LAYERGZIP_FLAG_GZIPHEADER 0x00 #define LAYERGZIP_FLAG_NOGZIPHEADER 0x01 /* No gzip file header */ #define LAYERGZIP_FLAG_MAYBEGZIPHEADER 0x02 /* Look for magic number */ #define LAYERGZIP_FLAG_AUTOPOP 0x03 #define LAYERGZIP_FLAG_READMODEMASK 0x03 #define LAYERGZIP_FLAG_LAZY 0x04 /* defer header check */ #define LAYERGZIP_FLAG_OURBUFFERBELOW 0x08 /* We own the buffer below us */ #define LAYERGZIP_FLAG_INFL_INIT_DONE 0x10 /* Done inflate init */ #define LAYERGZIP_FLAG_DO_CRC_AT_END 0x20 /* Check CRC at Z_STREAM_END */ #define LAYERGZIP_FLAG_DEFL_INIT_DONE 0x40 /* Done deflate init */ #define LAYERGZIP_FLAG_NO_TIMESTAMP 0x80 #define LAYERGZIP_FLAG_CLOSING_FILE 0x100 #define LAYERGZIP_GZIPHEADER_GOOD 0 #define LAYERGZIP_GZIPHEADER_ERROR 1 #define LAYERGZIP_GZIPHEADER_BADMAGIC 2 #define LAYERGZIP_GZIPHEADER_BADMETHOD 3 #define LAYERGZIP_GZIPHEADER_NOTGZIP 4 /* BEWARE. If you get this your buf pointer is now invalid */ #ifndef LAYERGZIP_DEFAULT_OS_TYPE #define LAYERGZIP_DEFAULT_OS_TYPE 255 /* "Unknown" - see rfc1952 */ #endif #define OUTSIZE 4096 #define LAYERGZIP_DEF_MEM_LEVEL 8 typedef struct { PerlIOBuf base; z_stream zs; /* zlib's struct. */ int status; /* state of the inflater */ int flags; /* bitmap */ unsigned long crc; /* ongoing CRC of data */ long time; /* timestamp to write to the header */ Bytef *outbuf; /* Our malloc'd output buffer */ int level; /* compression level for deflate */ unsigned char os_type; /* OS type flag for the header */ } PerlIOGzip; /***************************************************************************** * * Reading stuff * *****************************************************************************/ /* Logic of the header passer: buffer is where we're reading from. It may point into the fast_gets buffer of the layer below, or into our private SV. We start, if possible in the fast_gets buffer. When we exhaust it (or if we can't use it) we allocate a private SV and store everything that we've read into it. */ static SSize_t get_more (PerlIO *below, SSize_t wanted, SV **sv, unsigned char **buffer) { dTHX; /* fetch context */ SSize_t get, done, read; unsigned char *read_here; #if DEBUG_LAYERGZIP PerlIO_debug("PerlIOGzip get_more f=%p wanted=%08"UVxf" sv=%p buffer=%p\n", below, wanted, *sv, *buffer); #endif if (!*sv) { /* We know there were not enough bytes available in the layer below's buffer. We know that we started at the beginning of it, so we can calculate how many bytes we've passed over (but not consumed, as we didn't alter the pointer and count). */ done = *buffer - (unsigned char*) PerlIO_get_ptr(below); get = done + wanted; /* Need to read the lot into our SV. */ *sv = newSVpvn("", 0); if (!*sv) return -1; read_here = (unsigned char *) SvGROW(*sv, get); *buffer = read_here + done; } else { done = SvCUR(*sv); read_here = *buffer = (unsigned char *) SvGROW(*sv, done + wanted) + done; get = wanted; /* Only need to read the next section */ } #if DEBUG_LAYERGZIP PerlIO_debug("PerlIOGzip get_more sv=%p buffer=%p done=%08"UVxf" read_here=%p get=%08"UVxf" \n", *sv, *buffer, done, read_here, get); #endif read = PerlIO_read (below, read_here, wanted); #if DEBUG_LAYERGZIP PerlIO_debug("PerlIOGzip get_more read=%08"UVxf"\n", read); #endif if (read == -1) { /* Read error. Messy. Don't know what state our buffer is, and whether we should unread it. Probably not. */ SvREFCNT_dec(*sv); *sv = NULL; return read; } if (read_here == *buffer) { /* We were appending. */ SvCUR(*sv) += read; return read; } /* We were reading into the whole buffer. */ SvCUR_set(*sv, read); return read - done; } static SSize_t eat_nul (PerlIO *below, SV **sv, unsigned char **buffer) { dTHX; /* fetch context */ SSize_t munch_size = 256; /* Pick a size to read in. Should this double each loop? */ #if DEBUG_LAYERGZIP PerlIO_debug("PerlIOGzip eat_nul f=%p sv=%p buffer=%p\n", below, *sv, *buffer); #endif if (!*sv) { /* Buffer below supposed fast_gets. */ unsigned char *end = (unsigned char *) PerlIO_get_base(below) + PerlIO_get_bufsiz(below); unsigned char *here = *buffer; #if DEBUG_LAYERGZIP PerlIO_debug("PerlIOGzip eat_nul here=%p end=%p\n", here, end); #endif while (here < end) { if (*here++) continue; *buffer = here; #if DEBUG_LAYERGZIP PerlIO_debug("PerlIOGzip eat_nul found it! here=%p end=%p, returning %08" UVxf"\n", here, end, (UV) (end-here)); #endif return end-here; } *buffer = here; #if DEBUG_LAYERGZIP PerlIO_debug("PerlIOGzip eat_nul no joy here=%p end=%p\n", here, end); #endif } #if DEBUG_LAYERGZIP PerlIO_debug("PerlIOGzip eat_nul about to loop\n"); #endif while (1) { unsigned char *end, *here; SSize_t avail = get_more (below, munch_size, sv, buffer); #if DEBUG_LAYERGZIP PerlIO_debug("PerlIOGzip eat_nul sv=%p buffer=%p wanted=%08"UVxf" avail=%08"UVxf"\n", *sv, *buffer, munch_size, (UV)avail); #endif if (avail == -1 || avail == 0) return -1; end = (unsigned char *)SvEND(*sv); here = *buffer; #if DEBUG_LAYERGZIP PerlIO_debug("PerlIOGzip eat_nul here=%p end=%p\n", here, end); #endif while (here < end) { if (*here++) continue; *buffer = here; #if DEBUG_LAYERGZIP PerlIO_debug("PerlIOGzip eat_nul found it! here=%p end=%p, returning %08" UVxf"\n", here, end, (UV) (end-here)); #endif return end-here; } /* as *sv is not NULL, get_more doesn't use the input value of *buffer, so don't waste time setting it. We've eaten the whole SV - that's all get_more cares about. So loop and munch some more. */ } } /* gzip header is Magic number 0,1 Compression type 2 Flags 3 Time 4-7 XFlags 8 OS Code 9 */ static int check_gzip_header (PerlIO *f) { dTHX; /* fetch context */ PerlIO *below = PerlIONext(f); int code = LAYERGZIP_GZIPHEADER_GOOD; SSize_t avail; SV *temp = NULL; unsigned char *header; #if DEBUG_LAYERGZIP PerlIO_debug("PerlIOGzip check_gzip_header f=%p below=%p fast_gets=%d\n", f, below, PerlIO_fast_gets(below)); #endif if (PerlIO_fast_gets(below)) { avail = PerlIO_get_cnt(below); if (avail <= 0) { avail = PerlIO_fill(below); if (avail == 0) avail = PerlIO_get_cnt(below); else avail = 0; } } else avail = 0; #if DEBUG_LAYERGZIP PerlIO_debug("PerlIOGzip check_gzip_header avail=%08"UVxf"\n", (UV)avail); #endif if (avail >= GZIP_HEADERSIZE) header = (unsigned char *) PerlIO_get_ptr(below); else { temp = newSVpvn("", 0); if (!temp) return LAYERGZIP_GZIPHEADER_ERROR; header = (unsigned char *) SvGROW(temp, GZIP_HEADERSIZE); #if DEBUG_LAYERGZIP PerlIO_debug("PerlIOGzip check_gzip_header below=%p header=%p size %d\n", below, header, GZIP_HEADERSIZE); #endif avail = PerlIO_read(below,header,GZIP_HEADERSIZE); #if DEBUG_LAYERGZIP PerlIO_debug("PerlIOGzip check_gzip_header read=%08"UVxf"\n", (UV)avail); #endif SvCUR_set(temp, avail); if (avail < 0) { code = LAYERGZIP_GZIPHEADER_ERROR; goto bad; } else if (avail < 2 ) { code = LAYERGZIP_GZIPHEADER_BADMAGIC; goto bad; } else if (avail < GZIP_HEADERSIZE) { /* Too short, but if magic number isn't there, it's not a gzip file */ if (header[0] == 0x1f && header[1] == 0x8b) { /* It's trying to be a gzip file. */ code = LAYERGZIP_GZIPHEADER_ERROR; } else code = LAYERGZIP_GZIPHEADER_BADMAGIC; goto bad; } } #if DEBUG_LAYERGZIP PerlIO_debug("PerlIOGzip check_gzip_header header=%p\n", header); #endif avail -= GZIP_HEADERSIZE; if (header[0] != 0x1f || header[1] != 0x8b) code = LAYERGZIP_GZIPHEADER_BADMAGIC; else if (header[2] != Z_DEFLATED) code = LAYERGZIP_GZIPHEADER_BADMETHOD; else if (header[3] & !GZIP_KNOWNFLAGS) code = LAYERGZIP_GZIPHEADER_ERROR; else { /* Check the header, and skip any extra fields */ int flags = header[3]; #if DEBUG_LAYERGZIP PerlIO_debug("PerlIOGzip check_gzip_header flags=%02X\n", flags); #endif header += GZIP_HEADERSIZE; if (flags & GZIP_HAS_EXTRAFIELD) { Size_t len; if (avail < 2) { /* Need some more */ avail = get_more (below, 2, &temp, &header); if (avail < 2) { code = LAYERGZIP_GZIPHEADER_ERROR; goto bad; } } /* 2 byte little endian quantity, which we now know is in the buffer. */ len = header[0] | (header[1] << 8); header += 2; avail -= 2; #if DEBUG_LAYERGZIP PerlIO_debug("PerlIOGzip check_gzip_header header=%p avail=%08"UVxf " extra len=%d\n", header, (UV)avail, (int)len); #endif if (avail < len) { /* Need some more */ avail = get_more (below, len, &temp, &header); if (avail < len) { code = LAYERGZIP_GZIPHEADER_ERROR; goto bad; } } header += len; avail -= len; } if (flags & GZIP_HAS_ORIGNAME) { #if DEBUG_LAYERGZIP PerlIO_debug("PerlIOGzip check_gzip_header header=%p avail=%08"UVxf " has origname\n", header, (UV)avail); #endif avail = eat_nul (below, &temp, &header); if (avail < 0) { code = LAYERGZIP_GZIPHEADER_ERROR; goto bad; } } if (flags & GZIP_HAS_COMMENT) { #if DEBUG_LAYERGZIP PerlIO_debug("PerlIOGzip check_gzip_header header=%p avail=%08"UVxf " has comment\n", header, (UV)avail); #endif avail = eat_nul (below, &temp, &header); if (avail < 0) { code = LAYERGZIP_GZIPHEADER_ERROR; goto bad; } } if (flags & GZIP_HAS_HEADERCRC) { #if DEBUG_LAYERGZIP PerlIO_debug("PerlIOGzip check_gzip_header header=%p avail=%08"UVxf " has header CRC\n", header, (UV)avail); #endif if (avail < 2) { /* Need some more */ avail = get_more (below, 2, &temp, &header); if (avail < 2) { code = LAYERGZIP_GZIPHEADER_ERROR; goto bad; } } header += 2; avail -= 2; } } if (code == LAYERGZIP_GZIPHEADER_GOOD) { /* Adjust the pointer here. or free the SV */ if (temp) { SSize_t unread; #if DEBUG_LAYERGZIP PerlIO_debug("PerlIOGzip check_gzip_header finished. unreading header=%p " "avail=%08"UVxf"\n", header, (UV)avail); #endif if (avail) { if (!(PerlIOBase(below)->flags & PERLIO_F_FASTGETS)) { #if DEBUG_LAYERGZIP PerlIO_debug("check_gzip_header HACK around core PerlIO bug\n"); #endif if (!PerlIO_push(aTHX_ below,&PerlIO_perlio,"r",&PL_sv_undef)) { #if DEBUG_LAYERGZIP PerlIO_debug("check_gzip_header failed to push new layer\n"); #endif code = LAYERGZIP_GZIPHEADER_ERROR; goto bad; } PerlIOSelf(f,PerlIOGzip)->flags |= LAYERGZIP_FLAG_OURBUFFERBELOW; below = PerlIONext(f); } unread = PerlIO_unread (below, header, avail); if (unread != avail) { #if DEBUG_LAYERGZIP PerlIO_debug("PerlIOGzip check_gzip_header finished. only unread %08"UVxf"\n", unread); #endif code = LAYERGZIP_GZIPHEADER_ERROR; } } SvREFCNT_dec(temp); } else { PerlIO_debug("PerlIOGzip check_gzip_header finished. setting ptrcnt " "header=%p avail=%08"UVxf"\n", header, (UV)avail); PerlIO_set_ptrcnt(below, (STDCHAR *) header, avail); } } else { /* Unread the whole the SV. Maybe I should try to seek first. */ bad: if (temp) { STRLEN len; STDCHAR *ptr = SvPV(temp, len); PerlIOGzip *g = PerlIOSelf(f,PerlIOGzip); #if DEBUG_LAYERGZIP PerlIO_debug("PerlIOGzip check_gzip_header failed. unreading ptr=%p len=%08"UVxf"\n", ptr, (UV)len); #endif if (((g->flags & LAYERGZIP_FLAG_READMODEMASK) == LAYERGZIP_FLAG_MAYBEGZIPHEADER) && !(PerlIOBase(below)->flags & PERLIO_F_FASTGETS)) { #if DEBUG_LAYERGZIP PerlIO_debug("check_gzip_header HACK around core PerlIO bug\n"); #endif if (PerlIO_push(aTHX_ below,&PerlIO_perlio,"r",&PL_sv_undef)) { g->flags |= LAYERGZIP_FLAG_OURBUFFERBELOW; below = PerlIONext(f); } else { #if DEBUG_LAYERGZIP PerlIO_debug("check_gzip_header failed to push new layer\n"); #endif } } PerlIO_unread (below, ptr, len); SvREFCNT_dec(temp); } if (code != LAYERGZIP_GZIPHEADER_BADMAGIC) PerlIOBase(f)->flags |= PERLIO_F_ERROR; } return code; } static int check_gzip_header_and_init (PerlIO *f) { dTHX; /* fetch context */ PerlIOGzip *g = PerlIOSelf(f,PerlIOGzip); int code; z_stream *z = &g->zs; PerlIO *below = PerlIONext(f); #if DEBUG_LAYERGZIP PerlIO_debug("PerlIOGzip check_gzip_header_and_init f=%p below=%p flags=%02X\n", f, below, g->flags); #endif if ((g->flags & LAYERGZIP_FLAG_READMODEMASK) != LAYERGZIP_FLAG_NOGZIPHEADER) { g->flags |= LAYERGZIP_FLAG_DO_CRC_AT_END; code = check_gzip_header (f); if (code != LAYERGZIP_GZIPHEADER_GOOD) { #if DEBUG_LAYERGZIP PerlIO_debug("PerlIOGzip check_gzip_header_and_init code=%d\n", code); #endif if (code != LAYERGZIP_GZIPHEADER_BADMAGIC) return code; else { int mode = g->flags & LAYERGZIP_FLAG_READMODEMASK; if (mode == LAYERGZIP_FLAG_MAYBEGZIPHEADER) { /* There wasn't a magic number. But flags say that's OK. And we won't be checking the CRC at the end */ g->flags &= ~LAYERGZIP_FLAG_DO_CRC_AT_END; } else if (mode == LAYERGZIP_FLAG_AUTOPOP) { /* There wasn't a magic number. Muahahaha. Treat it as a normal file by popping ourself. */ return LAYERGZIP_GZIPHEADER_NOTGZIP; } else { return code; } } } } g->status = LAYERGZIP_STATUS_NORMAL; /* (any header validated) */ if (PerlIOBase(below)->flags & PERLIO_F_FASTGETS) { #if DEBUG_LAYERGZIP PerlIO_debug("check_gzip_header_and_init :-). f=%p %s fl=%08X\n", below, PerlIOBase(below)->tab->name, (int)PerlIOBase(below)->flags); #endif } else { /* Bah. Layer below us doesn't support FASTGETS. So we need to add a layer to provide our input buffer. */ #if DEBUG_LAYERGZIP PerlIO_debug("check_gzip_header_and_init :-(. f=%p %s fl=%08X\n", below, PerlIOBase(below)->tab->name, (int) PerlIOBase(below)->flags); #endif if (!PerlIO_push(aTHX_ below,&PerlIO_perlio,"r",&PL_sv_undef)) return LAYERGZIP_GZIPHEADER_ERROR; g->flags |= LAYERGZIP_FLAG_OURBUFFERBELOW; below = PerlIONext(f); } assert (PerlIO_fast_gets(below)); z->next_in = (Bytef *) PerlIO_get_base(below); z->avail_in = z->avail_out = 0; z->zalloc = (alloc_func) 0; z->zfree = (free_func) 0; z->opaque = 0; /* zlib docs say that next_out and avail_out are unchanged by init. Implication is that they don't yet need to be initialised. */ if (inflateInit2(z, -MAX_WBITS) != Z_OK) { #if DEBUG_LAYERGZIP PerlIO_debug("check_gzip_header_and_init failed to inflateInit2"); #endif if (g->flags & LAYERGZIP_FLAG_OURBUFFERBELOW) { g->flags &= ~LAYERGZIP_FLAG_OURBUFFERBELOW; PerlIO_pop(aTHX_ below); } return LAYERGZIP_GZIPHEADER_ERROR; } g->flags |= LAYERGZIP_FLAG_INFL_INIT_DONE; if (g->flags & LAYERGZIP_FLAG_DO_CRC_AT_END) g->crc = crc32(0L, Z_NULL, 0); return LAYERGZIP_GZIPHEADER_GOOD; } /***************************************************************************** * * Writing stuff * *****************************************************************************/ static int write_gzip_header (PerlIO *f) { dTHX; /* fetch context */ PerlIOGzip *g = PerlIOSelf(f,PerlIOGzip); char header[10]; unsigned long timestamp = 0; #if DEBUG_LAYERGZIP PerlIO_debug("PerlIOGzip write_gzip_header f=%p flags=%02X\n", f, g->flags); #endif header[0] = 0x1f; header[1] = 0x8b; header[2] = Z_DEFLATED; header[3] = 0; /* TEXT, CRC, EXTRA, NAME, COMMENT */ if (!(g->flags & LAYERGZIP_FLAG_NO_TIMESTAMP)) { timestamp = g->time; if (timestamp == 0) { /* time_t is signed, I want unsigned for my shifting below */ time_t now = time(NULL); timestamp = (now == -1) ? 0 : now; } } /* All quantities are little endian. */ header[4] = timestamp & 0xFF; header[5] = (timestamp >> 8) & 0xFF; header[6] = (timestamp >> 16) & 0xFF; header[7] = (timestamp >> 24) & 0xFF; header[8] = 0; /* XFlags can be zero. */ header[9] = g->os_type; if (PerlIO_write(PerlIONext(f), header, sizeof(header)) != sizeof(header)) return -1; return 0; } static int write_gzip_header_and_init (PerlIO *f) { dTHX; /* fetch context */ PerlIOGzip *g = PerlIOSelf(f,PerlIOGzip); int code; z_stream *z = &g->zs; #if DEBUG_LAYERGZIP PerlIO *below = PerlIONext(f); PerlIO_debug("PerlIOGzip write_gzip_header_and_init f=%p below=%p flags=%02X\n", f, below, g->flags); #endif if ((g->flags & LAYERGZIP_FLAG_READMODEMASK) != LAYERGZIP_FLAG_NOGZIPHEADER) { g->flags |= LAYERGZIP_FLAG_DO_CRC_AT_END; code = write_gzip_header (f); if (code) { #if DEBUG_LAYERGZIP PerlIO_debug("PerlIOGzip write_gzip_header_and_init code=%d\n", code); #endif return code; } } g->status = LAYERGZIP_STATUS_NORMAL; Renew(g->outbuf, OUTSIZE, Bytef); z->next_in = (Bytef *) NULL; z->avail_in = 0; z->next_out = (Bytef *) g->outbuf; z->avail_out = OUTSIZE; z->zalloc = (alloc_func) 0; z->zfree = (free_func) 0; z->opaque = 0; /* zlib docs say that next_out and avail_out are unchanged by init. Implication is that they don't yet need to be initialised. */ if (deflateInit2(z, g->level, Z_DEFLATED, -MAX_WBITS, LAYERGZIP_DEF_MEM_LEVEL, Z_DEFAULT_STRATEGY) != Z_OK) { #if DEBUG_LAYERGZIP PerlIO_debug("write_gzip_header_and_init failed to deflateInit2"); #endif return LAYERGZIP_GZIPHEADER_ERROR; } g->flags |= LAYERGZIP_FLAG_DEFL_INIT_DONE; if (g->flags & LAYERGZIP_FLAG_DO_CRC_AT_END) g->crc = crc32(0L, Z_NULL, 0); return LAYERGZIP_GZIPHEADER_GOOD; } /***************************************************************************** * * Methods * *****************************************************************************/ static SV * PerlIOGzip_getarg(pTHX_ PerlIO *f, CLONE_PARAMS *param, int flags) { PerlIOGzip *g = PerlIOSelf(f,PerlIOGzip); SV *sv; register const char *mode; switch (g->flags & LAYERGZIP_FLAG_READMODEMASK) { case LAYERGZIP_FLAG_GZIPHEADER: if (!(g->flags & LAYERGZIP_FLAG_AUTOPOP)) { /* Default */ sv = newSVpvn("",0); return sv ? sv : &PL_sv_undef; } mode = "gzip"; break; case LAYERGZIP_FLAG_NOGZIPHEADER: mode = "none"; break; case LAYERGZIP_FLAG_MAYBEGZIPHEADER: mode = "auto"; break; case LAYERGZIP_FLAG_LAZY: mode = "lazy"; break; } sv = newSVpv (mode, 4); if (!sv) return &PL_sv_undef; if (g->flags & LAYERGZIP_FLAG_AUTOPOP) sv_catpv (sv, ",autopop"); return sv; } PerlIO_funcs PerlIO_gzip; static IV PerlIOGzip_pushed(pTHX_ PerlIO *f, const char *mode, SV *arg, PerlIO_funcs *tab) { PerlIOGzip *g = PerlIOSelf(f,PerlIOGzip); IV code = 0; STRLEN len; const char *argstr; if (arg && SvOK(arg)) argstr = SvPV(arg, len); else { argstr = NULL; len = 0; } #if DEBUG_LAYERGZIP PerlIO_debug("PerlIOGzip_pushed f=%p %s %s fl=%08"UVxf" g=%p\n", f,PerlIOBase(f)->tab->name,(mode) ? mode : "(Null)", PerlIOBase(f)->flags, g); if (argstr) PerlIO_debug(" len=%d argstr=%.*s\n", (int)len, (int)len, argstr); #endif code = PerlIOBuf_pushed(aTHX_ f,mode,&PL_sv_undef,&PerlIO_gzip); if (code) return code; g->flags = LAYERGZIP_FLAG_GZIPHEADER; g->status = LAYERGZIP_STATUS_1ST_DO_HEADER; g->outbuf = NULL; g->level = Z_DEFAULT_COMPRESSION; g->os_type = LAYERGZIP_DEFAULT_OS_TYPE; if (len) { const char *end = argstr + len; while (1) { int arg_bad = 0; const char *comma = memchr (argstr, ',', end - argstr); STRLEN this_len = comma ? (comma - argstr) : (end - argstr); #if DEBUG_LAYERGZIP PerlIO_debug(" processing len=%d argstr=%.*s\n", (int)this_len, (int)this_len, argstr); #endif if (this_len == 4) { if (memEQ (argstr, "none", 4)) { g->flags &= ~LAYERGZIP_FLAG_READMODEMASK; g->flags |= LAYERGZIP_FLAG_NOGZIPHEADER; } else if (memEQ (argstr, "auto", 4)) { g->flags &= ~LAYERGZIP_FLAG_READMODEMASK; g->flags |= LAYERGZIP_FLAG_MAYBEGZIPHEADER; } else if (memEQ (argstr, "lazy", 4)) { g->flags &= ~LAYERGZIP_FLAG_READMODEMASK; g->flags |= LAYERGZIP_FLAG_LAZY; } else if (memEQ (argstr, "gzip", 4)) { g->flags &= ~LAYERGZIP_FLAG_READMODEMASK; g->flags |= LAYERGZIP_FLAG_GZIPHEADER; } else arg_bad = 1; } else if (this_len == 7) { if (memEQ (argstr, "autopop", 7)) { g->flags &= ~LAYERGZIP_FLAG_READMODEMASK; g->flags |= LAYERGZIP_FLAG_AUTOPOP; } else arg_bad = 1; } if (arg_bad) { dTHX; /* fetch context */ /* XXX This will mangle UTF8 in error messages */ Perl_warn(aTHX_ "perlio: layer :gzip, unrecognised argument \"%.*s\"", (int)this_len, argstr); } if (!comma) break; argstr = comma + 1; } } if (PerlIOBase(f)->flags & PERLIO_F_CANWRITE) { #if DEBUG_LAYERGZIP PerlIO_debug("PerlIOGzip_pushed f=%p fl=%08"UVxf" including write (%X)\n", f, PerlIOBase(f)->flags, PERLIO_F_CANWRITE); #endif /* autopop trumps writing. */ if ((g->flags & LAYERGZIP_FLAG_READMODEMASK) == LAYERGZIP_FLAG_AUTOPOP) { PerlIO_pop(aTHX_ f); return 0; } else if ((g->flags & LAYERGZIP_FLAG_READMODEMASK) == LAYERGZIP_FLAG_MAYBEGZIPHEADER) { /* This makes no sense for writing. */ return -1; } if (PerlIOBase(f)->flags & PERLIO_F_CANREAD) return -1; if (!(g->flags & LAYERGZIP_FLAG_LAZY) || ((g->flags & LAYERGZIP_FLAG_READMODEMASK) == LAYERGZIP_FLAG_NOGZIPHEADER)) { code = write_gzip_header_and_init (f); if (code != LAYERGZIP_GZIPHEADER_GOOD) return -1; } } else if (PerlIOBase(f)->flags & PERLIO_F_CANREAD) { /* autopop trumps lazy. (basically, it's going to confuse upstream far too much if on the first read we pop our buffered layer off to reveal an unbuffered layer below us) */ if (!(g->flags & LAYERGZIP_FLAG_LAZY) || ((g->flags & LAYERGZIP_FLAG_READMODEMASK) == LAYERGZIP_FLAG_AUTOPOP)) { code = check_gzip_header_and_init (f); if (code != LAYERGZIP_GZIPHEADER_GOOD) { if (code == LAYERGZIP_GZIPHEADER_NOTGZIP) { PerlIO_pop(aTHX_ f); #if DEBUG_LAYERGZIP PerlIO_debug("PerlIOGzip_pushed just popped f=%p\n", f); #endif return 0; } return -1; } } } else { #if DEBUG_LAYERGZIP PerlIO_debug("PerlIOGzip_pushed f=%p fl=%08"UVxf " neither read nor write\n", f, PerlIOBase(f)->flags); #endif return -1; } if (g->flags & LAYERGZIP_FLAG_DO_CRC_AT_END) g->crc = crc32(0L, Z_NULL, 0); #if DEBUG_LAYERGZIP PerlIO_debug("PerlIOGzip_pushed f=%p g->status=%d g->flags=%02X\n", f, g->status, g->flags); #endif return 0; } static IV PerlIOGzip_popped(pTHX_ PerlIO *f) { PerlIOGzip *g = PerlIOSelf(f,PerlIOGzip); IV code = 0; #if DEBUG_LAYERGZIP PerlIO_debug("PerlIOGzip_popped f=%p %s flags=%02X\n", f,PerlIOBase(f)->tab->name, g->flags); #endif if (g->flags & LAYERGZIP_FLAG_INFL_INIT_DONE) { g->flags &= ~LAYERGZIP_FLAG_INFL_INIT_DONE; code = inflateEnd (&(g->zs)) == Z_OK ? 0 : -1; } if (g->flags & LAYERGZIP_FLAG_DEFL_INIT_DONE) { g->flags &= ~LAYERGZIP_FLAG_DEFL_INIT_DONE; code = deflateEnd (&(g->zs)); PerlIO_debug("PerlIOGzip_popped code=%d\n", code); code = (code == Z_OK) ? 0 : -1; } #if DEBUG_LAYERGZIP PerlIO_debug("PerlIOGzip_popped code=%d\n", code); #endif Safefree (g->outbuf); g->outbuf = NULL; if (g->flags & LAYERGZIP_FLAG_OURBUFFERBELOW) { PerlIO *below = PerlIONext(f); assert (below); /* This must be a layer, or our flags a screwed, or someone else has been screwing with our buffer. */ PerlIO_pop(aTHX_ below); g->flags &= ~LAYERGZIP_FLAG_OURBUFFERBELOW; } #if DEBUG_LAYERGZIP PerlIO_debug("PerlIOGzip_popped f=%p %s %d\n", f,PerlIOBase(f)->tab->name, (int)code); #endif return code; } static IV PerlIOGzip_close(pTHX_ PerlIO *f) { IV code = 0; PerlIOGzip *g = PerlIOSelf(f,PerlIOGzip); #if DEBUG_LAYERGZIP PerlIO_debug("PerlIOGzip_close f=%p %s za=%p g->status=%d\n", f,PerlIOBase(f)->tab->name, g->zs.zalloc, (int) g->status); #endif /* Signal to anything (eg the flush()) that the sky *is* falling down. Can't simply move the status to EOF, as status on "1ST_DO_HEADER" is used by lazy write to mean "write the gzip header on first write" and there's a real chance (certainly in the regression tests :-)) that we have all the data to compress ready in the buffer with nothing actually deflated right now at close time. */ g->flags |= LAYERGZIP_FLAG_CLOSING_FILE; if ((g->flags & LAYERGZIP_FLAG_DEFL_INIT_DONE) || (PerlIOBase(f)->flags & PERLIO_F_WRBUF)) { code = PerlIO_flush(f); } if (g->flags & LAYERGZIP_FLAG_DO_CRC_AT_END) { if ((PerlIOBase(f)->flags & PERLIO_F_CANREAD) && (g->status == LAYERGZIP_STATUS_ZSTREAM_END)) { unsigned char buffer[8]; PerlIO *below = PerlIONext(f); SSize_t got = PerlIO_read(below,buffer,8); #if DEBUG_LAYERGZIP PerlIO_debug("PerlIOGzip_close g->crc=%08"UVxf" next=%p got=%d\n", g->crc, below, (int)got); #endif if (got != 8) code = -1; else { U32 crc = buffer[0] | (buffer[1] << 8) | (buffer[2] << 16) | (buffer[3] << 24); #if DEBUG_LAYERGZIP PerlIO_debug("PerlIOGzip_close crc=%08"UVxf"\n", crc); #endif if (crc != (g->crc & 0xFFFFFFFF)) code = -1; else { U32 len = buffer[4] | (buffer[5] << 8) | (buffer[6] << 16) | (buffer[7] << 24); #if DEBUG_LAYERGZIP PerlIO_debug("PerlIOGzip_close len=%08"UVxf" total=%08"UVxf"\n", len, g->zs.total_out); #endif if (len != (g->zs.total_out & 0xFFFFFFFF)) code = -1; } } } else if ((PerlIOBase(f)->flags & PERLIO_F_CANWRITE) && (code == 0)) { /* Don't come in here if the flush failed (ie code != 0). */ unsigned char buffer[8]; PerlIO *below = PerlIONext(f); #if DEBUG_LAYERGZIP PerlIO_debug("PerlIOGzip_close crc=%08"UVxf" len=%08"UVxf"\n", g->crc, g->zs.total_in); #endif buffer[0] = g->crc & 0xFF; buffer[1] = (g->crc >> 8) & 0xFF; buffer[2] = (g->crc >> 16) & 0xFF; buffer[3] = (g->crc >> 24) & 0xFF; buffer[4] = g->zs.total_in & 0xFF; buffer[5] = (g->zs.total_in >> 8) & 0xFF; buffer[6] = (g->zs.total_in >> 16) & 0xFF; buffer[7] = (g->zs.total_in >> 24) & 0xFF; code = (PerlIO_write(below,buffer,8) == 8 ? 0 : -1); } } if (g->flags & (LAYERGZIP_FLAG_DEFL_INIT_DONE | LAYERGZIP_FLAG_INFL_INIT_DONE | LAYERGZIP_FLAG_OURBUFFERBELOW)) code |= PerlIOGzip_popped(aTHX_ f); #if DEBUG_LAYERGZIP PerlIO_debug("PerlIOGzip_close f=%p %d\n", f, (int)code); #endif #if PERL_VERSION > 8 || PERL_SUBVERSION > 0 /* 5.8.1 correctly exports PerlIOBuf_close */ code |= PerlIOBuf_close(aTHX_ f); /* Call it whatever. */ #else /* 5.8.0 doesn't, so platforms such as AIX and Windows can't see it. Inline it here: */ code |= PerlIOBase_close(aTHX_ f); { PerlIOBuf *b = PerlIOSelf(f, PerlIOBuf); if (b->buf && b->buf != (STDCHAR *) & b->oneword) { Safefree(b->buf); } b->buf = NULL; b->ptr = b->end = b->buf; PerlIOBase(f)->flags &= ~(PERLIO_F_RDBUF | PERLIO_F_WRBUF); } #endif return code ? -1 : 0; /* Only returns 0 if both succeeded */ } static IV PerlIOGzip_fill(pTHX_ PerlIO *f) { PerlIOGzip *g = PerlIOSelf(f,PerlIOGzip); PerlIOBuf *b = PerlIOSelf(f,PerlIOBuf); PerlIO *n = PerlIONext(f); SSize_t avail; int status = Z_OK; #if DEBUG_LAYERGZIP PerlIO_debug("PerlIOGzip_fill f=%p g->status=%d\n", f, g->status); #endif if (g->status == LAYERGZIP_STATUS_CONFUSED || g->status == LAYERGZIP_STATUS_ZSTREAM_END) return -1; /* Error state, or EOF has been seen. */ if (g->status == LAYERGZIP_STATUS_1ST_DO_HEADER) { if (check_gzip_header_and_init (f) != LAYERGZIP_GZIPHEADER_GOOD) { g->status = LAYERGZIP_STATUS_CONFUSED; PerlIOBase(f)->flags |= PERLIO_F_ERROR; return -1; } } if (!b->buf) PerlIO_get_base(f); /* allocate via vtable */ b->ptr = b->end = b->buf; g->zs.next_out = (Bytef *) b->buf; g->zs.avail_out = b->bufsiz; #if DEBUG_LAYERGZIP PerlIO_debug("PerlIOGzip_fill next_out=%p avail_out=%08x status=%d\n", g->zs.next_out,g->zs.avail_out, g->status); #endif assert (PerlIO_fast_gets(n)); /* loop while we see no output. */ while (g->zs.next_out == (Bytef *) b->buf) { /* If we have run out of input then read some more. */ avail = PerlIO_get_cnt(n); #if DEBUG_LAYERGZIP PerlIO_debug("PerlIOGzip_fill avail=%08"UVxf" status=%d\n", (UV)avail, g->status); #endif /* Someone is going to give us compressed input on a tty some day, and there we'll only see EOF once, before a read will block again. So if we see EOF, remember it. inflate will stall with an error if more input were really needed and this EOF turns out to have been premature. */ if (avail <= 0 && (g->status != LAYERGZIP_STATUS_INPUT_EOF)) { avail = PerlIO_fill(n); if (avail == 0) { avail = PerlIO_get_cnt(n); #if DEBUG_LAYERGZIP PerlIO_debug("PerlIOGzip_fill refill, avail=%08"UVxf"\n",(UV)avail); #endif } else { /* To make this non blocking friendly would we need to change this? */ if (PerlIO_error(n)) { /* I'm assuming that the error on the input stream is persistent, and that as there is going to be output space, I'll get Z_BUF_ERROR if no progress is possible because I've used all the input I got before the error. */ avail = 0; #if DEBUG_LAYERGZIP PerlIO_debug("PerlIOGzip_fill error, avail=%08"UVxf"\n",(UV)avail); #endif } else if (PerlIO_eof(n)) { g->status = LAYERGZIP_STATUS_INPUT_EOF; avail = 0; #if DEBUG_LAYERGZIP PerlIO_debug("PerlIOGzip_fill input eof, avail=%08"UVxf"\n",(UV)avail); #endif } else { avail = 0; #if DEBUG_LAYERGZIP PerlIO_debug("PerlIOGzip_fill how did I get here?, avail=%08"UVxf "\n",(UV)avail); #endif } } } g->zs.avail_in = avail; g->zs.next_in = (Bytef *) PerlIO_get_ptr(n); /* Z_SYNC_FLUSH to get as much output as possible if there's no input left. This may be pointless, but I'm hoping that this is enough to make non- blocking work by forcing as much output as possible if the input blocked. */ #if DEBUG_LAYERGZIP PerlIO_debug("PerlIOGzip_fill preinf next_in=%p avail_in=%08x\n", g->zs.next_in,g->zs.avail_in); #endif status = inflate (&(g->zs), avail ? Z_NO_FLUSH : Z_SYNC_FLUSH); #if DEBUG_LAYERGZIP PerlIO_debug("PerlIOGzip_fill postinf next_in=%p avail_in=%08x status=%d\n", g->zs.next_in,g->zs.avail_in, status); #endif /* And we trust that zlib gets these two correct */ PerlIO_set_ptrcnt(n, (STDCHAR *) g->zs.next_in, g->zs.avail_in); if (status != Z_OK) { if (status == Z_STREAM_END) { g->status = LAYERGZIP_STATUS_ZSTREAM_END; PerlIOBase(f)->flags |= PERLIO_F_EOF; } else { PerlIOBase(f)->flags |= PERLIO_F_ERROR; } break; } } /* loop until we read enough data. hopefully not literally forever. Z_BUF_ERROR should be generated if there is a buffer problem. Z_OK will only appear if there is progress - ie either input is consumed (it must be available for this) or output is generated (there must be space for this). Hence not consuming any input whilst also not generating any more output is an error we will spot and barf on. */ #if DEBUG_LAYERGZIP PerlIO_debug("PerlIOGzip_fill leaving next_out=%p avail_out=%08x\n", g->zs.next_out,g->zs.avail_out); #endif if (g->zs.next_out != (Bytef *) b->buf) { /* Success if we got at least one byte. :-) */ b->end = (STDCHAR *) g->zs.next_out; /* Update the crc */ if (g->flags & LAYERGZIP_FLAG_DO_CRC_AT_END) g->crc = crc32(g->crc, (Bytef *) b->buf, b->end - b->buf); PerlIOBase(f)->flags |= PERLIO_F_RDBUF; return 0; } return -1; } IV PerlIOGzip_flush(pTHX_ PerlIO *f) { #if DEBUG_LAYERGZIP PerlIO_debug("PerlIOGzip_flush f=%p fl=%08"UVxf"\n", f, PerlIOBase(f)->flags); #endif if (PerlIOBase(f)->flags & PERLIO_F_ERROR) return -1; if (PerlIOBase(f)->flags & PERLIO_F_CANWRITE) { /* Must come in here even if there's no buffered data, in case we need to finish */ PerlIOGzip *g = PerlIOSelf(f,PerlIOGzip); PerlIOBuf *b = PerlIOSelf(f,PerlIOBuf); z_stream *z = &g->zs; if (g->status == LAYERGZIP_STATUS_1ST_DO_HEADER) { #if DEBUG_LAYERGZIP PerlIO_debug("PerlIOGzip_flush 1st do header b->buf=%p b->ptr=%p\n", b->buf, b->ptr); #endif /* OK. In lazy mode. */ if (b->ptr == b->buf) { #if DEBUG_LAYERGZIP PerlIO_debug("PerlIOGzip_flush no data, write no header\n"); #endif g->status = LAYERGZIP_STATUS_ZSTREAM_END; return 0; } if (write_gzip_header_and_init (f)) { g->status = LAYERGZIP_STATUS_CONFUSED; PerlIOBase(f)->flags |= PERLIO_F_ERROR; return -1; } } z->next_in = (Bytef *) b->buf; z->avail_in = b->ptr - b->buf; if (g->flags & LAYERGZIP_FLAG_DO_CRC_AT_END) g->crc = crc32(g->crc, z->next_in, z->avail_in); while (z->avail_in || ((g->flags & LAYERGZIP_FLAG_CLOSING_FILE) && g->status == LAYERGZIP_STATUS_NORMAL)) { int status; #if DEBUG_LAYERGZIP PerlIO_debug("PerlIOGzip_flush predef next_in= %p avail_in= %08x\n" " next_out=%p avail_out=%08x" "\n", g->zs.next_in, g->zs.avail_in, g->zs.next_out, g->zs.avail_out); #endif status = deflate (&(g->zs), (g->flags & LAYERGZIP_FLAG_CLOSING_FILE) ? Z_FINISH : 0); #if DEBUG_LAYERGZIP PerlIO_debug("PerlIOGzip_flush postdef next_in= %p avail_in= %08x\n" " next_out=%p avail_out=%08x " "status=%d\n", g->zs.next_in,g->zs.avail_in, g->zs.next_out,g->zs.avail_out, status); #endif if (status == Z_STREAM_END) { assert (z->avail_in == 0); g->status = LAYERGZIP_STATUS_ZSTREAM_END; } if (status == Z_OK || status == Z_STREAM_END) { if (z->avail_out == 0 || status == Z_STREAM_END) { PerlIO *n = PerlIONext(f); SSize_t avail = OUTSIZE - z->avail_out; STDCHAR *where = (STDCHAR *) g->outbuf; while (avail > 0) { SSize_t count = PerlIO_write(n, where, avail); if (count > 0) { where += count; avail -= count; } else if (count < 0 || PerlIO_error(n)) { #if DEBUG_LAYERGZIP PerlIO_debug("PerlIOGzip_flush write failed, data lost\n"); #endif PerlIOBase(f)->flags |= PERLIO_F_ERROR; return -1; #if DEBUG_LAYERGZIP } else { PerlIO_debug("PerlIOGzip_flush wrote 0 - aren't we spinning?\n"); #endif } } z->next_out = (Bytef *) g->outbuf; z->avail_out = OUTSIZE; } } else { #if DEBUG_LAYERGZIP PerlIO_debug("PerlIOGzip_flush deflate failed %d, data lost\n", status); #endif PerlIOBase(f)->flags |= PERLIO_F_ERROR; return -1; } } b->ptr = b->end = b->buf; PerlIOBase(f)->flags &= ~(PERLIO_F_WRBUF); if (PerlIO_flush(PerlIONext(f)) != 0) return -1; } return 0; } /* Hmm. These need to be public? */ static IV PerlIOGzip_seek_fail(pTHX_ PerlIO *f, Off_t offset, int whence) { return -1; } PerlIO * PerlIOGzip_dup(pTHX_ PerlIO *f, PerlIO *o, CLONE_PARAMS *param, int flags) { croak ("PerlIO::gzip can't yet clone active layers"); return NULL; } PerlIO_funcs PerlIO_gzip = { sizeof(PerlIO_funcs), "gzip", sizeof(PerlIOGzip), PERLIO_K_BUFFERED, /* XXX destruct */ PerlIOGzip_pushed, PerlIOGzip_popped, PerlIOBuf_open, PerlIOBase_binmode, PerlIOGzip_getarg, PerlIOBase_fileno, PerlIOGzip_dup, PerlIOBuf_read, PerlIOBuf_unread, /* I am not convinced that this is going to work */ PerlIOBuf_write, PerlIOGzip_seek_fail, /* PerlIOBuf_seek, */ PerlIOBuf_tell, PerlIOGzip_close, PerlIOGzip_flush, /* PerlIOBuf_flush, Hmm. open() expects to flush :-( */ PerlIOGzip_fill, PerlIOBase_eof, PerlIOBase_error, PerlIOBase_clearerr, PerlIOBase_setlinebuf, PerlIOBuf_get_base, PerlIOBuf_bufsiz, PerlIOBuf_get_ptr, PerlIOBuf_get_cnt, PerlIOBuf_set_ptrcnt, }; MODULE = PerlIO::gzip PACKAGE = PerlIO::gzip PROTOTYPES: DISABLE BOOT: PerlIO_define_layer(aTHX_ &PerlIO_gzip); PerlIO-gzip-0.18/Makefile.PL000644 001765 001765 00000004411 10510031371 015432 0ustar00nicknick000000 000000 #!perl -w use 5.008; # Stop perl 5.005 or earler barfing on the v string # require 5.7.0; use ExtUtils::MakeMaker; use Config; unless ($Config{useperlio} eq 'define' and $Config{usesfio} eq 'false') { die < 'PerlIO::gzip', 'VERSION_FROM' => 'gzip.pm', # finds $VERSION 'PREREQ_PM' => {}, # e.g., Module::Name => 1.1 ABSTRACT_FROM => 'gzip.pm', # retrieve abstract from module AUTHOR => 'Nicholas Clark ', 'LIBS' => ['-lz'], 'DEFINE' => $ostype, ((eval $ExtUtils::MakeMaker::VERSION > 6.30) ? (LICENSE => 'perl') : ()), # 'DEFINE' => '-g -Wall -DDEBUG_LAYERGZIP', ); # Check out Radioactive::Decay at http://belfast.pm.org/Modules/ # It might be more useful than this module :-) __END__ OS types, from RFC 1952 0 - FAT filesystem (MS-DOS, OS/2, NT/Win32) 1 - Amiga 2 - VMS (or OpenVMS) 3 - Unix 4 - VM/CMS 5 - Atari TOS 6 - HPFS filesystem (OS/2, NT) 7 - Macintosh 8 - Z-System 9 - CP/M 10 - TOPS-20 11 - NTFS filesystem (NT) 12 - QDOS 13 - Acorn RISCOS 255 - unknown PerlIO-gzip-0.18/README000644 001765 001765 00000003237 10510001362 014342 0ustar00nicknick000000 000000 PerlIO::gzip version 0.18 ========================= A layer for the PerlIO system to transparently gzip/gunzip files. **DON'T** trust it with your data. YOU NEED PERL 5.8.0 or later. INSTALLATION To install this module type the following: perl Makefile.PL make make test echo I know that I am using this at my own risk make install That echo statement is particularly important. DEPENDENCIES This module requires perl 5.8.0 or later compiler with PerlIO. (which is now the default), and the zlib compression library. BUGS Threads - Currenly this module isn't threadsafe as it can't clone an open layer. flush() doesn't really flush, just empty the buffer TODO Lots: Check that concatenating to files works. Add a way of reading multiple contactenated gzip files. Test more Add an open method so that things we don't like (read/write opens) are faulted early (ie before a file is created on disk) and so that we add "b" flag to the open for the layer below. Test more Rewrite flush to really flush, and write a PerlIOGzip_write() rather than rely on PerlIOBuf_write, which calls flush() to empty the buffer each time. Test more Cope with crlf systems. Should this layer turn off crlf on the layer below and turn it on on itself? Cope with utf8. Should the utf8 flag be hiked above us? Test more Worry about threads. Writing headers with fancy stuff (filename, comment, extra info, header crc) What to about the "TEXT" flag in the gzip header Test more COPYRIGHT AND LICENCE You may distribute this work under the terms of either the GNU General Public License or the Artistic License, as specified in perl's README file. Copyright © 2001-2004, 2006 Nicholas Clark PerlIO-gzip-0.18/META.yml000644 001765 001765 00000000676 10510035477 014755 0ustar00nicknick000000 000000 --- #YAML:1.0 name: PerlIO-gzip version: 0.18 abstract: Perl extension to provide a PerlIO layer to gzip/gunzip license: perl generated_by: ExtUtils::MakeMaker version 6.30_02 author: Nicholas Clark distribution_type: module requires: meta-spec: url: ; version: 1.1 PerlIO-gzip-0.18/t/ok19000644 001765 001765 00000000010 10357731317 014440 0ustar00nicknick000000 000000 ËÏV0´äPerlIO-gzip-0.18/t/ok50.gz.short000644 001765 001765 00000000031 10357731317 016213 0ustar00nicknick000000 000000 ‹ÔžŽ:ËÏV05àØ ƒPerlIO-gzip-0.18/t/ok54.gz.len000644 001765 001765 00000000037 10357731317 015644 0ustar00nicknick000000 000000 ‹3ŸŽ:ok54ËÏV05áÜÈïPerlIO-gzip-0.18/t/read.t000644 001765 001765 00000015041 10510027072 015027 0ustar00nicknick000000 000000 #!perl -w # Before `make install' is performed this script should be runnable with # `make test'. After `make install' it should work as `perl test.pl' ######################### use strict; use warnings; use Test::More tests => 301; use File::Compare; # This is standard in all distributions that have layers. use File::Spec; use Config; use PerlIO::gzip; ok(1, "Does it even load?"); # If we made it this far, we're ok. chdir 't' if -d 't'; ######################### # Test numbers in file names reflect the original numbering in test.pl # There were TODO tests but they've been hacked around. # Currently the perl core can't unread onto :unix (and other non-fast buffered # layers), then push another layer atop it, without losing the unread data. # This shafts gzip() when the gzip file has embedded filenames or comments # so it hacks round it by pushing the buffering layer just before the unread. # Grrr. my $perlgz = "perl.gz"; my $done_perlgz; my $command = "gzip -c --fast $^X >$perlgz"; my $unread_bug = "Can't unread then push layer on :unix [core perlio bug]"; my $unread_stdio_bug = "Can't unread the push layer on :stdio [core perlio bug]"; # I think that the problem is that you can't specify "b" on the fopen() my $win32_stdio_hairy = ":stdio is a bit hairy on Win32"; my $stdio = 'Not really a layer name'; $stdio = ':stdio' unless $Config{d_faststdio} and $Config{usefaststdio}; my $readme = File::Spec->catfile(File::Spec->updir(), "README"); END {if (-f $perlgz) {unlink $perlgz or die "Can't unlink $perlgz: $!"}} foreach my $buffering ('', ':unix', ':stdio', ':perlio') { # default # check with no args # check with explict gzip header # check with lazy header check # both foreach my $layer ('', '()', '(gzip)', '(lazy)', '(gzip,lazy)') { local $/; ok (open (FOO, "<$buffering:gzip$layer", "ok3.gz"), "open ok3.gz with <$buffering:gzip$layer"); is (, "ok 3\n"); ok (eof (FOO), 'should be end of file'); ok (close (FOO), "close it again"); } # This should open ok ((open FOO, "<$buffering", $readme), "README should open"); # This should fail to open ok (!(open FOO, "<$buffering:gzip", $readme), "README should not open [core perlio bug fixed post 5.7.2 12827]"); { local $/; # This file has an embedded filename. Being short it also checks get_more # (called by eat_nul) and the unread of the excess data. ok (open (FOO, "<$buffering:gzip", "ok17.gz"), "open ok17.gz with <$buffering:gzip"); TODO: { # local $TODO = $unread_bug if $buffering eq ':unix'; # local $TODO = $unread_stdio_bug if $buffering eq $stdio; is (, "ok 17\n"); } ok (eof (FOO), 'should be end of file'); TODO: { # local $TODO = $unread_bug if $buffering eq ':unix'; # local $TODO = $unread_stdio_bug if $buffering eq $stdio; ok (close (FOO), "close it"); # As TODO as the read } ok (open (FOO, "<$buffering:gzip(none)", "ok19"), "open ok19 with <$buffering:gzip(none)"); is (, "ok 19\n"); } ok (open (FOO, "<$buffering", "ok21"), "open ok21 with <$buffering"); is (, "ok 21\n"); ok (binmode (FOO, ":gzip"), "Ho ho ho. Switch to gunzip mid stream."); is (, "ok 23\n"); # Test auto mode foreach (['auto', 'ok19', "ok 19\n"], ['auto', 'ok3.gz', "ok 3\n"], ['lazy,auto', 'ok19', "ok 19\n"], ['auto,lazy', 'ok3.gz', "ok 3\n"], ) { my ($args, $file, $contents) = @$_; local $/; ok (open (FOO, "<$buffering:gzip($args)", $file), "open $file with <$buffering:gzip($args)"); TODO: { # local $TODO = $unread_bug if $buffering eq ':unix' and $file eq 'ok19'; # local $TODO = $unread_stdio_bug # if $buffering eq $stdio and $file eq 'ok19'; is (, $contents); } ok (eof (FOO), 'should be end of file'); TODO: { # local $TODO = $unread_bug if $buffering eq ':unix' and $file eq 'ok19'; # local $TODO = $unread_stdio_bug # if $buffering eq $stdio and $file eq 'ok19'; ok (close (FOO), "close it"); # As TODO as the read } } foreach my $args ('lazy', 'auto', 'auto,lazy') { # This should open # (auto will find no gzip header and assume deflate stream) # (lazy defers test) ok ((open FOO, "<$buffering:gzip($args)", $readme), "README should open in $args mode"); # For lazy gzip header check is on first read it should fail here # For auto it's not (meant to be) a deflate stream it (hopefully) will go # wrong here my $line = ; ok (!defined $line, "but should fail on first read") or print "# got $_\n"; } if (!defined $done_perlgz) { # Attempt this the first time only print "# Attempting to run '$command'\n"; $done_perlgz = system $command; } SKIP: { skip "$command failed", 3 if $done_perlgz; ok ((open GZ, "<$buffering:gzip", "perl.gz"), "open perl.gz"); TODO: { # local $TODO = $unread_bug if $buffering eq ':unix'; local $TODO = $win32_stdio_hairy if $buffering eq ':stdio' && $^O eq 'MSWin32'; ok (compare ($^X, \*GZ) == 0, "compare with original $^X"); } ok (eof (GZ), 'should be end of file'); TODO: { # local $TODO = $unread_bug if $buffering eq ':unix'; local $TODO = $win32_stdio_hairy if $buffering eq ':stdio' && $^O eq 'MSWin32'; ok ((close GZ), "close perl.gz"); } } # OK. autopop mode. muhahahahaha ok ((open FOO, "<$buffering:gzip(autopop)", $readme), "open README with <$buffering:gzip(autopop)"); ok (defined , "read first line"); like (, qr/^======/, "check second line"); { local $/; ok ((open FOO, "<$buffering:gzip(autopop)", "ok3.gz"), "open ok3.gz with <$buffering:gzip(autopop)"); is (, "ok 3\n"); } # Verify that short files get an error on close # Verify that files with erroroneous lengths get an error on close # Verify that files with erroroneous crc get an error on close foreach (['', 'ok50.gz.short', "ok 50\n"], ['', 'ok54.gz.len', "ok 54\n"], ['', 'ok58.gz.crc', "ok 58\n"], ) { my ($layer, $file, $contents) = @$_; local $/; ok (open (FOO, "<$buffering:gzip$layer", $file), "open $file with <$buffering:gzip$layer"); TODO: { # ok54.gz.len has an embedded filename. # local $TODO = $unread_bug # if $buffering eq ':unix' and $file eq 'ok54.gz.len'; # local $TODO = $unread_stdio_bug # if $buffering eq $stdio and $file eq 'ok54.gz.len'; is (, $contents); } ok (eof (FOO), "should be end of file"); ok (!(close FOO), "close should fail"); } } PerlIO-gzip-0.18/t/write.t000644 001765 001765 00000014366 10510002700 015246 0ustar00nicknick000000 000000 #!perl -w # Before `make install' is performed this script should be runnable with # `make test'. After `make install' it should work as `perl test.pl' ######################### use strict; use warnings; use Test::More tests => 260; use File::Compare; # This is standard in all distributions that have layers. use Config; use PerlIO::gzip; chdir 't' if -d 't'; ######################### undef $/; my $sh; if (-s $Config{sh}) { open FOO, "<", $Config{sh} or die $!; binmode FOO; $sh = ; die "Can't slurp $Config{sh}: $!" unless defined $sh; die sprintf ("Slurped %d, but disk file $Config{sh} is %d bytes", length $sh, -s $Config{sh}) unless length $sh == -s $Config{sh}; close FOO or die "Close failed: $!"; } foreach my $buffering ('', ':unix', ':stdio', ':perlio') { ok ((open FOO, ">$buffering:gzip", 'foo'), "open foo as >$buffering:gzip"); ok (close (FOO), 'close it straight away'); is (-s 'foo', 20, 'empty gzip file should be 20 bytes') or printf "# it's %d bytes\n", -s 'foo'; ok ((open FOO, ">$buffering:gzip", 'foo'), "open foo as >$buffering:gzip"); my $message = "ok 68\n"; ok ((print FOO $message), 'print to it') or print "# \$! is $!\n"; ok (close (FOO), 'close it'); ok ((open FOO, "<$buffering:gzip", 'foo'), 'open foo for reading'); is (, $message, 'check we get same thing back'); ok (eof (FOO), 'should be end of file'); ok (close (FOO), 'close it'); unlink 'foo' or die "unlink 'foo' failed: $!"; # autopop writes should work ok ((open FOO, ">$buffering:gzip(autopop)", 'foo'), "open foo as >$buffering:gzip(autopop)"); $message = "ok 45\n"; ok ((print FOO $message), 'print to it') or print "# \$! is $!\n"; ok (close (FOO), 'close it'); ok ((open FOO, "<", 'foo'), "open foo for reading [just '<']"); is (, $message, 'check we get same thing back'); ok (eof (FOO), 'should be end of file'); ok (close (FOO), 'close it'); unlink 'foo' or die "unlink 'foo' failed: $!"; SKIP: { skip "Your configured shell, '$Config{sh}', is missing or size 0",7 unless defined $sh; ok ((open GZ, ">$buffering:gzip", 'foo'), sprintf "open >$buffering:gzip [about to write %d bytes]", length $sh); ok ((print GZ $sh), "print contents of $Config{sh}") or print "# \$! is $!\n"; ok ((close GZ), 'close it'); ok ((open GZ, "<$buffering:gzip", 'foo'), "open <$buffering:gzip"); ok (compare (\*GZ, $Config{sh}) == 0, "compare compressed copy with '$Config{sh}'"); ok (eof (FOO), 'should be end of file'); ok ((close GZ), 'close it'); unlink 'foo' or die "unlink 'foo' failed: $!"; } ok ((open FOO, ">$buffering:gzip(lazy)", "empty"), "open empty as >$buffering:gzip(lazy)"); ok ((close FOO), 'close it'); ok (-z "empty", "check it is zero length") or printf "# -s empty is %d\n", -s "empty"; unlink "empty" or die "unlink 'empty' failed: $!"; ok ((open GZ, ">$buffering:gzip(lazy)", 'foo'), "open foo as >$buffering:gzip(lazy)"); $message = "ok 87\n"; my $message2 = "ok 88"; ok ((print GZ $message), 'print to it') or print "# \$! is $!\n"; { local $\ = "\n"; ok ((print GZ $message2), 'print to it with $\ set') or print "# \$! is $!\n"; } ok ((close GZ), 'close it'); ok ((open FOO, "<$buffering:gzip", 'foo'), 'open foo for reading'); { local $/ = "\n"; is (, $message, 'check we get message back'); is (, "$message2\n", 'check we get message2 back'); } ok (eof (FOO), 'should be end of file'); ok (close (FOO), 'close it'); unlink 'foo' or die "unlink 'foo' failed: $!"; ok ((open FOO, ">$buffering:gzip(none)", 'foo'), "open foo as >$buffering:gzip(none)"); $message = "ok 95\n"; ok ((print FOO $message), 'print to it') or print "# \$! is $!\n"; ok (close (FOO), 'close it'); ok (!(open FOO, "<$buffering:gzip", "foo"), "no header, so open <$buffering:gzip should fail"); ok ((open FOO, "<$buffering:gzip(none)", 'foo'), 'open foo for reading'); is (, $message, 'check we get same thing back'); ok (eof (FOO), 'should be end of file'); ok (close (FOO), 'close it'); unlink 'foo' or die "unlink 'foo' failed: $!"; while (-f "empty") { # VMS is going to have several of these, isn't it? unlink "empty" or die $!; } # Read/writes don't work ok (!(open FOO, "+<$buffering:gzip", "empty"), "open +<$buffering:gzip should fail, as read/write unsupported"); ok (!-e 'empty', "check file empty was not created") or printf "# file empty has size %d\n", -s 'empty'; if (-f "empty") { unlink "empty" or die $!; } ok (!(open FOO, "+>$buffering:gzip", "empty"), "open +>$buffering:gzip should fail, as read/write unsupported"); TODO: { local $TODO = "read/write open still creates file"; ok (!-e 'empty', "check file empty was not created") or printf "# file empty has size %d\n", -s 'empty'; if (-f "empty") { unlink "empty" or die $!; } } # Touch empty so that +< successfuly opens an existing file open FOO, ">empty" or die "Can't open 'empty': $!"; close FOO or die "Can't close 'empty': $!"; ok ((open FOO, "+<$buffering", "empty"), "open +<$buffering"); ok (!(binmode FOO, ":gzip"), "binmode ':gzip' should fail on read/write"); ok (close (FOO), 'close it'); unlink "empty" or die $!; ok ((open FOO, "+>$buffering", "empty"), "open +>$buffering"); ok (!(binmode FOO, ":gzip"), "binmode ':gzip' should fail on read/write"); ok (close (FOO), 'close it'); unlink "empty" or die $!; ok ((open FOO, ">$buffering", 'foo'), "open foo as >$buffering"); $message = "uncompressed\n"; ok ((print FOO $message), 'print to it') or print "# \$! is $!\n"; ok ((binmode FOO, ":gzip(none)"), "binmode ':gzip(none)'"); $message2 = "compressed\n"; ok ((print FOO $message2), 'print to it') or print "# \$! is $!\n"; ok (close (FOO), 'close it'); ok ((open FOO, "<$buffering", 'foo'), "open foo as <$buffering"); { local $/ = "\n"; is (, $message, 'check we get uncompressed message'); ok ((binmode FOO, ":gzip(none)"), "binmode ':gzip(none)'"); is (, $message2, 'check we get compressed message'); } ok (eof (FOO), 'should be end of file'); ok (close (FOO), 'close it'); unlink 'foo' or die "unlink 'foo' failed: $!"; } PerlIO-gzip-0.18/t/ok21000644 001765 001765 00000000040 10357731317 014434 0ustar00nicknick000000 000000 ok 21 ‹üÇ}:ËÏV02æžHá5PerlIO-gzip-0.18/t/ok17.gz000644 001765 001765 00000000037 10357731317 015066 0ustar00nicknick000000 000000 ‹fÅ}:ok17ËÏV04çÃ3ËSPerlIO-gzip-0.18/t/ok3.gz000644 001765 001765 00000000031 10357731317 014773 0ustar00nicknick000000 000000 ‹¢¦}:ËÏV0æM­”PerlIO-gzip-0.18/t/ok58.gz.crc000644 001765 001765 00000000032 10357731317 015634 0ustar00nicknick000000 000000 ‹ËÏV0µàЇNÓ