Bug 715113: Update Snappy to r56. r=bent

dothq · Jan 5, 2012 · 20035c7 · 20035c7
1 parent 89088b2
commit 20035c7
Show file tree

Hide file tree

Showing 9 changed files with 221 additions and 55 deletions.
diff --git a/.hgignore b/.hgignore
@@ -36,3 +36,6 @@ _OPT\.OBJ/
 
 # Java HTML5 parser classes
 ^parser/html/java/(html|java)parser/
+
+# SVN directories
+\.svn/
diff --git a/other-licenses/snappy/README b/other-licenses/snappy/README
@@ -6,6 +6,8 @@ Mozilla does not modify the actual snappy source with the exception of the
 Snappy comes from:
   http://code.google.com/p/snappy/
 
+We are currently using revision: 56
+
 To upgrade to a newer version:
   1. Check out the new code using subversion.
   2. Update 'snappy-stubs-public.h' in this directory with any changes that were
@@ -20,3 +22,4 @@ To upgrade to a newer version:
        - 'autogen.sh'
        - 'configure.ac'
        - 'Makefile.am'
+  5. Update the revision stamp in this file.
diff --git a/other-licenses/snappy/src/framing_format.txt b/other-licenses/snappy/src/framing_format.txt
@@ -0,0 +1,124 @@
+Snappy framing format description
+Last revised: 2011-12-15
+
+This format decribes a framing format for Snappy, allowing compressing to
+files or streams that can then more easily be decompressed without having
+to hold the entire stream in memory. It also provides data checksums to
+help verify integrity. It does not provide metadata checksums, so it does
+not protect against e.g. all forms of truncations.
+
+Implementation of the framing format is optional for Snappy compressors and
+decompressor; it is not part of the Snappy core specification.
+
+
+1. General structure
+
+The file consists solely of chunks, lying back-to-back with no padding
+in between. Each chunk consists first a single byte of chunk identifier,
+then a two-byte little-endian length of the chunk in bytes (from 0 to 65535,
+inclusive), and then the data if any. The three bytes of chunk header is not
+counted in the data length.
+
+The different chunk types are listed below. The first chunk must always
+be the stream identifier chunk (see section 4.1, below). The stream
+ends when the file ends -- there is no explicit end-of-file marker.
+
+
+2. File type identification
+
+The following identifiers for this format are recommended where appropriate.
+However, note that none have been registered officially, so this is only to
+be taken as a guideline. We use "Snappy framed" to distinguish between this
+format and raw Snappy data.
+
+  File extension:         .sz
+  MIME type:              application/x-snappy-framed
+  HTTP Content-Encoding:  x-snappy-framed
+
+
+3. Checksum format
+
+Some chunks have data protected by a checksum (the ones that do will say so
+explicitly). The checksums are always masked CRC-32Cs.
+
+A description of CRC-32C can be found in RFC 3720, section 12.1, with
+examples in section B.4.
+
+Checksums are not stored directly, but masked, as checksumming data and
+then its own checksum can be problematic. The masking is the same as used
+in Apache Hadoop: Rotate the checksum by 15 bits, then add the constant
+0xa282ead8 (using wraparound as normal for unsigned integers). This is
+equivalent to the following C code:
+
+  uint32_t mask_checksum(uint32_t x) {
+    return ((x >> 15) | (x << 17)) + 0xa282ead8;
+  }
+
+Note that the masking is reversible.
+
+The checksum is always stored as a four bytes long integer, in little-endian.
+
+
+4. Chunk types
+
+The currently supported chunk types are described below. The list may
+be extended in the future.
+
+
+4.1. Stream identifier (chunk type 0xff)
+
+The stream identifier is always the first element in the stream.
+It is exactly six bytes long and contains "sNaPpY" in ASCII. This means that
+a valid Snappy framed stream always starts with the bytes
+
+  0xff 0x06 0x00 0x73 0x4e 0x61 0x50 0x70 0x59
+
+The stream identifier chunk can come multiple times in the stream besides
+the first; if such a chunk shows up, it should simply be ignored, assuming
+it has the right length and contents. This allows for easy concatenation of
+compressed files without the need for re-framing.
+
+
+4.2. Compressed data (chunk type 0x00)
+
+Compressed data chunks contain a normal Snappy compressed bitstream;
+see the compressed format specification. The compressed data is preceded by
+the CRC-32C (see section 3) of the _uncompressed_ data.
+
+Note that the data portion of the chunk, i.e., the compressed contents,
+can be at most 65531 bytes (2^16 - 1, minus the checksum).
+However, we place an additional restriction that the uncompressed data
+in a chunk must be no longer than 32768 bytes. This allows consumers to
+easily use small fixed-size buffers.
+
+
+4.3. Uncompressed data (chunk type 0x01)
+
+Uncompressed data chunks allow a compressor to send uncompressed,
+raw data; this is useful if, for instance, uncompressible or
+near-incompressible data is detected, and faster decompression is desired.
+
+As in the compressed chunks, the data is preceded by its own masked
+CRC-32C (see section 3).
+
+An uncompressed data chunk, like compressed data chunks, should contain
+no more than 32768 data bytes, so the maximum legal chunk length with the
+checksum is 32772.
+
+
+4.4. Reserved unskippable chunks (chunk types 0x02-0x7f)
+
+These are reserved for future expansion. A decoder that sees such a chunk
+should immediately return an error, as it must assume it cannot decode the
+stream correctly.
+
+Future versions of this specification may define meanings for these chunks.
+
+
+4.5. Reserved skippable chunks (chunk types 0x80-0xfe)
+
+These are also reserved for future expansion, but unlike the chunks
+described in 4.4, a decoder seeing these must skip them and continue
+decoding.
+
+Future versions of this specification may define meanings for these chunks.
diff --git a/other-licenses/snappy/src/snappy-stubs-internal.h b/other-licenses/snappy/src/snappy-stubs-internal.h
@@ -86,10 +86,9 @@ using namespace std;
 // version (anyone who wants to regenerate it can just do the call
 // themselves within main()).
 #define DEFINE_bool(flag_name, default_value, description) \
-  bool FLAGS_ ## flag_name = default_value;
+  bool FLAGS_ ## flag_name = default_value
 #define DECLARE_bool(flag_name) \
-  extern bool FLAGS_ ## flag_name;
-#define REGISTER_MODULE_INITIALIZER(name, code)
+  extern bool FLAGS_ ## flag_name
 
 namespace snappy {
 

diff --git a/other-licenses/snappy/src/snappy-test.cc b/other-licenses/snappy/src/snappy-test.cc
@@ -353,7 +353,6 @@ int ZLib::CompressAtMostOrAll(Bytef *dest, uLongf *destLen,
   // compression.
   err = deflate(&comp_stream_, flush_mode);
 
-  const uLong source_bytes_consumed = *sourceLen - comp_stream_.avail_in;
   *sourceLen = comp_stream_.avail_in;
 
   if ((err == Z_STREAM_END || err == Z_OK)
@@ -397,7 +396,6 @@ int ZLib::CompressChunkOrAll(Bytef *dest, uLongf *destLen,
 int ZLib::Compress(Bytef *dest, uLongf *destLen,
                    const Bytef *source, uLong sourceLen) {
   int err;
-  const uLongf orig_destLen = *destLen;
   if ( (err=CompressChunkOrAll(dest, destLen, source, sourceLen,
                                Z_FINISH)) != Z_OK )
     return err;

diff --git a/other-licenses/snappy/src/snappy-test.h b/other-licenses/snappy/src/snappy-test.h
@@ -135,7 +135,7 @@ namespace File {
     while (!feof(fp)) {
       char buf[4096];
       size_t ret = fread(buf, 1, 4096, fp);
-      if (ret == -1) {
+      if (ret == 0 && ferror(fp)) {
         perror("fread");
         exit(1);
       }