Skip to content

Commit

Permalink
Improve docs
Browse files Browse the repository at this point in the history
Signed-off-by: Stephen Levine <[email protected]>
  • Loading branch information
sclevine committed Feb 12, 2022
1 parent 3ed81d6 commit 096c329
Show file tree
Hide file tree
Showing 3 changed files with 29 additions and 8 deletions.
20 changes: 17 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,10 @@
# ztgrep

Search for file names and contents through nested archives.
[![GoDoc](https://pkg.go.dev/badge/github.com/sclevine/ztgrep?status.svg)](https://pkg.go.dev/github.com/sclevine/ztgrep)

Useful for locating data lost within many levels of compressed archives.
Search for file names and contents within nested compressed archives.

Useful for locating data lost within many levels of compressed archives without using additional storage.

Supports the following compression formats for **both archives and files**:
- gzip
Expand All @@ -19,6 +21,7 @@ Nested archives and compressed files must have a recognizable file extension to

If multiple paths are specified, they are searched in parallel with nondeterministic output order.
However, output order is deterministic for any single path.
Only one path per CPU is searched concurrently.

Nested ZIP files must be read into memory to be searched.
By default, ZIP files larger 10 MB are not searched.
Expand All @@ -31,11 +34,22 @@ Usage:
Search Options:
-b, --skip-body Skip file bodies
-n, --skip-name Skip file names inside of tarballs
-z, --max-zip-size= Maximum zip file size to search (default: 10 MB)
-z, --max-zip-size= Maximum zip file size to search in bytes (default: 10 MB)
General Options:
-v, --version Return ztgrep version
Help Options:
-h, --help Show this help message
```

### Installation

Binaries for macOS, Linux, and Windows are [attached to each release](https://github.com/sclevine/ztgrep/releases).

`ztgrep` is also available as a [Docker image](https://hub.docker.com/r/sclevine/ztgrep).

### Go Package

ztgrep may be imported as a Go package.
See [godoc](https://pkg.go.dev/github.com/sclevine/ztgrep) for details.
2 changes: 1 addition & 1 deletion cmd/ztgrep/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ type Options struct {
Search struct {
SkipBody bool `short:"b" long:"skip-body" description:"Skip file bodies"`
SkipName bool `short:"n" long:"skip-name" description:"Skip file names inside of tarballs"`
MaxZipSize int64 `short:"z" long:"max-zip-size" default:"0" default-mask:"10 MB" description:"Maximum zip file size to search"`
MaxZipSize int64 `short:"z" long:"max-zip-size" default:"0" default-mask:"10 MB" description:"Maximum zip file size to search in bytes"`
} `group:"Search Options"`

General struct {
Expand Down
15 changes: 11 additions & 4 deletions ztgrep.go
Original file line number Diff line number Diff line change
Expand Up @@ -24,9 +24,11 @@ import (
const defaultMaxZipSize = 10 << (10 * 2) // 10 MB

var cpuLock = semaphore.NewWeighted(int64(runtime.NumCPU()))

func acquireCPU() { cpuLock.Acquire(context.Background(), 1) }
func releaseCPU() { cpuLock.Release(1) }

// New returns a *ZTgrep given a regular expression following https://golang.org/s/re2syntax
func New(expr string) (*ZTgrep, error) {
exp, err := regexp.Compile(expr)
if err != nil {
Expand All @@ -38,18 +40,23 @@ func New(expr string) (*ZTgrep, error) {
}, nil
}

// ZTgrep searchs for file names and contents within nested compressed archives.
type ZTgrep struct {
MaxZipSize int64
SkipName bool
SkipBody bool
exp *regexp.Regexp
MaxZipSize int64 // maximum size of zip file to search (held in memory)
SkipName bool // skip file names
SkipBody bool // skip file contents

exp *regexp.Regexp
}

// Result contains each matching path in Path.
// Each entry in Path[1:] represents a file nested in the previous archive.
type Result struct {
Path []string
Err error
}

// Start searches paths in parallel, returning results via a channel
func (zt *ZTgrep) Start(paths []string) <-chan Result {
// TODO: restrict number of open files
// TODO: buffer output to guarantee order
Expand Down

0 comments on commit 096c329

Please sign in to comment.