Skip to content

erip/bin

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

95 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Description

This is a collection of utilities I use a lot, many of which I consider to be gaps in the standard UNIX toolset. Released into the public domain. Matt Post

  • zpaste. A version of "paste" that also allows compressed files.

  • expose. Allows a standard UNIX command or pipeline to be exposed as a lightweight web service.

  • iso639. Converts among ISO 639 codes and names.

  • mean, sum. Computes the mean/sum of a list of numbers.

  • mid. My favorite. Gives you a specified line number from a file. e.g.,

     cat big_file.txt | mid 100
    

    will print out the 100th line. Equivalent to

     head -n 100 big_file.txt | tail -n 1
    
  • philog. Personal mods to Philipp Koehn's command-line logging script.

  • rand-sample. Simple reservoir sampler from Ben Van Durme that grabs N random lines from a stream on STDIN.

  • roll. Simple dice roller.

  • shuffle. Wrapper around Perl's "shuffle()". Warning --- slurps all input! Better to use "sort -R" if available.

  • splittabs. Writes tab-delimited columns on STDIN to file names specified as script arguments. e.g.,

     paste file1.en file1.fr | remove_bad_lines | splittabs file1_trimmed.en file1_trimmed.fr
    

    passes two files through some imaginary script and then writes out the reduced ones. Useful in machine translation research!

About

bin files

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 68.0%
  • Shell 14.7%
  • R 9.8%
  • Perl 7.5%