forked from Yelp/mrjob
-
Notifications
You must be signed in to change notification settings - Fork 0
/
CHANGES.txt
77 lines (67 loc) · 3.67 KB
/
CHANGES.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
v0.2.6, 2011-05-24 -- Hadoop 0.20 in EMR, inline runner, and more
* Set Hadoop to run on EMR with --hadoop-version (Issue #71).
* Default is still 0.18, but will change to 0.20 in mrjob v0.3.0.
* New inline runner, for testing locally with a debugger
* New --strict-protocols option, to catch unencodable data (Issue #76)
* Added steps_python_bin option (for use with virtualenv)
* mrjob no longer chokes when asked to run on an EMR job flow running
Hadoop 0.20 (Issue #110)
* mrjob no longer chokes on job flows with no LogUri (Issue #112)
v0.2.5, 2011-04-29 -- Hadoop input and output formats
* Added hadoop_input/output_format options
* You can now specify a custom Hadoop streaming jar (hadoop_streaming_jar)
* extra args to hadoop now come before -mapper/-reducer on EMR, so
that e.g. -libjar will work (worked in hadoop mode since v0.2.2)
* hadoop mode now supports s3n:// URIs (Issue #53)
v0.2.4, 2011-03-09 -- fix bootstrapping mrjob
* Fix bootstrapping of mrjob in hadoop and local mode (Issue #89)
* SSH tunnels try to use the same port for the same job flow (Issue #67)
* Added mr_postfix_bounce and mr_pegasos_svm to examples.
* Retry on spurious 505s from EMR API
v0.2.3, 2011-02-24 -- boto compatibility
* Fix incompatibility with boto 2.0b4 (Issue #91)
v0.2.2, 2011-02-15 -- GET/POST EMR issue
* Use POST requests for most EMR queries (EMR was choking on large GETs)
* find_probable_cause_of_failure() ignores transient errors (Issue #31)
* --hadoop-arg now actually works (Issue #79)
* on Hadoop, extra args are added first, so you can set e.g. -libjar
* S3 buckets may now have . in their names
* MRJob scripts now respect --quiet (Issue #84)
* added --no-output option for MRJob scripts (Issue #81)
* added --python-bin option (Issue #54)
v0.2.1, 2010-11-17 -- laststatechangereason bugfix
* Don't assume EMR sets laststatechangereason
v0.2.0, 2010-11-15 -- Many bugfixes, Windows support
* New Features/Changes:
* EMRJobRunner now prints % of mappers and reducers completed when you
enable the SSH tunnel.
* Added mr_page_rank example
* Added mrjob.tools.emr.audit_usage script (Issue #21)
* You can specify alternate job owners with the "owner" option. Useful for
auditing usage. (Issue #59)
* The job_name_prefix option has been renamed to label (the old name still
works but is deprecated)
* bootstrap_cmds and bootstrap_scripts no longer automatically invoke sudo
* Bugs Fixed/Cleanup:
* bootstrap files no longer get uploaded to S3 twice (Issue #8)
* When using add_file_option(), show_steps() can now see the local version
of the file (Issue #45)
* Now works on Windows (Issue #46)
* No longer requires external jar, tar, or zip binaries (Issue #47)
* mrjob-* scratch bucket is only created as needed (Issue #50)
* Can now specify us-east-1 region explicitly (Issue #58)
* mrjob.tools.emr.terminate_idle_job_flows leaves Hive jobs alone (Issue #60)
v0.1.0, 2010-10-28 -- Same code, better version. It's official!
v0.1.0-pre3, 2010-10-27 -- Pre-release to run Yelp code against
* Added debian packaging
* mrjob bootstrapping can now deal with symlinks in site-packages/mrjob
* MRJobRunner.stream_output() can now be called multiple times
v0.1.0-pre2, 2010-10-25 -- Second pre-release after testing
* Fixed small bugs that broke Python 2.5.1 and Python 2.7
* Fixed reading mrjob.conf without yaml installed
* Fix tests to work with modern simplejson and pipes.quote()
* Auto-create temp bucket on S3 if we don't have one (Issue #16)
* Auto-infer AWS region from bucket (Issue #7)
* --steps now passes in all extra args (e.g. --protocol) (Issue #4)
* Better docs
v0.1.0-pre1, 2010-10-21 -- Initial pre-release. YMMV!