Skip to content

Commit

Permalink
Issue python#13165: stringbench is now available in the Tools/stringb…
Browse files Browse the repository at this point in the history
…ench folder.

It used to live in its own SVN project.
  • Loading branch information
pitrou committed Apr 9, 2012
1 parent 75d9aca commit 1584ae3
Show file tree
Hide file tree
Showing 4 changed files with 1,560 additions and 0 deletions.
6 changes: 6 additions & 0 deletions Misc/NEWS
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,12 @@ Tests
- Issue #14355: Regrtest now supports the standard unittest test loading, and
will use it if a test file contains no `test_main` method.

Tools / Demos
-------------

- Issue #13165: stringbench is now available in the Tools/stringbench folder.
It used to live in its own SVN project.


What's New in Python 3.3.0 Alpha 2?
===================================
Expand Down
3 changes: 3 additions & 0 deletions Tools/README
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,9 @@ scripts A number of useful single-file programs, e.g. tabnanny.py
tabs and spaces, and 2to3, which converts Python 2 code
to Python 3 code.

stringbench A suite of micro-benchmarks for various operations on
strings (both 8-bit and unicode).

test2to3 A demonstration of how to use 2to3 transparently in setup.py.

unicode Tools for generating unicodedata and codecs from unicode.org
Expand Down
68 changes: 68 additions & 0 deletions Tools/stringbench/README
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
stringbench is a set of performance tests comparing byte string
operations with unicode operations. The two string implementations
are loosely based on each other and sometimes the algorithm for one is
faster than the other.

These test set was started at the Need For Speed sprint in Reykjavik
to identify which string methods could be sped up quickly and to
identify obvious places for improvement.

Here is an example of a benchmark


@bench('"Andrew".startswith("A")', 'startswith single character', 1000)
def startswith_single(STR):
s1 = STR("Andrew")
s2 = STR("A")
s1_startswith = s1.startswith
for x in _RANGE_1000:
s1_startswith(s2)

The bench decorator takes three parameters. The first is a short
description of how the code works. In most cases this is Python code
snippet. It is not the code which is actually run because the real
code is hand-optimized to focus on the method being tested.

The second parameter is a group title. All benchmarks with the same
group title are listed together. This lets you compare different
implementations of the same algorithm, such as "t in s"
vs. "s.find(t)".

The last is a count. Each benchmark loops over the algorithm either
100 or 1000 times, depending on the algorithm performance. The output
time is the time per benchmark call so the reader needs a way to know
how to scale the performance.

These parameters become function attributes.


Here is an example of the output


========== count newlines
38.54 41.60 92.7 ...text.with.2000.newlines.count("\n") (*100)
========== early match, single character
1.14 1.18 96.8 ("A"*1000).find("A") (*1000)
0.44 0.41 105.6 "A" in "A"*1000 (*1000)
1.15 1.17 98.1 ("A"*1000).index("A") (*1000)

The first column is the run time in milliseconds for byte strings.
The second is the run time for unicode strings. The third is a
percentage; byte time / unicode time. It's the percentage by which
unicode is faster than byte strings.

The last column contains the code snippet and the repeat count for the
internal benchmark loop.

The times are computed with 'timeit.py' which repeats the test more
and more times until the total time takes over 0.2 seconds, returning
the best time for a single iteration.

The final line of the output is the cumulative time for byte and
unicode strings, and the overall performance of unicode relative to
bytes. For example

4079.83 5432.25 75.1 TOTAL

However, this has no meaning as it evenly weights every test.

Loading

0 comments on commit 1584ae3

Please sign in to comment.