Skip to content

Commit

Permalink
LUCENE-4291: reduce jflex buffer sizes
Browse files Browse the repository at this point in the history
git-svn-id: https://svn.apache.org/repos/asf/lucene/dev/branches/branch_4x@1369892 13f79535-47bb-0310-9956-ffa450edef68
  • Loading branch information
rmuir committed Aug 6, 2012
1 parent f689567 commit 777f25e
Show file tree
Hide file tree
Showing 22 changed files with 2,027 additions and 2,020 deletions.
4 changes: 4 additions & 0 deletions lucene/CHANGES.txt
Original file line number Diff line number Diff line change
Expand Up @@ -119,6 +119,10 @@ Optimizations
making them substantially more lightweight. Behavior is unchanged.
(Robert Muir)

* LUCENE-4291: Reduced internal buffer size for Jflex-based tokenizers
such as StandardTokenizer from 32kb to 8kb.
(Raintung Li, Steven Rowe, Robert Muir)

Bug Fixes

* LUCENE-4109: BooleanQueries are not parsed correctly with the
Expand Down

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Original file line number Diff line number Diff line change
Expand Up @@ -141,9 +141,9 @@ InlineElment = ( [aAbBiIqQsSuU] |
[vV][aA][rR] )


%include src/java/org/apache/lucene/analysis/charfilter/HTMLCharacterEntities.jflex
%include HTMLCharacterEntities.jflex

%include src/java/org/apache/lucene/analysis/charfilter/HTMLStripCharFilter.SUPPLEMENTARY.jflex-macro
%include HTMLStripCharFilter.SUPPLEMENTARY.jflex-macro

%{
private static final int INITIAL_INPUT_SEGMENT_SIZE = 1024;
Expand Down

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ import org.apache.lucene.analysis.tokenattributes.CharTermAttribute;
%function getNextToken
%pack
%char
%buffer 4096

%{

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
* limitations under the License.
*/

// Generated using ICU4J 49.1.0.0 on Friday, July 27, 2012 6:24:21 AM UTC
// Generated using ICU4J 49.1.0.0 on Monday, August 6, 2012 5:23:08 PM UTC
// by org.apache.lucene.analysis.icu.GenerateJFlexSupplementaryMacros


Expand Down

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Original file line number Diff line number Diff line change
Expand Up @@ -44,8 +44,9 @@ import org.apache.lucene.analysis.tokenattributes.CharTermAttribute;
%implements StandardTokenizerInterface
%function getNextToken
%char
%buffer 4096

%include src/java/org/apache/lucene/analysis/standard/SUPPLEMENTARY.jflex-macro
%include SUPPLEMENTARY.jflex-macro
ALetter = ([\p{WB:ALetter}] | {ALetterSupp})
Format = ([\p{WB:Format}] | {FormatSupp})
Numeric = ([\p{WB:Numeric}] | {NumericSupp})
Expand Down

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Original file line number Diff line number Diff line change
Expand Up @@ -47,8 +47,9 @@ import org.apache.lucene.analysis.tokenattributes.CharTermAttribute;
%implements StandardTokenizerInterface
%function getNextToken
%char
%buffer 4096

%include src/java/org/apache/lucene/analysis/standard/SUPPLEMENTARY.jflex-macro
%include SUPPLEMENTARY.jflex-macro
ALetter = ([\p{WB:ALetter}] | {ALetterSupp})
Format = ([\p{WB:Format}] | {FormatSupp})
Numeric = ([\p{WB:Numeric}] | {NumericSupp})
Expand Down Expand Up @@ -88,7 +89,7 @@ HiraganaEx = {Hiragana} ({Format} | {Extend})*
// RFC-5321: Simple Mail Transfer Protocol
// RFC-5322: Internet Message Format

%include src/java/org/apache/lucene/analysis/standard/ASCIITLD.jflex-macro
%include ASCIITLD.jflex-macro

DomainLabel = [A-Za-z0-9] ([-A-Za-z0-9]* [A-Za-z0-9])?
DomainNameStrict = {DomainLabel} ("." {DomainLabel})* {ASCIITLD}
Expand Down

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Original file line number Diff line number Diff line change
Expand Up @@ -38,8 +38,9 @@ import org.apache.lucene.analysis.tokenattributes.CharTermAttribute;
%implements StandardTokenizerInterface
%function getNextToken
%char
%buffer 4096

%include src/java/org/apache/lucene/analysis/standard/std31/SUPPLEMENTARY.jflex-macro
%include SUPPLEMENTARY.jflex-macro
ALetter = ([\p{WB:ALetter}] | {ALetterSupp})
Format = ([\p{WB:Format}] | {FormatSupp})
Numeric = ([\p{WB:Numeric}] | {NumericSupp})
Expand Down

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Original file line number Diff line number Diff line change
Expand Up @@ -38,8 +38,9 @@ import org.apache.lucene.analysis.tokenattributes.CharTermAttribute;
%implements StandardTokenizerInterface
%function getNextToken
%char
%buffer 4096

%include src/java/org/apache/lucene/analysis/standard/std31/SUPPLEMENTARY.jflex-macro
%include SUPPLEMENTARY.jflex-macro
ALetter = ([\p{WB:ALetter}] | {ALetterSupp})
Format = ([\p{WB:Format}] | {FormatSupp})
Numeric = ([\p{WB:Numeric}] | {NumericSupp})
Expand Down Expand Up @@ -77,7 +78,7 @@ ExtendNumLetEx = {ExtendNumLet} ({Format} | {Extend})*
// RFC-5321: Simple Mail Transfer Protocol
// RFC-5322: Internet Message Format

%include src/java/org/apache/lucene/analysis/standard/std31/ASCIITLD.jflex-macro
%include ASCIITLD.jflex-macro

DomainLabel = [A-Za-z0-9] ([-A-Za-z0-9]* [A-Za-z0-9])?
DomainNameStrict = {DomainLabel} ("." {DomainLabel})* {ASCIITLD}
Expand Down

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Original file line number Diff line number Diff line change
Expand Up @@ -36,8 +36,9 @@ import org.apache.lucene.analysis.tokenattributes.CharTermAttribute;
%implements StandardTokenizerInterface
%function getNextToken
%char
%buffer 4096

%include src/java/org/apache/lucene/analysis/standard/std34/SUPPLEMENTARY.jflex-macro
%include SUPPLEMENTARY.jflex-macro
ALetter = ([\p{WB:ALetter}] | {ALetterSupp})
Format = ([\p{WB:Format}] | {FormatSupp})
Numeric = ([\p{WB:Numeric}] | {NumericSupp})
Expand Down
Loading

0 comments on commit 777f25e

Please sign in to comment.