Skip to content

Latest commit

 

History

History
 
 

org.eclipse.jdt.core.internal.tools

How to support a new unicode version in the scanner:

1) Go to http://www.unicode.org/Public/
2) Select the folder that corresponds to the unicode version for which you want to generate the scanner resource files
3) Select the ucdxml folder and download the file called ucd.all.flat.zip.
4) Unzip that file on your disk. This creates a file called ucd.all.flat.xml.
5) To generate the resource files for identifier starts, you need to invoke
org.eclipse.jdt.core.internal.tools.unicode.GenerateIdentifierStartResources with the following arguments:
	- first argument: unicode version
	- second argument: path to the ucd.all.flat.xml file.
	- third argument: folder in which the resource files will be generated
For example:
	8.0 c:/unicode8.0.0/ucd.all.flat.xml c:/unicode8.0.0/res

6) To generate the resource files for identifier parts, you need to invoke
org.eclipse.jdt.core.internal.tools.unicode.GenerateIdentifierPartResources with the same arguments used previously.
7) Once this is done, you need to edit org.eclipse.jdt.internal.compiler.parser.ScannerHelper to add a new table for the new unicode support.

For example:
- add the new method:
	static void initializeTable19() {
		Tables9 = initializeTables("unicode8"); //$NON-NLS-1$
	}
- add the new static field Tables9.
- add a new folder unicode8 as a sub folder of org/eclipse/jdt/internal/compiler/parser/.
- put into this folder all resource files generated in step 5 and 6.
- modify
	org.eclipse.jdt.internal.compiler.parser.ScannerHelper.isJavaIdentifierPart(long, int)
	org.eclipse.jdt.internal.compiler.parser.ScannerHelper.isJavaIdentifierStart(long, int)
To use the new Tables9 values based on the compliance value by adding a new else if condition.

For org.eclipse.jdt.internal.compiler.parser.ScannerHelper.isJavaIdentifierPart(long, int) this becomes
	The last else becomes an else if that supports the previous 1.8 compliance
	else if (complianceLevel <= ClassFileConstants.JDK1_8) {
		// java 7 supports Unicode 6.2
		if (Tables8 == null) {
			initializeTable18();
		}
		switch((codePoint & 0x1F0000) >> 16) {
			case 0 :
				return isBitSet(Tables8[PART_INDEX][0], codePoint & 0xFFFF);
			case 1 :
				return isBitSet(Tables8[PART_INDEX][1], codePoint & 0xFFFF);
			case 2 :
				return isBitSet(Tables8[PART_INDEX][2], codePoint & 0xFFFF);
			case 14 :
				return isBitSet(Tables8[PART_INDEX][3], codePoint & 0xFFFF);
		}	
	} else {
		// java 9 supports Unicode 8
		if (Tables9 == null) {
			initializeTable19();
		}
		switch((codePoint & 0x1F0000) >> 16) {
			case 0 :
				return isBitSet(Tables9[PART_INDEX][0], codePoint & 0xFFFF);
			case 1 :
				return isBitSet(Tables9[PART_INDEX][1], codePoint & 0xFFFF);
			case 2 :
				return isBitSet(Tables9[PART_INDEX][2], codePoint & 0xFFFF);
			case 14 :
				return isBitSet(Tables9[PART_INDEX][3], codePoint & 0xFFFF);
		}
	}

8) Do the same set of changes for org.eclipse.jdt.internal.compiler.parser.ScannerHelper.isJavaIdentifierStart(long, int).
9) You need to add a regression test class in org.eclipse.jdt.core.tests.compiler.regression similar to org.eclipse.jdt.core.tests.compiler.regression.Unicode18Test.
You can get the character value for the regression test by checking the ucd.all.flat.xml file and searching for an entry that has the age parameter equals to the
unicode version you want to check (i.e. for unicode 8, age="8.0").

If you have any questions regarding this tool, please comment in the bug report 506870: https://bugs.eclipse.org/bugs/show_bug.cgi?id=506870