Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing support for google's Noto CJK and Adobe's SourceHan font families #31

Open
lemzwerg opened this issue Nov 8, 2018 · 11 comments

Comments

@lemzwerg
Copy link

lemzwerg commented Nov 8, 2018

It seems that cjk-gs-integrate doesn't recognize fonts from the SourceHan or NotoCJK families.

@aminophen
Copy link
Member

I already have ones locally, but I don't have enough time to commit it

@lemzwerg
Copy link
Author

lemzwerg commented Nov 8, 2018

Great! So... hopefully you will soon find some time :-)

@aminophen
Copy link
Member

Added on 18dcc41, but not tested well yet especially for following reasons:

  • The priority field is not determined yet, as I do not have enough time to examine which format (OTC vs OTF) is commonly available and which causes less trouble.
  • I guess, Ghostscript will not be able to handle Noto/SourceHan font sets, because they are provided in rather new format. They are based on Adobe-Identity0, not Adobe-{Japan1,Korea1,GB1,CNS1}. In this case, Ghostscript will fail to display/embed proper glyphs.

@lemzwerg
Copy link
Author

It seems to work, thanks. The created link from e.g., SourceHanSerif-Medium.otf to CIDFont/SourceHanSerif-Medium makes ghostscript accept the OTF.

My use-case is the music score typesetting engine lilypond, which has the command line option --pspdfopt=TeX-GS to produce PDFs without embedded fonts. Multiple PDFs (with CJK characters) are included in a XeTeX document (also with CJK characters), and the resulting PDF file is post-processed with ps2pdf.

Note, however, that I get a far better result if I convert SourceHanSerif-Medium.otf to a real CIDFont resource (I use fontforge for that purpose): Using embedded CJK fonts everywhere, XeTeX creates a 13MByte file, for example. Doing it the cjk-gs-support way (with the above --pspdfopt option for lilypond) the size is reduced to 1.3MByte, and with a CIDFont resource instead of cjk-gs-support the resulting file is only 232kByte! I guess it is a limitation of ghostscript (tested version is 9.26) that OTF files are not as efficiently handled.

@norbusan
Copy link
Member

Hallo Werner,
(wo immer ich mich auch verstecke, du findest mich ;-)

Thanks for your comments and confirmation that it works. When I first wrote this script I didn't plan it to be a universal font installer for CJK fonts, though over time and with the hard work of Hironobu it has grown a lot.

What do you mean with "far better" here, the size difference I suppose, because there shouldn't be a visual difference.

Best

Norbert

@lemzwerg
Copy link
Author

[Hehe, ich hebe irgendeinen Stein hoch, und darunter sitzt schon ein Projekt, wo du deine Finger drin hast :-)]

Yes, I mean the very noticeable size difference if using the current ghostscript version 9.26. Maybe it's worth to investigate whether such size reductions are possible for cjk-gs-support also (i.e., working with CIDFont resource files in addition to CID-keyed OTFs if there is plenty of disk space).

Just a note of warning: Currently, XeTeX happily accepts CIDFont resource files as fonts if offered by fontconfig. However, it can't use them, and xdvipdfmx aborts with a wrong error message, cf. https://sourceforge.net/p/xetex/bugs/156.

@aminophen
Copy link
Member

Sorry I don't know what "CIDFont resource created by fontforge" looks like, as I've never used fontforge. Could you give me more information on it, or how can I create one? (I'm currently working on macOS, and MacPorts seems to have a port "fontforge" ...)

@lemzwerg
Copy link
Author

Assuming that you make the fontforge's GUI work: Say

fontforge SourceHanSerif-Medium.otf 

(this can take quite a long time, especially if fontforge has to regenerate its fontconfig database), then select 'File->Generate Fonts->PS CID', change the output font name to 'SourceHanSerif-Medium'
and press the 'Generate' button. This results in a 41MByte file, to be put into ~/.fonts/CIDFont, for example. To avoid the XeTeX problem described above, you can add the following to your fonts.conf file:

<selectfont>
  <rejectfont>
    <pattern>
      <patelt name="fontformat" >
          <string>CID Type 1</string>
      </patelt>
    </pattern>
  </rejectfont>
</selectfont>

@aminophen
Copy link
Member

Installed fontforge using macports, but it does not work on my computer;

$ fontforge --version
Copyright (c) 2000-2012 by George Williams.
 Executable based on sources from 14:57 GMT 31-Jul-2012-ML-NoPython.
 Library based on sources from 14:57 GMT 31-Jul-2012.
fontforge 20120731
libfontforge 20120731-ML

$ fontforge /path/to/SourceHanSerif-Medium.otf
Copyright (c) 2000-2012 by George Williams.
 Executable based on sources from 14:57 GMT 31-Jul-2012-ML-NoPython.
 Library based on sources from 14:57 GMT 31-Jul-2012.
Abort trap: 6

$ fontforge /path/to/SourceHanSerif-Bold.otf
Copyright (c) 2000-2012 by George Williams.
 Executable based on sources from 14:57 GMT 31-Jul-2012-ML-NoPython.
 Library based on sources from 14:57 GMT 31-Jul-2012.
Internal Error:
Reference found in CID font. Can't fix it up
Internal Error:
Reference found in CID font. Can't fix it up
Internal Error:
Reference found in CID font. Can't fix it up
Internal Error:
Reference found in CID font. Can't fix it up
Internal Error:
Reference found in CID font. Can't fix it up
Internal Error:
Reference found in CID font. Can't fix it up
Internal Error:
Reference found in CID font. Can't fix it up
Internal Error:
Reference found in CID font. Can't fix it up
Internal Error:
Reference found in CID font. Can't fix it up
Internal Error:
Reference found in CID font. Can't fix it up
Internal Error:
Reference found in CID font. Can't fix it up
Internal Error:
Reference found in CID font. Can't fix it up
Internal Error:
Reference found in CID font. Can't fix it up
Internal Error:
Reference found in CID font. Can't fix it up
Internal Error:
Reference found in CID font. Can't fix it up
Internal Error:
Reference found in CID font. Can't fix it up
Internal Error:
Reference found in CID font. Can't fix it up
Internal Error:
Reference found in CID font. Can't fix it up
Internal Error:
Reference found in CID font. Can't fix it up
Segmentation fault: 11

Luckily,$ font forge /path/to/SourceHanSans-Bold.otf worked fine, but it seems hard to figure out what fontforge is doing.

@aminophen
Copy link
Member

I found that macports can activate fontforge+python27, and it works.

$ fontforge --version
Copyright (c) 2000-2014 by George Williams. See AUTHORS for Contributors.
 License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
 with many parts BSD <http://fontforge.org/license.html>. Please read LICENSE.
 Based on sources from 14:14 UTC 12-Sep-2018-D.
 Based on source from git with hash: 
fontforge 14:14 UTC 12-Sep-2018
libfontforge 20180912

I first write ffscript.pe

Open("SourceHanSerif-Bold.otf")
Generate("SourceHanSerif-Bold.cid")

and run $ fontforge -script ffscript.pe to generate SourceHanSerif-Bold.cid. It is difficult for me to read the resulting PS CID...

@lemzwerg
Copy link
Author

Good to know that you can create such files on the Mac!

The generated resource file is similar to a large Type1 PostScript font using the CID framework (i.e., putting the glyphs into various font dictionaries). The font encoding is identical to the data in the OTF's 'CFF ' table ('Adobe-Identity-0', which essentially means unordered). This implies that the metrics and cmaps from the OTF file must be used to generate the PDF; only for ps2pdf it makes sense to use the generated CIDFont resource.

Note that it also works if you generate a 'bare CFF CID-keyed font' with fontforge. However, the PDF created by ps2pdf isn't as compact as if using the PS CID font. For the use-case discussed above, the resulting PDF has a size of 957kByte; it is thus smaller than making ghostscript use the OTF but still much larger than using the PS CID format.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants