Skip to content

Commit

Permalink
Merge remote-tracking branch 'upstream/master'
Browse files Browse the repository at this point in the history
  • Loading branch information
rmorlok committed Mar 16, 2013
2 parents ff59857 + b6ba80a commit af8e9c1
Show file tree
Hide file tree
Showing 21 changed files with 189 additions and 91 deletions.
20 changes: 10 additions & 10 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ cmake_minimum_required(VERSION 2.6.0 FATAL_ERROR)

include_directories(${CMAKE_SOURCE_DIR}/src)

set(PDF2HTMLEX_VERSION "0.7")
set(PDF2HTMLEX_VERSION "0.8")
set(ARCHIVE_NAME pdf2htmlex-${PDF2HTMLEX_VERSION})
add_custom_target(dist
COMMAND git archive --prefix=${ARCHIVE_NAME}/ HEAD
Expand Down Expand Up @@ -43,7 +43,7 @@ if(FONTFORGE_FOUND)
link_directories(${FONTFORGE_LIBRARY_DIRS})
set(PDF2HTMLEX_LIBS ${PDF2HTMLEX_LIBS} ${FONTFORGE_LIBRARIES})
else()
message("Trying to locate fontforge...")
message("Trying to locate old versions of fontforge...")
find_path(FF_INCLUDE_PATH fontforge/fontforge.h)
if(FF_INCLUDE_PATH)
message("Found fontforge.h: ${FF_INCLUDE_PATH}/fontforge/fontforge.h")
Expand All @@ -61,6 +61,14 @@ else()
else()
message(FATAL_ERROR "Error: cannot locate fontforge.h")
endif()
find_path(FF_CONFIG_INCLUDE_PATH config.h PATHS
${FONTFORGE_INCLUDE_DIRS} NO_DEFAULT_PATH)
if(FF_CONFIG_INCLUDE_PATH)
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -include ${FF_CONFIG_INCLUDE_PATH}/config.h")
message("Found config.h: ${FF_CONFIG_INCLUDE_PATH}/config.h")
else()
message("Cannot locate config.h for fontforge")
endif()

macro(wl_find_library LIB_NAME RESULT)
unset(${RESULT})
Expand Down Expand Up @@ -98,14 +106,6 @@ else()
set(PDF2HTMLEX_LIBS ${PDF2HTMLEX_LIBS} ${PYTHON_LIBRARIES})
endif()

find_path(FF_CONFIG_INCLUDE_PATH config.h PATHS
${FONTFORGE_INCLUDE_DIRS} NO_DEFAULT_PATH)
if(FF_CONFIG_INCLUDE_PATH)
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -include ${FF_CONFIG_INCLUDE_PATH}/config.h")
message("Found config.h: ${FF_CONFIG_INCLUDE_PATH}/config.h")
else()
message("Cannot locate config.h for fontforge")
endif()

# debug build flags (overwrite default cmake debug flags)
set(CMAKE_C_FLAGS_DEBUG "-ggdb")
Expand Down
7 changes: 6 additions & 1 deletion ChangeLog
Original file line number Diff line number Diff line change
@@ -1,8 +1,13 @@
Latest v0.7
Latest v0.8

v0.7
2013.03.01

* Process outline
* Fix build with poppler
* Many code cleaning jobs [John Hewson]
* Experimental printing support
* Lots of code refinements

v0.6
2013.01.26
Expand Down
14 changes: 8 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,8 @@ A beautiful demo is worth a thousand words:
- **Scientific Paper**: [Default](http://coolwanglu.github.com/pdf2htmlEX/demo/demo.html) / [MediaFire](http://www.mediafire.com/view/?6po429kz9czcga2) / [Original](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.148.349&rep=rep1&type=pdf)
- **Full Circle Magazine**: [Default](http://coolwanglu.github.com/pdf2htmlEX/demo/issue65_en.html) / [MediaFire](http://www.mediafire.com/view/?6hxmt94k2vppnpb) / [Original](http://dl.fullcirclemagazine.org/issue65_en.pdf) <sub>The 1st link might be slow</sub>
- **Chinese**: [Default](http://coolwanglu.github.com/pdf2htmlEX/demo/chn.html) / [MediaFire](http://www.mediafire.com/view/?6550ldag9w0uuq3) / [Original](http://files.cnblogs.com/phphuaibei/git%E6%90%AD%E5%BB%BA.pdf)
- Try your own files on [MediaFire](http://www.mediafire.com), which uses pdf2htmlEX for its PDF preview feature.
- [Try your own files](https://github.com/coolwanglu/pdf2htmlEX/wiki/UploadDemo)

## Introduction

pdf2htmlEX renders PDF files in HTML, utilizing modern Web technologies.
Expand All @@ -29,13 +29,15 @@ The generated HTML file is static, Javascript is not required.
- Correct font & position & styles
- Proper reencoding
- Generated HTML file is of similar size as the original (uncompressed) PDF file
- Fallback (image + hidden text) - better accuracy and compatibility
* Output modes
- Normal HTML
- All-in-one HTML - portable & easy to share
- One HTML per page - best for dynamic pages
* More PDF stuffs that you love
- Links
- Outline
- Printing (experimental)

[Full list](https://github.com/coolwanglu/pdf2htmlEX/wiki/Feature-List)
[Compare with others](https://github.com/coolwanglu/pdf2htmlEX/wiki/Comparison)
Expand Down Expand Up @@ -83,15 +85,15 @@ Thanks to all packagers!
## Usage

pdf2htmlEX /path/to/foobar.pdf
pdf2htmlEX --help
man pdf2htmlEX

[Quick Start](https://github.com/coolwanglu/pdf2htmlEX/wiki/QuickStart)

## FAQ

* [Troubleshooting compilation errors](https://github.com/coolwanglu/pdf2htmlEX/wiki/FAQ#wiki-compile)
* [How can I help](https://github.com/coolwanglu/pdf2htmlEX/wiki/FAQ#wiki-help)
* [I want more features](https://github.com/coolwanglu/pdf2htmlEX/wiki/FAQ#wiki-feature_commission)
* [More about pdf2htmlEX](https://github.com/coolwanglu/pdf2htmlEX/wiki/)
* [More...](https://github.com/coolwanglu/pdf2htmlEX/wiki/FAQ)

## LICENSE

Expand All @@ -114,7 +116,7 @@ pdf2htmlEX is maintained by one person in spare time, and it needs your help!

* Lu Wang <[email protected]>
* For personal enquiries only
* Accepting messages in **Chinese**, **English** or **Japanese**.
* Accepting messages in **中文**, **English** or **日本語**.

## Acknowledge

Expand Down
11 changes: 4 additions & 7 deletions TODO
Original file line number Diff line number Diff line change
@@ -1,19 +1,15 @@
clean css class names
print css for draw/link/image...

== Future: ==

Too difficult/complicated to implement:
- integrate splash/cairo
- naive support for image/drawing (SVG?)
- naive image/drawing (SVG?)
- type 3 fonts (convert to SVG fonts?)
- reflowable text/combine lines/unwrapping
- Printing
- multi-thread

Not enough motivated/Lazy
- argument auto-completion
- use absolute positioning for long whitespace
- color invert
- detect duplicate base fonts when embedding
- disable selection if we know unicode is wrong
- check if we can add information to the font, and let browsers show ligatures automatically
Expand All @@ -23,6 +19,7 @@ Not enough motivated/Lazy
- merge sub/sup into one line
- precise link dest: zoom
- multiple charcode mapped to a same glyph
- don't dump image when there is nothing
- don't dump image when it is empty
- minimum line width of css drawing
- ajax in pdf2htmlEX for separated pages
- separate classes for annotations (such that we don't have to hide all css drawings for printing)
13 changes: 13 additions & 0 deletions debian/changelog
Original file line number Diff line number Diff line change
@@ -1,3 +1,16 @@
pdf2htmlex (0.8-1~git201303011406r3bc73-0ubuntu1) quantal; urgency=low

* Experimental printing support
* New version

-- WANG Lu <[email protected]> Fri, 01 Mar 2013 14:06:42 +0800

pdf2htmlex (0.7-1~git201302282259r3bc73-0ubuntu1) quantal; urgency=low

* suggests ttfautohint

-- WANG Lu <[email protected]> Thu, 28 Feb 2013 22:59:45 +0800

pdf2htmlex (0.7-1~git201302271054r3bc73-0ubuntu1) precise; urgency=low

* Packaging for 12.04
Expand Down
1 change: 1 addition & 0 deletions debian/control
Original file line number Diff line number Diff line change
Expand Up @@ -9,5 +9,6 @@ Homepage: http://github.com/coolwanglu/pdf2htmlEX
Package: pdf2htmlex
Architecture: any
Depends: ${shlibs:Depends}, ${misc:Depends}, libpoppler27 (>= 0.20.3) | libpoppler28, libpng12-0, libfontforge1
Suggests: ttfautohint
Description: Converts PDF to HTML without losing format
pdf2htmlEX converts PDF to HTML while retaining text, format & style as much as possible
4 changes: 4 additions & 0 deletions pdf2htmlEX.1.in
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,10 @@ Specify the filename of the generated outline file, if not embedded.

If it's empty, the file name will be determined automatically.

.TP
.B --fallback <0|1> (Deafult: 0)
Output in fallback mode, for better accuracy and browser compatibility, but the size becomes larger.

.TP
.B --process-nontext <0|1> (Default: 1)
Whether to process non-text objects (as images)
Expand Down
19 changes: 17 additions & 2 deletions share/base.css.in
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,9 @@
overflow:visible;
background-color:transparent;
}
.@CSS_CSS_DRAW_CN@ {
display:none;
}
}
/* Part 2: Page Elements: Modify with caution
* The followings are base classes, which are meant to be override by PDF specific classes
Expand Down Expand Up @@ -113,6 +116,17 @@
.@CSS_PAGE_CONTENT_BOX_CN@.opened { /* used by pdf2htmlEX.js, to show/hide pages */
display:block;
}
.@CSS_BACKGROUND_IMAGE_CN@ {
position:absolute;
left:0;
top:0;
width:100%;
height:100%;
-ms-user-select:none;
-moz-user-select:none;
-webkit-user-select:none;
user-select:none;
}
@media print {
.@CSS_PAGE_DECORATION_CN@ {
margin:0;
Expand Down Expand Up @@ -150,11 +164,12 @@ span { /* text blocks within a line */
color:transparent;
z-index:-1;
}
/* selection background should not be opaque, for fallback mode */
::selection{
background: rgba(127,255,255,1);
background: rgba(127,255,255,0.4);
}
::-moz-selection{
background: rgba(127,255,255,1);
background: rgba(127,255,255,0.4);
}
.@CSS_PAGE_DATA_CN@ { /* info for Javascript */
display:none;
Expand Down
10 changes: 8 additions & 2 deletions share/pdf2htmlEX.js.in
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ var pdf2htmlEX = (function(){
page_decoration : '@CSS_PAGE_DECORATION_CN@',
page_content_box : '@CSS_PAGE_CONTENT_BOX_CN@',
page_data : '@CSS_PAGE_DATA_CN@',
background_image : '@CSS_BACKGROUND_IMAGE_CN@',
link : '@CSS_LINK_CN@',
__dummy__ : 'no comma'
};
Expand Down Expand Up @@ -126,11 +127,12 @@ var pdf2htmlEX = (function(){
this.outline = $('#'+this.outline_id);
this.container = $('#'+this.container_id);

// need a better design
// Open the outline if nonempty
if(this.outline.children().length > 0) {
this.outline.addClass('opened');
}

// collect pages
var new_pages = new Array();
var pl= $('.'+CSS_CLASS_NAMES['page_frame'], this.container);
/* don't use for(..in..) */
Expand All @@ -140,14 +142,18 @@ var pdf2htmlEX = (function(){
}
this.pages = new_pages;

// register schedule rendering
var _ = this;
this.container.scroll(function(){ _.schedule_render(); });

//this.zoom_fixer();

// used by outline/annot_link etc
// handle links
this.container.add(this.outline).on('click', '.'+CSS_CLASS_NAMES['link'], this, this.link_handler);

// disable background image draging
$('.'+CSS_CLASS_NAMES['background_image'], this.container).on('dragstart', function(e){return false;});

this.render();
},
pre_hide_pages : function() {
Expand Down
18 changes: 10 additions & 8 deletions src/BackgroundRenderer/SplashBackgroundRenderer.cc
Original file line number Diff line number Diff line change
Expand Up @@ -20,17 +20,19 @@ void SplashBackgroundRenderer::drawChar(GfxState *state, double x, double y,
CharCode code, int nBytes, Unicode *u, int uLen)
{
// draw characters as image when
// - there is special filling method
// - in fallback mode
// - OR there is special filling method
// - OR using a writing mode font
// - OR using a Type 3 font
if(( (state->getFont())
&& ( (state->getFont()->getWMode())
|| (state->getFont()->getType() == fontType3)
)
)
if((param->fallback)
|| ( (state->getFont())
&& ( (state->getFont()->getWMode())
|| (state->getFont()->getType() == fontType3)
)
)
)
{
SplashOutputDev::drawChar(state,x,y,dx,dy,originX,originY,code, nBytes, u, uLen);
SplashOutputDev::drawChar(state,x,y,dx,dy,originX,originY,code,nBytes,u,uLen);
}
}

Expand All @@ -42,7 +44,7 @@ void SplashBackgroundRenderer::render_page(PDFDoc * doc, int pageno, const strin
{
doc->displayPage(this, pageno, param->h_dpi, param->v_dpi,
0,
(param->use_cropbox == 0),
(!(param->use_cropbox)),
false, false,
nullptr, nullptr, &annot_cb, nullptr);

Expand Down
4 changes: 4 additions & 0 deletions src/HTMLRenderer/HTMLRenderer.h
Original file line number Diff line number Diff line change
Expand Up @@ -241,6 +241,9 @@ class HTMLRenderer : public OutputDev
double text_scale_factor1;
double text_scale_factor2;

// 1px on screen should be printed as print_scale()pt
double print_scale (void) const { return 96.0 / DEFAULT_DPI / text_zoom_factor(); }


////////////////////////////////////////////////////
// states
Expand Down Expand Up @@ -294,6 +297,7 @@ class HTMLRenderer : public OutputDev
RiseManager rise_manager;
LeftManager left_manager;
////////////////////////////////////////////////
BGImageSizeManager bgimage_size_manager;

// optimize for web
// we try to render the final font size directly
Expand Down
2 changes: 1 addition & 1 deletion src/HTMLRenderer/TextLineBuffer.cc
Original file line number Diff line number Diff line change
Expand Up @@ -275,7 +275,7 @@ int HTMLRenderer::TextLineBuffer::State::diff(const State & s) const

// the order should be the same as in the enum
const char * const HTMLRenderer::TextLineBuffer::State::css_class_names [] = {
CSS::FONT_NAME_CN,
CSS::FONT_FAMILY_CN,
CSS::FONT_SIZE_CN,
CSS::FILL_COLOR_CN,
CSS::STROKE_COLOR_CN,
Expand Down
10 changes: 5 additions & 5 deletions src/HTMLRenderer/font.cc
Original file line number Diff line number Diff line change
Expand Up @@ -727,7 +727,7 @@ void HTMLRenderer::export_remote_font(const FontInfo & info, const string & suff
}

f_css.fs << "@font-face{"
<< "font-family:" << CSS::FONT_NAME_CN << info.id << ";"
<< "font-family:" << CSS::FONT_FAMILY_CN << info.id << ";"
<< "src:url(";

{
Expand All @@ -749,8 +749,8 @@ void HTMLRenderer::export_remote_font(const FontInfo & info, const string & suff
f_css.fs << ")"
<< "format(\"" << format << "\");"
<< "}" // end of @font-face
<< "." << CSS::FONT_NAME_CN << info.id << "{"
<< "font-family:" << CSS::FONT_NAME_CN << info.id << ";"
<< "." << CSS::FONT_FAMILY_CN << info.id << "{"
<< "font-family:" << CSS::FONT_FAMILY_CN << info.id << ";"
<< "line-height:" << round(info.ascent - info.descent) << ";"
<< "font-style:normal;"
<< "font-weight:normal;"
Expand All @@ -772,12 +772,12 @@ static string general_font_family(GfxFont * font)
// TODO: this function is called when some font is unable to process, may use the name there as a hint
void HTMLRenderer::export_remote_default_font(long long fn_id)
{
f_css.fs << "." << CSS::FONT_NAME_CN << fn_id << "{font-family:sans-serif;visibility:hidden;}" << endl;
f_css.fs << "." << CSS::FONT_FAMILY_CN << fn_id << "{font-family:sans-serif;visibility:hidden;}" << endl;
}

void HTMLRenderer::export_local_font(const FontInfo & info, GfxFont * font, const string & original_font_name, const string & cssfont)
{
f_css.fs << "." << CSS::FONT_NAME_CN << info.id << "{";
f_css.fs << "." << CSS::FONT_FAMILY_CN << info.id << "{";
f_css.fs << "font-family:" << ((cssfont == "") ? (original_font_name + "," + general_font_family(font)) : cssfont) << ";";

string fn = original_font_name;
Expand Down
Loading

0 comments on commit af8e9c1

Please sign in to comment.