<?xml version="1.0" encoding="windows-1252"?>
<node id="878526" title="Re: Convert PDF file into HTML file" created="2010-12-22 08:47:49" updated="2010-12-22 08:47:49">
<type id="11">
note</type>
<author id="708738">
LanX</author>
<data>
<field name="doctext">
The answer highly depends on the nature of your PDFs and the result you want! &lt;P&gt;

There is no simple answer for this general question, because a pure print format and a flowing format are different by nature and (as already mentioned) need heuristics.&lt;P&gt;

This post lists some possibilities (especially [http://linux.die.net/man/1/pdftohtml|pdftohtml -xml]) and other corresponding discussions: &lt;P&gt;

[id://831190]&lt;P&gt;


&lt;!-- Node text goes above. Div tags should contain sig only --&gt;
&lt;div class="pmsig"&gt;&lt;div class="pmsig-708738"&gt;
&lt;p&gt;Cheers Rolf
&lt;/div&gt;&lt;/div&gt;</field>
<field name="root_node">
878483</field>
<field name="parent_node">
878483</field>
</data>
</node>
