[links-list] Re: PDF in Textmode

Cliff Cunnington ccnix at ccnet.xs4all.nl
Mon Nov 11 13:35:36 PST 2002


* Karel Kulhavy <clock at atrey.karlin.mff.cuni.cz> [2002-11-11 21:11:39 +0100]:

> > > Why don't you use the pdf to txt translation from google, instead of
> > > showing binary trash?
> > 
> > You can use Lua for that in ELinks, I believe ;-).
> 
> How?
> 

I'd do it this way.

 -- Add to (pw)hooks.lua

function pre_format_html_hook (url, html)
    -- Depending on a matching URL string, this function filters and modifies
    -- requested documents.

    -- begin-snip
    
    -- Converts PDF files to a text format and displays in Elinks
    -- Requires pdftotext(1), packaged with xpdf
    -- URL: <http://www.foolabs.com/xpdf/>
    elseif strfind (url, "%.pdf$") then
        local tmp = tmpname ()
        writeto (tmp) write (html) writeto ()
        html = pipe_read ("(pdftotext "..tmp.." -") 2>/dev/null")
	remove (tmp)
        ret = 1
    end
    -- end-snip
end


Usage:
    After following a link to a PDF document, Elinks will prompt for
    Save, View, Cancel. Select View. The raw PDF document will start to
    display in the browser (i.e. "junk"). Once the document is fully
    downloaded it will be filtered through the above function, and with
    luck, it will be rendered in human-readable ASCII.



A native Links function that permits pre-filtering would be nice. E.g.
an extra option in Links Setup -> Associations -> "Filter with app;
display output in Links".

For Links (text-mode), even being able to pre-filter HTML is a good
idea. Users could then add a (Lua ;-) script to clean up stupid 
<img alt="spacer"> tags, as well as other bad HTML practices.

This would be especially useful for braille terminal users, who
could custom build their own alt text from filename. (See links-discuss
May/June 2002; thread title: "visible image tags".)

And finally, for aalib fans who have littered this list with requests
for aalib support since 1999; they could simply(?) filter requested
images into a Links text display.


Regards,



Cliff

-- 
Unsubscribe: send email to links-list-request at linuxfromscratch.org
and put unsubscribe in the subject header of the message



More information about the links-list mailing list