Page 1 of 1
convert PDF to plain text
Posted: 2018-09-07T15:40:58-07:00
by vargheseg
I have a business need to convert pdf file which has tabular data (which has table formatted data). is it possible to convert the pdf document to a single text file keeping tabular data context.
I do not need table/cell boundary and row but need the column, row in the correct context
Re: convert PDF to plain text
Posted: 2018-09-07T16:21:25-07:00
by snibgo
ImageMagick is a raster image processor. So it can read a PDF and create raster images from it. If you want formatted text, IM is not the appropriate tool. Try something like pdftotext.
Re: convert PDF to plain text
Posted: 2018-09-07T18:21:05-07:00
by vargheseg
Is it possible to convert to PostScript file
Re: convert PDF to plain text
Posted: 2018-09-07T18:22:32-07:00
by vargheseg
I doubt, as postscript files has no raster images
Re: convert PDF to plain text
Posted: 2018-09-08T03:52:00-07:00
by snibgo
vargheseg wrote:Is it possible to convert to PostScript file
With IM, you can:
... but that will rasterize the PDF and embed the raster image in the PS file. If you want the text to remain as editable text, IM is the wrong tool.