I am Having a PDF file from which I need to extract Text in the Hindi Font Only. The PDF seems to be Image. Please Guide how to extract this in Text/Excel file.
Sample File
https://www.dropbox.com/s/kxbgp3cxb606i ... e.pdf?dl=0
Thanks
Need to Extract Hindi Text from PDF(Image) File
Re: Need to Extract Hindi Text from PDF(Image) File
Have you tried dedicated OCR software?
I tried part of your first page on http://www.i2ocr.com/free-online-hindi-ocr and it was a bit slow and was not 100% correct but I would think you could edit the output. I doubt any OCR software would be 100%.
Personally unless you have hundreds to do I would type it out manually as by the time you have checked the results are correct you could have done it.
I tried part of your first page on http://www.i2ocr.com/free-online-hindi-ocr and it was a bit slow and was not 100% correct but I would think you could edit the output. I doubt any OCR software would be 100%.
Personally unless you have hundreds to do I would type it out manually as by the time you have checked the results are correct you could have done it.
Re: Need to Extract Hindi Text from PDF(Image) File
Yes I tried that, Before posting to this forum and after your reply again.
I got error as "Invalid Input Image Type"
I chose Input Language as "Hindi"
Thanks
I got error as "Invalid Input Image Type"
I chose Input Language as "Hindi"
Thanks
Re: Need to Extract Hindi Text from PDF(Image) File
I did not download your whole file but took a screen capture and it was saved as a png - Microsoft snipping tool
Re: Need to Extract Hindi Text from PDF(Image) File
OK. Will try it using a PNG file. Thanks