Hi Team,
I am looking help for removing all lines from the scanned pdf, so the scenario is i have tax bills in which my values reside under the rectangular area so i am to exactly remove the rectangle borders and want text only.
All the code I need in c# i am seeing so many related posts in php but need some assistance on c#
I am using Magick.NET-Q16-x64.dll for image processing before passing it to Tesseract to lift the text from the Images.
If you need Images i can provide that too.
Thanks,
Shobhit
Remove Lines from Scanned Images
-
- Posts: 4
- Joined: 2019-01-29T08:26:46-07:00
- Authentication code: 1152
- fmw42
- Posts: 25562
- Joined: 2007-07-02T17:14:51-07:00
- Authentication code: 1152
- Location: Sunnyvale, California, USA
Re: Remove Lines from Scanned Images
What have you tried? Can you provide an example image?
-
- Posts: 4
- Joined: 2019-01-29T08:26:46-07:00
- Authentication code: 1152
Re: Remove Lines from Scanned Images
Thanks for the reply...
Please find the below link having sample image...
https://www.dropbox.com/s/4liae22jjwwzk ... l.pdf?dl=0
Below is the little explanation again what exactly i am looking for:
Registration Evaluation Vin Bill Number
AP31BE3785 16800 ABCD123DRF3GH 23005
I need the data to be like this but all this data is there inside a table and rectangle border so i need to remove those rectangular border.
Thanks,
Shobhit
Please find the below link having sample image...
https://www.dropbox.com/s/4liae22jjwwzk ... l.pdf?dl=0
Below is the little explanation again what exactly i am looking for:
Registration Evaluation Vin Bill Number
AP31BE3785 16800 ABCD123DRF3GH 23005
I need the data to be like this but all this data is there inside a table and rectangle border so i need to remove those rectangular border.
Thanks,
Shobhit