1. Questions to answer: a. Have you worked with bboxes, tesseract, opencv or similar solution, NLP? b. have you made rule based text extractions? If yes on either, please tell me about what you have done on the question you have answered.
2. Work to do We have a template based OCR solution that is working almost...
It could extract text in fixed bboxes in the solution we had before. A developer we have had did some changes and now it is not extracting properly.
It works with tesseract 5 and did in the former solution work by cropping the bbox and then extract text. The downside of this solution is that it is not able to extract strings that may exceed the bbox. The developer we had solved this, but now the solution is not extracting this properly.
Steps: Phase1: 1. get the extraction with exceeding strings to work to 100%
2. configure a script we have to load json annotation files with multiple documents in the same annotation file.
Phase2: 3. Create a rule based extraction of text so it automatically can extract text. We have an embryo of it but it must be finalised.
3. Skills: 5+ years of python coding. Use GIT MySQL stored procedures/routines
You MUST have done this with bboxes and rule based extractions from invoices
4. Time. Phase1 takes 1-2 days if you are experienced and extremely good.
Phase2, 1-1.5 weeks if you have done it previously Phase 1 and 2 can be different developers no need to have both skills.
Please, no fake nationalities where Chinese pretending being from Russia etc. How can I trust you when you lie about where you are from?
1 minute video for my course review Category: Digital Marketing, Video Editing, Video Production, Video Services, Voice Talent Budget: ₹600 - ₹1500 INR