| July, 2002 Diablo Blue Page 9 |
| pages of scanning. A word with a registration mark (® the circle R) next to it is a example of a suspect word. It guessed correctly and I only had to confirm the program’s choice. The interpret rating was 100% and again, the document type was a 10. Test Two Alright, the parties over, time to see just what this program can do. For second test I selected a sampling of a legal document, reproduced many many times, originally bound in a plastic binder with half inch slots, later converted to three hole punched paper. The document because of numerous copies over the years was heavily speckled. I called it the document from hell. This is characteristic of many of the scans I had done in the past and where the personal productivity factor went in the dumper. Lots of interpretation needed and in the older OCR programs the remnants of holes and slots were interpreted as characters. The documents also had to be cleaned of speckles and marks manually with older OCR programs. Further the document had a number stamped on it of a different size and slant. Document quality 2 at best on my scale, I decided to name this part of the test ‘Extreme Scanning.’ Four pages were scanned and the overall scan accuracy was 93.93%. On the first page it highlighted 17 words. Of the selections it had guessed correctly 15 times and I had only to confirm the choice. I had to select from a provided list of choices twice, and the correct words were in the list. I’m convinced that lawyers invent words so nobody knows what they are talking about. The net in the OCR test was that the document was loaded with legalese and OmniPage Pro flew through the legal words just fine. The additional pages had slightly less selections to decide, apparently the IntelliScan feature was doing its job. The parts of the documents that contained remnants of holes and slots were passed over by the program, no need to clean them beforehand or eliminate wrong interpretations on the part of OmniPage. Its beginning to get too easy for this program (and me), so for the final scan of the legal document I decided to place it in the scanner at about a 10 degree angle. It scanned fine. The program automatically aligns the text. When I took the document off of the scanner I realized I had also put it in upside down. The program had also fixed that. I became a believer. At this point I was sold, the effect on my overall productivity in the OCR area was great. No more speckle and hole removal or spending countless hours interpreting information. But I figured there had to be one more test I could do. Test Three I had achieved my goal of watching the program pass the extreme scanning test, but what else would put it to the test after that? Looking over my bookshelf I saw an old copy of the Bible. The pages are 4½ inches wide and 7½ inches long, the font was too small for me to measure but is about 7 point or less. Lots of unusual words, numbered verses and italicized footnotes. I placed it on the scanner at a 10-degree angle and hit the scan button. The result was 100% accuracy, 13 suspect words and no changes. On the scale of document quality I put it at about a 6. I’m amazed at this performance and now we’re in bonus territory in our extreme scanning test. The Bonus Round Added to the program with this release is the ability to OCR Acrobat files. To do this test I found a ‘readme’ file in my Acrobat folder and scanned it in. It is a 21-page file and all 11,000+ words were input at 100% accuracy with 64 suspect words. Again, the registration symbol and similar characters required confirmation that the interpretation was correct. No changes had to be made. |