Tuesday 3 October 2017

scanning - What features are important in a scanner + sheet feeder for old personal documents

I would like to scan some old text documents. My purpose is twofold: disaster recovery (e.g. fire), and to save space on bulky documents I rarely refer to (e.g. old phone bills).


After scanning I intend to destroy some of the originals, where I rarely refer to them and they are bulky. The rest I will keep and continue referring to. I do not intend to OCR the documents.


I estimate there are a few thousand sides of A4 to scan, and I am aiming for only a few failures (missed or illegible sides) per 1000 sides scanned. By illegible I mean text that a human cannot read reliably.


I would like to do this myself rather than using a commercial service.


I believe the documents are fairly typical of what home users will have collected in their filing cabinets over the past say 10 or 20 years:



  • Mostly (perhaps 80%) standard paper size or close to standard size (A4, would be US letter elsewhere presumably)

  • Some bills that are longer than A4 (less than 10%)

  • A small number of "very miscellaneous" pages (less than 10%)

  • Mostly relatively flat good quality paper

  • The documents are printed on various papers since they include bills, receipts, letters, etc.

  • Many but not all documents are printed on both sides

  • A mixture of colour and in black and white only. Most of the documents do not use colour in an important way

  • A minority of pages with some graphics and pictures, etc. (perhaps 5 or 10%)

  • A minority of yellowed pages (less than 5%)


I would like to scan in colour because I do not want to verify that all of the colour information is unimportant. I will exclude large format documents (e.g. A3), but I would ideally like to scan bills that are longer than A4.


I don't mind scanning the "awkward cases" sheet-by-sheet but would like to save time using a sheet feeder where possible. However I anticipate that a high-end professional scanner isn't really called for. Also, as long as documents are still human-legible, damage to the paper is not very important.


Aside from dpi, what features in a scanner and sheet feeder are important for a job like this? By "features" I mean specific technical features (or performance characteristics) of the design, rather than broad categories like "reliability".


I am not looking for product recommendations. I would like to know what features are relevant for this scale of application.

No comments:

Post a Comment

Where does Skype save my contact's avatars in Linux?

I'm using Skype on Linux. Where can I find images cached by skype of my contact's avatars? Answer I wanted to get those Skype avat...