Using a Scanner to Reduce Paper Clutter

March 17, 2010

Ok, this might not be the most exciting subject, but I normally find it easiest to blog about things that are going on in my life. Two weeks ago I completed my 2009 tax return online using H&R Block. To be perfectly honest I had very mixed feelings about doing my taxes completely online, but that is the subject of another blog post. Anyway, ever since I got out into “the real world” I have found myself with an ever increasing amount of paper work and documents that I either “need for tax purposes” or “need to keep for my records”. I absolutely despise clutter because I feel as though it slows me down. I dislike trying to find some cryptic piece of paper in a pile of junk. This large increase in paper work led me to purchase a scanner to store my documents electronically.

The idea is quite simple (and hardly original). You get a document, you scan it in, and you save as a pdf. Then, once it’s on your computer, you can setup whatever directory structure makes sense (I store by year and organization). Further, if you use consistent file naming you might be able to search your files for a document that you need. If you have been reading this blog for any length of time it should not be a surprise to learn that I am a Linux user. Presently I run Kubuntu 9.10 so it was important to me that my scanner work in Linux and that I be able to make said PDF files in Linux. The other consideration is that I wanted a cheap scanner since I am scanning mostly black and while documents. I do most of my shopping on Newegg and using the wonderful comments I was able to locate a Linux compatible scanner (more on the setup of that in a future post, but it was not trivial).

In any case once I had the scanner working, I could scan my documents into GIMP. Once they were in GIMP I saved them as PNG files and used everyone’s favorite “convert” to make them PDF files.

convert sample-png.png sample-pdf.pdf

If the documents were more than one page I used pdftk to make one giant document.

pdftk *.pdf cat output final-document.pdf

The trick to getting the *.pdf to concatenate in the correct document order means prefixing the documents with numbers such as: 00-ImportantTaxThing.pdf, 01-ImportantTaxThing.pdf, 02-ImportantTaxThing.pdf, etc.

The method can be a bit round about, but I’m sure a bit of bash magic could speed things up a little. Perhaps if I’m feeling motivated I’ll write a little python graphical front end to all this.

In any case, I have found this to be quite nice and it’s really easy to find these documents on the rare occasions I need them. Now, I should mention there is one flaw in my plan which my dad brought up. I don’t know what the laws are for needing physical copies of documents if you were, for example, audited by the IRS. So, for any extra special documents I just store them away in a folder that is unsorted on the off chance I might need them. Still, this allows me to keep a very clean desk, and still have access to all my documents.