Document management and work-flow

I guess we all have had to deal with computerised document storage in one way or another. In contrast to most, I have worked at a software house that primarily developed document management solutions. I will try to relate the good things about document management and things one actually needs to turn one's network file system to a "proper" document management solution.

Document storage

Versioning

We need to be able to retrieve older versions of a document (both the whole document and individual component files).

Sensible locking

We need to be able to "check out" a document and close it for updating by someone else. With sufficiently simple underlying file formats, we can envision an off-line editing mode, where we fetch a snapshot and on check-in have to perform a merge. Unfortunately, since we're talking general systems here, we cannot presume the existence of sufficiently good tools to perform version-checking.

In conjunction with this, we might need an administrative "forcibly check-in" function.

Fine-grained access control

A document management system has fine-grained access control. Ideally, we want access-control fine-grained enough to only reveal sections of a document, but again we are talking general systems, so cannot really realise this.

Hierarchical storage

We need to impose some sort of structure on our document storage (this might be inherent in the storage back-end or just something that emerges in the document storage API). Without a decent structure, finding anything becomes nigh-on impossible.

Documents, not files

This might sound counter-intuitive. Most people consider "document" and "file" to be synonyms. Surely an MS Word document is a single file and any single file is a document?

Not necessarily so. Look, for example, at a LaTeX document, they're often decomposed into multiple files. One file with the over-all document structure and then one file per chapter (or similar). Possibly even having tables separated out from the rest of the file (that is certainly the structure used for my larger pen&paper RPGs).

So, the first thing that becomes obvious is that we need to have a larger scope than "stores and tracks single files". We still need to be able to get at underlying files, for editing. We might need to be able to get at them for printing or other processing.

Document typing and templating

This ties in with "documents, not files". We want to type our documents, not the files they consist of. It's much more valuable having a visual indication that a given document (or file) is a "customer complaint" or "audit" or "design document" than "word processor file", "spreadsheet file" or "diagram tool file".

We might, conceivably, have a list of what each document type can be, in terms of underlying file formats (it might not be too clever to allow customer complaints to exist as JPEG image files, but we might want to allow TIFFs).

Once we have a concept of "document type", we can then have a better "create new document". instead of saying "give me a new Open Office text document", we can say "Give me a new Purchase Order" and a suitable template gets loaded in the right application.

Life-cycle management

Ideally, we want to model the document life-cycle within the document storage. We want a system that guides us through initial document creation, initial review, revisioning, more review, useful life, archiving and possibly destruction. This can, but shouldn't necessarily, be done using a work-flow system (ideally using a completely separate map, not showing up in the normal "what's to be done" view).

Work-flow

Why work-flow

Still, all the fancy document storage in the world is not going to make our lives sufficiently simpler, just by themselves. Documents in a normal business are seldom the reason for their own existence. They exist as part of a business process (let's call it a "work flow") and it''s only in the context of this paper shuffling that the document carries any meaning. At least up until the document has finished its useful life and goes for archiving (or destruction).

What work-flow

This is where a work-flow system comes in. Simply speaking, a work-flow is a way of automating paper-shuffling (to a degree). It does this by having a number of states ("actions"), connected by arcs ("transitions") into an automaton ("map"). Each action is a single thing that needs to be done to or with a document. From each action, there's either a single "next action" transition or more (if there is an actual choice). Each document flowing through a map is represented by a work-item (the work item usually refers only to a single document and each document is usually represented by a single item).

Ah, work-flow

Of course there might be a need to restrict visibility of work-flow maps (or specific actions). Ideally, the access control is shared with the document storage access control, so user-ids and passwords can be re-used.

Manage documents

So, with the combination of a work-flow management solution and a document storage solution, we have a document management solution, where all information flow is (hopefully) made explicit and thus slightly easier to comprehend, tweak and the like. With a bit of luck, it also makes it harder for any specific business process to fall between cracks, since there is machinery to make sure that everything is tracked.

Of course, there's a need of a suitable client to present both document storage and work-flow items at the same time, so one can switch between them. Ideally a client that can be tied in to all document-processing tools and their "New", "Open", "Save" and "Save as..." dialogs.

More work-flow

Processes change

No matter how well you have modeled whatever business process you have modeled, it will change over time (as a reaction to market pressure, as a reaction to changing circumstances or for other reasons). This is where a work-flow server that is separate from application logic wins. The application only needs to be able to say "what states are there that I should be interested in, what transitions do they have and what items are there in them?", so the underlying flow can change. Not necessarily without restarting the application, mind you, but it can certainly change on a week-medium-term basis, as business processes are refined.

An example

It's usually easier showing a work-flow process as a graph. Here is one I prepared earlier (it's a purchasing process), illustrating some interesting points. Firstly, it has splits and joins (the transitions with extra bows on them), to make some parts of the process inherently parallel. Secondly, it's actually a broken process that should be fixed. Thirdly, it's the example/test map I use for Creek, my next work-flow server (the first one, Dribble, uses, I believe, a different map and is somewhat lacking in essentials like "login" and "user awareness") project.

I'll leave "find the obvious bug" as an exercise for the interested reader. The purchase order flows from left to right, I've only labeled state transitions that are Actual Choices (they all have labels in the database version). A previous version of this document is available here.

Comments on this can be sent to ingvar -at- hexapodia -dot- net.

This is one of Ingvar's essays

By: Darnesha
2011-07-04 07:59

Very true! Makes a change to see seomone spell it out like that. :)

All fields below are mandatory, your email address will not be displayed by the site. All comments are sent to a moderation queue, so do not be surprised that it doesn't show up immediately.

Name:
Email (will not be displayed):
Comment: