Document Management

Document Management Systems Challenge

The Exponential Growth of Information

As far back as 1990, Coopers and Lybrand estimated that fresh, previously unavailable, knowledge was being generated at the rate of 800 words per second. At that time, the nation’s printed records were stored on three trillion pieces of paper, and represented approximately 70 percent of all recorded human knowledge. In those days, the industry was called Document Management, and was largely taken up with managing the digitization, storage and retrieval of these paper archives. Since that time, the numbers of records and pages have increased exponentially, riding a surging wave of computing power, storage capacity and communication bandwidth. While paper archives are still enormous, and are still growing, the storage and transfer of information via digital means is growing at an even greater pace. Even the term ‘document’ has been called into question by purists who complain that the concept is dead in the face of electronic and Internet commerce.

Today, the pursuit of knowledge is an obsolete, almost silly objective. We no longer have to search for knowledge. Rather we are pummeled and buried in a deluge of largely unfiltered and unorganized information. Huge repositories of information sit waiting to be tapped, analyzed, and filtered. Printed, visual, audio, and digital records are available in such quantities and from so many sources that it is almost impossible to know where to start assimilating the available knowledge to make even minor business decisions.

Every Challenge is an Opportunity

Clearly, a major issue facing the information systems analysts of the future will be to develop faster and more efficient means of filtering  flow of information and more robust, scalable methods of storing massive amounts of data and numbers of files and objects. In the new millennium, companies are successful, not because they can provide fresh knowledge to the world, but because they are able to capture and distribute  information  more effectively than their competition.

Choosing an Experienced Partner

You wouldn’t think of heading out into uncharted wilderness without choosing a seasoned guide to accompany you. You would select someone who could steer you away from trouble, and provide a sure path to your goal.  By the same token, you shouldn’t start soring digital documents  without someone to guide you through the potential challenges you will encounter in the process.

Choose a partner with a strong background in handling the challenges of converting the massive document backfile into an indexed volume of digital documents. Maintaining document storage repositories involves dealing with hundreds of gigabytes of RAID storage.. The document repository should be designed and configured to have minimal impact on your organization when you need to convert a million pages of information from hard magnetic to optical. Consider the impact of storing and migrating millions of documents files spread across storage servers and accessible by hundreds of users. Therefore, when choosing your partner, evaluate their know how of the special hardware and software components that go into a successful solution. And the products to make your document archive strategy a reality.

The World Needs Enterprise-wide Solutions

Today, with the availability of Email, workflow objects and a host of other interdepartmental documents, a departmental solution is just part of the big picture. Employees in every department, facility and level need to avail themselves of the total ‘corporate memory’. It’s not enough to search the department’s document base. To make valued and effective decisions, today’s worker must have access to every piece of information in existence that relates to his decision-making process.

Network Independence

Today’s high growth industries are enjoying double- and triple-digit growth spurts more because of mergers and acquisitions than simple increases in sales. While this keeps a grin on the faces of Wall Street analysts, it furrows the brows of IT administrators who are suddenly faced with the challenges of managing disparate network and host operating platforms. Hundreds of millions of dollars which could otherwise be spent on marketing or research are expended in resolving incompatible network topologies and operating platforms.

Flexibility with No Loss Of Performance

The World Wide Web and the plethora of Internet search engines have raised the expectation bar of the I-enabled public. When we search, we don’t expect to be have a limited scope of sites. And we don’t expect to have to wait forever for a hit list.

Your repository should allow your documents to be distributed across an unlimited number of volumes on an unlimited number of serves on an unlimited number of networks. The system administrators should maintain control over the contents and location of each virtual volume, providing a distributed implementation that takes into consideration usage and migration requirements of each group of documents. Documents can be stored on servers that are local to their primary users, and still be available to a more global set of virtual users.

Open Architecture Keeps Your Documents Alive

Documents come in various flavors. There are images, word processing, spreadsheets, databases, text files, Email, binary files, multimedia files, and a host of other formats and types. Each general type has its own list of syntaxes and specifications, sometimes numbering in the hundreds. It is important when choosing a document repository that you can be assured of using a document in its native format when you need to access it. Otherwise, your repository has become a static archive. The documents are dead, by definition, no longer in your active workflow.

Beware of document repository products that impose proprietary or specific storage formats onto the user.

Choose the one that can store and retrieve documents in their original format. A preferred repository would be a dynamic collection of files and folders whose only restriction is that they must be able to be stored in a Windows file-system. Period. Otherwise, the repository allows files to be stored on RAID, discreet hard drives, CD, DVD, Optical, WORM,  and Storage Area Networks.   Document folders can be created on any network server, LINUX, AIX or UNIX host platform that provides Windows compatible file-systems. Documents are stored in their original format, maximizing availability to the user base and minimizing the need for retrieval time conversions or proprietary viewing applications.

Migratory Habits of Electronic Documents

Some traditional document management systems provide quite robust storage and retrieval functionality. Some can even store and index a few million documents. But when it’s time to start freeing space in active storage, it’s time to start thinking about major system services costs, downtime, full system conversions and even application development. In some cases, documents are just destroyed to free up space because it is such a major production to maintain links to documents that have moved. Once again, the elegant design of the repository has to address this issue, and reduce the challenge to almost insignificant proportions.

The result should be to offer users extremely fast and efficient retrieval and deployment of objects, regardless of their type and storage media. “Media independence” , allows documents and folders to be freely moved among on-line, near-line and off-line media depending on the usage requirements of that object. Keep recently used documents in an on-line ‘cache’, while more seldom used objects are migrated to slower, more inaccessible media (near-line, near-off-line, off-line, etc.). Documents can be migrated from one storage media to another without the need for intricate database updates, and without impacting the on-line user. If a physical volume cannot hold all of the objects required of it, the volume is simply migrated to a virtual media pool, without impacting the underlying database, or the users. This migration can be done as an automatic process or as part of an administrative workflow.

Simple AND Scalable

The repository should offer simple scalability by allowing new disk devices, jukeboxes, servers, and other sites to be added without bringing the object management system down. Each object set is stored in a single volume, on a specific site. An unlimited number of volumes can be added. An unlimited number of sites can be configured.

Recoverability is Critical

Disaster strikes without warning. Your network crashes, your storage hardware crashes,  a million things can happen.  It is usually just a matter of time before your installation will be affected by one or more system or physical disasters.

Table 1. Average Hourly Cost Of Downtime

Type of Business/Technology Cost of Downtime
Brokerage House Large E-commerge Site $6.4 million
Credit Card Sales and Authorization $2.6 million
Catalog Sales $90 thousand
Package Shipping Transportation Industry $28 thousand
UNIX Networks $75 thousand
PC LANs $18 thousand

It’s one thing to have incredible scalability, adaptability and migratory capabilities, but without the ability to recover from a disaster, a value of a document repository is a fragile item at best. Disaster Recovery support should be provided at a variety of levels:

First, every transaction that impacts a volume should be recorded in an audit file. These files should be readily accessible, say as formatted ASCII text, and can be used to completely recover the indexing portion of the repository.

Next, the repository should be configured to store indexing information inside the object collection itself. This allows recovery even if the entire set of audit files has been destroyed. Repository utilities can be used to process the entire, rebuilding audit files with as much information as is available in the object base.

In the case of a host database crash, utilities can be used to rebuild the configuration files, and to fully analyze orphan volumes, extracting data from the itself.

The repository should not be a proprietary ‘black box’. With proper permissions, a trained administrator can rebuild most components with a file manager and the system Notepad. All configuration information should be available to the administrator as simple unencrypted ASCII data sets.

Security Issues

All document repository objects are files and folders that are totally compatible with the Windows file system in which they are written. As such, these objects are subject to as much perusal and modification as you wish to give the user. The objects can be protected from prying eyes or destruction in exactly the same way that you protect the rest of your sensitive files. That is to say via the user permissions, document-level, folder-level,  network and/or host database security system. User permission strategies can be as robust as the system administrator would allow.

Enterprise Integration

So, now you’ve got this totally flexible, fully distributed, completely scalable document repository deployed across your enterprise. Document capture centers are channeling images, faxes, Email, word-processing, and who knows what else into this powerful  repository’. The next logical step is to make this storehouse available to your line-of-business and decision support software.  The repository index should be easily integrated into client/ server, host and web applications.  A key for most larger businesses, institutions and enterprises is to purchase an integrated your document management system to leverage your existing IT assets!

Back to top