by Bernard Chester, e-Doc Magazine
01-01-2006
Data migration is the transfer of data between storage types, data formats, or computer systems. It is necessary when organizations change computer systems, upgrade to new systems, or move data from one storage device to another. Data migration may be accomplished primarily for business reasons: in order to implement new functionality, consolidate existing and acquired applications after a merger or acquisition, or for outsourcing to a service provider. The process can be implemented manually but is usually enabled by a set of customized programs or scripts. Data migration is successful if the data in the new format, system, or location is an accurate and usable representation of the original. Examples of data migration projects are:
• Moving content from a no longer supported imaging system to a new one.
• Converting records from old 14-inch optical platter jukebox to a new DVDbased jukebox.
• Converting Wordstar™ files to XML.
The data migration process has three major stages: analysis, testing, and production. Within the analysis stage, there are three major steps.
1. Develop a clear understanding of the content and format of the existing documents and records and their metadata. An inventory of the existing electronic content is an essential first step. The goal is to obtain a list of the existing file formats and the volumes of each. A file viewer or conversion tool can be a useful tool. The past file formats may be undocumented or little understood. If so, you will need do some research and find a tool that can read them and produce a usable format.
Next, examine the metadata associated with the content. How much of this will need to be carried into the new system? Of the metadata being kept, how accurate and consistent is it? Are entries in a field structured the same way, with the same abbreviations used throughout? It may pay to schedule time to clean up the metadata before loading it into the next system, including developing some tools to automate the changes and reduce the overall work. The metadata may support relationships between documents, such as annotations and redactions, or complex documents like a web site. These will need to be understood, and the relationships maintained.
Finally, determine how you are going to extract the information from the old system. Is there an export tool, or will you have to develop your own?
2. Decide on the target formats, and how you will get there. First make sure you understand the new application and its expected usage. Then decide on the format for the information target. What formats will provide the functionality you'll need? Is there a transformation tool or algorithm from each source format to its target? What format will the metadata need to be in to be loaded into the new application? Are there going to be length, coding, or other issues in fitting the old information into the new system?
3. Develop a detailed conversion plan, covering the tools, process, and performance issues involved in the transformation. The new application may predetermine the import approach, or you may need to choose between several approaches. It will be important to pick an approach that permits errors to be detected and prevented, even if it is slower than another technique. What information format and control files are required by the tool that will load the new system? If the format is unknown, convert the metadata into an easily read layout, such as a comma-delimited file or a relational database.
Set up a test bed for checking on your process and uncovering flaws in the tools or invalid assumptions about the data. A test run of a sample is essential to ensure that the process will work in production. This will also allow you to estimate the time required for the full conversion. Since conversion is likely to require a number of days or weeks, you will need to make a schedule for how much you will convert with each run, the sequence for conversion (usually tied to the deployment plan and information usage), and identify logical checkpoints and quality checks. Check the results thoroughly before you start formally converting-the fact that data loaded into the target is not enough!
After all that preparation, you are finally ready to migrate your content according to your plan. It should go smoothly, but do not become complacent! Check each run for errors, not only through examining the tool logs, but by checking document and page counts and random sampling. Keep all your research and logs of every run and all problem corrections, as this will be needed if the new content is ever questioned.
If you take the time to be careful and thorough, you won't lose anything.
Bernard Chester is a consultant on ECM who focuses on implementation and integration issues. He may be reach at bchester@imergeconsult.com. He is a principal with IMERGE Consulting, an independent ECM consulting firm.
Download: Full Article in PDF Format