In the HVTO Industry the term "Data Transformation" is used to describe all the document processing steps to transform or adapt documents or their data for subsequent processes. This includes several types of processes:
- Converting documents into another format required by a specific target device (e.g. a special printer) or target system (e.g. archiving system).
- Restructuring document datastreams such as splitting document spools into separate documents or sorting documents within a spool using specific criteria (e.g. zip codes) or filtering out pages from spools or documents.
- Classifying and indexing documents through analysis of the variables inside documents, e.g. by NOP's or TLE's in AFP documents, by identifying text at particular positions in the page, or by reading barcodes.
- Changing the content or attributes of a page by including OMR marks, barcodes, text, or formatted data, removing text, changing the page size, shifting the page content, or rotating pages etc.
Data Transformation in the HVTO Industry has to be differentiated from the same term used for data management in the areas of database and data warehousing. While these fields deal with the selection, adjustment, conversion, and collection of data from different sources, Data Transformation in the HVTO Industry processes formatted data in documents, namely data as a synonym for document objects and its attributes. Therefore Data Transformation is a basic function of almost all the processes in output management.