This is the second of three articles describing the Serving XML pipeline language.
This article discusses pipelines where the input or output (or both) are sequences of records.
Table 1. Analagous XML pipeline and record stream elements.
XML Pipeline Elements | Record Stream Elements | ||
---|---|---|---|
px:task | px:serialize | px:task | px:writeRecords |
px:transform | px:processRecords | ||
px:emitter | px:xmlEmitter | px:recordWriter | px:flatFileWriter |
px:customEmitter | px:sqlWriter | ||
px:filter | px:transform | px:recordFilter | px:processRecords |
px:choose | |||
px:saxFilter | px:customRecordFilter | ||
px:style | px:restrictRecordFilter | ||
px:taskRunnerFilter | px:taskRunnerRecordFilter | ||
px:removeEmptyElementFilter | |||
msv:msvFilter | msv:msvRecordFilter | ||
px:content | px:document | px:recordReader | px:flatFileReader |
px:dynamicContent | px:sqlReader | ||
px:recordContent | px:directoryReader | ||
px:emptyDocument | px:parameterReader | ||
px:xmlRecordReader |
The example below illustrates the idea of record readers and writers with a flat file reader
that reads a stream of records from a positional flat file, and a flat file writer that writes
the stream to a delimited flat file. Here, we pair a px:flatFileReader
with a
px:flatFileWriter
, but we could just as easily pair a px:flatFileReader
with a px:sqlWriter
, or
a px:sqlReader
with a px:flatFileWriter
.
Figure 1. Record readers and writers
<px:resources xmlns:px="http://www.presentingxml.com/PresentingXML"> <px:service name="new-books"> <px:writeRecords> <px:flatFileWriter> <px:flatFile ref="newBooksFlatFile"/> </px:flatFileWriter> <px:flatFileReader> <px:flatFile ref="oldBooksFlatFile"/> </px:flatFileReader> </px:writeRecords> </px:service> <px:flatFile name="newBooksFlatFile"> <px:flatFileHeader> <px:flatRecordType ref="newBookType"/> </px:flatFileHeader> <px:flatFileBody> <px:flatRecordType ref="newBookType"/> </px:flatFileBody> </px:flatFile> <px:flatRecordType name="newBookType"> <px:fieldDelimiter value="|"/> <px:delimitedField name="author" label="Author"/> <px:delimitedField name="category" label="Category"/> <px:delimitedField name="title" label= "Title"/> <px:delimitedField name="price" label="Price"/> </px:flatRecordType> <px:flatFile name="oldBooksFlatFile"> <px:flatFileHeader> <px:flatRecordType ref="oldBookType"/> <px:annotationRecord/> </px:flatFileHeader> <px:flatFileBody> <px:flatRecordType ref="oldBookType"/> </px:flatFileBody> <px:flatFileTrailer> <px:annotationRecord></px:annotationRecord> <px:annotationRecord>This is a trailer record</px:annotationRecord> </px:flatFileTrailer> </px:flatFile> <px:flatRecordType name="oldBookType"> <px:positionalField name="category" width="1"/> <px:positionalField name="author" width="30"/> <px:positionalField name="title" width="30"/> <px:positionalField name="price" width="10" justify="right"/> </px:flatRecordType> </px:resources>
The next example shows how to use a px:recordContent element to adapt a record stream to XML content. Once we have XML content, we can apply all of the XML pipeline instructions described in Serving XML: Pipeline Language.
Figure 2. Adapting a record stream to XML content
<px:resources xmlns:px="http://www.presentingxml.com/PresentingXML" xmlns:myns="http://www.mydomain.com/MyNamespace"> <px:service name="books"> <px:serialize> <px:transform> <px:content ref="books"/> </px:transform> </px:serialize> </px:service> <px:recordContent name="books"> <px:flatFileReader> <px:flatFile ref="oldBooksFlatFile"/> </px:flatFileReader> <px:recordMapping ref="booksToXmlMapping"/> </px:recordContent> <px:recordMapping name="booksToXmlMapping"> <myns:books> <px:onRecord> <myns:book> <px:fieldElementMap field="title" element="myns:title"/> <px:fieldAttributeMap field="category" attribute="categoryCode"/> <px:fieldElementMap field="author" element="myns:author"/> <px:fieldElementMap field="price" element="myns:price"/> </myns:book> </px:onRecord> </myns:books> </px:recordMapping> <px:flatFile name="oldBooksFlatFile"> <px:flatFileHeader> <px:flatRecordType ref="oldBookType"/> <px:annotationRecord/> </px:flatFileHeader> <px:flatFileBody> <px:flatRecordType ref="oldBookType"/> </px:flatFileBody> <px:flatFileTrailer> <px:annotationRecord></px:annotationRecord> <px:annotationRecord>This is a trailer record</px:annotationRecord> </px:flatFileTrailer> </px:flatFile> <px:flatRecordType name="oldBookType"> <px:positionalField name="category" width="1"/> <px:positionalField name="author" width="30"/> <px:positionalField name="title" width="30"/> <px:positionalField name="price" width="10" justify="right"/> </px:flatRecordType> </px:resources>
The next example shows how to use an px:xmlRecordReader
element to adapt XML content
to a record stream. Once we have a record stream, we can apply any record writer, including the
px:flatFileWriter
or the px:sqlWriter
.
Figure 3. Adapting XML content to a record stream
<px:resources xmlns:px="http://www.presentingxml.com/PresentingXML" xmlns:myns="http://www.mydomain.com/MyNamespace"> <px:service name="books2pos"> <px:writeRecords> <px:flatFileWriter> <px:flatFile ref="booksFlatFile"/> </px:flatFileWriter> <px:xmlRecordReader> <px:inverseRecordMapping ref="booksToFileMapping"/> <px:transform> <px:document/> </px:transform> </px:xmlRecordReader> </px:writeRecords> </px:service> <px:inverseRecordMapping name="booksToFileMapping"> <px:documentFragmentMap path="/myns:books/myns:book"> <px:fragmentRecordMap recordType="book"> <px:fragmentFieldMap select="myns:title" field="title"/> <px:fragmentFieldMap select="@categoryCode" field="category"/> <px:fragmentFieldMap select="myns:author" field="author"/> <px:fragmentFieldMap select="myns:price" field="price"/> <px:fragmentFieldMap select="myns:reviews/myns:review[1]" field="review1"/> <px:fragmentFieldMap select="myns:reviews/myns:review[2]" field="review2"/> </px:fragmentRecordMap> </px:documentFragmentMap> </px:inverseRecordMapping> <px:flatFile name="booksFlatFile"> <px:flatFileHeader> <px:flatRecordType ref="bookType"/> <px:annotationRecord/> </px:flatFileHeader> <px:flatFileBody> <px:flatRecordType ref="bookType"/> </px:flatFileBody> <px:flatFileTrailer> <px:annotationRecord></px:annotationRecord> <px:annotationRecord>This is a trailer record</px:annotationRecord> </px:flatFileTrailer> </px:flatFile> <px:flatRecordType name="bookType"> <px:positionalField name="category" label="Category" width="1"/> <px:positionalField name="author" label="Author" width="30"/> <px:positionalField name="title" label="Title" width="30"/> <px:positionalField name="price" label="Price" width="10" justify="right"/> </px:flatRecordType> </px:resources>
In the previous examples we pair a record reader and a record writer
inside a px:writeRecords
element.
The reader reads a stream of records and the writer writes out the records.
A record reader can contain record filters that do some processing on the records as they pass through.
Normally the records go on to a writer, but a writer is optional, the processing can take place entirely
within the filters. The example below shows a lone record reader inside a px:writeRecords
element. This record reader is a px:directoryReader
, which reads all the file names
in the data
directory,
skipping any that do not match the pattern "(books.*)[.]txt"
. The resulting stream of
file names passes through another px:writeRecords
element, which reads each books file
and writes out the records to a similiarly named file with a _new
suffix in the
output
directory.
Figure 4. Processing selected files in a directory
<px:resources xmlns:px="http://www.presentingxml.com/PresentingXML"> <px:service name="all-books"> <px:processRecords> <px:directoryReader directory="data"> <px:restrictRecordFilter> <px:restrictField field="name" match="books.*[.]txt"/> </px:restrictRecordFilter> <px:taskRunnerRecordFilter> <px:parameter name="output-file"> <px:replace match="(books.*)[.]txt" replaceWith ="$1-new.txt"><px:toString value="{name}"/></px:replace> </px:parameter> <px:writeRecords> <px:flatFileReader> <px:fileSource directory="{parentDir}" file="{name}"/> <px:flatFile ref="oldBooksFlatFile"/> </px:flatFileReader> <px:flatFileWriter> <px:fileSink directory="output" file="{$output-file}"/> <px:flatFile ref="newBooksFlatFile"/> </px:flatFileWriter> </px:writeRecords> </px:taskRunnerRecordFilter> </px:directoryReader> </px:processRecords> </px:service> <px:flatFile name="newBooksFlatFile"> <px:flatFileHeader> <px:flatRecordType ref="newBookType"/> </px:flatFileHeader> <px:flatFileBody> <px:flatRecordType ref="newBookType"/> </px:flatFileBody> </px:flatFile> <px:flatRecordType name="newBookType"> <px:fieldDelimiter value="|"/> <px:delimitedField name="author" label="Author"/> <px:delimitedField name="category" label="Category"/> <px:delimitedField name="title" label= "Title"/> <px:delimitedField name="price" label="Price"/> </px:flatRecordType> <px:flatFile name="oldBooksFlatFile"> <px:flatFileHeader> <px:flatRecordType ref="oldBookType"/> <px:annotationRecord/> </px:flatFileHeader> <px:flatFileBody> <px:flatRecordType ref="oldBookType"/> </px:flatFileBody> <px:flatFileTrailer> <px:annotationRecord></px:annotationRecord> <px:annotationRecord>This is a trailer record</px:annotationRecord> </px:flatFileTrailer> </px:flatFile> <px:flatRecordType name="oldBookType"> <px:positionalField name="category" width="1"/> <px:positionalField name="author" width="30"/> <px:positionalField name="title" width="30"/> <px:positionalField name="price" width="10" justify="right"/> </px:flatRecordType> </px:resources>