Serving XML: Record Streams

Daniel Parker


Record readers and writers
Adapting a record stream to XML content
Adapting XML content to a record stream
Performing tasks repeatedly within a record filter

This is the second of three articles describing the Serving XML pipeline language.

This article discusses pipelines where the input or output (or both) are sequences of records.

Table 1. Analagous XML pipeline and record stream elements.

XML Pipeline ElementsRecord Stream Elements
px:taskpx:serializepx:taskpx:writeRecords
 px:transform px:processRecords
px:emitterpx:xmlEmitterpx:recordWriterpx:flatFileWriter
 px:customEmitter px:sqlWriter
px:filterpx:transformpx:recordFilterpx:processRecords
 px:choose  
 px:saxFilter px:customRecordFilter
 px:style px:restrictRecordFilter
 px:taskRunnerFilter px:taskRunnerRecordFilter
 px:removeEmptyElementFilter  
 msv:msvFilter msv:msvRecordFilter
px:contentpx:documentpx:recordReaderpx:flatFileReader
 px:dynamicContent px:sqlReader
 px:recordContent px:directoryReader
 px:emptyDocument px:parameterReader
   px:xmlRecordReader

Record readers and writers

The example below illustrates the idea of record readers and writers with a flat file reader that reads a stream of records from a positional flat file, and a flat file writer that writes the stream to a delimited flat file. Here, we pair a px:flatFileReader with a px:flatFileWriter, but we could just as easily pair a px:flatFileReader with a px:sqlWriter, or a px:sqlReader with a px:flatFileWriter.

Figure 1. Record readers and writers


<px:resources xmlns:px="http://www.presentingxml.com/PresentingXML">
   
  <px:service name="new-books"> 
    <px:writeRecords>
      <px:flatFileWriter>
        <px:flatFile ref="newBooksFlatFile"/>
      </px:flatFileWriter>
      <px:flatFileReader>
        <px:flatFile ref="oldBooksFlatFile"/>
      </px:flatFileReader>
    </px:writeRecords>
  </px:service>
  
  <px:flatFile name="newBooksFlatFile">
    <px:flatFileHeader>
      <px:flatRecordType ref="newBookType"/>
    </px:flatFileHeader>
    <px:flatFileBody>
      <px:flatRecordType ref="newBookType"/>
    </px:flatFileBody>
  </px:flatFile>      

  <px:flatRecordType name="newBookType">
    <px:fieldDelimiter value="|"/>
    <px:delimitedField name="author" label="Author"/>
    <px:delimitedField name="category" label="Category"/>
    <px:delimitedField name="title" label= "Title"/>
    <px:delimitedField name="price" label="Price"/>
  </px:flatRecordType>
  
  <px:flatFile name="oldBooksFlatFile">
    <px:flatFileHeader>
      <px:flatRecordType ref="oldBookType"/>
      <px:annotationRecord/>
    </px:flatFileHeader>
    <px:flatFileBody>
      <px:flatRecordType ref="oldBookType"/>
    </px:flatFileBody>
    <px:flatFileTrailer>
      <px:annotationRecord></px:annotationRecord>
      <px:annotationRecord>This is a trailer record</px:annotationRecord>
    </px:flatFileTrailer>
  </px:flatFile>      

  <px:flatRecordType name="oldBookType">
    <px:positionalField name="category" width="1"/>
    <px:positionalField name="author" width="30"/>
    <px:positionalField name="title" width="30"/>
    <px:positionalField name="price" width="10" justify="right"/>
  </px:flatRecordType>
  
</px:resources>

Adapting a record stream to XML content

The next example shows how to use a px:recordContent element to adapt a record stream to XML content. Once we have XML content, we can apply all of the XML pipeline instructions described in Serving XML: Pipeline Language.

Figure 2. Adapting a record stream to XML content


<px:resources xmlns:px="http://www.presentingxml.com/PresentingXML"
                         xmlns:myns="http://www.mydomain.com/MyNamespace">

  <px:service name="books"> 
    <px:serialize>
      <px:transform>
        <px:content ref="books"/>
      </px:transform>
    </px:serialize>
  </px:service>
  
  <px:recordContent name="books">
    <px:flatFileReader>
      <px:flatFile ref="oldBooksFlatFile"/>
    </px:flatFileReader>
    <px:recordMapping ref="booksToXmlMapping"/>
  </px:recordContent>

  <px:recordMapping name="booksToXmlMapping">
    <myns:books>
      <px:onRecord>
        <myns:book>
          <px:fieldElementMap field="title" element="myns:title"/>  
          <px:fieldAttributeMap field="category" attribute="categoryCode"/>
          <px:fieldElementMap field="author" element="myns:author"/>
          <px:fieldElementMap field="price" element="myns:price"/>
        </myns:book>  
      </px:onRecord>
    </myns:books>
  </px:recordMapping>  
   
  <px:flatFile name="oldBooksFlatFile">
    <px:flatFileHeader>
      <px:flatRecordType ref="oldBookType"/>
      <px:annotationRecord/>
    </px:flatFileHeader>
    <px:flatFileBody>
      <px:flatRecordType ref="oldBookType"/>
    </px:flatFileBody>
    <px:flatFileTrailer>
      <px:annotationRecord></px:annotationRecord>
      <px:annotationRecord>This is a trailer record</px:annotationRecord>
    </px:flatFileTrailer>
  </px:flatFile>      

  <px:flatRecordType name="oldBookType">
    <px:positionalField name="category" width="1"/>
    <px:positionalField name="author" width="30"/>
    <px:positionalField name="title" width="30"/>
    <px:positionalField name="price" width="10" justify="right"/>
  </px:flatRecordType>
  
</px:resources>

Adapting XML content to a record stream

The next example shows how to use an px:xmlRecordReader element to adapt XML content to a record stream. Once we have a record stream, we can apply any record writer, including the px:flatFileWriter or the px:sqlWriter.

Figure 3. Adapting XML content to a record stream


<px:resources xmlns:px="http://www.presentingxml.com/PresentingXML"
                        xmlns:myns="http://www.mydomain.com/MyNamespace">
   
  <px:service name="books2pos"> 
    <px:writeRecords>
      <px:flatFileWriter>
        <px:flatFile ref="booksFlatFile"/>
      </px:flatFileWriter>
      <px:xmlRecordReader>
        <px:inverseRecordMapping ref="booksToFileMapping"/>
        <px:transform>
          <px:document/>
        </px:transform>
      </px:xmlRecordReader>
    </px:writeRecords>
  </px:service>

  <px:inverseRecordMapping name="booksToFileMapping">
    <px:documentFragmentMap path="/myns:books/myns:book">
      <px:fragmentRecordMap recordType="book">
        <px:fragmentFieldMap select="myns:title" field="title"/>
        <px:fragmentFieldMap select="@categoryCode" field="category"/>
        <px:fragmentFieldMap select="myns:author" field="author"/>
        <px:fragmentFieldMap select="myns:price" field="price"/>
        <px:fragmentFieldMap select="myns:reviews/myns:review[1]" field="review1"/>
        <px:fragmentFieldMap select="myns:reviews/myns:review[2]" field="review2"/>
      </px:fragmentRecordMap>
    </px:documentFragmentMap>
  </px:inverseRecordMapping>

  <px:flatFile name="booksFlatFile">
    <px:flatFileHeader>
      <px:flatRecordType ref="bookType"/>
      <px:annotationRecord/>
    </px:flatFileHeader>
    <px:flatFileBody>
      <px:flatRecordType ref="bookType"/>
    </px:flatFileBody>
    <px:flatFileTrailer>
      <px:annotationRecord></px:annotationRecord>
      <px:annotationRecord>This is a trailer record</px:annotationRecord>
    </px:flatFileTrailer>
  </px:flatFile>      

  <px:flatRecordType name="bookType">
    <px:positionalField name="category" label="Category" width="1"/>
    <px:positionalField name="author" label="Author" width="30"/>
    <px:positionalField name="title" label="Title" width="30"/>
    <px:positionalField name="price" label="Price" width="10" justify="right"/>
  </px:flatRecordType>
  
</px:resources>

Performing tasks repeatedly within a record filter

In the previous examples we pair a record reader and a record writer inside a px:writeRecords element. The reader reads a stream of records and the writer writes out the records.

A record reader can contain record filters that do some processing on the records as they pass through. Normally the records go on to a writer, but a writer is optional, the processing can take place entirely within the filters. The example below shows a lone record reader inside a px:writeRecords element. This record reader is a px:directoryReader, which reads all the file names in the data directory, skipping any that do not match the pattern "(books.*)[.]txt". The resulting stream of file names passes through another px:writeRecords element, which reads each books file and writes out the records to a similiarly named file with a _new suffix in the output directory.

Figure 4. Processing selected files in a directory


<px:resources xmlns:px="http://www.presentingxml.com/PresentingXML">

  <px:service name="all-books"> 
    <px:processRecords>
      <px:directoryReader directory="data">
        <px:restrictRecordFilter>
          <px:restrictField field="name" match="books.*[.]txt"/>
        </px:restrictRecordFilter>
        <px:taskRunnerRecordFilter>
          <px:parameter name="output-file">
            <px:replace match="(books.*)[.]txt" replaceWith ="$1-new.txt"><px:toString value="{name}"/></px:replace>
          </px:parameter>   
          <px:writeRecords>
            <px:flatFileReader>
              <px:fileSource directory="{parentDir}" file="{name}"/>
              <px:flatFile ref="oldBooksFlatFile"/>
            </px:flatFileReader>
            <px:flatFileWriter>
              <px:fileSink directory="output" file="{$output-file}"/> 
              <px:flatFile ref="newBooksFlatFile"/>
            </px:flatFileWriter>
          </px:writeRecords>
        </px:taskRunnerRecordFilter>
      </px:directoryReader>
    </px:processRecords>
  </px:service>

  <px:flatFile name="newBooksFlatFile">
    <px:flatFileHeader>
      <px:flatRecordType ref="newBookType"/>
    </px:flatFileHeader>
    <px:flatFileBody>
      <px:flatRecordType ref="newBookType"/>
    </px:flatFileBody>
  </px:flatFile>      

  <px:flatRecordType name="newBookType">
    <px:fieldDelimiter value="|"/>
    <px:delimitedField name="author" label="Author"/>
    <px:delimitedField name="category" label="Category"/>
    <px:delimitedField name="title" label= "Title"/>
    <px:delimitedField name="price" label="Price"/>
  </px:flatRecordType>

  <px:flatFile name="oldBooksFlatFile">
    <px:flatFileHeader>
      <px:flatRecordType ref="oldBookType"/>
      <px:annotationRecord/>
    </px:flatFileHeader>
    <px:flatFileBody>
      <px:flatRecordType ref="oldBookType"/>
    </px:flatFileBody>
    <px:flatFileTrailer>
      <px:annotationRecord></px:annotationRecord>
      <px:annotationRecord>This is a trailer record</px:annotationRecord>
    </px:flatFileTrailer>
  </px:flatFile>      

  <px:flatRecordType name="oldBookType">
    <px:positionalField name="category" width="1"/>
    <px:positionalField name="author" width="30"/>
    <px:positionalField name="title" width="30"/>
    <px:positionalField name="price" width="10" justify="right"/>
  </px:flatRecordType>
  
</px:resources>