Handling Inline Data

Trailer Lines and Inline Entity Data

In the Footwear Orders sample MAPPER report, the asterisk lines that follow the last line item in an order are known as trailer lines. For example:

Footwear Orders Trailer Lines

Trailer lines often contain data for an entity that is separate from, but related to, the entity represented by the column-formatted, tab data lines. For the Footwear Orders report, the trailer lines represent the customer contact information or "addressee" for the preceding order items:

Footwear Order Line Item - Addressess Relationship

MJ considers entity data embedded in trailer lines as "inline".

Identifying Inline Data Tuples

Just as with tab lines containing the columnar data fields for the primary entity in the MAPPER report, it's important to identify which trailer lines comprise a tuple (or row) of inline entity data, and how those tuples are organized. It may be that one trailer line represents a distinct, inline data tuple. Or, as in the Footwear Orders sample report, multiple trailer lines may comprise an inline data tuple. Additionally, the trailer lines may be not column-formatted. In Footwear Orders, the trailer lines contain delimited, variable-length fields.

IInlineDataHandler Interface

Translation of inline MAPPER report data is informed by a Spring bean that implements IInlineDataHandler, an interface purposely designed for composing tuples from trailer lines. MJ provides IInlineDataHandler beans useful in handling common occurrences of inline data, such as parsing data fields with regular expressions and storing inline text as CLOBs. Custom inline handlers may be devised for specially formatted trailer lines by implementing the IInlineDataHandler interface.

Handling Inline Data with Regular Expressions

The RegexInlineDataHandler bean recognizes and extracts fields from one or more trailer lines, mapping and loading the fields into attributes of a data entity. Noteworthy properties of the RegexInlineDataHandler bean include:

matchFullInput Indicates whether field parsing operates in match-and-advance mode (portion of the input text that was matched is discarded before the next field is matched) or match-full-input mode, where each regular expression is matched against the full text input.
textStripRegex Regular expression to match and strip from input text before parsing fields from input.
textReplaceRegex Regular expression to match and replace in input text before parsing fields from input.
unixLines Controls use of java.util.regex.Pattern.UNIX_LINES for regular expressions employed by bean.
multiline Controls use of java.util.regex.Pattern.MULTILINE for regular expressions employed by bean.
caseInsensitive Controls use of java.util.regex.Pattern.CASE_INSENSITIVE set for regular expressions employed by bean.
fieldParseRegex<N> Regular expression used to parse field N  from input text. The field is named and a regular expression supplied to recognize the particular field and select the field contents as a "capturing group".

For example, this bean is employed to translate customer adddress information in the trailer, asterisk lines of the Footwear Orders sample report to the FootwearOrderAddressee entity. Below are the fieldParseRegex regular expression properties to recognize and extract the addressee and street from the first trailer, asterisk line from Footwear Orders:

<bean id="addrInlineHandler"
      class="com.arsi.mj.maprpt.parser.tuple.inline.RegexInlineDataHandler"
      . . .
      p:matchFullInput="false"
      p:multiline="true"
      p:unixLines="true"
      p:textStripRegex="^\*">

  <property name="fieldParseRegex1">
    <!-- column name, field selector regex -->
    <list>
      <value>addressee</value>
      <value>^[^\t]*\t([^\t]*)</value>
    </list>
  </property>

  <property name="fieldParseRegex2">
    <!-- column name, field selector regex -->
    <list>
      <value>street</value>
      <value>\t([^\t]*)$</value>
    </list>
  </property>
  . . .
</bean>

Note the use of match-and-advance mode by setting matchFullInput to false and stripping of the asterisk line type before parsing the trailer line using the textStripRegex property.

Referencing the example below of trailer, asterisk lines from the Footwear Orders report:

Footwear Orders Trailer Lines

the following illustrates how the fieldParseRegex1 and fieldParseRegex2 bean properties match fields in the first trailer line. The screenshots are from Java regex tester.

Regular expression property fieldParseRegex1 matches the addressee, "Dave Bennet".

<property name="fieldParseRegex1">
  <!-- column name, field selector regex -->
  <list>
    <value>addressee</value>
    <value>^[^\t]*\t([^\t]*)</value>
  </list>
</property>\
Regular Expression To Parse Addressee

Regular expression property fieldParseRegex2 matches the street number and name, "123 Main St".

<property name="fieldParseRegex2">
  <!-- column name, field selector regex -->
  <list>
    <value>street</value>
    <value>\t([^\t]*)$</value>
  </list>
</property>
Regular Expression To Parse Street

Defining the RDBMS Table and Columns For Inline Data

Once the recognition and extraction of inline data fields are specified, the target database table and columns where those fields are stored must be provided. In the following example for Footwear Orders, the database table and entity class are listed in the tableName and className properties, while database columns are defined by collaborating column definition beans:

<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xmlns:p="http://www.springframework.org/schema/p"
       xmlns:util="http://www.springframework.org/schema/util"
       xsi:schemaLocation="http://www.springframework.org/schema/beans
                           http://www.springframework.org/schema/beans/spring-beans-3.0.xsd
                           http://www.springframework.org/schema/util
                           http://www.springframework.org/schema/util/spring-util-3.0.xsd">

  <bean id="addrInlineHandler"
        class="com.arsi.mj.maprpt.parser.tuple.inline.RegexInlineDataHandler"
        p:className="com.arsi.mj.testapp.hibgen.model.FootwearOrderAddressee"
        p:baseClassSuffix="Base"
        p:tableName="FOOTWEAR_ORDER_ADDRESSEE"

        p:column1-ref="addrInlineHandler.coldefAddressee"
        p:column2-ref="addrInlineHandler.coldefStreet"
        p:column3-ref="addrInlineHandler.coldefCityState"
        p:column4-ref="addrInlineHandler.coldefPostalCode"
        p:column5-ref="addrInlineHandler.coldefEmail"
        p:column6-ref="addrInlineHandler.coldefPhone"
        . . .
        >
</bean>

Below are example column definition beans for addressee and street  fields in the Footwear Orders report. Each column is defined via arguments passed to the ColumnDef constructor:

<bean id="addrInlineHandler.coldefAddressee"
	class="com.arsi.mj.config.atoms.ColumnDef">
  <!-- entity attrname, column name, length, NULLABLE, data type -->
  <constructor-arg index="0" value="addressee"/>
  <constructor-arg index="1" value="addressee"/>
  <constructor-arg index="2" value="50"/>
  <constructor-arg index="3" value="false"/>
  <constructor-arg index="4">
    <util:constant static-field="org.hibernate.type.StandardBasicTypes.STRING"/>
  </constructor-arg>
</bean>

<bean id="addrInlineHandler.coldefStreet"
	class="com.arsi.mj.config.atoms.ColumnDef">
  <!-- entity attrname, column name, length, NULLABLE, data type -->
  <constructor-arg index="0" value="street"/>
  <constructor-arg index="1" value="street"/>
  <constructor-arg index="2" value="80"/>
  <constructor-arg index="3" value="false"/>
  <constructor-arg index="4">
    <util:constant static-field="org.hibernate.type.StandardBasicTypes.STRING"/>
  </constructor-arg>
</bean>

Columns are defined with the ColumnDef bean instead of AnnotatedColumnDef because the inline data is not column-formatted and does not contain heading lines that give the MAPPER name and size of each column. The entity attribute name provided as the first argument to the ColumnDef constructor must match the field name specified in the corresponding fieldParseRegex property ("addressee" and "street", for example).

RegexInlineDataHandler Example: FootwearOrderAddressee

See the full Spring XML for the Footwear Orders sample MAPPER report, including configuration for translating inline "addressee" entity data into the FootwearOrderAddressee persistence class and FOOTWEAR_ORDER_ADDRESSEE table.