Filtering Input Data
BDT supports filtering of input data. For example, in a case with one million input rows, you can designate a subset for transformation. To do so, specify subset criteria in a policy file to filter the input records.
The element subset
is a table-level element, and is applicable only to that table. Refer to BDT Policy Files for more information on Table attributes.
Filter Input Rows from Source Files
The following example shows how to filter input rows from source files:
"subset":{
"offset":"10",
"limit":1000,
"recurrency": 2,
"filters":[
{
"selector":"name",
"operator":"like",
"expression":"'p.*'"
}
]
}
Specify the starting offset
and the limit
, which define the number of rows to transform.
The recurrency
attribute specifies whether to transform rows in a sequence or skip rows. If recurrency is set to 2, then every second row in the sequence is transformed and the row in between is skipped.
filters
specifies the list of conditions to match. The row should satisfy all conditions specified in the list.
selector
is the name of the column that the filter should evaluate.operators
Options: "=","<",">","<=",">=","!=","like".expression
Used to evaluate the filter on the selector using specified operator. For file transformation ifoperator
is specified aslike
, then specifyregex expression
to match in the column value.
Note
To create a regex expression, see Regex Tutorial for details.
Filter Database Table Rows
The following example shows how to filter input database table rows:
"subset":{
"offset":"10",
"limit":1000,
"recurrency": 1,
"orderBy":"ID",
"order":"DESC",
"filters":[
{
"selector":"name",
"operator":"like",
"expression":"'p.*'"
},
{
"selector":"salary",
"operator":"<",
"expression":"100000"
}
]
}
Compared with file transformations, the database transformation subset
can include two additional attributes:
orderBy
: Use the name of column which should be used to order records. Records are ordered first, then offset and limit attributes are applied.order
: Define ascending (ASC) or descending (DESC) order.
For the like
operator, use database-specific syntax for the expression to evaluate.