Filtering Input Data
BDT supports filtering of input data. For example, in a case with one million input rows, you can designate a subset for transformation. To do so, specify subset criteria in a policy file to filter the input records.
The element subset is a table-level element, and is applicable only to that table. Refer to BDT Policy Files for more information on Table attributes.
Filter Input Rows from Source Files
The following example shows how to filter input rows from source files:
"subset":{
"offset":"10",
"limit":1000,
"recurrency": 2,
"filters":[
{
"selector":"name",
"operator":"like",
"expression":"'p.*'"
}
]
}
Specify the starting offset and the limit, which define the number of rows to transform.
The recurrency attribute specifies whether to transform rows in a sequence or skip rows. If recurrency is set to 2, then every second row in the sequence is transformed and the row in between is skipped.
filters specifies the list of conditions to match. The row should satisfy all conditions specified in the list.
selectoris the name of the column that the filter should evaluate.operatorsOptions: "=","<",">","<=",">=","!=","like".expressionUsed to evaluate the filter on the selector using specified operator. For file transformation ifoperatoris specified aslike, then specifyregex expressionto match in the column value.
Note
To create a regex expression, see Regex Tutorial for details.
Filter Database Table Rows
The following example shows how to filter input database table rows:
"subset":{
"offset":"10",
"limit":1000,
"recurrency": 1,
"orderBy":"ID",
"order":"DESC",
"filters":[
{
"selector":"name",
"operator":"like",
"expression":"'p.*'"
},
{
"selector":"salary",
"operator":"<",
"expression":"100000"
}
]
}
Compared with file transformations, the database transformation subset can include two additional attributes:
orderBy: Use the name of column which should be used to order records. Records are ordered first, then offset and limit attributes are applied.order: Define ascending (ASC) or descending (DESC) order.
For the like operator, use database-specific syntax for the expression to evaluate.