Working with XML and Textual Files

Working with XML and Textual Files

Enterprise systems usually do not rely exclusively on relational databases to store data. More often than not legacy systems could be using other data stores to support scenarios like importing and exporting data. XML has come into common use for the interchange of data. BizDataX enables setting up rules to anonymize data in existing XML files or to generate new XML files with synthetic data.

In the example below we want to anonymize data in an XML file containing master data about customers.

xml file

Figure 62: Sample XML file that can be anonymized with BizDataX masking rules

In general, XML can be very different from tabular structures typical for relational databases. However, in this particular case, if we use expression like xml.Descendants(“Customer”) as query, we can start treating XML structures using BizDataX’s masking engine and masking iterator. The engine iterates every Customer element in the xml structure.

xml handler

Figure 63: Specifying XLINQ expressions to iterate elements in xml structures

Setting up the rules for changing the names is very similar to setting up the rules to change the names in columns of a table coming from a relational database. We can use appropriate tools from the BizDataX toolbox. However, when working with XML, we have to be very specific regarding the property that is being changed by the rule. With relational data it is always a column, with XML it could be an attribute or inner text of a child element, to mention a few.

xml first name attribute

Figure 64: Masking FirstName attribute in XML structure using replacement from the Swiss names list

The rest of the rule setup is not different that it was when we were working with relational data.

Additionally, when you work with XML files, you have to load and save the data.

xml load save

Figure 65: Loading and saving XML content

You can define more than one query, i.e. load XML structures, anonymize xml.Descendants(“Customer”), xml.Descendants(“CustomerAddress”), xml.Descendants(“SalesOrderHeader”), and then save.

Also, you are in no way limited when defining the rules. You can define Repeating to produce XML files that are in sync with the rest of the data in relational databases, see Enforcing Referential Integrity for details. Conditional masking works just as it would with relational data. Similar applies to the rest of the BizDataX Designer concepts.