BizDataX and sensitive data discovery start by connecting to one or more databases and inspecting data samples and metadata information. Discovery rules, algorithms, and heuristics are used (e.g. name lists, tax ids, credit card numbers, home addresses) while producing discovery findings. Hit rates and database statistical data can be used to sort and filter the results.
Each finding can be classified as sensitive (hit) or not sensitive (miss). Users can inspect the hit rate, peak the data or check if the same column is classified as finding somewhere else, before making a final decision. Users can add comments and collaborate with the DBA or application development team in case there is a doubt about the data found. Alternatively, one can request a larger sample or modify the rule to get a new and possibly shorter list of findings.
Focusing on finding hits only (i.e. only tables and columns containing sensitive data), the team specifies how to mask the data – which masking algorithms and referential integrity strategies to use. Discovery process results, comments, statistics, hit rates and probabilities are always accessible in context to support the process.
Huge databases can be inspected in reasonable time due to smart sampling and rule evaluations being executed in parallel. Major relational databases are supported: DB2, MSSQL, Oracle and others.