Skip to main content

Re-Write data in SQL file [Resolved]

So my solution to anonymize data on our backup server where we store backups of MySQL databases in .sql files is the following:

Created a script that copies the .sql file from the backup server to my local workstation, automatically imports it into my database server, straight anonymizes the sensitive data, exports the database into another .sql file, removes the original .sql and also drops the database. Thus I end up with an anonymized .sql file. However, I wish to make it faster by directly re-writing the values in the original .sql file it self.

What do you think, what is the easiest method for this or is my solution already the best there is? Was thinking about using the sed command someway, but I just have no clue ATM how to tell the script what to replace, where, how many times and with what. :/


Question Credit: Bert
Question Reference
Asked July 20, 2019
Posted Under: Network
8 views
3 Answers

Things to consider:

  • How many times do you "hit the disk". Writing a dump file or a .csv costs something because of the disk hits. Sometimes a workaround is to "pipe" steps together.
  • UPDATE in place goes through all the transactional stuff; this leads to extra hits. So, perhaps UPDATEing is not the best approach.
  • CREATE TABLE and DROP TABLE take some effort, but not nearly as much as reading or writing an entire table.

I would consider the following (if it works for your situation):

CREATE DATABASE AnonDB ...;
USE AnonDB;
CREATE TABLE t1 LIKE RealDB.t1;
INSERT INTO t1
    SELECT id, ok_data, Anonymizer(sensitive_data), ...
        FROM RealDB.t1;
... t2 ...
etc.

Notes...

  • Need DROP RealDB at the end?
  • You need the function(s) Anonymizer.
  • Everything is manual (which table to convert; which columns; etc).
  • No CSV or sed.

credit: Rick James
Answered July 20, 2019

At the end, all your answers were good, but did not increase the performance and reduce the time that is needed to serve an anonymous database for the developers. Here is my bash script first:

#!/bin/bash

if [ -z "$1" ]; then
    echo ""
    echo "No anonymization PHP file present."
    echo "Please create first a project specific anonymization PHP file!"
    echo ""
    exit
fi

if [ -z "$2" ]; then
    echo ""
    echo "You have not specified where the backup SQL files are located"
    echo "Example: /backup/daily-raid/websites/projectnumberone/sqldump/"
    echo ""
    exit
fi

phpconfig=$1
route=$2

file=$(ssh backup "ls -t $route | head -1 && exit")
dbname=$(echo $file | head -c -11)
dbslug=$(echo $dbname | sed -e "s/-/_/g")
createdb="create database $dbslug"
dropdb="drop table $dbslug"


scp backup:$route$file /home/sysadmin/dumps

mysql -u root -e "$dropdb"
mysql -u root -e "$createdb"
mysql -u root $dbslug < /home/sysadmin/dumps/$file

rm -rf /home/sysadmin/dumps/$file

php $phpconfig $dbslug

exit

Please ignore the fact that the local DB servers root has no password.

The PHP file is based on these GitHub projects:

Just create a php file that knows what data needs to be anonymized and the bash script will run it. OFC you need PHP to be installed on your local workstation and MySQL server.


credit: Bert
Answered July 20, 2019
Your Answer
D:\Adnan\Candoerz\CandoProject\vQA