Storing data at Rest in a data integration tool is critical for both data security and compliance to the industry standards. This gets more significant if the data is stored at LAN. Therefore, it is imperative to encrypt the data at rest to save it from misuse.
Let’s talk about the encryption for data at rest in context to data integration application with an example. Firstly, we should understand that encryption on any system requires three components:
(1) Data that needs to be encrypted
(2) A method to encrypt the data using a cryptographic algorithm
(3) Encryption keys to be used in conjunction with the data and the algorithm
Programming languages like JAVA provide libraries with a wide range of cryptographic algorithms, such as Advanced Encryption Standard (AES). Choosing the right algorithm involves analysis and research based on security and performance.
A data integration tool that companies use for data transactions generally handles sensitive information of their clients. This information can be as generic as SSN or as confidential as Credit Card Details. The tool stores data for processing it on the file system, and an application interacts with it through a Java IO API. In this case, the file needs to be encrypted before it persists on file system by the tool, and then decrypts it before application reads it to parsing and processing.
Related Searches: What is Data Mapping?
To secure data, we may look for a good algorithm that can help in encrypting it at rest. The commonly thought after method for encrypting and decrypting data is PGP (Pretty Good Privacy). PGP uses symmetric and asymmetric keys to encrypt data being transferred across networks. Asymmetric encryption uses two different keys for the encryption and decryption and both keys are derived from one another and created at the same time. These are divided into and referred to as a public and a private key that makes up the key pair. Data is only encrypted with a public key and thus can only be decrypted with the matching private key. PGP provides additional security that prevents anyone who has only the public key from decrypting data which was previously encrypted with it. Another benefit of asymmetric encryption is that it allows authentication check. This seemed to be a viable option but with some limitations. Take a look at the public key encryption demo depicted below.
PGP is mainly beneficial in cases when sensitive data is exchanged between partners; essentially when the information is shared over the network. It works fine when you work for attaining public key cryptography. However, PGP requires more computational resources that can lead to performance issues and make the process cumbersome. Hence, one must be aware of other algorithms such as AES for encryption. Let’s delve into AES to know how different it is from PGP.
Another algorithm available is AES: AES is a symmetric key encryption algorithm that essentially lets the key to be used for encryption and decryption of data. A computer program takes clear text and processes it through an encryption key and returns ciphertext.
AES vs PGP:If data needs to be decrypted, the program processes it again with the same key and reproduces the clear text. It requires less computational resources as compared to PGP that means lower performance impact. As we discuss the encryption of data at rest, AES seems to be a promising solution.
There are a few important points that need to be noted while implementing AES in the application:
1. Initialization Vector (IV): The role of IV is to insert some new randomness into the process each time a message is encrypted. This would enable the same message to be encrypted to a different ciphertext each time, but similar messages will not result in similar ciphertexts. This is the reason why IVs need to be random, but never need to be secret, it can live at the beginning of the ciphertext. There are all sorts of attacks on CBC mode when IVs are predictable. For example, many of the attacks on SSL (such as POODLE) take advantage of predictable IVs.
2. Generate AES Keys and Storing in a JCEKS keystore format: JCEKS or Java Cryptography Extension KeyStore is created using the “keytool” provided with the Java JDK. Storing keys in a KeyStore can be a measure to prevent your encryption keys from being exposed. It is simple to manipulate a keystore while using the KeyTool. Keystores must be created either with a link to a new key or during an import of an existing keystore. In order to create a new key and keystore type:
keytool -genseckey -keystore aes-keystore.jck -storetype jceks -storepass mystorepass -keyalg AES -keysize 256 -alias jceksaes -keypass mykeypass
genseckey: Generate SecretKey. This is the flag indicating the creation of a synchronous key which will become our AES key
keystore: Location of the keystore. If the keystore does not exist, the tool will create a new store. Paths can be relative or absolute but must be local
storetype: This is the type of store (JCE, PK12, JCEKS, etc). JCEKS is used to store symmetric keys (AES) not contained within a certificate.
storepass: password related to the keystore. Highly recommended to create a strong passphrase for the keystore
keyalg: algorithm used to create the key (AES)
keysize: size of the key (128, 192, 256, etc)
alias: alias given to the newly created key
keypass: password protecting the use of the key
3. Cipher Instance: While getting Cipher instance in Java, there is an option to provide the security provider such as BouncyCastle. Instead of recreating the provider instance multiple times (which is a bad idea), try using Security.addProvider(new BouncyCastleProvider()) and then using the name "BC". This approach will save you few milliseconds (~200) for each call to get the Cipher instance.
So, instead of using:
// required only once