Security and Cryptography
Python, being one of the most popular languages in computer and network security, has great potential in security and cryptography. This topic deals with the cryptographic features and implementations in Python from its uses in computer and network security to hashing and encryption/decryption algorithms.
- hashlib.pbkdf2_hmac(name, password, salt, rounds, dklen=None)
Many of the methods in
hashlib will require you to pass values interpretable as buffers of bytes, rather than strings. This is the case for
hashlib.new().update() as well as
hashlib.pbkdf2_hmac. If you have a string, you can convert it to a byte buffer by prepending the character
b to the start of the string:
Asymmetric RSA encryption using pycrypto
Asymmetric encryption has the advantage that a message can be encrypted without exchanging a secret key with the recipient of the message. The sender merely needs to know the recipients public key, this allows encrypting the message in such a way that only the designated recipient (who has the corresponding private key) can decrypt it. Currently, a third-party module like pycrypto is required for this functionality.
The recipient can decrypt the message then if they have the right private key:
Note: The above examples use PKCS#1 OAEP encryption scheme. pycrypto also implements PKCS#1 v1.5 encryption scheme, this one is not recommended for new protocols however due to known caveats.
Available Hashing Algorithms
hashlib.new requires the name of an algorithm when you call it to produce a generator. To find out what algorithms are available in the current Python interpreter, use
The returned list will vary according to platform and interpreter; make sure you check your algorithm is available.
There are also some algorithms that are guaranteed to be available on all platforms and interpreters, which are available using
Calculating a Message Digest
hashlib module allows creating message digest generators via the
new method. These generators will turn an arbitrary string into a fixed-length digest:
Note that you can call
update an arbitrary number of times before calling
digest which is useful to hash a large file chunk by chunk. You can also get the digest in hexadecimal format by using
A hash is a function that converts a variable length sequence of bytes to a fixed length sequence. Hashing files can be advantageous for many reasons. Hashes can be used to check if two files are identical or verify that the contents of a file haven't been corrupted or changed.
You can use
hashlib to generate a hash for a file:
For larger files, a buffer of fixed length can be used:
Generating RSA signatures using pycrypto
RSA can be used to create a message signature. A valid signature can only be generated with access to the private RSA key, validating on the other hand is possible with merely the corresponding public key. So as long as the other side knows your public key they can verify the message to be signed by you and unchanged - an approach used for email for example. Currently, a third-party module like pycrypto is required for this functionality.
Verifying the signature works similarly but uses the public key rather than the private key:
Note: The above examples use PKCS#1 v1.5 signing algorithm which is very common. pycrypto also implements the newer PKCS#1 PSS algorithm, replacing
PKCS1_PSS in the examples should work if you want to use that one. Currently there seems to be little reason to use it however.
Secure Password Hashing
The PBKDF2 algorithm exposed by
hashlib module can be used to perform secure password hashing. While this algorithm cannot prevent brute-force attacks in order to recover the original password from the stored hash, it makes such attacks very expensive.
PBKDF2 can work with any digest algorithm, the above example uses SHA256 which is usually recommended. The random salt should be stored along with the hashed password, you will need it again in order to compare an entered password to the stored hash. It is essential that each password is hashed with a different salt. As to the number of rounds, it is recommended to set it as high as possible for your application.
If you want the result in hexadecimal, you can use the
Symmetric encryption using pycrypto
Python's built-in crypto functionality is currently limited to hashing. Encryption requires a third-party module like pycrypto. For example, it provides the AES algorithm which is considered state of the art for symmetric encryption. The following code will encrypt a given message using a passphrase:
The AES algorithm takes three parameters: encryption key, initialization vector (IV) and the actual message to be encrypted. If you have a randomly generated AES key then you can use that one directly and merely generate a random initialization vector. A passphrase doesn't have the right size however, nor would it be recommendable to use it directly given that it isn't truly random and thus has comparably little entropy. Instead, we use the built-in implementation of the PBKDF2 algorithm to generate a 128 bit initialization vector and 256 bit encryption key from the password.
Note the random salt which is important to have a different initialization vector and key for each message encrypted. This ensures in particular that two equal messages won't result in identical encrypted text, but it also prevents attackers from reusing work spent guessing one passphrase on messages encrypted with another passphrase. This salt has to be stored along with the encrypted message in order to derive the same initialization vector and key for decrypting.
The following code will decrypt our message again: