# 2 Dec — Security basics & cryptography

## Category of OS security threats

**Breach of confidentiality**- retrieving secret data
**Breach of integrity**- changing data illicitly
**Breach of availability**- deleting data illicitly - may be easier than changing data
**Theft of service**- using resources illicitly (e.g., botnets)
**Denial of service**- preventing system from legitimate use - for example, flood system with invalid requests so that legitimate requests rarely succeed

## Symmetic key encryption

- In secure location, A & B agree on a secret
`S`. - When insecure, A sends message encrypted with
`S`. - B decrypts message using
`S`.

Simple example: One-time pad, where `S` is a random
sequence of bits. Given message `M`, A encrypts by
computing bitwise exclusive-or `M` ^ `S`. B decrypts
by repeating bitwise exclusive-or (`M` ^ `S`) ^ `S`, which
matches `M` since (`M` ^ `S`) ^ `S` = `M` ^ (`S` ^ `S`) = `M` ^ 0 = `M`.
By the way, one-time pad is perfectly secure; but it is only
secure as long as secret is used *only once* (hence the
name *one-time* pad). In practice, this isn't very practical.

Other examples:

- ENIGMA (used by Germans in World War II - Polish and British gained tremendous advantage by breaking it)
- DES - was once widely used, but no longer secure
- RC4 - used in WEP wireless encryption and SSL/HTTPS. In general, RC4 is considered on the verge of being broken. The usage in WEP encryption is already broken.
- AES - recommended for today

## Public key cryptography

- B generates pair of keys: private key
`S`and public key`K`. - B announces
`K`to world. - A encrypts message using
`K`. - B decrypts message using
`S`.

Example: RSA (for Rivest-Shamir-Adleman). B starts by
generating two large prime numbers `p` and `q`,
computing their product `N` = `p` ⋅ `q`.
B then selects a number `s` that is relatively prime
to (`p` − 1) ⋅ (`q` − 1).
Then B computes `k` so that
(`k` ⋅ `s`) mod ((`p` − 1) ⋅ (`q` − 1)) = 1.
Public key is the pair (`k`, `N`), private key is `s`.

When A wants to send a message `M`, A encrypts by
computing `M`^{k} mod `N`. B can decrypt ciphertext `C`
by computing `C`^{s} mod `N`. This depends on the fact
from number theory that if
(`k` ⋅ `s`) mod ((`p` − 1) ⋅ (`q` − 1)) = 1,
then `x`^{k ⋅ s} mod (`p` ⋅ `q`) = `x` for any `x`.

Example: B chooses `p` = 11, `q` = 13, two prime numbers.
It finds that (`p` − 1) ⋅ (`q` − 1) is 120, so it chooses
77 as a number that's relatively prime to that; this is our `s`.
As it happens, 53 is the correct number to choose for `k`,
since (77 ⋅ 53) mod 120 = 1.
B announces 53 and 143 (since 11 ⋅ 13 = 143) as his public key.
Now suppose that A wants to send 89 as her message;
she computes 89^{53} mod 143, which is 111, so she sends
111 to B.
B decodes this by computing 111^{77} mod 143, which indeed
does translate back to 89.

RSA was the first example of a decent public key cryptosystem, and it is still dominant today. However, elliptic curve crytography is an alternative.

## Cryptographic hash functions

A cryptographic hash function takes a plaintext `X` of
any length and generates a fixed-length result `H`(`X`)
that should be *one-way*:
Given `H`(`X`) but not `X`, it should be
impossible to determine `X` aside from simply
trying many values until we find a `Y` for which
`H`(`Y`) = `H`(`X`).
Also, a cryptographic hash function should
avoid collisions: Given two values `X` and `Y`,
`H`(`X`) and `H`(`Y`) should be completely different
values if `X` and `Y` are similar,
and they should be equal with as small a probability as
possible.

Several cryptographic hash functions have been proposed over the years. Two older hash functions still being used are MD5 and SHA-1. However, MD5 has been shown to be prone to collisions, and people have found weaknesses in SHA-1 leading to concern about whether it is prone to collisions; so neither is recommended. Recommended hash functions include SHA-256 and SHA-512.

## An aside on passwords

Hash functions are particularly important for storing passwords.

Tips for storing passwords:

Rather than store passwords directly — a clear security problem — we store the cryptographic hash of each password. When a user enters a password, we hash it and see whether it matches the hashed version that is stored.

Use a

*slow*hash function. Some cryptographic hash functions (MD5) are designed to be fast; but password crackers work by trying a variety of possible passwords, and a fast hash function allows the cracker to try passwords more rapidly.Use

**salt**with the passwords. “Salt” is just a random sequence of characters that is stored alongside the password: That is, instead of storing the hash of the password, we store the salt and the hash of the result of appending the password to the salt. When a user enters a password, we read the salt, append the two, compute the hash of the result, and compare the hashed value to the saved hash value. Salt slows password crackers markedly: Without salt, a cracker can go through a list of popular passwords, doing one hash on each and then looking through the list to find a match; with the salt, though, the cracker has to re-hash the password for every salt value.

Tips for using passwords:

Use passwords of at least 10 randomly chosen characters (I suggest 12). Good password crackers today can nearly try all possible 8-character passwords. And they can easily crank through the 10,000,000 most popular passwords, so a random sequence will be best at defeating them.

Use a different password for each purpose. This way, if somebody acquires one password, then they have only broken one.

To remember the variety of passwords, use a password manager — a program that stores an encrypted database of your passwords. (Keepass and Password Safe are two decent alternatives.) Somebody is unlikely to get this password database without explicit access to your computer, but it makes it much easier to deal with the variety of passwords you need.

## Signatures

- B generates pair of keys: private key
`S`and public key`K`. - B announces
`K`to world. - B sends message
`M`followed by a*certificate*computed using`S`. - A checks that the received message
`M`actually came from B by verifying the certificate using`K`.

One can again use RSA to do this:
B generates the key exactly as before. B computes the certificate
by first hashing the message to get `H`(`M`) and
then computing (`H`(`M`))^{s} mod `N`.
A verifies the certificate by computing `C`^{k} mod `N` and
confirming that the result matches `H`(`M`) for the message A
receives.

## Signatures application: TLS/HTTPS

Applying certificates requires avoiding the man-in-the-middle attack: Person E may intercept all communication between A and B, using information to decode. For example, when A sends its public key to B, E may intercept that and instead send its own public key to B. Consequently, when E sends messages to B, B thinks they come from A.

The Web uses TLS (on which HTTPS is based); to get around the man-in-the-middle attack, it depends on &lqduo;certificate authorities” (CAs), trusted providers of authenticity.

- Site generates public and private key, pays CA to create a certificate for the site's public key using the CA's private key.
- All users' browsers are distributed including the public keys for several well-known CAs.
- When browser sends request to the server, the server sends the site's public key to the browser along with the CA-generated certificate.
- Browser verifies the certificate using its built-in public key for that CA.
- Browser generates a symmetric key, sending it to the server encrypted by the site's public key.
- Server decrypts the symmetric key using the site's private key.
- All subsequent communication between browser and server use the symmetric key.

(Note that the CA is only involved in the setup, not in each individual browser request.)

## Other forms of cryptography

There are other niches for specialized cryptographic systems: steganography (hiding a message in some other information, like a picture) and secure voting.