next up previous contents
Next: Glossary of Terms and Up: Computer Networks and Networking Previous: Network analysis

Subsections

Network security

Most people think of security and cryptography as something used by the military during wars to communicate without the enemy knowing or by governments to keep their secrets.

During the second world war the Germans used their Enigma machines to send messages. They thought those messages were unbreakable since Enigma applied a different code to each letter of a message. However a group of high-IQed British people managed to crack Enigma codes with the world's most advanced computers then, and managed to read Hitler's mail, which helped the Allies winning the war in Europe and Africa. At this same war, on the other side of the globe the United States was braking the codes of Japan with their cryptanalysis team in Pearl Harbor. The Japanese didn't believe their codes could be broken, till the end of the war. The Japanese system was representing each word by a randomly assigned set of five digits. For example, 78934 might stand for "Tokyo", and 78935 for "Suicide". Every time they changed their codes it might be months before the US could read them again. The United States managed to crack about 25% of the Japanese messages, and it was enough.

Julius Caesar, two thousand years earlier, when sending messages used a system of cryptography on his messages to his troops. He used to rotate each letter of the messages by a number of letters, for example, the word ATTACK would become CVVCEM (Rotating each letter 2 letters ahead). Back at those days this system was enough against the semi-literate barbarian spies of the enemy. Today, any 10 year old kid could crack Julius Caesar's messages, after all there are only 26 possibilities to check...

Security in a Computer Network

Security in a network can come in many different ways. There are basic needs that most networks provide, like Access control, and passwords. Some organizations need better protection of their data, and need more sophisticated means, like the ability to encrypt messages, so that only the receiver will be able to read them (It's quite easy to read messages sent over a normal network). What we may want in our local network :

Control of Access to computers and information

Many operation systems use a password mechanism to control access to the computer. Each user has a login and a password, and whenever he wished to enter the computer, he needs to enter his password. When accessing computer from a terminal (through the network), it's not a good idea to transfer the password as-is, because it is possible to wiretap the network. We might need to encrypt the password, or find a way to use it safely.

When the information is extremely sensitive, we might simply not allow to access it through the network. It is always easier to break through network security than to break into an isolated computer!

Data Protection

Sometimes we need to protect a certain file, so that it won't be available to all. Some operating system provide such protection (In UNIX, a user can decide who can view this file, and who can't, and who can write to it, or execute it), and one could always use a program to encrypt the file. When encrypting the file, we save a scrambled version of it, and then only the ones that are allowed to read the file can decrypt it (un-scramble). The simplest use of encryption needs a key. The encryption program produces a new file, given the original file, and the key. It looks like that:

Encrypted File = Encrypt (Original File, Key).

When we want to open the Encrypted file, we need the Key again :

Original File = Decrypt ( Encrypted File, Key)

Mail protection

When we need to send a mail message (or a file through the network, for that matter) we need to be sure that only the intended receiver will be able to read it. Because most network won't guarantee that fact, The messages are usually encrypted. But the encryption scheme described before, is not suitable now. We need to use a key, but we cannot transmit the key to the receiver, because the transmission isn't safe... So we need to know in advance the key that is used, in order to decrypt the message! A better method, is Public Key Encryption. It works like that: Every one has a public key, that is known by all. When we want to send someone an encrypted message, we use his public key. In addition to the public key, everyone also have a Private key, known to himself only. The encrypted message (using the public key) can only be decrypted using the private key! so we could send someone a message, be sure of its safety, without needing to agree upon a key. It works like that :

Encrypted Message = Encrypt ( Original Message, Receiver Public Key)

When the receiver gets the message, he opens it, using his Private key :

Original Message = Decrypt (Encrypted Message, Receiver Private Key)

PGP is a high security cryptographic application. PGP allows you to exchange information (whether it's files or messages) with privacy, authentication and convenience. It means that only the intended receiver will be able to access the information and that the sender identity is known. PGP is very easy to use, as there is no need for a dedicated secure channel.

PGP uses two algorithms. One is the Rivest-Shamir-Adleman (RSA) public key cryptosystem and the second is the IDEA algorithm. An IDEA key is generated for the message. The IDEA system is a conventional cryptosystem, so that the same key will be used to decrypt the message. Then RSA is used to encrypt the IDEA key, using the recipient public key. The recipient (When he receives the message) uses his private RSA key to decrypt the IDEA key, which is then used to decrypt the entire message.

Authentication

Another problem with networks, is that we are never sure who sent us a message. It's very easy to write a message pretending to be someone else. A technique called a Digital Signature was developed for that. The sender 'signs' his message, using a key that only he knows. The receiver can then decrypt the signature, just like a regular encrypted message. Again, we usually use a private/public key combination : A signature can only be signed using a private key, and can be decrypted using a public key. In that way, the receiver can be sure as to who sent the message. To ensure authentication and privacy, we can use a digital signature and then encrypt the message. The receiver will need to both decrypt the message using the public key of the sender, and then to authenticate, he'll use his own private key.

Host Security

The Characteristics of a Secure System and the TCSEC

The Department of Defence Trusted Computer System Evaluation Criteria (TCSEC) also known as the "Orange Book" provides simple but fundamental computer security issues. The table below lists the possible classes of evaluation within the TCSEC.



Class Requirements
A1 Verified Design
B3 Security Domains
B2 Structured Protection
B1 Labelled Security Protection
C2 Controlled Access Protection
C1 Discretionary Security Protection
D Minimal Protection



The TCSEC covers trusted systems from Class D (no trust), to class A1 (as trustworthy as the state-of-the-art allows).

Most major operating systems either contain or are easily modifiable to contain the features which are described below. A secure network system has many characteristics with the baseline measurement being the C2-level of security. The C2-level security is defined in the TCSEC and its requirements are :

1.
Discretionary Access Control: Discretionary access controls (DAC) are controls which allow a given user to indicate who may or may not view a file that the given user owns. In its simplest form, it allows the user to restrict read, write and execute permissions to specific individuals listed with the access control list (ACL).
2.
Object Reuse: Object reuse is the requirement that whenever an object within the Trusted Computing Base (TCB) is no longer required by a given subject, that its contents be erased by the TCB prior to it being returned to free object area. This can be accomplished by writing over the contents a given number of times by a predefined pattern followed by a random pattern. For a system to achieve C2, it must perform this for objects returned in memory as well as objects returned to disk.
3.
Identification and Authentication: Identification and authentication is the requirement of the system to require the user to identify his/her self to the TCB. Once the user is identified, the system will request information in order to authenticate the user's identity. The simplest form of I&A is the login ID and password pair.
4.
Auditing: Audit is the requirement that the system maintain and protect from modification an audit trail of all access to the object the TCB protects. Access to the audit records will be restricted by the TCB to specific system administrative users.
5.
Operational Assurance: This is divided into two aspects, system architecture and system integrity. With regards to system architecture, the system must be able to isolate its resources that are required to be protected so that they can be subject to access control and audit as well as being able to protect itself from external interference or tampering. To attain system integrity, the TCB must contain hardware/software elements which can be used periodically to validate the correct operation of the hardware.
6.
Life-Cycle Assurance: Life-cycle assurance is composed of various tests to ensure that the TCB functions as indicated in the documentation and that there are no obvious methods of bypassing or defeating the security mechanisms of the TCB.
7.
Documentation

The Security of Windows NT

Windows NT offers features that meet the practical security requirements of business. For every-day users, NT restricts who uses the computer and controls what each authorized user does. For administrators, NT provides tracking and auditing capabilities, enabling network managers to monitor who attempts to use a particular computer and what each user attempts to do.

Networks of computers are becoming increasingly important to most businesses. Networks are used to share key information and resources among many users throughout organizations of various sizes. Frequently, the information stored on network, is secure information that is intended for use only by specific individuals. Therefore, the ability of these networks to prevent unauthorized access to information is paramount to the security and competitiveness of an organization. The NT operating system can maintain the security for these networks.

Within NT, there are two types of configuration, these being a NT server which is shared among a lot of users over a network and the second type of configuration is an NT workstation which is intended for numerous people logging on to this computer. These are used in many Universities and offices. Although they are similar, there are security differences which makes the NT workstation vulnerable to attack.

Windows NT Security Architecture

Windows NT was designed to be a portable operating system with minimal dependence on a processor's unique hardware features. However, all NT implementations rely on the processor to provide two execution modes. These are kernel and user. Kernel mode is used by the privileged operating system code to protect system data. Code running in this mode communicates directly with the hardware whereas code executing in the user mode must use operating system calls to modify system data and access the hardware. NT consists of an executive and several protected subsystems.

Executive

The executive lies on top of the hardware abstraction layer (HAL) and operates in the kernel mode. It consists of several parts. Each of them implements two functions. One is system service, that the environment subsystem and other executive components can call. Another is internal routines, which is only available to the executive components.

Executive components are independent each other. Each of them creates and manages a system data structure. The description of each component is as follows :

Protected Subsystems

Each protected subsystem provides an Application Programmer Interface (API) that programs can call. When an application calls an API function, a message is sent to the corresponding server by the NT executive. The server performs the functions, then replies the result by sending a message to the caller. NT has two types of protected subsystems: environment subsystems and integral subsystems. An environment subsystem is a user-mode server that provides an API to an operating system. These components interact with users directly. The integral subsystems are servers that perform some important operating system functions. The security subsystem and the NT networking software parts are integral subsystem. The security subsystem runs in user mode and records security policies on a local computer. In addition, the security subsystem maintains a database of information about user accounts, including account names, passwords, group information and special privileges that the user owns. It also accepts user logon.

User processes execute only in user mode and must make requests to the subsystems in order to obtain access to the computer `s facilities.

The security components of NT consist of two parts that execute in user mode. These are WinLogon and a protected server called the Local Security Authority (LSA). The LSA relies on the Security Accounts Manager (SAM) and two executive components, the Object Manager and the Security Reference Monitor (SRM), to determine access privileges and obtain system resources. The diagram in Figure 10.1 shows the security architecture of the NT operating system including the protected subsystems and the executive.

  
Figure 10.1: Architecture of NT

\resizebox*{!}{8cm}{\includegraphics{notesimages/ntsecur.eps}}


Object Security of Windows NT

Sharable resources, such as processes, threads, files, and shared memory, are implemented as objects in the NT executive. Hence, the NT object system serves as a check point for resource security. Whenever a process opens a handle to an NT object, the NT security is involved. Each object has an access control list (ACL), contains information about which processes can access the object and what they can do with it. When a process opens a handle to an object, it specifies operations that it wants to do. The security subsystem then checks whether the process is allowed to perform the operation. Because a process must open a handle to an object before it can do anything on it, its behaviour is always checked by the security subsystem. No process can bypass security checking, the NT object manager becomes a security gateway.

Access Tokens

An access token is essentially NT's identification card for a user. Every process, and potentially every thread, has a related access token which identifies the user account that the process is running under and a lot of default values to be used when creating new objects. An access token contains a lot of security information about the user, including:

Access tokens are created through login services (e.g. the login dialogue or a network share login), through impersonation (temporary assumption of a user's security attributes by a service's thread), or through the NT login.

Access Control Lists (ACL)

All objects, including files, threads, events and even access token, are assigned security descriptors when they are created. The security descriptor consists mainly by Access Control List(ACL), which is a list of protections that are applied to the object. The owner of the object has discretionary access control over the object and can change the object's ACL.

Each entry in an ACL is called an Access Control Entry(ACE). Each ACE contains a security ID and a set of access rights. A user with a matching security ID has the rights to access the object. An ACE can also be created for a group security ID. The maximum rights which are granted by ACL are the sum of all ACE's rights.

The Access Control Entry (ACE)

This is the most basic unit of permission in NT. ACE has two forms, access allowed and access denied which are used to grant or refuse access respectively. An ACE contains an SID which indicates which user or group of users the permission applies to, and a permission mask that indicates exactly what kind of permission is being granted or refused.

The permission mask is broken into parts which indicate permissions specific to a particular object type (called specific rights) and generic permissions (generic rights) that apply to all objects. The standard rights, which control the accessibility and exclusivity for all objects, are summarized in the table below.



Right Description
DELETE The ability to delete an object.
READ_CONTROL The ability to inspect the objects security information.
WRITE_DAC The ability to change the objects list of permissions.
WRITE_OWNER The ability to change the owner of the object.
SYNCHRONIZE Force mutual-exclusion of object classes.



A single ACE is seldom enough to fully describe the accessibility of an object to different users or groups on the system.

The Security Descriptor(SD)

While an ACL describes the accessibility of an object, it does not completely describe the security attributes of an object. The complete group of security attributes is kept in an object called a security descriptor (SD). The contents of the SD contains

In addition to an ACL, an object's security descriptor also contains a field for Auditing. It is used to refer to the security system's audit capability. Whenever a user tries to change files in the directory, NT makes an entry in the audit log. Regularly reviewing the audit log can help reduce the risk of computer tampering.

The Security Identifier (SID)

This is the most basic object in NT. An SID is a unique identifier used to identify a user or a group of users that exist on a particular computer or in a particular domain.

Each of these objects builds upon the others in order to provide certain security features.

Security Procedures

Identification and Authentication

Identification and authentication are the most fundamental security features on an operating system. To log on to NT server/workstation the key combination Ctrl-Alt-Del is pressed. This implements a feature called a "Trusted Path" and this Trusted Path is a requirement of the "Orange Book". This requirement ensures the user that the prompt for username and password is from the operating system and not from a program trying to find out passwords, etc. The user must identify oneself with a username and authenticate oneself with a password.

The sequence for authorizing is as follows :

Every user belongs to one or more groups and a few special groups are built in. Each group has a name and a set of user rights. Users have the rights of all the groups they belong to, plus any special rights that they are given perhaps from the system-administration.

Auditing

Because no system is absolutely secure, administrators need to be able to determine if their system has been the target of attack, or has been vulnerable to the misadventures of a non-malicious user. For NT, the auditing policy is set and controlled with the User Manager. The User Manager provides an easy interface to specify the level of auditing. Because the auditing process contributes to the system overhead, the amount of audit information to be captured has to be carefully weighted in consideration to the overall requirements.

NT divides audited user actions into several categories including file and object access, logging on and off and exercise of user rights. Actions within each category can be audited for success, failure or both.

Audit records consist of three different logs: system events, application events and security events. Each event record is time-stamped and both the process and the user attempting the operation are identified. A log is an object like other resources controlled by NT and therefore it has its own access control list associated with it.

Object Reuse

Underlying all of NT's objects are physical RAM and disk space, both of which are continually being recycled for new processes and files. Object Reuse is a security requirement that prevents a user from accessing the remains of another user's work particularly when the operating system creates new objects from previously used resources.

When NT creates a new object for a user , the object is initially empty. For file objects, NT prevents the user from reading past a file's logical end-of-file marker and thus possibly peeking at data from an erased file. Also if a user has a right to extend a file, NT overwrites that area on disk before granting access to it.

When a program allocates memory, NT first clears a section of RAM that a newly created memory object will occupy. This prevents a user from probing random locations in RAM, searching for the vestiges of documents or file buffers that might contain confidential information.

Internet Security Issues

The initial release of NT had some gaping security holes which would allow anyone on the Internet to easily delete any of the files on your computer to which that Web server had delete access. Not only that, but their actions wouldn't even have been logged.

NT supports several types of networking:

If you attach such a workstation to the Internet, anyone can connect to any shared directories on that machine, login as Guest and wreck havoc with the file system of that computer. Or they can connect to the registry on that machine (which is always shared, as described below) and mess it up.

Remote Registry Access

NT installs by default with Everyone given write access to much of the registry. In NT 3.51, this was a major problem due to the remote registry access feature of the Registry Editor. Any user could manipulate the registry on any server or workstation on which this user has an account, or on which the guest account is enabled. NT 4.0 fixed this problem by introducing the following registry key:

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurePipeServers\winreg

This key is present by default on NT 4.0 server. It is NOT present on NT 4.0 workstation, but can be added. The presence of this key disables remote registry access, other than to administrators. Another way to prevent remote registry access is to remove the permission for Everyone at the root of the HKEY_LOCAL_MACHINE hive. This is the appropriate way to protect the registry for NT 3.51.

NT programs use remote procedure calls (RPCs) to allow various system services to be performed on a remote computer. For example, the ability to modify the registry on remote computers is implemented using remote procedure calls. There are mechanisms in NT for the RPC server to learn the username of the RPC client and then to limit the functions it will perform based on that username.

FTP Access

When using the FTP server that came with NT 3.51, the home directory you specify for the FTP service is only the initial current directory. Ftp users can change their current directory. So if you specify a home directory of c: \ftp, any ftp user can change to c:\ and thence change to any sub-directories under c:\. Normal NTFS permissions will apply, of course, to whatever account the ftp user is running under. If you don't want ftp users to be able to see the root directory of your primary partition, you should create a separate partition for ftp and then configure ftp so that it can only read and/or write to that partition. The IIS FTP server in NT 4.0 does not have this problem.

Analysis of the Security of UNIX

The UNIX operating system, although now in widespread use in environments concerned about security, was not really designed with security in mind. This does not mean that UNIX does not provide any security mechanisms; indeed, several very good ones are available. However, most "out of the box" installation procedures from companies such as Sun Microsystems still install the operating system in much the same way as it was installed 15 years ago, with little or no security enabled.

UNIX was originally designed by programmers for use by other programmers. The environment in which it was used was one of open co-operation, not one of privacy. Programmers typically collaborated with each other on projects, and hence preferred to be able to share their files with each other without having to climb over security hurdles. Because the first sites outside of Bell Laboratories to install UNIX were university research laboratories, where a similar environment existed, no real need for greater security was seen until some time later.

In the early 1980s, many universities began to move their UNIX systems out of the research laboratories and into the computer centers, allowing (or forcing) the user population as a whole to use this new and wonderful system. Many businesses and government sites began to install UNIX systems as well, particularly as desktop workstations became more powerful and affordable. Thus, the UNIX operating system is no longer being used only in environments where open collaboration is the goal.

To complicate matters, new features have been added to UNIX over the years, making security even more difficult to control. Perhaps the most problematic features are those relating to networking : remote login, remote command execution, network file systems, diskless workstations, and electronic mail. All of these features have increased the utility and usability of UNIX by untold amounts. However, these same features, along with the widespread connection of UNIX systems to the Internet and other networks, have opened up many new areas of vulnerability to unauthorized abuse of the system.

Like most multi-user operating systems, UNIX offers a wide variety of security measures. However, because of its roots as a research platform and its development both inside AT&T Bell Laboratories and academia, UNIX has become synonymous with insecure security. To make UNIX compatible with the C2-level of security, modifications and enhancements have to be integrated into the UNIX. The following presents the security on a typical UNIX operating system.

UNIX's Inherent Security

The security features commonly found on the UNIX OS are described below.

Passwords

Under the UNIX OS, all passwords are usually stored in the /etc/passwd file. These passwords are encrypted with a modified version of the Data Encryption Standard (DES), so that it becomes a one way crypt. When a user logs in, the password entered to the OS at the password prompt is encrypted using the modified DES routine and compared against the encrypted password file inside the /etc/passwd file. If the two match, the user is allowed to log into the system.

However, to reduce the effectiveness of key search, a defence called "password salting" is used. This method employs the use of a random12-bit number, the salt, which is appended onto the password when the password is first entered into the system. The string, the 12-bit number and password, is encrypted and stored in the password file. When a user attempts to log in, the salt stored in the password file, is appended to the supplied password, encrypted and compared to the encrypted string. If the strings match, the user is allowed access to the OS. The use of salts increases the complexity of breaking any given password by 212.

To help ensure the quality of passwords chosen, many UNIX OS's,

Discretionary Access Control

All UNIX systems contain a form of Discretionary Access Control (DAC). These controls are of the form :

-rwxrwxrwx user group filename

where - is one of d (directory), c (character), s (special), b (block), - (a regular file) and r, w, x are read, write and execute respectively.

Each file has a filename, an owner and an associated group. In conjunction with the standard access controls available on UNIX, there is a feature called umask which allows the user to define a default file and directory creation mode.

Auditing

UNIX audits many things. Common audit items include :

Most of this data is available to the normal user. Last login time is normally printed at login so the user can see if this is correct. A discrepancy may force the user to change his/her password. The audit records concerning the root are never accessible to anyone but the system administrator.

Networking

Although a system which allows a network can not be considered "trusted" in the "Orange Book" sense, most modern UNIX OS's are indeed networked. The most common network configuration on a UNIX OS is the NFS system pioneered by Sun. This system allows for transparent connection of a wide assortment of hardware. Also NFS allows a user to login in from one machine to another via a utility called rlogin without having to specify a password. This feature is known as trusted users/trusted hosts. This configuration is the reason why the Internet Worm of November 1988 spread so quickly. This problem was eliminated by making the user login every time he/she wished to connect to another computer.

UUCP, E - Mail and FTP

UNIX to UNIX Copy (uucp) is a means of transferring files between two electronic machines. uucp allows for e-mail and for the transferring of other information.

FTP is a file transfer protocol which allows for a special type of file transfer. This is known as "anonymous ftp". If this is set up correctly, then users will be only allowed access to specific files. However, if it is set up with a few bugs, users may have access to the whole file system. To maintain the tightest security, ftp capability should be removed.

The Network File System

The Network File System (NFS) is designed to allow several hosts to share files over the network. One of the most common uses of NFS is to allow diskless workstations to be installed in offices, while keeping all disk storage in a central location. Distributed by Sun, NFS has no security features enabled. This means that any host on the Internet may access your files via NFS, regardless of whether you trust them or not. Fortunately, there are several easy ways to make NFS more secure. The more commonly used methods are described in this section, and these can be used to make your files quite secure from unauthorized access via NFS.

Making a UNIX System C2

As already mentioned, to make UNIX C2 compliant it is necessary to change some of the present features of UNIX. The necessary steps to do this are explained as follows :

Passwords

In most UNIX systems, the user passwords are stored encrypted in the /etc/passwd file. To attain C2 would require the use of a "shadow" file to store the encrypted passwords as well as any password aging information.

Object Reuse

Whenever an object that the system manipulates is returned to a common pool of similar objects, the returned object must have all of its internal information removed. To do this, one must successively write binary patterns over the object followed by a random pattern.

Discretionary Access Controls

The National Computer Security Centre (NCSC) has stated that the access controls that currently exist within UNIX are sufficient for the C2-level. Guidelines for higher levels of classification are available and reflect requirements of the B3 class.

Auditing

To ensure an appropriate level of trust in the access controls, audit considerations must be addressed. The controls may require an increased level of protection from attack. This requirement is relative to the existing controls of the UNIX system. Most UNIX systems currently perform some level of audit. However, to achieve a C2 rating, a few additions are required. To achieve a C2 level of trust, the following changes must occur.

Operational Assurance

The most straightforward method of achieving operational assurance is to use a hardware platform which has protected states. As already stated in Discretionary Access Controls, certain hardware come equipped to protect the OS from external interference or tampering. The use of this type of hardware, with the addition of a memory manager, allows for the simple creation of a domain for the TCB's execution.

Documentation

There are three manuals that must be supplied by the manufacturer. There three manuals are the detailed user, the detailed administrator, and the detail design documents. Insufficient or incorrect documentation must be correct before a product can be properly evaluated.

UNIX Versus Windows NT

UNIX and NT are strikingly similar in design and capabilities, but their differences are significant. Both support advanced file systems with long filenames and both offer powerful peer sharing and network services. UNIX has still the edge in distributes resources, with the ability to share applications, files, printers, modems and remote procedures across WAN and LAN connections. However, NT's native file-sharing is generally faster and more efficient then NFS. It also serves files and printers to Windows, Win 95 and Macintosh clients without requiring optional software.

UNIX currently has a market lock on serving applications. If a user can get across to a UNIX host through and WAN or LAN network connection, the user can tap into all its resources. NT lacks the native ability to share graphical applications across network connections, a failure that also makes it harder to do remote administration. This is one of NT's serious shortcomings.

Both OS's support remote procedure calls (RPC), and object-sharing standards are rapidly evolving for both. UNIX is a better overall application server than NT. No UNIX implementation can rival NT's ease of set-up and management. NFS is the UNIX standard for file-sharing which has recently seen enhancements.

Security

NT has excellent standard security features. Commercial UNIX implementations offer varying levels of security, but none can rival NT's simple administrative interface. Also, it has been pointed out that the security features of NT match more closely to that required in the "Orange Book" as opposed to UNIX which has to make changes to achieve this level.

NT and UNIX both offer read, write and execute permissions on each file. NT adds "take ownership" and "change permissions" to these. NT access control list apply not only to files, but to all objects managed by the OS. Each of the operating systems require the user to login. In NT this sets up a "trusted host" by pressing Ctrl-Alt-Del simultaneously. In UNIX, the "trusted host" feature is not present so care must be taken when selecting passwords. There are guidelines laid down to ensure that a correct password will be chosen. Both OS's use file-level access permissions. However, few UNIX OS's have file access control lists compared to NT which has this feature. NT carries out auditing whereas only some UNIX OS's perform this.

Manageability

UNIX is easier to manage from a distance than NT, but a user at the console will find NT much easier to maintain. DHCP makes adding a host to a LAN very easy.

Network Advantages

If a user want to be connected to the Internet, then UNIX is preferred. UNIX with let you implement firewalls, proxyservers, security enhancements, etc.

Firewalls

The need for firewalls no longer seems to be in question today. As the Internet and internal corporate networks continue to grow, such a safeguard has become all but mandatory. As a result, network administrators increasingly need to know how to effectively design a firewall. This article explains the basic components and major architectures used in constructing firewalls.

The "right solution" to building a firewall is seldom a single technique; it's usually a carefully crafted combination of techniques to solve different problems. Which problems you need to solve depend on what services you want to provide your users and what level of risk you're willing to accept. Which techniques you use to solve those problems depend on how much time, money, and expertise you have available.

Some protocols (such as Telnet and SMTP) lend themselves to packet filtering. Others (such as., FTP, Archie, Gopher, and WWW) are more effectively handled with proxies. Most firewalls use a combination of proxying and packet filtering.

Before we explore various firewall architectures, let's discuss two major approaches used to build firewalls today: packet filtering and proxy services.

Packet filtering

Packet filtering systems route packets between internal and external hosts, but they do it selectively. They allow or block certain types of packets in a way that reflects a site's own security policy.

Every packet has a set of headers containing certain information. The main information is:

In addition, the router knows things about the packet that aren't reflected in the packet headers, such as:

The fact that servers for particular Internet services reside at certain port numbers lets the router block or allow certain types of connections simply by specifying the appropriate port number (such as TCP port 23 for Telnet connections) in the set of rules specified for packet filtering.

Here are some examples of ways in which you might program a screening router to selectively route packets to or from your site:

To understand how packet filtering works, let's look at the difference between an ordinary router and a screening router.

An ordinary router simply looks at the destination address of each packet and picks the best way it knows to send that packet towards that destination. The decision about how to handle the packet is based solely on its destination. There are two possibilities: the router knows how to send the packet towards its destination, and it does so; or the router does not know how to send the packet towards its destination, and it returns the packet, via an ICMP "destination unreachable" message, to its source.

A screening router, on the other hand, looks at packets more closely. In addition to determining whether or not it can route a packet towards its destination, a screening router also determines whether or not it should. "Should" or "should not" are determined by the site's security policy, which the screening router has been configured to enforce.

Although it is possible for only a screening router to sit between an internal network and the Internet, as shown in the diagram above, this places an enormous responsibility on the screening router. Not only does it need to perform all routing and routing decision-making, but it is the only protecting system; if its security fails (or crumbles under attack), the internal network is exposed. Furthermore, a straightforward screening router can't modify services. A screening router can permit or deny a service, but it can't protect individual operations within a service. If a desirable service has insecure operations, or if the service is normally provided with an insecure server, packet filtering alone can't protect it.

A number of other architectures have evolved to provide additional security in packet filtering firewall implementations.

Proxy services

Proxy services are specialized application or server programs that run on a firewall host: either a dual-homed host with an interface on the internal network and one on the external network, or some other bastion host that has access to the Internet and is accessible from the internal machines. These programs take users' requests for Internet services (such as FTP and Telnet) and forward them, as appropriate according to the site's security policy, to the actual services. The proxies provide replacement connections and act as gateways to the services. For this reason, proxies are sometimes known as application-level gateways.

Proxy services sit, more or less transparently, between a user on the inside (on the internal network) and a service on the outside (on the Internet). Instead of talking to each other directly, each talks to a proxy. Proxies handle all the communication between users and Internet services behind the scenes.

Transparency is the major benefit of proxy services. It's essentially smoke and mirrors. To the user, a proxy server presents the illusion that the user is dealing directly with the real server. To the real server, the proxy server presents the illusion that the real server is dealing directly with a user on the proxy host (as opposed to the user's real host).

Note: Proxy services are effective only when they're used in conjunction with a mechanism that restricts direct communications between the internal and external hosts. Dual-homed hosts and packet filtering are two such mechanisms. If internal hosts are able to communicate directly with external hosts, there's no need for users to use proxy services, and so (in general) they won't.

The proxy server doesn't always just forward users' requests on to the real Internet services. The proxy server can control what users do, because it can make decisions about the requests it processes. Depending on your site's security policy, requests might be allowed or refused. For example, the FTP proxy might refuse to let users export files, or it might allow users to import files only from certain sites. More sophisticated proxy services might allow different capabilities to different hosts, rather than enforcing the same restrictions on all hosts.

Firewall architectures

There are a variety of ways to put various firewalls components together. Let's examine some of these approaches in detail.

Dual-homed host architecture

A dual-homed host architecture is built around the dual-homed host computer, a computer that has at least two network interfaces. Such a host could act as a router between the networks these interfaces are attached to; it is capable of routing IP packets from one network to another. However, to implement a dual-homed host type of firewalls architecture, you disable this routing function. Thus, IP packets from one network (such as the Internet) are not directly routed to the other network (such as the internal, protected network). Systems inside the firewall can communicate with the dual-homed host, and systems outside the firewall (on the Internet) can communicate with the dual-homed host, but these systems can't communicate directly with each other. IP traffic between them is completely blocked.

  
Figure 10.2: Dual-homed host architecture

\resizebox*{!}{0.4\textheight}{\includegraphics{notesimages/dualhome.eps}}


The network architecture for a dual-homed host firewall is pretty simple: (see Figure 10.2) The dual homed host sits between, and is connected to, the Internet and the internal network. Dual-homed hosts can provide a very high level of control. If you aren't allowing packets to go between external and internal networks at all, you can be sure that any packet on the internal network that has an external source is evidence of some kind of security problem. In some cases, a dual-homed host will allow you to reject connections that claim to be for a particular service but that don't actually contain the right kind of data. (A packet filtering system, on the other hand, has difficulty with this level of control.) However, it takes considerable work to consistently take advantage of the potential advantages of dual-homed hosts.

A dual-homed host can provide services only by proxying them, or by having users log into the dual-homed host directly. User accounts present significant security problems by themselves. They present special problems on dual-homed hosts, where they may unexpectedly enable services you consider insecure. Furthermore, most users find it inconvenient to use a dual-homed host by logging into it.

Proxying is much less problematic, but may not be available for all services you're interested in.

Screened host architecture

Whereas a dual-homed host architecture provides services from a host that's attached to multiple networks (but has routing turned off), a screened host architecture provides services from a host that's attached to only the internal network, using a separate router. In this architecture, the primary security is provided by packet filtering. (For example, packet filtering is what prevents people from going around proxy servers to make direct connections.)

  
Figure 10.3: Screened host architecture

\resizebox*{!}{0.4\textheight}{\includegraphics{notesimages/bastion.eps}}


The bastion host sits on the internal network as in Figure 10.3. The packet filtering on the screening router is set up in such a way that the bastion host is the only system on the internal network that hosts on the Internet can open connections to (for example, to deliver incoming email). Even then, only certain types of connections are allowed. Any external system trying to access internal systems or services will have to connect to this host. The bastion host thus needs to maintain a high level of host security.

The packet filtering also permits the bastion host to open allowable connections to the outside world. The packet filtering configuration in the screening router may do one of the following:

You can mix and match these approaches for different services; some may be allowed directly via packet filtering, while others may be allowed only indirectly via proxy. It all depends on the particular policy your site is trying to enforce.

Because this architecture allows packets to move from the Internet to the internal networks, it may seem more risky than a dual-homed host architecture, which is designed so that no external packet can reach the internal network. In practice, however, the dual-homed host architecture is also prone to failures that let packets actually cross from the external network to the internal network. (Because this type of failure is completely unexpected, there are unlikely to be protections against attacks of this kind.) Furthermore, it's easier to defend a router, which provides a very limited set of services, than it is to defend a host. For most purposes, the screened host architecture provides both better security and better usability than the dual-homed host architecture.

Compared to other architectures, however, such as the screened subnet architecture discussed in the following section, there are some disadvantages to the screened host architecture. The major one is that if an attacker manages to break in to the bastion host, there is nothing left in the way of network security between the bastion host and the rest of the internal hosts. The router also presents a single point of failure; if the router is compromised, the entire network is available to an attacker. For this reason, the screened subnet architecture has become increasingly popular.

Screened subnet architecture

The screened subnet architecture adds an extra layer of security to the screened host architecture by adding a perimeter network that further isolates the internal network from the Internet, as in Figure 10.4.

  
Figure 10.4: Screened subnet architecture

\resizebox*{!}{0.4\textheight}{\includegraphics{notesimages/screensubnet.eps}}


Why do this? By their nature, bastion hosts are the most vulnerable machines on your network. Despite your best efforts to protect them, they are the machines most likely to be attacked, because they're the machines that can be attacked. If, as in a screened host architecture, your internal network is wide open to attack from your bastion host, then your bastion host is a very tempting target. There are no other defenses between it and your other internal machines (besides whatever host security they may have, which is usually very little). If someone successfully breaks into the bastion host in a screened host architecture, he's hit the jackpot.

By isolating the bastion host on a perimeter network, you can reduce the impact of a break-in on the bastion host. It is no longer an instantaneous jackpot; it gives an intruder some access, but not all.

With the simplest type of screened subnet architecture, there are two screening routers, each connected to the perimeter net. One sits between the perimeter net and the internal network, and the other sits between the perimeter net and the external network (usually the Internet). To break into the internal network with this type of architecture, an attacker would have to get past both routers. Even if the attacker somehow broke in to the bastion host, he'd still have to get past the interior router. There is no single vulnerable point that will compromise the internal network.

Some sites go so far as to create a layered series of perimeter nets between the outside world and their interior network. Less trusted and more vulnerable services are placed on the outer perimeter nets, farthest from the interior network. The idea is that an attacker who breaks into a machine on an outer perimeter net will have a harder time successfully attacking internal machines because of the additional layers of security between the outer perimeter and the internal network. This is only true if there is actually some meaning to the different layers, however; if the filtering systems between each layer allow the same things between all layers, the additional layers don't provide any additional security.

The perimeter network is another layer of security, an additional network between the external network and your protected internal network. If an attacker successfully breaks into the outer reaches of your firewall, the perimeter net offers an additional layer of protection between that attacker and your internal systems.

Here's an example of why a perimeter network can be helpful. In many network setups, it's possible for any machine on a given network to see the traffic for every machine on that network. This is true for most Ethernet-based networks, (and Ethernet is by far the most common local area networking technology in use today); it is also true for several other popular technologies, such as token ring and FDDI. Snoopers may succeed in picking up passwords by watching for those used during Telnet, FTP, and rlogin sessions. Even if passwords aren't compromised, snoopers can still peek at the contents of sensitive files people may be accessing, interesting email they may be reading, and so on; the snooper can essentially "watch over the shoulder" of anyone using the network.

With a perimeter network, if someone breaks into a bastion host on the perimeter net, he'll be able to snoop only on traffic on that net. All the traffic on the perimeter net should be either to or from the bastion host, or to or from the Internet. Because no strictly internal traffic (that is, traffic between two internal hosts, which is presumably sensitive or proprietary) passes over the perimeter net, internal traffic will be safe from prying eyes if the bastion host is compromised.

Obviously, traffic to and from the bastion host, or the external world, will still be visible.

With the screened subnet architecture, you attach a bastion host (or hosts) to the perimeter net; this host is the main point of contact for incoming connections from the outside world; for example:

and so on.

Outbound services (from internal clients to servers on the Internet) are handled in either of these ways:

The interior router (sometimes called the choke router in firewalls literature) protects the internal network from both the Internet and the perimeter net.

The interior router does most of the packet filtering for your firewall. It allows selected services outbound from the internal net to the Internet. These services are the services your site can safely support and safely provide using packet filtering rather than proxies. (Your site needs to establish its own definition of what "safe" means. You'll have to consider your own needs, capabilities, and constraints; there is no one answer for all sites.) The services you allow might include outgoing Telnet, FTP, WAIS, Archie, Gopher, and others, as appropriate for your own needs and concerns.

The services the interior router allows between your bastion host (on the perimeter net itself) and your internal net are not necessarily the same services the interior router allows between the Internet and your internal net. The reason for limiting the services between the bastion host and the internal network is to reduce the number of machines (and the number of services on those machines) that can be attacked from the bastion host, should it be compromised.

You should limit the services allowed between the bastion host and the internal net to just those that are actually needed, such as SMTP (so the bastion host can forward incoming email), DNS (so the bastion host can answer questions from internal machines, or ask them, depending on your configuration), and so on. You should further limit services, to the extent possible, by allowing them only to or from particular internal hosts; for example, SMTP might be limited only to connections between the bastion host and your internal mail server or servers. Pay careful attention to the security of those remaining internal hosts and services that can be contacted by the bastion host, because those hosts and services will be what an attacker goes after-indeed, will be all the attacker can go after-if the attacker manages to break in to your bastion host.

In theory, the exterior router (sometimes called the access router in firewalls literature) protects both the perimeter net and the internal net from the Internet. In practice, exterior routers tend to allow almost anything outbound from the perimeter net, and they generally do very little packet filtering. The packet filtering rules to protect internal machines would need to be essentially the same on both the interior router and the exterior router; if there's an error in the rules that allows access to an attacker, the error will probably be present on both routers.

Frequently, the exterior router is provided by an external group (for example, your Internet provider), and your access to it may be limited. An external group that's maintaining a router will probably be willing to put in a few general packet filtering rules, but won't want to maintain a complicated or frequently changing rule set. You also may not trust them as much as you trust your own routers. If the router breaks and they install a new one, are they going to remember to reinstall the filters? Are they even going to bother to mention that they replaced the router so that you know to check?

The only packet filtering rules that are really special on the exterior router are those that protect the machines on the perimeter net (that is, the bastion hosts and the internal router). Generally, however, not much protection is necessary, because the hosts on the perimeter net are protected primarily through host security (although redundancy never hurts).

The rest of the rules that you could put on the exterior router are duplicates of the rules on the interior router. These are the rules that prevent insecure traffic from going between internal hosts and the Internet. To support proxy services, where the interior router will let the internal hosts send some protocols as long as they are talking to the bastion host, the exterior router could let those protocols through as long as they are coming from the bastion host. These rules are desirable for an extra level of security, but they're theoretically blocking only packets that can't exist because they've already been blocked by the interior router. If they do exist, either the interior router has failed, or somebody has connected an unexpected host to the perimeter network.

So, what does the exterior router actually need to do? One of the security tasks that the exterior router can usefully perform-a task that usually can't easily be done anywhere else-is the blocking of any incoming packets from the Internet that have forged source addresses. Such packets claim to have come from within the internal network, but actually are coming in from the Internet.

The interior router could do this, but it can't tell if packets that claim to be from the perimeter net are forged. While the perimeter net shouldn't have anything fully trusted on it, it's still going to be more trusted than the external universe; being able to forge packets from it will give an attacker most of the benefits of compromising the bastion host. The exterior router is at a clearer boundary. The interior router also can't protect the systems on the perimeter net against forged packets.

Variations on firewall architectures

However, there is a lot of variation in architectures. There is a good deal of flexibility in how you can configure and combine firewall components to best suit your hardware, your budget, and your security policy.

Internal firewalls

In some situations, you may also be protecting parts of your internal network from other parts. There are a number of reasons why you might want to do this:

Laboratory and test networks are often the first networks that people consider separating from the rest of an organization via a firewall (usually as the result of some horrible experience where something escapes the laboratory and runs amok). Unless people are working on routers, this type of firewall can be quite simple. Neither a perimeter net nor a bastion host is needed, because there is no worry about snooping (all users are internal anyway), and you don't need to provide many services (the machines are not people's home machines). In most cases, you'll want a packet filtering router that allows any connection inbound to the test network, but only known safe connections from it. In a few cases (for example, if you are testing bandwidth on the network), you may want to protect the test network from outside traffic that would invalidate tests, in which case you'll deny inbound connections and allow outbound connections.

If you are testing routers, it's probably wisest to use an entirely disconnected network; if you don't do this, then at least prevent the firewall router from listening to routing updates from the test network.

Joint venture firewalls

Sometimes, organizations come together for certain limited reasons, such as a joint project; they need to be able to share machines, data, and other resources for the duration of the project. For example, look at the decision of IBM and Apple to collaborate on the PowerPC, a personal computer that runs a common operating system; undertaking one joint project doesn't mean that IBM and Apple have decided to merge their organizations or to open up all their operations to each other.

Although the two parties have decided to trust each other for the purposes of this project, they are still competitors. They want to protect most of their systems and information from each other. It isn't just that they may distrust each other; it's also that they can't be sure how good the other's security is. They don't want to risk that an intruder into their partner's system might, through this joint venture, find a route into their system as well. This security problem occurs even if the collaborators aren't also competitors.

Shared perimeter networks are a good way to approach joint networks. Each party can install its own router, under its own control, onto a perimeter net between the two organizations.

What the future holds

Systems that might be called "third-generation firewalls" - firewalls that combine the features and capabilities of packet filtering and proxy systems into something more than both - are just starting to become available.

More and more client and server applications are coming with native support for proxied environments. For example, many WWW clients include proxy capabilities, and lots of systems are coming with run-time or compile-time support for generic proxy systems such as the SOCKS package.

Packet filtering systems continue to grow more flexible and gain new capabilities, such as dynamic packet filtering. With dynamic packet filtering the packet filtering rules are modified "on the fly" by the router in response to certain triggers. For example, an outgoing UDP packet might cause the creation of a temporary rule to allow a corresponding, answering UDP packet back in.

While firewall technologies are changing, so are the underlying technologies of the Internet, and these changes will require corresponding changes in firewalls.

The underlying protocol of the Internet, IP, is currently undergoing major revisions, partly to address the limitations imposed by the use of four-byte host addresses in the current version of the protocol (which is version 4; the existing IP is sometimes called IPv4), and the blocks in which they're given out. Basically, the Internet has been so successful and become so popular that four bytes simply isn't a big enough number to assign a unique address to every host that will join the Internet over the next few years, particularly because addresses must be given out to organizations in relatively large blocks.

Attempts to solve the address size limitations by giving out smaller blocks of addresses (so that a greater percentage of them are actually used) raise problems with routing protocols. Stop-gap solutions to both problems are being applied but won't last forever. Estimates for when the Internet will run out of new addresses to assign vary, but the consensus is that either address space or routing table space (if not both) will be exhausted sometime within a few years after the turn of the century.

While they're working "under the hood" to solve the address size limitations, the people designing the new IP protocol (which is often referred to as "IPng" for "IP next generation"-officially, it will be IP version 6, or IPv6, when the standards are formally adopted and ratified) are taking advantage of the opportunity to make other improvements in the protocol. Some of these improvements have the potential to cause profound changes in how firewalls are constructed and operated; however, it's far too soon to say exactly what the impact will be.

The underlying network technologies are also changing. Currently, most networks involving more than two machines (i.e., almost anything other than dial-up or leased lines) are susceptible to snooping; any node on the network can see at least some traffic that it's not supposed to be a party to. Newer network technologies, such as frame relay and Asynchronous Transfer Mode (ATM), pass packets directly from source to destination, without exposing them to snooping by other nodes in the network.

Cryptography[#!IntroCrypt!#]

People mean different things when they talk about cryptography. Children play with toy ciphers and secret languages. However, these have nothing to do with real security and strong encryption. Strong encryption is the kind of encryption that can be used to protect information of real value against organized criminals, multinational corporations, and major governments. Strong encryption used to be only military business; however, in the information society it has become one of the central tools for maintaining privacy and confidentiality.

As we move into an information society, the technological means for global surveillance of millions of individual people are becoming available to major governments. Cryptography has become one of the main tools for privacy, trust, access control, electronic payments, corporate security, and countless other fields.

Cryptography is no longer a military thing that should not be messed with. It is time to demystify cryptography and make full use of the advantages it provides for the modern society. Widespread cryptography is also one of the few defenses people have against suddenly finding themselves in a totalitarian surveillance society that can monitor and control everything they do.

Basic Terminology

Suppose that someone wants to send a message to a receiver, and wants to be sure that no-one else can read the message. However, there is the possibility that someone else opens the letter or hears the electronic communication.

In cryptographic terminology, the message is called plaintext or cleartext. Encoding the contents of the message in such a way that hides its contents from outsiders is called encryption. The encrypted message is called the ciphertext. The process of retrieving the plaintext from the ciphertext is called decryption. Encryption and decryption usually make use of a key, and the coding method is such that decryption can be performed only by knowing the proper key.

Cryptography is the art or science of keeping messages secret. Cryptanalysis is the art of breaking ciphers, i.e. retrieving the plaintext without knowing the proper key. People who do cryptography are cryptographers, and practitioners of cryptanalysis are cryptanalysts.

Cryptography deals with all aspects of secure messaging, authentication, digital signatures, electronic money, and other applications. Cryptology is the branch of mathematics that studies the mathematical foundations of cryptographic methods.

Basic Cryptographic Algorithms

A method of encryption and decryption is called a cipher. Some cryptographic methods rely on the secrecy of the algorithms; such algorithms are only of historical interest and are not adequate for real-world needs. All modern algorithms use a key to control encryption and decryption; a message can be decrypted only if the key matches the encryption key. The key used for decryption can be different from the encryption key, but for most algorithms they are the same.

There are two classes of key-based algorithms, symmetric (or secret-key) and asymmetric (or public-key) algorithms. The difference is that symmetric algorithms use the same key for encryption and decryption (or the decryption key is easily derived from the encryption key), whereas asymmetric algorithms use a different key for encryption and decryption, and the decryption key cannot be derived from the encryption key.

Symmetric algorithms can be divided into stream ciphers and block ciphers. Stream ciphers can encrypt a single bit of plaintext at a time, whereas block ciphers take a number of bits (typically 64 bits in modern ciphers), and encrypt them as a single unit.

Asymmetric ciphers (also called public-key algorithms or generally public-key cryptography) permit the encryption key to be public (it can even be published in a newspaper), allowing anyone to encrypt with the key, whereas only the proper recipient (who knows the decryption key) can decrypt the message. The encryption key is also called the public key and the decryption key the private key or secret key.

Modern cryptographic algorithms cannot really be executed by humans. Strong cryptographic algorithms are designed to be executed by computers or specialized hardware devices. In most applications, cryptography is done in computer software, and numerous cryptographic software packages are available.

Generally, symmetric algorithms are much faster to execute on a computer than asymmetric ones. In practice they are often used together, so that a public-key algorithm is used to encrypt a randomly generated encryption key, and the random key is used to encrypt the actual message using a symmetric algorithm.

Many good cryptographic algorithms are widely and publicly available in any major bookstore, scientific library, or patent office, and on the Internet. Well-known symmetric functions include DES and IDEA. RSA is probably the best known asymmetric algorithm.

Digital Signatures

Some public-key algorithms can be used to generate digital signatures. A digital signature is a block of data that was created using some secret key, and there is a public key that can be used to verify that the signature was really generated using the corresponding private key. The algorithm used to generate the signature must be such that without knowing the secret key it is not possible to create a signature that would verify as valid.

Digital signatures are used to verify that a message really comes from the claimed sender (assuming only the sender knows the secret key corresponding to his/her public key). They can also be used to timestamp documents: a trusted party signs the document and its timestamp with his/her secret key, thus testifying that the document existed at the stated time.

Digital signatures can also be used to testify (or certify) that a public key belongs to a particular person. This is done by signing the combination of the key and the information about its owner by a trusted key. The reason for trusting that key may again be that it was signed by another trusted key. Eventually some key must be a root of the trust hierarchy (that is, it is not trusted because it was signed by somebody, but because you believe a priori that the key can be trusted). In a centralized key infrastructure there are very few roots in the trust network (e.g., trusted government agencies; such roots are also called certification authorities). In a distributed infrastructure there need not be any universally accepted roots, and each party may have different trusted roots (such of the party's own key and any keys signed by it). This is the web of trust concept used e.g. in PGP.

A digital signature of an arbitrary document is typically created by computing a message digest from the document, and concatenating it with information about the signer, a timestamp, etc. The resulting string is then encrypted using the private key of the signer using a suitable algorithm. The resulting encrypted block of bits is the signature. It is often distributed together with information about the public key that was used to sign it. To verify a signature, the recipient first determines whether it trusts that the key belongs to the person it is supposed to belong to (using the web of trust or a priori knowledge), and then decrypts the signature using the public key of the person. If the signature decrypts properly and the information matches that of the message (proper message digest etc.), the signature is accepted as valid.

Several methods for making and verifying digital signatures are freely available. The most widely known algorithm is RSA.

Cryptographic Hash Functions

Cryptographic hash functions are typically used to compute the message digest when making a digital signature. A hash function compresses the bits of a message to a fixed-size hash value in a way that distributes the possible messages evenly among the possible hash values. A cryptographic hash function does this in a way that makes it extremely difficult to come up with a message that would hash to a particular hash value.

Cryptographic hash functions typically produce hash values of 128 or more bits. This number is vastly larger than the number of different messages likely to ever be exchanged in the world.

Many good cryptographic hash functions are freely available. Well-known ones include MD5 and SHA.

Cryptographic Random Number Generators

Cryptographic random number generators generate random numbers for use in cryptographic applications, such as for keys. Conventional random number generators available in most programming languages or programming environments are not suitable for use in cryptographic applications (they are designed for statistical randomness, not to resist prediction by cryptanalysts).

In the optimal case, random numbers are based on true physical sources of randomness that cannot be predicted. Such sources may include the noise from a semiconductor device, the least significant bits of an audio input, or the intervals between device interrupts or user keystrokes. The noise obtained from a physical source is then "distilled" by a cryptographic hash function to make every bit depend on every other bit. Quite often a large pool (several thousand bits) is used to contain randomness, and every bit of the pool is made to depend on every bit of input noise and every other bit of the pool in a cryptographically strong way.

When true physical randomness is not available, pseudo-random numbers must be used. This situation is undesirable, but often arises on general purpose computers. It is always desirable to obtain some environmental noise - even from device latencies, resource utilization statistics, network statistics, keyboard interrupts, or whatever. The point is that the data must be unpredictable for any external observer; to achieve this, the random pool must contain at least 128 bits of true entropy.

Cryptographic pseudo-random generators typically have a large pool ("seed value") containing randomness. Bits are returned from this pool by taking data from the pool, optionally running the data through a cryptographic hash function to avoid revealing the contents of the pool. When more bits are needed, the pool is stirred by encrypting its contents by a suitable cipher with a random key (that may be taken from an unreturned part of the pool) in a mode which makes every bit of the pool depend on every other bit of the pool. New environmental noise should be mixed into the pool before stirring to make predicting previous or future values even more impossible.

Even though cryptographically strong random number generators are not very difficult to built if designed properly, they are often overlooked. The importance of the random number generator must thus be emphasized - if done badly, it will easily become the weakest point of the system.

Strength of Cryptographic Algorithms

Good cryptographic systems should always be designed so that they are as difficult to break as possible. It is possible to build systems that cannot be broken in practice (though this cannot usually be proved). This does not significantly increase system implementation effort; however, some care and expertise is required. There is no excuse for a system designer to leave the system breakable. Any mechanisms that can be used to circumvent security must be made explicit, documented, and brought into the attention of the end users.

In theory, any cryptographic method with a key can be broken by trying all possible keys in sequence. If using brute force to try all keys is the only option, the required computing power increases exponentially with the length of the key. A 32 bit key takes 2^32 (about 10^9) steps. This is something any amateur can do on his/her home computer. A system with 40 bit keys (e.g. US-exportable version of RC4) takes 2^40 steps - this kind of computing power is available in most universities and even smallish companies. A system with 56 bit keys (such as DES) takes a substantial effort, but is quite easily breakable with special hardware. The cost of the special hardware is substantial but easily within reach of organized criminals, major companies, and governments. Keys with 64 bits are probably breakable now by major governments, and will be within reach of organized criminals, major companies, and lesser governments in a few years. Keys with 80 bits may become breakable in future. Keys with 128 bits will probably remain unbreakable by brute force for the foreseeable future. Even larger keys are possible; in the end we will encounter a limit where the energy consumed by the computation, using the minimum energy of a quantum mechanic operation for the energy of one step, will exceed the energy of the mass of the sun or even of the universe.

However, key length is not the only relevant issue. Many ciphers can be broken without trying all possible keys. In general, it is very difficult to design ciphers that could not be broken more effectively using other methods. Designing your own ciphers may be fun, but it is not recommended in real applications unless you are a true expert and know exactly what you are doing.

One should generally be very wary of unpublished or secret algorithms. Quite often the designer is then not sure of the security of the algorithm, or its security depends on the secrecy of the algorithm. Generally, no algorithm that depends on the secrecy of the algorithm is secure. Particularly in software, anyone can hire someone to disassemble and reverse-engineer the algorithm. Experience has shown that a vast majority of secret algorithms that have become public knowledge later have been pitifully weak in reality.

The key lengths used in public-key cryptography are usually much longer than those used in symmetric ciphers. There the problem is not that of guessing the right key, but deriving the matching secret key from the public key. In the case of RSA, this is equivalent to factoring a large integer that has two large prime factors. In the case of some other cryptosystems it is equivalent to computing the discrete logarithm modulo a large integer (which is believed to be roughly comparable to factoring). Other cryptosystems are based on yet other problems.

To give some idea of the complexity, for the RSA cryptosystem, a 256 bit modulus is easily factored by ordinary people. 384 bit keys can be broken by university research groups or companies. 512 bits is within reach of major governments. Keys with 768 bits are probably not secure in the long term. Keys with 1024 bits and more should be safe for now unless major algorithmic advances are made in factoring; keys of 2048 bits are considered by many to be secure for decades.

It should be emphasized that the strength of a cryptographic system is usually equal to its weakest point. No aspect of the system design should be overlooked, from the choice algorithms to the key distribution and usage policies.

Cryptanalysis and Attacks on Cryptosystems

Cryptanalysis is the art of deciphering encrypted communications without knowing the proper keys. There are many cryptanalytic techniques. Some of the more important ones for a system implementor are described below.

Ciphertext-only attack: This is the situation where the attacker does not know anything about the contents of the message, and must work from ciphertext only. In practice it is quite often possible to make guesses about the plaintext, as many types of messages have fixed format headers. Even ordinary letters and documents begin in a very predictable way. It may also be possible to guess that some ciphertext block contains a common word.

Known-plaintext attack: The attacker knows or can guess the plaintext for some parts of the ciphertext. The task is to decrypt the rest of the ciphertext blocks using this information. This may be done by determining the key used to encrypt the data, or via some shortcut.

Chosen-plaintext attack: The attacker is able to have any text he likes encrypted with the unknown key. The task is to determine the key used for encryption. Some encryption methods, particularly RSA, are extremely vulnerable to chosen-plaintext attacks. When such algorithms are used, extreme care must be taken to design the entire system so that an attacker can never have chosen plaintext encrypted.

Man-in-the-middle attack: This attack is relevant for cryptographic communication and key exchange protocols. The idea is that when two parties are exchanging keys for secure communications (e.g., using Diffie-Hellman), an adversary puts himself between the parties on the communication line. The adversary then performs a separate key exchange with each party. The parties will end up using a different key, each of which is known to the adversary. The adversary will then decrypt any communications with the proper key, and encrypt them with the other key for sending to the other party. The parties will think that they are communicating securely, but in fact the adversary is hearing everything.

One way to prevent man-in-the-middle attacks is that both sides compute a cryptographic hash function of the key exchange (or at least the encryption keys), sign it using a digital signature algorithm, and send the signature to the other side. The recipient then verifies that the signature came from the desired other party, and that the hash in the signature matches that computed locally.

Timing Attack: This very recent attack is based on repeatedly measuring the exact execution times of modular exponentiation operations. It is relevant to at least RSA, Diffie-Hellman, and Elliptic Curve methods. More information is available in the original paper and various followup articles.

There are many other cryptographic attacks and cryptanalysis techniques. However, these are probably the most important ones for a practical system designer. Anyone contemplating to design a new encryption algorithm should have a much deeper understanding of these issues. One place to start looking for information is the excellent book Applied Cryptography by Bruce Schneier.

Cryptographic Algorithms[#!cryptoAlgor!#]

The section describes some of the better known cryptographic algorithms, and presents some details as to the operation of selected algorithms.

Public Key Algorithms

Public key algorithms use a different key for encryption and decryption, and the decryption key cannot (practically) be derived from the encryption key. Public key methods are important because they can be used to transmit encryption keys or other data securely even when the parties have no opportunity to agree on a secret key in private. All known methods are quite slow, and they are usually only used to encrypt session keys (randomly generated "normal" keys), that are then used to encrypt the bulk of the data using a symmetric cipher (see below).

RSA

RSA (Rivest-Shamir-Adelman) is the most commonly used public key algorithm. Can be used both for encryption and for signing. It is generally considered to be secure when sufficiently long keys are used (512 bits is insecure, 768 bits is moderately secure, and 1024 bits is good). The security of RSA relies on the difficulty of factoring large integers. Dramatic advances in factoring large integers would make RSA vulnerable. RSA is currently the most important public key algorithm. It is patented in the United States (expires year 2000), and free elsewhere.

At present, 512 bit keys are considered weak, 1024 bit keys are probably secure enough for most purposes, and 2048 bit keys are likely to remain secure for decades.

One should know that RSA is very vulnerable to chosen plaintext attacks. There is also a new timing attack that can be used to break many implementations of RSA. The RSA algorithm is believed to be safe when used properly, but one must be very careful when using it to avoid these attacks.

It works as follows: take two large primes, p and q, and find their product n=pq; n is called the modulus. Choose a number, e, less than n and relatively prime to (p-1)(q-1), and find its inverse, d, \( mod\, [(p-1)(q-1)] \), which means that \( ed=1\, mod\, [(p-1)(q-1)] \); e and d are called the public and private exponents, respectively. Two numbers are relatively prime if they have no prime factors in common. The public key is the pair (n,e); the private key is (n,d). The factors p and q must be kept secret, or destroyed.

Example:

\begin{eqnarray*}p & = & 37\\
q & = & 51\\
Then\, n & = & 1887\\
(p-1)(q-1) &...
...\\
Find\, d & = & 373\, (637\times 373=237601=1800\times 132+1)
\end{eqnarray*}


Thus the public key is (1887,637), and the private key is (1887,373). An small program for finding d is shown below:

long d = 0l; 

while (d++) 

  if (((d * 637) % 1800) == 1) 

    printf ("%ld\n", d);

It is difficult (presumably) to obtain the private key (n,d) from the public key (n,e). If one could factor n into p and q, however, then one could obtain the private exponent d. Thus the entire security of RSA is predicated on the assumption that factoring is difficult; an easy factoring method would ``break'' RSA

Here is how RSA can be used for privacy and authentication (in practice, actual use is slightly different).

RSA privacy (encryption): suppose Alice wants to send a private message, m, to Bob. Alice creates the ciphertext c by exponentiating: \( c=m^{e}\, mod\, n \), where e and n are Bob's public key. To decrypt, Bob also exponentiates: \( m=c^{d}\, mod\, n \), and recovers the original message m; the relationship between e and d ensures that Bob correctly recovers m. Since only Bob knows d, only Bob can decrypt.

Example:


\begin{eqnarray*}Alice\, sends\, m & = & 42\\
c & = & 42^{637}\, mod\, 1887\\
...
...5\\
Bob\, decypts\, m & = & 315^{373}\, mod\, 1887\\
& = & 42
\end{eqnarray*}


RSA authentication: suppose Alice wants to send a signed document m to Bob. Alice creates a digital signature s by exponentiating: \( s=m^{d}\, mod\, n \), where d and n belong to Alice's key pair. She sends s and m to Bob. To verify the signature, Bob exponentiates and checks that the message m is recovered: \( m=s^{e}\, mod\, n \), where eand n belong to Alice's public key.

Thus encryption and authentication take place without any sharing of private keys: each person uses only other people's public keys and his or her own private key. Anyone can send an encrypted message or verify a signed message, using only public keys, but only someone in possession of the correct private key can decrypt or sign a message.

Diffie-Hellman

Diffie-Hellman is a commonly used public-key algorithm for key exchange. It is generally considered to be secure when sufficiently long keys and proper generators are used. The security of Diffie-Hellman relies on the difficulty of the discrete logarithm problem (which is believed to be computationally equivalent to factoring large integers). Diffie-Hellman is claimed to be patented in the United States, but the patent expires April 29, 1997. There are also strong rumors that the patent might in fact be invalid (there is evidence of it having been published over an year before the patent application was wiled).

Diffie-Hellman is sensitive to the choice of the strong prime and the generator. One possible prime/generator pair is suggested in the Photuris draft. The size of the secret exponent is also important for its security. Conservative advice is to make the random exponent twice as long as the intended session key.

One should note the results presented in Brian A. LaMacchia and Andrew M. Odlyzko, Computation of Discrete Logarithms in Prime Fields, Designs, Codes and Cryptography 1 (1991), 47-62. Basically, they conclude that by doing pre-computations, it is possible to compute discrete logarithms relative to a particular prime efficiently. The work needed for the pre-computation is approximately equal or slightly higher than the work needed for factoring a composite number of the same size. In practice this means that if the same prime is used for a large number of exchanges, it should be larger than 512 bits in size, preferably 1024 bits.

There is also a new timing attack that can be used to break many implementations of Diffie-Hellman.

LUC

LUC is a public key encryption system. It uses Lucas functions instead of exponentiation. It's inventor Peter Smith has since then implemented four other algorithms with Lucas functions: LUCDIF, a key negotiation method like Diffie-Hellman; LUCELG PK, equivalent to El Gamal public-key encryption; LUCELG DS, equivalent to El Gamal digital signature; and LUCDSA, equivalent to the US Digital Signature Standard. LUC Encryption Technology Ltd (LUCENT) has obtained patents for cryptographic use of Lucas functions in United States and New Zealand.

Secret Key Algorithms (Symmetric Ciphers)

Secret key algorithms use the same key for both encryption and decryption (or the other is easily derivable from the other).

DES

DES is an algorithm developed in the 1970s. It was made a standard by the US government, and has also been adopted by several other governments worldwide. It is widely used, especially in the financial industry.

DES is a block cipher with 64-bit block size. It uses 56-bit keys. This makes it fairly easy to break with modern computers or special-purpose hardware. DES is still strong enough to keep most random hackers and individuals out, but it is easily breakable with special hardware by government, criminal organizations, or major corporations. In large volumes, the cost of breaking DES keys is on the order of tens of dollars. DES is getting too weak, and should not be used in new designs.

A variant of DES, Triple-DES or 3DES is based on using DES three times (normally in an encrypt-decrypt-encrypt sequence with three different, unrelated keys). Many people consider Triple-DES to be much safer than plain DES.

DES processes plaintext blocks of n=64 bits, producing 64 bit ciphertext blocks. The size of the secret key K is 56 bits, specified as 64 bits, 8 of which are used as parity bits. There is a belief that the parity bits were introduced to weaken DES, reducing the exhaustive key search by 256.

Encryption proceeds in 16 stages (rounds). For each round, a 48 bit sub-key Ki is generated from the input key K. Within each round, 8 fixed 6-to-4 bit substitution mappings (Si - S boxes - collectively S) are used. The 64 bit plaintext is divided into 32 bit halves, L0and R0. Each round takes the 32 bit inputs from the previous round and produces 32 bit outputs as follows:


\begin{eqnarray*}L_{i} & = & R_{i-1}\\
R_{i} & = & L_{i-1}\oplus f(R_{i-1},K_{i}),
\end{eqnarray*}


where

\( f(R_{i-1},K_{i})=P(S(E(R_{i-1})\oplus K_{i})) \)

Here E is a fixed expansion permutation mapping Ri-1 from 32 to 48 bits. P is a fixed permutation on 32 bits. The operator \( \oplus \)represents exclusive or. An initial bit permutation precedes the first round; following the last round the left and right halves are exchanged and the resulting string bit permutated by the inverse of the initial permutation.

Decryption involves the same key and algorithm, but with sub-keys applied to the internal rounds in the reverse order.

IDEA

IDEA (International Data Encryption Algorithm) is an algorithm developed at ETH Zurich in Switzerland. It uses a 128 bit key, and it is generally considered to be very secure. It is currently one of the best public known algorithms. It is a fairly new algorithm, but it has already been around for several years, and no practical attacks on it have been published despite of numerous attempts to analyze it.

IDEA is patented in the United States and in most of the European countries. The patent is held by Ascom-Tech. Non-commercial use of IDEA is free. Commercial licenses can be obtained by contacting idea@ascom.ch.

IDEA is a clock cipher, which uses a 128-bit length key to encrypt 64-bit blocks of data. The scheme uses 52 16-bit sub-keys. They are generated from the 128-bit key (The 'main' key), like that :

The second step is repeated, until we have 52 sub-keys.

The 64-bit block of data (original data) is split into 4 16-bit segments. We'll call them s1,s2,s3 and s4. The sub-keys are k1, k2 ... k52.

The encryption is made in 8 rounds. Each round is made of the following steps :

Note : This process involves modular multiplication, with a modulus of 216+1and addition with a modulus of 216. A key of all zeros is defined as being equal to 216, for multiplication steps. The operation \( \oplus \)represents exclusive or.

After these steps, the blocks d11, d13, d12, d14 (in that order!) are used as input to the next round, with the next 6 keys. After the 8 rounds are over, we get 4 blocks : b1,b2,b3,b4. Then we perform :

The final blocks (c1 - c4) form the encrypted 64-bit block.

To decrypt the data, we use the same steps, but with different set of sub-keys. it goes like that.

Final transformation uses k1* k2# k3# k4*

Explanation : k* is the multiplicative inverse of k modulus 216+1, k# is the additive inverse of k modulus 216.

RC4

RC4 is a cipher designed by RSA Data Security, Inc. It used to be a trade secret, until someone posted source code for an algorithm in Usenet News, claiming it to be equivalent to RC4. There is very strong evidence that the posted algorithm is indeed equivalent to RC4. The algorithm is very fast. Its security is unknown, but breaking it does not seem trivial either. Because of its speed, it may have uses in certain applications. It can also accept keys of arbitrary length. RC4 is essentially a pseudo random number generator, and the output of the generator is XORed with the data stream. For this reason, it is very important that the same RC4 key never be used to encrypt two different data streams.

Source code and information about RC4 can be found here and in many cryptographic libraries, e.g. SSLeay, Crypto++, and Ssh source code.

The United States government routinely approves RC4 with 40 bit keys for export. Keys that are this small can be easily broken by governments, criminals, and amateurs.

It is interesting to know that the exportable version of SSL (Netscape's Secure Socket Layer), which uses RC4-40, was recently broken by at least two independent groups. Breaking it took about eight days; in many major universities (or companies) the corresponding amount of computing power is available to any computer science major. More information about the incident can be found on Damien Doligez's SSL cracking page.

Skipjack

Skipjack is the encryption algorithm contained in the Clipper chip; it was designed by the NSA. It uses an 80-bit key to encrypt 64-bit blocks of data; the same key is used for the decryption. Skipjack can be used in the same modes as DES (see Question 5.3), and may be more secure than DES, since it uses 80-bit keys and scrambles the data for 32 steps, or ``rounds''; by contrast, DES uses 56-bit keys and scrambles the data for only 16 rounds.

The details of Skipjack are classified. The decision not to make the details of the algorithm publicly available has been widely criticized. Many people are suspicious that Skipjack is not secure, either due to oversight by its designers, or by the deliberate introduction of a secret trapdoor. By contrast, there have been many attempts to find weaknesses in DES over the years, since its details are public. These numerous attempts (and the fact that they have failed) have made people confident in the security of DES. Since Skipjack is not public, the same scrutiny cannot be applied towards it, and thus a corresponding level of confidence may not arise.

Clipper is an encryption chip developed and sponsored by the U.S. government as part of the Capstone project. Announced by the White House in April, 1993, Clipper was designed to balance the competing concerns of federal law-enforcement agencies with those of private citizens and industry. The law-enforcement agencies wish to have access to the communications of suspected criminals, for example by wire-tapping; these needs are threatened by secure cryptography. Industry and individual citizens, however, want secure communications, and look to cryptography to provide it.

Clipper technology attempts to balance these needs by using escrowed keys. The idea is that communications would be encrypted with a secure algorithm, but the keys would be kept by one or more third parties (the ``escrow agencies''), and made available to law-enforcement agencies when authorized by a court-issued warrant. Thus, for example, personal communications would be impervious to recreational eavesdroppers, and commercial communications would be impervious to industrial espionage, and yet the FBI could listen in on suspected terrorists or gangsters.

Each chip also contains a unique 80-bit unit key U, which is escrowed in two parts at two escrow agencies; both parts must be known in order to recover the key. Also present is a serial number and an 80-bit ``family key'' F; the latter is common to all Clipper chips. The chip is manufactured so that it cannot be reverse engineered; this means that the Skipjack algorithm and the keys cannot be read off the chip.

When two devices wish to communicate, they first agree on an 80-bit `''session key''' K. The method by which they choose this key is left up to the implementor's discretion; a public-key method such as RSA or Diffie-Hellman seems a likely choice. The message is encrypted with the key K and sent; note that the key K is not escrowed. In addition to the encrypted message, another piece of data, called the law-enforcement access field (LEAF), is created and sent. It includes the session key K encrypted with the unit key U, then concatenated with the serial number of the sender and an authentication string, and then, finally, all encrypted with the family key. The exact details of the law-enforcement field are classified.

The receiver decrypts the law-enforcement field, checks the authentication string, and decrypts the message with the key K.

Enigma

Enigma was the cipher used by the Germans in World War II. It is trivial to solve with modern computers; see the Crypt Breaker's Workbench tool. This cipher is used by the Unix crypt(1) program, which should thus not be used.

Block Cipher Modes

Many commonly used ciphers (e.g., IDEA, DES, BLOWFISH) are block ciphers. This means that they take a fixed-size block of data (usually 64 bits), an transform it to another 64 bit block using a function selected by the key. The cipher basically defines a one-to-one mapping from 64-bit integers to another permutation of 64-bit integers.

If the same block is encrypted twice with the same key, the resulting ciphertext blocks are the same (this method of encryption is called Electronic Code Book mode, or ECB). This information could be useful for an attacker.

In practical applications, it is desirable to make identical plaintext blocks encrypt to different ciphertext blocks. Two methods are commonly used for this:

The previous ciphertext block is usually stored in an Initialization Vector (IV). An initialization vector of zero is commonly used for the first block, though other arrangements are also in use.

Cryptographic Hash Functions

MD5

MD5 (Message Digest Algorithm 5) is a secure hash algorithm developed at RSA Data Security, Inc. It can be used to hash an arbitrary length byte string into a 128 bit value. MD5 is in wide use, and is considered reasonable secure.

However, some people have reported potential weaknesses in it, and "keyed MD5" (typically used for authentication by having a shared secret, and computing an authentication value by hashing first the secret (as a key), and then the data to be hashed) has been reported to be broken. It is also reported that one could build a special-purpose machine costing a few million dollars to find a plaintext matching given hash value in a few weeks.

Random Number Generators

The generation of random numbers is critical to cryptographic systems. Symmetric ciphers such as DES, RC2, and RC5 all require a randomly selected encryption key. Public-key algorithms - like RSA, Diffie-Hellman, and DSA - begin with randomly generated values when generating prime numbers. At a higher level, SSL and other cryptographic protocols use random challenges in the authentication process to foil replay attacks.

But truly random numbers are difficult to come by in software-only solutions, where electrical noise and sources of hardware randomness are not available (or at least not convenient). This poses a challenge for software developers implementing cryptography. There are methods, however, for generating sufficiently random sequences in software that can provide an adequate level of security.

Random verus Pseudo-Random Numbers

What is a truly random number? The definition can get a bit philosophical. Knuth speaks of a sequence of independent random numbers with a specified distribution, each number being obtained by chance and not influenced by the other numbers in the sequence. Rolling a die would give such results. But computers are logical and deterministic by nature, and fulfilling Knuth's requirements is not something they were designed to do. So-called random number generators on computers actually produce pseudo-random numbers. Pseudo-random numbers are numbers generated in a deterministic way, which only appear to be random.

Most programming languages include a pseudo-random number generator, or "PRNG." This PRNG may produce a sequence adequate for a computerized version of blackjack, but it is probably not good enough to be used for cryptography. The reason is that someone knowledgeable in cryptanalysis might notice patterns and correlations in the numbers that get generated. Depending on the quality of the PRNG, one of two things may happen. If the PRNG has a short period, and repeats itself after a relatively short number of bits, the number of possibilities the attacker will need to try in order to deduce keys will be significantly reduced. Even worse, if the distribution of ones and zeros has a noticeable pattern, the attacker may be able to predict the sequence of numbers, thus limiting the possible number of resulting keys. An attacker may know that a PRNG will never produce 10 binary ones in a row, for example, and not bother searching for keys that contain that sequence.

A PRNG must have a high degree of entropy, which is a measurement of randomness. Cryptographers use the word entropy a lot, so it is worth knowing. In a system that produces the same output each time, each bit is fixed, so there is no uncertainty, or zero entropy per bit. If every output is equally likely (i.e. truly random) then there is total uncertainty, or one bit's worth of entropy in each bit. A true random number generator, like a hardware device, will have maximum entropy. A good PRNG will have a high degree of entropy, making the output unguessable, which is the goal.

The first step in producing good random numbers in software, then, is to use a good PRNG. Important to note is that although the PRNG may produce statistically good looking output, it also has to withstand analysis to be considered strong. Since the one included with your compiler or operating system may or may not be, we recommend you don't use it. Instead, use a PRNG that has been verified to have a high degree of randomness. RSA's BSAFE toolkit uses the MD5 message digest function as a random number generator. BSAFE uses a counter that is digested with MD5. The strength of this approach relies on MD5 being a one-way function - from the random output bytes it is difficult to determine the counter, and hence the other output bytes remain secure. Similar generators can be constructed with other hash functions, such as SHA1.

The Seed

The second step in producing good random numbers is providing a random seed. A good PRNG like BSAFE's will produce a sequence that is sufficiently random for cryptographic operations, with one catch: it needs to be properly initialized, or "seeded". Using a bad seed (or no seed at all) is a common flaw in poorly implemented cryptographic systems. A PRNG will always generate the same output if started with the same seed. If you are using MD5 with the time of day as the seed, for example, an attacker has a high likelihood of re-creating your sequence of pseudo-random bytes by guessing the exact seeding time. Once he has the pseudo-random bytes, he can re-create your keys. The security issue becomes one of making sure an attacker cannot determine your seed.

You may be wondering why use a random number generator to generate random bytes, if to use it, you need to first generate random bytes. Seeding is a bootstrap operation. Once done, generating subsequent keys will be more efficient. Another important point is that the information collected for the seed does not need to be truly random, but unguessable and unpredictable. Once the seed is fed into MD5, the output becomes pseudo-random. If attackers cannot guess or predict seeds, they will be unable to predict the output.

There are two aspects to a random seed: quantity and quality. They are related. The quality of a random seed refers to its entropy. Since the quality may vary, it is a good idea to account for this with quantity. Sufficient quantity makes it impractical for an attacker to exhaustively try all likely seed values.

In general, collect as much external random information as possible. Using a composite of many items makes the attacker's task more difficult. In an application where several keys will be generated, it may make sense to collect enough seed bytes for multiple keys, even before the first is generated. Be careful of information that moves across a network that could be intercepted by a dedicated attacker. Mouse movements on X-terminals, for example, may be available to anyone listening on the wire.

Now we get to the issue of quantity. A developer cannot assume that all of the bits collected are truly random, so a useful rule of thumb is to assume that for every byte of data collected at random, there is one bit of entropy. This is a bit conservative, but cryptographers are conservative by nature. To illustrate this rule of thumb, take the example of user keystrokes, which many consider to be a good source of randomness. Assuming ASCII keystrokes, bit 7 will always be zero. Many of the letters can be predicted: they will probably all be lowercase, and will often alternate between left and right hand. Analysis has shown that there is only one bit per byte of entropy per keystroke.

To guard against this kind of analysis, the idea is to collect one byte of seed for each bit required. This information will be fed into the PRNG to produce the first random output.

As an example, if the seed will be used to produce a random symmetric encryption key, the number of random bytes in the seed should at least equal the number of effective bits in the key. In the case of DES, this would be 56 random bits culled from a seed pool of 56 bytes. Any less and the number of possible starting keys is reduced from 256 to something smaller, reducing the amount of effort required by an attacker in searching the seed space by brute force. Attacks like this have recently been widely publicized on the Internet and in the press. For public-key algorithms, the goal is to make searching for the seed at least as difficult as the hard mathematical problem at their core. This will discourage attackers from searching for seeds instead of attacking problems like factoring composite numbers and calculating discreet logarithms. A seed of 128 bits (taken from a seed pool of 128 bytes) should be more than enough for the modulus sizes being used today.

One last thing that should be mentioned is updating the seed, or "re-seeding." It makes sense to allow an application to add seed bits as they become available. User events often provide additional sources of randomness, but obviously have not taken place when an application starts. These should be included as they occur. Re-seeding also frustrates attackers trying to find the seed state using a brute force attack. Since the seed will be change, say, every thirty seconds, the seed state becomes a moving target and makes the brute force attack infeasible. The idea is to take the existing seed and mix it together with the new information as it becomes available.

Security in WWW

The Hypertext Transfer Protocol - HTTP/1.0 draft, proposed by the Internet Engineering Task Force or IETF HTTP Working Group, gave initial suggestions as to the possible security threats involved in HTTP.

Among the threats they have mentioned are:

1.
Client/session authentication. The basic authentication scheme used by HTTP/1.0 does not provide a secure method of user authentication.
2.
Idempotent Methods. Writers of client software should make sure make that any actions taken by their software are safe and otherwise idempotent. The actual user of the software should be completely aware of any actions that may be taken by the software they are running.
3.
Abuse of Server Log Information. Servers are in the position to collect data about the information requested by clients. This information is considered confidential in nature and may be prohibited by law. Server providers should make sure that logging information is not distributed.
More threats and problems are:

As we said SSL, SHTTP and Shen are proposed encryption and user authentication standards for the Web, though SSL is supported by Netscape thus has more chances to become a standard, we mention the two others because you can't predict what would happen on the internet in a few months time. None of them is yet the universal solution to the secure data transmission problem, and to use each of them you need the right combination of a WWW-Browser and a server.

Secure Servers/HTTP

SSL - Secure Socket's Layer

SSL is the scheme proposed by Netscape Communication Corporation, and was contributed for free use. Netscape has designed and specified a protocol for providing data security layered between application protocols (such as HTTP, Telnet, NNTP, or FTP ) and TCP/IP. This security protocol, called Secure Sockets Layer (SSL), provides data encryption, server authentication, message integrity, and optional client authentication for a TCP/IP connection.

SSL is currently implemented commercially on several different browsers, including the two most popular, Netscape Navigator, and Internet Explorer, and Secure Mosaic, and many different servers, including ones from Netscape, Microsoft, IBM, Quarterdeck, OpenMarket and O'Reilly and associates.

The main goal of the SSL Protocol is to provide privacy and reliability between two communicating applications. The SSL Record Protocol is used for encapsulation of various higher level protocols. One such encapsulated protocol, the SSL Handshake Protocol, allows the server and client to authenticate each other and to negotiate an encryption algorithm and cryptographic keys before the application protocol transmits or receives its first byte of data. One advantage of SSL is that it is application protocol independent. A higher level protocol can layer on top of the SSL Protocol transparently.

The three basic things the SSL protocol provides are:

The current SSL version is 3.0.

How does it work ?

SSL uses the RSA public key cryptography, which is widely used for authentication and encryption in the computer industry.

The public key encryption is a technique that uses two asymmetric keys for encryption and decryption. Each pair of keys consists of a public key and a private key. The public key is made public by distributing it widely. The private key is never distributed; it is always kept secret.

Data that is encrypted with the public key can be decrypted only with the private key. Conversely, data encrypted with the private key can be decrypted only with the public key.

A Public Key Cryptography Can Be Used For Authentication

Authentication means verifying the identity, and checking if someone is who he claims to be.

Here's an example of using public key cryptography for authentication:

The basic idea is this: Say Yogev wants to authenticate Ron. Ron has 2 keys, a public key and a private key. Ron gives Yogev his public key (More on that later) then Yogev generates a random message and send it to Ron. Ron encrypts the message he got with his private key and send the encrypted message back. Then all Yogev has to do is decrypt what Ron sent, with his public key. If the decrypted message is identical to the message Yogev generated in the beginning, he knows it's really Ron, since if it's someone else he isn't supposed to know Ron's private key, and wouldn't be able to encrypt the message he sent for checking.

That's the basic idea, but there are some twists in how it is actually done. It's not a very good idea to encrypt something with your private key and send it to someone else without knowing what you're encrypting. This is because someone can use the encrypted value against you (Since only you could have done the encryption with your private key).

So Ron, instead of encrypting the original message Yogev sent him takes the message that was sent to him, constructs a message digest out of it, and encrypts that. What's a message digest, you must ask now (Unless you know...), well, a message digest is a value derived from the original message that has to be:

1.
Difficult to reverse - So if someone is trying to impersonate to Ron, he couldn't get the original message back from the digest.
2.
Hard to find a different message that has the same digest value.
In this way Ron can protect himself. He computes a digest from the random message sent by Yogev and then encrypts the result and sends the encrypted digest back to him. Then Yogev can compute the same digest and authenticate Ron, by decrypting his message with his (Ron's) public key and comparing them.

What we have described now is called digital signature. Ron has signed a message Yogev generated, and that's as dangerous as encrypting a random value. So the protocol takes another twist: some (or all) of the data needs to be originated by Ron.

Yogev-> Ron : Hey, is that you Ron?

Ron-> Yogev : Yogev, it's me, Ron [ digest[Yogev, it's me, Ron] ] Ron's-private key

Now Yogev can easily know Ron is Ron, and he hasn't signed something he doesn't know.

Is is really safe?

The version of SSL that is exportable from the United States is restricted to 40 bit keys (But it can also use 128 bit), which means they can be broken by anyone with access to a reasonable amount of computing power (For example, a student who studies Computer Science at Rhodes University). The breaking can be done by using brute force (Which means, in simple words, trying all of the combinations...). Come to think of it , a PENTIUM PC can crack a 40 bit key in a matter of a month, more or less. If a criminal buys one he can break 12 keys a year for 407 NIS per key...

SHTTP

SHTTP (Secure HTTP) is the scheme designed by Enterprise Integration Technologies (EIT). It is a higher level protocol that only works with the HTTP protocol, but is potentially more extensible than SSL. S-HTTP is backwards compatible with HTTP. It is designed to incorporate different cryptographic message formats into WWW browsers and servers. This will include PEM, PGP, and PKCS-7. Non S-HTTP browsers/servers should be able to communicate with S-HTTP without a discernible difference, unless they request protected documents

SHTTP provides a wide variety of mechanisms to provide for confidentiality, authentication, and integrity. SHTTP is not tied to any particular cryptographic system, key infrastructure, or cryptographic format.

Shen

Shen is a security scheme proposed by CERN. Shen provides for three separate security related mechanisms:

1.
Weak Authentication with low maintenance overhead and without patent or export restrictions.
2.
Strong Authentication via public key exchange.
3.
Strong Encryption of message content.

Firewalls and WWW proxies

A firewall is any one of several ways of protecting one network from another untrusted network. The actual mechanism whereby this is accomplished varies widely, but in principle, the firewall can be thought of as a pair of mechanisms: one which exists to block traffic, and the other which exists to permit traffic. Some firewalls place a greater emphasis on blocking traffic, while others emphasize permitting traffic.

You can use a firewall to enhance your site's security in a number of ways. The most straightforward use of a firewall is to create an "internal site", that is accessible only to computers within your own LAN. For that, you just need to place the server inside the firewall.

However, most chances are you'd like to make your server available to the rest of the world, that means you'll have to put it outside the firewall. The safest way to do so is to put it completely outside of the LAN:

This is called a "sacrificial lamb" configuration. The server is at risk of being broken into, but at least when it's broken into it doesn't breach the security of the inner network.

In order to connect from the LAN to the outside world, a proxy is often installed on the Firewall machine. A proxy is a small program that can see both sides of the firewall. Requests for information from the Web server are intercepted by the proxy, forwarded to the server machine, and the response forwarded back to the requester. A proxy server mediates traffic between a protected network and the Internet. Many proxies contain extra logging or support for user authentication. Since proxies must "understand" the application protocol being used, they can also implement protocol specific security (e.g., an FTP proxy might be configurable to permit incoming FTP and block outgoing FTP).

Another way of contacting the outside world from behind a firewall is allowing the firewall to pass requests for port 80 that are bound to or returning from the WWW server machine. This has the effect of poking a small hole in the dike through which the rest of the world can send and receive requests to the WWW server machine.

Privacy in the Internet

Every time we use the Internet, either for surfing the World Wide Web or sending E-Mails, we leave our traces behind. These traces can be analyzed, and a lot of information can be taken from them.

How your personal information Gets collected

Whenever you connect to a web server, view a web site, or send an E-Mail, The servers along the way log these activities. The information can be collected in two ways :

What information can be revealed about you

A web site with the right equipment, can know a great deal. The information that can be revealed includes your E-mail address, your IP address, the files you viewed, and the pages you visited.

When you send E-mail, you're actually getting quite exposed. You can encrypt the message body, but you can't hide the headers, if you want the message to travel through the net. Let's look at an example E-Mail header, taken from a Netscape Mailer :

Return-Path: <yogevm@math.tau.ac.il> 

Received: from bfmail4 ([206.156.198.174]) by  e4000.artaxia.com (8.8.5/8.8.5) with 

  SMTP id TAA08477 for <mertero@artaxia.com>; Thu, 5 Jun 1997  19:28:01 -0200 (GMT) 

Received: from taurus.math.tau.ac.il (132.67.64.4) by bfmail4.bigfoot.com with SMTP 

  ( Bigfoot SMTP Server May 8 1997 15:22:04 ); Thu, 05 Jun 1997 12:25:22 -400 

  (Eastern Standard Time) 

Received: from lune.math.tau.ac.il (yogevm@lune.math.tau.ac.il [132.67.96.11]) 

  by taurus.math.tau.ac.il (8.8.3/8.8.3) with SMTP id TAA23843; 

  Thu, 5 Jun 1997 19:22:21 +0300 (GMT+0300) 

Date: Thu, 5 Jun 1997 19:22:20 +0300 (GMT+0300) 

From: Mashiach Yogev <yogevm@math.tau.ac.il> 

To: Mertens Ron <mertero@bigfoot.com> 

Subject: Re: [Fwd: Re: "Operating Systems" - The Exam] 

In-Reply-To: <3396C160.7745@netvision.net.il> Message-ID: 

  <Pine.SUN.3.95.970605192120.27235C-100000@lune.math.tau.ac.il> 

  MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-UIDL:   

  011a6057e9a080092d8d36ce7f0fd9e8 

Status: U X-PMFLAGS: 36176000 0

We can see who sent this message (Mashiach Yogev) and we could also find his Email address. The message was sent to Ron Mertens. We can see the subject (an Exam in Operation Systems), the Date it was sent, and even the path the message went through to get to it's destination.

When you connect to a web server, things are much worse than that. Let's take a look at the NCSA server (it's quite popular). It includes a program called httpd. It maintains 3 log files :

The 'problem' lies within the HTTP protocol. It has some features that allow all that data to be collected. The TCP/IP protocol, has a sort of caller-ID build in. When you connect, you send your computer's name, and the IP address. The refer_log is also a problem. It is used mainly for advertisement : for companies to be able to focus the advertisement more correctly, but it can be used for other means!

A newer source of trouble, are the 'Cookies'. Cookies are client side persistent information. Almost all of the new browsers have this facility : It allows web sites to store information about your visit, in your own hard disk. When you enter the site again, it will read your cookie, and thus now that you've been there already. It is used for nice tricks such as a personalized web page, and so on, but it can a serious privacy breach.

If you're connected to the internet through a Proxy, you have still another problem. The proxy server logs every access to the outside web, by every member of the organization. Your IP address, and your host computer are written down as well.

Why should I care about all that?

Well, you Should. Your privacy is your Right, as a human. If your privacy is violated, other freedoms (like the freedom of expression, or religion) might get threatened. Even if you have nothing to hide (like most people think), your privacy right must be important to you! Of course, it seems safe to surf the internet, but be warned that your are watched. Your privacy is not guaranteed, and that's a problem. Most of the information that can be collected is never used, but it can be. And there's not much you can do about it. A detailed profile of yourself can be created, and your tastes and preferences can be learned. Just by recording your message, and links, and web activity. This information can fall to the wrong hands (marketers and governments, for example) and be used against your interests.

How can I assure my privacy?

You can wait for your country to legislate a law about privacy, although these things take time, and most likely will never happen. In the meantime, you can use several offered utilities to ensure your privacy :

Problems with privacy control

The main problem is that the internet is a world-wide network. People from all around the world visit it, and so it's hard to enforce laws on it. In the US, for example there is no comprehensive law that protects people's privacy. There are several guidelines that protect some areas of your privacy, but it's not enough. And most countries are falling behind the US in that area. Another problem is that the web browsers, are usually made by the same companies that make the web servers. The ability to collect information about the site visitors is a major selling points, and your privacy can be endangered by this fact.

Emerging Trends

There are many emerging technologies these days, that are competing on the market of the Internet. There are WWW browsers, E-mail clients, Server software, Interactive web technologies, new Languages that are tailored to be used over a network, and more. All these new products are very exciting, and offer a better Internet experience, but they also pose a serious problem.

Java

Java is the hottest name in the Internet today. In a nutshell, Java is just another Object-Oriented language, like many others. But the main power of Java is that it's possible to run Java on any platform (okay, Almost all platforms. A special program, called the Interpreter need to be written in order for the program to work, but once that is done, All programs will load!). That makes it ideal for the Internet. Java makes this possible by using two steps to compile a program : A first compile is done to produce what is called 'byte-code' and then the byte code is run on the Interpreter. As a consequence Java is not so efficient like C/++ or any Compiler language, but it's usually fast enough. One exciting aspect of Java is the ability to create 'applets'. An Applet is a program, but which is run within a Browser (that supports Java, of course). So you can put a Java program right inside your page, turning the HTML code into a dynamic page.

Of course, all those great abilities, pose a serious security threat. Downloading a program from the internet to be run on your computer can be quite dangerous. The developers of Java has taken this into account. There are two types of protection : The language restrictions, and the Virtual Machine.

Safety features built into the Java language

The guys who wrote Java, wanted to ensure that program could only access memory in a structured, safe way. This helps make Java program robust. And also Safer. It's safer because if you allow access to the main memory, a program could easily crash the Operation system, which is a serious security risk in some environments. And memory access could also be used to violate the system security. The mechanisms that restrict a program's memory access are :

All these features are build into the Byte code, not the high level language. That means that even if you write your program in Byte Code (There are probably some assembler freaks that would enjoy that) you can't violate these rules.

The Java Virtual Machine

The Java program, once it's up and running, is being confined to a Java Virtual Machine (or JVM). This JVM (Or 'Sandbox') defines the area in which the program can run, and doesn't allow it to take any action outside the JVM. What it means, is that the JVM prohibits many actions that can be dangerous, like:

The fundamental components in the JVM are :

The Class loader and the Security manager are customizables. What is means, that you customizer your JVM. You can put hard demands on security, or the other way around... Usually you don't have to worry about this, because the applets you run in your browsers already have defined JVMs (The browser defines those).

So Java is totally safe.

Well, No. There is a way to penetrate the security barriers that the JRM maintains. A Java program may call a method from a dynamic library. This is called a Native method. The Native methods don't go through the Java API (That's what they are for, actually) so the security manager doesn't hold for them. All of Java's security is breached ! You can easily called a native method that can wreck havoc, like deleteing some files on the local disk or crashing the OS. Luckily, the security manager can prevent a program from using Native Methods. What it means, is that if you load a program that you don't trust (like EVERY web applet, for that matter) you can load them in a JVM that doesn't allow Native Methods. If you have a program that you do trust, you can allow it more control. So if you're careful, you can be sure that Java is quite safe. quite. It is still possible that someone will crack the Java security. It happened before : an early version of the JVM had a bug in it, and in march,97 a Java security bug was found. It was fixed, but who can tell what will happen in the future.

Javascript

Javascript is another exciting way to liven up your HTML pages. Unlike Java, which is a stand alone language, Javascript can only be used from within a browser. Javascript 'program's work with browser objects, and they can be used to make dynamic HTML pages. Javascript is usually used for small programs, like menus or scrolling texts. In Javascript you cannot 'hide' your source code, it is written directly into the HTML text. It's just a scripting language, like the old Macros that were once used, or the newer languages used within many popular spreadsheets, word-processors and other applications. So Javascript, in it's essence, is quite safe. It is confined to the browser, it cannot do real damage to your computer. Or can it? It was found that the earlier versions of Javascript had several bugs in them, that could cause some security breaches. For example, it was possible for a javascript program to load files from a user's hard disk. This was found in March/96 and was quickly fixed. Several other bugs were found, and fixed, and there are sure to be more.

How do I protect myself from Java or Javascript programs?

Of course, the true paranoid can simply choose not to load Java and Javascript programs. Every browser (okay, most of 'em for sure) can be configured so that these programs will be ignored. Most people want to use these facilities of course, and they are actually in a problem. What you SHOULD do, is to be informed. Every so often a Bug is found, and the browser companies quickly issue a patch for it. These patch are usually free, and small to download. Another advice is to never use Beta programs. These programs are pre-releases of new browser (or any other program for that matter). While it's very tempting to get your hands on new technology before it hits the market (especially since those are usually free) it can be dangerous. Most of the bugs that are found are fixed in the final version, and the beta version you have maybe not protected!.

ActiveX

ActiveX controls are software components that can run inside other applications. Actually, ActiveX is an internet-enabled version of OLE (Object linking and Embedding, Microsoft's component architecture. ActiveX is a Microsoft standard, aimed to take Java's place as providing active WWW pages (ActiveX can also be used in Non-Internet applications, like Word processing or MIDI sequencing software).

The ActiveX controls can perform much more than Java applets. But that has a serious effect : The Controls can take over the computer and shut it down. Malicious applets can introduce a virus or harm a PC or a network.

Microsoft are aware of the serious security breach in ActiveX, and their strategy is to require code-signing. All controls must be certified by a 3rd party tester. But that is not enough. recently, a control named Exploder (written by Apropos Inc), which turns off a PC received certification from Verisign, which is such a 3rd party tester. The technology only attach the author name to the control. It does not scan for viruses or other security breaches.

Some methods need to be found out to help solve the problem (An online virus scanner? A 'Virtual PC' like the one Java uses?) but in the mean time, Microsoft has also licensed Java to be used with it's browser. The whole direction that ActiveX will take is not clear at all.

If you want to be sure about your security : Most experts suggest disabling ActiveX completely. It is simply not safe at the moment. The author's signature is not enough.

Certificates

The oldest form of security, is to ask for a password. A password is a classic 'what you know' type of security. Of course, the problem is that anyone can access your information if he knows the password. A certificate (or public-key certificate, or Digital ID) is a 'What you know and what you have' type of security. In order to access information you need to have a specific file in your disc, that will authenticate you. Those files are encrypted, to provide a high level of security.

There are many standards of certificates. The most popular one is X.509v3 (by ITU). A X.509v3 certificate holds the following information :

The issuer is an entity that attests to the identity of the holder of the certificate. The issuer is usually an external company (like VeriSign) that all it does it to verify the identity.

Certificates are very useful when extra security is needed. It's your Digital ID, and can be used to identify you in cyberspace (your electronic network). The certificate proves several important services :

Certificates work in both ways. If you connect to some server, you can view it's certificate so you'll be sure to whom you're talking. The new browsers also have client-side certificates. So the server can know who YOU are.

Secure E-Mail (S/MIME)

S/MIME is a new standard for secure E-Mail. It is an open standard (The specifications are open for all, which means many companies can issue a S/MIME compatible E-Mail client), which is used for encrypted, signed mail.

S/MIME has these basic features :

The main advantage of S/MIME is it's interoperability, the fact that it's an open standard, and it has a good chance of become the De-Facto standard for secure E-Mail.

  
Exercises

1.
Most operating systems lose all file access control if the machines are booted up with a boot floppy containing a program that can interpret the values on the sectors of the hard drive, and reconstruct a view of the file system.

(a)
Why can this sort of program not be used when the operating system is running?
(b)
Windows NT comes with a new type of file system for which documentation is not easily available. Is this an adequate solution to this problem?
(c)
How does the use of networked file servers alleviate this problem?
2.
Most server programs written in C make use of the string library to read in requests from clients (possibly via the network). Most of these functions only specify the amount of memory required after they have copied values into it. This allows clients to overwrite portions of memory on the server, possibly altering the code for the server.

(a)
How could this cause security problems.
(b)
Suggest ways in which this problem could be prevented.
(c)
Is there any way to detect when a program has been corrupted in this way?
3.
Unix provides the rlogin mechanism which allows users on specific machines to log in to other machines without having to type in a password.

(a)
In what circumstances would this be good for the security of a system.
(b)
In what circumstances would this be bad for the security of a system.
4.
How does a firewall work? If you have a network that is connected to the Internet, where would you put a firewall?
5.
How are viruses transmitted through the Internet?
6.
Consider a company which is selling images of fractals over the Internet. How should it make use of firewalls to prevent:

(a)
Illegal use being made of its computation servers.
(b)
Illegal access to its archive of images on a HTTP server.
7.
What is private-key data encryption?


next up previous contents
Next: Glossary of Terms and Up: Computer Networks and Networking Previous: Network analysis
Shaun Bangay
1999-11-02