Experts Exchange: 2/8/09

WD passport Hardrive Troubleshooting

Frequently Asked Questions
Q: Why does the drive not power up?
A: Be sure the drive is plugged in to a power source. A special cable may be needed for computers with limited
bus power. For more information in the U.S., visit our Web site at store.westerndigital.com. Outside the U.S.,
contact WD Technical Support in your region.
Q: Why is the drive not recognized under My Computer or on the computer desktop?
A: If your system has a USB 2.0 PCI adapter card, be sure that its drivers are installed before connecting your
WD USB 2.0 external storage product. The drive is not recognized correctly unless USB 2.0 root hub and
host controller drivers are installed. Contact the adapter card manufacturer for installation procedures.
Q: Why won't my PC boot when I connect my USB drive to the computer before booting?
A: Depending on your system configuration, your computer may attempt to boot from your WD portable USB
drive. Refer to your system’s motherboard BIOS setting documentation to disable this feature or visit
support.wdc.com and see Knowledge Base answer ID 1201. For more information about booting from
external drives, refer to your system documentation or contact your system manufacturer.
Q: How do I partition and reformat the drive?
A: For recommended formats for this device, search our knowlege base for answer ID 207 at support.wdc.com.
Q: Why is the data transfer rate slow?
A: Your system may be operating at USB 1.1 speeds due to an incorrect driver installation of the USB 2.0
adapter card or a system that does not support USB 2.0.
Q: What is Hi-Speed USB?
A: Hi-Speed USB is another name for USB 2.0, which provides transfer rates up to 40 times faster than
USB 1.1. Upgrading to USB 2.0 is highly recommended because of the significant reduction in file transfer
time versus USB 1.1.
Q: How do I determine whether or not my system supports USB 2.0?
A: Refer to your USB card documentation or contact your USB card manufacturer.
Note: If your USB 2.0 controller is built-in to the system motherboard, be sure to install the appropriate
chipset support for your motherboard. Refer to your motherboard or system manual for more information.
Q: What happens when a USB 2.0 device is plugged into a USB 1.1 port or hub?
A: USB 2.0 is backward-compatible with USB 1.1. When connected to a USB 1.1 port or hub, a USB 2.0
device operates at the USB 1.1 full speed of up to 12 Mbps.
Q: Can USB 1.1 cables be used with USB 2.0 devices?
A: Although USB 1.1 cables work with USB 2.0 devices, it is recommended that USB 2.0 certified cables be
used with USB 2.0 peripherals and USB 2.0 PCI adapter cards.
If your system includes a PCI slot, you can achieve Hi-Speed USB transfer rates by installing a USB 2.0 PCI
adapter card. Contact the card manufacturer for installation procedures and more information.

SSH Protocol

SSH Protocol
It was already mentioned that the Telnet protocol is not very suitable for controlling a remote server because it is far from secure. There is considerable need and an even greater wish to control servers remotely. There are several servers in large networks, and it is inconvenient to scurry about the monitors to configure each of them. Any administrator wants to be able to control the entire network complex without leaving the workplace and to do this over secure channels.

During server management sessions, the administrator sends voluminous confidential information onto the networks (e.g., root passwords) that should under no circumstances be intercepted by eavesdropping utilities. There are numerous programs for providing secure communications. The most popular of them is SSH, which is included in most Linux distributions.

Using this utility, you can administer your network servers remotely from one work-place without having to equip each of them with a monitor and to run to each server every time you have to implement a minor configuration change. This is how I administer my network: from one monitor that I can connect to any system block if the problem cannot be solved over the network.

The advantage of SSH is that this protocol allows commands to be executed remotely but requires authentication and encrypts the communications channel. An important feature is that even user authentication passwords are transmitted encrypted.

Presently, there are two versions of the SSH protocol, numbered 1 and 2. The second version employs a stronger encoding algorithm and fixes some of the bugs found in the first version. At present, Linux supports both versions.

In Linux, the SSH protocol is handled by the OpenSSH program. The native platform of this program is another UNIX-like operating system: OpenBSD, from which it has been cloned to all other UNIX platforms, including Linux. But even now the OpenBSD name can be encountered sometimes in configuration files.

The SSH protocol requires a server and a client for operation. The server waits for connection and executes user commands. The client is used to connect to the server and to send it commands to execute. Thus, both parts of the protocol have to be properly configured for normal operation.

Configuration Files
All configuration files for the SSH protocol are located in the /etc/ssh/ directory. These are the following:

The SSH server configuration file: sshd_config

The SSH client configuration file: ssh_config

The key files for different algorithms:

ssh_host_dsa_key

ssh_host_dsa_key.pub

ssh_host_key

ssh_host_key.pub

ssh_host_rsa_key

ssh_host_rsa_key.pub

What is the reason for so many key files? It is that SSH uses different encoding algorithms, including the two most popular and secure ones: the Digital Signature Algorithm (DSA) and RSA. (The latter abbreviation is composed of the first letters of the last names of its creators: R.L. Rivest, A. Shamir, and L.M. Adleman.) The ssh_host_dsa_key and ssh_host_dsa_key.pub files are used by DSA, and the ssh_host rsa_key and ssh_host_rsa_key.pub files are used by the RSA algorithm. The remaining two key files — ssh_host_key and ssh_host_key.pub — store the keys to the first SSH version. Each algorithm requires two files: The file with the PUB extension stores the public key, and the file without this extension stores the private key.

Data to be sent to the server are encrypted using the public key. They can only be decrypted with the help of the private key. The private key cannot be picked by any known algorithms within a reasonable length of time. It can, however, be stolen; thus, you should take all necessary measures to prevent this from happening.

SSH Server Main Configuration Parameters
Consider the contents of the SSH server (sshd) configuration file (Listing 5.1). The sshd file is not large, so its entire contents are listed with only some comments deleted.

Listing 5.1: The sshd configuration file

#Port 22
#Protocol 2,1
#ListenAddress 0.0.0.0
#ListenAddress ::

# HostKey for protocol version 1
#HostKey /etc/ssh/ssh_host_key
# HostKeys for protocol version 2
#HostKey /etc/ssh/ssh_host_rsa_key
#HostKey /etc/ssh/ssh_host_dsa_key

# Lifetime and size of ephemeral version 1 server key
#KeyRegenerationInerval 3600
#ServerKeyBits 768

# Logging
#obsoletes QuietMode and FascistLogging
#SyslogFacility AUTH
#SyslogFacility AUTHPRIV
#LogLevel INFO

# Authentication:

#LoginGraceTime 600
#PermitRootLogin yes
#StrictModes yes

#RSAAuthentication yes
#PubkeyAuthentication yes
#AuthorizedKeysFile .ssh/authorized_keys

# Rhosts authentication should not be used.
#RhostsAuthentication no
# Don't read the user's ~/.rhosts and ~/.shosts files.
#IgnoreRhosts yes
# For this to work, you will also need host keys
# in /etc/ssh/ssh_known_hosts.
#RhostsRSAAuthentication no
# Similar for protocol version 2
#HostbasedAuthentication no
# Change to yes if you don't trust ~/.ssh/known_hosts for

# RhostsRSAAuthentication and HostbasedAuthentication.
#IgnoreUserKnownHosts no

# To disable tunneled clear text passwords, change to
# no here!
#PasswordAuthentication yes
#PermitEmptyPasswords no

# Change to no to disable s/key passwords.
#ChallengeResponseAuthentication yes

# Kerberos options
# KerberosAuthentication automatically enabled if
# the key file exists.
#KerberosAuthentication yes
#KerberosOrLocalPasswd yes
#KerberosTicketCleanup yes

# AFSTokenPassing automatically enabled if k_hasafs()
# is true.
#AFSTokenPassing yes

# Kerberos TGT passing only works with the AFS kaserver.
#KerberosTgtPassing no

#PAMAuthenticaticnViaKbdInt yes

#X11Forwarding no
#X11Forwarding yes
#X11DisplayOffset 10
#X11UseLocalhcst yes
#PrintMotd yes
#PrintLastLog yes
#KeepAlive yes
#UseLogin no

#MaxStartups 10
# No default banner path
#Banner /some/path
#VerifyReverseMapping no

# Override default of no subsystems.
Subsystem sftp /usr/libexec/openssh/sftp-server

The main parameters that you may have to use are the following:

Port — Shows the port, to which to connect on the remote machine. By default, this is port 22. Some administrators like to change this value and to move the server to another port. This action is justified to an extent. For example, if you do not have a Web server, the port normally used by it can be given to SSH. Hackers will think that this is a Web server and will not try to break into it.

Protocol — Gives the protocol versions supported. Note that first version 2 is specified, and then version 1. This means that the server will first try to connect over the version 2 protocol and only then over the version 1 protocol. I recommend removing the comments from this line and deleting the 1 number so that only the last protocol version is used. It's high time we updated the client software and started using more secure technologies. Getting stuck on old software only causes losses.

ListenAddress — Specifies the addresses to listen for a connection. Your server may be equipped with several network adapters. By default, all of these interfaces are monitored. You should specify only those interfaces that will be used for the SSH connection. For example, often one network adapter is used to connect to a local network, and another one is used to connect to the Internet. If the SSH protocol is used to connect only within the local network, only this adapter should be monitored. It is specified in the address:port format. Several address entries can be made to describe all necessary interfaces.

HostKey — Specifies the path to the files containing the encoding key. Only private keys need to be specified, used to decrypt incoming packets.

KeyRegeneraticnInterval — The key can be regenerated during the session in version 1. The purpose of regeneration is to make it impossible to decrypt intercepted packets by later stealing the keys from the machine. Setting this value to 0 disables regeneration. If you followed my recommendation not to use the version 1 protocol (see the Protocol parameter), this parameter does not affect the operation.

ServerKeyBits — Gives the length of the server key. The default value is 768; the minimal value is 512.

SyslogFacility — Specifies the types of messages to be stored in the system log.

LogLevel — Specifies the level of the event to be logged. The possible levels correspond to the system levels, which are considered in Section 12.5.6.

LoginGraceTime — Gives the time interval, within which the user has to enter the correct password before the connection is broken.

PermitRootLogin — Specifies whether the root user can log in using SSH. It was already said that root is the god in the system and its account's privileges must be used with care. If it is not advisable to log in as root regularly, this is all the more so using SSH. Change this parameter to no at once.

StrictModes — Specifies whether sshd should check the status of the files and their owners, user files, and home directory before accepting the login. It is desirable to set this parameter to yes because many novice users make their files accessible for writing to everyone.

RSAAuthentication — Specifies whether RSA authentication is permitted. This option is valid for protocol version 1 only.

PubkeyAuthentication — Specifies whether public key authentication is allowed. This option is valid for protocol version 2 only.

Authorizedkeyfiles — Specifies the file storing the public key that can be used for user authentication.

RhostsAuthentication — Allows authentication using the $home/.rhosts and /etc/hosts.equiv files. The default value is no. It should not be changed without a justified need, because this may have a negative effect on the security.

IgnoreRhosts — When set to yes, the ~/.rhosts and ~/.shosts cannot be read. The value should not be changed unless really necessary, because doing so may have a negative effect on the security.

AuthorizedKeysFile — Specifies the file for storing the list of the authorized keys. If a user logs into the system with a key stored in this file, no further authentication is performed.

RhostsAuthentication — When this parameter is set to yes, a host key from the /etc/ssh/ssh_known_hosts directory will be requested. The parameter is used in protocol version 1 only.

IgnoreUserKnownHosts — When the parameter is set to no, computers listed in ~/.ssh/known_hosts should be trusted during RhostsRSAAuthentication. Because you should not trust anyone, it is better to set this parameter to yes.

PasswordAuthentication — When set to yes, a password will be requested. When authentication is performed using keys, the parameter can be set to no.

PermitEmptyPasswords — Specifies whether empty passwords can be used. The default value is no, and it ought not to be changed.

KerberosAuthentication — Specifies whether Kerberos authentication of the user password should be performed. This authentication has been gaining popularity lately because of the security it provides.

KerberosOrLocalPasswd — When set, if the Kerberos password authentication failed, the password is validated using the /etc/shadow file mechanism.

KerberosTicketCleanup — When set, the user's Kerberos ticket cache file is destroyed on logout.

Banner — Specifies whether a warning message is displayed before the login procedure.

The sshd Server Access Parameters
In addition to those listed in Listing 5.1, the following keywords can be used in the sshd configuration file:

AllowGroups — Allows only the users of the specified groups to log into the system. The user group names are listed after the keyword, separated by spaces.

AllowUsers — Allows only the users listed after the key to enter the system. The user names are listed separated by spaces.

DenyGroups — Denies login to users of the specified groups. The user group names are listed after the keyword, separated by spaces.

AllowUsers — Denies login to users listed after the key. The user names are listed separated by spaces. This parameter comes in handy when a user of a permitted group has to be denied login.

I recommend specifying the names of the groups and users that can log into the system over SSH explicitly.

5.3.4. Configuring the SSH Client
The SSH client configuration settings contain even fewer parameters. The global settings for all of the system's users are stored in the /etc/ssh/ssh_config file. But any settings for any user can be redefined in the .ssh_config file in the particular user's directory. The contents of the global configurations file (with some comments omitted) are shown in Listing 5.2.

Listing 5.2: The contents of the /etc/ssh/ssh_config configuration file

# Site-wide defaults for various options

# Host *
# ForwardAgent no
# ForwardXll no
# RhostsAuthentication yes
# RhostsRSAAuthentication yes
# RSAAuthentication yes
# PasswordAuthentication yes
# FallBackToRsh no
# UseRsh no
# BatchMode no
# CheckHostIP yes
# StrictHostKeyChecking ask
# IdentityFile ~/.ssh/identity
# IdentityFile ~/.ssh/id_rsa
# IdentityFile ~/.ssh/id_dsa
# Port 22
# Protocol 2, 1
# Cipher 3des
# Ciphers aes128-cbc, 3des-cbc, blowfish-cbc, cast128-cbc,arcfour,
# aes192-cbc, aes256-cbc
# EscapeChar ~
Host *
Protocol 1, 2

Some of the parameters in this file are the same as in the server configuration file. One of these is the Protocol parameter, which specifies the SSH version being used. But in the case of the client, the version 1 protocol should not be disabled. This will not affect the security of the client but will help you avoid problems when connecting to a server that supports only this protocol.

The following are the most common client parameters:

Host — Specifies, to which server the following declarations are to apply.

CheckHostIP — If this parameter is set to yes, the IP address will be checked in the known_hosts file.

Compression — Enables (yes) or disables (no) data compression.

KerberosAuthentication — Enables (yes) or disables (no) Kerberos authentication.

NumberOfPasswordPrompts — Specifies the number of password entry attempts. If no correct password is entered, the connection is broken.

IdentityFile — Specifies the name of the file containing the private user keys.

PasswordAutentication — Specifies authentication by the password.

Examples of Using the SSH Client
Consider how to connect to a remote server. This is done by executing the following command:

ssh user@server

For example, to connect to the server flenovm as the user flenov, the following command has to be executed:

ssh flenov@flenovm

This will be answered by the following message:

The authenticity of host 'localhost(127.0.0.1)' can't be established
RSA1 key fingerprint is f2:al:6b:d6:fc:d0:f2:al:6b:d6:fc:d0.
Are you sure you want to continue connection (yes/no)?

The program informs you that the authenticity of the host localhost cannot be estab-lished and displays a fingerprint of the RSA key. To continue establishing the connection, enter yes from the keyboard. The following message will be displayed:

Permanently added 'localhost' (RSA1) to the list of known hosts.

This message informs you that the key has been added to the list of the known hosts. This means that the known_hosts file containing the remote system's key was created (or updated) in the .ssh/ subdirectory of your home directory.

The program then prompts you to enter the user's password. If the authentication succeeds, you will be connected to the remote system and will be able to execute commands there as if entering them from its keyboard.

Authentication by Key
Authentication by key is more convenient and more secure than authentication by password. The latter can even be disabled. Accessing the system over SSH is not quite safe. The password can be intercepted when being entered in another program. What is the use, then, of encrypting the SSH connection, if the password can be found when working with other programs?

This can be prevented by using different passwords for each connection. But it is difficult to remember all of the passwords, so it is better to perform authentication by keys, which are protected exceedingly well. All you have to do for this is modify the configuration slightly.

Start by creating a new key. This is done with the help of the ssh-keygen program. It has to be passed the following two parameters:

-t — The key type. This can be rsa or dsa for the second SSH version, or rsal for the first version. The rsa key will be used in the example.

-f — The file, in which the private key is to be stored. The open key file will be named the same, but with the PUB extension.

-b — The key length, which should be 512 minimum. The default value is 1024, which you should leave in place.

The key is generated by executing the following command:

ssh-keygen -t rsa -f ~/.ssh/myrsakey

Note that I specified the place to store the key as the .ssh subdirectory of my home directory, as indicated by the ~ character. SSH will look for all configuration settings in this directory. If you have not connected to the server yet, this path and key do not exist. The situation is corrected by opening the user's home directory and creating the .ssh directory:

cd /home/flenov
mkdir .ssh

If the file, in which to store the key, is not specified when the key is generated, by default an RSA key file, named id_rsa, will created in the ~/.ssh/ directory. The key file for DSA encoding will be stored in the same directory but will be named id-dsa. I specified the key file name on purpose to show how to do this.

If you did everything right, the program should display the following message:

Generating public/private rsa key pair.
Enter passphrase (empty for no passphrase):

I recommend specifying a password at least 10 characters long or, even better, a passphrase. The password is submitted by pressing the key, after which the program asks you to confirm the password.

If the password confirmation is successful, the following messages are displayed:

Your identification has been saved in ~/ssh/myrsakey.
Your public key has been saved in ~/ssh/myrsakey.pub.

The first message informs you that the private key has been saved in the ~/ssh/myrsakey file. The public key is saved in the ~/ssh/myrsakey.pub file.

The ~/ssh/myrsakey.pub key file has to be sent to the remote computer for the SSH server to use it for the authentication. The file can be sent over open communication channels, because even if it is intercepted by nefarious individuals, it is useless without the password you entered when the keys were created and without the private key.

The administrator of the remote server has to add the contents of the public key file to the .ssh/authorized_keys file. This can be done by executing the following command:

cat myrsakey.pub .ssh/authorized_keys

You can now connect to the server using the public key instead of a password to authenticate your identity. But before you do this, make sure that the server configuration file contains the following directives:

RSAAuthentication yes
PubkeyAuthentication yes

To connect to the server, execute the following command:

ssh -i ~/.ssh/myrsakey

The -i parameter specifies the public key file. If this parameter is not used, the id_rsa file will be used by default; it is specified in IdentityFile in the SSH client configuration file.

Now the server will ask you not for the password but for the passphrase specified when generating the public key.

Enter passphrase for key

Setting the PasswordAuthentication parameter in the SSH server configuration file to no dispenses with password checking, and the authentication will be performed based on the keys only. This is sufficient to provide secure communications.

Running X11 in the Terminal
Using the command line to control the remote system allows traffic to be reduced significantly. But sometimes it is necessary to use the graphical mode. I personally do not recommend using the graphical mode for purposes of security and efficiency, but many Windows users simply cannot accept the command line as the only interface. So if you are one of them, the SSH server can redirect X11 (the Linux graphical shell) to your local terminal. For this, the following three directives have to be specified in the sshd_config file:

XllForwarding yes — Self-explanatory.

XllDisplayOffset 10 — The first display number available to the SSH server. The default value is 10, and there is no reason to change it.

XllUseLocalhost yes — If this parameter is set to yes, the local X server will be used.In this case, the client will work with the local X11 and the service information sent over the network will be encrypted.

If you want to connect to the Linux graphical shell from Windows, you will need a program like X11 for this operating system. I can recommend the X-Win32 client for this, which can be downloaded from this site: www.starnet.com.

I do not recommend using X11, because this technology is still in the development stage and there are methods to fake or break into the connection.

Secure Data Transfer
The SSH packet also includes two useful utilities: the sftp server (an FTP server that supports data encryption) and the sftp client (an FTP client to connect to the sftp server). Examine the last line of the SSH server /ect/ssh/sshd_config configuration file:

Subsystem sftp /usr/libexec/openssh/sftp-server

The Subsystem directive defines supplementary services. It launches the OpenSSH sftp server.

Working with the sftp client is no different from working with the SSH client. Execute the sftp localhost command; the login message, will appear. Enter the correct password and you will be taken to the ftp client command line, from which you can send and receive files using FTP commands. This protocol is considered in detail in Chapter 10; for now, you only need to know that most of its commands are similar to the Linux file handling commands.

Try to connect to your system through an ftp client. After logging in, you can try executing the ls or cd commands to verify that the connection is working. To quit sftp, execute the exit command. The main FTP commands are listed in Appendix 1.

If you have to upload to or download from the server confidential information (for example, accounting files or the password file), you should do this over a secure sftp connection. Regular FTP clients transfer files in plaintext; consequently, anyone can monitor the traffic and obtain information that can be used to compromise your server.

You should keep it in mind, however, that not all FTP servers and clients support SSH encoding. You should ascertain that your software supports this protocol before using it.

How to setup Network Security

Network Security
Making your server secure is a complex task requiring you to control the operation of the entire network. The least you have to do is monitor all communication channels to know, which of them are being used.

The easiest way of doing this is to use the nmap utility. It allows the ping command to be executed for the entire network, thus revealing, which servers and computers are currently accessible. If some computer has not responded, you have to investigate the causes of this. Perhaps, this is only because of a power loss or unscheduled reboot. But it may also be caused by a successful DoS attack, and you should be the first one to know about this.

The nmap utility is extremely handy for a one-time check but inconvenient for constant monitoring. I prefer using the CyD NET Utils (www.cydsoft.com) utility for this purpose. The utility, however, has a serious shortcoming: It only works under Windows.

Unfortunately, despite my extensive search on the Internet, I have not been able to find any comparable Linux program and I assume there is none. Thus, until one is developed, the nmap utility remains your only choice for network monitoring under Linux. Despite its inconvenience, it is better than nothing.

How to setup Log Security

Log Security
I want to conclude the system-message logging topic with a section about their security. Even though the original purpose of logs was to monitor the system and detect attacks, they can also be used to break in to the system.

Consider a classical break-in example using logs. As you know, when a failed authorization attempt is logged, the password, albeit an incorrect one, is not saved, so as not to give hackers a starting point in figuring it out. But suppose that the user accidentally enters the password instead of the login. This happens often, especially in the mornings, and especially on Monday mornings. Logins, unlike passwords, are recorded in log messages; thus, the password will become available to hackers who obtain access to the message log file.

Therefore, it is important to make logs inaccessible to unauthorized people. Check the access permissions of the log files by executing the following command:

1s -al /var/log

The results produced by the command look similar to the following:

drwxr-xr-x 9 root root 4096 Jan 12 13:18 .
drwxr-xr-x 21 root root 4096 Jan 24 23:00 ..
drwx------ 2 root root 4096 Jan 12 11:14 bclsecurity
-rw-r----- 1 root root 83307 Jan 12 13:18 boot.log
-rw-r----- 1 root root 89697 Jan 6 09:01 boot.log.1
-rw-r----- 1 root root 48922 Jan 30 11:45 boot.log.2
-rw-r----- 1 root root 64540 Jan 23 19:55 boot.log.3

-rw-r----- 1 root root 36769 Jan 16 12:36 boot.log.4
-rw-r----- 1 root root 8453 Jan 12 13:18 cron
-rw-r----- 1 root root 8507 Jan 6 09:06 cron.1
-rw-r----- 1 root root 7189 Jan 30 11:50 cron.2
-rw-r----- 1 root root 6935 Jan 23 20:01 cron.3
-rw-r----- 1 root root 4176 Jan 16 12:41 cron.4
...
...

The owner of all files should be root. Also, make sure that only the administrator has full access rights, with the rest of the users unable to work with the files.

By default, the read rights for most files belong to the owners and the members of the owner group. The logs most often belong to the root group. If in your system only the administrator belongs to this group, you have no cause for alarm. But if this group comprises several users, which I am against, a special group with minimal rights has to be created and all logs must be switched to this group.

The following sequence of commands creates a new group, named logsgroup, and changes the group membership of all log files to this group:

groupadd logsgroup
cd /var/log
chgrp -R logsgroup .

Only the administrator should have the read and write rights to the files in the /var/log directory. The group users should only have the read rights, with the others having no rights. To set these permissions to all log files, execute the following sequence of commands:

cd /var/log
find . -type f | xargs chmod 640

The first line consists of two commands. The first command — find . -type f — searches the current directory for all f-type objects, that is, for all files. The second command — xargs chmod 640 — changes the access permissions of all objects found by the first command to 640. These permissions can even be lowered to 600 to give the read and write rights to the administrator only.

Moreover, no user should have read rights to the /var/log/ directory, so as to prevent unauthorized deletion of the log files. If hackers cannot modify log records, they may settle for the second best: deleting the logs themselves. True, deleted logs are a strong indication that someone unauthorized visited your system. But it's a small consolation, because without the log information you will not be able to learn how the break-in was perpetrated and find the culprit.

Remember, if hackers can read the logs, they can use the information recorded in them accidentally to raise their rights in the system. If hackers can modify the logs, they can cover their tracks by deleting all entries pertaining to their activities.

But it is not enough to provide maximum protection. The essence of logging is that the operating system only adds new entries to the log files; it neither deletes the files nor modifies the entries already logged in them. Thus, log files can be further protected from being deleted or modified with the help of file attributes. File attributes in the Ext2 and Ext3 files systems can be expanded with the help of the chattr command. One such expanded attribute is that the file can be only added to. It is set by executing the chattr command with the +a option. For example, the following command sets this attribute to the /var/log/boot.log file:

chattr +a /var/log/boot.log

Now, trying to delete or modify the file will be unsuccessful. The only shortcoming of this attribute is that you will not be able to clean the file. Log files have a tendency to grow constantly and rather rapidly, but usually it is not necessary to save a record of events that took place a month or even a year ago. To clean up, remove the add-only attribute as follows:

chattr -a /var/log/boot.log

Just don't forget to set it again after you are done.

In addition to protecting logs, programs used for analyzing them have to be protected. After all, what's the use of protecting log files from reading if they can be viewed using these programs? Log-analyzing programs are protected from unauthorized users by making sure that their SUID and SGID bits are not set.

How to Harden Xnix Logs

Handling Logs
By now, you know what system logs there are, where they are stored, and the nature and format of their contents. All this information is useful, but analyzing several megabytes of text is inconvenient and difficult.

In a system processing numerous requests, logs grow rapidly. For example, the daily log on my Web server can exceed 4 MB. This is a lot of text information, in which finding a specific entry within short time is practically impossible.

This is why programmers and administrators have written and continue writing log-analyzing software. Logs should be analyzed every day or, preferably, every hour. To maintain a secure system, you cannot afford to miss any important messages.

The most effective log-analyzing programs are those that analyze log entries as they are recorded in the log. This is relatively simple to implement, especially on a remote computer that receives log entries from the server over the network. As entries come in, they are analyzed and recorded in the log files for storage and more detailed future analyzes. It is usually difficult to detect an attack by one system message, and sometimes a dynamic picture is necessary. For example, one failed authorization attempt does not mean anything, while ten or more attempts look quite suspicious.

Unfortunately, all known log-analyzing software cannot do effective dynamic analysis. Most of this software only create rules, according to which certain entries are considered either suspicious or not. Therefore, all failed system login entries are considered suspicious and are subsequently analyzed manually. Every day, at least one user hits the wrong key when entering the password, especially if it is a complex one. It would make no sense to react to all such messages.

There is another shortcoming to analyzing logs line by line. Suppose that the log-analyzing utility issued a message informing of an attempt to access a restricted disk area. Such log entries for most services will contain only information about the attempt, not information about the user account used.

For example, a log entry recording unauthorized access to the ftp directory will contain the IP address of the client but not the user account. To find out, which user produced this failed login attempt, you have to open the log and look over the connection history from this IP manually. This problem can be avoided by dynamic log analysis.

The Tail Utility
When I am working directly at the server, I launch the following command in a new terminal window:

tail -f /var/log/messages

This command displays updates to the log file in real time; that is, whenever a new entry is added to the log, the utility displays it.

This is convenient if only a few entries are recorded into the log. In this way, you can work in one terminal and periodically switch to the other to check the new log messages. But if there are too many system messages (e.g., many users are working with the server), checking all new entries becomes impossible. In this case, you need a special utility to filter the messages and display only those deemed suspicious.

The Swatch Utility
This is a powerful Perl log message-analyzing utility. This is a rather simple language and many administrators know it, so you can easily modify the program and add new functions. The program can be downloaded from the site http://sourceforge.net/project/swatch.

The program can analyze log entries on the schedule (if the program is scheduled in the cron task manager) or immediately upon their being entered into the log.

The installation process is different because Swatch is a Perl program. This is done by executing the following sequence of commands:

tar xzvf swatch-3.1.tgz
cd swatch-3.1
perl Makefile.PL
make test
make install
make realclean

That the program is written in Perl is also its shortcoming. I had already mentioned that any software that can be used by hackers to enter the system should not be installed on the server unless necessary. The Perl interpreter is necessary for a Web server using scripts written in this language. In other cases, I recommend against installing a Perl interpreter because hackers often use this language for writing their own rootkits.

The Logsurfer Utility
This is one of the few programs that can examine logs dynamically. The program can be downloaded from sourceforge.net/projects/logsurfer. As was said, most log-analyzing programs do this line by line, which is ineffective because lots of trash is produced.

The powerful features of the program make it more difficult to configure. This is a shortcoming, because configuration errors may result in an important event going undetected.

The Logcheck/LogSentry Utility
This is the easiest program to use. It was developed by the programmers who developed the PortSentry utility considered earlier. LogSentry uses various templates to filter out the suspicious log messages.

The program is user-friendly, but I am concerned about its future. It looks like there will be no more updates, and sooner or later the current features will not be enough and a substitution will be necessary.

But I have high hopes for the prospects of the program. Its operation was considered in Section 12.4, when considering the operation of the PortSentry program.

How to detect Attacks

Detecting Attacks
A good administrator must do everything to nip in the bud any attack attempts on his or her system. What is the first thing hackers do to break in to a system? They collect information about the system, as was discussed at the beginning of this book. Hackers try to learn as much as possible about the system they want to break in to, and administrators must do everything to give as little information as possible about their system or, even better, throw hackers off the track with some false information.

The simplest and initial information-gathering technique is port scanning. To determine who tried to scan ports on your machine, when, and from where, you have to detect any nonstandard port events. Doing this manually is difficult, and a good specialized program is called for.

Automated port-scanning detection programs are a rather good attack-detection tool but, unfortunately, not in all cases. For example, popular servers are scanned often. I believe that such servers as www.yahoo.com or www.microsoft.com are scanned thousands if not millions of times a day. It is useless to pay attention to each of these countless scans. The most important thing is that automatic attack detection consumes computing resources, and sometimes a quite substantial amount. If every scanning attempt is logged, hackers can devise attack-imitating packets. Then all the server will do is handling these supposed attacks. The effect will be a classical DoS attack, because the server will no longer process client requests.

However, detecting scanning attempts on a company or home network server is a certain way to prevent a break-in.

Another shortcoming of automated scanning detection is that you will not be able to use your own security-scanning utilities, because their activity will be interpreted as an attack by the scanning-detection utilities. Consequently, when you perform scanning yourself, you have to disable the scanning-detection utilities for your scanning to work.

The Klaxon Utility
One of the simplest and most effective scanning-detection utilities is Klaxon (www.eng.auburn.edu/users/doug/second.html). The utility monitors ports unused by the system, and when it detects attempts to access them it gathers as much information as possible about the IP address, from which the scanning is conducted, and saves it in a log file.

The program is simple to install. After installing the program to the /etc/local/klaxon directory, add the following lines to the /etc/inetd.conf file:

#
# Local testing counterintelligence
#
rexec stream tcp nowait root /etc/local/klaxon klaxon rexec
link stream tcp nowait root /etc/local/klaxon klaxon link
supdup stream tcp nowait root /etc/local/klaxon klaxon supdup
tftp dgram udp wait root /etc/local/klaxon klaxon tftp

The preceding directives redirect services to the Klaxon utility and you can log who and when attempts to access these services.

This is useful because the remote command execution (REXEC) service is not needed for regular users and is mostly sought by hackers to penetrate the system. If an attempt, even an unsuccessful one, to access the REXEC service was made from some address, you should make a note that someone from this IP address is casing your server for vulnerabilities, and keep your eyes open for it.

I recommend installing Klaxon on no more than three services, because too many ports may cause the hackers to become suspicious. Moreover, with Klaxon installed on more than five ports, repeated scanning can divert system resources to Klaxon, resulting in a successful DoS attack.

The PortSentry Utility
This program comes in source codes and has to be extracted from the archive at sourceforge.net/projects/sentrytools and compiled. This should present you with no difficulties.

Extract the program by executing the following command:

tar xzvf portsentry-1.2.tar.gz

In my case, the program was extracted into the portsenty_beta directory. The directory name may be different on your machine because the program version may have changed by the time this book is published. The extraction directory and the file being extracted are displayed during the extraction process in the directory_name/file_name format.

Open the newly-created directory with the source codes and execute the following command in it:

cd portsentry_beta

The PortSentry program works under all UNIX-like system, such as Solaris, FreeBSD, OpenBSD, and, of course, Linux. When compiling the program, you have to explicitly specify the operating system installed:

make linux

By default, the program is installed into the /usr/local/psionic directory; the install directory, however, can be changed by specifying the necessary directory as the value for the INSTALLDIR parameter in the Makefile file. The executable file is built by the following command:

make install

To view the logs created by the utility, you also have to install the Logcheck program. It is available from the same site as the PortSentry program. It is installed in the same way as the PortSentry utility using the following commands:

tar xzvf logcheck-1.1.1.tar.gz
cd logcheck-1.1.1
make linux
make install

By default, the Logcheck program installs into the /usr/local/etc directory. The directory can be changed by editing the INSTALLDIR parameter in the Makefile file.

The program's configuration settings are located in the /usr/local/psionic/portsentry/portsentry.conf file. By default, all settings are commented out and you have to remove the comments from the necessary settings.

For example, for monitoring ports the configuration file contains three port-monitoring options. To enable monitoring of the selected ports, remove the pound sign (#) from the corresponding entry. For example, the uncommented third port-monitoring option looks like the following:

TCP_PORTS="1,11,15,79,111,119,143,540,635,1080,1524,2000,5742,6667"
UDP_PORTS="1,7,9,69,161,162,513,635,640,641,700,32770,32771,32772"

In addition to the monitoring capabilities, the program has an excellent security feature: When it detects an attack attempt, the utility can configure the firewall to prohibit any traffic from the address, from which the attack was attempted. By default, this feature is also disabled and is enabled by removing the comments (the pound sign) from the corresponding directives.

The firewall most often used in Linux is ipchains. It is configured by the following directive:

KILL_ROUTE="/sbin/ipchains -I input -s $TARGET$ -j DENY -l"

Before doing this, make sure that the firewall is installed at the specified directory (/sbin/ipchains). This can be done by executing the following command:

which ipchains

If you are using the iptables firewall instead of ipchains, it is configured by the following directive:

KILL_ROUTE="/usr/local/bin/iptables -I INPUT -s $TARGET$ -j DROP"

I consider the capability for an automatic firewall configuration in response to attack detection powerful indeed. On the other hand, any program can make a mistake and disallow access to someone that should not be blocked. Hackers can imitate an attack as coming from another user's address — your boss's, for example. PortSentry cannot tell who is hiding behind any address and will cut your boss off from the Internet. This will not be a welcome development.

I conducted an experiment in my test system and tried to throw packets requesting connection to various ports at the server with the source IP address set to different addresses. This rendered the server inaccessible from those IP addresses. You, however, should control how the monitoring program configures the firewall; otherwise, hackers can flood your server with requests and deny other computers access to it.

The monitoring program is launched by the following commands:

/usr/local/psionic/portsentry/portsentry -atcp
/usr/local/psionic/portsentry/portsentry -audp

The first command launches monitoring of the TCP ports, and the second one starts monitoring of the UDP ports. All activities of the program are saved in a log file, which can be viewed using the Logcheck program. I recommend that this program to be scheduled execute regularly (no less frequently than every 15 minutes) and inform the administrator about system happenings.

Start by configuring the Logcheck program. Open the /usr/local/etc/logchecksh file and add the following entry to it (if it is not already in there):

"mailto:SYSADMIN=admin@server.com"

Replace admin@server.com with the email address, to which you want notification messages about log entries created by PortSentry to be sent. To run the /user/local/etc/logcheck.sh script every specified period, use the crontab service.

To test the program, I configured it as described previously and started the CyD NET Utils (www.cydsoft.com) port scanning utility. It showed only the first two ports as opened. Even though more than one port had been open, the rest of them were closed for the port-scanner program. On the Linux server, I executed the cat /etc/hosts.deny command to view the /etc/hosts.deny file, which stores the IP addresses of all computers prohibited to connect to the server.

The last entry in the displayed file contents was the IP address of the computer, from which I conducted the port scanning:

ALL: 192.168.77.10

The PortSentry program reacted rapidly and efficiently, adding to the /etc/hosts.deny file a prohibition for using any service from the 192.168.77.10 address. This prohibition can only be removed by deleting the corresponding directive in the /etc/hosts.deny file.

It must be said that some ports can be used intensively enough in the process of normal operations for the program to interpret this as a break-in attempt. Such are the ident (113) and NetBIOS (139) ports. It is best if these ports are not included in the list of ports to monitor. Find the ADVANCED_EXCLUDE_TCP and ADVANCED_EXCLUDE_UDP entries in the /usr/local/psionic/portsentry/portsentry.conf file and add the necessary ports to the lists. By default, the following ports are excluded from monitoring:

ADVANCED_EXCLUDE TCP="113,139"
ADVANCED_EXCLUDE UDP="520,138,137,67"

As you can see, ports 113 and 139 are already excluded from being monitored.

The LIDS Utility
Even though I do not like to patch the kernel, I consider the Linux intrusion detection/defense system (LIDS) packet worthy of consideration because it offers comprehensive capabilities and makes it possible to enhance system security significantly.

The configuration files are encrypted, which makes modifying them difficult. It is not that easy to shut down the utility, because it requires knowing the system administrator's password.

Detection of port scanning attempts is a small fraction of what this utility packet can do. One of the handy LIDS features is being able to limit file access not on the user level but on the program level. This expands the rights-assignment capabilities and enhances the overall security. For example, the ls and cat programs and text editors can be disallowed to work with the /etc directory. This will make it difficult for hackers to view the /etc/passwd file.

Installing LIDS is not an easy task because it requires patching the kernel source code, compiling the patched codes, and installing the kernel. Here is where some problems may be encountered, because there is no guarantee that the patched kernel will work as intended. The source codes may become corrupted and not compile. When a new kernel version is introduced, it should be tested on a test machine before installing it on the production system. There still is a chance that the new kernel will not work properly on the production machine even after it has been checked on a test machine. But not updating the kernel is not an option, because failure to do this may result in faulty operation in the future.

You can obtain detailed information on LIDS at the utility's official site (www.lids.org).

How to Shutt SUID and SGID Doors

Shutting SUID and SGID Doors

If you are an administrator or a security specialist, you should know your system inside and out. You already know that one of the potential security problems is SUID or SGID bits. You have to clear these bits for all programs that you are not using. But how can you find programs that have these bits set? Use the following command:

find / $ -perm -02000 -o -perm -04000 $ -ls

This command will find all files that have 02000 or 04000 rights, which corresponds to the SUID or SGID bits set. The following is an example of the command's execution:

130337 64 -rwsr-xr-x 1 root root 60104 Jul 29 2002 /bin/mount
130338 32 -rwsr-xr-x 1 root root 30664 Jul 29 2002 /bin/umount
130341 36 -rwsr-xr-x 1 root root 35040 Jul 19 2002 /bin/ping
130365 20 -rwsr-xr-x 1 root root 19072 Jul 10 2002 /bin/su
...

The most dangerous thing security-wise in this list is that all of the programs have root permissions and can be executed by a user or a group member. There are programs with SUID and SGID bits set that belong to other users in the system, but most have the root ownership.

If you do not use a program, either delete it or clear the bits. If you think that there are no unnecessary programs in your system, think again. Perhaps, there is something you can do without. For example, if a program is not a must for a server, its SUID bit can be cleared.

If, after the initial paring, there are still many privileged programs left, I recommend clearing the bits for all programs. This will make it impossible for users to mount devices or change their passwords. But do they need these services? If some of them need some of these services, you can always give them these by resetting the SUID bit.

You can also change programs' ownerships to less privileged accounts. Even though this is difficult to implement, because you will have to change quite a few permissions, you will sleep better at night.

Why is it so important to regularly check files for SUID or SGID bits set? Because after penetrating a system, hackers often try to fortify their positions in it to remain invisible yet retain maximum privileges. The easiest way of achieving this is setting the SUID bit on the bash command interpreter. This will result in the interpreter executing any user's commands with the root rights, meaning that the hackers can have guest rights but perform operations requiring root privileges — that is, anything they may feel like.

Squid Browser Caching

Browser Caching
In addition to page caching by a central proxy server, page caching can be done by local programs. For example, the Mozilla browser can cache Web pages visited on the local hard drive. When a previously-visited page is requested again, the browser does not retrieve it from the proxy server cache but loads it from its local cache.

Fig.a shows the dialog window for configuring Mozilla cache. The Memory Cache parameter is the maximum operating memory allocated to caching pages. Its default value is 4,096 KB. Using memory cache speeds up operations when browsing the same site, because most of its graphical objects are saved in memory and retrieved from there instead of from the hard drive.

Figure a: Configuring Mozilla cache
The Disk Cache parameter sets the size of the disk cache. Usually, its default value is set to 50,000 KB (about 50 MB). This amount is too small for regular Web surfing and will be used up quickly. If your hard drive allows, I recommend increasing this value. The Disk Cache Folder parameter specifies the folder, in which the disk cache is stored.

You can also specify when a page in the cache should be compared with the page on the network. The following four options are available:

Every time I view the page — Self-explanatory.

When the page is out of date — Ditto.

Once per session — Every time the browser is started.

Never — The page will always be loaded from the local cache; you can reload it by clicking the Reload button on the browser's toolbar.

When working with the Internet using a proxy server or local browser caching, you should remember that pages that load can be outdated. To load the fresh version of the page, click the Reload button.

Working with Squid Proxy Server

Working with Squid
Here I will consider some security aspects of the squid service and the supplementary features that can accelerate Internet operations.

Squid Security
When I first read squid documentation, I found the following two directives interesting: cache_effective_user and cache_effective_group. If squid is run as root, the user and group identifiers will be replaced with those specified by these tags. The user and group identifiers are set to squid by default:

cache_effective_user squid
cache_effective_group squid

In this way, squid will not work with the root rights, and when an attempt is made to make it do so, the service will itself lower its rights to squid. I do not recommend modifying these directives. There is no need to give the squid service greater rights, because those for the cache directory are sufficient for it.

Site Acceleration
Squid can be used to access a certain site more rapidly by acting as an httpd accelerator. At least three parameters have to be specified for this:

httpd_accel_host address — This indicates the host name of the accelerated server.

httpd_accel_port port — This sets the port, to which the accelerated requests are to be forwarded. Most often, this is the default port (port 80).

httpd_accel_uses_host_header on|off — The HTTP header contains a HOST filed in it, which is not checked by squid. This may be a source of security problems. The developers recommend setting the value of this option to off. It should be set to on if squid is operating in the transparent mode.

httpd_accel_with_proxy on|off — This needs to be set to on for the cache to function as both a Web cache and an accelerator.

User Agent Field
Many statistical systems do not take into account or do not allow entry to users in whose requests the User Agent field is blank. This field being blank indicates that the request was channeled through a proxy.

Another company I used to work for limited Internet access by IP addresses. I was the only programmer and the network administrator in my department. Only the department head, his assistant, and I were allowed Internet access. A few hours after I was hired, all other department workers had Internet access. How? Simple: I installed a proxy server on my computer, to which all of my coworkers could connect without having to go through an authentication process. The proxy redirected all requests from my coworkers to the main corporate proxy. Because all these requests were coming from me, the main proxy did not suspect anything.

It could have been suspicious, because there is a small flaw in this charitable solution. This is the User Agent field, which was blanked out when requests passed through my proxy. But there is a solution to this problem: the field can be filled out manually in the configuration file by the fake_user_agent directive. For example, the following line emulates requests coming from a Netscape browser:

fake_user_agent Netscrape/1.0 (CP/M; 8-bit)

Network Protection
The squid service is a two-edged sword: it can be used both to protect the network and to penetrate it. To prevent outside users from using the proxy server to connect to computers in the local network, the following directives have to be added to the configuration file:

tcp_incoming_address downstream_address
tcp_outgoing_address upstream_address
udp_incoming_address downstream_address
udp_outgoing_address upstream_address

In the preceding list, downstream_address is the address of the computer with squid installed whose network connection is directed to the local network; upstream_address is the address of the network connection directed to the Internet. If addresses are specified incorrectly, it will be possible to connect to the local network's computer from the outside. The following is an example of squid configured incorrectly:

tcp_incoming_address upstream_address
tcp_outgoing_address downstream_address
udp_incoming_address upstream_address
udp_outgoing_address downstream_address

Fighting Banners and Popup Windows
It was already mentioned that most traffic from any site is graphics. Most browsers allow the image-viewing feature to be disabled; this, however, will make Web surfing less convenient. Without graphics, some sites become less informative and more difficult to navigate; thus, it is not possible to dispense with graphics display altogether.

But there is a type of graphics that irritates and does not carry any useful information — the graphics we would love to, and can, get rid of. I am talking about banners. Consider how to disable banner display way up on the proxy server level. For this, first define the following rules in the squid.conf file:

acl banners_regex url_regex "/usr/etc/banners_regex"
acl banners_path_regex urlpath_regex "/usr/etc/banners_path_regex"
acl banners_exclusion url_regex "/usr/etc/banners_exclusion"

The first entry creates an ACL named banners_regex of the url_regex type that allows a complete URL to be searched. The last parameter specifies the /usr/etc/banners_regex file, in which the URLs of banner systems will be stored.

The second entry creates an ACL named banner_path_regex of the urlpath_regex type. The last parameter here specifies the /usr/etc/banners_path_regex file, in which URLs to be disallowed will be defined.

The third entry creates an ACL of the same type as the first one, named banners_exclusion and linked to the /usr/etc/banners_exclusion file. In the first two files, descriptions of URLs or templates to be used for killing banners will be stored. Sometimes, however, you may want to view a particular banner. In this case, its URL can be recorded in this file and the banner will be loaded.

Next, specify the following operators for the created ACLs:

http_access deny banners_path_regex !banners_exclusion
http_access deny banners_regex !banners_exclusion

Both directives do basically the same: They prohibit loading from the addresses specified in the banners_path_regex and banners_regex lists unless they are included in the banners_exclusion list.

Consider the following fragment of the contents of the /usr/etc/banners_regexfile:

^http://members\.tripod\.com/adm/popup/.+html
^http://www\.geocities\.com/ad_container/pop\.html

As you shsssould remember, this file contains template URL paths, and all addresses that match them will be filtered out.

The first entry describes a template that prohibits loading of addresses of the following type:

http://members.tripod.com/adm/popup/popup.html

As you can see, it is easy to do away with the popup windosws from the www.tripod.com site. If you know how to build regular expressions, you will be able to create a similar list for any banner system and cut off the most sophisticated paths of graphical pests. The subject of regular expressions is not considered in this book because it is too extensive and requires a book all for itself

In your fight with banners, be prepared for the resurrection of the banners you thought you had killed off. This is because banners are simply commercials allowing sites to earn money to stay in business. Some especially clever administrators are constantly looking for ways to prevent users from getting rid of banners. One of the ways they achieve this is by changing the addresses, from which the banners are served, to neutralize regular expressions.

Replacing Banners
Even though in most cases banners and popup windows are irritating pests, they provide some artistic dressing for pages. Having eliminated them, you may find pages dull and unattractive. This problem can be alleviated by replacing removed banners and popup windows with your own images, which are stored on the local server and, thus, do not have to be loaded from the Internet.

The tool to implement this task is a redirector. In squid, this is an external program that replaces addresses. For example, if the page code contains an address for a banner and your banner-filter program detects it, the redirector replaces the address of the other guy's banner with the address of whatever you may want to load in its place.

There is only one little problem with this: Linux has no ready program for this task and you will have to write it yourself. Any programming language will do, but I will show an example implemented in Perl. If you know how to program in this language, I am certain you will like replacing banners better than simply killing them using ACLs.

Listing 9.2 shows an example of a classic redirector program. I tried to simplify it as much as possible to make it easier to adapt for your needs.

Listing 9.2: Perl redirector program

#!/usr/bin/perl

$| = 1;

# Specify the URL on your Web server, to which the images
# are stored.
$YOURSITE = 'http://yourserver.com/squid';
$LOG = '/usr/etc/redirectlog';
$LAZY_WRITE = 1;

if ($LOG) {
open LOG, ">> $LOG";
unless ($LAZY_WRITE)

{
select LOG ;
$| = 1;
select STDOUT;
}
}
@b468_60 = qw (
www\.sitename\.com/cgi/
# Add descriptions of the 468 x 60 banners'
# URLs here.
)

@b100 100= qw (
www\.sitename\.com/cgi/
# Add descriptions of the 100 x 100 banners'
# URLs here.
);

@various = qw (
www\.sitename\.com/cgi/
# Add descriptions of non-standard size banners'
# URLs here.
);

@popup_window = qw (
^http://members\.tripod\.com/adm/popup/.+html
^http://www\.geocities\.com/ad_container/pop\.html
^http://www\.geocities\.com/toto\?
# Add descriptions of popup windows' URLs here
);

# Descriptions of where images are located
$b468_60 = "$YOURSITE/468_60.gif";
$b100_100 = "$YOURSITE/100_100.gif";
$various = "$YOURSITE/empty.gif";
$closewindow = "$YOURSITE/close.htm";

while (<>)
{
($url, $who, $ident, $method) = /^(\S+) (\S+) (\S+) (\S+)$/;
$prev = $url;

# A check for 468 x 60 banners

$url = $b468_60 if grep $url =~ m%$_%, @b468_60;

# A check for 100 x 100 banners
$url = $b100100 if grep $url =~ m%$_%, @blOO_100;

# A check for non-standard size banners
$url = $various if grep $url =~ m%$_%, @various;

# A check for popup windows
$url = $closewindow if grep $url =~ m%$_%, @popup_window;

# An individual site not included in the list at the
# beginning of the file
$url = "$YOURSITE/empty.gif" if $url =~ m%hitbox\.com/Hitbox\?%;

if ($LOG and $url ne $prev)
{
my ($sec, $min, $hour, $mday, $mon, $year) = localtime;
printf LOG "%2d.%02d.%2d %2d:%02d:%04d: %s\r\n",
$mday, $mon + 1, $year + 1900, $hour, $min, $sec,
"$who $prev > $url";
}
print "$url $who $ident $method\n";
}
close LOG if $LOG;

Save this program in the /usr/etc/redirector file and give squid the rights to execute it. Afterward, add the following entry to the squid.conf file:

redirect_program /usr/local/etc/squid/redirector

For the program to work, you will have to create the following files on your Web server:

468_60.gif— A 468 × 60 image.

100_100.gif— A 100 × 100 image.

empty.gif — An image that will replace all nonstandard banners. It is best to make it 1 × 1 pixels so that it does not spoil the aesthetics of the site's design.

close.htm — An HTML file to close popup windows. It contains the window.close()JavaScript function to close the windows. Listing 9.3 shows an example of the contents of this file.

JavaScript for killing popup windows

All these files should be stored on the Web server in one directory. Don't forget to specify the correct path to this directory in the script's $YOURSITE variable.

I tried to explain the most important code areas in Listing 9.2 with comments. If you have Perl programming experience, you will have no problems making it all work.

Barring Sites
I had a conversation with an acquaintance not long ago, and he offered a definition of the Internet that I found amusing: The World Wide Web was created for and lives by pornography. Although I do not completely agree with him, I feel he might be partially right in that the sites with sexy content are most frequently visited (if you don't take into account the Microsoft update site, from which users download patches for software from this company).

No employers will be happy if their workers visit sites with illicit content during work hours. This produces not only traffic waste but also other expenses unrelated to work. Parents do not want their children to be exposed to sites like these either, and they strive to shelter their sensibilities from too many facts of life. I am saying this as a father of two children.

Pornography sites can be easily banned using the same methods as those used to kill banners. For example, you could disallow any site whose URL contains the word "sex." But this method can produce false calls. For example, an address may contain the "GasExpo" text in it. Because it contains a letter combination that spells "sex," this site will be barred. This is a real-life example, in which a user was not allowed to load a gas-equipment exhibition site.

Although creating lists of prohibited sites is a difficult task, it is not an impossible one. Currently, most sites of erotic persuasion have folded their activities in the com domain and are settling down in other domains, which usually belong to small island nations. In some of such domains, almost 90% of sites are of the adult entertainment nature. These you could bar without any fear that someone won't be able to catch up on the latest in the gas equipment developments.

Limiting Bandwidth
Frequently, when organizing Internet access some users have to be provided a high-speed connection. How can this be accomplished if, by default, all users are peers and can access the Internet at the maximum speed available? You have to establish some priorities to achieve this.

If a user requires a wide bandwidth channel to work with applications requiring a high data-exchange rate (e.g., for presentations), you have to reserve for this user a channel of wider bandwidth than that set aside for the rest of the users. This can be achieved only by borrowing bandwidth from other users.

Limiting the external channel is easy to accomplish using squid. The following example lists the directives used to achieve this:

delay_pools 3
delay_class 1 1
delay_class 2 2
delay_class 3 1
delay_parameters 1 256000/256000
delay_access 1 deny all
delay_access 1 allow admins
delay_parameters 2 256000/256000 4000/8000
delay_access 2 allow all
delay_access 2 deny admins
delay_parameters 3 64000/64000
delay_access 3 deny all
delay_access 3 allow bigboss

Add this code to the /etc/squid/squid.conf configuration file after the following comment:

# DELAY POOL PARAMETERS (all require DELAY_POOLS compilation option).
#--------------------------------------------------------------------

Most of the parameters are already set by default and have to be modified.

The first line — delay_pools n — specifies that there will be n number of delay pools (rules describing access speeds) to use. By default, n equals 0; there is no limit on the number of pools. Because you are going to create three pools, n is set to 3.

Next, the pools are actually created using the delay_class n c directive, where n is the pool number and c is the class number.

There are three different pool classes. These are the following:

1 — The download rates of all connections in the class are added together, and the aggregate is kept below a given maximum value. For example, you can limit the download speed from all adult entertainment sites (defined in advance using acl tag) to 32 Kb/sec. If your Internet connection bandwidth is, for example, 256 Kb/sec, no matter how many people try to download hot stuff, they will have only 32 Kb/sec to share, with the rest of the users guaranteed the remaining 224 Kb/sec of bandwidth.

2 — The aggregate bandwidth for all connections in the class and the bandwidth of each connection in the class is limited. For example, with a 256 Kb/sec Internet connection, you can limit a certain class of users to 128 Kb/sec and ensure that no single user gets more than his or her fair share of this bandwidth.

3 — The aggregate bandwidth for all connections and the bandwidth for each IP range and the bandwidth for each connection is limited. Suppose you have four IP ranges (subnetworks) in your local network and an Internet connection speed of 512 Kb/sec. You want to leave 64 Kb/sec available for mail and other service traffic. This leaves 512 - 64 = 448 Kb/sec for all four subnetworks. Each of the four subnetworks is further limited to about 112 Kb/sec. Each user of each subnetwork is then limited to his or her share of the subnetwork's bandwidth, the actual bandwidth depending on the number of users and their download habits.

In the example, I used delay pools class 1, class 2, and class 1 again. I did it on purpose to make the example more illustrative.

Next, speed limits are set on each pool as follows:

delay_parameters delay_pool aggregate_bandwidth
network_bandwidth user_bandwidth

The dealy_pool parameter is the pool number whose bandwidth is being limited. In the example, the following line limits the bandwidth of the first pool:

delay_parameters 1 256000/256000

Because pool 1 is of the type 1 class (delay_class 1 1) — that is, only its aggregate bandwidth can be limited — the directive takes only one parameter: aggregate_bandwidth (the value 256000/256000). The parameter's value consists of two numbers separated by a slash. The first number is the actual speed limit (in bytes per second). The second number is the threshold, in bytes downloaded, when this speed limit lacks in. For example, when downloading a large file, its first 16,000 bytes will be downloaded at the normal speed, whatever it is. But then the limit will kick in and the remainder of the file will download at 4,000 bytes per second (32 Kb/sec).

The number of parameters depends on the pool class. Only two parameters have to be specified for the class 1 pool, which limits the aggregate connection bandwidth:

delay_parameters delay_pool aggregate_bandwidth

The directive for the second pool class looks as follows:

delay_parameters delay_pool aggregate_bandwidth user_bandwidth

Thus, the first directive sets the aggregate bandwidth of all connections to 256,000 bytes per second (or 2 Mb/sec). No bandwidth limit is imposed if it is specified as -1.

After the bandwidth limitations for the first pool are specified, access rights to the pool are set by the delay_access directive as follows:

delay_access delay_pool allow|deny acl

The first parameter is the pool number. This is followed by the access or the deny option for the members of the list, given as the last parameter (acl).

In the example, access rights to pool 1 are set for two groups: all and admins:

delay_access 1 deny all
delay_access 1 allow admins

The first directive bars all users from working at the given bandwidth, and the second gives access to it to only the members of the admins ACL. It is assumed that only administrators are such members.

Next, a description of the bandwidth limitations and access rights for the second pool are given:

delay_parameters 2 256000/256000 4000/8000
delay_access 2 allow all
delay_access 2 deny admins

The second pool is of the type 2 class. Here, the aggregate bandwidth limitation is specified (256,000 bytes per second), as well as the bandwidth limitation for individual connections (4,000 bytes per second). All users but the administrators will work at this speed.

Finally, there could be some problems if you limit the boss to the bandwidth of 4,000 bytes per second like a regular user. To avoid potential problems, separate permission is given to the boss as follows:

delay_parameters 3 64000/64000
delay_access 3 deny all
delay_access 3 allow bigboss

The bandwidth limitation feature can be used to bar loading of multimedia files during work hours. Listing 9.4 shows how to configure squid to read Web pages at regular speeds but to limit speeds for loading media files during work hours.

Limiting speed for loading media during work hours

# ACL describing the network
acl fullspeed url_regex -i 192.168.1
# ACL describing media files that must put the brakes on
# during work hours
acl mediaspeed url_regex -i ftp .exe .mp3 .avi .mpeg .iso .wav
# The period, during which the restriction on the
# download speed of media files applies
acl day time 08:00-18:59

# Two second-class pools are needed.
delay_pools 2
delay_class 1 2

delay_class 2 2

# The first pool has no restrictions for anyone.
delay_parameters 1 -1/-1 -1/-1
delay_access 1 allow fullspeed

# The second pool restricts daytime speed.
delay_parameters 2 4000/100000 4000/100000
delay_access 2 allow day
delay_access 2 deny !day
delay_access 2 allow mediaspeed

I believe the comments to the code are sufficient to understand how it functions. The media file download speed, however, is limited for all users. If you want to make exceptions for certain users from this restriction, you can create an ACL for them (named, for example, allowfull) and add the following line at the end of the listing:

delay_access 2 deny !allowfull

Squid Access Right

Squid Access Rights
This is the sorest subject for any administrator. Yes, access rights to various squid functions are controlled in squid, and they are defined in the /etc/squid/squid.conf configuration file. But because the main emphasis of this book is on the security aspects of Linux, I devoted this subject to a separate section.

Access Control List
The first thing to consider is the ACL, which is a powerful tool for configuring site access rights. Using a list of names, actions or users can be grouped. The tag is issued in the following format:

acl name type string

The functions of the tag's three parameters are the following:

name — This can be any name, preferably descriptive of the actions performed.

decision_string — This is a template whose function depends on the type of operation specified in the second argument.

type — This parameter can take on the following values: src, dst, srcdomain, dstdomain, url_pattern, urlpath_pattern, time, port, proto, proxy_auth, method, browser, or user. The functions of the main types, specifying how to interpret the preceding parameter (decision_string), are as follows:

src — Access is controlled by source IP addresses.

dst — Access is controlled by destination IP addresses.

port — Access is controlled by the destination port number.

proto — A list of protocols is given delimited by a space.

method — This specifies the type of the method of the request; for example, POST or GET.

proxy_auth — This requires an external authentication program to check user name and password combinations. With REQUIRED put as a user name (i.e., acl password proxy_auth REQUIRED) allows any valid user name to be accepted.

url_regex — This instructs the function to search the entire URL for the regular expression you specify.

time — This indicates the time in the format day h1:m1 - h2:m2. This string can be used to restrict access to only specified days of the week and times of day. The abbreviations for the days of week are the following: s for Sunday, M for Monday, T for Tuesday, w for Wednesday, H for Thursday, F for Friday, A for Saturday.

The configuration file already contains several lists that are ready to use and usually do not have to be edited. These are shown in Listing 9.1.

Listing 9.1: Default ACL rules in the /etc/squid/squid.conf configuration file

acl all src 0.0.0.0/0.0.0.0
acl manager proto cache_object
acl localhost src 127.0.0.1/255.255.255.255
acl SSL_ports port 443 563
acl Safe_ports port 80 # http
acl Safe_ports port 21 # ftp
acl Safe_ports port 443 563 # https, snews
acl Safe_ports port 70 # gopher
acl Safe_ports port 210 # wais
acl Safe_ports port 1025-65535 # unregistered ports
acl Safe_ports port 280 # http-mgmt
acl Safe_ports port 488 # gss-http
acl Safe_ports port 591 # filemaker
acl Safe_ports port 777 # multiling http
acl CONNECT method CONNECTs

The preceding list is the minimum recommended ACL configuration.

The first entry specifies an acl named all. The src type of decision string means this list covers all users whose IP address matches 0.0.0.0/0.0.0.0, that is, all users.

The next entry creates an ACL class named manager. It defines access to the cache_object protocol, as specified by the proto type and the cache_object decision string. And so on.

Now, try to create your own ACL class. Suppose you have to allow access to the Internet for ten computers in your network with addresses from 192.168.8.1 to 192.168.8.10 (the subnet mask is 255.255.255.0). Access should be denied to all other computers in the network

When creating the list, you should start by denying access to all and then allowing it only to those who require it. A class for all users already exists in the default list: acl all src 0.0.0.0/0.0.0.0. A list for ten computers is named, for example, AllowUsers; its decision string is of the src type, the decision string itself being the range of addresses in question. Here is how it looks:

acl AllowUsers src 192.168.8.1-192.168.8.10/255.255.255.0

This ACL class, named AllowUsers, includes all computers in the specified address range.

Assigning Rights
After access lists have been created, access rights for each of them can be assigned using the following commands:

http_access allow|deny ACL_name — Specifies access rights to HTTP. In the following example, all users, except those specified in the AllowUsers ACL, are prohibited access to HTTP:

http_access deny all
http_access allow AllowUsers

By specifying access rights for the AllowUsers ACL, all it takes is one line to allow access for all computers included in this ACL. This eliminates the need to specify rights for each computer and makes the lives of administrators of big networks much easier.

In the previous example, only computers in the 192.168.8.1 to 192.168.8.10 address range were allowed access to the Internet. Access will be denied to any computer trying to access the Internet from any other address.

icp_access allow|deny ACL_name — Specifies access rights to the proxy server over ICP. By default, access is denied to all:

icp_access deny all

miss_access allow|deny ACL_name — Specifies rights to receive the MISSES reply. In the following example, only local users have the rights to receive the MISSES reply; all other users can only receive the HITS reply:

act localclients src 172.16.0.0/16
miss_access allow localclients
miss_access deny !localclients

Authentication
Using an IP address to limit access rights does not guarantee that the IP address cannot be faked. Moreover, there always exists a possibility that the wrong people can obtain the physical access to the computer allowed access to the Internet. Once they do, what they do with it is up their good, or ill, will.

I used to work for a company, in which each employee was allotted a certain monthly download limit, with the excess paid for by the employee. The authentication procedure was based on the IP address.

Note Authentication does not work if squid is configured to work in the transparent mode.

Once, several employees were noticed to have gone over their traffic limit significantly. This would have been no big deal, except these guys were away on vacation. Someone was faking their IP addresses and using their share of the Internet traffic.

To prevent something similar from happening to you, you should employ supplementary protection by checking the user name and password. This is done using the following directive:

authenticate_program path_to_program path_to_pswdfile

The directive specifies the path to the external authentication program and the path to the password file. By default, the authenticator program is not used. The traditional proxy-authentication program can be specified by the following directive:

authenticate_program /usr/lib/squid/ncsa_auth /usr/etc/passwd

The path to the ncsa_auth program may be different for your system.

You must have at least one ACL of the proxy_auth type to be able to use the authentication feature of the proxy server.

Consider the following directives:

authenticate_children n — Specifies the number of concurrent authentication processes to spawn. One process cannot perform authentication of several clients at once; consequently, while one user is being authenticated, no other users will be able to access the Internet using the proxy server.

authenticate_ttl n hour — Indicates the time in hours that the authenticated user name-password pair remains cached. During this time, the user can work without having to undergo the authentication process again. The default value is 1 hour; however, if a wrong password is entered, the pair is removed from the cache.

authenticate_ip_ttl 0 second — Specifies how long a proxy authentication will be bound to a specific IP address. The purpose of this directive is to prevent password sharing. Setting it to 0 will prevent users logging in with the same password from different IP addresses. For dial-up users, this value can be increased to 60 seconds, so that the user can redial in case of a connection break. However, dynamic IP addresses are normally used for dial-up connections, with a new address given for each connection; consequently, it is not guaranteed that the original address will be given for the repeated call.

authenticate_ip_ttl_is_strict on|off — If set to on, access from other IP addresses is disallowed until the time specified in authenticate_ip_ttl expires.

Proxy Server Operation
Initially, proxy servers were intended for solving a specific task, namely, caching data received from the Internet. For example, you may have a network of a hundred computers that all connect to the Internet using one physical communications link. It is well known that most users load the same pages several times a day. Loading the same page wastes the local server's bandwidth.

Do a simple calculation. Every day you use a search system, for example, Yahoo (www.yahoo.com) or Google (www.google.com). Assuming that on average, 10 requests are made from each of the 100 computers, about 1,000 loads of the same page will be made every day. I will not calculate how many megabytes this is, for it is already obvious that bandwidth is wasted.

A proxy server solves this problem by storing (caching) a Web page on the local disk the first time it was accessed. The next time a local user asks to access this page, instead of requesting it from the remote server, the local server serves it from the local disk cache. The economy is obvious. With time, these features have been enhanced and currently offer the following functions:

Caching documents received from the network

Caching the results of DNS requests

Organizing a network access gateway

Controlling Internet access

Providing anonymous Internet access by hiding addresses

Reducing IP address use

In this Page, the most popular Linux proxy server — squid — will be considered.

To reduce the bandwidth traffic and to increase the loading speed, a special program is installed on the server that provides access to the Internet (Fig. 9.1). When a page, for example, www.yahoo.com, is loaded on one of the local network's computers for the first time, all of its contents are saved in the proxy's cache. The next time the same page is requested from the local network, the images it contains are loaded not from the Internet but from the provider's proxy server, and the text (depending on the contents of the page and the changes to it) may be loaded from the source server.

Accessing the Internet via a proxy server

As a rule, the graphical contents of a page take up most of its volume. The text part of a page does not usually exceed 15 KB, but the graphical part can be 100 KB and more. Loading the latter information from the local proxy server makes it possible to reduce the bandwidth load and increase the page-loading speed.

The loading speed is increased because the proxy server is sending most of the Web page data (all graphics and the unmodified text) at the local network rate, which currently is 100 Mb/sec on even the cheapest network equipment. The dial-up Internet connection speed is much lower, ranging from 2 to 8 Mb/sec. At this rate, only text data that do not change are loaded (most often, HTML file contents).

In addition to caching Web pages, a proxy server can cache results of DNS requests. This can also have a positive effect on the productivity. Although humans prefer to use symbolic Web page names, computers convert them to the corresponding numerical IP addresses. Thus, before a page can be loaded, some time is taken for converting the symbolic address to its IP form. However, if the site being accessed has already been accessed, its IP address will be saved in the proxy's cache. So instead of going to a DNS server for the IP address, the proxy will take it from its cache.

As the World Wide Web has been developing and the requirements of its users have been increasing, capabilities of proxy servers have also been growing. Now, a proxy server can perform gateway functions and provide Internet access without additional software or equipment. Moreover, it serves as a shield guarding the network against invasions from the outside. When any of the proxy clients sends a Web page request to the Internet, the proxy server hides the client's IP address and sends the packets on its own behalf. This means that hackers can see only the address of the proxy server and will attempt to break into it and not into the computers it services. This makes it much easier to organize defense against outside attacks, because you can give more attention to one computer, that is, the proxy server, instead of spreading it among all client computers. However, the protection capabilities of proxy servers are too basic and are easily circumvented, so they should be supplemented by a good firewall and an eagle-eyed administrator.

The IP address concealment feature also makes it possible to save IP addresses. Because only the proxy server has the actual Internet connection, only it must have an IP address. The rest of the computers in the local network can have unroutable addresses reserved for private networks (in the 192.168.x.x. or 10.x.x.x ranges).

There are two types of proxy servers: transparent and anonymous. Transparent proxies simply forward a client's packets to the requested Web server without changing the sender's address. A proxy that conceals the sender's IP address is called anonymous. This server communicates with the external world on behalf of clients under its own name. This feature is often taken advantage of by miscreants. For example, hackers do their break-ins through anonymous proxy servers so that the owners of the burglarized machines will not be able to determine, from which address the break-in was perpetrated.

Today, there are many servers on the Internet claiming to offer anonymous proxy services, but not all them actually do. Some of them make the request source IP available to the system, to which the request is directed; others log all traffic activities, including IP addresses, with the logs available to law-enforcement agencies. Consequently, you can never be sure that the server is as anonymous as it claims to be.

Because not all of a network's computers are allowed Internet access, user authentication can carried out on the proxy server level.

Some proxy versions have a handy feature: They can exchange their cache data. For example, several offices may share one local network, but each of them has separate Internet access through its own proxy server to keep the Internet bills separate. The individual proxies can be combined into a sort of a proxy network so that if one of them does not have the information requested in its cache, it will check the caches of the other proxies for it.

Most often, this cache-sharing feature is implemented using the Internet Cache Protocol (ICP). If one server does not find the requested document in its cache, it sends an ICP request to the other proxies. If one of the proxies gives a positive reply, the information will be taken from its cache.

Using cache sharing does not lead to a significant loading-speed increase when requesting small documents, because it takes extra time to search for the documents in the shared caches. With a large request load on the servers and a sizable cache base, the search time may even be so long that it eliminates any speed load advantages. It still leaves the bandwidth economy factor, which may be important for those who have to watch each megabyte of traffic.

Not all proxy servers will have the main features just considered. It all depends on what purposes a particular proxy was developed for, and some of them are intended to address only one task.

To work through a proxy server, you have to properly configure the program you want to use the proxy with. Consider the Mozilla browser as an example. Launch the browser and select the Edit/Preferences menu sequence. A tree of categories that can be configured is located in a pane on the left in the Preferences dialog window. Select the Advanced/Proxies category sequence to configure proxy server connections. The default is no proxy: Direct connection to the Internet. You should select the Manual proxy configuration and specify the IP address and port of the proxy server for each protocol.

Configuring a proxy server in Mozilla

When configured to use a proxy server, the browser will send all requests to the proxy server, which will then forward them to the destination server. The proxy server always has to be loaded and must listen to the specific port (or several ports for different protocols).

A separate port is allocated for each protocol. For HTTP, intended to load Web pages, most often port 8080 is used; however, this value depends on the server and can be changed. Before using proxy server software, make sure that it has the necessary features and supports the necessary protocols. If the proxy does not support a certain protocol, its traffic will connect to the Internet directly.

To enhance the security of your network, you should configure the firewall to prohibit incoming connections to the ports used by the squid proxy service. For example, for the HTTP proxy port 3128 is used. Prohibiting incoming connections to this port will prevent using the proxy server for purposes other than those it is intended for, for example, breaking into the network.