View
229
Download
8
Category
Preview:
Citation preview
A prototype application Firewall to filter out dangerous e-mails MSc in DMS 1999-2000
Elias Vouropoulos - 1 -
Contents
Contents ....................................................................................................................1 1. About the project ...................................................................................................2 2. Internet security .....................................................................................................3
2.1 The need for security........................................................................................3 2.2 Cryptography ...................................................................................................4 2.3 Firewalls..........................................................................................................5
2.3.1 Network level firewalls..............................................................................6 2.3.2 Application level firewalls.........................................................................7
2.4 IP addresses .....................................................................................................7 2.5 The three myths of firewalls.............................................................................9
3. E-mail ..................................................................................................................10 3.1 Introduction....................................................................................................10 3.2 Message Format .............................................................................................11 3.3 MIME (Multipurpose Internet Mail Extensions).............................................12 3.4 Message Transfer-SMTP................................................................................13 3.5 E-mail gateways.............................................................................................13 3.6 POP3 (Post Office Protocol, version 3)...........................................................14
4. E-mail problems...................................................................................................17 4.1 Spam e-mail ...................................................................................................17 4.2 E-mail threats.................................................................................................17 4.3 Address spoofing............................................................................................19
5. E-mail filtering ....................................................................................................20 5.1 The fundamentals of e-mail filtering...............................................................20 5.2 E-mail filtering products.................................................................................21
5.2.1 Eudora.....................................................................................................21 5.2.2 Pegasus ...................................................................................................23 5.2.3 Procmail ..................................................................................................25
6. Design, implementation and evaluation of the project ..........................................26 6.1 Outline of the design......................................................................................26 6.2 The creation of the e-mail reader ....................................................................26
6.2.1 Design.....................................................................................................26 6.2.2 Implementation .......................................................................................27 6.2.3 Evaluation of the e-mail reader ................................................................29
6.3 Establishing the filters....................................................................................31 6.3.1 Design.....................................................................................................31 6.3.2 Implementation .......................................................................................33 6.3.3 The implementation of the filter ..............................................................33 6.3.4 Manipulation of the files containing rejected e-mail and IP address, domains, suspicious words and dangerous attachments ....................................36 6.3.5 Evaluation of the e-mail filter ..................................................................37
7. Future Improvements...........................................................................................39 8. Conclusions.........................................................................................................41 9. References...........................................................................................................42 10. Bibliography ......................................................................................................44 APPENDIX A: Experiences gained through the project ...........................................45 APPENDIX B: Project Objectives and Deliverables................................................46 APPENDIX C: Project Interim Report .....................................................................48 APPENDIX D: The manpage of Mail::POP3Client..................................................51
A prototype application Firewall to filter out dangerous e-mails MSc in DMS 1999-2000
Elias Vouropoulos - 2 -
1. About the project
The purpose of this project was for me to become familiar with some aspects of
Internet and network security. It was the first time that I was coming close to this
subject so I was totally unacquainted with that area. Therefore the purpose of my MSc
project was not to make me an expert in the field of network security and more
specific in e-mail filtering –and security. Its real purpose was to give the basic
knowledge about that area in order to make me capable of coping with real e-mail and
other network problems and aspects of security, in particular, e-mail filtering.
Originally the project title was 'Firewalls, encryption and other aspects of security” .
This was so vague and potentially so wide that it would not be possible to implement
anything in the time allowed.. Still, my studying in order to prepare my interim report
made me understand that it was impossible to cover all the aspects of Internet security
in a MSc project. My new supervisor Mr Bill Whyte, who started to supervise me
after the interim report and my assessor also, had the same opinion with me after they
had read my interim report. So from the first meeting with my new supervisor we
agreed that the project had to be redefined, in order to become more achievable. The
proposal of oy new supervisor was the project to be the design and the
implementation of a prototype firewall for filtering e-mails. I found his proposal very
interesting so my project was redefined and its new title was “A prototype application
firewall to filter out dangerous e-mails” .
A prototype application Firewall to filter out dangerous e-mails MSc in DMS 1999-2000
Elias Vouropoulos - 3 -
2. Internet secur ity
2.1 The need for secur ity
Computer networks are one of the most rapidly developing areas of computing that
can affect not only the area of computer science but also lots of different areas of
modern world. Almost all the enterprises around the world are using computers,
which are connected with each other in order to be able to exchange information with
their clients and the rest of the world. Consequently it is obvious that every
organisation and enterprise that wants to survive in the hard competition and the
speedy changes of the market has to use the computer networks facilities. This is even
more imperative with the rise of e-commerce as the new standard for business. Thus
the need for Internet security has become an up to date and crucial matter.
The most basic question that someone would ask is why Internet and generally
network security is needed. In order to answer that question the implementation of
Internet has to be examined. Internet is the widest known type of networks. It is
sometimes called the “network of networks” and uses the TCP/IP (Transmission
Control Protocol/ Internet Protocol) protocol. TCP/IP is not something new. Its
origins can be found in the creation of ARPANET. The first ARPANET mainly
provided high bandwidth connectivity between some US major computing sites, as
government and educational organisations and research laboratories. It provided its
users with the ability to transfer files and e-mails from one site to other [Hare, Siyan,
1996]. Although the TCP/IP was created many years ago it is still, with some new
versions, the basis of the Internet.
The fundamental problem is that Internet was not designed to be very secure. The
explanation of that can be found by examining its two significant characteristics:
distributed processing and open communications. What does that mean? To put it
simply when a computer communicate with one other almost all the other members of
that network may observe that communication and consequently they can access the
information that these two computers exchange. That phenomenon is based on the
physical connection of the computers that comprise the entire network. The two
computers that communicate with each other are not physically connected (a cable or
a fibre optic connecting directly these two computers) [Feghhi et al, 1999].
A prototype application Firewall to filter out dangerous e-mails MSc in DMS 1999-2000
Elias Vouropoulos - 4 -
Workstation
Workstation
Workstation
Workstation
C D
G
BA
F
the entire network
Workstation Workstation
Figure1: the network physical connection
Although the two computers appear as to be connected directly what really happens is
that there is a virtual path between them. Their physical connection is established with
the contribution of an immense number of intermediate computers that are connected
physically with each other in order to establish a connection between the two
computers that exchange information. All the computers that comprise the physical
path between these two computers are able to read this information. So if this
information is sensitive – for instance the code of a VISA card- the sender has to find
a way (function) to convert this data to an incomprehensible stream of bits for the
other members of the network. The receiver has to perform exactly the opposite
function than the sender’s one in order to restore the information in its first
comprehensible shape. The reason that just these two computers –the sender and the
receiver- can perform that transformation from sensible information to unmeaning bits
and the opposite is that just these two computers are sharing a secret, a key, that
allows them to do that. That technique is called encryption.
Apart from encrypting data Internet security also necessitates the protection of the
information in the server from unauthorised users. This is what firewalls are
responsible for. A firewall is a set of mechanisms that protects a network from
another network. The firewall is placed between an internal secure network and the
rest of the Internet.
2.2 Cryptography
A prototype application Firewall to filter out dangerous e-mails MSc in DMS 1999-2000
Elias Vouropoulos - 5 -
Cryptography is the science that converts an original message called plaintext to an
unintelligible message called ciphertext. Cryptography has two purposes:
1. To make the cost of breaking the cipher greater than the value of the encrypted
information.
2. To make the time that the attacker has to spend to break the ciphertext long enough,
so that the information has lost its value.
The cryptographic algorithms are divided into two categories:
1. Secret-key algorithms
2. Public-key algorithms
Secret-key encryption: the algorithms that comprise this category are using the same
secret key for encryption and decryption. The holder of the key can encrypt and
decrypt information [Stallings, 1995].
plaintext encryption decryption plaintextciphertext
secretkey
secretkey
Figure 2: secret-key encryption
Public-key encryption: it involves the use of two different keys. The private key that
must be kept secret and the public key that can be freely shared with anyone. The
public key is used for the encryption and the private one for the decryption. The
public–key algorithms are not used only for cryptography but also for the creation of
digital signatures [Stallings, 1995].
plaintext encryption decryption plaintextciphertext
secretkey
publickey
Figure 3: public-key encryption
2.3 Firewalls
A firewall is a set of mechanisms that protects one network from another. In practise a
firewall is a pair of mechanisms; one is used to block traffic and the other one is used
A prototype application Firewall to filter out dangerous e-mails MSc in DMS 1999-2000
Elias Vouropoulos - 6 -
to permit traffic1. Generally firewalls are designed to prevent access to unauthorised
users who are coming from an external network. More sophisticated firewalls block
traffic from outside to inside but they allow traffic from inside to outside. So, “a
firewall is a filtering mechanism placed between the private network and the outside
world (Internet) so that all incoming and outgoing traffic is forced to pass through it
to prevent unwanted and potentially damaging intrusion” 2.
The basic idea of the firewall is that the inner network will remain theoretically
invisible (or at least unreachable) to anyone that has no authorisation to reach the
trusted network form the outside world. So all the communications with the rest of the
Internet are taking place through the firewall. The firewall will receive the request
from someone who is trying to reach a computer inside the inner network and it will
decide whether it will allow the request to pass through or not.
Figure 4: the use of the firewall
Although firewalls are usually placed between a trusted network and the Internet,
there are some cases, especially in big organisations and enterprises, where firewalls
can be used in order to create different sub-nets of the network and consequently to
create different levels of access to information for its employees.
There are different implementations of firewalls. Still they can be divided into two
basic types: network level and application level firewalls.
2.3.1 Network level firewalls
The network level are also known as packet filtering firewalls. They are usually router
based. So the rules about who and what can access the inner network are applied in
1 Ranum, M J and Curtin, M (1998), Internet Firewalls Frequently Asked Questions, http://www.hideaway.net/texts/fwfaq.html 2 Khalid Al-Tawil and Ibrahim A. Al-Kaltham (1999), Evaluation and Testing of Internet Firewalls, International Journal of Network Management, Int. J. Network Mgmt. 9, pp. 135-149
A prototype application Firewall to filter out dangerous e-mails MSc in DMS 1999-2000
Elias Vouropoulos - 7 -
the router level. These firewalls accept or deny access based on the source address.
The router examines the packet that is coming from the outside world. After the IP
source has been identified the router will decide if it will forward the packet or it will
reject it, according to its rules. The philosophy used by the router is that everything
that is not forbidden is permitted. This type of firewall is not so secure because it can
be bypassed when the attackers are using forged IP addresses. Although they are not
so secure they are fast because it is easy enough for the router to identify the IP source
of the incoming packet and check if it is restricted or not3.
2.3.2 Application level firewalls
This type of firewalls is also known as application gateways. Contrary to the network
level, which are hardware (router) based firewalls, the application level ones use
server programs –called proxies- that run on the firewall. So a computer is used as
firewall instead of a router. When a remote user sends a request to a network using an
application gateway, the gateway blocks the remote connection. Instead of allowing
immediately access to the internal network the gateway examines various fields in the
request. If these meet the predefined rules then the gateway allows the remote user to
access the internal network. Application gateways require a proxy for each service,
such as FTP, HTTP etc to be supported through the firewall. The application gateway
is considered to be the most secure type of firewall4.
2.4 IP addresses
As it has already been mentioned, one of the most important elements that firewalls
are taking into consideration in order to allow a user to access the internal network is
the IP source of the holder of that request. IP addresses are also one of the filters that
3 Ranum, M J and Curtin, M (1998), Internet Firewalls Frequently Asked Questions, http://www.hideaway.net/texts/fwfaq.html 4 Ranum, M J and Curtin, M (1998), Internet Firewalls Frequently Asked Questions, http://www.hideaway.net/texts/fwfaq.html
A prototype application Firewall to filter out dangerous e-mails MSc in DMS 1999-2000
Elias Vouropoulos - 8 -
has been implemented in that project. So it would be useful to discuss briefly what IP
address are.
Every computer connected to the Internet has a unique number so the other computers
can identify it. That is the IP address. So IP is 32-bit numbers that can identify
Internet host. These numbers are placed in the header of each packet using TCP/IP
and used in order to route the packets in their destination.5 E-mail is also using
TCP/IP so, in each e-mail header there is the IP address of the sender. So we can use
that info in order to identify the sender of an e-mail.
As mentioned earlier, an IP address is a 32-bit number. That number usually
represented as four fields each representing 8-bit numbers in the range 0 to 255
separated by periods. So an IP address looks like the following: “129.11.147.188” . IP
addresses can be divided into 4 classes A, B, C and D. The value of the first class
determines the class that an IP belongs. Class D addresses are used for multi-cast
applications. The range of these values for each class is listed bellow6:
Class Range Allocation
A 1-126 N.H.H.H
B 128-191 N.N.H.H
C 192-223 N.N.N.H
D 224-239 Not applicable
N = Network and H = Host
An IP address can be either static or dynamic.
• A static address is permanent. It is the address that has a computer that is always
connected to the Internet.
• A dynamic IP address is one that is temporarily assigned to a different node each
time it connects to the Internet. Dynamic IP is used from ISPs for dial-up access
and each time o node dials up, a different IP address is assigned to it.
5 Connected: An Internet Encyclopedia (April, 1997), IP Address, http://noc.ucsc.edu/cie/Topics/23.htm 6 Chris Lewis (2000), IP 101: All About IP Addresses, http://www.networkcomputing.com/netdesign/ip101.html
A prototype application Firewall to filter out dangerous e-mails MSc in DMS 1999-2000
Elias Vouropoulos - 9 -
2.5 The three myths of firewalls
The following three myths about the firewalls are adopted by Bob Blakley, a security
architect at IBM7.
1. “We have got the place surrounded”: Firewalls are assuming that the only way to
access the inner network from the Internet is through this firewall. But that
assumption is not always true, because sometimes there are some back doors to
the inner network. Usually these back doors can be created from the users of that
network. For instance in a huge enterprise users may set-up their own back doors
using modems and the appropriate programs so thatthey can work from home.
2. “Nobody is here but us chickens” : Firewalls are also assuming that all the users in
the inner network are trustworthy. But that is not a real assumption, since lots of
computer crimes have been done by insiders.
3. “Sticks and stones may break my bones, but words will never hurt me”: With the
use of Word macros, JavaScripts, Java and other types of executable commands
that can be embedded inside data, a security system that is not aware of that fact
may be totally unsecured.
7 MIT Kerberos team (2000), The Three Myths of Firewalls, http://web.mit.edu/kerberos/www/firewalls.html
A prototype application Firewall to filter out dangerous e-mails MSc in DMS 1999-2000
Elias Vouropoulos - 10 -
3. E-mail
3.1 Introduction
Electronic mail, or e-mail as it is widely known is one of the most popular uses of the
Internet. Millions of people are using it in order to send mails in other people all
around the world. The first e-mail system was a simple file transfer protocol and
every file that was e-mail it has in its first line the recipient’ s address. But that
implementation has enough problems. Some of them were that a mail could not be
send to more than one recipient and also it had no internal structure in order for the
server to process it easier.
By the time the necessity for some better standards was continually increasing. “In
1982, the ARPANET e-mail proposals were published as RFC 821 (transmission
protocol) and RFC 822 (message format). These have since become the de facto
Internet standards” [Tanenbaum, 1996]. Two years later, another proposal for e-mail
was done by CCITT. That was the X.400. Although after one decade of hard
competition between these two standards the e-mail systems using RFC 822 are
commonly used and those using X.400 have almost disappeared. The reason for that
was the very poor and complex design of X.400 rather than the good implementation
of the RFC 822 proposal.
An e-mail system consists of two parts: the Message User Agent (MUA) and the
Message Transfer Agent (MTA). The MUA is a client-based program that provides
the user with an interface, graphical or not, in order to interact with their e-mails. This
project involved creating a basic MUA, or e-mail reader, applying some filtering
rules. It uses the POP3 protocol that will be explained later in that chapter, to
communicate with the mail server in order to access the e-mails. The MTA is the part
of the system that moves the messages from their source to their destination. The
typical implementation of a MTA is the SMTP (Simple Mail Transfer Protocol, RFC
821). One typical e-mail system has to perform five basic functions [Tanenbaum,
1996]:
• Composition: it refers to the creation of the e-mail.
• Transfer: it refers to the sending of the messages.
• Reporting: informs the sender if the e-mail was delivered.
• Display: it displays the incoming messages.
A prototype application Firewall to filter out dangerous e-mails MSc in DMS 1999-2000
Elias Vouropoulos - 11 -
• Disposition: it allows the user to delete or save a message.
Function: MUA MTA
Composition �
Transfer �
Reporting �
Display �
Disposition �
3.2 Message Format
The message format (RFC 822) defines that messages have two parts: header and
body. Both these parts are text ASCII characters. In the beginning of the e-mail
history there was no need for the body of a message to be something different than the
text. Although later the increasing need of the users to send files through e-mail led to
the definition of MIME (Multipurpose Internet Mail Extensions), so they could send
files via e-mail [Peterson, Davie, 2000].
In RFC 822 each header consists of one line of ASCII text. The most important
header fields are the following [Tanenbaum, 1996]:
• To: that field contains the DNS (Domain Name Server) address of the primary
receiver. Multiple receivers are also allowed.
• Cc (Carbon Copy): it contains the DNS address of the secondary receiver. As in
the case of the To field, multiple receivers are also allowed. There is no difference
between the To and the Cc field for the e-mail system. Their difference is just
psychological for the user.
• Bcc (Blind Carbon Copy): it is similar with the To and Cc fields but that line is
removed from the copies that are sent to the primary and secondary receivers.
• From: it identifies, but it is not always accurate as it will be discussed later, the
person that sends the e-mail. Usually it contains their name and their DNS
address.
• Received: that header is added by each MTA along the way. It contains some
information about the agent’s identity and also the date and the time.
• Date: it is the date and the time the message was sent.
A prototype application Firewall to filter out dangerous e-mails MSc in DMS 1999-2000
Elias Vouropoulos - 12 -
• Message-ID: a unique number to identify the message.
• Subject: a brief discussion about what the e-mail contains.
The RFC 822 also allows the users to add some new headers for their own private use.
These headers have to start with the “X-“ – for instance “X-No-Archive:” .
3.3 MIME (Multipurpose Internet Mail Extensions)
As mentioned earlier, the RFC 822 specifies that the body of the message will be US-
ASCII characters. So MIME extends the format of the messages in order to allow8:
• Textual messages in character sets others than US-ASCII.
• An extensible set of different formats for non-textual message bodies.
• Multipart message bodies.
• Text header information in character sets other than US-ASCII.
MIME defines five new message headers. If any message has no these fields, then it is
handled as if it is an US-ASCII characters message.
• MIME-Version: uses a number to identify the MIME version.
• Content-Type: specifies the nature of the data in the body of the message.
• Content-Transfer-Encoding: identifies the way that the body of the message is
encoded for transmission through the network. The most appropriate way to
encode binary messages is to use base64 encoding [Tanenbaum, 1996].
• Content-ID: it is similar with the standard Message-ID header.
• Content-Description: a human-readable description about what the message
contains. It is similar to the Subject header, as it is defined in RFC 822.
8 Multipurpose Internet Mail Extensions (RFC 2045) (1996), http://andrew2.andrew.cmu.edu/rfc/rfc2045.html
A prototype application Firewall to filter out dangerous e-mails MSc in DMS 1999-2000
Elias Vouropoulos - 13 -
3.4 Message Transfer -SMTP
The message transfer subsystem of an e-mail system is responsible for the
transmission of the message from its source to its destination. The simplest way to do
that is to establish a transport connection from the source machine to the destination
one and then just transfer the message, although it is very common the receiver
machine of one connection to be just an intermediate one and not the ultimate
destination. Within the Internet e-mails are moved by establishing a TCP connection
to port 25 between the sender-SMTP and the receiver-SMTP. The sender and the
receiver generate and exchange SMTP commands. In each host there is an e-mail
daemon (program). The MUA gives the daemon the order that it wishes to send an e-
mail and the daemon is using SMTP to send the e-mail to the daemon runs on the
receiver machine. The most popular implementation of e-mail daemon is the sendmail
of UNIX.
After establishing the connection the sender acts as client and the receiver as server.
The server sends a message to the client giving its identity and whether it accepts e-
mails or not. If the server wishes to accept e-mails the client send a message to the
server about the sender and the receiver (their DNS addresses) of the e-mail. If the
receiver exists in the server, then it gives the client the prompt to send the e-mail.
3.5 E-mail gateways
SMTP is a protocol that runs above TCP/IP, an Internet standard. However, there are
some companies that do not wish to be connected to the Internet but they wish to
receive and send e-mails. In that case the server of that company is not connected
directly to the Internet. It is just connected to an e-mail gateway, which act as a
firewall, because it protects the company’s server allowing just to e-mails to reach the
server.
A prototype application Firewall to filter out dangerous e-mails MSc in DMS 1999-2000
Elias Vouropoulos - 14 -
e-mail gateway
E-maildaemon
Internet
TCP or TP4connection
TCP or TP4connection
Server runningSMTP or X.400
E-maildaemon
LAN
Figure 5: the use of an e-mail gateway as a “ firewall”
There are also some cases when the sender or receiver speaks only RFC 822 and the
other part of the connection just X.400. The solution for both these cases is the use of
application layer e-mail gateways. The e-mail gateway is an intermediate node that
likes the hosts runs sendmail. Its job is to store and forward e-mails.
e-mail gateway
Server running SMTP
E-maildaemon
E-mailreader
E-maildaemon
Server running X.400
E-mailreader
E-maildaemon
TCP connection TP4 connection
Figure 6: the use of an e-mail gateway node to connect different networks
3.6 POP3 (Post Office Protocol, version 3)
Up till now what has just been mentioned is the way that e-mails are moved from one
host to other. PCs or workstation that do no have the resources to run an e-mail
daemon, need a way to access the e-mails in a mailbox server. A simple protocol that
A prototype application Firewall to filter out dangerous e-mails MSc in DMS 1999-2000
Elias Vouropoulos - 15 -
is used in these cases is POP3. That protocol is used for the implementation of that
project.
Clinet running POP3
Server runningPOP3
TCP conection in port 110
Exchanging commandsand responses after theestablishing of theconnection
Figure 7: the POP3 protocol
The mailbox server starts the POP3 service by listening on TCP on port 110. When a
client wishes to retrieve e-mails using that protocol, it establishes a connection with
the server on that port. When the connection has been established the server sends a
greeting. Afterwards the server and the client exchange responses and commands
respectively till the end of the connection9.
The commands in POP3 consist of some keyword followed in some cases by an
argument. The responses consist of a success indicator and a keyword followed
sometimes by additional information. There are two success indicators: the positive
(“+OK”) and the negative (“-ERR”) one. A CTRLF pair terminates the commands
and the responses.
During a POP3 connection there are three different stages. The first stage is the
AUTHORIZATION one. In the beginning of that stage the server returns one line
greeting to the client. The client now has to identify itself. There are three commands
in that stage:
• User “username”
• Pass “password”
• Quit
If the client logs in properly to the server then the server locks the maildrop and
passes the connection to the next stage, the TRANSACTION one. In that stage the
client gives some commands and the server replies to the client with a positive
9 Post Office Protocol - Version 3 (RFC 1939) (1996), http://www.faqs.org/rfcs/rfc1939.html
A prototype application Firewall to filter out dangerous e-mails MSc in DMS 1999-2000
Elias Vouropoulos - 16 -
response if the command is right and some additional information if necessary. The
commands are the following:
• Stat: the server returns a line containing information about the maildrop.
• List msg (optional): if the client gives a number then the server returns some
elements about the e-mail with that number or else it returns some elements about
the maildrop.
• Retr msg: the server returns the message with this number.
• Dele msg: the mail with this number is marked as deleted.
• Noop: the server just replies a positive response.
• Last: the server returns the highest message number, which accessed.
• Rset: the mails that have been marked as deleted, they are unmarked.
The final stage is the UPDATE one. The connection is going to that stage when the
client gives the quit command in the TRANSACTION stage. The server deletes all
the messages marked as deleted and unlocks the maildrop.
A prototype application Firewall to filter out dangerous e-mails MSc in DMS 1999-2000
Elias Vouropoulos - 17 -
4. E-mail problems
4.1 Spam e-mail
Junk or spam e-mail as it is widely known is a big arising problem of Internet. Spam
is flooding the Internet with many copies of the same message, sending it to lots of
receivers who would not choose to receive it in any case. Spam e-mail is usually
commercial advertisements or some get-rich-quick schemes10.
There are two main types of spam e-mailing. The first one is through sending an e-
mail to many newsgroups. The second type is when the spam mail is being sent
directly to many different receivers. When the first type occurs it is in the
responsibility of the administrator of the newsgroup to block the spam e-mail. That
project is targeted at the second type of spam.
Except that spam e-mail is so annoying, it is also a waste of time and money for its
receiver to read it. It is very clear why it is a waste of time. It is also waste of money
because lots of persons are dialling up to some ISP, so the time that they read it they
pay some bills – for instance the rent for the line. Lots of companies choose that way
of advertisement instead of using the traditional mail system because it is cheaper.
Generally, spam e-mail is the only way of advertising that costs more money to the
customer than the company. The sender of spam e-mails can find the e-mail addresses
of the receivers using many ways. Two of them is searching through the net for them
or stealing some lists for different newsgroups.
4.2 E-mail threats
In the beginning of the e-mail history, when only text could be transferred via e-mails
there was no any problem to any system to be damaged from a malicious program
through the e-mail. However, today the use of MIME allows e-mails to carry files in
their bodies. These files may contain some malicious programs. It is also possible
some e-mails to contain embedded HTML code in their bodies. With the use of Java
and JavaScripts it is also possible to add some harmful executable commands in the
HMTL code. If the e-mail reader can execute immediately the HTML code, it will
10 Scott Hazen Mueller (1999), What is spam?, http://spam.abuse.net/whatisspam.html
A prototype application Firewall to filter out dangerous e-mails MSc in DMS 1999-2000
Elias Vouropoulos - 18 -
execute the malicious one. Consequently it is possible for a nasty program to be
executed just by reading the e-mail. E-mail threats may be Trojan horses, viruses or
worms.
• Trojan horse: it is the most elementary form of malicious code. By a Trojan horse
we mean instructions hidden inside an otherwise useful program that do bad
things. Usually the term Trojan is given to a program when these instructions are
added inside the program by the time that this program is written [Kaufman,
1995]. When a Trojan horse program is executed it may destroy files or create a
“back door” entry allowing an intruder to access tour system. A Trojan horse
program does not propagate itself from one computer to another. This is a
characteristic of the other two threats – virus and worm11.
• Virus: it is a set of instructions that, when it is executed, inserts copies of itself
into other programs [Kaufman, 1995]. When the infected program runs, the virus
code gets a chance to inspect its environment for other programs and infects these
files. If a user sends that program to another user or if a media storage –for
instance a floppy drive- that contains an infected file is moved from one machine
to another, then the virus may spread rapidly. Melissa is one of the most famous
viruses that was spread through e-mails. This virus managed to infect over
100,000 PCs in all over the world.
• Worm: it is a program that replicates itself by installing copies of itself in other
machines across the network [Kaufman, 1995]. A worm does not alert other files
like Trojan horses and viruses, but resides in active memory and duplicates itself
through the network. Worms are invisible to the users. They can be noticed when
their uncontrolled replication consumes system resources, slowing or halting other
programs. So what worms are really doing is causing harm by consuming
computing resources. Some new worms, like the Worm.ExploreZip, reside in the
computer memory and replicate themselves, like all the worms, but they also
contain some malicious payload12.
11 Zdnet, (2000), Help &How-To: Trojan Horse, Virus or Worms,
http://www.zdnet.com/zdhelp/stories/main/0,5594,2435378,00.html?chkpt=zdnnmoreon 12 Zdnet, (2000), Help &How-To: Definition of a Worms,
http://www.zdnet.com/zdhelp/stories/main/0,5594,2435378-3,00.html
A prototype application Firewall to filter out dangerous e-mails MSc in DMS 1999-2000
Elias Vouropoulos - 19 -
4.3 Address spoofing
When an e-mail is received its header ought to indicate who sent it. However, we
cannot be absolutely sure that the address is shown in the header is the sender’s one. It
is possible for the headers to be forced so that they indicate that the message was sent
from another e-mail address than the real one. This is called e-mail address spoofing.
This attack can be prevented by a technique called origin authentication.
Origin authentication checks whether or not the sender is the one who he claims to be.
In order that to be achieved a public key cryptography is used. This cryptography uses
two different keys. A private key which only the sender knows (each sender has their
own private key) and one public key, corresponding to this private key, that can be
found easily. The sender uses their private key to encrypt the message and the
receiver uses the public key to decrypt the e-mail. After the decryption, if the message
remains unintelligible that means that the sender did not use the appropriate private
key, consequently he is not him who he claims to be [Hare, 1996]. That project does
not implement any cryptography and origin authentication techniques.
A prototype application Firewall to filter out dangerous e-mails MSc in DMS 1999-2000
Elias Vouropoulos - 20 -
5. E-mail filter ing
5.1 The fundamentals of e-mail filter ing
As it has been clear from the above mentioned e-mail is a very useful facility and tool
of computer networks, although sometimes it can be very dangerous (e-mail threats)
or annoying (spam e-mail). Consequently it is necessary to use some filtering
techniques in order to isolate some dangerous or annoying e-mails before the user
opens them or before the user runs their attachments.
The following are the fundamentals of e-mail filtering which have been specified
through the collection of different information of discussion forums and newsgroups:
1. The source of the e-mail: the source should be checked in order to make sure that
it is an acceptable one. In other words, if there is an exclusion list of sources, the
e-mail should be not accepted if it is one of the e-mails that comprise the
exclusion list. This might be the individual –for instance
vouropoulos@hotmail.com- or the domain –hotmail.com. Both of them have to be
checked separately. Also the IP address of the sender has to be checked and if it is
an excluded one, the e-mail needs to be quarantined.
2. Subject line: if that header contains some “dangerous” words, then the e-mail
needs to be quarantined.
3. The main body of the text: it has to be checked whether it contains any of the
words belonging to a list of trigger words or not. It is vital that in that search for
trigger words both in the body of the message and in the subject line, the words
have to be normalized before the search, usually be changing them to mono-case.
4. Attachments: some e-mails containing dangerous attachments need to be isolated.
If there attachment is a known virus, these e-mails have to be kept in quarantine.
In addition, for some files with the following extension such as .vbs, .shs, .jpg,
.gif, .exe, .bat, .com or others that might have active content, the e-mail reader has
to inform the user about the risk of receiving and executing these files.
5. Viruses: it would also be useful, although complicated, for the e-mail reader to
include an exit for a standard antiviral program, so that each attachment file can
be checked.
Most of these fundamentals have been implemented in this project.
A prototype application Firewall to filter out dangerous e-mails MSc in DMS 1999-2000
Elias Vouropoulos - 21 -
5.2 E-mail filter ing products
Almost all the e-mail readers have some filtering abilities. For this report just three of
them will be examined: two e-mail readers for M.S. Windows and one for Unix. The
two MUA for Windows that will be examined are those two commonly used in the
University of Leeds computer clusters: Eudora and Pegasus. The Unix filtering tool is
the Procmail.
5.2.1 Eudora
Eudora allows the user to filter the incoming of outgoing mail or both of them
automatically or manually when the user desires to do so13. In order for the user to
define a new filter, the “Filters” option has to be selected from the menu “Tools” .
Figure 8: the Eudora’s “ Filters” option
When the form of filtering is displayed on the screen, the user can define the criteria
for a new filter. The first thing upon which the user has to decide is which header the
user wishes to be checked –one of these option is the body of the message. After that
the user has to put a phrase or just few words that he/she wants to appear in the
already chosen header of the e-mail in order to activate the filter. In that stage it is
13 QUALCOM Incorporated (1999-2000), Tutorial: how to use fil ters, http://www/eudora.com/techsupport/tutorials/win_filters.html
A prototype application Firewall to filter out dangerous e-mails MSc in DMS 1999-2000
Elias Vouropoulos - 22 -
possible for the user to choose another header as well to be checked and consequently
a combination of the two headers –“or” , “and” , “unless” etc- to activate the filter.
After the user has decided which situation will activate the filter the next step is to
decide what will happen if one e-mail meets the filtering rules.
Figure 9: the Eudora’s “ Filters” form
There are different options for the user to choose that Eudora will perform when an e-
mail activates the filtering rules. This part of the program is always updated in every
new version of Eudora and new actions are added as well as the ability to combine
more actions to be performed in each of the filtering rules. So, Eudora’s 4.3 different
options for that stage can be viewed in the following figure.
A prototype application Firewall to filter out dangerous e-mails MSc in DMS 1999-2000
Elias Vouropoulos - 23 -
Figure 10: the Eudora’s “ Actions” option
5.2.2 Pegasus
Pegasus filtering abilities are better than those of Eudora. There are eight different
types of rules14. In order for the user to create a filtering rule, one has to go to the
“Tools” menu and chose successively the following choices: “Mail filtering rules”
“Edit new mail filtering rules” “Rules applied when folder is opened” or “Rules
applied when folder is closed” . The following figure displays the above mentioned
process.
14 AIESEC Organization (2000), E-mail Management Tips: Filtering, http://www.aisec.org/help/filtering.html
A prototype application Firewall to filter out dangerous e-mails MSc in DMS 1999-2000
Elias Vouropoulos - 24 -
Figure 11: choice of Pegasus filtering option
Two of the filtering rules, “Standard header match” and “Regular expression match”
are similar to the filtering rules of Eudora. Two other types of rules, “Message date”
and “Message Age” allow the filter to handle each e-mail according to each date. The
actions that the filter will perform when an e-mail meets the filtering rules are about
the same with those of Eudora.
Figure 12: the Pegasus filtering option
A prototype application Firewall to filter out dangerous e-mails MSc in DMS 1999-2000
Elias Vouropoulos - 25 -
5.2.3 Procmail
Procmail is a mail processing utility for Unix that can also filter the e-mails. Procmail
is written by Stephen van de Berge. The advantage of procmail is that it can process
messages either as they arrive or after they are already stored in the mailbox15, by
contrast with the two previous programs and this project which process the messages
after they are stored in the maildrop. This significant difference is based on the fact
that procmail is not a MUA as the other three but a Mail Delivery Agent (MDA)16.
MDA is the program used by the servers in order to deliver e-mail messages to the
mailboxes of the system’s users. Some servers are using MTA -usually sendmail- as
MDA, but others are using a pure MDA program like procmail. The first step for
someone to use procmail is to do some configurations in case procmail is not the
MDA of the system, so that all the e-mails to can be processed by procmail. What is
essential is to create a .forward file so that each e-mail is forwarded to the procmail.
Afterwards the creation of the .procmailrc file is necessary. Each .procmailrc file is
like a small program consisting of two parts, assignments and recipes17. The former
sets up some variables so that procmail knows where whatever necessary is stored.
The latter is the part where the filtering is done. A part of the recipe of a .procmailrc
file from the procmail man page is following:
:0
* ^Subject:.*Flame
/dev/null
Briefly what this code does is moving all the e-mails containing in their Subject the
word “Flame” to the /dev/null, which is a “bit bucket” of Unix meaning that the
procmail deletes these e-mails.
15 Infinitive Ink (2000), Procmail quick start, http://www.ii.com/internet/robots/procmail/qs/ 16 Jim Dennis (1997), Promail Mini tutorial: Automated Mail Handling, http://www/linuxgazette.com/issue14/procmail.html 17 Ian Soboroff (1997, Mail filtering with procmail, http://www.gl.umbc.edu/~ian/procmail
A prototype application Firewall to filter out dangerous e-mails MSc in DMS 1999-2000
Elias Vouropoulos - 26 -
6. Design, implementation and evaluation of the project
6.1 Outline of the design
That project is a client-based e-mail reader implying that the user has to copy the
necessary files in some directories and afterwards to run the “mail.pl” file in order to
access his/hers e-mails in the mail server. The method used to access the e-mails from
the mail server is the POP3, already described in a previous chapter. This project is a
program written in PERL so as to be platform independent and it can run both from
Unix and Windows environment (although in order to be able to run in Windows
(DOS), some small changes are necessary). The PERL module Mail::POP3Client,
written by Sean Dowd, was used for the POP3 communication with the server. The
design and implementation of the project was divided in to parts. The former includes
the creation of an e-mail reader while the later the formation of some filters for these
e-mails. Each part comprises three stages: design, implementation and evaluation.
6.2 The creation of the e-mail reader
6.2.1 Design
The purpose of that project is the creation of an application firewall to filter out e-
mails. So the first part should be the creation of an e-mail reader into which later these
filters will be incorporated. Everyone who has used e-mails knows the basic
principles of an e-mail reader. An e-mail reader has to perform some basic operations
including the composition, display and disposition of the e-mails. The e-mail reader of
that project performs two of these operations, display and disposition. It displays the
e-mails while allowing the user to delete the e-mails that they do not want to be stored
anymore in their mailbox. The display operation has to perform two tasks. The first
one is the display of the e-mails list allowing the user to be informed about the e-mails
existing in his/hers mailbox. In that case that list has to display such information
about each e-mail as its subject and its sender. The second task is to display on the
screen some of the e-mail headers, which can be the same that are displayed for the
first task, and also the body of the e-mail. The disposition operation of the e-mail
reader is to allow the user to delete or undelete some messages already marked to be
deleted.
A prototype application Firewall to filter out dangerous e-mails MSc in DMS 1999-2000
Elias Vouropoulos - 27 -
6.2.2 Implementation
The PERL module Mail::POP3Client by Sean Dowd was used for the implementation
of the e-mail reader (for more information about this module see the Appendix No
??). Through my research as to how an e-mail reader for filtering e-mails could be
created I participated in different discussion groups and newsgroups while at the same
time searching the Internet for additional data. The majority of the other participants
suggested the creation of an appropriate scripts using PERL and the PERL
Mail::POP3Client module in order to retrieve the e-mails from the server. However,
one tutorial from the Internet proposed the Net::POP3 module for the retrieving of e-
mails18. The Net::POP3 module was already installed in the Unix machines so only
the former had to be installed in my Unix account. A choice between those two
modules had to be made. After a short examination of those two, the
Mail::POP3Client seemed to best one, so it was applied to the project. The e-mail
reader has four main options.
Figure 13:the four different options of the e-mail reader
It allows the user to see a list with their e-mails, to read a specific by giving its serial
number in the list, to delete a specific e-mail or undelete an e-mail already marked as
deleted.
18 About.com (2000), How to retrieve e-mails, http://perl.about.com/compute/perl/library/weekly/aa022700a.html
A prototype application Firewall to filter out dangerous e-mails MSc in DMS 1999-2000
Elias Vouropoulos - 28 -
The delete and undelete operations are performed within the main body of the
program. A hash (associative array), called “deleted” is used in order to delete or
undelete an e-mail. When a user wants to delete an e-mail, then an entry for that e-
mail is added to that hash. If the user wishes to undelete an already marked as deleted
e-mail, then the entry for that e-mail is removed from the hash. So, the e-mails
marked as deleted will be deleted from the mailbox, after the user exits the program.
Two different subroutines are used for the display of the e-mails list and the retrieving
of a specific e-mail. For the display of the list there is a separate subroutine called
“header” . Briefly, what this subroutine is doing is that for each e-mail, its serial
number in the list is displayed. The From and Subject line of that e-mail are also
displayed. In case the e-mail has been marked as a deleted one, a label (the block
letter D) is displayed next to the serial number of that e-mail in order for the user to be
aware of the e-mail to be deleted from the mailbox.
Figures 14:displaying the e-mails list
The second subroutine displaying a specific e-mail when the user gives its serial
number is called “retrieve” . That subroutine displays the From and Subject headers of
that specific e-mail and its body, too. In case the specific the specific is a multipart
one, then its body contains a plaintext part, if any, and the binary file(s) that is/are
stored in the mail server as huge unintelligible text(s), after its/their encoding, usually
base64 encoding. A “boundary” string that can be found in the header Content-Type,
which is added to the e-mail by MIME, separates each part of the e-mail. So,
whenever the subroutine finds that string in the body of the e-mail, it examines
A prototype application Firewall to filter out dangerous e-mails MSc in DMS 1999-2000
Elias Vouropoulos - 29 -
whether the following part is a plaintext and displays it or in a different case, the type
of that part as well as the name of the file are displayed.
Figures 15:displaying a specific e-mail
6.2.3 Evaluation of the e-mail reader
As it was mentioned before, the Mail::POP3Client module was used for the
implementation of that part of the project. The problem that came up during the
implementation of that part was the following one: although the script could find the
POP3Client module and the code used in order the connection with the mail server to
be established was the right one, it was the one existing in the module, the connection
could not be established. Different discussion groups and newsgroups were used but
initially none was of any particular help. Finally the solution came from the
discussion group of the author of the module, Sean Dowd19. The three obligatory
fields that the manpage of module defines that should be used in the constructor, are
the “USER’ ’ , “PASSWORD” and the “HOST” one. All the other fields have some
default values, so when they are not mentioned in the constructor they have their
default values. One of these fields is the “AUTH_MODE”. “The valid values for
AUTH_MODE are “PASS” and “APOP’’ . APOP implies that an MDS checksum will
be used instead of passing your password in cleartext. However, if the server does not
support APOP, the cleartext method will be used” 20. The default value of that field is
19 Deja (Discussions>>comp.lang.perl.modules ) (2000), Re: POP3Client, http://x70.deja.com/getdoc.xp?AN=644432010&CONTEXT=966869437.408289281&hitnum=1 20 As mentioned in the Mail::POP3Client manpage.
A prototype application Firewall to filter out dangerous e-mails MSc in DMS 1999-2000
Elias Vouropoulos - 30 -
“PASS”. But there is a problem with some servers and in these cases the constructor
has to include the following line:
AUTH_MODE=> PASS
After that change the connection with the server was established. For evaluation
reasons, the connection with different mail servers was attempted and the e-mail
seemed to work properly, although in that part of the project there is a problem when
the program is going to be used under a WINDOWS (DOS) platform. When the
program prompts the user to give his/her password, this does not have to be viewable
on the screen for security reasons. There is a PERL module called Term::ReadKey
that can be used for that purpose. Nevertheless, it could not be installed properly, so
the “sty-echo” command of Unix was used in order to do that. Still, that command is
not working in DOS and a DOS similar command could not be found. So when the
program runs under a WINDOWS platform, that command has to be removed and the
password will be viewable.
A prototype application Firewall to filter out dangerous e-mails MSc in DMS 1999-2000
Elias Vouropoulos - 31 -
6.3 Establishing the filters
6.3.1 Design
The majority of the-mail filtering principles, already discussed in a previous chapter,
are included in the e-mail filtering rules of this project. This project contains six
different types of filters and it is going to define whether or not it will quarantine a
specific e-mail after taking into account the following factors (the sequence of these
factors is the one that is followed in the program):
• The individual e-mail address of the sender.
• The domain where from the e-mail is coming.
• The IP addresses that the headers of the e-mail contain.
• The attachments of the e-mail.
• The e-mail Subject line.
• The number of the individual receivers of this e-mail.
check theindividual e-mail
check theSubject line
check the attachments
check the IP addresses
check the domain
check thenumber of the
receivers
OK
OK
OK
OK
OK
OK quarantine the e-mail
wanted e-mail
Not OK
Not OK
Not OK
Not OK
Not OK
Not OK
Figure 16: the sequence of the six filters
A prototype application Firewall to filter out dangerous e-mails MSc in DMS 1999-2000
Elias Vouropoulos - 32 -
The first filter that the program performs is relevant to the individual e-mail address
of the sender. The e-mail addresses existing in all the e-mail headers (excluding the
“To:” and “Cc:” ones) are checked to see whether or not they belong to an exclusion
list of senders; if so, the e-mail is quarantined The next filter is similar to the previous
one, except, this time, the domains in the e-mail headers (again the “To:” and “Cc:”
ones are excluded) are checked whether or not they belong to a list with unwelcome
domains.
The third filter of the program concerns the IP addresses contained in the headers of
an e-mail. Usually the IP addresses of the different MTAs involved in the
transportation of the e-mail along the way are added in the “Received:” header of the
e-mail. So, there is a list with some unwanted IP addresses and when an e-mail
contains one of these, it is quarantined.
The fourth-filter defines whether an e-mail is dangerous or not, according to its
attachments, if any. If it contains some files whose filenames are included in a list
with some dangerous files (worms, viruses and Trojan horses), then this e-mail is kept
in quarantine. The fifth filter confirms whether or not the “Subject:” line of the e-mail
contains some suspicious words. There is also a list with these doubtful words and if
the “Subject:” line contains one of them, then the e-mail is quarantined. The last filter
monitors the number of the individual receivers and in case this number is beyond an
already given limit, the e-mail is isolated. The last three filters are trying to isolate e-
mails containing dangerous files. It is known that when these dangerous files,
especially the viruses infecting a computer, they are transmitted automatically through
the infected machine to other users whose e-mails are contained in the mailbox of the
infected computer. These automatically sent e-mails have a specific subject
(according to the specific virus), they are sent to many persons and they contain in
their bodies the hazardous files. So these last filters are dealing with these suspicious
e-mails.
A prototype application Firewall to filter out dangerous e-mails MSc in DMS 1999-2000
Elias Vouropoulos - 33 -
6.3.2 Implementation
The implementation of the filters is divided in two parts: the filter itself and the
manipulation of the files containing the lists with the unwanted IP and e-mail
addresses, domains, the suspicious words and the dangerous filenames.
6.3.3 The implementation of the filter
The six filters performed by the program are included in a subroutine called
“ find_emails” . This subroutine checks each e-mail individually in order to decide
whether it is a wanted one or not. The unwanted e-mails are not deleted from the
mailbox. What really happens is that these e-mail are quarantined, so the users cannot
see and access them. When the user exits the program, these e-mails are deleted from
the mailbox. The quarantine is achieved through the use of a small trick. There is an
array called “deleted_f” whose length is the number of the e-mails in the mailbox - it
has to be made clear that this is a different array from the associative one (hash) called
“deleted” containing the e-mails the user wishes to delete. Consequently each element
of this array refers to an e-mail in the mailbox. It has to be mentioned here that the
counting of the elements in the arrays starts from 0 contrary to the mailbox, where it
starts from 1. So, the zero element in the array refers to the first e-mail in the mailbox
and so on. This array is initialized in the beginning of the program –all of its elements
are set equal to 0. After this initialization the e-mail reader checks all the e-mails to
find any unwanted ones.
Figure 17: the quarantine process
When an e-mail is considered unwanted, the filter in response sets the corresponding
element to this e-mail in the “deleted_f” array equal to 1. So, in the end of that
For each e-mail contained in the mailbox
Is the N e-mail an unaccepetedone? (N stands for the number of each e-
mail in the mailbox)Yes No
Quarantine the e-mail:deleted_f[N-1]=1
Check the next e-mailCheck the next e-mail
A prototype application Firewall to filter out dangerous e-mails MSc in DMS 1999-2000
Elias Vouropoulos - 34 -
process all the unwanted e-mails have their corresponding element in the array set
equal to 1. The e-mail reader uses this information for displaying the list with the
wanted e-mails and for accessing a specific –wanted- e-mail (retrieve or delete it).
Whenever the user wishes to see the list with the e-mails, the e-mail reader checks the
element corresponding to each e-mail in that array. If it is 0, then it displays the
necessary information for this e-mail or else it skips it.
Figure 18:list displaying process
That list, as it has been mentioned earlier, before the information for each e-mail has
a number that is supposed to be its serial number in the mailbox, but it is not. This
number is just a counter that counts the wanted e-mails. So whenever a user wishes to
read or deleto or undelete an e-mail, he/she is giving to the e-mail reader the number
of this counter. Consequently the program has to find in some way the true number of
the e-mail in the mailbox. Another array called “real_mail” is used for this purpose. It
has stored the true serial number for each wanted e-mail corresponding to the number
of the counter the users sees in the list.
Let us now examine more detailed how each filter works. The sequence that the filters
will be examined will be the same as mentioned before. The first filter checks the
sender of the e-mail. It has a text file containing the e-mail addresses to be rejected.
The program reads them from this file and stores them in a hash. The e-mail reader
searches the headers for e-mail addresses. An e-mail address consists of a sequence of
word characters, digits, “_” and “-“, following by the “@” following by at least two
strings joined with a dot (the PERL regular expression that “grabs” the e-mail address
from the header is:
“ /([ -\w] {1,}\@{1}[ -\w] {1,}\.{1}[ -\w.] {1,})/g” ).
For each e-mail with number N contained in the mailbox
Is the N e-mail an unaccepetedone (deleted_f[N-1]=1)?
Yes No
Skip it.Go to the next e-mail
Print the necessaryinformation for this e-mail
Go to the next e-mail
A prototype application Firewall to filter out dangerous e-mails MSc in DMS 1999-2000
Elias Vouropoulos - 35 -
When an e-mail address is found and it is not contained in the “To:” and “Cc:” fields,
then this address is checked whether or not it exists in the hash containing the rejected
ones. The reason we are doing this, which is excluding these two fields from the
search for rejected e-mails, is that these fields do not show the sender but the receiver.
So there is the possibility where an accepted sender wishes to send the same e-mail to
a receiver from whom we do not want to receive any e-mail, so at the same time there
is an unwanted sender in the “To:” or “Cc:” headers. For example: let us suppose that
Craig is an accepted e-mail sender whose e-mail address is craig@someone.com.
Craig is sending the same e-mail to myself (me@someone.com) and also to Claire
(claire@someone.com) whose e-mail address is an unaccepted one. Consequently,
those two e-mail addresses are included in the “To:” and “Cc:” headers –either of
them might be contained either in the former or in the latter header. If we check for
unaccepted e-mail addresses in those two fields, the unaccepted e-mail address of
Claire will be located, so the e-mail will be quarantined and we will not be able to
access it, although we would like to retrieve it because Craig is an accepted sender.
That is the reason we do not search in those two fields for rejected e-mail addresses.
The filter searching for rejected domains is similar to the previous one. There is also a
text file containing the rejected domains and the name for each one of them. The e-
mail reader stores this information in a hash. The domains can be derived from the e-
mail addresses as they are the part following the “@” in an e-mail address. So,
whenever a domain is found in any e-mail header except the “To:” and “Cc:” ones, it
is checked to see whether or not it is a rejected one. The third filter is relevant to the
unwanted IP addresses. All these are shorted and stored in a text file. The e-mail
reader reads the IP addresses from this files and stores them in an array. When an IP
address is found in any header the program checks whether it belongs to the unwanted
ones. In order for the program to identify an IP address in the headers, it has to search
for a string comprising four numbers –one till three digits each- joined with dots (the
PERL regular expression that “grabs” the e-mail address from the header is:
“ /(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})/g” ).
The next three filters are checking dangerous e-mail threats. There are two text files
where some dangerous filenames –files that are known as threats- and some
suspicious phrases – for instance ILOVEYOU (this text is added to the Subject lines
of the e-mails containing the homonymous virus)- are stored. The entries of these files
are also stored in two hashes. The filename of an attachment, if the e-mail is a
multipart one, can be found in the body of the message following the “name=” string
A prototype application Firewall to filter out dangerous e-mails MSc in DMS 1999-2000
Elias Vouropoulos - 36 -
in a line starting with the “Content-Type:” one. If the filename of the attachment is a
known e-mail threat, then the e-mail is quarantined. In the case of the Subject line
filter, the string following the “Subject:” one is compared with the suspicious words
and if it is one of them, the e-mail is isolated. The last filter is relevant to the number
of the individual receivers of the e-mail. The e-mail addresses existing in the “To:”
and “Cc:” fields are counted and in case this number is beyond a limit, the e-mail is
quarantined.
Figure 19: filtering of some e-mails
6.3.4 Manipulation of the files containing rejected e-mail and IP address,
domains, suspicious words and dangerous attachments
As it has been mentioned earlier there are five text files containing lists with the
rejected e-mail and IP address, domains, suspicious words and dangerous
attachments. The user is able to modify these files. They just have to follow the
prompts of the interface in order to do that.
A prototype application Firewall to filter out dangerous e-mails MSc in DMS 1999-2000
Elias Vouropoulos - 37 -
Figure 20: the manipulation of the files
The modification of the files is taking place before the user logs in the mail server. It
has to be implemented by that way because when the user logs in the mail server there
is a time out option defined by the POP3 protocol. If the user does not send any
command to the server during a specific time when the connection is established, the
server aborts the connection. So if the manipulation of the files takes place after the
user logs in the server, it is possible that the user will not send any command to the
server because he/she will modify the files, so it will abort the connection.
6.3.5 Evaluation of the e-mail fi lter
The next part after the implementation of the filter was its test and its evaluation.
Different persons were asked to send me e-mails in order to test how well the filter
works. During that evaluation stage two problems with the filtering process occurred,
both of them related to the e-mail address filtering. The majority of the MUA and
MTA are using the following format for the e-mail addresses:
“<someone@a.domain>” . However, during that stage it was made clear to me that
this format is not an obligatory one. So, some servers are not adding the “<” and “>”
characters before and after the e-matl address – for instance the “uom.gr” server does
not add these characters when an e-mail address belongs to a group and not to a
specific person. So, these two characters were removed from the search for e-mail
addresses. The second problem that arose was that the filter could not identify an e-
mail address if there was a “-“ character in it. I was not aware that an e-mail address
A prototype application Firewall to filter out dangerous e-mails MSc in DMS 1999-2000
Elias Vouropoulos - 38 -
could have that specific character which was not one of the allowable characters of an
e-mail address. But when the filter came across with this problem, the “-“ character
was added to the allowable ones. After these two corrections the filter worked
properly and isolated all the unwanted messages.
A prototype application Firewall to filter out dangerous e-mails MSc in DMS 1999-2000
Elias Vouropoulos - 39 -
7. Future Improvements
The general purpose of this project was the design of a prototype Application Firewall
able to filter out dangerous e-mails and the implementation of an appropriate program
able to perform this task. This purpose has been fulfilled. However, the program
cannot be considered as a fullworking e-mail reader and someone could make some
improvements in order to make it even better. Some of these improvements could be
the following:
• As it has been mentioned before an e-mail reader has to perform three basic
operations: composition, display and disposition of the e-mails. However, the e-
mail reader that has been implemented for this project performs only two of the
above-mentioned tasks. It can display and dispose the e-mail messages.
Consequently, in order to make it a complete e-mail reader, the ability to send e-
mail has to be added.
• A better user interface has to be designed. In case the user wants to see the list of
the e-mail or to read a certain e-mail, then a specific number of e-mails or a
specific number of lines or the body of that particular e-mail will be displayed
each time and then the user has to press the return key in order to see the rest.
These numbers are predefined, constant and independent of the screen size. So
when the size of the screen is relatively small, then some information will not
appear on the screen. There are two solutions to this problem. The first is the use
of a scroll in the edge of the window. The second one is for the number of the e-
mails and the number of the rows of the body of each e-mail that will be displayed
on the screen to be variable and to be based on the size of the screen; the size of
the latter can be calculated by some readily available PERL modules. Therefore
these modules could also be used for the implementation of a better interface and
thus a solution to the problem could be achieved.
• There are some text files that contain the lists with the various unwanted elements
such as e-mail addresses, domains etc. When someone wants to use this program,
one has also to copy these files to his/her account. When the filter is installed to
anew account, these files have to be empty- so that no filters exist- but still they
have to exist. That happens because when the program cannot find any of the files,
it is terminated. An improved version of the program could check the existence of
A prototype application Firewall to filter out dangerous e-mails MSc in DMS 1999-2000
Elias Vouropoulos - 40 -
these files and, in case they exist, to ask the user whether or not he/she wants to
create them instead of terminating itself.
• The attachments of the files of each e-mail are encoded before being sent-more
often by using the base64 encoding- and they are converted into an unintelligible
sequence of ASCII characters. So they are stored in the mail server as text files.
The e-mail reader that will retrieve the e-mails containing these attachments has to
decode these files in order to reconvert them in their binary form. This operation
is not performed by the current project and could be included in the future
improvements.
A prototype application Firewall to filter out dangerous e-mails MSc in DMS 1999-2000
Elias Vouropoulos - 41 -
8. Conclusions
The aim of this project was to make myself acquainted with various aspects of the e-
mails and more specifically the fundamental principles of e-mail filtering. The
objectives of this project were:
1. Specification and design of a prototype Application Firewall to filter out
dangerous e-mails.
2. Identification of commercial components that will do some of it.
3. Development and demonstration at least part of the design.
4. Development of an admin tool kit that will allow this to be customised.
5. Test of it to see how successful it is in rejecting bad but passing good e-mails.
This project could be divided in two parts. The first part was the necessary literature
survey, including academic books, discussion forums and newsgroups, in order to
become familiar with the e-mail technology, the e-mail filtering techniques and other
security issues. The second part was the programming one for the implementation of
the e-mail reader, which is also responsible for the e-mail filtering. PERL was the
programming language that used for the creation of the code. A ready PERL module,
called Mail::POP3Client –by Sean Dowd- was used for the retrieving of the e-mails
from the server.
The basic steps of the filtering process are the following:
• Check the e-mail address of the sender.
• Check the domain of the server.
• Check the different IP addresses in the e-mail header.
• Check the attachments, if any.
• Check the “Subject:” field for suspicious words.
• Check the number of the individual receivers.
Different text files are used in order to store the lists with the rejected e-mail and IP
addresses, domains suspicious words and filenames of dangerous files. The user has
the ability to modify the files, to add new entries or delete some already existing ones,
so he/she will be able to create his/her own personal filter with their own preferences.
A prototype application Firewall to filter out dangerous e-mails MSc in DMS 1999-2000
Elias Vouropoulos - 42 -
9. References
1. Ranum, M J and Curtin, M (1998), Internet Firewalls Frequently Asked
Questions, http://www.hideaway.net/texts/fwfaq.html
2. Khalid Al-Tawil and Ibrahim A. Al-Kaltham (1999), Evaluation and Testing of
Internet Firewalls, International Journal of Network Management, Int. J. Network
Mgmt. 9, pp. 135-149
3. Connected: An Internet Encyclopedia (April, 1997), IP Address,
http://noc.ucsc.edu/cie/Topics/23.htm
4. Chris Lewis (2000), IP 101: All About IP Addresses,
http://www.networkcomputing.com/netdesign/ip101.html
5. MIT Kerberos team (2000), The Three Myths of Firewalls,
http://web.mit.edu/kerberos/www/firewalls.html
6. Multipurpose Internet Mail Extensions (RFC 2045) (1996),
http://andrew2.andrew.cmu.edu/rfc/rfc2045.html
7. Post Office Protocol - Version 3 (RFC 1939) (1996),
http://www.faqs.org/rfcs/rfc1939.html
8. Scott Hazen Mueller (1999), What is spam?,
http://spam.abuse.net/whatisspam.html
9. Zdnet (2000), Help &How-To: Trojan Horse, Virus or Worms,
http://www.zdnet.com/zdhelp/stories/main/0,5594,2435378,00.html?chkpt=zdnn
moreon
10. Zdnet (2000), Help &How-To: Definition of a Worms,
http://www.zdnet.com/zdhelp/stories/main/0,5594,2435378-3,00.html
11. QUALCOM Incorporated (1999-2000), Tutorial: how to use filters,
http://www/eudora.com/techsupport/tutorials/win_filters.html
12. AIESEC Organization (2000), E-mail Management Tips: Filtering,
http://www.aisec.org/help/filtering.html
13. Infinitive Ink (2000), Procmail quick start,
http://www.ii.com/internet/robots/procmail/qs/
14. Jim Dennis (1997), Promail Mini tutorial: Automated Mail Handling,
http://www/linuxgazette.com/issue14/procmail.html
15. Ian Soboroff (1997, Mail filtering with procmail,
http://www.gl.umbc.edu/~ian/procmail
A prototype application Firewall to filter out dangerous e-mails MSc in DMS 1999-2000
Elias Vouropoulos - 43 -
16. About.com (2000), How to retrieve e-mails,
http://perl.about.com/compute/perl/library/weekly/aa022700a.html
17. Deja (Discussions>>comp.lang.perl.modules ) (2000), Re: POP3Client,
http://x70.deja.com/getdoc.xp?AN=644432010&CONTEXT=966869437.408289
281&hitnum=1
A prototype application Firewall to filter out dangerous e-mails MSc in DMS 1999-2000
Elias Vouropoulos - 44 -
10. Bibliography
1. Charlie Kaufman, Radia Perlman, Mike Speciner (1995), Network security:
private communication in a public world, Prentice Hall
2. William Stallings (1995), Network and internetwork security : principles and
practice, Prentice Hall
3. Andrew S. Tanenbaum, Computer networks (3rd edition), Prentice Hall, (1996)
4. Chris Hare, Karanjit Siyan (1996), Internet firewalls and network security (2nd
edition), New Riders
5. Jalal Feghhi, Jalil Feghhi, Peter Williams (1999), Digital certificates : applied
Internet security, Addison-Wesley
6. Larry L. Peterson & Bruce S. Davie (2000), Computer networks : a systems
approach, Morgan Kaufmann
7. Marcus Gonçalves (1998), Firewalls complete, McGraw-Hill
8. Larry Wall, Tom Christiansen, and Randal L. Schwartz with Stephen Potter
(1996), Programming Perl (2nd edition), O'Reilly & Associates
A prototype application Firewall to filter out dangerous e-mails MSc in DMS 1999-2000
Elias Vouropoulos - 45 -
APPENDIX A: Exper iences gained through the project
After finishing the project I had to undertake for the MSc programme, certain
thoughts have to be expressed in terms of the effort made and the degree in which the
objectives of the project have been fulfilled.
The general impression is that myself ended up satisfied with the final outcome even
though initially there was a more general field of research that was chosen. That was
“Firewalls, encryption and other aspects of security” . The reason for not opting for
this subject was that after discussing this with my new supervisor and after a short
study I made in order to prepare the interim report, we both perceived this subject as
too extended in order to be thoroughly covered. Here too the comments made by the
assessor came to agree with this observation. Therefore, we redefined the project and
finally decided to engage into something more achievable such as e-mail firewalls.
Throughout the effort made I came across many problems such as the difficulty to
establish the POP3 connection with the mail server and due to the fact that it was the
first time I was engaged in a project totally on my own, such problems came to be
seen as quite significant; still, these were overcome quickly after participating in
various discussion forums and newsgroups. Their help was not only essential when a
problem arose but also when bibliography was concerned.
The most important aspect perhaps of this project was the fact that it was the first time
that I got so seriously and in such a depth engaged in matters of networks and Internet
security. Hopefully, the knowledge and the experience acquired will be beneficial for
my further engagement in these matters in the future but this time in a professional
way.
A prototype application Firewall to filter out dangerous e-mails MSc in DMS 1999-2000
Elias Vouropoulos - 46 -
APPENDIX B: Project Objectives and Deliverables
School of Computer Studies MSC PROJECT OBJECTIVES AND DELIVERABLES
This form must be completed by the student, with the agreement of the supervisor of each project, and submitted to the MSc project co-ordinator (Mrs A. Roberts) by 7th April 2000. A copy should be given to the supervisor and a copy retained by the student. Amendments to the agreed objectives and deliverables may be made by agreement between the student and the supervisor during the project. Any such revision should be noted on this form.* At the end of the project, a copy of this form must be included in the Project Report as an Appendix.
NOTE: this form includes amendments to Project Interim Report, added in 6 June 2000, as agreed with supervisor and include addressing issues raised by assessor's interim report.
Student: EliasVouropoulos
Programme of Study: Distributed Multimedia Systems
Supervisor: Mr. Bill Whyte
Title of project: A prototype Application Firewall to filter out dangerous e-mails
External Organisation* : _______________________________________
* (if applicable)
AGREED MARKING SCHEME
Understand the Problem
Produce a Solution *
Evaluation Write -Up Appendix A TOTAL
%
20 40 20 15 5 100
* This category includes Professionalism (see handbook)
OVERALL OBJECTIVES (continue overleaf if necessary):
1. Specify and design a prototype Application Firewall to filter out dangerous e-mails.
2. Identify commercial components that will do some of it.
3. Develop and demonstrate at least part of the design.
4. Develop an admin tool kit that will allow this to be customised.
5. Test it to see how successful it is in rejecting bad but passing good e-mails.
A prototype application Firewall to filter out dangerous e-mails MSc in DMS 1999-2000
Elias Vouropoulos - 47 -
DELIVERABLE(s):
1. A project report.
2. The construction of an e-mail Application Firewall
A prototype application Firewall to filter out dangerous e-mails MSc in DMS 1999-2000
Elias Vouropoulos - 48 -
APPENDIX C: Project I nter im Repor t
School of Computer Studies MSC PROJECT INTERIM REPORT
Student: Elias Vouropoulos Programme of Study: Distributed Multimedia Systems Title of project: Firewalls, encryption and other aspects of security
Supervisor: Mr. Martyn Clark External Company (if appropriate):
AGREED MARKING SCHEME Understand the problem
Produce a solution *
Evaluation Write-up Appendix A
TOTAL %
20 40 20 15 5 100 Overall Objectives 1. Investigate the types of information and the different categories of users in SCS. 2. Investigate current systems and future requirements. 3. Examine the technical requirements in building web sites and develop appropriate
personal skills. 4. Design and implement secure web service for SCS. Deliverable(s): 1. A project report. 2. The construction of an appropriate web site.
A prototype application Firewall to filter out dangerous e-mails MSc in DMS 1999-2000
Elias Vouropoulos - 49 -
Following are the comments of my assessor for the interim report:
Following are the comments of my supervisor for the interim report: As discussed, we decided to re-define your project, in order to make it more acheivable. Here is what we discussed: • Specify and design a prototype Application Firewall to filter out dangerous
A prototype application Firewall to filter out dangerous e-mails MSc in DMS 1999-2000
Elias Vouropoulos - 50 -
e-mails. • Identify commercial components that will do some of it. • Develop and demonstrate at least part of the design. • Develop an admin tool kit that will allow this to be customised. • Test it to see how successful it is in rejecting bad but passing good e-mails.
A prototype application Firewall to filter out dangerous e-mails MSc in DMS 1999-2000
Elias Vouropoulos - 51 -
APPENDIX D: The manpage of Mail::POP3Client
NAME Mail::POP3Client - Perl 5 module to talk to a POP3 (RFC1939) server SYNOPSIS use Mail::POP3Client; $pop = new Mail::POP3Client( USER => "me", PASSWORD => "mypassword", HOST => "pop3.do.main" ); for( $i = 1; $i <= $pop->Count(); $i++ ) { foreach( $pop->Head( $i ) ) { /^(From|Subject):\s+/i && print $_, "\n"; } } $pop->Close(); # OR $pop2 = new Mail::POP3Client( HOST => "pop3.otherdo.main" ); $pop2->User( "somebody" ); $pop2->Pass( "doublesecret" ); $pop2->Connect() || die $pop2->Message(); $pop2->Close(); DESCRIPTION This module implements an Object-Oriented interface to a POP3 server. It implements RFC1939 (http://www.faqs.org/rfcs/rfc1939.html) EXAMPLES Here is a simple example to list out the From: and Subject: headers in your remote mailbox: #!/usr/local/bin/perl use Mail::POP3Client; $pop = new Mail::POP3Client( USER => "me", PASSWORD => "mypassword", HOST => "pop3.do.main" ); for ($i = 1; $i <= $pop->Count(); $i++) { foreach ( $pop->Head( $i ) ) { /^(From|Subject):\s+/i and print $_, "\n"; } print "\n"; } CONSTRUCTORS Old style (deprecated): new Mail::POP3Client( USER, PASSWORD [, HOST, PORT, DEBUG, AUTH_MODE] );
A prototype application Firewall to filter out dangerous e-mails MSc in DMS 1999-2000
Elias Vouropoulos - 52 -
New style (shown with defaults): new Mail::POP3Client( USER => "", PASSWORD => "", HOST => "pop3", PORT => 110, AUTH_MODE => 'PASS', DEBUG => 0, TIMEOUT => 60, ); * USER is the userID of the account on the POP server * PASSWORD is the cleartext password for the user ID * HOST is the POP server name or IP address (default = 'pop3') * PORT is the POP server port (default = 110) * DEBUG - any non-null, non-zero value turns on debugging (default = 0) * AUTH_MODE - pass 'APOP' to attempt APOP (MD5) authorization. (default is 'PASS') * TIMEOUT - set a timeout value for socket operations (default = 60) METHODS These commands are intended to make writing a POP3 client easier. They do not necessarily map directly to POP3 commands defined in RFC1081 or RFC1939, although all commands should be supported. Some commands return multiple lines as an array in an array context. new( USER => 'user' , PASSWORD => 'password', HOST => 'host' , PORT => 110, DEBUG => 0, AUTH_MODE => 'PASS', TIMEOUT => 60 ) Construct a new POP3 connection with this. You should use the hash-style constructor. The old positional constructor is deprecated and will be removed in a future release. It is strongly recommended that you convert your code to the new version. You should give it at least 2 arguments: USER and PASSWORD. The default HOST is 'pop3' which may or may not work for you. You can specify a different PORT (be careful here). new will attempt to Connect to and Login to the POP3 server if you supply a USER and PASSWORD. If you do not supply them in the constructor, you will need to call Connect yourself. The valid values for AUTH_MODE are 'PASS' and 'APOP'. APOP implies that an MD5 checksum will be used instrad of passing your password in cleartext. However, if the server does not support APOP, the cleartext method will be used. Be careful. If you enable debugging with DEBUG => 1, messages about command will go to STDERR. Another warning, it's impossible to differentiate between a timeout and a failure. Head( MESSAGE_NUMBER ) Get the headers of the specified message, either as an array or as a string, depending on context. You can also specify a number of preview lines which will be returned with the headers. This may not be supported by all POP3 server implementations as it is marked as optional in the RFC. Submitted by Dennis Moroney <dennis@hub.iwl.net>. Body( MESSAGE_NUMBER ) Get the body of the specified message, either as an array of lines or as a string, depending on context. HeadAndBody( MESSAGE_NUMBER [, PREVIEW_LINES ] )
A prototype application Firewall to filter out dangerous e-mails MSc in DMS 1999-2000
Elias Vouropoulos - 53 -
Get the head and body of the specified message, either as an array of lines or as a string, depending on context. Example foreach ( $pop->HeadAndBody( 1, 10 ) ) print $_, "\n"; prints out a preview of each message, with the full header and the first 10 lines of the message (if supported by the POP3 server). Retrieve( MESSAGE_NUMBER ) Same as HeadAndBody. Delete( MESSAGE_NUMBER ) Mark the specified message number as DELETED. Becomes effective upon QUIT. Can be reset with a Reset message. Connect Start the connection to the POP3 server. You can pass in the host and port. Close Close the connection gracefully. POP3 says this will perform any pending deletes on the server. Alive Return true or false on whether the connection is active. Socket Return the file descriptor for the socket. Size Set/Return the size of the remote mailbox. Set by POPStat. Count Set/Return the number of remote messages. Set during Login. Message The last status message received from the server. State The internal state of the connection: DEAD, AUTHORIZATION, TRANSACTION. POPStat Return the results of a POP3 STAT command. Sets the size of the mailbox. List Return a list of sizes of each message. ListArray Return a list of sizes of each message. This returns an indexed array, with each message number as an index (starting from 1) and the value as the next entry on the line. Beware that some servers send additional info for each message for the list command. That info may be lost.
A prototype application Firewall to filter out dangerous e-mails MSc in DMS 1999-2000
Elias Vouropoulos - 54 -
Uidl( [MESSAGE_NUMBER] ) Return the unique ID for the given message (or all of them). Returns an indexed array with an entry for each valid message number. Indexing begins at 1 to coincide with the server's indexing. Last Return the number of the last message, retrieved from the server. Reset Tell the server to unmark any message marked for deletion. User( [USER_NAME] ) Set/Return the current user name. Pass( [PASSWORD] ) Set/Return the current user name. Login Attempt to login to the server connection. Host( [HOSTNAME] ) Set/Return the current host. Port( [PORT_NUMBER] ) Set/Return the current port number. AUTHOR Sean Dowd <pop3client@dowds.net> CREDITS Based loosely on News::NNTPClient by Rodger Anderson <rodger@boi.hp.com>. SEE ALSO perl(1).
Recommended