A prototype application Firewall to filter out dangerous e ...€¦ · A prototype application...

A prototype application Firewall to filter out dangerous e-mails MSc in DMS 1999-2000

Elias Vouropoulos - 1 -

Contents

Contents ....................................................................................................................1 1. About the project ...................................................................................................2 2. Internet security .....................................................................................................3

2.1 The need for security........................................................................................3 2.2 Cryptography ...................................................................................................4 2.3 Firewalls..........................................................................................................5

2.3.1 Network level firewalls..............................................................................6 2.3.2 Application level firewalls.........................................................................7

2.4 IP addresses .....................................................................................................7 2.5 The three myths of firewalls.............................................................................9

3. E-mail ..................................................................................................................10 3.1 Introduction....................................................................................................10 3.2 Message Format .............................................................................................11 3.3 MIME (Multipurpose Internet Mail Extensions).............................................12 3.4 Message Transfer-SMTP................................................................................13 3.5 E-mail gateways.............................................................................................13 3.6 POP3 (Post Office Protocol, version 3)...........................................................14

4. E-mail problems...................................................................................................17 4.1 Spam e-mail ...................................................................................................17 4.2 E-mail threats.................................................................................................17 4.3 Address spoofing............................................................................................19

5. E-mail filtering ....................................................................................................20 5.1 The fundamentals of e-mail filtering...............................................................20 5.2 E-mail filtering products.................................................................................21

5.2.1 Eudora.....................................................................................................21 5.2.2 Pegasus ...................................................................................................23 5.2.3 Procmail ..................................................................................................25

6. Design, implementation and evaluation of the project ..........................................26 6.1 Outline of the design......................................................................................26 6.2 The creation of the e-mail reader ....................................................................26

6.2.1 Design.....................................................................................................26 6.2.2 Implementation .......................................................................................27 6.2.3 Evaluation of the e-mail reader ................................................................29

6.3 Establishing the filters....................................................................................31 6.3.1 Design.....................................................................................................31 6.3.2 Implementation .......................................................................................33 6.3.3 The implementation of the filter ..............................................................33 6.3.4 Manipulation of the files containing rejected e-mail and IP address, domains, suspicious words and dangerous attachments ....................................36 6.3.5 Evaluation of the e-mail filter ..................................................................37

7. Future Improvements...........................................................................................39 8. Conclusions.........................................................................................................41 9. References...........................................................................................................42 10. Bibliography ......................................................................................................44 APPENDIX A: Experiences gained through the project ...........................................45 APPENDIX B: Project Objectives and Deliverables................................................46 APPENDIX C: Project Interim Report .....................................................................48 APPENDIX D: The manpage of Mail::POP3Client..................................................51

1. About the project

The purpose of this project was for me to become familiar with some aspects of

Internet and network security. It was the first time that I was coming close to this

subject so I was totally unacquainted with that area. Therefore the purpose of my MSc

project was not to make me an expert in the field of network security and more

specific in e-mail filtering –and security. Its real purpose was to give the basic

knowledge about that area in order to make me capable of coping with real e-mail and

other network problems and aspects of security, in particular, e-mail filtering.

Originally the project title was 'Firewalls, encryption and other aspects of security” .

This was so vague and potentially so wide that it would not be possible to implement

anything in the time allowed.. Still, my studying in order to prepare my interim report

made me understand that it was impossible to cover all the aspects of Internet security

in a MSc project. My new supervisor Mr Bill Whyte, who started to supervise me

after the interim report and my assessor also, had the same opinion with me after they

had read my interim report. So from the first meeting with my new supervisor we

agreed that the project had to be redefined, in order to become more achievable. The

proposal of oy new supervisor was the project to be the design and the

implementation of a prototype firewall for filtering e-mails. I found his proposal very

interesting so my project was redefined and its new title was “A prototype application

firewall to filter out dangerous e-mails” .

2. Internet secur ity

2.1 The need for secur ity

Computer networks are one of the most rapidly developing areas of computing that

can affect not only the area of computer science but also lots of different areas of

modern world. Almost all the enterprises around the world are using computers,

which are connected with each other in order to be able to exchange information with

their clients and the rest of the world. Consequently it is obvious that every

organisation and enterprise that wants to survive in the hard competition and the

speedy changes of the market has to use the computer networks facilities. This is even

more imperative with the rise of e-commerce as the new standard for business. Thus

the need for Internet security has become an up to date and crucial matter.

The most basic question that someone would ask is why Internet and generally

network security is needed. In order to answer that question the implementation of

Internet has to be examined. Internet is the widest known type of networks. It is

sometimes called the “network of networks” and uses the TCP/IP (Transmission

Control Protocol/ Internet Protocol) protocol. TCP/IP is not something new. Its

origins can be found in the creation of ARPANET. The first ARPANET mainly

provided high bandwidth connectivity between some US major computing sites, as

government and educational organisations and research laboratories. It provided its

users with the ability to transfer files and e-mails from one site to other [Hare, Siyan,

1996]. Although the TCP/IP was created many years ago it is still, with some new

versions, the basis of the Internet.

The fundamental problem is that Internet was not designed to be very secure. The

explanation of that can be found by examining its two significant characteristics:

distributed processing and open communications. What does that mean? To put it

simply when a computer communicate with one other almost all the other members of

that network may observe that communication and consequently they can access the

information that these two computers exchange. That phenomenon is based on the

physical connection of the computers that comprise the entire network. The two

computers that communicate with each other are not physically connected (a cable or

a fibre optic connecting directly these two computers) [Feghhi et al, 1999].

Workstation

the entire network

Workstation Workstation

Figure1: the network physical connection

Although the two computers appear as to be connected directly what really happens is

that there is a virtual path between them. Their physical connection is established with

the contribution of an immense number of intermediate computers that are connected

physically with each other in order to establish a connection between the two

computers that exchange information. All the computers that comprise the physical

path between these two computers are able to read this information. So if this

information is sensitive – for instance the code of a VISA card- the sender has to find

a way (function) to convert this data to an incomprehensible stream of bits for the

other members of the network. The receiver has to perform exactly the opposite

function than the sender’s one in order to restore the information in its first

comprehensible shape. The reason that just these two computers –the sender and the

receiver- can perform that transformation from sensible information to unmeaning bits

and the opposite is that just these two computers are sharing a secret, a key, that

allows them to do that. That technique is called encryption.

Apart from encrypting data Internet security also necessitates the protection of the

information in the server from unauthorised users. This is what firewalls are

responsible for. A firewall is a set of mechanisms that protects a network from

another network. The firewall is placed between an internal secure network and the

rest of the Internet.

2.2 Cryptography

Cryptography is the science that converts an original message called plaintext to an

unintelligible message called ciphertext. Cryptography has two purposes:

1. To make the cost of breaking the cipher greater than the value of the encrypted

information.

2. To make the time that the attacker has to spend to break the ciphertext long enough,

so that the information has lost its value.

The cryptographic algorithms are divided into two categories:

1. Secret-key algorithms

2. Public-key algorithms

Secret-key encryption: the algorithms that comprise this category are using the same

secret key for encryption and decryption. The holder of the key can encrypt and

decrypt information [Stallings, 1995].

plaintext encryption decryption plaintextciphertext

secretkey

Figure 2: secret-key encryption

Public-key encryption: it involves the use of two different keys. The private key that

must be kept secret and the public key that can be freely shared with anyone. The

public key is used for the encryption and the private one for the decryption. The

public–key algorithms are not used only for cryptography but also for the creation of

digital signatures [Stallings, 1995].

plaintext encryption decryption plaintextciphertext

secretkey

publickey

Figure 3: public-key encryption

2.3 Firewalls

A firewall is a set of mechanisms that protects one network from another. In practise a

firewall is a pair of mechanisms; one is used to block traffic and the other one is used

to permit traffic1. Generally firewalls are designed to prevent access to unauthorised

users who are coming from an external network. More sophisticated firewalls block

traffic from outside to inside but they allow traffic from inside to outside. So, “a

firewall is a filtering mechanism placed between the private network and the outside

world (Internet) so that all incoming and outgoing traffic is forced to pass through it

to prevent unwanted and potentially damaging intrusion” 2.

The basic idea of the firewall is that the inner network will remain theoretically

invisible (or at least unreachable) to anyone that has no authorisation to reach the

trusted network form the outside world. So all the communications with the rest of the

Internet are taking place through the firewall. The firewall will receive the request

from someone who is trying to reach a computer inside the inner network and it will

decide whether it will allow the request to pass through or not.

Figure 4: the use of the firewall

Although firewalls are usually placed between a trusted network and the Internet,

there are some cases, especially in big organisations and enterprises, where firewalls

can be used in order to create different sub-nets of the network and consequently to

create different levels of access to information for its employees.

There are different implementations of firewalls. Still they can be divided into two

basic types: network level and application level firewalls.

2.3.1 Network level firewalls

The network level are also known as packet filtering firewalls. They are usually router

based. So the rules about who and what can access the inner network are applied in

1 Ranum, M J and Curtin, M (1998), Internet Firewalls Frequently Asked Questions, http://www.hideaway.net/texts/fwfaq.html 2 Khalid Al-Tawil and Ibrahim A. Al-Kaltham (1999), Evaluation and Testing of Internet Firewalls, International Journal of Network Management, Int. J. Network Mgmt. 9, pp. 135-149

the router level. These firewalls accept or deny access based on the source address.

The router examines the packet that is coming from the outside world. After the IP

source has been identified the router will decide if it will forward the packet or it will

reject it, according to its rules. The philosophy used by the router is that everything

that is not forbidden is permitted. This type of firewall is not so secure because it can

be bypassed when the attackers are using forged IP addresses. Although they are not

so secure they are fast because it is easy enough for the router to identify the IP source

of the incoming packet and check if it is restricted or not3.

2.3.2 Application level firewalls

This type of firewalls is also known as application gateways. Contrary to the network

level, which are hardware (router) based firewalls, the application level ones use

server programs –called proxies- that run on the firewall. So a computer is used as

firewall instead of a router. When a remote user sends a request to a network using an

application gateway, the gateway blocks the remote connection. Instead of allowing

immediately access to the internal network the gateway examines various fields in the

request. If these meet the predefined rules then the gateway allows the remote user to

access the internal network. Application gateways require a proxy for each service,

such as FTP, HTTP etc to be supported through the firewall. The application gateway

is considered to be the most secure type of firewall4.

2.4 IP addresses

As it has already been mentioned, one of the most important elements that firewalls

are taking into consideration in order to allow a user to access the internal network is

the IP source of the holder of that request. IP addresses are also one of the filters that

3 Ranum, M J and Curtin, M (1998), Internet Firewalls Frequently Asked Questions, http://www.hideaway.net/texts/fwfaq.html 4 Ranum, M J and Curtin, M (1998), Internet Firewalls Frequently Asked Questions, http://www.hideaway.net/texts/fwfaq.html

has been implemented in that project. So it would be useful to discuss briefly what IP

address are.

Every computer connected to the Internet has a unique number so the other computers

can identify it. That is the IP address. So IP is 32-bit numbers that can identify

Internet host. These numbers are placed in the header of each packet using TCP/IP

and used in order to route the packets in their destination.5 E-mail is also using

TCP/IP so, in each e-mail header there is the IP address of the sender. So we can use

that info in order to identify the sender of an e-mail.

As mentioned earlier, an IP address is a 32-bit number. That number usually

represented as four fields each representing 8-bit numbers in the range 0 to 255

separated by periods. So an IP address looks like the following: “129.11.147.188” . IP

addresses can be divided into 4 classes A, B, C and D. The value of the first class

determines the class that an IP belongs. Class D addresses are used for multi-cast

applications. The range of these values for each class is listed bellow6:

Class Range Allocation

A 1-126 N.H.H.H

B 128-191 N.N.H.H

C 192-223 N.N.N.H

D 224-239 Not applicable

N = Network and H = Host

An IP address can be either static or dynamic.

• A static address is permanent. It is the address that has a computer that is always

connected to the Internet.

• A dynamic IP address is one that is temporarily assigned to a different node each

time it connects to the Internet. Dynamic IP is used from ISPs for dial-up access

and each time o node dials up, a different IP address is assigned to it.

5 Connected: An Internet Encyclopedia (April, 1997), IP Address, http://noc.ucsc.edu/cie/Topics/23.htm 6 Chris Lewis (2000), IP 101: All About IP Addresses, http://www.networkcomputing.com/netdesign/ip101.html

2.5 The three myths of firewalls

The following three myths about the firewalls are adopted by Bob Blakley, a security

architect at IBM7.

1. “We have got the place surrounded”: Firewalls are assuming that the only way to

access the inner network from the Internet is through this firewall. But that

assumption is not always true, because sometimes there are some back doors to

the inner network. Usually these back doors can be created from the users of that

network. For instance in a huge enterprise users may set-up their own back doors

using modems and the appropriate programs so thatthey can work from home.

2. “Nobody is here but us chickens” : Firewalls are also assuming that all the users in

the inner network are trustworthy. But that is not a real assumption, since lots of

computer crimes have been done by insiders.

3. “Sticks and stones may break my bones, but words will never hurt me”: With the

use of Word macros, JavaScripts, Java and other types of executable commands

that can be embedded inside data, a security system that is not aware of that fact

may be totally unsecured.

7 MIT Kerberos team (2000), The Three Myths of Firewalls, http://web.mit.edu/kerberos/www/firewalls.html

3. E-mail

3.1 Introduction

Electronic mail, or e-mail as it is widely known is one of the most popular uses of the

Internet. Millions of people are using it in order to send mails in other people all

around the world. The first e-mail system was a simple file transfer protocol and

every file that was e-mail it has in its first line the recipient’ s address. But that

implementation has enough problems. Some of them were that a mail could not be

send to more than one recipient and also it had no internal structure in order for the

server to process it easier.

By the time the necessity for some better standards was continually increasing. “In

1982, the ARPANET e-mail proposals were published as RFC 821 (transmission

protocol) and RFC 822 (message format). These have since become the de facto

Internet standards” [Tanenbaum, 1996]. Two years later, another proposal for e-mail

was done by CCITT. That was the X.400. Although after one decade of hard

competition between these two standards the e-mail systems using RFC 822 are

commonly used and those using X.400 have almost disappeared. The reason for that

was the very poor and complex design of X.400 rather than the good implementation

of the RFC 822 proposal.

An e-mail system consists of two parts: the Message User Agent (MUA) and the

Message Transfer Agent (MTA). The MUA is a client-based program that provides

the user with an interface, graphical or not, in order to interact with their e-mails. This

project involved creating a basic MUA, or e-mail reader, applying some filtering

rules. It uses the POP3 protocol that will be explained later in that chapter, to

communicate with the mail server in order to access the e-mails. The MTA is the part

of the system that moves the messages from their source to their destination. The

typical implementation of a MTA is the SMTP (Simple Mail Transfer Protocol, RFC

821). One typical e-mail system has to perform five basic functions [Tanenbaum,

1996]:

• Composition: it refers to the creation of the e-mail.

• Transfer: it refers to the sending of the messages.

• Reporting: informs the sender if the e-mail was delivered.

• Display: it displays the incoming messages.

• Disposition: it allows the user to delete or save a message.

Function: MUA MTA

Composition �

Transfer �

Reporting �

Display �

Disposition �

3.2 Message Format

The message format (RFC 822) defines that messages have two parts: header and

body. Both these parts are text ASCII characters. In the beginning of the e-mail

history there was no need for the body of a message to be something different than the

text. Although later the increasing need of the users to send files through e-mail led to

the definition of MIME (Multipurpose Internet Mail Extensions), so they could send

files via e-mail [Peterson, Davie, 2000].

In RFC 822 each header consists of one line of ASCII text. The most important

header fields are the following [Tanenbaum, 1996]:

• To: that field contains the DNS (Domain Name Server) address of the primary

receiver. Multiple receivers are also allowed.

• Cc (Carbon Copy): it contains the DNS address of the secondary receiver. As in

the case of the To field, multiple receivers are also allowed. There is no difference

between the To and the Cc field for the e-mail system. Their difference is just

psychological for the user.

• Bcc (Blind Carbon Copy): it is similar with the To and Cc fields but that line is

removed from the copies that are sent to the primary and secondary receivers.

• From: it identifies, but it is not always accurate as it will be discussed later, the

person that sends the e-mail. Usually it contains their name and their DNS

address.

• Received: that header is added by each MTA along the way. It contains some

information about the agent’s identity and also the date and the time.

• Date: it is the date and the time the message was sent.

• Message-ID: a unique number to identify the message.

• Subject: a brief discussion about what the e-mail contains.

The RFC 822 also allows the users to add some new headers for their own private use.

These headers have to start with the “X-“ – for instance “X-No-Archive:” .

3.3 MIME (Multipurpose Internet Mail Extensions)

As mentioned earlier, the RFC 822 specifies that the body of the message will be US-

ASCII characters. So MIME extends the format of the messages in order to allow8:

• Textual messages in character sets others than US-ASCII.

• An extensible set of different formats for non-textual message bodies.

• Multipart message bodies.

• Text header information in character sets other than US-ASCII.

MIME defines five new message headers. If any message has no these fields, then it is

handled as if it is an US-ASCII characters message.

• MIME-Version: uses a number to identify the MIME version.

• Content-Type: specifies the nature of the data in the body of the message.

• Content-Transfer-Encoding: identifies the way that the body of the message is

encoded for transmission through the network. The most appropriate way to

encode binary messages is to use base64 encoding [Tanenbaum, 1996].

• Content-ID: it is similar with the standard Message-ID header.

• Content-Description: a human-readable description about what the message

contains. It is similar to the Subject header, as it is defined in RFC 822.

8 Multipurpose Internet Mail Extensions (RFC 2045) (1996), http://andrew2.andrew.cmu.edu/rfc/rfc2045.html

3.4 Message Transfer -SMTP

The message transfer subsystem of an e-mail system is responsible for the

transmission of the message from its source to its destination. The simplest way to do

that is to establish a transport connection from the source machine to the destination

one and then just transfer the message, although it is very common the receiver

machine of one connection to be just an intermediate one and not the ultimate

destination. Within the Internet e-mails are moved by establishing a TCP connection

to port 25 between the sender-SMTP and the receiver-SMTP. The sender and the

receiver generate and exchange SMTP commands. In each host there is an e-mail

daemon (program). The MUA gives the daemon the order that it wishes to send an e-

mail and the daemon is using SMTP to send the e-mail to the daemon runs on the

receiver machine. The most popular implementation of e-mail daemon is the sendmail

of UNIX.

After establishing the connection the sender acts as client and the receiver as server.

The server sends a message to the client giving its identity and whether it accepts e-

mails or not. If the server wishes to accept e-mails the client send a message to the

server about the sender and the receiver (their DNS addresses) of the e-mail. If the

receiver exists in the server, then it gives the client the prompt to send the e-mail.

3.5 E-mail gateways

SMTP is a protocol that runs above TCP/IP, an Internet standard. However, there are

some companies that do not wish to be connected to the Internet but they wish to

receive and send e-mails. In that case the server of that company is not connected

directly to the Internet. It is just connected to an e-mail gateway, which act as a

firewall, because it protects the company’s server allowing just to e-mails to reach the

server.

e-mail gateway

E-maildaemon

Internet

TCP or TP4connection

Server runningSMTP or X.400

E-maildaemon

Figure 5: the use of an e-mail gateway as a “ firewall”

There are also some cases when the sender or receiver speaks only RFC 822 and the

other part of the connection just X.400. The solution for both these cases is the use of

application layer e-mail gateways. The e-mail gateway is an intermediate node that

likes the hosts runs sendmail. Its job is to store and forward e-mails.

e-mail gateway

Server running SMTP

E-maildaemon

E-mailreader

E-maildaemon

Server running X.400

E-mailreader

E-maildaemon

TCP connection TP4 connection

Figure 6: the use of an e-mail gateway node to connect different networks

3.6 POP3 (Post Office Protocol, version 3)

Up till now what has just been mentioned is the way that e-mails are moved from one

host to other. PCs or workstation that do no have the resources to run an e-mail

daemon, need a way to access the e-mails in a mailbox server. A simple protocol that

is used in these cases is POP3. That protocol is used for the implementation of that

project.

Clinet running POP3

Server runningPOP3

TCP conection in port 110

Exchanging commandsand responses after theestablishing of theconnection

Figure 7: the POP3 protocol

The mailbox server starts the POP3 service by listening on TCP on port 110. When a

client wishes to retrieve e-mails using that protocol, it establishes a connection with

the server on that port. When the connection has been established the server sends a

greeting. Afterwards the server and the client exchange responses and commands

respectively till the end of the connection9.

The commands in POP3 consist of some keyword followed in some cases by an

argument. The responses consist of a success indicator and a keyword followed

sometimes by additional information. There are two success indicators: the positive

(“+OK”) and the negative (“-ERR”) one. A CTRLF pair terminates the commands

and the responses.

During a POP3 connection there are three different stages. The first stage is the

AUTHORIZATION one. In the beginning of that stage the server returns one line

greeting to the client. The client now has to identify itself. There are three commands

in that stage:

• User “username”

• Pass “password”

• Quit

If the client logs in properly to the server then the server locks the maildrop and

passes the connection to the next stage, the TRANSACTION one. In that stage the

client gives some commands and the server replies to the client with a positive

9 Post Office Protocol - Version 3 (RFC 1939) (1996), http://www.faqs.org/rfcs/rfc1939.html

response if the command is right and some additional information if necessary. The

commands are the following:

• Stat: the server returns a line containing information about the maildrop.

• List msg (optional): if the client gives a number then the server returns some

elements about the e-mail with that number or else it returns some elements about

the maildrop.

• Retr msg: the server returns the message with this number.

• Dele msg: the mail with this number is marked as deleted.

• Noop: the server just replies a positive response.

• Last: the server returns the highest message number, which accessed.

• Rset: the mails that have been marked as deleted, they are unmarked.

The final stage is the UPDATE one. The connection is going to that stage when the

client gives the quit command in the TRANSACTION stage. The server deletes all

the messages marked as deleted and unlocks the maildrop.

4. E-mail problems

4.1 Spam e-mail

Junk or spam e-mail as it is widely known is a big arising problem of Internet. Spam

is flooding the Internet with many copies of the same message, sending it to lots of

receivers who would not choose to receive it in any case. Spam e-mail is usually

commercial advertisements or some get-rich-quick schemes10.

There are two main types of spam e-mailing. The first one is through sending an e-

mail to many newsgroups. The second type is when the spam mail is being sent

directly to many different receivers. When the first type occurs it is in the

responsibility of the administrator of the newsgroup to block the spam e-mail. That

project is targeted at the second type of spam.

Except that spam e-mail is so annoying, it is also a waste of time and money for its

receiver to read it. It is very clear why it is a waste of time. It is also waste of money

because lots of persons are dialling up to some ISP, so the time that they read it they

pay some bills – for instance the rent for the line. Lots of companies choose that way

of advertisement instead of using the traditional mail system because it is cheaper.

Generally, spam e-mail is the only way of advertising that costs more money to the

customer than the company. The sender of spam e-mails can find the e-mail addresses

of the receivers using many ways. Two of them is searching through the net for them

or stealing some lists for different newsgroups.

4.2 E-mail threats

In the beginning of the e-mail history, when only text could be transferred via e-mails

there was no any problem to any system to be damaged from a malicious program

through the e-mail. However, today the use of MIME allows e-mails to carry files in

their bodies. These files may contain some malicious programs. It is also possible

some e-mails to contain embedded HTML code in their bodies. With the use of Java

and JavaScripts it is also possible to add some harmful executable commands in the

HMTL code. If the e-mail reader can execute immediately the HTML code, it will

10 Scott Hazen Mueller (1999), What is spam?, http://spam.abuse.net/whatisspam.html

execute the malicious one. Consequently it is possible for a nasty program to be

executed just by reading the e-mail. E-mail threats may be Trojan horses, viruses or

worms.

• Trojan horse: it is the most elementary form of malicious code. By a Trojan horse

we mean instructions hidden inside an otherwise useful program that do bad

things. Usually the term Trojan is given to a program when these instructions are

added inside the program by the time that this program is written [Kaufman,

1995]. When a Trojan horse program is executed it may destroy files or create a

“back door” entry allowing an intruder to access tour system. A Trojan horse

program does not propagate itself from one computer to another. This is a

characteristic of the other two threats – virus and worm11.

• Virus: it is a set of instructions that, when it is executed, inserts copies of itself

into other programs [Kaufman, 1995]. When the infected program runs, the virus

code gets a chance to inspect its environment for other programs and infects these

files. If a user sends that program to another user or if a media storage –for

instance a floppy drive- that contains an infected file is moved from one machine

to another, then the virus may spread rapidly. Melissa is one of the most famous

viruses that was spread through e-mails. This virus managed to infect over

100,000 PCs in all over the world.

• Worm: it is a program that replicates itself by installing copies of itself in other

machines across the network [Kaufman, 1995]. A worm does not alert other files

like Trojan horses and viruses, but resides in active memory and duplicates itself

through the network. Worms are invisible to the users. They can be noticed when

their uncontrolled replication consumes system resources, slowing or halting other

programs. So what worms are really doing is causing harm by consuming

computing resources. Some new worms, like the Worm.ExploreZip, reside in the

computer memory and replicate themselves, like all the worms, but they also

contain some malicious payload12.

11 Zdnet, (2000), Help &How-To: Trojan Horse, Virus or Worms,

http://www.zdnet.com/zdhelp/stories/main/0,5594,2435378,00.html?chkpt=zdnnmoreon 12 Zdnet, (2000), Help &How-To: Definition of a Worms,

http://www.zdnet.com/zdhelp/stories/main/0,5594,2435378-3,00.html

4.3 Address spoofing

When an e-mail is received its header ought to indicate who sent it. However, we

cannot be absolutely sure that the address is shown in the header is the sender’s one. It

is possible for the headers to be forced so that they indicate that the message was sent

from another e-mail address than the real one. This is called e-mail address spoofing.

This attack can be prevented by a technique called origin authentication.

Origin authentication checks whether or not the sender is the one who he claims to be.

In order that to be achieved a public key cryptography is used. This cryptography uses

two different keys. A private key which only the sender knows (each sender has their

own private key) and one public key, corresponding to this private key, that can be

found easily. The sender uses their private key to encrypt the message and the

receiver uses the public key to decrypt the e-mail. After the decryption, if the message

remains unintelligible that means that the sender did not use the appropriate private

key, consequently he is not him who he claims to be [Hare, 1996]. That project does

not implement any cryptography and origin authentication techniques.

5. E-mail filter ing

5.1 The fundamentals of e-mail filter ing

As it has been clear from the above mentioned e-mail is a very useful facility and tool

of computer networks, although sometimes it can be very dangerous (e-mail threats)

or annoying (spam e-mail). Consequently it is necessary to use some filtering

techniques in order to isolate some dangerous or annoying e-mails before the user

opens them or before the user runs their attachments.

The following are the fundamentals of e-mail filtering which have been specified

through the collection of different information of discussion forums and newsgroups:

1. The source of the e-mail: the source should be checked in order to make sure that

it is an acceptable one. In other words, if there is an exclusion list of sources, the

e-mail should be not accepted if it is one of the e-mails that comprise the

exclusion list. This might be the individual –for instance

vouropoulos@hotmail.com- or the domain –hotmail.com. Both of them have to be

checked separately. Also the IP address of the sender has to be checked and if it is

an excluded one, the e-mail needs to be quarantined.

2. Subject line: if that header contains some “dangerous” words, then the e-mail

needs to be quarantined.

3. The main body of the text: it has to be checked whether it contains any of the

words belonging to a list of trigger words or not. It is vital that in that search for

trigger words both in the body of the message and in the subject line, the words

have to be normalized before the search, usually be changing them to mono-case.

4. Attachments: some e-mails containing dangerous attachments need to be isolated.

If there attachment is a known virus, these e-mails have to be kept in quarantine.

In addition, for some files with the following extension such as .vbs, .shs, .jpg,

.gif, .exe, .bat, .com or others that might have active content, the e-mail reader has

to inform the user about the risk of receiving and executing these files.

5. Viruses: it would also be useful, although complicated, for the e-mail reader to

include an exit for a standard antiviral program, so that each attachment file can

be checked.

Most of these fundamentals have been implemented in this project.

5.2 E-mail filter ing products

Almost all the e-mail readers have some filtering abilities. For this report just three of

them will be examined: two e-mail readers for M.S. Windows and one for Unix. The

two MUA for Windows that will be examined are those two commonly used in the

University of Leeds computer clusters: Eudora and Pegasus. The Unix filtering tool is

the Procmail.

5.2.1 Eudora

Eudora allows the user to filter the incoming of outgoing mail or both of them

automatically or manually when the user desires to do so13. In order for the user to

define a new filter, the “Filters” option has to be selected from the menu “Tools” .

Figure 8: the Eudora’s “ Filters” option

When the form of filtering is displayed on the screen, the user can define the criteria

for a new filter. The first thing upon which the user has to decide is which header the

user wishes to be checked –one of these option is the body of the message. After that

the user has to put a phrase or just few words that he/she wants to appear in the

already chosen header of the e-mail in order to activate the filter. In that stage it is

13 QUALCOM Incorporated (1999-2000), Tutorial: how to use fil ters, http://www/eudora.com/techsupport/tutorials/win_filters.html

possible for the user to choose another header as well to be checked and consequently

a combination of the two headers –“or” , “and” , “unless” etc- to activate the filter.

After the user has decided which situation will activate the filter the next step is to

decide what will happen if one e-mail meets the filtering rules.

Figure 9: the Eudora’s “ Filters” form

There are different options for the user to choose that Eudora will perform when an e-

mail activates the filtering rules. This part of the program is always updated in every

new version of Eudora and new actions are added as well as the ability to combine

more actions to be performed in each of the filtering rules. So, Eudora’s 4.3 different

options for that stage can be viewed in the following figure.

Figure 10: the Eudora’s “ Actions” option

5.2.2 Pegasus

Pegasus filtering abilities are better than those of Eudora. There are eight different

types of rules14. In order for the user to create a filtering rule, one has to go to the

“Tools” menu and chose successively the following choices: “Mail filtering rules”

“Edit new mail filtering rules” “Rules applied when folder is opened” or “Rules

applied when folder is closed” . The following figure displays the above mentioned

process.

14 AIESEC Organization (2000), E-mail Management Tips: Filtering, http://www.aisec.org/help/filtering.html

Figure 11: choice of Pegasus filtering option

Two of the filtering rules, “Standard header match” and “Regular expression match”

are similar to the filtering rules of Eudora. Two other types of rules, “Message date”

and “Message Age” allow the filter to handle each e-mail according to each date. The

actions that the filter will perform when an e-mail meets the filtering rules are about

the same with those of Eudora.

Figure 12: the Pegasus filtering option

5.2.3 Procmail

Procmail is a mail processing utility for Unix that can also filter the e-mails. Procmail

is written by Stephen van de Berge. The advantage of procmail is that it can process

messages either as they arrive or after they are already stored in the mailbox15, by

contrast with the two previous programs and this project which process the messages

after they are stored in the maildrop. This significant difference is based on the fact

that procmail is not a MUA as the other three but a Mail Delivery Agent (MDA)16.

MDA is the program used by the servers in order to deliver e-mail messages to the

mailboxes of the system’s users. Some servers are using MTA -usually sendmail- as

MDA, but others are using a pure MDA program like procmail. The first step for

someone to use procmail is to do some configurations in case procmail is not the

MDA of the system, so that all the e-mails to can be processed by procmail. What is

essential is to create a .forward file so that each e-mail is forwarded to the procmail.

Afterwards the creation of the .procmailrc file is necessary. Each .procmailrc file is

like a small program consisting of two parts, assignments and recipes17. The former

sets up some variables so that procmail knows where whatever necessary is stored.

The latter is the part where the filtering is done. A part of the recipe of a .procmailrc

file from the procmail man page is following:

* ^Subject:.*Flame

/dev/null

Briefly what this code does is moving all the e-mails containing in their Subject the

word “Flame” to the /dev/null, which is a “bit bucket” of Unix meaning that the

procmail deletes these e-mails.

15 Infinitive Ink (2000), Procmail quick start, http://www.ii.com/internet/robots/procmail/qs/ 16 Jim Dennis (1997), Promail Mini tutorial: Automated Mail Handling, http://www/linuxgazette.com/issue14/procmail.html 17 Ian Soboroff (1997, Mail filtering with procmail, http://www.gl.umbc.edu/~ian/procmail

6. Design, implementation and evaluation of the project

6.1 Outline of the design

That project is a client-based e-mail reader implying that the user has to copy the

necessary files in some directories and afterwards to run the “mail.pl” file in order to

access his/hers e-mails in the mail server. The method used to access the e-mails from

the mail server is the POP3, already described in a previous chapter. This project is a

program written in PERL so as to be platform independent and it can run both from

Unix and Windows environment (although in order to be able to run in Windows

(DOS), some small changes are necessary). The PERL module Mail::POP3Client,

written by Sean Dowd, was used for the POP3 communication with the server. The

design and implementation of the project was divided in to parts. The former includes

the creation of an e-mail reader while the later the formation of some filters for these

e-mails. Each part comprises three stages: design, implementation and evaluation.

6.2 The creation of the e-mail reader

6.2.1 Design

The purpose of that project is the creation of an application firewall to filter out e-

mails. So the first part should be the creation of an e-mail reader into which later these

filters will be incorporated. Everyone who has used e-mails knows the basic

principles of an e-mail reader. An e-mail reader has to perform some basic operations

including the composition, display and disposition of the e-mails. The e-mail reader of

that project performs two of these operations, display and disposition. It displays the

e-mails while allowing the user to delete the e-mails that they do not want to be stored

anymore in their mailbox. The display operation has to perform two tasks. The first

one is the display of the e-mails list allowing the user to be informed about the e-mails

existing in his/hers mailbox. In that case that list has to display such information

about each e-mail as its subject and its sender. The second task is to display on the

screen some of the e-mail headers, which can be the same that are displayed for the

first task, and also the body of the e-mail. The disposition operation of the e-mail

reader is to allow the user to delete or undelete some messages already marked to be

deleted.

6.2.2 Implementation

The PERL module Mail::POP3Client by Sean Dowd was used for the implementation

of the e-mail reader (for more information about this module see the Appendix No

??). Through my research as to how an e-mail reader for filtering e-mails could be

created I participated in different discussion groups and newsgroups while at the same

time searching the Internet for additional data. The majority of the other participants

suggested the creation of an appropriate scripts using PERL and the PERL

Mail::POP3Client module in order to retrieve the e-mails from the server. However,

one tutorial from the Internet proposed the Net::POP3 module for the retrieving of e-

mails18. The Net::POP3 module was already installed in the Unix machines so only

the former had to be installed in my Unix account. A choice between those two

modules had to be made. After a short examination of those two, the

Mail::POP3Client seemed to best one, so it was applied to the project. The e-mail

reader has four main options.

Figure 13:the four different options of the e-mail reader

It allows the user to see a list with their e-mails, to read a specific by giving its serial

number in the list, to delete a specific e-mail or undelete an e-mail already marked as

deleted.

18 About.com (2000), How to retrieve e-mails, http://perl.about.com/compute/perl/library/weekly/aa022700a.html

The delete and undelete operations are performed within the main body of the

program. A hash (associative array), called “deleted” is used in order to delete or

undelete an e-mail. When a user wants to delete an e-mail, then an entry for that e-

mail is added to that hash. If the user wishes to undelete an already marked as deleted

e-mail, then the entry for that e-mail is removed from the hash. So, the e-mails

marked as deleted will be deleted from the mailbox, after the user exits the program.

Two different subroutines are used for the display of the e-mails list and the retrieving

of a specific e-mail. For the display of the list there is a separate subroutine called

“header” . Briefly, what this subroutine is doing is that for each e-mail, its serial

number in the list is displayed. The From and Subject line of that e-mail are also

displayed. In case the e-mail has been marked as a deleted one, a label (the block

letter D) is displayed next to the serial number of that e-mail in order for the user to be

aware of the e-mail to be deleted from the mailbox.

Figures 14:displaying the e-mails list

The second subroutine displaying a specific e-mail when the user gives its serial

number is called “retrieve” . That subroutine displays the From and Subject headers of

that specific e-mail and its body, too. In case the specific the specific is a multipart

one, then its body contains a plaintext part, if any, and the binary file(s) that is/are

stored in the mail server as huge unintelligible text(s), after its/their encoding, usually

base64 encoding. A “boundary” string that can be found in the header Content-Type,

which is added to the e-mail by MIME, separates each part of the e-mail. So,

whenever the subroutine finds that string in the body of the e-mail, it examines

whether the following part is a plaintext and displays it or in a different case, the type

of that part as well as the name of the file are displayed.

Figures 15:displaying a specific e-mail

6.2.3 Evaluation of the e-mail reader

As it was mentioned before, the Mail::POP3Client module was used for the

implementation of that part of the project. The problem that came up during the

implementation of that part was the following one: although the script could find the

POP3Client module and the code used in order the connection with the mail server to

be established was the right one, it was the one existing in the module, the connection

could not be established. Different discussion groups and newsgroups were used but

initially none was of any particular help. Finally the solution came from the

discussion group of the author of the module, Sean Dowd19. The three obligatory

fields that the manpage of module defines that should be used in the constructor, are

the “USER’ ’ , “PASSWORD” and the “HOST” one. All the other fields have some

default values, so when they are not mentioned in the constructor they have their

default values. One of these fields is the “AUTH_MODE”. “The valid values for

AUTH_MODE are “PASS” and “APOP’’ . APOP implies that an MDS checksum will

be used instead of passing your password in cleartext. However, if the server does not

support APOP, the cleartext method will be used” 20. The default value of that field is

19 Deja (Discussions>>comp.lang.perl.modules ) (2000), Re: POP3Client, http://x70.deja.com/getdoc.xp?AN=644432010&CONTEXT=966869437.408289281&hitnum=1 20 As mentioned in the Mail::POP3Client manpage.

“PASS”. But there is a problem with some servers and in these cases the constructor

has to include the following line:

AUTH_MODE=> PASS

After that change the connection with the server was established. For evaluation

reasons, the connection with different mail servers was attempted and the e-mail

seemed to work properly, although in that part of the project there is a problem when

the program is going to be used under a WINDOWS (DOS) platform. When the

program prompts the user to give his/her password, this does not have to be viewable

on the screen for security reasons. There is a PERL module called Term::ReadKey

that can be used for that purpose. Nevertheless, it could not be installed properly, so

the “sty-echo” command of Unix was used in order to do that. Still, that command is

not working in DOS and a DOS similar command could not be found. So when the

program runs under a WINDOWS platform, that command has to be removed and the

password will be viewable.

6.3 Establishing the filters

6.3.1 Design

The majority of the-mail filtering principles, already discussed in a previous chapter,

are included in the e-mail filtering rules of this project. This project contains six

different types of filters and it is going to define whether or not it will quarantine a

specific e-mail after taking into account the following factors (the sequence of these

factors is the one that is followed in the program):

• The individual e-mail address of the sender.

• The domain where from the e-mail is coming.

• The IP addresses that the headers of the e-mail contain.

• The attachments of the e-mail.

• The e-mail Subject line.

• The number of the individual receivers of this e-mail.

check theindividual e-mail

check theSubject line

check the attachments

check the IP addresses

check the domain

check thenumber of the

receivers

OK quarantine the e-mail

wanted e-mail

Not OK

Figure 16: the sequence of the six filters

The first filter that the program performs is relevant to the individual e-mail address

of the sender. The e-mail addresses existing in all the e-mail headers (excluding the

“To:” and “Cc:” ones) are checked to see whether or not they belong to an exclusion

list of senders; if so, the e-mail is quarantined The next filter is similar to the previous

one, except, this time, the domains in the e-mail headers (again the “To:” and “Cc:”

ones are excluded) are checked whether or not they belong to a list with unwelcome

domains.

The third filter of the program concerns the IP addresses contained in the headers of

an e-mail. Usually the IP addresses of the different MTAs involved in the

transportation of the e-mail along the way are added in the “Received:” header of the

e-mail. So, there is a list with some unwanted IP addresses and when an e-mail

contains one of these, it is quarantined.

The fourth-filter defines whether an e-mail is dangerous or not, according to its

attachments, if any. If it contains some files whose filenames are included in a list

with some dangerous files (worms, viruses and Trojan horses), then this e-mail is kept

in quarantine. The fifth filter confirms whether or not the “Subject:” line of the e-mail

contains some suspicious words. There is also a list with these doubtful words and if

the “Subject:” line contains one of them, then the e-mail is quarantined. The last filter

monitors the number of the individual receivers and in case this number is beyond an

already given limit, the e-mail is isolated. The last three filters are trying to isolate e-

mails containing dangerous files. It is known that when these dangerous files,

especially the viruses infecting a computer, they are transmitted automatically through

the infected machine to other users whose e-mails are contained in the mailbox of the

infected computer. These automatically sent e-mails have a specific subject

(according to the specific virus), they are sent to many persons and they contain in

their bodies the hazardous files. So these last filters are dealing with these suspicious

e-mails.

6.3.2 Implementation

The implementation of the filters is divided in two parts: the filter itself and the

manipulation of the files containing the lists with the unwanted IP and e-mail

addresses, domains, the suspicious words and the dangerous filenames.

6.3.3 The implementation of the filter

The six filters performed by the program are included in a subroutine called

“ find_emails” . This subroutine checks each e-mail individually in order to decide

whether it is a wanted one or not. The unwanted e-mails are not deleted from the

mailbox. What really happens is that these e-mail are quarantined, so the users cannot

see and access them. When the user exits the program, these e-mails are deleted from

the mailbox. The quarantine is achieved through the use of a small trick. There is an

array called “deleted_f” whose length is the number of the e-mails in the mailbox - it

has to be made clear that this is a different array from the associative one (hash) called

“deleted” containing the e-mails the user wishes to delete. Consequently each element

of this array refers to an e-mail in the mailbox. It has to be mentioned here that the

counting of the elements in the arrays starts from 0 contrary to the mailbox, where it

starts from 1. So, the zero element in the array refers to the first e-mail in the mailbox

and so on. This array is initialized in the beginning of the program –all of its elements

are set equal to 0. After this initialization the e-mail reader checks all the e-mails to

find any unwanted ones.

Figure 17: the quarantine process

When an e-mail is considered unwanted, the filter in response sets the corresponding

element to this e-mail in the “deleted_f” array equal to 1. So, in the end of that

For each e-mail contained in the mailbox

Is the N e-mail an unaccepetedone? (N stands for the number of each e-

mail in the mailbox)Yes No

Quarantine the e-mail:deleted_f[N-1]=1

Check the next e-mailCheck the next e-mail

process all the unwanted e-mails have their corresponding element in the array set

equal to 1. The e-mail reader uses this information for displaying the list with the

wanted e-mails and for accessing a specific –wanted- e-mail (retrieve or delete it).

Whenever the user wishes to see the list with the e-mails, the e-mail reader checks the

element corresponding to each e-mail in that array. If it is 0, then it displays the

necessary information for this e-mail or else it skips it.

Figure 18:list displaying process

That list, as it has been mentioned earlier, before the information for each e-mail has

a number that is supposed to be its serial number in the mailbox, but it is not. This

number is just a counter that counts the wanted e-mails. So whenever a user wishes to

read or deleto or undelete an e-mail, he/she is giving to the e-mail reader the number

of this counter. Consequently the program has to find in some way the true number of

the e-mail in the mailbox. Another array called “real_mail” is used for this purpose. It

has stored the true serial number for each wanted e-mail corresponding to the number

of the counter the users sees in the list.

Let us now examine more detailed how each filter works. The sequence that the filters

will be examined will be the same as mentioned before. The first filter checks the

sender of the e-mail. It has a text file containing the e-mail addresses to be rejected.

The program reads them from this file and stores them in a hash. The e-mail reader

searches the headers for e-mail addresses. An e-mail address consists of a sequence of

word characters, digits, “_” and “-“, following by the “@” following by at least two

strings joined with a dot (the PERL regular expression that “grabs” the e-mail address

from the header is:

“ /([ -\w] {1,}\@{1}[ -\w] {1,}\.{1}[ -\w.] {1,})/g” ).

For each e-mail with number N contained in the mailbox

Is the N e-mail an unaccepetedone (deleted_f[N-1]=1)?

Yes No

Skip it.Go to the next e-mail

Print the necessaryinformation for this e-mail

Go to the next e-mail

When an e-mail address is found and it is not contained in the “To:” and “Cc:” fields,

then this address is checked whether or not it exists in the hash containing the rejected

ones. The reason we are doing this, which is excluding these two fields from the

search for rejected e-mails, is that these fields do not show the sender but the receiver.

So there is the possibility where an accepted sender wishes to send the same e-mail to

a receiver from whom we do not want to receive any e-mail, so at the same time there

is an unwanted sender in the “To:” or “Cc:” headers. For example: let us suppose that

Craig is an accepted e-mail sender whose e-mail address is craig@someone.com.

Craig is sending the same e-mail to myself (me@someone.com) and also to Claire

(claire@someone.com) whose e-mail address is an unaccepted one. Consequently,

those two e-mail addresses are included in the “To:” and “Cc:” headers –either of

them might be contained either in the former or in the latter header. If we check for

unaccepted e-mail addresses in those two fields, the unaccepted e-mail address of

Claire will be located, so the e-mail will be quarantined and we will not be able to

access it, although we would like to retrieve it because Craig is an accepted sender.

That is the reason we do not search in those two fields for rejected e-mail addresses.

The filter searching for rejected domains is similar to the previous one. There is also a

text file containing the rejected domains and the name for each one of them. The e-

mail reader stores this information in a hash. The domains can be derived from the e-

mail addresses as they are the part following the “@” in an e-mail address. So,

whenever a domain is found in any e-mail header except the “To:” and “Cc:” ones, it

is checked to see whether or not it is a rejected one. The third filter is relevant to the

unwanted IP addresses. All these are shorted and stored in a text file. The e-mail

reader reads the IP addresses from this files and stores them in an array. When an IP

address is found in any header the program checks whether it belongs to the unwanted

ones. In order for the program to identify an IP address in the headers, it has to search

for a string comprising four numbers –one till three digits each- joined with dots (the

PERL regular expression that “grabs” the e-mail address from the header is:

“ /(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})/g” ).

The next three filters are checking dangerous e-mail threats. There are two text files

where some dangerous filenames –files that are known as threats- and some

suspicious phrases – for instance ILOVEYOU (this text is added to the Subject lines

of the e-mails containing the homonymous virus)- are stored. The entries of these files

are also stored in two hashes. The filename of an attachment, if the e-mail is a

multipart one, can be found in the body of the message following the “name=” string

in a line starting with the “Content-Type:” one. If the filename of the attachment is a

known e-mail threat, then the e-mail is quarantined. In the case of the Subject line

filter, the string following the “Subject:” one is compared with the suspicious words

and if it is one of them, the e-mail is isolated. The last filter is relevant to the number

of the individual receivers of the e-mail. The e-mail addresses existing in the “To:”

and “Cc:” fields are counted and in case this number is beyond a limit, the e-mail is

quarantined.

Figure 19: filtering of some e-mails

6.3.4 Manipulation of the files containing rejected e-mail and IP address,

domains, suspicious words and dangerous attachments

As it has been mentioned earlier there are five text files containing lists with the

rejected e-mail and IP address, domains, suspicious words and dangerous

attachments. The user is able to modify these files. They just have to follow the

prompts of the interface in order to do that.

Figure 20: the manipulation of the files

The modification of the files is taking place before the user logs in the mail server. It

has to be implemented by that way because when the user logs in the mail server there

is a time out option defined by the POP3 protocol. If the user does not send any

command to the server during a specific time when the connection is established, the

server aborts the connection. So if the manipulation of the files takes place after the

user logs in the server, it is possible that the user will not send any command to the

server because he/she will modify the files, so it will abort the connection.

6.3.5 Evaluation of the e-mail fi lter

The next part after the implementation of the filter was its test and its evaluation.

Different persons were asked to send me e-mails in order to test how well the filter

works. During that evaluation stage two problems with the filtering process occurred,

both of them related to the e-mail address filtering. The majority of the MUA and

MTA are using the following format for the e-mail addresses:

“<someone@a.domain>” . However, during that stage it was made clear to me that

this format is not an obligatory one. So, some servers are not adding the “<” and “>”

characters before and after the e-matl address – for instance the “uom.gr” server does

not add these characters when an e-mail address belongs to a group and not to a

specific person. So, these two characters were removed from the search for e-mail

addresses. The second problem that arose was that the filter could not identify an e-

mail address if there was a “-“ character in it. I was not aware that an e-mail address

could have that specific character which was not one of the allowable characters of an

e-mail address. But when the filter came across with this problem, the “-“ character

was added to the allowable ones. After these two corrections the filter worked

properly and isolated all the unwanted messages.

7. Future Improvements

The general purpose of this project was the design of a prototype Application Firewall

able to filter out dangerous e-mails and the implementation of an appropriate program

able to perform this task. This purpose has been fulfilled. However, the program

cannot be considered as a fullworking e-mail reader and someone could make some

improvements in order to make it even better. Some of these improvements could be

the following:

• As it has been mentioned before an e-mail reader has to perform three basic

operations: composition, display and disposition of the e-mails. However, the e-

mail reader that has been implemented for this project performs only two of the

above-mentioned tasks. It can display and dispose the e-mail messages.

Consequently, in order to make it a complete e-mail reader, the ability to send e-

mail has to be added.

• A better user interface has to be designed. In case the user wants to see the list of

the e-mail or to read a certain e-mail, then a specific number of e-mails or a

specific number of lines or the body of that particular e-mail will be displayed

each time and then the user has to press the return key in order to see the rest.

These numbers are predefined, constant and independent of the screen size. So

when the size of the screen is relatively small, then some information will not

appear on the screen. There are two solutions to this problem. The first is the use

of a scroll in the edge of the window. The second one is for the number of the e-

mails and the number of the rows of the body of each e-mail that will be displayed

on the screen to be variable and to be based on the size of the screen; the size of

the latter can be calculated by some readily available PERL modules. Therefore

these modules could also be used for the implementation of a better interface and

thus a solution to the problem could be achieved.

• There are some text files that contain the lists with the various unwanted elements

such as e-mail addresses, domains etc. When someone wants to use this program,

one has also to copy these files to his/her account. When the filter is installed to

anew account, these files have to be empty- so that no filters exist- but still they

have to exist. That happens because when the program cannot find any of the files,

it is terminated. An improved version of the program could check the existence of

these files and, in case they exist, to ask the user whether or not he/she wants to

create them instead of terminating itself.

• The attachments of the files of each e-mail are encoded before being sent-more

often by using the base64 encoding- and they are converted into an unintelligible

sequence of ASCII characters. So they are stored in the mail server as text files.

The e-mail reader that will retrieve the e-mails containing these attachments has to

decode these files in order to reconvert them in their binary form. This operation

is not performed by the current project and could be included in the future

improvements.

8. Conclusions

The aim of this project was to make myself acquainted with various aspects of the e-

mails and more specifically the fundamental principles of e-mail filtering. The

objectives of this project were:

1. Specification and design of a prototype Application Firewall to filter out

dangerous e-mails.

2. Identification of commercial components that will do some of it.

3. Development and demonstration at least part of the design.

4. Development of an admin tool kit that will allow this to be customised.

5. Test of it to see how successful it is in rejecting bad but passing good e-mails.

This project could be divided in two parts. The first part was the necessary literature

survey, including academic books, discussion forums and newsgroups, in order to

become familiar with the e-mail technology, the e-mail filtering techniques and other

security issues. The second part was the programming one for the implementation of

the e-mail reader, which is also responsible for the e-mail filtering. PERL was the

programming language that used for the creation of the code. A ready PERL module,

called Mail::POP3Client –by Sean Dowd- was used for the retrieving of the e-mails

from the server.

The basic steps of the filtering process are the following:

• Check the e-mail address of the sender.

• Check the domain of the server.

• Check the different IP addresses in the e-mail header.

• Check the attachments, if any.

• Check the “Subject:” field for suspicious words.

• Check the number of the individual receivers.

Different text files are used in order to store the lists with the rejected e-mail and IP

addresses, domains suspicious words and filenames of dangerous files. The user has

the ability to modify the files, to add new entries or delete some already existing ones,

so he/she will be able to create his/her own personal filter with their own preferences.

9. References

1. Ranum, M J and Curtin, M (1998), Internet Firewalls Frequently Asked

Questions, http://www.hideaway.net/texts/fwfaq.html

2. Khalid Al-Tawil and Ibrahim A. Al-Kaltham (1999), Evaluation and Testing of

Internet Firewalls, International Journal of Network Management, Int. J. Network

Mgmt. 9, pp. 135-149

3. Connected: An Internet Encyclopedia (April, 1997), IP Address,

http://noc.ucsc.edu/cie/Topics/23.htm

4. Chris Lewis (2000), IP 101: All About IP Addresses,

http://www.networkcomputing.com/netdesign/ip101.html

5. MIT Kerberos team (2000), The Three Myths of Firewalls,

http://web.mit.edu/kerberos/www/firewalls.html

6. Multipurpose Internet Mail Extensions (RFC 2045) (1996),

http://andrew2.andrew.cmu.edu/rfc/rfc2045.html

7. Post Office Protocol - Version 3 (RFC 1939) (1996),

http://www.faqs.org/rfcs/rfc1939.html

8. Scott Hazen Mueller (1999), What is spam?,

http://spam.abuse.net/whatisspam.html

9. Zdnet (2000), Help &How-To: Trojan Horse, Virus or Worms,

http://www.zdnet.com/zdhelp/stories/main/0,5594,2435378,00.html?chkpt=zdnn

moreon

10. Zdnet (2000), Help &How-To: Definition of a Worms,

http://www.zdnet.com/zdhelp/stories/main/0,5594,2435378-3,00.html

11. QUALCOM Incorporated (1999-2000), Tutorial: how to use filters,

http://www/eudora.com/techsupport/tutorials/win_filters.html

12. AIESEC Organization (2000), E-mail Management Tips: Filtering,

http://www.aisec.org/help/filtering.html

13. Infinitive Ink (2000), Procmail quick start,

http://www.ii.com/internet/robots/procmail/qs/

14. Jim Dennis (1997), Promail Mini tutorial: Automated Mail Handling,

http://www/linuxgazette.com/issue14/procmail.html

15. Ian Soboroff (1997, Mail filtering with procmail,

http://www.gl.umbc.edu/~ian/procmail

16. About.com (2000), How to retrieve e-mails,

http://perl.about.com/compute/perl/library/weekly/aa022700a.html

17. Deja (Discussions>>comp.lang.perl.modules ) (2000), Re: POP3Client,

http://x70.deja.com/getdoc.xp?AN=644432010&CONTEXT=966869437.408289

281&hitnum=1

10. Bibliography

1. Charlie Kaufman, Radia Perlman, Mike Speciner (1995), Network security:

private communication in a public world, Prentice Hall

2. William Stallings (1995), Network and internetwork security : principles and

practice, Prentice Hall

3. Andrew S. Tanenbaum, Computer networks (3rd edition), Prentice Hall, (1996)

4. Chris Hare, Karanjit Siyan (1996), Internet firewalls and network security (2nd

edition), New Riders

5. Jalal Feghhi, Jalil Feghhi, Peter Williams (1999), Digital certificates : applied

Internet security, Addison-Wesley

6. Larry L. Peterson & Bruce S. Davie (2000), Computer networks : a systems

approach, Morgan Kaufmann

7. Marcus Gonçalves (1998), Firewalls complete, McGraw-Hill

8. Larry Wall, Tom Christiansen, and Randal L. Schwartz with Stephen Potter

(1996), Programming Perl (2nd edition), O'Reilly & Associates

APPENDIX A: Exper iences gained through the project

After finishing the project I had to undertake for the MSc programme, certain

thoughts have to be expressed in terms of the effort made and the degree in which the

objectives of the project have been fulfilled.

The general impression is that myself ended up satisfied with the final outcome even

though initially there was a more general field of research that was chosen. That was

“Firewalls, encryption and other aspects of security” . The reason for not opting for

this subject was that after discussing this with my new supervisor and after a short

study I made in order to prepare the interim report, we both perceived this subject as

too extended in order to be thoroughly covered. Here too the comments made by the

assessor came to agree with this observation. Therefore, we redefined the project and

finally decided to engage into something more achievable such as e-mail firewalls.

Throughout the effort made I came across many problems such as the difficulty to

establish the POP3 connection with the mail server and due to the fact that it was the

first time I was engaged in a project totally on my own, such problems came to be

seen as quite significant; still, these were overcome quickly after participating in

various discussion forums and newsgroups. Their help was not only essential when a

problem arose but also when bibliography was concerned.

The most important aspect perhaps of this project was the fact that it was the first time

that I got so seriously and in such a depth engaged in matters of networks and Internet

security. Hopefully, the knowledge and the experience acquired will be beneficial for

my further engagement in these matters in the future but this time in a professional

APPENDIX B: Project Objectives and Deliverables

School of Computer Studies MSC PROJECT OBJECTIVES AND DELIVERABLES

This form must be completed by the student, with the agreement of the supervisor of each project, and submitted to the MSc project co-ordinator (Mrs A. Roberts) by 7th April 2000. A copy should be given to the supervisor and a copy retained by the student. Amendments to the agreed objectives and deliverables may be made by agreement between the student and the supervisor during the project. Any such revision should be noted on this form.* At the end of the project, a copy of this form must be included in the Project Report as an Appendix.

NOTE: this form includes amendments to Project Interim Report, added in 6 June 2000, as agreed with supervisor and include addressing issues raised by assessor's interim report.

Student: EliasVouropoulos

Programme of Study: Distributed Multimedia Systems

Supervisor: Mr. Bill Whyte

Title of project: A prototype Application Firewall to filter out dangerous e-mails

External Organisation* : _______________________________________

* (if applicable)

AGREED MARKING SCHEME

Understand the Problem

Produce a Solution *

Evaluation Write -Up Appendix A TOTAL

20 40 20 15 5 100

* This category includes Professionalism (see handbook)

OVERALL OBJECTIVES (continue overleaf if necessary):

1. Specify and design a prototype Application Firewall to filter out dangerous e-mails.

2. Identify commercial components that will do some of it.

3. Develop and demonstrate at least part of the design.

4. Develop an admin tool kit that will allow this to be customised.

5. Test it to see how successful it is in rejecting bad but passing good e-mails.

DELIVERABLE(s):

1. A project report.

2. The construction of an e-mail Application Firewall

APPENDIX C: Project I nter im Repor t

School of Computer Studies MSC PROJECT INTERIM REPORT

Student: Elias Vouropoulos Programme of Study: Distributed Multimedia Systems Title of project: Firewalls, encryption and other aspects of security

Supervisor: Mr. Martyn Clark External Company (if appropriate):

AGREED MARKING SCHEME Understand the problem

Produce a solution *

Evaluation Write-up Appendix A

TOTAL %

20 40 20 15 5 100 Overall Objectives 1. Investigate the types of information and the different categories of users in SCS. 2. Investigate current systems and future requirements. 3. Examine the technical requirements in building web sites and develop appropriate

personal skills. 4. Design and implement secure web service for SCS. Deliverable(s): 1. A project report. 2. The construction of an appropriate web site.

Following are the comments of my assessor for the interim report:

Following are the comments of my supervisor for the interim report: As discussed, we decided to re-define your project, in order to make it more acheivable. Here is what we discussed: • Specify and design a prototype Application Firewall to filter out dangerous

e-mails. • Identify commercial components that will do some of it. • Develop and demonstrate at least part of the design. • Develop an admin tool kit that will allow this to be customised. • Test it to see how successful it is in rejecting bad but passing good e-mails.

APPENDIX D: The manpage of Mail::POP3Client

NAME Mail::POP3Client - Perl 5 module to talk to a POP3 (RFC1939) server SYNOPSIS use Mail::POP3Client; $pop = new Mail::POP3Client( USER => "me", PASSWORD => "mypassword", HOST => "pop3.do.main" ); for( $i = 1; $i <= $pop->Count(); $i++ ) { foreach( $pop->Head( $i ) ) { /^(From|Subject):\s+/i && print $_, "\n"; } } $pop->Close(); # OR $pop2 = new Mail::POP3Client( HOST => "pop3.otherdo.main" ); $pop2->User( "somebody" ); $pop2->Pass( "doublesecret" ); $pop2->Connect() || die $pop2->Message(); $pop2->Close(); DESCRIPTION This module implements an Object-Oriented interface to a POP3 server. It implements RFC1939 (http://www.faqs.org/rfcs/rfc1939.html) EXAMPLES Here is a simple example to list out the From: and Subject: headers in your remote mailbox: #!/usr/local/bin/perl use Mail::POP3Client; $pop = new Mail::POP3Client( USER => "me", PASSWORD => "mypassword", HOST => "pop3.do.main" ); for ($i = 1; $i <= $pop->Count(); $i++) { foreach ( $pop->Head( $i ) ) { /^(From|Subject):\s+/i and print $_, "\n"; } print "\n"; } CONSTRUCTORS Old style (deprecated): new Mail::POP3Client( USER, PASSWORD [, HOST, PORT, DEBUG, AUTH_MODE] );

New style (shown with defaults): new Mail::POP3Client( USER => "", PASSWORD => "", HOST => "pop3", PORT => 110, AUTH_MODE => 'PASS', DEBUG => 0, TIMEOUT => 60, ); * USER is the userID of the account on the POP server * PASSWORD is the cleartext password for the user ID * HOST is the POP server name or IP address (default = 'pop3') * PORT is the POP server port (default = 110) * DEBUG - any non-null, non-zero value turns on debugging (default = 0) * AUTH_MODE - pass 'APOP' to attempt APOP (MD5) authorization. (default is 'PASS') * TIMEOUT - set a timeout value for socket operations (default = 60) METHODS These commands are intended to make writing a POP3 client easier. They do not necessarily map directly to POP3 commands defined in RFC1081 or RFC1939, although all commands should be supported. Some commands return multiple lines as an array in an array context. new( USER => 'user' , PASSWORD => 'password', HOST => 'host' , PORT => 110, DEBUG => 0, AUTH_MODE => 'PASS', TIMEOUT => 60 ) Construct a new POP3 connection with this. You should use the hash-style constructor. The old positional constructor is deprecated and will be removed in a future release. It is strongly recommended that you convert your code to the new version. You should give it at least 2 arguments: USER and PASSWORD. The default HOST is 'pop3' which may or may not work for you. You can specify a different PORT (be careful here). new will attempt to Connect to and Login to the POP3 server if you supply a USER and PASSWORD. If you do not supply them in the constructor, you will need to call Connect yourself. The valid values for AUTH_MODE are 'PASS' and 'APOP'. APOP implies that an MD5 checksum will be used instrad of passing your password in cleartext. However, if the server does not support APOP, the cleartext method will be used. Be careful. If you enable debugging with DEBUG => 1, messages about command will go to STDERR. Another warning, it's impossible to differentiate between a timeout and a failure. Head( MESSAGE_NUMBER ) Get the headers of the specified message, either as an array or as a string, depending on context. You can also specify a number of preview lines which will be returned with the headers. This may not be supported by all POP3 server implementations as it is marked as optional in the RFC. Submitted by Dennis Moroney <dennis@hub.iwl.net>. Body( MESSAGE_NUMBER ) Get the body of the specified message, either as an array of lines or as a string, depending on context. HeadAndBody( MESSAGE_NUMBER [, PREVIEW_LINES ] )

Get the head and body of the specified message, either as an array of lines or as a string, depending on context. Example foreach ( $pop->HeadAndBody( 1, 10 ) ) print $_, "\n"; prints out a preview of each message, with the full header and the first 10 lines of the message (if supported by the POP3 server). Retrieve( MESSAGE_NUMBER ) Same as HeadAndBody. Delete( MESSAGE_NUMBER ) Mark the specified message number as DELETED. Becomes effective upon QUIT. Can be reset with a Reset message. Connect Start the connection to the POP3 server. You can pass in the host and port. Close Close the connection gracefully. POP3 says this will perform any pending deletes on the server. Alive Return true or false on whether the connection is active. Socket Return the file descriptor for the socket. Size Set/Return the size of the remote mailbox. Set by POPStat. Count Set/Return the number of remote messages. Set during Login. Message The last status message received from the server. State The internal state of the connection: DEAD, AUTHORIZATION, TRANSACTION. POPStat Return the results of a POP3 STAT command. Sets the size of the mailbox. List Return a list of sizes of each message. ListArray Return a list of sizes of each message. This returns an indexed array, with each message number as an index (starting from 1) and the value as the next entry on the line. Beware that some servers send additional info for each message for the list command. That info may be lost.

Uidl( [MESSAGE_NUMBER] ) Return the unique ID for the given message (or all of them). Returns an indexed array with an entry for each valid message number. Indexing begins at 1 to coincide with the server's indexing. Last Return the number of the last message, retrieved from the server. Reset Tell the server to unmark any message marked for deletion. User( [USER_NAME] ) Set/Return the current user name. Pass( [PASSWORD] ) Set/Return the current user name. Login Attempt to login to the server connection. Host( [HOSTNAME] ) Set/Return the current host. Port( [PORT_NUMBER] ) Set/Return the current port number. AUTHOR Sean Dowd <pop3client@dowds.net> CREDITS Based loosely on News::NNTPClient by Rodger Anderson <rodger@boi.hp.com>. SEE ALSO perl(1).

A prototype application Firewall to filter out dangerous e ...€¦ · A prototype application...

Documents

Managing DNS Firewall - Cisco · Managing DNS Firewall • ManagingDNSFirewall,page1 Managing DNS Firewall DNSfirewallcontrolsthedomainnames,IPaddresses,andnameserversthatareallowedtofunctiononthe

Ubiquitous Computing for Firefighters: Field Studies and ...jasonh/publications/chi2004-firefighters... · Web viewFigure 5. Prototype 1, Firewall, is a wall-sized display to help

SIP, NAT, Firewall SIP NAT Firewall How to Traversal NAT/Firewall for SIP

Using pfSense as a Firewall - growthpixel.com · Using pfSense as a Firewall @tetranoodle Describe a firewall and its functions Configure pfSense as a firewall Setup and manage firewall

What is a Firewall? Firewall, VPN, Firewall, VPN, IDS ...abc/teaching/bbs677/slides/...Firewall, VPN, Firewall, VPN, IDS/IPSIDS/IPS Ahmet Burak Can Hacettepe University abc@hacettepe.edu.tr

Windows Firewall Exceptions - Docusnap€¦ · Windows Firewall Exceptions ... Editing a group policy object Windows Firewall Exceptions ... Network Connections Windows Firewall

FortiGate 600D Next Generation Firewall Internal ......T ST Next Generation Firewall Internal Segmentation Firewall The FortiGate 600D delivers next generation firewall capabilities

Stephen Smaldone, Vinod Ganapathy, and Liviu Iftodevinodg/papers/sacmat2009/sacmat2009_slides.pdf · Firewall Internet VPN File Accesses . ... • Prototype Implementation of WSBAC

untangle firewall configuration for firewall security

Firewall and SmartDefense - asm.mdstorage.asm.md/docs/CheckPoint NGX R65/CheckPoint_R65...Reporter, Eventia Suite, FireWall-1, FireWall-1 GX, FireWall-1 SecureServer, FloodGate-1,

Chapter 9 Firewalls. The Need for Firewalls Putting a Web server on the Internet without a firewall is dangerous –Remember in CNIT 123 how a firewall

FIREWALL - WordPress.com€¦ · Firewall (definições) Conexão Segura Infraestrutura de rede Proteção de Redes: Firewall . Firewall - Introdução Definição Dispositivo que

PROTOTYPE - ddbunlimited.comprototype prototype prototype prototype prototype prototype prototype prototype prototype prototype prototype prototype f e g h f e g h scale: 1:12 4. 4.875

Prototype Prototype

A McAfee Firewall Enterprise, Multi‑Firewall Edition...このクイックスタートガイドでは、McAfee® Firewall Enterprise, Multi‑Firewall Edition （以下、 Firewall

Ch 3 Firewall and Perimeter Security. Contents Firewall –packet-filter firewall: filters at the network or transport layer –proxy firewall: filters at

Controlling Access Through the Firewall · Firewall# show firewall The result is either “Firewall mode: Router” or “Firewall mode: Transparent”. If you need to conﬁgure

Firewall - piya.ee.engr.tu.ac.th · Firewall Pro & Cons Types of Firewalls Firewall Configuration Firewall Products Firewall Alternatives Firewall Pro & Cons Pros Keeping unwanted

Filtering in Firewall By Fantastic 5. Agenda What is Firewall? Types Of Firewall Pros and Cons Of Different Firewalls What Firewall can do? What Firewall

Dangerous Goods and Hazardous Chemicals Guideline · Dangerous goods are regulated by the Dangerous Substances Act 1979. Many hazardous chemicals are also dangerous goods. Dangerous