Upload
vaibhav1991
View
214
Download
0
Embed Size (px)
Citation preview
8/10/2019 deep protocol analysis.pdf
1/16
Copyright IBM Corporation 2010 Trademarks
Deep-protocol analysis of UNIX networks Page 1 of 16
Deep-protocol analysis of UNIX networks
Martin Brown([email protected])
Freelance Writer
Author
08 June 2010
Whether you are monitoring your network to identify performance issues, debugging
an application, or have found an application on your network that you do not recognize,
occasionally you need to look deep into the protocols being used on your UNIX network to
understand what they are doing. Some protocols are easy to identify and understand, evenwhen used on non-standard ports. Others need more investigation to understand what they are
doing and what information they are exchanging. In this article, we will take a look at techniques
for performing detailed analysis of the protocols in use on your UNIX network.
Introduction
Networks have become so ubiquitous that in many cases we take the use of the network to
communicate with different machines, inside and outside of our network, for granted. Most of the
time this isn't an issue, but there are times when you need to take a closer look at your network
and find out what is going on.
There are a number of reasons to take a closer look at the contents of the network traffic. The first
is that you may simply be debugging an existing network application or one you are developing,
and want to monitor the traffic going past on your network. The second reason is to identify traffic
on your network that may be using up bandwidth and resources. For the former, you probably
already know the contents of the protocol, but you may want to get a more detailed look at the
actual data being transferred, for example, when using web services. For the latter, identifying the
contents of the packets requires some extensive knowledge of the protocols being used.
With both TCP/IP and UDP/IP communications, the key elements are the IP addresses used
to identify the hosts and the port number. The port number is used to provide additional
communication channels so that you can support multiple connections between two hosts. There
are some standards in the port definitions. For example, port 25 is for email (SMTP) traffic, and
most websites operate on port 80 (HTTP). These conventions are used to allow programs to
communicate with each other over a known channel in the same way as you would choose a
phone or fax number.
While these conventions exist, there is no limit or restriction on what ports you use. In fact, in many
cases, subversive network applications and some security methods will deliberately use non-
mailto:[email protected]://www.ibm.com/developerworks/ibm/trademarks/http://www.ibm.com/legal/copytrade.shtml8/10/2019 deep protocol analysis.pdf
2/16
developerWorks ibm.com/developerWorks/
Deep-protocol analysis of UNIX networks Page 2 of 16
standard ports. For example, some will hide content by misusing a standard port with a different
protocol, like using HTTP over port 25. Other examples include using a different port from the
standard so that it is not obvious which port is being used for the traffic (like using port 99 for
HTTP), or by encapsulating specific protocol traffic within another protocol. This last method is
actually the one used by network tunneling and virtual private networks (VPNs).
Regardless of the reasons or complexities of the network traffic, the first step is always to start
recording the data.
Recording raw data
There are a number of different tools available if you want to record the raw network data so
that you can examine the information yourself. Most of the network sniffers will also decode and
decipher specific packet contents, which will help you when you want to study the content of a
recognized protocol.
Under Solaris you can use the snoop tool, or under AIX the iptrace tool. You can also try the cross-
platform tcpdump tool, which is supported on most UNIX and Linux operating systems. These
provide a combination of both capturing and decoding content for you, often performing the bulk of
the protocol analysis process for you. Note that with modern switches the Ethernet packets are not
echoed to every port, which often limits the information you can extract to the current host. Many
modern switches provide a management port that often carries a copy of all packets for exactly
this type of monitoring.
The primary complexity behind decoding network transmission is the levels of information that are
provided within the network packets. In addition, much of this information is also sent encoded in
binary format, and capturing pure raw packets off of the network requires a significant amount ofwork to pick out the data that you need. By using a tool that provides some of the processing, you
can simplify the process of decoding network data.
To give an example, on an Ethernet network when looking at a typical TCP/IP protocol, the data
transmitted over the network will include:
Ethernet packet headers, including the Ethernet source and destination address, packet size
and the Ethernet packet type.
IP header, consisting of the IP addressing (source and destination), protocol identity and IP
flags. You will also get information about the fragmentation and packet sequence.
TCP header, which includes information on the port, implied protocol, flags and sequencingnumbers.
Even with all this information, we still haven't hit the actual content. Beneath the TCP (or UDP)
protocol will be additional protocols, standard data protocols (including HTTP, SMTP and FTP), or
encapsulating protocols such as Remote Procedure Call (RPC) and the subtype of RPC such as
NFS.
8/10/2019 deep protocol analysis.pdf
3/16
8/10/2019 deep protocol analysis.pdf
4/16
developerWorks ibm.com/developerWorks/
Deep-protocol analysis of UNIX networks Page 4 of 16
IP: Source address = 192.168.0.112, tiger.mcslp.pri
IP: Destination address = 192.168.0.2, bear.mcslp.pri
IP: No options
IP:
In Listing 3you can see the TCP header. Again, this information is usually useful only for the
source port and destination port numbers as these will either identify the expected protocol orprovide the information you need to further investigate traffic on this port.
Listing 3. TCP headerTCP: ----- TCP Header -----
TCP:
TCP: Source port = 2049
TCP: Destination port = 889 (Sun RPC)
TCP: Sequence number = 2834727685
TCP: Acknowledgement number = 2654368001
TCP: Data offset = 32 bytes
TCP: Flags = 0x18
TCP: 0... .... = No ECN congestion window reduced
TCP: .0.. .... = No ECN echo
TCP: ..0. .... = No urgent pointer
TCP: ...1 .... = Acknowledgement
TCP: .... 1... = Push
TCP: .... .0.. = No reset
TCP: .... ..0. = No Syn
TCP: .... ...0 = No Fin
TCP: Window = 32806
TCP: Checksum = 0x4852
TCP: Urgent pointer = 0
TCP: Options: (12 bytes)
TCP: - No operation
TCP: - No operation
TCP: - TS Val = 34449495, TS Echo = 253458642
TCP:
The penultimate section, Listing 4, shows the RPC header data.
Listing 4. RPC header dataRPC: ----- SUN RPC Header -----
RPC:
RPC: Record Mark: last fragment, length = 168
RPC: Transaction id = 3041181596
RPC: Type = 1 (Reply)
RPC: This is a reply to frame 63
RPC: Status = 0 (Accepted)
RPC: Verifier : Flavor = 0 (None), len = 0 bytes
RPC: Accept status = 0 (Success)
RPC:
Finally, Listing 5provides the content of the NFS packet, including the permissions (file mode),
file size, ownership and other information. In this case, the NFS operation requested is for the
filesystem statistics (triggered by the equivalent of an ls operation), hence the level of detail.
Listing 5. Content of NFS packetNFS: ----- Sun NFS -----
NFS:
NFS: Proc = 18 (Get filesystem statistics)
NFS: Status = 0 (OK)
8/10/2019 deep protocol analysis.pdf
5/16
ibm.com/developerWorks/ developerWorks
Deep-protocol analysis of UNIX networks Page 5 of 16
NFS: Post-operation attributes:
NFS: File type = 2 (Directory)
NFS: Mode = 0777
NFS: Setuid = 0, Setgid = 0, Sticky = 0
NFS: Owner's permissions = rwx
NFS: Group's permissions = rwx
NFS: Other's permissions = rwx
NFS: Link count = 24, User ID = 502, Group ID = 10
NFS: File size = 29, Used = 2560NFS: Special: Major = 4294967295, Minor = 4294967295
NFS: File system id = 781684113418, File id = 4304616
NFS: Last access time = 28-Feb-10 15:49:51.042953989 GMT
NFS: Modification time = 25-Feb-10 09:39:07.965422590 GMT
NFS: Attribute change time = 25-Feb-10 09:39:07.965422590 GMT
NFS:
NFS: Total space = 759567510016 bytes
NFS: Available space = 659048374272 bytes
NFS: Available space - this user = 659048374272 bytes
NFS: Total file slots = 1288161604
NFS: Available file slots = 1287203856
NFS: Available file slots - this user = 1287203856
NFS: Invariant time = 0 sec
NFS:
In this case, we can see the file being looked up was in fact a directory (see the File type line).
Although we do not get the actual path to the file, we could find the directory in question by using
Find to look for the file/path with the corresponding inode number (see Listing 6).
Listing 6. Looking for a file with the corresponding inode number$ find /scratch -xdev -inum 4304616
/scratch/installed/mysql-6.0.11
The best way to use these tools if you are trying to identify traffic is first to run them and collect as
much data as possible, and then manually examine the content looking for items that you don'texpect to see on your network.
Once you have identified suspicious traffic, you can then start to add specifications on the
command line to zero in on the detail of the traffic. For example, you can specify to only display
traffic to a given host using either of the following shown in Listing 7.
Listing 7. Specifying to only display traffic to a give hose$ snoop host 192.168.0.2
$ tcpdump host 192.168.0.2
To further restrict things, you could port the protocol details: $ snoop host 192.168.0.2 and port25.
Parsing the raw data to understand the content
Another way to process the content from tcpdump is to save the raw network packet data to a file
and then process the file to find and decode the information that you want.
There are a number of modules in different languages that provide functionality for reading and
decoding the data captured by tcpdump and snoop. For example, within Perl, there are two
8/10/2019 deep protocol analysis.pdf
6/16
developerWorks ibm.com/developerWorks/
Deep-protocol analysis of UNIX networks Page 6 of 16
modules: Net::SnoopLog (for snoop) and Net::TcpDumpLog (for tcpdump). These will read the raw
data content. The basic interfaces for both of these modules is the same.
To start, first you need to create a binary record of the packets going past on the network by writing
out the data to a file using either snoop or tcpdump. For this example, we'll use tcpdump and the
Net::TcpDumpLog module: $ tcpdump -w packets.raw.
Once you have amassed the network data, you can start to process the network data contents
to find the information you want. The Net::TcpDumpLog parses the raw network data saved by
tcpdump. Because the data is in it's raw binary format, parsing the information requires processing
this binary data. For convenience, another suite of modules, NetPacket::*, provides decoding of
the raw data.
For example, Listing 8shows a simple script that prints out the IP address information for all of the
packets.
Listing 8. Simple script that prints out the IP address info for all packets
use Net::TcpDumpLog;
use NetPacket::Ethernet;
use NetPacket::IP;
my $log = Net::TcpDumpLog->new();
$log->read("packets.raw");
foreach my $index ($log->indexes)
{
my $packet = $log->data($index);
my $ethernet = NetPacket::Ethernet->decode($packet);
if ($ethernet->{type} == 0x0800)
{
my $ip = NetPacket::IP->decode($ethernet->{data});
printf(" %s to %s protocol %s \n",
$ip->{src_ip},$ip->{dest_ip},$ip->{proto});
}
}
The first part is to extract each packet. The Net::TcpDumpLogmodule serializes each packet, so
that we can read each packet by using the packet ID. The data()method then returns the raw
data for the entire packet.
8/10/2019 deep protocol analysis.pdf
7/16
ibm.com/developerWorks/ developerWorks
Deep-protocol analysis of UNIX networks Page 7 of 16
As with the output from snoop, we have to extract each of the blocks of data from the raw network
packet information. So in this example, we first need to extract the ethernet packet, including the
data payload, from the raw network packet. The NetPacket::Ethernet module does this for us.
Since we are looking for IP packets, we can check for IP packets by looking at the Ethernet packet
type. IP packets have an ID of 0x0800.
The NetPacket::IPmodule can then be used to extract the IP information from the data payload of
the Ethernet packet. The module provides the source IP, destination IP and protocol information,
among others, which we can then print.
Using this basic framework you can perform more complex lookups and decoding that do not rely
on the automated solutions provided by tcpdump or snoop. For example, if you suspect that there
is HTTP traffic going past on a non-standard port (i.e., not port 80), you could look for the string
HTTP on ports other than 80 from the suspected host IP using the script in Listing 9.
Listing 9. Looking for strong HHTP on ports other than 80
use Net::TcpDumpLog;
use NetPacket::Ethernet;
use NetPacket::IP;
use NetPacket::TCP;
my $log = Net::TcpDumpLog->new();
$log->read("packets.raw");
foreach my $index ($log->indexes)
{
my $packet = $log->data($index);
my $ethernet = NetPacket::Ethernet->decode($packet);
if ($ethernet->{type} == 0x0800)
{
my $ip = NetPacket::IP->decode($ethernet->{data});
if ($ip->{src_ip} eq '192.168.0.2')
{
if ($ip->{proto} == 6)
8/10/2019 deep protocol analysis.pdf
8/16
developerWorks ibm.com/developerWorks/
Deep-protocol analysis of UNIX networks Page 8 of 16
{
my $tcp = NetPacket::TCP->decode($ip->{data});
if (($tcp->{src_port} != 80) &&
($tcp->{data} =~ m/HTTP/))
{
print("Found HTTP traffic on non-port 80\n");
printf("%s (port: %d) to %s (port: %d)\n%s\n",
$ip->{src_ip},
$tcp->{src_port},
$ip->{dest_ip},
$tcp->{dest_port},
$tcp->{data});
}
}
}
}
}
Running the above script on a sample packet set returned the following shown in Listing 10.
Listing 10. Running the script on a sample packet set
$ perl http-non80.pl
Found HTTP traffic on non-port 80
192.168.0.2 (port: 39280) to 168.143.162.100 (port: 80)
GET /statuses/user_timeline.json HTTP/1.1
Found HTTP traffic on non-port 80
192.168.0.2 (port: 39282) to 168.143.162.100 (port: 80)
GET /statuses/friends_timeline.json HTTP/1
In this particular case we're seeing traffic from the host to an external website (Twitter).
Obviously, in this example, we are dumping out the raw data, but you could use the same basic
structure to decode and the data in any format using any public or proprietary protocol structure. Ifyou are using or developing a protocol using this method, and know the protocol format, you could
extract and monitor the data being transferred.
Using a protocol analyzer
Although, as already mentioned, tools like tcpdump, iptrace and snoop provide basic network
analysis and decoding, there are GUI-based tools that make the process even easier. Wireshark is
one such tool that supports a vast array of network protocol decoding and analysis.
8/10/2019 deep protocol analysis.pdf
9/16
ibm.com/developerWorks/ developerWorks
Deep-protocol analysis of UNIX networks Page 9 of 16
One of the main benefits of Wireshark is that you can capture packets over a period of time (just
as with tcpdump) and then interactively analyze and filter the content based on the different
protocols, ports and other data. Wireshark also supports a huge array of protocol decoders,
enabling you to examine in minute detail the contents of the packets and conversations.
You can see the basic screenshot of Wireshark showing all of the packets of all types being listedin Figure 1. The window is divided into three main sections: the list of filtered packets, the decoded
protocol details, and the raw packet data in hex/ASCII format.
Figure 1. Wireshark interface
As an example of the level of information and decoding that is provided by the Wireshark tool,
while writing this article I noticed that there were some error packets being returned by one of the
MySQL servers on the network.
To zero in on the content, I first applied the MySQL filter to the output. You can do this either
by typing an expression (like those provided to tcpdump, snoop or iptrace) into the Filter box.
Alternatively, you can click the Expression button and chose the filter from the built-in list. You cansee a sample of the filters available in Figure 2. Once you have chosen the filter, click Apply to
filter the packet list.
8/10/2019 deep protocol analysis.pdf
10/16
developerWorks ibm.com/developerWorks/
Deep-protocol analysis of UNIX networks Page 10 of 16
Figure 2. Choosing a Wireshark filter
By filtering on the MySQL protocol, I was able to identify the error packets. The MySQL protocol
returns a specific packet type with the error information. In this case, error 1242 means that the
query execution failed because there was a problem in a subquery. You can see the MySQL
protocol content details by expanding the MySQL protocol section of the Wireshark window, as
seen here in Figure 3.
8/10/2019 deep protocol analysis.pdf
11/16
ibm.com/developerWorks/ developerWorks
Deep-protocol analysis of UNIX networks Page 11 of 16
Figure 3. Examining a MySQL error packet
Here we can see the detail of the error. By tracking back to the previous 'Request Query' packet, it
is possible to determine the query that triggered the error response (Figure 4).
8/10/2019 deep protocol analysis.pdf
12/16
developerWorks ibm.com/developerWorks/
Deep-protocol analysis of UNIX networks Page 12 of 16
Figure 4. The MySQL query that triggered the error response
By drilling down into the packets, I could identify a problem with the code I hadn't previously
noticed and identify both the error and the query that triggered the problem.
Wireshark supports such a wide variety of protocols and filters that you can get detailed
information. Another common use is to monitor the exact content of detailed protocols, such as
web services. Figure 5shows the detailed (and structured) content from a SOAP request used to
log status information.
8/10/2019 deep protocol analysis.pdf
13/16
ibm.com/developerWorks/ developerWorks
Deep-protocol analysis of UNIX networks Page 13 of 16
Figure 5. Looking at the details of SOAP web service request
This kind of detail can be invaluable when trying to debug any network protocol that you are using.
Another useful feature is that Wireshark can work with both live information, and it can record
information for later filtering and processing. This means that you can use it to monitor specific
periods of suspicious traffic and then allow you to drill down into the information at your leisure to
find out exactly what was occurring on your network.
Summary
Protocol analysis of the information going across the wire of your UNIX network could be a
complex process. However, with the combination of some simple and widely available tools, you
can decode and examine the details of your network traffic from the basics of the source and
destination through to the specific protocol and data being exchanged.
As shown in this article, using tools like tcpdump, snoop or iptrace, you can extract a wide range
of data at the command line. With tools like Wireshark, you can go even deeper and get more
8/10/2019 deep protocol analysis.pdf
14/16
developerWorks ibm.com/developerWorks/
Deep-protocol analysis of UNIX networks Page 14 of 16
detailed information on a much wider range of protocols and content. For custom protocols and
data structures, you can use Perl to extract the raw data and get all the information you need.
8/10/2019 deep protocol analysis.pdf
15/16
ibm.com/developerWorks/ developerWorks
Deep-protocol analysis of UNIX networks Page 15 of 16
Resources
Learn
UNIX network analysis(Martin Brown, developerWorks, May 2009): Get more information on
understanding your general network structure, see UNIX network analysis. Solutions for tracing UNIX applications(Martin Brown, developerWorks, March 2009): Read
this tutorial for background information on basic packet snooping, see Solutions for tracking
UNIX applications.
Solve application problems with tracing(Sean Wahlberg, developerWorks, March 2006): Get
information on using truss, trace, and similar tools.
System Administration Toolkit: Network Scanning(Martin Brown,developerWorks,December
2007): Get more tips on network scanning.
Read System Administration Toolkit: Standardizing your UNIX command-line tools(Martin
Brown, developerWorks, May 2006): Learn how to use the same command across multiple
machines. For anarticle series that will teach you how to programin bash, see Bash by example,
Part 1:Fundamental programming in the Bourne again shell (bash)(Daniel Robbins,
developerWorks, March 2000), Bash by example, Part 2: More bash programming
fundamentals(Daniel Robbins, developerWorks, April 2000), and Bash by example, Part 3:
Exploring the ebuild system(Daniel Robbins, developerWorks, May 2000).
System Administration Toolkit: Check out other parts in this series.
Making UNIX and Linux work together(Martin Brown, developerWorks, April 2006): A guide
to getting traditional UNIX distributions and Linux working together.
Different systems use different tools, and the IBM Redbook Solaris to Linux Migration: A
Guide for System Administratorswill help you identify some key tools.
New to AIX and UNIX: Visit the New to AIX and UNIX page to learn more about AIX and
UNIX.
The developerWorks AIX and UNIX zonehosts hundreds of informative articles and
introductory, intermediate, and advanced tutorials.
Wireshark can be downloaded from the Wireshark homepage.
To listen to interesting interviews and discussions for software developers, check out
developerWorks podcasts.
developerWorks technical events and webcasts: Stay current with developerWorks technical
events and webcasts.
Get products and technologies
Innovate your next open source development project with IBM trial software, available for
download or on DVD.
Discuss
Participate in the discussion forum for this content.
Participate in developerWorks blogsand get involved in the developerWorks community.
http://www.ibm.com/developerworks/downloads/http://www.ibm.com/developerworks/offers/techbriefings/http://www.ibm.com/developerworks/podcast/http://www.wireshark.org/http://www.wireshark.org/http://www.ibm.com/developerworks/aix/http://www.ibm.com/developerworks/aix/http://www.ibm.com/developerworks/aix/newto/http://www.redbooks.ibm.com/abstracts/SG247186.html?Openhttp://www.redbooks.ibm.com/abstracts/SG247186.html?Openhttp://www.ibm.com/developerworks/views/aix/libraryview.jsp?search_by=System+Administration+Toolkithttp://www.ibm.com/developerworks/linux/library/l-bash3.htmlhttp://www.ibm.com/developerworks/linux/library/l-bash2.htmlhttp://www.ibm.com/developerworks/linux/library/l-bash2.htmlhttp://www.ibm.com/developerworks/linux/library/l-bash.htmlhttp://www.ibm.com/developerworks/linux/library/l-bash.htmlhttp://www.ibm.com/developerworks/linux/library/l-bash.htmlhttp://www.ibm.com/developerworks/aix/library/au-satstandardsh.html?ca=drs-t3406http://www.ibm.com/developerworks/aix/library/au-satnetworkscan/index.html?ca=drs-tp4907http://www.ibm.com/developerworks/blogshttp://www.ibm.com/developerworks/community/http://www.ibm.com/developerworks/downloads/http://www.ibm.com/developerworks/offers/techbriefings/http://www.ibm.com/developerworks/podcast/http://www.wireshark.org/http://www.ibm.com/developerworks/aix/http://www.ibm.com/developerworks/aix/newto/http://www.redbooks.ibm.com/abstracts/SG247186.html?Openhttp://www.redbooks.ibm.com/abstracts/SG247186.html?Openhttp://www.ibm.com/developerworks/aix/library/au-linuxtogether/index.htmlhttp://www.ibm.com/developerworks/views/aix/libraryview.jsp?search_by=System+Administration+Toolkithttp://www.ibm.com/developerworks/linux/library/l-bash3.htmlhttp://www.ibm.com/developerworks/linux/library/l-bash3.htmlhttp://www.ibm.com/developerworks/linux/library/l-bash2.htmlhttp://www.ibm.com/developerworks/linux/library/l-bash2.htmlhttp://www.ibm.com/developerworks/linux/library/l-bash.htmlhttp://www.ibm.com/developerworks/linux/library/l-bash.htmlhttp://www.ibm.com/developerworks/aix/library/au-satstandardsh.html?ca=drs-t3406http://www.ibm.com/developerworks/aix/library/au-satnetworkscan/index.html?ca=drs-tp4907http://www.ibm.com/developerworks/aix/library/au-unix-tracingapps.htmlhttp://www.ibm.com/developerworks/aix/tutorials/au-apptracing/index.htmlhttp://www.ibm.com/developerworks/aix/tutorials/au-unixnetworkanalysis/index.html8/10/2019 deep protocol analysis.pdf
16/16
developerWorks ibm.com/developerWorks/
Deep-protocol analysis of UNIX networks Page 16 of 16
About the author
Martin Brown
Martin Brown has been a professional writer for more than seven years. He is the
author of numerous books and articles across a range of topics. His expertise
spans myriad development languages and platforms -- Perl, Python, Java,
JavaScript, Basic, Pascal, Modula-2, C, C++, Rebol, Gawk, Shellscript, Windows,
Solaris, Linux, BeOS, Mac OS X and more -- as well as Web programming,
systems management, and integration. He is a Subject Matter Expert (SME) for
Microsoft and regular contributor to ServerWatch.com, LinuxToday.com, and IBM
developerWorks. He is also a regular blogger at Computerworld, The Apple Blog, and
other sites. You can contact him through his Web site.
Copyright IBM Corporation 2010
(www.ibm.com/legal/copytrade.shtml)Trademarks
(www.ibm.com/developerworks/ibm/trademarks/)
http://www.ibm.com/developerworks/ibm/trademarks/http://www.ibm.com/legal/copytrade.shtmlhttp://www.mcslp.com/