6
IS&T’s PICS Conference 427 IS&T’s 1998 PICS Conference 427 IS&T’s 1998 PICS Conference Copyright 1998, IS&T

IS&T’s PICS Conference IS&T’s 1998 PICS …...um b ers are used b y the serv er to sp ecify up date targets. CGI connection with the standard h ttp proto col, whic h is a stateless

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: IS&T’s PICS Conference IS&T’s 1998 PICS …...um b ers are used b y the serv er to sp ecify up date targets. CGI connection with the standard h ttp proto col, whic h is a stateless

Server-Side Image Processing using the World

Wide Web

Daniel Tretter and Andrew Patti

Hewlett-Packard Laboratories

Palo Alto, CA 94304

IS&T’s PICS ConferenceIS&T’s 1998 PICS ConferenceIS&T’s 1998 PICS Conference Copyright 1998, IS&T

Abstract

In many image management and image processing sce-narios, it is advantageous to have a central repository forstoring and processing the images, while users can inter-act with the images from multiple remote locations. Thisclient-server architecture allows the system resources andimage data to be concentrated in a single location, whichsimpli�es system maintenance and dissemination of thedata. In this paper, we describe a system in which aweb browser is used to access and manipulate image dataon a remote server. The images are stored in multipledatabases on the server. Clients can navigate and alterthese databases as well as select and process individualimages. Client interaction is managed through a Java ap-plet, while CGI programs on the server perform the nec-essary calculations and data manipulations. The clientand server communicate through CGI protocols and asecondary socket connection. The architecture supportsmultiple clients, with changes initiated by any individ-ual client immediately propagating to all a�ected clients.This paper will describe the system in detail and discussits advantages and disadvantages.

Introduction

In the past few years, the world wide web has gainedgreat popularity as a vehicle for distributing informa-tion, including both text and images. As a result, clientbrowsers are widely available for most computing plat-forms, and many computer users already have signi�cantexperience working with them. We would like to leveragethis existing codebase and the corresponding well-de�nedcommunications protocols to construct a distributed im-age processing system, where multiple clients interactwith images that are stored and manipulated on a server.We limit ourselves to a system architecture that does notrequire any special client software (other than a com-mercially available web browser) or any speci�c clientplatform. In this way the system is easily accessible andmore likely to attract users.

4242

Our system concentrates the processing on the serverrather than on the client platforms. This approach doesnot require any assumptions about the processing capa-bilities of the client machine, so the system will work forso-called \thin clients", which have very little processingpower. Concentrating the processing on the server alsogives us full control over the processing software. Onlya single version of the code is in use, so it can be easilyupdated or modi�ed without having to be redistributedto all the users. This ensures that all clients are usingthe same version of the code at all times since no codeis stored locally on the client.

Unfortunately, server-side processing also has severaldrawbacks. Since the processing is concentrated on asingle machine, the system is not scalable. Even a verypowerful server can be overwhelmed if the number ofactive clients grows too large. Also, for security and pri-vacy reasons the server is not able to access a client's�le system. If a client wants to process a local images�le, that �le must �rst be uploaded to the server. Theserver then processes the �le, and the output image isdownloaded by the client. The uploading and subse-quent downloading can add signi�cant overhead to theprocessing task, particularly if multiple images or largeimages are being processed.

Distributed imaging systems using the world wideweb have been developed by several researchers. Meng,Chang, and their associates at Columbia University havedeveloped a distributed system for searching and editingvideo databases[3, 5]. Their system, like ours, uses Javaon the client side for maximum portability, but also re-quires a Netscape plugin to be installed on the client.Bamberger proposes a system for distance learning thatallows image processing to be speci�ed by a client via theworld wide web. His approach uses CGI scripts on theserver and HTML forms on the client side[1]. Berman etal. propose a distributed system to allow clients to accessa database of X-ray images[2], and Hasegawa et al. de-vise a multimedia database that allows users to requestimage processing via the web and view the results[4].

We will next discuss the architecture of our system,

77

Page 2: IS&T’s PICS Conference IS&T’s 1998 PICS …...um b ers are used b y the serv er to sp ecify up date targets. CGI connection with the standard h ttp proto col, whic h is a stateless

IS&T’s PICS ConferenceIS&T’s 1998 PICS ConferenceIS&T’s 1998 PICS Conference Copyright 1998, IS&T

Figure 1: CGI scripts on the server communicate with aJava applet on each client. The data and processing re-side on the server, while the user controls the processingthrough the applet.

and then we go through a brief example interaction toillustrate how the system operates. A more detailed de-scription of our system can be found in [7].

Architecture

System Overview

Figure 1 illustrates our overall system architecture.We use CGI (Common Gateway Interface) Perl scriptson the server to perform the necessary server-side pro-cessing. CGI allows the web server to communicate withother programs on the server.

In typical CGI systems, a user enters information intoan HTML form and submits it to the server through aweb browser. The web server passes the form data onto CGI programs on the server that process the dataand generate a result, often an HTML page, that getssent back to the client's web browser. Our system usesa Java applet on the client in place of HTML forms tohandle user input and interpret server responses. Javaallows the user to view and interact with the images inways not possible through HTML. In addition, Java al-lows the use of a secondary sockets connection, which al-lows the server to initiate contact and update each clientwhen server status changes. This second connection willtypically fail if the client is connecting through a proxy,so our system will usually not operate properly across a�rewall.

Our system stores images in simple databases, whichwe represent as �lmstrips, with each frame of the �lm-strip a di�erent image. Images and databases are storedand maintained on the server. Users can access and mod-ify existing databases, create new databases, or processindividual images in a database. Images can be includedin multiple databases simultaneously, and databases cancontain any number of images. Users can only accessimages through a database, so all images must be in at

4242

least one database. When a user creates a new imageeither by processing an existing image or by uploadingan image from the client, the new image must be addedto a database immediately if it is to be kept on the serverfor future user access.

Since we allow multiple simultaneous users, we en-counter many of the same problems seen in any dis-tributed database application. We need to track eachclient's activity and current state, keep clients up to datewhen other clients make database changes, and preventcon icts from occurring when multiple clients access thesame database simultaneously. These tasks are the re-sponsibility of the server, which handles all client inter-actions and updates the clients using a second socketsconnection outside the CGI protocol. These issues arecovered in more detail below, where we discuss the serverresponsibilities.

One issue we do not address in our system is the prob-lem of security. The normal web protocols and Java pro-vide some basic security (the Java applet cannot accessthe client �lesystem, users cannot modify CGI scripts,etc.), but a number of security problems still exist. Froman individual user's point of view, we would like to im-plement protections to prevent other users from modi-fying or even accessing user-created databases and im-ages without permission. This would probably requiresome sort of password protection for both individual im-ages and entire databases, perhaps in conjunction withNetscape cookies to allow automatic recognition of users.Similarly, the server may want to maintain some read-only databases, and each user should be limited to a pre-de�ned amount of disk space on the server. Otherwise, auser could upload many very large images and overloadthe server. These issues, although not insurmountable,have not been addressed in our existing system.

Server Side

Server-side responsibilities fall into three major cate-gories, and the software code structure re ects this divi-sion. The three categories for which the server is respon-sible are client handling, image and database handling,and image processing. The server software de�nes sep-arate Perl classes for clients, databases, and processingalgorithms. Each client or image database is a separateinstance of the corresponding class, while the processingparameters and algorithm choice are carried out usingthe processing class.

Client handling includes storing and tracking multi-ple client states, communicating with the clients, andpreventing client collisions from occurring. When a newclient initiates a session, the server assigns a unique iden-ti�er and communicates this number to the client. Eachsubsequent request from the client will be prefaced by theclient with the identi�er, so the server can automaticallyidentify the request with the client. This identi�cationmechanism is necessary because client requests use the

88

Page 3: IS&T’s PICS Conference IS&T’s 1998 PICS …...um b ers are used b y the serv er to sp ecify up date targets. CGI connection with the standard h ttp proto col, whic h is a stateless

IS&T’s PICS ConferenceIS&T’s 1998 PICS ConferenceIS&T’s 1998 PICS Conference Copyright 1998, IS&T

Figure 2: The server communicates with each clientthrough both a CGI connection and a separate sock-ets connection. The client identi�cation numbers areused to associate CGI requests with clients, while the IPaddresses and port numbers are used by the server tospecify update targets.

CGI connection with the standard http protocol, whichis a stateless protocol. The server maintains no informa-tion between transactions, so each request must standalone, and any tracking required of the server needs tobe implemented explicitly.

When the server generates an identi�cation numberfor a new client, it also creates a state �le for that client.In this �le the server stores information identifying thedatabase being used by the client, the current knowl-edge of the client, and the last time this client contactedthe server. The active database and current knowledgeof the client are needed so the server knows when theclient needs to receive an update (client knowledge isstale) and so the server can correctly interpret client re-quests regarding individual database images (i.e. addnew image to the active database, etc.). The contacttime is used by the server to determine when a client isno longer active. If no requests have been received fora certain period of time, the client is declared inactiveand its identi�cation number is cleared for use by a newclient. This approach is similar to the one used in [2].

Figure 2 illustrates how the server identi�es and com-municates with multiple clients. The unique client iden-ti�cation number allows the server to keep track of eachclient, but this mechanism only works if the client ini-tiates the contact. Occasionally, the state of the serveritself will change. For instance, when a client modi�esan image database, other clients need to be informedof this change. The server cannot initiate a connectionthrough the standard web protocols (this sort of con-nection requires what is referred to as a push channel).We therefore open a separate sockets connection betweenthe server and the applet to allow this. Each client'sIP address is stored by the server, and the clients areeach assigned a port on which to \listen" for updates.The port number must be assigned dynamically by theserver in case multiple clients share an IP address, as il-lustrated in Figure 2. If too many clients share the sameIP address, the probability of a port con ict may become

4242

unacceptably high.

Client collisions are prevented using a simple \�rstcome, �rst served" strategy. When a client is in the pro-cess of modifying a database, the database is locked toprevent other clients from accessing it. Other clients us-ing that database are then updated through the socketsconnection and the database is unlocked for access. Inthis way, only one client at a time can be modifying anygiven database.

The server is also responsible for handling the imagesand databases. Whenever a new image is generated oracquired from a client, two scaled copies are automati-cally generated and stored on the server. A thumbnailversion is used to represent the image in the database�lmstrip, and a somewhat larger view image is displayedby the applet when the user requests the image from thedatabase. The original image, which may be of higherresolution than either of the others, is used for process-ing and downloading to the client if desired. In additionto the multiple image resolutions, the server also keepstrack of how many databases are using each image. If animage is ever removed from the only database that cur-rently includes it, it is automatically deleted from theserver since clients will no longer be able to access it.

Our system allows a user to specify a processing al-gorithm to be performed on the chosen image. The pro-cessing occurs on the server, which stores the output im-age in a temporary location until the user decides whatto do with it. The user can choose to replace the origi-nal image with the processed image, either in the activedatabase only or in all databases that contain the image.The user can also choose to download the processed im-age to the client, add it to the active database, delete it,or process it further.

The processing module on the server has been de-signed to allow new algorithms to be added easily with-out signi�cantly changing the code. All processing al-gorithms are assumed to take a single input image andproduce a single output image. The client is allowed tospecify a variety of processing parameters, including in-teger, oating point, and binary parameters as well asimage regions.

Client Side

The client side is responsible for allowing the user tointeract with the database and providing a reasonablemeans of specifying image processing operations that willbe carried out on the server. These tasks necessitatea reasonably functional GUI to be implemented at theclient side. In addition, the client must be able to acceptupdates regarding the server state that may occur at anytime. Lastly, the client must provide a mechanism foruploading to the server images that reside on the client.This task is especially challenging because Java does notallow direct access to the client local storage.

The client architecture that provides required func-

99

Page 4: IS&T’s PICS Conference IS&T’s 1998 PICS …...um b ers are used b y the serv er to sp ecify up date targets. CGI connection with the standard h ttp proto col, whic h is a stateless

IS&T’s PICS ConferenceIS&T’s 1998 PICS ConferenceIS& IS&T

Figure 3: Client Java applet classes (boxes) in relation to the server and HTML forms (ellipses).

T’s 1998 PICS Conference Copyright 1998,

tionality is depicted in Figure 3. In the �gure, Javaclasses are depicted by rectangular boxes, while non-Javaentities such as HTML forms and server CGI scripts aredepicted with ellipses. The client functionality is dividedinto classes in the following manner. A local database ismaintained at the client via the \CGI Manage + ClientDBase" class. This class manages all CGI-based con-nections to the server and can be thought of as a localserver, since the GUI classes act as clients that requestthe local data.

The \Socket Manager" class maintains a thread whichmonitors a TCP port for updates from the server. Whenthe server contacts the client to provide an update, thisclass forwards the information to the \CGI Manage +Client DBase" class in the form of a Java InputStream.This enables the client-side data to be updated and main-tained in a single place.

The \Server DBase GUI" class presents the user withthe capability to select an image strip for viewing, man-age the strip display, initiate an image upload to thestrip, and select an image for viewing and manipulationat full resolution. The image strip is composed of thumb-nails loaded from the server into the client database. Todisplay the strip, a separate thread is managed that con-tinuously scrolls the strip of images. The user is alsoprovided a scrollbar to directly control strip scrolling.To allow client image upload, this class initiates a CGIrequest to the server, but uses the Applet Context toredirect the server response to an HTML form that re-sides in a separate HTML frame on the page containingthe applet. Because the form does have access to thelocal storage this Java-based system therefore allows forclient upload. One last note is that this class is the JavaApplet derived class that the browser launches.

The last client class, \Image Processing GUI", al-lows for viewing and manipulation of the full sized im-age. This class is launched by the \Server DBase GUI"in response to a request to view an image in the strip, orwhen the server has �nished applying an image process-ing task and the image is to be presented to the user. The

4343

GUI allows the user to select a processing algorithm andprovide pixel-oriented input such as selecting speci�c re-gions of the image to apply the processing to. The actualrequest for processing on the server is directed throughthe \CGI Manage + Client DBase" class. Upon comple-tion of the processing, the client is alerted via the TCPsocket connection and a new viewer containing the pro-cessed output image is launched by the \CGI Manage +Client DBase" class.

Case Study

Now that we have described the architecture of our sys-tem, we will go through an example interaction to illus-trate its operation. Figure 4 illustrates a typical session.We will follow this general outline in our case study.

Suppose a new client accesses the system by enteringthe appropriate URL in a web browser. The browser en-counters and downloads the Java applet automatically.At this point, the browser has instantiated the \ServerDBase GUI" class. The applet begins to run on the clientmachine, establishing a CGI-based connection with theserver. This is carried out by the \CGI Manage + ClientDBase" class. The server will generate a unique identi-�cation number and a port number for this client. Thisinformation, along with a list of image database choicesand an initial default database, will be sent back to theclient. The server also creates a state �le for this client,storing the same information just sent to the client plusa time stamp to indicate the last time of contact withthis client. Before generating an identi�cation numberfor the client, the server checks the time stamps for allexisting clients and removes those clients that haven'tmade contact recently enough.

The client receives the information from the serverthrough the CGI connection and stores it locally. Theapplet opens a TCP socket connection on the speci�edport and starts up a thread to \listen" for server updates.This is accomplished by instantiating the \Socket Man-

00

Page 5: IS&T’s PICS Conference IS&T’s 1998 PICS …...um b ers are used b y the serv er to sp ecify up date targets. CGI connection with the standard h ttp proto col, whic h is a stateless

IS&T’s PICS ConferenceIS&T’s 1998 PICS ConferenceIS&T’s 1998 PICS Conference Copyright 1998, IS&T

Figure 4: This diagram indicates the communications for a typical client-server session. Note that the client sendsits identi�er with each request.

ager" class. It also gets the thumbnail images for theinitial database from the server and generates a screenfor the user showing the database selections and the ini-tial database. Figure 5 shows a screen shot of what theuser sees at this point.

Suppose the user selects a di�erent database from thelist. The \Server DBase GUI" class passes this request tothe "CGI Manage + Client DBase" class, which contactsthe server and updates the local database. Upon receiv-ing the request from the applet, the server responds witha list of images contained in this database. The appletthen retrieves thumbnails of the indicated images fromthe server and updates the user display. The server up-dates its state �le for this client to re ect the databasechange.

Now suppose the user selects an image from the data-base by clicking on one of the thumbnails in the �lmstrip.The applet retrieves the higher resolution version of theimage from the server and launches the \Image Process-ing GUI" to display it in a separate window. The usermight then elect to use the mouse to select an imageregion for processing. Figure 6 shows a screen shot ofwhat the user might see at this point. In this example,we are using a redeye reduction processing algorithm[6],so we have selected an eye region. For some processingalgorithms, the user might select parameters other thanan image region. After selecting a processing algorithmand parameter values, the user inputs a \process image"command.

The processing request will be passed to the serverthrough the \CGI Manage + Client DBase" class, andthe request will include the input image name, the cho-

4343

sen algorithm, parameter values, and its client identi�er.The server responds �rst by acknowledging the request.The acknowledgement is sent back to the applet, freeingit up so the user can continue to interact with the im-age database while the server is performing the desiredprocessing. The server then forks o� a child process toperform the requested image processing. After the pro-cessing is completed and an output image is available,the server sends the name of this processed image to theapplet through the secondary TCP socket. The appletthen retrieves this information via the \Socket Manager"class, which ultimately causes the processed image to bedisplayed via the \Image Processing GUI" class. Thedisplay window then allows the user to specify furtheractions on this image (ie. saving, further processing,etc. ).

Conclusion

We have proposed and implemented an architecture forperforming image processing and archiving on the web.The proposed system uses server-based processing, andno special software needs to be installed on the client. Byusing a combination of Java, CGI, and Perl programs,users are allowed pixelwise interaction with the images,and the server software can be easily upgraded withoutrequiring a change in client software. Future versions ofsuch a system would bene�t from the use of Flashpixformat images, which incorporate the desired multipleresolutions. This would eliminate the need to generateand store multiple versions of each image.

11

Page 6: IS&T’s PICS Conference IS&T’s 1998 PICS …...um b ers are used b y the serv er to sp ecify up date targets. CGI connection with the standard h ttp proto col, whic h is a stateless

IS&T’s PICS ConferenceIS&T’s 1998 PICS ConferenceIS&T’s 1998 PICS Conference Copyright 1998, IS&T

Figure 5: The user initially sees a scrolling �lmstrip ofimage thumbnails and a list of available databases.

References

[1] R. H. Bamberger. A prototype distance learning lab-oratory for image processing education. In Proc. 26thAnnual Frontiers in Education (FIE) Conference I,pages 51{54, November 1996.

[2] L. E. Berman, R. Long, and G. R. Thoma. Chal-lenges in providing general access to digitized x-raysover the internet. In Proceedings of the 23rd AIPRWorkshop, pages 183{193, Washington, D.C., Octo-ber 1994.

[3] Shih-Fu Chang, John R. Smith, and Horace J. Meng.Exploring image functionalities in www applications{ development of image/video search and editing en-gines. In Proc. IEEE International Conference onImage Processing III, pages 1{4, Santa Barbara, CA,October 1997.

[4] T. Hasegawa, T. Minamihaba, M. Daibo, T. Kuma-gai, M. Fujisawa, and N. Tayama. Development ofinteractive multimedia database. In Proc. SPIE {volume 2915, pages 183{190, November 1996.

[5] H. J. Meng, D. Zhong, and S.-F. Chang. Webclip: Awww video editing/browsing system. In Proc. IEEE

4343

Figure 6: After selecting an image, the user can interactwith the chosen image, using the mouse, for example, toselect an image region.

1st Multimedia Signal Processing Workshop, Prince-ton, NJ, June 1997.

[6] Andrew Patti, Konstantinos Konstantinides, DanielTretter, and Qian Lin. Automatic digital redeye re-duction. Technical Report HPL-97-74, HP Labora-tories, June 1997.

[7] Daniel Tretter and Andrew Patti. A distributed im-age processing system using the world wide web.Technical report, HP Laboratories, 1998. in prepa-ration.

22