Information overload and customization

0278-6648/06/$20.00 © 2006 IEEESEPTEMBER/OCTOBER 2006 9

INFORMATION OVERLOADAND CUSTOMIZATION

MOHAMED SALAH HAMDI

HISTORICALLY, MORE INFORMATIONhas almost always been a good thing.However, as the ability to collect infor-mation grew, the ability to process thatinformation did not keep up. Today, wehave large amounts of available infor-mation and a high rate of new informa-tion being added but contra-dictions in the availableinformation, a low signal-to-noise ratio (proportion ofuseful information found toall information found), andinefficient methods for com-paring and processing dif-ferent kinds of informationcharacterize the situation.The result is the “informa-tion overload” of the user,i.e., users have too muchinformation to make a deci-sion or remain informedabout a topic.

One interesting solutionto this problem may beobtained by reversing theconventional ways we findinformation. Instead of usersinvesting significant effort tofind the right information,the right information shouldfind the users. This approachwill, of course, require thedevelopment of appropriatesoftware. Such software isexpected to do more for ustoday, in more situationsthan we ever expected in thepast. This is the challenge ofcomplex environments. The complexitycomes from many dimensions. There area variety of users (professional/naive,techie/financial/clerical, etc). There area variety of systems and interactionsamong them (win/Mac/Unix,client/server, portable, etc). There are avariety of resources and goals (earlier,programmers had only to trade off timeversus space). Now, they also have toworry about bandwidth, security,

money, completeness of results, andquality of information.

Information customizationsystems

To cope with such complex environ-ments, the promise of information cus-

tomization (IC) systems is becoming high-ly attractive. IC systems customize infor-mation to the needs and interests of theuser. They function proactively (take theinitiative), continuously scan appropriateresources, analyze and compare content,select relevant information, and present itas visualizations or in a pruned format.

IC systems are different from con-ventional search engines or database

systems. When using a search engine,for example, not all information is easyto find because it is difficult for peopleto articulate what they want using alimited set of keywords. The wordsmight not match exactly and hencenothing is returned or, much more

commonly, too many URLs arereturned by the searchengine. Even when the rele-vant information is easy tofind, it will be perhaps bor-ing and time consuming forthe user to perform this task.The search engine is notable to identify and presentthe information with little orno user intervention.Furthermore, each query isindependent of the previousone, and no attempt ismade to customize theresponses to a particularindividual. A search enginereceives millions of queriesper day, and as a result, theCPU cycles devoted to satis-fying each individual queryare sharply curtailed. Thereis no time for intelligence.The search engine is alsonot able to inform the userwhen new informationbecomes available. The usershould continually observethe resources since theymay change.

IC systems, however, donot preclude users from self-

directed information finding, such asbrowsing and searching. They oftencombine user-directed functions withproactive search functions to meet theuser’s demands. They can also rely onexisting information services (likesearch engines) that do the resource-intensive part of the work.Consequently, they can be sufficientlylightweight, reside and run on anindividual user’s machine, and serve

© CORBIS & iSWOOP

10 IEEE POTENTIALS

as personal assistants. In this way,there is no need to turn down intelli-gence and information customizationbecomes possible.

An IC system, to work properlyand be accepted, must/may exhibitsome or all of the following character-istics. IC systems tend, by their verynature, to be distributed. The IC sys-tem developer must therefore plan forproblems of distributed systems. ICsystems are often expected to beadaptive. A system is adaptive when itis able to adapt, with little or no inter-vention by a programmer, to changesin the environment in which it runs. Itis also adaptive when it is able to usefeedback to improve its performanceor when it is able to change its behav-ior based on its previous experience.IC systems are also expected to beautonomous, i.e., able to act even ifall the details are not specified orwhen the si tuat ion changes. Ofcourse, an IC system must be robust(a working system, accessible all thetime) and fast (begins transmittinguseful information within seconds).Other considerations such as proac-tiveness (the program does not simplyreact in response to the environment)and mobility (where does the pro-gram run?) are also of importance.

Today, many systems are alreadymaking their very first steps in cus-tomization. In online search engines,for example, customizing a search isbeing looked into, and it is sensible toexpect searches to change in the nextfew years. To resolve information over-load to some extent and enable users tocollect just the information they requireusing customized search engines thatrun all over the Internet looking for thecorrect information, it seems to beimportant to distribute the search andto move as much of it as possible to theuser’s own machine, allowing morecustomization. The user’s own machinecan become more like a friend whenanswering the user’s questions. Thiswould agree with the general consen-sus: giving direct answers to questionsby extracting data from online sourcesrather than giving links to Web pages iswhat needs to be researched.Customizing a search seems to be animportant issue and will surely havestrong impact on search engine opti-mization efforts and how these willadapt to the possible future of search.

Google’s Personalized Search isjust an example of the ongoing efforts

along those lines to make the user’ssearch experience more relevant tohim or her. The personalized searchdirective orders the results of theuser’s search to be based on his/herpast searches as well as previoussearch results and news headlines thatwere clicked on. The user may notnotice a huge impact on his/hersearch results at first, but as the indi-vidual user’s search history builds,personalized search results will con-tinue to improve.

Many of the world’s most-visitedWeb pages, such as <www.yahoo.com>, are example of systems that arestarting to emphasize information cus-tomization. These Web pages are con-ducting major redesigns moving themaway from the 1990s “everythingunder the sun” portals to more user-focused homepages. The generalmotto is “it is time for a simpler,streamlined, and more accessibleidentity.” The redesigned pages try todo a good job at what should be thegoal of a homepage, namely, helpingpeople get to interesting content andservices.

That could be done by having ahomepage focus on doing two things:i) get people quickly to parts of thehomepage that they like and ii) helppeople discover parts of the home-page they might like but don’t knowabout. In other words, the homepageshould help users get where theywant to go as well as find new, use-ful, and interesting stuff. To help theuser get where he or she wants to go,the page should feature things theuser uses at that page. To help theuser discover new content, the siteshould make recommendations basedon what the user already uses. Theseare essentially internal ads for thehomepage content. If the user fails toshow interest, the ad should disap-pear and be replaced with somethingelse that might be useful to the user.Different users would see differentversions of the homepage based ontheir particular interests and needs.

Information customization isachieved by observing the needs ofthe user and providing the referredinformation based on user needs. Inan IC system, information access isenhanced by optimizing the percent-age of relevant information exposed tothe user. There may be varying formsor degrees of customization but theyall share a similar methodology and

cycle. First, user data is collected andused to create a profile of the user.These profiles can be either for indi-viduals or groups of users. Second,content is selected from the contentrepository that matches the the user’sprofile. Third, the selected content isdelivered or published to the designat-ed user in the desired format.

Collaborative filtering, intelligentinterfaces, agents, bots, and intermedi-aries are all deployed as the enablingtechnologies for information customiza-tion. Recently, Web mining, a naturalapplication of data-mining techniquesto the Web as a very large and unstruc-tured information source, has made agreat impact on information customiza-tion. Web mining is usually divided intothree categories: i) Web content mining,which deals with discovering usefulinformation from Web contents/data/documents; ii) Web structure mining,which is focused on discovering themodel underlying link structures on theWeb; and (iii) Web usage mining,which tries to make sense of data gen-erated by a Web surfer’s sessions orbehaviors. Web mining, particularlyWeb-usage mining, is considered a cru-cial component of any efficacious ICsystem. Through Web mining, we areable to gain a better understanding ofboth the Web and user preferences, aknowledge that is crucial for informa-tion customization.

MASACAD, the system described inthe following, is an example of an ICsystem that combines many of thetechnologies mentioned previously toapproach information customizationand others.

MASACAD: an IC systemfor academic advisingAcademic advising

The general goal of academicadvising is to help students developeducational plans that are consistentwith their academic, career, and lifegoals while providing the informationand skills they need to pursue thosegoals. The foundation of this supportis the establishment of educationalpartnerships between students andtheir academic advisors. Academicadvisers are the students’ consultantson their educational planning.Advisers assist students by guidingthem through the university’s educa-tional requirements, helping themschedule the most appropriate cours-

es, introducing them to pertinentresources, promoting leadership andcampus involvement, assist ing incareer development, helping themwith the timely completion of theirdegree, and helping them find waysto make their educational experiencepersonally relevant.

The idea of an IC system for acad-emic advising is enticing. The bene-fits of such an assistant, in the formof a computer program, are manifold.Such a system would improve theadvising process and help to over-come the many problems that canoccur when a l imited number ofadvisers are available to service alarge number of students. That chal-lenge is compounded by the fact thatadvisers are not available all the timeand that some advisers are new to theuniversity and do not have enoughknowledge/experience with advising.An IC system for academic advisingwould simplify the tasks of faculty,staff, students, and professional advis-ers; help save time and effort, preventmistakes; and help the universityintroduce new technologies.

Unfortunately, the goal of academ-ic advising, as described earlier, is toogeneral because it involves manyexperts and requires a huge amountof knowledge. Hence, realizing a soft-ware assistant that can deal with allthe details would be difficult, if notimpossible. Therefore, it is necessaryto restrict the goal of academic advis-ing. Academic advising could beunderstood as providing students withan opportunity to plan programs ofstudy, select appropriate classes, andschedule classes in a way that pro-vides the greatest potential for acade-mic success. The task is still interest-ing because, when planning a pro-gram of study and selecting classes,we must consider many things, suchas prerequisites, course availability,effective course sequencing, and workload. However, the task is now of amore moderate size.

Resources for academic advising

Many diverse resources arerequired to deal with the problem ofacademic advising. First of all, oneneeds the student profile. The studentprofile consists of all the informationconcerning courses the student hasalready taken. This information isstored in a database and access to it is

restricted to the administration. Thestudent profile also includes the stu-dent ’s desires concerning whichcourses to attend. To obtain this infor-mation, an adviser must ask the stu-dent. The IC system could attempt tolearn these desires, for example, bymonitoring students or looking attheir profiles. The student profile alsocontains other information about thestudent, such as educational focus(art, science, technology), work expe-rience, internships, programming lan-guages mastered, and languages spo-ken. The second resource needed forsolving the problem of advising arethe courses that are offered in thesemester for which advising is need-ed. This information is well main-tained by the university administrationin appropriate Web sites and is acces-sible for everyone. The third resourceneeded for solving the problem ofacademic advising is expert ise.Expertise is the extensive, task-specif-ic knowledge acquired from training,reading, and experience. It is avail-able as documented knowledge andconsists of all the details and regula-tions concerning courses, programs,and curriculums. This kind of knowl-edge is available in many forms suchas Web pages, booklets, handouts,and announcements and is accessibleto everyone. Expertise also consists ofthe knowledge of a human expert. Ahuman expert should posses morecomplex knowledge than you canfind in documented sources. Thisknowledge consists of heurist ics,strategies, and meta-knowledge. An ICsystem can gain this knowledgethrough a learning process.

What is MASACAD?MASACAD stands for multi-agent

system for academic advising. It isalso the Arabic word for “courses.”MASACAD approaches informationcustomization by combining a widerange of technologies and ideas fromthe following domains: e-learning,user modeling, agent systems,machine learning, and Web mining.

MASACAD uses Bee-gent (Bondingand Encapsulat ion EnhancementAgent) as a solution to the problemsof network communication. Bee-gentis a communication framework devel-oped by Toshiba that is based on themulti-agent model. Bee-gent “agenti-fies” the applications and the commu-nication between them, i.e., applica-

tions become agents and messagesbetween them are also carried byagents. Agent wrappers are used toprovide an agent interface for existingapplications and manage the states ofthe applications they are wrappedaround. Mediation agents supportinterapplication coordination by han-dling all communications. They movefrom the site of an application toanother where they interact with theagent wrappers.

Components of MASACAD MASACAD is made up of four

interacting agents: • A database agent, consisting of a

grading system that existes within anagent wrapper. The grading system isa database application for answeringqueries about the students, the cours-es they have taken, and the gradesthey received.

• A Web agent, which consists ofa course announcement system exist-ing within an agent wrapper. Thecourse announcement system is aWeb appl icat ion for answeringqueries about the courses expectedto be offered in the semester forwhich advising is needed.

• A learning agent, which consistsof a user system that exists within anagent wrapper. The user system is anapplication that is the heart of theadvising system. It is here whereintelligence resides. This applicationgives students the opportunity toexpress their desires concerning thecourses to be attended, to initiate aquery to obtain advice, and to see theresult returned by the advising sys-tem. It also alerts the student auto-matically via Email when somethingchanges in the offered courses or inthe student profile.

• A mobile mediation agent thatprovides the information retrievingservice for the learning agent by trav-eling from one agent to another.

The four agents cooperate in orderto accomplish the academic advisingtask (see Fig. 1). The learning agentcreates a mediation agent (searcher),which travels to the database agent inorder to retrieve the courses success-fully taken by the student with the cor-responding grades and then moves tothe Web agent to retrieve the coursesoffered in the semester for whichadvising is needed. When the media-tion agent returns back to the learningagent with the retrieved results, the

SEPTEMBER/OCTOBER 2006 11

12 IEEE POTENTIALS

learning agent uses this informationtogether with the student’s desires toinvoke the “advising procedure.” The“advising procedure” is responsible forproducing the advice for the student.

Implementing theadvising procedure

The machine’s learning adequacyfor gaining human expertise, com-bined with the availability of experi-ence with advising students, makesthe adoption of a paradigm of super-vised learning from examples obvious.For the neural network solution adopt-ed, the known information (input vari-ables) consisted of the student profileand the offered courses grouped in asuitable way. The unknown informa-tion (output variables) consisted of theadvice the student expects. For theneural network to be able to infer theunknown information, prior training tointegrate the academic advising exper-tise (available in the form of advisingexamples) into the network is neces-sary. The back-propagation algorithmwas used for training the neural net-work. The back-propagation networkseems interesting and viable enoughto use to automate the process ofknowledge acquisition and help inextracting human expertise.

Lessons learned from MASACAD

MASACAD, as an example of an ICsystem, provides a solution to a prob-lem that depends on distributed infor-mation existing in different formats.Many agents cooperate to solve thecomplex problem of academic advis-ing. Agent wrappers invoke (query or

search) the appropriate resources(Web site or database). A mediationagent migrates between the agentwrappers, transporting queries andcollecting answers. The main lessonlearned from MASACAD is the necessi-ty to rely on technologies from diversedisciplines and to focus on combiningthem appropriately in order to be ableto come up with a satisfactory solutionto any information customization task.MASACAD demonstrates, for example,the suitability of the multi-agent para-digm for dealing with information cus-tomization. Using agents adds a layerof abstraction that localizes decisionsabout dealing with local peculiaritiesof format and knowledge conven-tions, for example, and thus helps tounderstand and manage complexity.The successful marriage betweenneural networks and expert systems isalso noticeable. MASACAD uses aback-propagation neural network toautomate the process of knowledgeacquisition, i.e., acquire the expertiseof the human academic adviser.

ConclusionsIC systems are supposed to be the

answer to information overload. Theyallow users to narrow what they arelooking for and get informationmatching their needs. IC systems arealso a bargain of consummate effi-ciency. The value proposition of suchsystems is reducing the time spentlooking for information.

However, IC systems require usersto share their tastes and demands.They need to memorize user queriesand profiles, hence the criticism ofsuch systems being instruments of

control. In addition, there is the riskof violating the user’s privacy andmisusing the information gatheredabout the user.

Another problem with IC systems isthat tailoring information to the userlimits social discovery. Narrowcastinglimits a user’s view into more diverseand otherwise peripheral information.Over time, users are taught to relyupon this mode as their primarysource for conducting informationsearches. Sharing, conversat ions,remixing, and socializing informationdo not exist in this mode.

A lot of work with information cus-tomization is being done. All of it ispretty interesting and important for thefuture of the information age. What ispossible in our future and how canour online lives evolve? The next fewyears may shed light on these issues.

Read more about it• M.S. Hamdi, “MASACAD: A multi-

agent-based approach to informationcustomization,” IEEE Intelligent Syst.,vol. 21, no. 1, pp. 60–67, Jan./Feb.2006.

About the authorMohamed Salah Hamdi

([email protected]) is an assistantprofessor in the Department ofComputer Science at the University ofQatar. He received his Ph.D. in com-puter science from the University ofHamburg. His research interests are inintell igent autonomous agents, e-learning, information customization,machine learning, and artificial intelli-gence in general.

Fig. 1 Interaction between the agents of MASACAD

UserSystem

Learning Agent

9) Replay the Info

1) Request for Info

3) Request for Info

4) Replay for info

6) Request for Info

7) Replay for Info

2) Move

8) Move

MobileMediation

Agent

GradingSystem

Database Agent

Web Agent

CourseAnnouncement

System

Searcher

Searcher

Searcher

5) Move

Documents

Information overload and customization