The use of an intelligent forum crawler for data retrieval from
e-learning portals Milo Pavkovi and Jelica Proti, University of
Belgrade School of Electrical Engineering, Belgrade, Serbia 6th
International Conference on Education and New Learning Technologies
Barcelona, 7th - 9th of July 2014
Slide 3
Introduction A large number of forums with different topics
Forums are often used by students during their studies Large number
of relevant information scattered around different forums inside
one university domain Forums are based on different technologies
2
Slide 4
Issues The same topic can appear across different forums inside
one university domain School official forums VS. departments
independent forums Same documents can be uploaded as post
attachments to a couple of different web forums Similar courses at
different schools 3
Slide 5
Solution Specialized crawler Specialized forum crawler
Aggregation of crawled data from multiple forums of a single
university domain Storing data into database Forum modules that use
this database for helping students 4
Slide 6
Forum structure Always defined by presented implicit paths 5
Example of a) forum b) thread c) attachments inside post.
Slide 7
Crawler algorithm FCbRE Forum Crawler based on Regular
Expressions Automated system Identifying DOM structure and basic
forum elements with regular expressions. Identifying forum implicit
paths using regex Example:
>>index\.php\?showforum\==\digit+!>+>\P=!