Focused Crawler Optimization Using Genetic Algorithm
Abstract: As the size of the
Web continues to grow, searching it for useful information has become more
difficult. Focused crawler intends to explore the Web conform to a specific
topic. This paper discusses the problems caused by local searching algorithms.
Crawler can be trapped within a limited Web community and overlook suitable Web
pages outside its track. A genetic algorithm as a global searching algorithm is
modified to address the problems. The genetic algorithm is used to optimize Web
crawling and to select more suitable Web pages to be fetched by the crawler.
Several evaluation experiments are conducted to examine the effectiveness of
the approach. The crawler delivers collections consist of 3396 Web pages from
5390 links which had been visited, or filtering rate of Roulette-Wheel
selection at 63% and precision level at 93% in 5 different categories. The
result showed that the utilization of genetic algorithm had empowered focused
crawler to traverse the Web comprehensively, despite it relatively small collections.
Furthermore, it brought up a great potential for building an exemplary
collections compared to traditional focused crawling methods.
Author: Banu Wirawan Yohanes,
Handoko Handoko, Hartanto Kusuma Wardana
Journal Code: jptkomputergg110050