The
Combine
Harvesting
Robot
Combined harvester
Features Home page Main Downloads

Simulation tool to study focused web crawling strategies

This is a simulation tool that can be used for study effects of different URL scheduling strategies for focused Web-crawling. The tool used the context of a pre-existing database generated by a relatively broad focused crawl by the Combine crawler. Using this tool will enable experimentation with different scheduling algorithms locally without actually doing the crawling over and over again. The tool itself is writen in Java. For post-processing of statistics a few Matlab routines are provided.

The simulator is the result of a Master's thesis work by Rafael Romero Trujillo.

Know Lib logo Last modified 2006-04-21