|
The Combine Harvesting Robot |
|
| Announcements (2009-10-23) | Features | Documentation | Downloads | Installation |
What is Combine?A Focused Crawler System for the WEB |
If you want to download Combine is an open system for crawling [harvesting and threshing (indexing)] Internet resources. It can be used both as a general and focused crawler. The name is derived from the combine-harvester since the two perform their jobs in a similar way. Its is implemented in Perl and is easily configured for most types of Web-crawling. Together with the Zebra text indexing and retrieval engine Combine makes an easy to use search engine in a box. In a few simple steps you create a vertical search engine with structured searching. |
Features |
|
Who is using Combine?Services, demos |
The Combine Robot was first released (spring '98), and was primarily used whithin the nordic NWI services (out of commision). A modified version of Combine was also used in the Kulturarw Web archival project. We envisage it as being useful for regional web indexing in general. SOSIG use Combine as harvester for their Social Science Search Engine, based on urls from the main catalogue. In the EU project ALVIS it is used (and further developed) as a focused crawler in order to produce data collections for the topic-specific semantic-based search engine that ALVIS develops. There are some ALVIS demos are available. Many of KnowLib demos use Combine crawled data. You may want to use it yourself? That is possible, since it is all built upon free software. See Downloads below |
Where can I get further information? |
Documentation
Mailing list at SourceForge
|
Installation |
Installation Debian stableThe recommended procedure to install Combine is to use the Debian package system - apt.
Installation from CPANCombine is also available from CPAN.Manual installation for the impatientDownload the source version 4.003 (gzipped tar)Unpack and cd into combine-4.003 Make sure you have all the dependencies installed The following command sequence will install Combine: perl Makefile.PL make make test make install mkdir /etc/combine cp conf/* /etc/combine/ mkdir /var/run/combine
Test that it all works (run as root)
Test the installation
Where can I get help?Examples, HowTo's and hints.
For reporting bugs, getting help or information or just to let us
know that you are using Combine, send an e-mail to
Combine general discussion list which is a
mailing list
at
|
Downloads |
Also available from CPAN and
|
| History, acknowledgements |
The Combine system was initially developed as a part of the Development of a
European Service for Information on Research and Education (DESIRE) project, which was funded by the European
Commission within Telematics for Science Program.
It is later beeing modified for focused crawling by integrating the automated topic classification algorithms also developed in DESIRE with the crawler. This work was funded by Vinnova, Swedish Agency for Innovation Systems (project P22504-1 A) and the EU project ALVIS project (IST-1-002068-STP). Currently supported by .SEs Internetfond in the project 'Vertical Search Engines'. |
Copyright |
From version 1.1 Combine is distributed under General Public License. |
View metadata for this page
|
Acknowledgements::
- The Combine Harvester was created at NetLab and
is maintained by the KnowLib group
at Dept. of Electrical and Information Technology, Lund University.
Currently supported by .SEs Internetfond.
Last modified |