» tagged pages
» logout

sorted by: recent | see : popular
Content Tagged with Information-Retrieval + Web

The Zettair Search Engine

"Zettair allows you to index and search HTML (or TREC) collections. It has been designed for simplicity as well as speed and flexibility, and its primary feature is the ability to handle large amounts of text."

open-source: del.icio.us tag/open-source

The Clair Library

"The Clair library is written in Perl and is intended to simplify a number of generic tasks in Natural Language Processing (NLP), Information Retrieval (IR), and Lexical Network Analysis. Its architecture also allows for external software to be plugged in

open-source: del.icio.us tag/open-source

Sphider - a php spider and search engine

"Sphider is a lightweight web spider and search engine written in PHP, using MySQL as its back end database."

open-source: del.icio.us tag/open-source

Focused crawler - Combine System Homepage

"Combine is an open system for crawling [harvesting and threshing (indexing)] Internet resources. It can be used both as a general and focused crawler."

open-source: del.icio.us tag/open-source

Heritrix - Home Page

"Heritrix is the Internet Archive's open-source, extensible, web-scale, archival-quality web crawler project."

open-source: del.icio.us tag/open-source

WIRE (Web Information Retrieval Environment)::Center for Web Research

"The WIRE project is an effort started by the Center for Web Research for creating an application for information retrieval, designed to be used on the Web."

open-source: del.icio.us tag/open-source

Welcome to Nutch!

"Nutch is open source web-search software. It builds on Lucene Java, adding web-specifics, such as a crawler, a link-graph database, parsers for HTML and other document formats, etc."

open-source: del.icio.us tag/open-source

OSIR 2006: Open Source Information Retrieval

"The goal of the Open Source Information Retrieval Workshop (OSIR) is to bring together practitioners developing open source search technologies in the context of a premier IR research conference to share their recent advances, and to coordinate their str

open-source: del.icio.us tag/open-source