» tagged pages
» logout

sorted by: recent | see : popular
Content Tagged with prodei + Web

Jeff's Search Engine Caffè: Current Open Source Search Engine Libraries

"Here is my short list of the most important open source [free] information retrieval libraries being used today that are undergoing active development as of writing."

open-source: del.icio.us tag/open-source

JoBo

"JoBo is a simple program to download complete websites to your local computer. Internally it is basically a web spider. he main advantage to other download tools is that it can automatically fill out forms [...] and also use cookies for session handling.

open-source: del.icio.us tag/open-source

WebCAT :: A Web Content Analysis Tool

"WebCAT is an extensible tool to extract meta-data and generate RDF descriptions from existing Web documents."

open-source: del.icio.us tag/open-source

WebLA :: Web Linkage Analysis

"WebLA is a Java package for handling Web Graphs, implementing popular algorithms such as PageRank, HITS, CoCitation Similarity and SimRank. It is of particular interest for research in Information Retrieval, [...]"

open-source: del.icio.us tag/open-source

webgraph++

"Webgraph++: big graph, little footprint"

open-source: del.icio.us tag/open-source

Swish-e :: Home Page

"Swish-e is a fast, flexible, and free open source system for indexing collections of Web pages or other files. Swish-e is ideally suited for collections of a million documents or smaller."

open-source: del.icio.us tag/open-source

The Clair Library

"The Clair library is written in Perl and is intended to simplify a number of generic tasks in Natural Language Processing (NLP), Information Retrieval (IR), and Lexical Network Analysis. Its architecture also allows for external software to be plugged in

open-source: del.icio.us tag/open-source

Wayback - Home Page

"wayback is an open source java implementation of the The Internet Archive Wayback Machine."

open-source: del.icio.us tag/open-source

wera - Home Page

"WERA (Web ARchive Access) is a freely available solution for searching and navigating archived web document collections."

open-source: del.icio.us tag/open-source

Internet Archive ARC access tools - Home Page

"This is home for Internet Archive ARC file access tools."

open-source: del.icio.us tag/open-source