Version française
Home     About     Download     Resources     Contact us    
Browse thread
[1/2 OT] Indexing (and mergeable Index-algorithms)
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: Richard Jones <rich@a...>
Subject: Re: [Caml-list] [1/2 OT] Indexing (and mergeable Index-algorithms)
On Thu, Nov 17, 2005 at 12:49:55PM +0100, Florian Weimer wrote:
> Plenty.  Berkeley DB, SQLite, full-blown SQL database servers like
> PostgreSQL or MySQL.  The list is pretty long.

We use PostgreSQL's tsearch2[1] module to index web pages across our
main site and customer sites.  Today we have 38,437 pages including
old versions in the index.

Pros:

* Extremely easy to use - you just insert pages as rows in the database.
* Very featureful - does stemming, multiple language support, etc.
* Works from OCaml using, eg., ocamldbi, OCaml-PostgreSQL module.

Cons:

* Quite hard to install - you need to read the documentation carefully.
* Slow for lookups - I haven't quite got to the bottom of this so I
  don't know if it's inherently slow or if I haven't set up the indexes
  right.

Rich.

[1] http://www.sai.msu.su/~megera/oddmuse/index.cgi/tsearch-v2-intro

-- 
Richard Jones, CTO Merjis Ltd.
Merjis - web marketing and technology - http://merjis.com
Team Notepad - intranets and extranets for business - http://team-notepad.com