Version française
Home     About     Download     Resources     Contact us    
Browse thread
porter's stemmer implementation
[ Home ] [ Index: by date | by threads ]
[ Search: ]

[ Message by date: previous | next ] [ Message in thread: previous | next ] [ Thread: previous | next ]
Date: -- (:)
From: Yoann Padioleau <padator@w...>
Subject: Re: [Caml-list] porter's stemmer implementation

On Jan 16, 2010, at 11:01 PM, Grégoire Seux wrote:

> Hello,
> 
> I am looking for an implementation of Porter's stemmer in ocaml.

There is one in nltk, a very complete python library for NLP and there is
ocamlpython to link ocaml and python code. As the stemmer interface is
very simpler (string -> string), it's very easy to use ocamlpython to do that.

Basically you can do in a python file nltk_ocaml,py:

import nltk, re

stemmer = nltk.PorterStemmer()

def stem(s):
    return stemmer.stem(s)

def test_nltk_ocaml():
    print "test_nltk_ocaml"
    nltk.draw.tree.demo()




and in a ml file nltk.ml:
open Pycaml

module Py = Python 
let modul = Py.cpr (Pycaml.pyimport_importmodule "nltk_ocaml")
let dict = Py.cpr (Pycaml.pymodule_getdict modul)

let stem s = 
  let py_str = Pycaml.pystring_fromstring s in
  let f = Py.cpr (Pycaml.pydict_getitemstring(dict, "stem")) in
  let args = Py.cpr (Pycaml.pytuple_fromsingle py_str) in
  let res = Py.cpr (Pycaml.pyeval_callobject (f,args)) in
  Pycaml.guarded_pystring_asstring res







> Erik Arneson published a few years ago a link to his implementation, but the file is no longer available, does someone has a copy of the file or another implementation ? 
> By the way, I am looking for every library that could be used in index construction in Ocaml, of course !
> 
> Thanks by advance, 
> 
> Gregoire Seux
> _______________________________________________
> Caml-list mailing list. Subscription management:
> http://yquem.inria.fr/cgi-bin/mailman/listinfo/caml-list
> Archives: http://caml.inria.fr
> Beginner's list: http://groups.yahoo.com/group/ocaml_beginners
> Bug reports: http://caml.inria.fr/bin/caml-bugs