[Date Prev] [Date Next] [Thread Prev] [Thread Next] Date Index Thread Index Search archive:
Date:Wed, 10 Mar 2004 20:25:00 -0000 
Subject:Optimisation of program using ratios 
From:Jonathon Read 
Volume-ID: 

Hi,

I've been implementing a Naive Bayes model for text classification in POP11.
While I'm rather impressed with the ratio datatype (I don't recall finding
perfect precision in the other languages I've studied), I've gotten myself
into a problem with speed.

My model iterates through the words in a piece of text, looking up the
probability of each word for a given class, and multiplying together all the
probabilities it looks up.  Since these are all very small probabilities, by
the time its finished the final probability is very very small - in fact if
I print it, it fills the an 80 x 30 screen!  From this I presume that each
time it is used in a calculation, the memory manipulation going on must be
enormous.  Is this the case?

I'd be grateful if anybody could offer their comments and suggestions -
perhaps another datatype is more suitable?

many thanks
Jonathon Read