On Thu, 18 Dec 2003 15:53:20 +0000 (UTC), A.Sloman@cs.bham.ac.uk
wrote:
>Jonathan,
>
>> Thanks ... saves me doing it. I may have a quick play with the
>> numbers now ...
>
>One quick play in my current environment (just using Ved to answer mail,
>but with a lot of local-stuff precompiled):
[snip]
>;;; distribution of sizes
>vec =>
>** <shortvec 0 10 23 31 62 112 115 131 149 104 87 69 50 32 22 12 8 2 3 0 0 0 0 0 1>
> 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15....
> [Bucket size]
I forgot to mention, you need to change the line
if char == `.` then `0` else char endif
to
if char == `.` then `0`, ` ` else char endif
otherwise you won't count the empty buckets (you seem to have 29 empty
buckets, since vec adds up to 995, not 1024). Changing the dot to a
zero sticks a zero onto the front of the other numbers.
Or, to put it another way
length(dic_numbers())=>
should be the size of the dictionary.
>Is that a poisson distribution? I don't recall the definition.
define poisson(x, n);
exp(-x)*(x**n)/factorial(n)
enddefine;
define factorial(n);
if n < 2 then 1 else n*factorial(n-1) endif
enddefine;
number_of_words/number_of_buckets -> x;
initv(infinity+1) -> number_of_buckets_with_n_words;
for n from 0 to infinity do
poisson(x, n) * number_of_buckets ->
number_of_buckets_with_n_words(n+1);
endfor;
If you add up the contents of number_of_buckets_with_n_words
it should sum to number_of_words. If it doesn't then infinity
isn't big enough.
I think, in your example, infinity was around 24 or 25 ;-).
Poisson distributions are for, er, probably best to google, but I
think it's the right thing for measuring words in buckets :-).
To the extent that the actual figures don't match the Poisson
distribution, you are not distributing things to buckets randomly.
See my other post.
>> Yes, you were too quick! I've just rebooted back into Windows
>> after having a look at it. (I must try out some Linux
>> newsreaders ...)
>
>I use the Bham versions of ved_gn, ved_postnews, etc., which use a
>socket connection to the site in the environment variable NNTPSERVER
>You can get the package here:
(snip)
Thanks - I'll have a look.
>It re-defines ved_send(mr) to use sendmail so you may have to do some
>tweaking of sendmail stuff in the /etc/ directory.
>
>I should change it one day to allow alternative mailers. It used
>to pipe through 'mail' but that turned out to be too limiting.
>I forget why.
Sendmail is fairly standard, even for people who don't use it (e.g.
I use postfix in the office, and I think it links sendmail to
whatever is part of the postfix suite of programs -- so things that
think they are using sendmail aren't, but it's transparent).
>Some of the details are here:
> http://www.cs.bham.ac.uk/research/poplog/help/ved_getmail
>
>Alternatively if you are now a wysywig person, install mozilla (not
>netscape), from www.mozilla.org
I've heard good things about mozilla: trying it out is on my todo
list somewhere, I think. [Starts scanning todo list. Gets out
telescope to see further reaches of todo list. Telescope not
powerful enough.] I'm sure it's there somewhere ... I'll try
re-inserting it nearer the top.
Jonathan
--
Use jlc at address, not spam.
|