Discussion:
[Aspell-user] Using aspell programmatically
John P. Hartmann
2010-05-04 07:29:09 UTC
Permalink
I wish to use aspell in one of my own applications. I can loop up words and
get spelling corrections through the C API, but word list management has me
stumped.

I have (from a mainframe application) list of words that should be
considered valid in the context of specific files/applications/users.

So I tried to generate a personal word list:

j /home/john/aspell: aspell create personal < john.add
Sorry "create/merge personal" is currently unimplemented.

Hmmm.

j /home/john/aspell: aspell create master < john.add
Error: The language "en_GB" is not known. This is probably because: the file
"/usr/local/lib/aspell-0.60/en_GB.dat" can not be opened for reading.

Quite right, it ain't there and I don't want anyone messing about in that
directory.

I tried aspell -c john.add and it gave me a panel of the words, but clicking
a or A seems to have no permanent effect.

My configuration file:

dict-dir /usr/lib/aspell-0.60
lang en_GB
home-dir ~/aspell
personal pwl

The last two lines are out of desperation more than knowledge.

What did I do wrong?

j.
Kevin Atkinson
2010-05-04 11:51:11 UTC
Permalink
Post by John P. Hartmann
I wish to use aspell in one of my own applications. I can loop up words and
get spelling corrections through the C API, but word list management has me
stumped.
I have (from a mainframe application) list of words that should be
considered valid in the context of specific files/applications/users.
j /home/john/aspell: aspell create personal < john.add
Sorry "create/merge personal" is currently unimplemented.
You can easily create one manually. The personal dict. is just a wordlist
with a header line. You probably want to use:
personal_ws-1.1 en 0
as the header.
Post by John P. Hartmann
j /home/john/aspell: aspell create master < john.add
Error: The language "en_GB" is not known. This is probably because: the file
"/usr/local/lib/aspell-0.60/en_GB.dat" can not be opened for reading.
You need to provide a dictionary name. Also the language needs to be
specified as "en". en_GB is technical not a language but a dictionary
name, but it works when specified as a language in most cases (excluding
this one of course).
Post by John P. Hartmann
dict-dir /usr/lib/aspell-0.60
lang en_GB
home-dir ~/aspell
personal pwl
The last two lines are out of desperation more than knowledge.
Get rid of them, they are unnecessary and the last one is nonsense.
John P. Hartmann
2010-05-04 13:55:13 UTC
Permalink
Post by Kevin Atkinson
You can easily create one manually. The personal dict. is just a wordlist
personal_ws-1.1 en 0
as the header.
Is there a sample of such a file anywhere?
You need to provide a dictionary name. Also the language needs to be
specified as "en". en_GB is technical not a language but a dictionary name,
but it works when specified as a language in most cases (excluding this one
of course).
j /home/john/aspell: aspell --lang=en create master
~/aspell/.aspell.en_GB.per <john.add
Error: The language "en" is not known. This is probably because: the file
"/usr/local/lib/aspell-0.60/en.dat" can not be opened for reading.

I'm clearly not doing it right.

On the programmatic front, I can add words to the dictionary by
aspell_speller_add_to_personal. How do I remove a word from the
dictionary?

Thanks,

j.
Kevin Atkinson
2010-05-04 21:29:45 UTC
Permalink
Post by John P. Hartmann
Post by Kevin Atkinson
You can easily create one manually. The personal dict. is just a wordlist
personal_ws-1.1 en 0
as the header.
Is there a sample of such a file anywhere?
You need to provide a dictionary name. Also the language needs to be
specified as "en". en_GB is technical not a language but a dictionary name,
but it works when specified as a language in most cases (excluding this one
of course).
j /home/john/aspell: aspell --lang=en create master
~/aspell/.aspell.en_GB.per <john.add
Error: The language "en" is not known. This is probably because: the file
"/usr/local/lib/aspell-0.60/en.dat" can not be opened for reading.
Did you install the English dictionary? You need to do that, even if you
don't plan to use it as Aspell needs the language data files.

Also you need to specify the name of the dictionary you want to create.
Be sure to prefix it with a "./" otherwise Aspell will try to create it in
"/usr/local/lib/aspell-0.60/.
Post by John P. Hartmann
I'm clearly not doing it right.
On the programmatic front, I can add words to the dictionary by
aspell_speller_add_to_personal. How do I remove a word from the
dictionary?
Sorry, this is currently unsupported.
John P. Hartmann
2010-05-05 09:34:59 UTC
Permalink
Did you install the English dictionary?  You need to do that, even if you don't plan to use it as Aspell needs the language data files.
My bad. I downloaded it. I could have sworn that I looked for any
kind of instructions, but obviously not.

Pity about not being able to take words out again. I can see that you
can clone.

So I did

j /home/john/aspell: aspell --lang en create master . < wordlist
Warning: The word "people-power" is invalid. The character '-' (U+2D)
may not appear in the middle of a word. Skipping word.
Unhandled Error: The file "/home/john/aspell/dicts/." can not be
opened for writing.
Aborted (core dumped)

Writing on top of a directory is, of course, frowned upon in most circles.

j /home/john/aspell: aspell --lang en create master ./private < wordlist
Warning: The word "people-power" is invalid. The character '-' (U+2D)
may not appear in the middle of a word. Skipping word.

So how do I enter double-barrel words into a dictionary? I guess the
answer is, one doesn't.

j /home/john/aspell: echo CCW|aspell list
Error: The file "/home/john/aspell/private" is not in the proper format.

od -c -tx1:

0000000 a s p e l l d e f a u l t s
61 73 70 65 6c 6c 20 64 65 66 61 75 6c 74 20 73
0000020 p e l l e r r o w l 1 . 1 0
70 65 6c 6c 65 72 20 72 6f 77 6c 20 31 2e 31 30

Guess I messed up again. Now what?

Thanks,

   j.
Kevin Atkinson
2010-05-05 22:26:51 UTC
Permalink
Post by John P. Hartmann
j /home/john/aspell: aspell --lang en create master ./private < wordlist
Warning: The word "people-power" is invalid. The character '-' (U+2D)
may not appear in the middle of a word. Skipping word.
So how do I enter double-barrel words into a dictionary? I guess the
answer is, one doesn't.
Basically, if you really want it in there, there are ways to get around
this.
Post by John P. Hartmann
j /home/john/aspell: echo CCW|aspell list
Error: The file "/home/john/aspell/private" is not in the proper format.
I fail to see where Aspell is getting /home/john/aspell/private. If it is
a config variable remove it for now and use
aspell list -d ./private
If that doesn't work you might need to rename the dict. to private.rws
John P. Hartmann
2010-05-06 06:42:28 UTC
Permalink
Post by Kevin Atkinson
Basically, if you really want it in there, there are ways to get around
this.
I do; how, please?
Post by Kevin Atkinson
I fail to see where Aspell is getting /home/john/aspell/private.  If it is a
config variable remove it for now and use
 aspell list -d ./private
If that doesn't work you might need to rename the dict. to private.rws
With this config:

dict-dir /home/john/aspell/dicts
data-dir /home/john/aspell/data/aspell5-en-6.0-0
lang en_GB
home-dir /home/john/aspell
personal private.rws

I get:

j /home/john/aspell: mv private private.rws
j /home/john/aspell: aspell list -d ./private.rws
<hangs, does not loop>

Commenting out personal in the config file:

j /home/john/aspell: aspell list -d ./private.rws
Error: The file "/home/john/aspell/private.rws" is not in the proper format.

Regenerating the directory using ./private.rws makes no difference in behaviour.

I then regenerated the directory without the hyphenated word and now
it hangs when I comment out personal in the config file and issues the
message when the line is active.

---

About pushing/popping private word lists:

const struct AspellWordList * aspell_speller_personal_word_list(struct
AspellSpeller * ths);

Gets me the word list, which has a string enumeration member. I
suppose that is what needs to be cloned/restored. Except there is no
C API functions that look like doing the job. Any suggestions?

Thanks,

j.
Kevin Atkinson
2010-05-07 22:20:02 UTC
Permalink
Post by John P. Hartmann
Post by Kevin Atkinson
Basically, if you really want it in there, there are ways to get around
this.
I do; how, please?
You can get Aspell to accept any words when creating a dictionary using
the "--dont-validate-words" command line option. However, you won't be
able to use them unless you use the C ABI. Basically if you add "foo-bar"
and then have "foo-bar" in a document Aspell will still check for "foo"
and "bar" and never "foo-bar", but if you use the aspell_speller_check or
aspell_speller_suggest things will work as expected.
Post by John P. Hartmann
Post by Kevin Atkinson
I fail to see where Aspell is getting /home/john/aspell/private.  If it is a
config variable remove it for now and use
 aspell list -d ./private
If that doesn't work you might need to rename the dict. to private.rws
dict-dir /home/john/aspell/dicts
data-dir /home/john/aspell/data/aspell5-en-6.0-0
lang en_GB
home-dir /home/john/aspell
personal private.rws
j /home/john/aspell: mv private private.rws
j /home/john/aspell: aspell list -d ./private.rws
<hangs, does not loop>
j /home/john/aspell: aspell list -d ./private.rws
Error: The file "/home/john/aspell/private.rws" is not in the proper format.
Regenerating the directory using ./private.rws makes no difference in behaviour.
I then regenerated the directory without the hyphenated word and now
it hangs when I comment out personal in the config file and issues the
message when the line is active.
You did not create a personal word list. You created a read-only word
list, thus it should not be used as a personal one. "-d ./private.rws"
might not work because Aspell expected a ".multi" file. Look at the
installed dictionaries for the format. Also see
http://aspell.net/man-html/Using-Multi-Dictionaries.html#Using-Multi-Dictionaries
You lost me.
Post by John P. Hartmann
const struct AspellWordList * aspell_speller_personal_word_list(struct
AspellSpeller * ths);
Gets me the word list, which has a string enumeration member. I
suppose that is what needs to be cloned/restored. Except there is no
C API functions that look like doing the job. Any suggestions?
John P. Hartmann
2010-05-08 10:45:31 UTC
Permalink
use them unless you use the C ABI.  Basically if you add "foo-bar" and then
have "foo-bar" in a document Aspell will still check for "foo" and "bar" and
never "foo-bar", but if you use the aspell_speller_check or
aspell_speller_suggest things will work as expected.
Right, I am using the C API.   And I guess I did as you describe
simply by ignoring hyphens.  But when I add a hyphenated word by
aspell_speller_add_to_personal, I get this error

The word "built-in" is invalid. The character '-' (U+2D) may not
appear in the middle of a word.
You lost me.
Can't have that!

A script file (as they are called) can imbed other files that can
imbed yet other files and so on.

For each imbedded script file, I can have an addenda file that
contains words that are considered valid within the script file, but
not elsewhere.  Macro keywords, for example.

Thus, I would like to add a number of words to the personal list for
the duration of an imbedded file and then remove them again.  If there
were an opposite to aspell_speller_add_to_personal, I could run the
words through that, but there isn't.  A push/pop facility seems to be
what supports my requirement best.  That is, words added after a push
are all removed by the matching pop.

Right now, words added dynamically stays in the dictionary for the
duration and can thus mask errors in the rest of the document.

That said, however, I should also thank you for aspell and the API.
I've been spell-check challenged ever since I moved the script files
to the workstation.  The aspell PAI is an enormous help and I already
use it productively.

  j.
Kevin Atkinson
2010-05-08 22:53:59 UTC
Permalink
Post by John P. Hartmann
use them unless you use the C ABI.  Basically if you add "foo-bar" and then
have "foo-bar" in a document Aspell will still check for "foo" and "bar" and
never "foo-bar", but if you use the aspell_speller_check or
aspell_speller_suggest things will work as expected.
Right, I am using the C API.   And I guess I did as you describe
simply by ignoring hyphens.  But when I add a hyphenated word by
aspell_speller_add_to_personal, I get this error
The word "built-in" is invalid. The character '-' (U+2D) may not
appear in the middle of a word.
Did you try setting validate-words to false. You can do this via the C
API.
Post by John P. Hartmann
You lost me.
Can't have that!
A script file (as they are called) can imbed other files that can
imbed yet other files and so on.
For each imbedded script file, I can have an addenda file that
contains words that are considered valid within the script file, but
not elsewhere.  Macro keywords, for example.
Thus, I would like to add a number of words to the personal list for
the duration of an imbedded file and then remove them again.  If there
were an opposite to aspell_speller_add_to_personal, I could run the
words through that, but there isn't.  A push/pop facility seems to be
what supports my requirement best.  That is, words added after a push
are all removed by the matching pop.
Some hints:

The aspell core should be able to handle this but I'm not sure I expose
enough of the internals to be able to do this. You can always
interface directly with Aspell using its native C++ API. But I offer
absolutely no guarantee that I won't break you code between releases.

Aspell can only have 1 personal dictionary, but additional dictionaries
can be loaded bu using the add-extra-dicts option.

You might want to look into using the session dictionary which you can
clear by using aspell_speller_clear_session.
John P. Hartmann
2010-05-10 14:39:05 UTC
Permalink
 The aspell core should be able to handle this but I'm not sure I expose
 enough of the internals to be able to do this.  You can always
 interface directly with Aspell using its native C++ API.  But I offer
 absolutely no guarantee that I won't break you code between releases.
 Aspell can only have 1 personal dictionary, but additional dictionaries
 can be loaded bu using the add-extra-dicts option.
 You might want to look into using the session dictionary which you can
 clear by using aspell_speller_clear_session.
I found aspell_string_enumeration_clone and
aspell_string_enumeration_assign, but it looks like they don't do what
one would assume.

Then it finally dawned on me that I might as well maintain my own
addenda dictionaries. A hundred lines later and I am a happy camper.

Thanks for aspell!

Ciarán Ó Duibhín
2010-05-06 23:18:05 UTC
Permalink
Post by Kevin Atkinson
Post by John P. Hartmann
So how do I enter double-barrel words into a dictionary? I guess the
answer is, one doesn't.
Basically, if you really want it in there, there are ways to get around
this.
I'd be interested in knowing how to do this, as I have a requirement to have
words containing hyphens and apostrophes in my dictionary. I would be using
Aspell as implemented in UltraEdit. At present I have to port my text from
UltraEdit to MSWord for spell-checking because of the above requirement.

Ciarán Ó Duibhín.
Loading...