|
by Miro Romih, Director |
With the integration of Slovenia
with Europe and the globalization of the world communication network,
language technologies are a field which becomes ever more important
daily. Large corpuses of text, electronic dictionaries, programs
for language translation support, systems for voice recognition
and generation, and different programs for detecting grammatical
errors in texts are fields that are flourishing with the growth
of computers' speed and power. Amebis decided some years ago to
get involved in the development of that field of science, and
keep in step with the world with cooperation with other groups
in Slovenia and abroad.
Amebis d.o.o. was established in the middle of 1991. The first
major project was the BesAna (word analysis) program. It was the
first Slovene grammatical checking program, which detected errors
in Slovene texts, written by computer. The expansion of the program
was relatively fast and, at the same time, it was an important
bridge between the company and users. The need for a smaller,
easier version of the program soon became apparent. This house
version was called micro-BesAna (micro word analysis). It detected
only spelling errors. Both micro-BesAna and BesAna were DOS programs.
At that time, the share of the Windows (3.1) operating system
on the market was fast increasing. The first serious editors appeared
in Windows, and also the need to use the speller in a graphics
environment. In 1993, BesAna for Windows was created. Since then,
we have developed a new version of BesAna for every new version
of Word or WordPerfect. It is adapted with editors and the dictionary
is being constantly expanded.
|
||||
|
||||
| Peter Holozan | ||||
| Matej Pivec | ||||
At the same time, we began
to cooperate with DZS d.d., which is the biggest publishing company
in the field of dictionaries in Slovenia. Various tools were required
for computer data preparing and error checking in dictionaries,
lexicons and encyclopedias. The tool collection gradually increased
in size and quality. This collection of tools made it possible
for us to handle large language databases very fast and effectively.
In addition to preparing paper dictionaries by computer, we began
to consider electronic versions of dictionaries. Since the proper
tool for fast searching big databases on PC was not available,
we decided to build our own system ASP (Amebis database). In ASP,
we fused all our experience and the user's wishes and needs. We
combined in a single program (ASP viewer), which was initially
run under DOS and Windows, simplicity, speed of use and a major
level of data compression, which was very important in the age
of floppy disc installations. One of the specialities of the ASP
system was that we were able to put different types of databases
(dictionaries, registers, catalogues, business relation databases,
texts) into the same ASP format. All these databases have very
different structures. We have read various collections in entirely
the same way with a single program.
In order to spread the program among users, we have had to provide
interesting collections. The first language collection in ASP
format was the Large German-Slovene dictionary of the DZS d.d.
publishing company, which was published in the middle of 1994.
The user response was very favourable, which gave a further impetus
to the further development of the ASP system. We prepared also
a few smaller useful demonstration collections (Postal numbers,
Improper English verbs...). In view of the good sales of the Large
German-Slovene dictionary, we decided in conjunction with DZS
d.d., to move other dictionaries into the electronic form, too.
The next dictionary in ASP format, with similar appearance, was
the Slovene-English dictionary. At the end of 1996 and begining
of the 1997, the next three electronic dictionaries were published:
the Dictionary of Slovene literary language, the Great English-Slovene
dictionary and the Great Slovene-German dictionary. We are also
planning new electronic dictionary editions which will be available
also on CD-ROM media with the new 32-bit ASP viewer.
During this period, we have completed a number of other projects
based on the ASP system. The CD-ROM Region lexicon of Slovenia
was very attractive, a product of cooperation between the Geographic
institute ZRC SAZU, DZS d.d. and Amebis d.o.o. In contrast with
other collections, data included pictures and photos, which added
to the attractiveness of the CD-ROM.
|
|||||||
|
|||||||
| Iztok Grilc | Ales Veluscek | ||||||
Simultaneous with the development
of the ASP system, we also continued with the development of language
modules. We developed various language modules for the Microsoft
company, for the Slovenian version of Microsoft Office. We prepared
for the Slovene market a spelling checker, hyphenation module
and thesaurus. A grammatical module is still under development.
For the Serbian speaking market, we developed with our Serbian
partner, a speller and hyphenation module, built into the Serbian
version of Microsoft Office. We have also made a Slovenian speller
for the Corel company and their editor WordPerfect.
In thedevelopment of language modules, first of all the grammatical
module, it became clear that we need a large number of sentences,
so we have started to collect texts and build our own corpus.
The structure and functions of the corpus are based on our own
development system ABIS, which provides not only storage and fast
data searching but also the automatic execution of defined operations
on texts. We also cooperate in other projects in the field of
corpuses, domestic and foreign (FIDA, Copernicus MULTEXT-EAST).
We are also developing acoustic interface. Taking advantage of
the fact that two very strong groups are working on the field
of voice recognition, we started to work on generation of the
human voice. Primarily for the needs of the blind and weak-sighted,
we are making a module for voice generation in Windows, which
will be integrated into two concrete applications.
We have recently started in the new hot theme - Internet. Among
other things, we have made an Internet presentation of the Chamber
of Craft of Slovenia, using an Internet version of ASP. With its
help, users can search for data very fast on large databases like
the Craft register OZS.
For the future, we are planning to continue developing language
tools, especially in new fields, such as a system for computer
translation support, for example. We will still devote attention
to Internet applications and the introduction of language modules,
expansion into new ASP collections (new dictionaries, lexicons,
books) and multimedia applications.