L10n:Dictionaries

From MozillaWiki
Jump to: navigation, search

Mozilla L10n Main | Join Mozilla | Overview | L10n Drivers | Communities | Meetings | Blog | Resources


If we ever want to improve our situation with dictionaries, it would be good to know:

  • How many different dictionary formats there are
  • Which ones different major projects use
  • What dictionaries exist in each
  • Who knows about their copyright status
To be included in the source tree, and therefore in builds shipped by Mozilla, a dictionary needs to have a licence compatible with the MPL 2.0. Examples of compatible licensing schemes include:
  • MPL 2.0
  • MPL 1.1/LGPL 2.1/GPL 2.0 tri-licence (the old Mozilla licence)
  • Apache 2.0
  • BSD or MIT-style licences
  • LGPL
  • Public domain
Use of some of these licences may require an addition of some text to about:licence. Please file a mozilla.org::Licensing bug when including a new dictionary so the licensing team can check this.

Afrikaans

Afrikaans dictionaries are built for MySpell and aspell by translate.org.za. Releases available at SourceForge. They are released under LGPL. --Friedelwolff 07:57, 9 Feb 2006 (PST)

Albanian

Albanian has GPL-ed dictionary for Aspell, Ispell and MySpell. The project covering them is available at http://www.shkenca.org/k6i/index_sq.html.

Arabic

There are GPL/LGPL/MPL Hunspell dictionaries for Firefox and Thunderbird 3 on http://ayaspell.sourceforge.net/ site. Nemethl 04:35, 28 January 2008 (PST)

Armenian

Armenian spell checker dictionary User:Armenzg

Asturian

Hunspell dictionary based on Aspell for Firefox (2.0 and above) and Thunderbird (2.0a1 and above), under Mozilla tri-license. Actual version is 0.04 (download).

Basque

MySpell (Firefox 2) and Hunspell (Firefox 3) dictionaries, licensed under GPL: http://www.euskara.euskadi.net/r59-20660/eu/contenidos/informacion/euskarazko_softwarea/eu_9567/xuxen.html.

Breton [br]

Hunspell dictionary provided by the An Drouizig team (http://www.drouizig.org). It is the same dictionary as in OpenOffice.org. Licensed under the Mozilla tri-license

Bulgarian [bg]

There is only one project dealing with free Bulgarian dictionaries and its releases can be found in bgoffice Sourceforge project files. Since 2010-04-15, the license is tri-licensed GPLv2 (or later) / LGPLv2.1 (or later) / MPLv1.1.

Catalan

Catalan dictionaries are available in Hunspell, MySpell and Aspell formats under GPL 2+, LGPL 2.1+ License at https://www.softcatala.org/wiki/Corrector_ortogràfic --Toniher (talk) 06:53, 12 November 2013 (PST)

Czech

The only open Czech dictionary I know is [Ispell] that was coverted by OOo guys to [MySpell]. Both dictionaries are under GPL/LGPL. However the original Ispell is kind of abandoned and its author doesn't communicate at all. Therefore, our efforts with relicencing failed. --User:Pawell

Update 2017: With l10n.cz We are trying to find open data for a new dictionary. So far we have found only data under viral copyleft licenses, so we would end up in the same situation we are now, only the dictionary quality would be slightly better. --User:mstanke

Danish

An old danish dictionary is available as aspell/ispell/myspell under GPL2. This is used in the add-on on AMO (Used because Thunderbird 2 does not support hunspell).

A newer dictionary is available as hunspell under the MPL/GPL/LGPL tri-license. It is part of the Danish build starting at Firefox 3.5. Bug 422162

British English [en-GB]

The [British English dictionary] is released under the LGPL.

Dutch

The dutch dictionary is available as Hunspell. The project purposely wants to be very open, such that the list can be used anywhere. Therefore, it is available under BSD and CC-by.

German

Unfortunately cannot ship with Firefox, only as add-ons (German, extended for Austria, variant for Switzerland), as we are using the same as LibreOffice, which is the igerman98 dictionary from Björn Jacke, licensed under GPL v2/v3 and the OASIS distribution license agreement.

Hebrew

OpenOffice.org has a localized version in hebrew, that uses the Hspell format, which is similar to MySpell. They converted it to MySpell for use with mozilla, and it's available here. the MySpell format is GPL, and i think Hspell too. Tsahi

Hungarian

There are new MySpell (Firefox 2) and Hunspell (Firefox 3) add-ons on the Hungarian spelling dictionary site. Unfortunately, Hungarian add-on 1.1.3 of Mozilla Add-on site is bad, see Bug 412386. Hunspell dictionary is licensed under MPL, so we would like it, as default spelling dictionary of the Hungarian localized versions of Mozilla products: Bug 414344. Nemethl 04:25, 28 January 2008 (PST)

+1 Timar 04:40, 28 January 2008 (PST)

Indonesian

MySpell id_ID dictionary is provided by OOo under GPL v2 [1]. License notes: https://ftp5.gwdg.de/pub/openoffice/contrib/dictionaries/README_id_ID.txt (in Indonesian and English). There is aspell dictionaries under GPL v2 too in aspell project made by the same author. The author (Benitius Brevoort) is willing to make the dictionary files from OOo available under GPLv2/LGPLv2.1/MPLv1.1 licenses to be used in Mozilla products. The email conversation and licenses is available at https://www.ewesewes.net/fxspellcheck/permissions.txt. We would like this dictionary to be included in the Firefox 3 release, see Bug 417095 --Rodin

Irish

The free Irish dictionary is available in aspell/ispell/myspell formats, licensed under the GPL. See http://borel.slu.edu/ispell/ --Kscanne

Italian

The free Italian dictionary is available in aspell/myspell/OOo formats and as an .xpi for Mozilla, licensed only under the GPL. We asked about relicensing it, but the answer was negative. See Linguistico project. --Prometeo

Kurdish

The Kurdish free dictonary is available as aspell/ispell/myspell, licensed under the GPL. If necessary, we can license it under something else, too. The URL is https://sourceforge.net/projects/myspellkurdish --Erdal Ronahi 07:00, 21 Feb 2006 (PST)

Latvian

Latvian myspell dictionary is available under a BSD licence from https://sourceforge.net/projects/openoffice-lv/.

Lithuanian

Lithuanian ispell/aspell/myspell dictionary is available under a BSD licence from https://files.akl.lt/ispell-lt/.

Macedonian [mk]

The macedonian dictionary is public domain.

Polish

The Polish free dictonary is available as aspell/ispell/myspell, licensed under the MPL/GPL/LGPL and CC-SA: https://www.sjp.pl/

Other projects using this dictionary:

Marcoos 10:04, 19 July 2006 (PDT)

Portuguese

The base dictionaries come from Project Natura's myspell dictionary. Fore Firefox 2 the dictionaries were licensed GPL2/BSD. In the meantime the license changed to GPL3. They've offered a triple-licensed (GPL2+, LGPL2.1+, MPL1.1) version for Firefox.

Dictionaries: https://natura.di.uminho.pt/natura/natura?&topic=Dicion%E1rios

Joao Miguel Neves 15:51, 12 September 2006 (WET)

Romanian

The hunspell dictionaries are available at http://rospell.sourceforge.net/index.html , there's a google group also at https://groups.google.com/forum/#!forum/rospell . The dictionaries are released under Mozilla tri-license: GPL 2.0/LGPL 2.1/MPL 1.1 .

--alexxed 10:44, 3 May 2008 (PDT)

Russian

There are three Russian Myspell dictionaries in Openoffice (all of them were converted from ispell format) - two of them are licensed under BSD License ( see ftp://scon155.phys.msu.ru/pub/russian/ispell/LICENSE ) and one is licensed under GPL/LGPL/MPL/CC-SA. --Unghost

Slovak

The open Slovak dictionaries are maintained by [sk-spell] project. Both ispell and aspell dictionaries are released under GPL/LGPL. The myspell dictionary is licensed under GPL / LGPL / MPL --User:Branor

Serbian

The Serbian dictionary is made with Myspell. It was produced by Igor Miletić (Игор Милетић), based on the Serbian Aspell dictionary made by Goran Rakić (Горан Ракић). See Serbian Aspell. The dictionary is released under MPL, GPL, and LGPL, the proof of which is given in the LICENSE.txt file of the localized spell checker directory. filmil

Swedish [sv-SE]

A Swedish Hunspell dictionary made by Göran Andersson is available from AMO. The dictionary is released under LGPL 3.0. (Hasse, Lakrits)

Thai [th]

Thai Hunspell dictionary (2006-12-12) - zipped, from OpenOffice.org; RPM, from Fedora 8.

Distributed in LGPL (as stated in RPM).

Maintained by Sila Chunwijitra <hin # opentle org>.

Words in the dictionary are partially from NECTEC's Lexitron dictionary (license), with additional geographical names, removal of words with Mai Yamok (repetition mark).

Ukrainian [uk]

There is a project which produces myspell/aspell/ispell Ukrainian dictionaries from single source. More information can be found on wiki page of Knowledge base for Ukrainian language. Downloads, CVS etc can be found on SourceForge. These spell dictionaries distributed under GPL, LGPL, and MPL.

Vietnamese

There's currently an Aspell dictionary for Vietnamese that is licensed under the LGPL. [2] I've been contacted by someone creating a spellchecker for a Vietnamese distribution of OpenOffice.org (Hunspell?). A demo of that spellchecker is available. [3] I haven't tried either, but I'd imagine creating a Vietnamese spell checker is a bit more straightforward than an English one, say. – Minh Nguyễn (talk, contribs) 01:25, 7 June 2007 (PDT)

There's now a Hunspell dictionary for Vietnamese, licensed under GPL v3. – Minh Nguyễn (talk, contribs) 16:36, 21 May 2008 (PDT)

I've packaged the dictionary as an extension, Vietnamese Dictionary. – Minh Nguyễn (talk, contribs) 22:51, 17 June 2008 (PDT)
The Hunspell dictionaries are now bundled with the Vietnamese localizations of Firefox 3.5 and above and Thunderbird 3.0 and above. – Minh Nguyễn (talk, contribs) 02:16, 5 July 2009 (UTC)

The Vietnamese Hunspell dictionary project has moved from Google Code to GitHub. – Minh Nguyễn (talk, contribs) 04:21, 10 November 2014 (PST)

The dictionary has been removed from the Vietname repository (see bug 1912392). As stated at the top of the page, dictionaries released under GPL are not suitable.

Walloon (wa)

There is a Walloon dictionaries being enriched; it's main target is aspell, but myspell is easily built from it (there is a "make myspell" rule); also a rule to make an XPI for firefox; and a (quite rude) rule to make a myspell for OOo (copying the files to local directory tree, ouch :) )

It can be downloaded on http://chanae.walon.org/walon/aspell.php license is LGPL. -- Srtxg

Zulu

Provisional Zulu dictionaries are built for MySpell and aspell by translate.org.za. Releases available at SourceForge. They are released under LGPL. --Friedelwolff 07:57, 9 Feb 2006 (PST)