Currently Online

Latest Posts

[Deutsch] - Spielerforum
Fehlermeldung Acurate Europe …
by hessenfarmer 2 hours ago
Graphic Development
nearest neighbor / point filt…
by Kusy 1 day ago
Technical Help
Green screan
by hessenfarmer 1 day ago
Technical Help
strange soldier behaviour
by LeightonMan 2 days ago
Game Suggestions
Some kind of military buildin…
by radoslawmazurek 2 days ago
Add-Ons
New experimental tribe: Europ…
by MarkMcWire 5 days ago
Graphic Development
Changing edge.png
by Kusy 5 days ago
Game Suggestions
add ons
by kaputtnik 6 days ago
More latest posts

Topic: Transifex Translation Memory Fill-Up

Nordfriese

Topic Opener

Joined: 2017-01-17, 17:07
Posts: 1955
OS: Debian Testing
Version: Latest master
Ranking

One Elder of Players
Location: 0x55555d3a34c0

Posted at: 2022-05-21, 08:29

Transifex offers an option to "automatically translate phrases with exact matches from the Translation Memory" when a source string is updated. This reduces work when the same string appears in multiple contexts, but may also lead to incorrect autogenerated translations in case a string needs to be disambiguated in a target language but not in English. Should we enable this option, or is it better to leave it off?

Top

Quote

tothxa

Joined: 2021-03-24, 11:44
Posts: 439
OS: antix / Debian
Version: some new PR I'm testing
Ranking

Tribe Member

Posted at: 2022-05-22, 12:36

Can we get some statistics how much it would fill without actually doing it? Does transifex mark these auto filled translations? (I once uploaded some offline translations with such draft translator remarks, and transifex just ignored them.) Between different source files it may be less risky, but IMO different contexts should be respected and manually reviewed.

I think a bigger issue is when there is a minor fix in the English text, especially if it's just punctuation, typo or English specific grammar that don't need changing the translations that were done properly. These are handled by gettext and transifex as if they were completely new strings and the translation is lost, and must be manually restored from translation memory. But of course these are better checked manually as well because there's no automatic way to tell if a small change requires change in a given translation. Again, ideally these should be marked as minor changes that need reexamination.

Top

Quote

Nordfriese

Topic Opener

Joined: 2017-01-17, 17:07
Posts: 1955
OS: Debian Testing
Version: Latest master
Ranking

One Elder of Players
Location: 0x55555d3a34c0

Posted at: 2022-05-22, 13:30

Transifex doesn't offer such statistics AFAIK, so I wrote a quick program (attached) to gather these stats. Plural forms and contexts are ignored for the sake of simplicity. The number of duplicate strings that would be filled in per language in current master are:

Language	Translated	Untranslated	Duplicates
ar	1330	6320	243
bg	3988	3662	289
br	866	6784	115
ca	7650	0	0
cs	7242	408	6
da	5578	2072	190
de	7645	5	0
el	2175	5475	220
en_GB	3585	4065	290
en_US	63	7587	6
eo	1247	6403	236
es	6447	1203	64
eu	331	7319	45
fa	93	7557	16
fi	7461	189	13
fr	7234	416	9
fy	1067	6583	210
ga	29	7621	0
gd	6592	1058	81
gl	1080	6570	185
he	308	7342	94
hi	33	7617	4
hr	741	6909	150
hu	7483	167	6
id	183	7467	48
ig	79	7571	24
it	5023	2627	140
ja	4537	3113	193
ka	32	7618	0
ko	6827	823	36
krl	75	7575	11
la	950	6700	202
lt	601	7049	72
ms	1335	6315	159
nb	2058	5592	229
nds	7645	5	0
nl	6030	1620	160
nn	747	6903	138
pl	5678	1972	227
pt	3959	3691	232
pt_BR	2211	5439	207
ro	271	7379	66
ru	7397	253	1
sk	3657	3993	244
sl	1285	6365	129
sr	206	7444	30
sr_RS	29	7621	0
sv	4600	3050	336
tr	621	7029	114
uk	932	6718	217
zh_CN	3373	4277	191
zh_TW	282	7368	62

This Transifex option is only on/off, we can't customise to differentiate by resource or context.

There's unfortunately no way to treat two source strings as "similar" in gettext AFAIK.

Edited: 2022-05-22, 13:32

Attachment:

DuplicateTranslationsSearcher.java.txt (2.6 KB)

Top

Quote

hessenfarmer

Joined: 2014-12-11, 22:16
Posts: 2656
Ranking

One Elder of Players
Location: Bavaria

Posted at: 2022-05-22, 15:00

Using this option would be problematich with all soldier related translations. In Englisch they are all soldiers. but in German for example we have "Soldat" for Empire and Atlanteans, "Krieger" for Barbarians and Frisians and "Kriegerin" for Amazons.
I really would prefer top stick for manual check and use the suggested memory strings in these cases.
IIRC we made these Soldier Strings translatable in different fashions by using Pgetttext and they were treated as the same string before, with multiple occurences.
So AFAIK 100% matching strings are translated the same as long as they are not excluded by some function like Pgettext.

Top

Quote

tothxa

Joined: 2021-03-24, 11:44
Posts: 439
OS: antix / Debian
Version: some new PR I'm testing
Ranking

Tribe Member

Posted at: 2022-05-22, 16:17

300 strings making 4% of all is indeed in the region that is annoying to some, but not annoying enough to easily let go of manual checking to others. I'm in the latter camp.

Top

Quote