Topic: Transifex Translation Memory Fill-Up
Nordfriese Topic Opener |
Posted at: 2022-05-21, 09:29
Transifex offers an option to "automatically translate phrases with exact matches from the Translation Memory" when a source string is updated. This reduces work when the same string appears in multiple contexts, but may also lead to incorrect autogenerated translations in case a string needs to be disambiguated in a target language but not in English. Should we enable this option, or is it better to leave it off? Top Quote |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
tothxa |
Posted at: 2022-05-22, 13:36
Can we get some statistics how much it would fill without actually doing it? Does transifex mark these auto filled translations? (I once uploaded some offline translations with such draft translator remarks, and transifex just ignored them.) Between different source files it may be less risky, but IMO different contexts should be respected and manually reviewed. I think a bigger issue is when there is a minor fix in the English text, especially if it's just punctuation, typo or English specific grammar that don't need changing the translations that were done properly. These are handled by gettext and transifex as if they were completely new strings and the translation is lost, and must be manually restored from translation memory. But of course these are better checked manually as well because there's no automatic way to tell if a small change requires change in a given translation. Again, ideally these should be marked as minor changes that need reexamination. Top Quote |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Nordfriese Topic Opener |
Posted at: 2022-05-22, 14:30
Transifex doesn't offer such statistics AFAIK, so I wrote a quick program (attached) to gather these stats. Plural forms and contexts are ignored for the sake of simplicity. The number of duplicate strings that would be filled in per language in current master are:
This Transifex option is only on/off, we can't customise to differentiate by resource or context. There's unfortunately no way to treat two source strings as "similar" in gettext AFAIK. Edited: 2022-05-22, 14:32
Top Quote |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
hessenfarmer |
Posted at: 2022-05-22, 16:00
Using this option would be problematich with all soldier related translations. In Englisch they are all soldiers. but in German for example we have "Soldat" for Empire and Atlanteans, "Krieger" for Barbarians and Frisians and "Kriegerin" for Amazons. Top Quote |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
tothxa |
Posted at: 2022-05-22, 17:17
300 strings making 4% of all is indeed in the region that is annoying to some, but not annoying enough to easily let go of manual checking to others. I'm in the latter camp. Top Quote |