New keyed translations format

Started by ison, September 04, 2018, 08:06:39 AM

Previous topic - Next topic

ison

New keyed translations format
(this post describes a system which will be used in the upcoming unstable build)

We highgly recommend using the translation cleaning tool before changing the keyed translations! This will update all "EN: " comments with the up-to-date English texts.

We've changed the keyed translations format a bit, but it's back-compatible which means everything should work the same as before even if you don't change anything.

Most keyed translations now have extra symbols you can use, like definite and indefinite forms of the item's label. You can access it by writing, for example {0_X}, where X is the name of the sub-symbol. You can find the list of all available sub-symbols at the end of this post.

For example, if {0} refers to a person, you can now write:
{0_definite} took {0_possessive} backpack.
which will be resolved to: John took his backpack.

This flexibility may or may not be useful in your case, depending on how the sentence looks like in your language. Sometimes it's enough to use exactly the same symbols as in English, and sometimes you may want to use more.

Some symbols now have a name, but using numbers is still valid. For example, an English translation may contain text like this: {PAWN_definite}.

There used to be a problem with definite and indefinite articles in some languages. This problem is now fixed for the following languages: German, French, Korean, Russian, Italian, Spanish. So if you used to use e.g. "Un(e)" in French, you now may want to change it to {0_indefinite} (0 is an example here) which should resolve to either "un" or "une" based on the pawn's gender. If your language has definite and indefinite articles but they don't work, then please let us know so we can implement them.

Some languages use different words for each gender. In English this is solved by using {0_pronoun}, {0_objective}, and {0_possessive}. Some languages have many more pronouns though. This can now be solved by using a conditional expression like this: {PAWN_gender ? A : B}. This will resolve to either A or B based on the pawn's gender. If your language uses 3 genders, you can use {PAWN_gender ? A : B : C} (A for male, B for female, and C for neuter). It's possible to use this syntax to resolve indefinite and definite articles, e.g. in French {PAWN_gender ? un : une}, however we don't recommend doing this. In this case it's better to use {PAWN_indefinite}.

Our system can easily tell the gender of a pawn, but it can't tell the gender of any other object. This means that if you want to use a gender conditional expression on, for example, items, then you need to provide some lookup tables by creating the following files in your language folder:
WordInfo/Gender/Male.txt
WordInfo/Gender/Female.txt
WordInfo/Gender/Neuter.txt
and add one word per line to each file, e.g.: (Male.txt)
sword
computer
keyboard


then if an object whose label is "sword" is passed, its _gender will be male.

This makes it possible to do this:
{THING_label} is {THING_gender ? beautiful1 : beautiful2}
where "beautiful" is different based on the THING's gender (which happens in some languages).

Note that we now use {} everywhere instead of []. The RulePackDef system (used by art descriptions and name generators) still uses [] though. Even though the RulePackDefs system works similarly they are 2 separate systems.

List of available symbols for Pawns:
{0}: John, a cat, a dog (short name if possible, otherwise indefinite form of the pawn kind)
{0_nameFull}: John Doe, a cat, a dog (full name if possible, otherwise indefinite form of the pawn kind)
{0_nameFullDef}: John Doe, the cat, the dog (full name if possible, otherwise definite form of the pawn kind)
{0_label}: John, Constructor; cat; dog (label (name + title))
{0_labelShort}: John, cat, dog (short name if possible, otherwise kind label)
{0_definite}: John, the cat, the dog (short name if possible, otherwise definite form of the pawn kind)
{0_nameDef}: John, the cat, the dog (short name if possible, otherwise definite form of the pawn kind)
{0_indefinite}: John, a cat, a dog (short name if possible, otherwise indefinite form of the pawn kind)
{0_nameIndef}: John, a cat, a dog (short name if possible, otherwise indefinite form of the pawn kind)
{0_pronoun}: he/she
{0_possessive}: his/her
{0_objective}: him/her
{0_factionName}: Some Community (faction name)
{0_factionPawnSingular}: colonist (faction member label)
{0_factionPawnSingularDef}: the colonist (faction member label (definite))
{0_factionPawnSingularIndef}: a colonist (faction member label (indefinite))
{0_factionPawnPlural}: colonists (faction members label)
{0_factionPawnPluralDef}: the colonists (faction members label, definite)
{0_factionPawnPluralIndef}: colonists (faction members label, indefinite)
{0_kind}: human, cat, dog (kind label)
{0_kindDef}: the human, the cat, the dog (kind label (definite))
{0_kindIndef}: a human, a cat, a dog (kind label (indefinite))
{0_kindPlural}: humans, cats, dogs (kind label (plural))
{0_kindPluralDef}: the humans, the cats, the dogs (kind label (plural), definite)
{0_kindPluralIndef}: humans, cats, dogs (kind label (plural), indefinite)
{0_kindBase}: human, cat, dog, deer (instead of buck/doe) (base genderless kind label)
{0_kindBaseDef}: the human, the cat, the dog, the deer (base genderless kind label (definite))
{0_kindBaseIndef}: a human, a cat, a dog, a deer (base genderless kind label (indefinite))
{0_kindBasePlural}: humans, cats, dogs, deer (base genderless kind label (plural))
{0_kindBasePluralDef}: the humans, the cats, the dogs, the deer (base genderless kind label (plural), definite)
{0_kindBasePluralIndef}: humans, cats, dogs, deer (base genderless kind label (plural), indefinite)
{0_lifeStage}: teenager, baby, adult (life stage label)
{0_lifeStageDef}: the teenager, the baby, the adult (life stage label (definite))
{0_lifeStageIndef}: a teenager, a baby, an adult (life stage label (indefinite))
{0_lifeStageAdjective}: teenage, baby, adult
{0_title}: constructor (pawn title)
{0_titleDef}: the constructor (pawn title (definite))
{0_titleIndef}: a constructor (pawn title (indefinite))
{0_gender}: male, female (pawn gender)
{0_gender ? A : B}: A if male or neuter, B if female
{0_gender ? A : B : C}: A if male, B if female, C if neuter
{0_humanlike ? A : B}: A if humanlike, B if not

How to contribute
RimWorld 1.0 Translation Improvements
New translation cleaner tool
More minor 1.0 changes + info about LanguageWorkers
Request: Help check the language workers for accuracy

[attachment deleted due to age]

k2ymg


Tynan

It doesn't seem to work for pawn traits. We should look at that.

Original report:
QuoteWhen trying to use {PAWN_gender ? A : B} in a label translation, it doesn't work. However, when used in the description translation, it does. Why is this?

The reason we are trying to use {PAWN_gender ? A : B} is a label is because some adjectives (ie. traits) are different based on the gender of the pawn they belong to. So an abrasive male is "Diretto", but an abrasive female is "Diretta". I was wondering if I might be doing something wrong.

<!-- EN: abrasive -->
<Abrasive.degreeDatas.abrasive.label>{PAWN_gender ? Diretto : Diretta}</Abrasive.degreeDatas.abrasive.label>

Tynan Sylvester - @TynanSylvester - Tynan's Blog

k2ymg

I see. I'll try it when supported. It seems useful in a case like translate 'courtesean' to 'male prostitute' or 'female prostitute'. Thx.

Elevator

#4
It would be great if conditional formatting worked with ThoughtDef objects. In this case translators can write like that:
<DivorcedMe.stages.divorced_me.label>{0_gender ? развёлся : развелась} со мной</DivorcedMe.stages.divorced_me.label>
instead of:
<DivorcedMe.stages.divorced_me.label>развелся(ась) со мной</DivorcedMe.stages.divorced_me.label>
This will provide cleaner thought messages and help to improve translations to different languages.

Elevator

It would also be great if conditional expressions worked with whitespaces.

Currently
{PAWN_gender ? ваш бывший поселенец : ваша бывшая поселенка}
transfroms to "вашбывшийпоселенец", which doesn't look good.

One of possible solutions could be using quotation marks like this:
{PAWN_gender ? "ваш бывший поселенец" : "ваша бывшая поселенка"}
so that anaylyzer will treat text between qoutation marks as a solid string.

I know that I can write like:
{PAWN_gender ? ваш : ваша} {PAWN_gender ? бывший : бывшая } {PAWN_gender ? поселенец : поселенка}
, but it doesn't seem convenient at all.

morticinus

Hello,
this helped greatly for keyed and backstories (thank you Ison), but:
can you implement this also for definjected folders RULEPACK, TALE, INTERACTION?

i need this (for example):

original template:
[INITIATOR_nameDef] asked [RECIPIENT_nameDef] to join.

translated:
[INITIATOR_nameDef] se {INITIATOR_gender ? zeptal : zeptala} [RECIPIENT_nameDef], zda se nechce připojit.


thank you

Harkeidos

Quote from: ison on September 04, 2018, 08:06:39 AM
Our system can easily tell the gender of a pawn, but it can't tell the gender of any other object. This means that if you want to use a gender conditional expression on, for example, items, then you need to provide some lookup tables by creating the following files in your language folder:
WordInfo/Gender/Male.txt
WordInfo/Gender/Female.txt
WordInfo/Gender/Neuter.txt
and add one word per line to each file, e.g.: (Male.txt)
sword
computer
keyboard


then if an object whose label is "sword" is passed, its _gender will be male.

This makes it possible to do this:
{THING_label} is {THING_gender ? beautiful1 : beautiful2}
where "beautiful" is different based on the THING's gender (which happens in some languages).


Hello,
is it possible to know which list the objects point to?
For example, when I find:

ANIMAL_labelShort {}
or
ANIMAL_label {}

is it possible to know in which file / folder all ANIMALS are listed?

Thank you

Adirelle

Quote from: Harkeidos on November 14, 2018, 01:03:27 PM
Hello,
is it possible to know which list the objects point to?
For example, when I find:

ANIMAL_labelShort {}
or
ANIMAL_label {}

is it possible to know in which file / folder all ANIMALS are listed?

Thank you

I have the same question: how do we extract a list of words to be listed in these files ? from Strings/Words/Nouns/*.xml ? from the labels of DefInjected/*/*.xml ? Should we list the labels as-is or split them in single words and put them in singular form ?



b606

Hello,
I am trying to implement translation for animal kinds which are epicene words in french. That is words which are always "male" or always "female".

For example, "tortue" (turtle), "gazelle" or "panthère" follow female grammar in french. And "écureuil" or "Yorkshire Terrier" follow male grammar. So it is a little shocking to read "un tortue" or "le tortue" when the ANIMAL_gender is male. Instead "une tortue mâle" or "la tortue mâle" is correct. The same rule applies for their plurals.

An instance of these cases can be found in Keyed/Incidents.xml tags <LetterLabelAnimalInsanitySingle>, <AnimalInsanitySingle>, <LetterLabelAnimalSelfTame>, <LetterAnimalSelfTame> and <LetterAnimalSelfTameAndNameNumerical>.

As I cannot rely on ANIMAL_gender for these, I tried to revert to the {0} word gender. 0_gender seems to work for <LetterLabelAnimalInsanitySingle> and <AnimalInsanitySingle> even though {0} nor {0_gender} was not referenced in the english sentence, provided the words are correctly classified in WordInfo/Gender. So far, tests on https://github.com/Ludeon/RimWorld-fr/tree/animal-epicene on these tags are satisfactory.

However, for <*seftTamed*> tags, that did not work. A closer look gives some sort of explanation:

  • for <*Insanity*>, {0} translates to "tortue" (a X_labelShort), which is classified as WordInfo/Gender/Female.txt. Then, {0_gender} gives "female", everything is OK.
  • for <*seftTamed*>, {0} translates to the incorrect "un tortue" (for male turtle, a 0_indefinite as expected since you pointed it out in the first post of this thread) and this cannot be found in any WordInfo/Gender ! nor is "une tortue" (for female turtle). In any case {0_gender} defaults to "male" when Gender is not set!
I am stuck here with half a solution. Maybe there are two distinct problems here but the questions are:

  • Is such an approach viable? Blindly using {0_gender} is non standard, and putting "un tortue" in WordInfo/Gender/Female.txt to address the above #2 item is a no go.
  • The tag <LetterLabelAnimalSelfTame> uses {1_kind}. I could not test it (I do not know how to trigger this tag in dev mode). Is it the solution?
  • Is there another way to deal with it? (some sort of {{ANIMAL_kind}_gender ?  ...})
Best regards.
b606


Make the French RimWorld Translation at maximum quality.
French Optimisation Mod

Harkeidos

It does not seem to work with the RulePacks_Combat.xml file.
This is the log shown in the game:

Quote
--    Keuneke contro Li
Perde cuore ha portato Li a spirare.
Il proiettile di fucile da caccia di Keuneke ha annientato {recipient_part_destroyed0_gender ? Il : La : Lo} cuore di Li.

--    Illuminati contro I Curbor Del Ruscello
Colpo d'arma da fuoco su torso ha portato Abereimar a morire.
Il proiettile di fucile da caccia di Keuneke ha perforato {recipient_part_destroyed0_gender ? Il : La : Lo} torso di Abereimar e ha colpito {recipient_part_damaged0_gender ? Il : La : Il} {recipient_part_damaged0_gender ? Suo : Sua : Suo} cassa toracica.
Il proiettile di fucile da caccia di Keuneke ha mancato.
Il proiettile di fucile da caccia di Keuneke ha trafitto come un genio {recipient_part_damaged0_gender ? Il : La : Lo} torso di Abereimar.
Keuneke ha fatto fuoco deliberatamente su Abereimar con {WEAPON_gender ? Il : La : Il} {WEAPON_gender ? Proprio : Propria : Proprio} fucile da caccia.

--    Illuminati contro I Curbor Del Ruscello
Rosso-croicaa è {SUBJECT_gender ? Svenuto : Svenuta : Svenuto} a causa di colpo d'arma da fuoco su testa.
{Recipient_part_damaged0_gender ? Il : La : Lo} testa di Rosso-croicaa viene ferito dal proiettile di fucile da caccia {WEAPON_projectile_gender ? Magistrale : Sapiente : Sapiente} di Keuneke.
Il proiettile di fucile da caccia di Keuneke ha eliminato {recipient_part_destroyed0_gender ? Il : La : Lo} polmone destro di Rosso-croicaa e ha danneggiato {recipient_part_damaged0_gender ? Il : La : Il} {recipient_part_damaged0_gender ? Suo : Sua : Suo} torso.
Keuneke ha fatto fuoco accuratamente su Rosso-croicaa con {WEAPON_gender ? Il : La : Il} {WEAPON_gender ? Proprio : Propria : Proprio} fucile da caccia.
Keuneke ha fatto fuoco accuratamente su Rosso-croicaa con {WEAPON_gender ? Il : La : Il} {WEAPON_gender ? Proprio : Propria : Proprio} fucile da caccia.
Il proiettile di fucile da caccia di Keuneke ha mancato di molto.
Il proiettile di fucile da caccia di Keuneke ha mancato.
{Recipient_part_damaged0_gender ? Il : La : Lo} braccio destro di Rosso-croicaa viene danneggiato dal proiettile di fucile da caccia di Keuneke.

There is something wrong?

Thanks

TeiXeR

I think it does not work with any of the RulePackDef files because it's a separate system. Therefore it's complicated to translate the combat logs properly, for example.

morticinus

Hi Ison
The Czech language has three genders. Male, female, neuter. I'm beginning to rework some sentences to sound better. I'm using {0_gender? A: B: C}.
However, I also need to have the D variant like this: {0_gender? A : B : C : D } + added WordInfo/Gender/Plural.txt
(maybe to future customizable like this: {0_gender? A : B : C : D : etc : etc }, only if easy to add)
A = Male
B = Female
C = Neuter
D = Plurale tantum/Classic Plural


Plurale tantum - a noun that appears only in the plural form and does not have a singular variant for referring to a single object, (in czech) door, scissors, pants, etc.

Examples:
"A {1_label} has been short-circuited in the rain and started a fire."
{1_label} {1_gender ? způsobil : způsobila : způsobilo : způsobily } kvůli dešti zkrat a {1_gender ? začal : začala : začalo : začaly } hořet.


"{0} has died because of cold."
{0} kvůli chladu {0_gender ? uhynul : uhynula : uhynulo : uhynuly }.


Please add this to your code. Thank you!