Request: Help check the language workers for accuracy

Started by ison, September 05, 2018, 07:45:23 AM

Previous topic - Next topic

Danub

#15
Now, I'm assuming that you're keeping separate lists of words for singular and plural forms. If not and you're building the plurals on the fly, then you're in for a world of pain and may not be able to get a proper Romanian translation using this system, since forming plural in Romanian is irregular.
I'll update with proper code for the 2 methods (if actually possible) once I know how the system is designed.
Just to make sure I'm clear: Do you call
WithDefiniteArticle("horses", Gender.Male, true,  false) to get "the horses" ('gender' and 'plural' params define the 'string' param)
or
WithIndefiniteArticle("horse", Gender.Male, true,  false) to get the same result ('gender' and 'plural' params define the required result)
Gender doesn't apply to English, but you get the point.. If it's the second case, then you can't use it for Romanian.

Elevator

#16
The current implementation of GenText.ToCommaList() provides incorrect punctuation for Russian language (and possibly for some other ones). In Russian there should be no comma before "and", so enumerations should look like: "Torso, neck, left shoulder and right shoulder"
Could you please add ToCommaList(IEnumerable<string> items, bool useAnd) virtual method to LanguageWorker base class? It may be called from GenText.ToCommaList() and could be overriden in subclasses.

Btw, SitePartWorker_Turrets.GetThreatsInfo() also uses "AndLower" keyed value for the same purpose. I believe this method could be refactored to use  GenText.ToCommaList() like it is done in all other contexts in the game.

Adirelle

#17
Hi there,

About french:

I/ Pluralization

Here are the proper pluralization rules (using the first match):

1) special cases (we should probably check if these words are actually used in the translations):
1a) "bail", "corail", "émail", "gemmail", "soupirail", "travail", "vantail", "vitrail": replace "ail" by "aux",
1b) "bleu", "émeu", "landau", "lieu", "pneu", "sarrau", "bal", "banal", "fatal", "final", "festival" : append "s",
1c) "bijou", "caillou", "chou", "genou", "hibou", "joujou", "pou" : append "x",
2) words ending in "s", "x" or "z": do not change anything,
3) words ending in "al": replace "al" by "aux",
4) words ending in "au" or "eu": append "x",
5) else append "s".

II/ PostProcessing

In addition to what b606 said, ellision should be applied before word mangling.
All these rules are case-insensitive and also apply at the start of sentences, e.g. "si il" => "s'il" should means .Replace(" si il ", " s'il ").Replace("Si il", "S'il"). This would be easier to write as regular exceptions (e.g. "si il" => "s'il" becomes "\b([sS])i il" => "$1'il"). Are they allowed ?

- words "de", "le", "la" followed by a vowel: replace by "d'", "l'" and "l'" respectively.
- "si il" => "s'il"
- "que il" => "qu'il"
- "lorsque il" => "lorsqu'il"
- "que elle" => "qu'elle"
- "lorsque elle" => "lorsqu'elle"
(existing rules:)
- "de le" => "du",
- "de les" => "des",
- "de des" => "des".

Bacilic

I have already translated 80% of version v0.18.1722 and now I want to adapt it to the new version of game (v1). There have been several changes to the translations of the game so that each language can be seen in the game as natural as possible. Because the new translation method is more complex (contains definitive and indefinite articles for each genus, pronouns, adjectives, etc) I want to do the right job. Since the English language worker does not cover Greek, it is necessary to create a language worker for Greek

How do I make - setup it?

Harkeidos

How can I implement language workers in the game?

Thanks

Bacilic

Quote from: Harkeidos on March 03, 2019, 03:46:55 AM
How can I implement language workers in the game?

First of all, the Language Worker of your language(?) must be include in the game's code.
On the first page:https://ludeon.com/forums/index.php?topic=44000.0 of this thread you will see that these following exist:

  • Catalan
  • Danish
  • Dutch
  • English
  • French
  • German
  • Hungarian
  • Italian
  • Korean
  • Norwegian
  • Portuguese
  • Romanian
  • Russian
  • Spanish
  • Swedish
  • Turkish
If someone of it does, you have to declare it in the code:
<LanguageInfo>
  <friendlyNameNative>Ελληνικά</friendlyNameNative>
  <friendlyNameEnglish>Greek</friendlyNameEnglish>
  <canBeTiny>true</canBeTiny>
  <languageWorkerClass>LanguageWorker_Default</languageWorkerClass>
  <credits>

in line: "<languageWorkerClass>LanguageWorker_YOUR_LANGUAGE</languageWorkerClass>" of the file "LanguageInfo.xml"

After using the appropriate keywords:https://ludeon.com/forums/index.php?topic=43979.0 (for each language produces different results) during translation you adjust the text in your language.

Elevator

QuoteFirst of all, the LanguageWorker of your language(?) must be include in the game's code.
Before you send your class to developers you may want to test your LanguageWorker in runtime.
This can be done by creating a mod for the game.
I've created a LanguageWorker mod for Russian language, you may use it as a template for your one:
https://github.com/Elevator89/RimWorld-LanguageWorker_Russian

ByJacob

Hi developers :D
It is possibile to add Polish Language Worker to code ??

public class LanguageWorker_Polish : LanguageWorker
{
public override string WithIndefiniteArticle(string str, Gender gender, bool plural = false, bool name = false)
{
return str;
}

public override string WithDefiniteArticle(string str, Gender gender, bool plural = false, bool name = false)
{
return str;
}

private interface IResolver
{
string Resolve(string[] arguments);
}

private class ReplaceResolver : IResolver
{
// ^Replace('{0}', 'jeden-jedna')^
private static readonly Regex _argumentRegex =
new Regex(@"'(?<old>[^']*?)'-'(?<new>[^']*?)'", RegexOptions.Compiled);

public string Resolve(string[] arguments)
{
if (arguments.Length == 0)
{
return null;
}

string input = arguments[0];

if (arguments.Length == 1)
{
return input;
}

for (int i = 1; i < arguments.Length; ++i)
{
string argument = arguments[i];

Match match = _argumentRegex.Match(argument);
if (!match.Success)
{
return null;
}

string oldValue = match.Groups["old"].Value;
string newValue = match.Groups["new"].Value;

if (oldValue == input)
{
return newValue;
}

//Log.Message(string.Format("input: {0}, old: {1}, new: {2}", input, oldGroup.Captures[i].Value, newGroup.Captures[i].Value));
}

return input;
}
}

private class NumberCaseResolver : IResolver
{
// ^Number( {0} | '# komentarz' | '# komentarze' | '# komentarzy' | '# komentarza' )^
// ^Number( {0} | '1           | 2              | 10              | 1/2            )^
private static readonly Regex _numberRegex =
new Regex(@"(?<floor>[0-9]+)(\.(?<frac>[0-9]+))?", RegexOptions.Compiled);

public string Resolve(string[] arguments)
{
if (arguments.Length != 5)
{
return null;
}

string numberStr = arguments[0];
Match numberMatch = _numberRegex.Match(numberStr);
if (!numberMatch.Success)
{
return null;
}

bool hasFracPart = numberMatch.Groups["frac"].Success;

string floorStr = numberMatch.Groups["floor"].Value;

string formOne = arguments[1].Trim('\'');
string formSeveral = arguments[2].Trim('\'');
string formMany = arguments[3].Trim('\'');
string fraction = arguments[4].Trim('\'');

if (hasFracPart)
{
return fraction.Replace("#", numberStr);
}

int floor = int.Parse(floorStr);
return GetFormForNumber(floor, formOne, formSeveral, formMany).Replace("#", numberStr);
}

private static string GetFormForNumber(int number, string formOne, string formSeveral, string formMany)
{
if (number == 1)
{
return formOne;
}

if (number % 10 >= 2 && number % 10 <= 4 && (number % 100 < 10 || number % 100 >= 20))
{
return formSeveral;
}

return formMany;
}
}

private static readonly ReplaceResolver replaceResolver = new ReplaceResolver();
private static readonly NumberCaseResolver numberCaseResolver = new NumberCaseResolver();

private static readonly Regex _languageWorkerResolverRegex =
new Regex(@"\^(?<resolverName>\w+)\(\s*(?<argument>[^|]+?)\s*(\|\s*(?<argument>[^|]+?)\s*)*\)\^",
RegexOptions.Compiled);

public override string PostProcessedKeyedTranslation(string translation)
{
translation = base.PostProcessedKeyedTranslation(translation);
return PostProcess(translation);
}

public override string PostProcessed(string str)
{
str = base.PostProcessed(str);
return PostProcess(str);
}

private static string PostProcess(string translation)
{
return _languageWorkerResolverRegex.Replace(translation, EvaluateResolver);
}

private static string EvaluateResolver(Match match)
{
string keyword = match.Groups["resolverName"].Value;

Group argumentsGroup = match.Groups["argument"];

string[] arguments = new string[argumentsGroup.Captures.Count];
for (int i = 0; i < argumentsGroup.Captures.Count; ++i)
{
arguments[i] = argumentsGroup.Captures[i].Value.Trim();
}

IResolver resolver = GetResolverByKeyword(keyword);

string result = resolver.Resolve(arguments);
if (result == null)
{
Log.Error(string.Format("Error happened while resolving LW instruction: \"{0}\"", match.Value));
return match.Value;
}

return result;
}

private static IResolver GetResolverByKeyword(string keyword)
{
switch (keyword)
{
case "Replace":
return replaceResolver;
case "Number":
return numberCaseResolver;
default:
return null;
}
}
}

Arczi008TV


b606

Hi,

I noticed few glitches in the LanguageWorker for french. It needs to be modified:

1. some words do not need elision, for example "le husky" and not "l'husky" (while "l'humain" is correct).

2. for X_possessive : "sa" (female) should be replace with "son" when followed by a vowel. Ex. "sa épée" (her sword) should be "son épée"

By the way, the french language does not formally have neutral gender, so in Keyed/Grammar.xml the tag <Proits> is insufficient : is there a way to superseed it by the Strings/WordInfo/Gender/{Male,Female}.txt files so that the game gets "son" (<Prohis> or <ProitsCap>) or "sa" (<Proher> or <ProherCap>) when the word is listed in these files ?
My use case for this is in the RulePacks_*.xml where X_gender is still not effective.

3. In LanguageWorker, Pluralize fuction needs to be much more complicated, especially for invariant or compound nouns. Wouldn't it be easier to use a tag labelPlural for the ThingDef ? Ex of plurals:
"balles de revolver" (revolver bullets, "s" on the first word), "chapeaux de cowboy" (cowboy hats), "vestes renforcées" (flak jackets, get two "s"). The algorithm as it is now would then be used for things that do not have labelPlural defined.

Lastly, is there any chance to get XXX_gender work in RulePacks_* ?

Best regards


b606


Make the French RimWorld Translation at maximum quality.
French Optimisation Mod

zerstrick

Hello ison, the truth is, I do not understand much about programming, but I do understand Spanish and there is the possibility of adding this to the code, it will work for both languages ​​of Spanish (Latin American and Spain)

We need plural form for indefite and definite articles

public class LanguageWorker_Spanish : LanguageWorker
{
public override string WithIndefiniteArticle(string str, Gender gender, bool plural = false, bool name = false)
{
//Names don't get articles
if( name )
return str;

if( plural )
return (gender == Gender.Female ? "unas " : "unos ") + str;

else

return (gender == Gender.Female ? "una " : "un ") + str;
}

public override string WithDefiniteArticle(string str, Gender gender, bool plural = false, bool name = false)
{
//Names don't get articles
if( name )
return str;

if( plural )
return (gender == Gender.Female ? "las " : "los ") + str;

else

return (gender == Gender.Female ? "la " : "el ") + str;
}

public override string OrdinalNumber(int number, Gender gender = Gender.None)
{
return number + ".º";
}

public override string Pluralize(string str, Gender gender, int count = -1)
{
if( str.NullOrEmpty() )
return str;

char last = str[str.Length - 1];
char oneBeforeLast = str.Length >= 2 ? str[str.Length - 2] : '\0';

if( IsVowel(last) )
{
if( str == "sí" )
return "síes";
else if( last == 'í' || last == 'ú' || last == 'Í' || last == 'Ú' )
return str + "es";
else
return str + 's';
}
else
{
if( (last == 'y' || last == 'Y') && IsVowel(oneBeforeLast) )
return str + "es";
else if( "lrndzjsxLRNDZJSX".IndexOf(last) >= 0 || (last == 'h' && oneBeforeLast == 'c') )
return str + "es";
else
return str + 's';
}
}

public bool IsVowel(char ch)
{
return "aeiouáéíóúAEIOUÁÉÍÓÚ".IndexOf(ch) >= 0;
}
}

ison

Quote from: ByJacob on November 28, 2019, 02:59:51 PM
Hi developers :D
It is possibile to add Polish Language Worker to code ??

public class LanguageWorker_Polish : LanguageWorker
{
public override string WithIndefiniteArticle(string str, Gender gender, bool plural = false, bool name = false)
{
return str;
}

public override string WithDefiniteArticle(string str, Gender gender, bool plural = false, bool name = false)
{
return str;
}

private interface IResolver
{
string Resolve(string[] arguments);
}

private class ReplaceResolver : IResolver
{
// ^Replace('{0}', 'jeden-jedna')^
private static readonly Regex _argumentRegex =
new Regex(@"'(?<old>[^']*?)'-'(?<new>[^']*?)'", RegexOptions.Compiled);

public string Resolve(string[] arguments)
{
if (arguments.Length == 0)
{
return null;
}

string input = arguments[0];

if (arguments.Length == 1)
{
return input;
}

for (int i = 1; i < arguments.Length; ++i)
{
string argument = arguments[i];

Match match = _argumentRegex.Match(argument);
if (!match.Success)
{
return null;
}

string oldValue = match.Groups["old"].Value;
string newValue = match.Groups["new"].Value;

if (oldValue == input)
{
return newValue;
}

//Log.Message(string.Format("input: {0}, old: {1}, new: {2}", input, oldGroup.Captures[i].Value, newGroup.Captures[i].Value));
}

return input;
}
}

private class NumberCaseResolver : IResolver
{
// ^Number( {0} | '# komentarz' | '# komentarze' | '# komentarzy' | '# komentarza' )^
// ^Number( {0} | '1           | 2              | 10              | 1/2            )^
private static readonly Regex _numberRegex =
new Regex(@"(?<floor>[0-9]+)(\.(?<frac>[0-9]+))?", RegexOptions.Compiled);

public string Resolve(string[] arguments)
{
if (arguments.Length != 5)
{
return null;
}

string numberStr = arguments[0];
Match numberMatch = _numberRegex.Match(numberStr);
if (!numberMatch.Success)
{
return null;
}

bool hasFracPart = numberMatch.Groups["frac"].Success;

string floorStr = numberMatch.Groups["floor"].Value;

string formOne = arguments[1].Trim('\'');
string formSeveral = arguments[2].Trim('\'');
string formMany = arguments[3].Trim('\'');
string fraction = arguments[4].Trim('\'');

if (hasFracPart)
{
return fraction.Replace("#", numberStr);
}

int floor = int.Parse(floorStr);
return GetFormForNumber(floor, formOne, formSeveral, formMany).Replace("#", numberStr);
}

private static string GetFormForNumber(int number, string formOne, string formSeveral, string formMany)
{
if (number == 1)
{
return formOne;
}

if (number % 10 >= 2 && number % 10 <= 4 && (number % 100 < 10 || number % 100 >= 20))
{
return formSeveral;
}

return formMany;
}
}

private static readonly ReplaceResolver replaceResolver = new ReplaceResolver();
private static readonly NumberCaseResolver numberCaseResolver = new NumberCaseResolver();

private static readonly Regex _languageWorkerResolverRegex =
new Regex(@"\^(?<resolverName>\w+)\(\s*(?<argument>[^|]+?)\s*(\|\s*(?<argument>[^|]+?)\s*)*\)\^",
RegexOptions.Compiled);

public override string PostProcessedKeyedTranslation(string translation)
{
translation = base.PostProcessedKeyedTranslation(translation);
return PostProcess(translation);
}

public override string PostProcessed(string str)
{
str = base.PostProcessed(str);
return PostProcess(str);
}

private static string PostProcess(string translation)
{
return _languageWorkerResolverRegex.Replace(translation, EvaluateResolver);
}

private static string EvaluateResolver(Match match)
{
string keyword = match.Groups["resolverName"].Value;

Group argumentsGroup = match.Groups["argument"];

string[] arguments = new string[argumentsGroup.Captures.Count];
for (int i = 0; i < argumentsGroup.Captures.Count; ++i)
{
arguments[i] = argumentsGroup.Captures[i].Value.Trim();
}

IResolver resolver = GetResolverByKeyword(keyword);

string result = resolver.Resolve(arguments);
if (result == null)
{
Log.Error(string.Format("Error happened while resolving LW instruction: \"{0}\"", match.Value));
return match.Value;
}

return result;
}

private static IResolver GetResolverByKeyword(string keyword)
{
switch (keyword)
{
case "Replace":
return replaceResolver;
case "Number":
return numberCaseResolver;
default:
return null;
}
}
}


Thanks for posting the LanguageWorker! However, since 1.3 we've introduced a new system which makes such custom code unnecessary. We now support "numCase" and "replace" resolvers natively and every language can use them. You could take a look at the Russian language for an example.

Arczi008TV

#27
Quote from: ison on August 12, 2021, 04:46:05 AM
Thanks for posting the LanguageWorker! However, since 1.3 we've introduced a new system which makes such custom code unnecessary. We now support "numCase" and "replace" resolvers natively and every language can use them. You could take a look at the Russian language for an example.

numCase not working for me



And what about fraction & world map names? In english language worker its uses Big Letter Size
With default worker, or without:     Only first word is big


J0anJosep

I also tried to use numCase and I get the same error as Arczi008TV. Do I need to add something to the language worker to use numCase?

I also updated the Catalan language worker. The updated version is here. Can it be added for the next RimWorld release? Thanks!