Found an awesome RegEx search/replace to strip XML for translation!

Started by joeyeti, May 07, 2014, 09:44:54 AM

Previous topic - Next topic

joeyeti

On a different forum (boardgamegeek.com for those interested) I asked yesterday if there was a way through a RegEx search&replace to remove all XML tags besides "defname", "label" and "description" from a text file - to be able to see what we translators need to see :)

And voila! A helpful fella replied today with the below RegEx code that I successfully tried on one of the XMLs from the "Defs" folder.

The result is just the lines containing these tags (and the text within them) are kept and all other lines are deleted from the file. In this way you do not need to scroll through the file and search meticulously for all defnames, labels and descriptions and instead have it all neatly one after the other. I left the "defname" in there for string reference in the resulting translated file.

This does not change the format of the Defs file into the "language" file, so it is more for reference and backwards control for changes between versions (after which you can edit your translated files).

I am using Notepad++ and it works there, so not sure if it works in ALL RegEx-enabled utilities.

Search:
(?s)(</label>|</description>|</defname>)(.*?)(<label>|<description>|<defname>)

Replace with:
\1\r\3

Neone

Yeah, good point. However, you have to check the file manually nevertheless.

ResearchProjectDef\BaseResearchProjects.xml -> descriptionDiscovered
FactionDef\BaseFactionTypes.xml -> pawnsPlural

That only ones that I remember.

Tynan

Yes, there are some string fields besides those I'm afraid. But it's an awesome start! you could handle the others case by case.

Moving topic to Translations.
Tynan Sylvester - @TynanSylvester - Tynan's Blog

joeyeti

On top of above, a perfect - and open source to that - tool for comparing two folders even with subdirectories is WinMerge (http://winmerge.org/).

In it you can open two files/folders and after the tool runs its comparisons it will show you details on file pairings - if the files are the same, if and how many changes are between them, it even handles renamed files (though I believe only if a suffix is added) etc.

Also, it has its own editor, showing color-coded lines for changes and stuff between two compared files, so you have a complete package, when you couple this with an external editor to edit your translated files in. I use Notepad++, as already mentioned.

Before this I had to have four tabs open in Total Commander and use its own comparison tool (much inferior to WinMerge) with much clicking and moving around in directories. This way I only need WinMerge to compare old and new english original files and Notepad++ to edit my Slovak translations.

Of course I do not believe I am a genious and surely WinMerge is already used by people (or similar utilities like BeyondCompare - which is paid btw), but some of you might not yet do so, so this is for them :)