Contents
History
Source
Email Feedback
pofilter tests
The following are descriptions of the tests available in pofilter with some
details about what type of errors they are useful to test for and the
limitations of each test.
You can always run:
pofilter -l
to get a list of the current tests available in your installation.
If you have an idea for a new test then please help us to write it.
Test Classification
Some tests are more important than others so we have classified them to help
you determine which to run first.
* Critical - can break a program
escapes, variables
* Functional - may confuse the user
accelerators, acronyms, blank, filepaths, kdecomments, long, numbers,
purepunc, sentencecount, short, unchanged, xmltags
* Cosmetic - make it look better
brackets, doublequoting, doublespacing, endpunc, endwhitespace, puncspacing,
simplecaps, singlequoting, startpunc, startwhitespace
* Extraction - useful mainly for extracting certain types of string
compendiumconflicts, isfuzzy, isreview, untranslated
Test Descriptions
* accelerators
checks whether accelerators are consistent between the two strings.
Make sure you use the --mozilla, --kde, etc options so that pofilter knows
which type of accelerator it is looking for. The test will pick up
accelerators that are missing and ones that shouldn't be there.
* acronyms
checks that acronyms that appear are unchanged
If the acronym URL appears in the original this test will check that it
appears in the translation. Translating acronyms is a language decision
but many languages leave them unchanged in that case this test is useful
for tracking down translations of the acronym and correcting them.
* blank
checks whether a translation is totally blank
This will check to see if a translation has inadvertently been translated
as blank ie as spaces. This is different from untranslated which is
completely empty. This test is useful in that if something is translated
as " " it will appear to most tools as if it is translated.
* brackets
checks that the number of brackets in both strings match
If ([{ pr }]) appear in the original this will check that the same number
appear in the translation.
* compendiumconflicts
checks for Gettext compendium conflicts (#-#-#-#-#)
When you use msgcat to create a PO compendium it will insert #-#-#-#-# into
entries that are not consistent. If the compendium us used later in a
message merge then these conflicts will appear in your translations. This
test quickly extracts those for correction.
* doublequoting
checks whether doublequoting is consistent between the two strings
Checks on double quotes " to ensure that you have the same number in both
the original and the translated string.
* doublespacing
checks for bad double-spaces by comparing to original
This will identify if you have [space][space] in when you don't have it in
the original or it appears in the original but not in your translation.
Some of these are spurious and how you correct them depends on the
conventions of your language.
* endpunc
checks whether punctuation at the end of the strings match
This will ensure that the ending of your translation has the same
punctuation as the original. Eg if it end in :[space] then so should
yours. It is useful for ensuring that you have ellipses [...] in all your
translations. You may pick up some errors in the original feel free to keep
your translation and notify the programmers. In some languages characters
such as ? ! are always preceded by a space e.g. [space]? - do what your
language customs dictate. Other false positives you will notice is if
through changes in words order you add "), etc at the end of the sentence.
Do not change these your language order takes precedence.
It must be noted that if you are tempted to leave out [full-stop] or [colon]
or add [full-stop] to a sentence that often these have been done for a
reason eg a list where fullstops make it look cluttered. So initially match
them with the English and make changes once the program is being used.
* endwhitespace
checks whether whitespace at the end of the strings matches
Operates the same as endpunc but is only concerned with whitespace.
* escapes
checks whether escaping is consistent between the two strings
Checks escapes such as \n \\uNNNN to ensure that if they exist in the
original that you have them in the translation.
* filepaths
checks that file paths have not been translated
Checks that paths such as /home/user1 have not been translated. Generally
you do not translate a file paths unless it is being used as an example
* isfuzzy
check if the po element has been marked fuzzy
If a message is marked fuzzy in the PO file then it is extracted. Note
this is different from --fuzzy and --nofuzzy options which specify whether
tests should be performed against messages marked fuzzy
* isreview
check if the po element has been marked review
If you make use of the non-Gettext 'review' flag:
#, review[ - reason for review]
Then if a message is marked for review in the PO file it will be extracted.
Note this is different from --review and --noreview options which specify
whether tests should be performed against messages marked review.
* kdecomments
checks to ensure that no KDE style comments appear in the translation
KDE style translator comments appear in PO files as "_: comment\n"
New translators often translate the comment. This test tries to identify
instances where the comment has been translated.
* long
checks whether a translation is much longer than the original string
* numbers
checks whether numbers of various forms are consistent between the two strings
You will see some errors where you have either written the number in full
or converted it to the digit in your translation. Also changes in order
will trigger this error.
* puncspacing
checks for bad spacing after punctuation
In the case of [full-stop][space] in the original this test checks that your
translation does not remove the space. It checks also for [coma], [colon],
etc
* purepunc
checks that strings that are purely punctuation are not changed
This extracts strings like "+" or "-" as these usually should not be
changed.
* sentencecount
checks that the number of sentences in both strings match
Adds the number of fullstops to see that the sentence count is the same
between the original and translated string.
* short
checks whether a translation is much shorter than the original string
* simplecaps
checks the capitalisation of two strings isn't wildly different
This will pick up many false positive so don't be a slave to it. It is
useful for identifying translation that do not start with a capital when
they should or those that do when they shouldn't. It will also highlight
sentences that have extra capitals, depending on the capitalisation
convention of your language you might want to change these to Title Case
or change them all to normal sentence case.
* singlequoting
checks whether singlequoting is consistent between the two strings
The same as doublequoting but checks for the ' character. Because this is
used in words like - it's, user's, etc - this can cause spurious errors as
your language might not use such a system. If a quote appears at the end
of a sentence in the translation ie '[full-stop] this might not be detected
properly by the check.
* startpunc
checks whether punctuation at the beginning of the strings match
Operates as endpunc but you will probably see fewer errors.
* startwhitespace
checks whether whitespace at the beginning of the strings matches
As in endwhitespace but you will see fewer errors.
* unchanged
checks whether a translation is basically identical to the original string
This checks to see if the translation isn't just a copy of the English
original. Many time this is what you want but sometimes you will detect
words that should have been translated.
* untranslated
checks whether a string has been translated at all
This check is really only useful if you want to extract untranslated
strings so that they can be translated independently of the main work.
* variables
checks whether variables of various forms are consistent between the two strings
This checks to make sure that variables that appear in the original also
appear in the translation. Make sure you use the --kde, --openoffice, etc
flags as these define what variables will be searched for. It does not at
the moment cope with variables that use the reordering syntax of Gettext PO
files.
* xmltags
checks that XML/HTML tags have not been translated
This check finds the number of tags in the source string and checks that
the same number are in the translation. If the counts don't match then
either the tag is missing or it was mistakenly translated by the
translator, both of which are errors.
The check ignores tags or things that look like tags that cover the whole
string eg "<Error>" but will produce false positives for things like "An
<Error> occurred" as here "Error" should be translated. It also will
detect a translated alt tag in eg <img src=bob.png alt="blah"> as an
error when in fact it is correct.