Contents  History  Source  Email Feedback

pofilter tests

The following are descriptions of the tests available in pofilter with some 
details about what type of errors they are useful to test for and the 
limitations of each test.

You can always run:

  pofilter -l

to get a list of the current tests available in your installation.

If you have an idea for a new test then please help us to write it.

Test Classification

Some tests are more important than others so we have classified them to help
you determine which to run first.

* Critical - can break a program
escapes, variables

* Functional - may confuse the user
accelerators, acronyms, blank, filepaths, kdecomments, long, numbers, 
purepunc, sentencecount, short, unchanged, xmltags

* Cosmetic - make it look better
brackets, doublequoting, doublespacing, endpunc, endwhitespace, puncspacing, 
simplecaps, singlequoting, startpunc, startwhitespace

* Extraction - useful mainly for extracting certain types of string
compendiumconflicts, isfuzzy, isreview, untranslated


Test Descriptions

* accelerators    
	checks whether accelerators are consistent between the two strings.

	Make sure you use the --mozilla, --kde, etc options so that pofilter knows
	which type of accelerator it is looking for.  The test will pick up
	accelerators that are missing and ones that shouldn't be there.

* acronyms
	checks that acronyms that appear are unchanged

	If the acronym URL appears in the original this test will check that it
	appears in the translation.  Translating acronyms is a language decision
	but many languages leave them unchanged in that case this test is useful
	for tracking down translations of the acronym and correcting them.

* blank
	checks whether a translation is totally blank

	This will check to see if a translation has inadvertently been translated
	as blank ie as spaces.  This is different from untranslated which is
	completely empty.  This test is useful in that if something is translated
	as "   " it will appear to most tools as if it is translated.

* brackets
	checks that the number of brackets in both strings match
	
	If ([{ pr }]) appear in the original this will check that the same number
	appear in the translation.

* compendiumconflicts
	checks for Gettext compendium conflicts (#-#-#-#-#)

	When you use msgcat to create a PO compendium it will insert #-#-#-#-# into
	entries that are not consistent.  If the compendium us used later in a
	message merge then these conflicts will appear in your translations.  This
	test quickly extracts those for correction.

* doublequoting   
	checks whether doublequoting is consistent between the two strings

	Checks on double quotes " to ensure that you have the same number in both
	the original and the translated string.
	
* doublespacing   
	checks for bad double-spaces by comparing to original

	This will identify if you have [space][space] in when you don't have it in
	the original or it appears in the original but not in your translation.
	Some of these are spurious and how you correct them depends on the
	conventions of your language.

* endpunc 
	checks whether punctuation at the end of the strings match

	This will ensure that the ending of your translation has the same
	punctuation as the original.  Eg if it end in :[space] then so should
	yours.  It is useful for ensuring that you have ellipses [...] in all your
	translations. You may pick up some errors in the original feel free to keep
	your translation and notify the programmers.  In some languages characters
	such as ? ! are always preceded by a space e.g. [space]? - do what your
	language customs dictate. Other false positives you will notice is if
	through changes in words order you add "), etc at the end of the sentence.
	Do not change these your language order takes precedence.

	It must be noted that if you are tempted to leave out [full-stop] or [colon]
	or add [full-stop] to a sentence that often these have been done for a
	reason eg a list where fullstops make it look cluttered.  So initially match
	them with the English and make changes once the program is being used.

* endwhitespace   
	checks whether whitespace at the end of the strings matches

	Operates the same as endpunc but is only concerned with whitespace.

* escapes 
	checks whether escaping is consistent between the two strings

	Checks escapes such as \n \\uNNNN to ensure that if they exist in the
	original that you have them in the translation.

* filepaths
	checks that file paths have not been translated

	Checks that paths such as /home/user1 have not been translated.  Generally
	you do not translate a file paths unless it is being used as an example

* isfuzzy
	check if the po element has been marked fuzzy

	If a message is marked fuzzy in the PO file then it is extracted.  Note
	this is different from --fuzzy and --nofuzzy options which specify whether 
	tests should be performed against messages marked fuzzy

* isreview
	check if the po element has been marked review
	
	If you make use of the non-Gettext 'review' flag:

		#, review[ - reason for review]

	Then if a message is marked for review in the PO file it will be extracted.  
	Note this is different from --review and --noreview options which specify 
	whether tests should be performed against messages marked review.

* kdecomments
	checks to ensure that no KDE style comments appear in the translation

	KDE style translator comments appear in PO files as "_: comment\n"
	New translators often translate the comment.  This test tries to identify
	instances where the comment has been translated.
	
* long    
	checks whether a translation is much longer than the original string

* numbers 
	checks whether numbers of various forms are consistent between the two strings

	You will see some errors where you have either written the number in full
	or converted it to the digit in your translation.  Also changes in order
	will trigger this error.

* puncspacing     
	checks for bad spacing after punctuation

	In the case of [full-stop][space] in the original this test checks that your 
	translation does not remove the space.  It checks also for [coma], [colon],
	etc

* purepunc        
	checks that strings that are purely punctuation are not changed

	This extracts strings like "+" or "-" as these usually should not be
	changed.

* sentencecount
	checks that the number of sentences in both strings match

	Adds the number of fullstops to see that the sentence count is the same
	between the original and translated string.

* short   
	checks whether a translation is much shorter than the original string

* simplecaps      
	checks the capitalisation of two strings isn't wildly different

	This will pick up many false positive so don't be a slave to it.  It is
	useful for identifying translation that do not start with a capital when
	they should or those that do when they shouldn't.  It will also highlight
	sentences that have extra capitals, depending on the capitalisation
	convention of your language you might want to change these to Title Case
	or change them all to normal sentence case.

* singlequoting   
	checks whether singlequoting is consistent between the two strings

	The same as doublequoting but checks for the ' character.  Because this is
	used in words like - it's, user's, etc - this can cause spurious errors as
	your language might not use such a system.  If a quote appears at the end
	of a sentence in the translation ie '[full-stop] this might not be detected
	properly by the check.

* startpunc       
	checks whether punctuation at the beginning of the strings match

	Operates as endpunc but you will probably see fewer errors.

* startwhitespace 
	checks whether whitespace at the beginning of the strings matches

	As in endwhitespace but you will see fewer errors.

* unchanged       
	checks whether a translation is basically identical to the original string

	This checks to see if the translation isn't just a copy of the English
	original.  Many time this is what you want but sometimes you will detect
	words that should have been translated.

* untranslated    
	checks whether a string has been translated at all

	This check is really only useful if you want to extract untranslated
	strings so that they can be translated independently of the main work.

* variables       
	checks whether variables of various forms are consistent between the two strings

	This checks to make sure that variables that appear in the original also
	appear in the translation.  Make sure you use the --kde, --openoffice, etc
	flags as these define what variables will be searched for.  It does not at
	the moment cope with variables that use the reordering syntax of Gettext PO
	files.

* xmltags
	checks that XML/HTML tags have not been translated

	This check finds the number of tags in the source string and checks that
	the same number are in the translation.  If the counts don't match then
	either the tag is missing or it was mistakenly translated by the
	translator, both of which are errors.

	The check ignores tags or things that look like tags that cover the whole
	string eg "<Error>" but will produce false positives for things like "An
	<Error> occurred" as here "Error" should be translated.  It also will
	detect a translated alt tag in eg <img src=bob.png alt="blah"> as an
	error when in fact it is correct.