Quotations
Thread poster: Wolfgang Schoene
Wolfgang Schoene
Wolfgang Schoene  Identity Verified
France
Local time: 19:31
Member (2007)
English to German
+ ...
Mar 30, 2022

I'm working on a quotation and am currently analyzing the files.
3 IDML files.

Wordfast Pro 7 counts 38.719 word / 26.107 repetitions
MemoQ counts 58.597 words / 40.145 reps
Trados Studio counts 59.602 word / 48.451 reps

Whom shall I trust?
The difference between Wordfast count and MemoQ and Studio counts seems absurd.
Any explanation for that?


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 19:31
Member (2006)
English to Afrikaans
+ ...
@Wolfgang Mar 30, 2022

Wolfgang wrote:
The difference between Wordfast count and MemoQ and Studio counts seems absurd.

Perhaps Wordfast doesn't extract all of the text.

You can test this by exporting the files to DOC/DOCX/RTF and comparing them in Word to see what are the extra bits of text that Trados and MemoQ are getting. In WFP7, it's Ctrl+5, then add the TXLF files, then "Bilingual export". In MemoQ, you create a view, then right-click and "Export bilingual". In Trados, right-click the project and go Batch Tasks > Export Bilingual Review.

Or, perhaps Wordfast quietly ignores certain tags that both Trados and MemoQ do consider, which leads to fewer 100% matches. What do the three tools say are the total word counts?


 
Wolfgang Schoene
Wolfgang Schoene  Identity Verified
France
Local time: 19:31
Member (2007)
English to German
+ ...
TOPIC STARTER
Total word counts Mar 30, 2022

Samuel Murray wrote:

Wolfgang wrote:
The difference between Wordfast count and MemoQ and Studio counts seems absurd.

Perhaps Wordfast doesn't extract all of the text.

You can test this by exporting the files to DOC/DOCX/RTF and comparing them in Word to see what are the extra bits of text that Trados and MemoQ are getting. In WFP7, it's Ctrl+5, then add the TXLF files, then "Bilingual export". In MemoQ, you create a view, then right-click and "Export bilingual". In Trados, right-click the project and go Batch Tasks > Export Bilingual Review.

Or, perhaps Wordfast quietly ignores certain tags that both Trados and MemoQ do consider, which leads to fewer 100% matches. What do the three tools say are the total word counts?


Hi Samuel
total word counts are those I quoted, that is:
Trados: 59.402
MemoQ: 58.597
Wordfast: 38.719
Just doing one more analysis with DVX


Gennady Lapardin
 
Stepan Konev
Stepan Konev  Identity Verified
Russian Federation
Local time: 20:31
English to Russian
Different ways of trimming spaces? Mar 30, 2022

Probably, this may be caused by how different tools trim spaces for example. Below are three identical segments in MS Word, Trados and memoQ.2022-03-30_204939
In this situation Trados seems to be most honest. Basically you can copy the entire source column from Trados and paste it into an MS Word document to see the real scope of work net of multiple spaces and tabs, tags, pictures, etc.


 
Wolfgang Schoene
Wolfgang Schoene  Identity Verified
France
Local time: 19:31
Member (2007)
English to German
+ ...
TOPIC STARTER
DVX analysis Mar 30, 2022

Wolfgang wrote:

Samuel Murray wrote:

Wolfgang wrote:
The difference between Wordfast count and MemoQ and Studio counts seems absurd.

Perhaps Wordfast doesn't extract all of the text.

You can test this by exporting the files to DOC/DOCX/RTF and comparing them in Word to see what are the extra bits of text that Trados and MemoQ are getting. In WFP7, it's Ctrl+5, then add the TXLF files, then "Bilingual export". In MemoQ, you create a view, then right-click and "Export bilingual". In Trados, right-click the project and go Batch Tasks > Export Bilingual Review.

Or, perhaps Wordfast quietly ignores certain tags that both Trados and MemoQ do consider, which leads to fewer 100% matches. What do the three tools say are the total word counts?


Hi Samuel
total word counts are those I quoted, that is:
Trados: 59,402
MemoQ: 58,597
Wordfast: 38,719
Just doing one more analysis with DVX


The DVX analysis says Total Words 32,869 / Duplicates 18,194 (no match 14,675)
So this looks more or less like the WFP analysis.


 
Wolfgang Schoene
Wolfgang Schoene  Identity Verified
France
Local time: 19:31
Member (2007)
English to German
+ ...
TOPIC STARTER
Thanks Stepan Mar 30, 2022

Stepan Konev wrote:

Probably, this may be caused by how different tools trim spaces for example. Below are three identical segments in MS Word, Trados and memoQ.2022-03-30_204939
In this situation Trados seems to be most honest. Basically you can copy the entire source column from Trados and paste it into an MS Word document to see the real scope of work net of multiple spaces and tabs, tags, pictures, etc.


Will do that


 
Sergei Leshchinsky
Sergei Leshchinsky  Identity Verified
Ukraine
Local time: 20:31
Member (2008)
English to Russian
+ ...
IDML Mar 30, 2022

IDML is a "distilled" (plain-text marked-up) InDesign format (for translation) that cannot be exported to DOC. It can only be imported back to the same InDesign project to update it with translation.

I guess the problem is in the CAT-tools' filters and their ability to handle the InDesign-specific XML correctly.

Samuel Murray wrote:

Wolfgang wrote:
The difference between Wordfast count and MemoQ and Studio counts seems absurd.

Perhaps Wordfast doesn't extract all of the text.

You can test this by exporting the files to DOC/DOCX/RTF and comparing them in Word to see what are the extra bits of text that Trados and MemoQ are getting. In WFP7, it's Ctrl+5, then add the TXLF files, then "Bilingual export". In MemoQ, you create a view, then right-click and "Export bilingual". In Trados, right-click the project and go Batch Tasks > Export Bilingual Review.

Or, perhaps Wordfast quietly ignores certain tags that both Trados and MemoQ do consider, which leads to fewer 100% matches. What do the three tools say are the total word counts?


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 19:31
Member (2006)
English to Afrikaans
+ ...
@Sergei Mar 30, 2022

Sergei Leshchinsky wrote:
Samuel Murray wrote:
You can test this by exporting the files to DOC/DOCX/RTF and comparing them in Word to see what are the extra bits of text that Trados and MemoQ are getting. In WFP7, it's Ctrl+5, then add the TXLF files, then "Bilingual export". In MemoQ, you create a view, then right-click and "Export bilingual". In Trados, right-click the project and go Batch Tasks > Export Bilingual Review.

IDML is a "distilled" (plain-text marked-up) InDesign format (for translation) that cannot be exported to DOC. It can only be imported back to the same InDesign project to update it with translation.

The export formats mentioned in my post are actually possible for IDML.


Sergei Leshchinsky
 
Sergei Leshchinsky
Sergei Leshchinsky  Identity Verified
Ukraine
Local time: 20:31
Member (2008)
English to Russian
+ ...
Right Mar 30, 2022

Yes, one can export a bilingual table and compare even visually.
But this is only to see the logic bug.

Samuel Murray wrote:
You can test this by exporting the files to DOC/DOCX/RTF and comparing them in Word to see what are the extra bits of text


 
Hans Lenting
Hans Lenting
Netherlands
Member (2006)
German to Dutch
CafeTran Mar 31, 2022

Any particular reason not to use CafeTran?

Import the idml, replace [aeiuo] with m.

Export.

Use Affinity Puplisher trial to check the idml.

Regarding the counting: you could verify via a bilingual export table.


 
Hans Lenting
Hans Lenting
Netherlands
Member (2006)
German to Dutch
Images Mar 31, 2022

Used a very simple, publicly available IDML file (but from experiences I know that CafeTran Espresso's filter is good and can handle complex documents).

Screen Shot 2022-03-31 at 08.02.18

Screen Shot 2022-03-31 at 08.01.00

Screen Shot 2022-03-31 at 08.01.43


 
Hans Lenting
Hans Lenting
Netherlands
Member (2006)
German to Dutch
Netto table Mar 31, 2022

Without the header and the numbering:

Screen Shot 2022-03-31 at 08.06.09


 
Rolf Keller
Rolf Keller
Germany
Local time: 19:31
English to German
Counting words vs. counting characters Mar 31, 2022

Stepan Konev wrote:

Probably, this may be caused by how different tools trim spaces for example.

Different handling of spaces might influence the character count but not the word count - at least not to such an extent.


 
Joakim Braun
Joakim Braun  Identity Verified
Sweden
Local time: 19:31
German to Swedish
+ ...
Nitpicks Mar 31, 2022

Sergei Leshchinsky wrote:

IDML is a "distilled" (plain-text marked-up) InDesign format



It's not distilled. IDML contains all data necessary for recreation of the complete Indesign document, the same as .indd files. In fact, it IS an Indesign document.

Sergei Leshchinsky wrote:
(for translation)


IDML is intended as a backwards-compatible XML format. That it works with CAT is just a coincidence.

Sergei Leshchinsky wrote:
It can only be imported back to the same InDesign project to update it with translation.


There's no such thing as an "Indesign project", and the IDML file is not "imported back to" anything. The CAT tool simply creates a new IDML file with updated text.


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Quotations







Trados Business Manager Lite
Create customer quotes and invoices from within Trados Studio

Trados Business Manager Lite helps to simplify and speed up some of the daily tasks, such as invoicing and reporting, associated with running your freelance translation business.

More info »
Wordfast Pro
Translation Memory Software for Any Platform

Exclusive discount for ProZ.com users! Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value

Buy now! »