How can the word-count analysis differ several hundred thousand words between Memsource and Studio?
Thread poster: Fredrik Pettersson
Fredrik Pettersson
Fredrik Pettersson  Identity Verified
Hong Kong
Local time: 02:50
Member (2009)
English to Swedish
+ ...
Sep 20, 2021

How can the word-count analysis differ several hundred thousand words between Memsource and Studio?

I applied the two TMs I received from the customer when running the analysis in SDL Trados Studio 2021, but I still get a total of 411,000 words whereas the customer gets a total of 284,752 words in his Memsource analysis.
He gets new words 63,611 in Memsource, I get new words 121,838 in SDL Trados Studio.


Yaotl Altan
 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 02:50
Member (2006)
English to Afrikaans
+ ...
@Fredrik Sep 20, 2021

Fredrik Pettersson wrote:
I applied the two TMs I received from the customer when running the analysis in SDL Trados Studio 2021, but I still get a total of 411,000 words whereas the customer gets a total of 284,752 words in his Memsource analysis.
He gets new words 63,611 in Memsource, I get new words 121,838 in SDL Trados Studio.


Did he send you a complete analysis report or just a short summary of one?

Perhaps your client can shed light on this -- did you ask him? It may be that he applies certain settings in Memsource that results in this discrepancy. Are you sure the two TMs that he sent you are the only TMs that he applies?


 
Fredrik Pettersson
Fredrik Pettersson  Identity Verified
Hong Kong
Local time: 02:50
Member (2009)
English to Swedish
+ ...
TOPIC STARTER
Memsource excludes the confirmed segments Sep 20, 2021

I received the explanation that Memsource excludes the confirmed segments. But there is still such a discrepancy between the different bands of the fuzzy matches. Can also that be explained by the exclusion by Memsource of the confirmed segments?

It might be that the customer applies a third TM, which would be their server TM, I will ask him.


 
Heloísa Helena Benetton Costa
Heloísa Helena Benetton Costa  Identity Verified
Brazil
Local time: 21:50
English to Portuguese
+ ...
Memsource wordcount Nov 19, 2021

Hi Frederik,

I discovered that, if you download the .doc file, you'll see that you have a table with the original AND the pre-translation from TM. I believe this explains the difference.

On the other hand, if you select all the files at the portal and double-click over them, you will open all the files joined and will get the same wordcount sent by the client. But this does not happen if you use de Desktop Editor.

ANYWAY, whenever I translate using MemSour
... See more
Hi Frederik,

I discovered that, if you download the .doc file, you'll see that you have a table with the original AND the pre-translation from TM. I believe this explains the difference.

On the other hand, if you select all the files at the portal and double-click over them, you will open all the files joined and will get the same wordcount sent by the client. But this does not happen if you use de Desktop Editor.

ANYWAY, whenever I translate using MemSource, I feel that I am working twice as much as with any other software. I always feel the wordcount is much higher. And MemSource does not allow you to run your own analysis. Unless, of course, you buy the software, which costs an arm and a leg!

Best,

Heloisa

[Edited at 2021-11-19 19:53 GMT]
Collapse


Yaotl Altan
 
Artem Vakhitov
Artem Vakhitov  Identity Verified
Kyrgyzstan
English to Russian
+ ...
Had a somewhat similar problem Nov 24, 2021

I've had a somewhat similar problem with MemSource. See my topics:

"MemSource XLIFF analysis gives 3x as many repetitions vs. SDL Analyse on an exported DOCX" in MemSource Support

"MXLIFF word count, including repetitions—alternatives to MemSource?" in CAT Tools Technical Help

(For some reasons, this post doesn't work correctly with links.)

"ANYWAY, whenever I translate using MemSource, I feel that I am working twice as much as with any other
... See more
I've had a somewhat similar problem with MemSource. See my topics:

"MemSource XLIFF analysis gives 3x as many repetitions vs. SDL Analyse on an exported DOCX" in MemSource Support

"MXLIFF word count, including repetitions—alternatives to MemSource?" in CAT Tools Technical Help

(For some reasons, this post doesn't work correctly with links.)

"ANYWAY, whenever I translate using MemSource, I feel that I am working twice as much as with any other software."

My sentiments exactly. Well, they can argue that there's nothing wrong with the software/service per se, and it's probably true technically, but I'm afraid that much of MemSource success is due to the powerful ways it offers to screw the translator.

[Edited at 2021-11-24 21:56 GMT]
Collapse


Jean Dimitriadis
 
Roy Oestensen
Roy Oestensen  Identity Verified
Denmark
Local time: 02:50
Member (2010)
English to Norwegian (Bokmal)
+ ...
Problem is: what is a word, and what should be included in the count. Nov 25, 2021

Word count is an old discussion topic. See for instance
https://www.proz.com/forum/memsource_support/348913-unrealistic_analysis.html
You may also rea
... See more
Word count is an old discussion topic. See for instance
https://www.proz.com/forum/memsource_support/348913-unrealistic_analysis.html
You may also read the following:
https://helpcenter.memoq.com/hc/en-us/articles/360010268440-Why-are-my-memoQ-word-counts-different-from-Microsoft-Word-or-other-translation-tools
And here is another. https://www.proz.com/forum/cat_tools_technical_help/13822-trados_word_count_vs_practicount_word_count_vs_ms_word_word_count.html
I know I've seen a practical analysis of how different tools count words, but I am not able to find it back.

But it seems that the way Studio is defining a word, it will always give a higher word count than other tools. I realise, though, that while part of the reason probably is the word definition, there seems to be some other reasons lurking too, which I cannot quite see.
Collapse


 
Fredrik Pettersson
Fredrik Pettersson  Identity Verified
Hong Kong
Local time: 02:50
Member (2009)
English to Swedish
+ ...
TOPIC STARTER
Working twice as much with Memsource, I agree Nov 26, 2021

I have another customer that, for some reason, have a preliminary word-count stated when the job offer is sent out to freelance translators. This preliminary word-count is then adjusted afterwards when the translator has delivered the completed translation. And it's not minor adjustments we are talking about, I think I've experienced around 40 percent less payment in post-adjustments. This made me react of course, and I asked for the reasons to this. I don't remember the details in the agency's ... See more
I have another customer that, for some reason, have a preliminary word-count stated when the job offer is sent out to freelance translators. This preliminary word-count is then adjusted afterwards when the translator has delivered the completed translation. And it's not minor adjustments we are talking about, I think I've experienced around 40 percent less payment in post-adjustments. This made me react of course, and I asked for the reasons to this. I don't remember the details in the agency's reply, but no matter the explanations, this adds up to what I hear from you other fellow translators about Memsource word-counts: That it's not a reliable way of counting words and it's a way of counting words which renders outputs that differ considerably from most other CAT tools. Furthermore, Memsource doesn't fall into the top category of advanced CAT tools but is a rather simple CAT tool with limited functionality. Wouldn't that limited capacity include limitations as to detailed and accurate word-counts as well? There really should be some universal word-count tool to resort to in cases like this when a less advanced CAT tool is used.Collapse


 
Stepan Konev
Stepan Konev  Identity Verified
Russian Federation
Local time: 03:50
English to Russian
Easy to compare Nov 27, 2021

You can easily check how Memsource counts. Memsource skips bullets and automatic numbering. You don’t translate bullets, do you? The longer your text is, the more such elements it contains. I tried to compare a “clean” text without bullets and other noise, and Memsource gave me exactly the same word count. Here are some images proving that: <... See more
You can easily check how Memsource counts. Memsource skips bullets and automatic numbering. You don’t translate bullets, do you? The longer your text is, the more such elements it contains. I tried to compare a “clean” text without bullets and other noise, and Memsource gave me exactly the same word count. Here are some images proving that: https://www.proz.com/forum/memsource_support/348913-unrealistic_analysis.html
In terms of work effort, Memsource shows you a realistic picture net of noise like bullets, autonumbering, and nontranlatables.

[Edited at 2021-11-27 10:13 GMT]
Collapse


 
Roy Oestensen
Roy Oestensen  Identity Verified
Denmark
Local time: 02:50
Member (2010)
English to Norwegian (Bokmal)
+ ...
Different word count before and after the translation??? Nov 29, 2021

Fredrik Pettersson wrote:

I have another customer that, for some reason, have a preliminary word-count stated when the job offer is sent out to freelance translators. This preliminary word-count is then adjusted afterwards when the translator has delivered the completed translation. And it's not minor adjustments we are talking about, I think I've experienced around 40 percent less payment in post-adjustments. This made me react of course, and I asked for the reasons to this. I don't remember the details in the agency's reply, but no matter the explanations, this adds up to what I hear from you other fellow translators about Memsource word-counts: That it's not a reliable way of counting words and it's a way of counting words which renders outputs that differ considerably from most other CAT tools. Furthermore, Memsource doesn't fall into the top category of advanced CAT tools but is a rather simple CAT tool with limited functionality. Wouldn't that limited capacity include limitations as to detailed and accurate word-counts as well? There really should be some universal word-count tool to resort to in cases like this when a less advanced CAT tool is used.


I can see that the weighted word count would be different after a translation than before, since then all the segments would be 100% matches. But post-adjustment? Perhaps there are more than one translator working on a project, and then they take into account the contribution from each when counting the translation for the others. I would say I would not accept this. When I start a translation, I expect to have an analysis I will be paid according to, and the agency would not adjust it downwards afterwards. If they did, I would say farewell to them, since I would not find them trustworthy or willing to pay what a translation is worth.

Roy


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

How can the word-count analysis differ several hundred thousand words between Memsource and Studio?







Protemos translation business management system
Create your account in minutes, and start working! 3-month trial for agencies, and free for freelancers!

The system lets you keep client/vendor database, with contacts and rates, manage projects and assign jobs to vendors, issue invoices, track payments, store and manage project files, generate business reports on turnover profit per client/manager etc.

More info »
Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »