TMX editor with Regex support (Windows 10)? Thread poster: Hayley Leva
| Hayley Leva France Local time: 15:22 French to English + ...
Hello, I have been looking into tmx editors with the aim of cleaning up a large (and overly messy) translation memory. In particular, I would like to delete: - number-only segments - segments containing (only) product codes and/or brand names - segments shorter than X characters. I imagine an editor which supports Regex might be the answer, but does such a thing exist? I've already tried: - Heartsome tmx editor - I can't find a way ... See more Hello, I have been looking into tmx editors with the aim of cleaning up a large (and overly messy) translation memory. In particular, I would like to delete: - number-only segments - segments containing (only) product codes and/or brand names - segments shorter than X characters. I imagine an editor which supports Regex might be the answer, but does such a thing exist? I've already tried: - Heartsome tmx editor - I can't find a way to filter as I need to - Olifant tmx editor - I read that it supports Regex, but I get an error when trying to import my tmx file ("Item has already been added") so haven't got any further than that (I've made a separate post for that issue*) Any ideas? Hayley *Post regarding Olifant error: https://www.proz.com/forum/cat_tools_technical_help/329711-olifant_vers_20_build_5_error_when_importing_"item_has_already_been_added_key_in_dictionary".html ▲ Collapse | | | |
[Deleted]
[Edited at 2018-10-17 13:47 GMT] | | | Hayley Leva France Local time: 15:22 French to English + ... TOPIC STARTER CafeTran looks like a good option (but licence required for TMs with more than 1000 TUs) | Oct 17, 2018 |
Thank you Michael for your reply - from your screenshots it certainly looks like CafeTran might be able to do what I need. However, I checked the CafeTran licensing page, and it looks like the free version has a TM limit of 1000 TUs. As the TMs I want to clean up each have 15,000 - 30,000 TUs, if I did want to use CafeTran, I would have to pay for a licence. I'm not ruling that out as an option, but I'll need to do some more research first! One specific query I have relates t... See more Thank you Michael for your reply - from your screenshots it certainly looks like CafeTran might be able to do what I need. However, I checked the CafeTran licensing page, and it looks like the free version has a TM limit of 1000 TUs. As the TMs I want to clean up each have 15,000 - 30,000 TUs, if I did want to use CafeTran, I would have to pay for a licence. I'm not ruling that out as an option, but I'll need to do some more research first! One specific query I have relates to compatibility, as I created (and currently use) my TMs in Trados Studio (2017). Can you confirm that I can do the following: - export a tmx of my TM from Trados Studio (2017) (I know how to do this step) - open and clean up* that tmx in CafeTran - convert the edited tmx back into a Studio TM *And in terms if clean up in CafeTran, as well as deleting unwanted segments, do you know whether it is also possible to edit the values of custom fields in my TM? Regards, Hayley ▲ Collapse | |
|
|
Maxprograms' TMXEditor supports regular expressions | Oct 17, 2018 |
Hi, TMXEditor, https://www.maxprograms.com/products/tmxeditor.html supports filtering on regular expressions. Select the segments that you want and then delete what you see appropriate. Regards, Rodolfo | | | Selcuk Akyuz Türkiye Local time: 17:22 English to Turkish + ... Why don't you use DVX? | Oct 17, 2018 |
Hi Hayley, Deja Vu is listed on your profile page, why don't you use it for this task? You can easily filter (and delete) such segments using SQL filters. | | | No TUs restrictions for TMX Editing in CafeTran Espresso | Oct 18, 2018 |
Hi Hayley, As I have been able to confirm recently, the demo version limits do NOT apply to TMX editing in CafeTran Espresso. You can edit big TMs without owning a license. Just drop the TMX on the dashboard and choose Edit Translation Memory. https://www.proz.com/forum/cat_tools_technical_help/326359-in_search_for_an_app_for_quick_translations.html *And in terms if clean up in CafeTran, as well as deleting unwanted segments, do you know whether it is also possible to edit the values of custom fields in my TM? Yes, from Michael's screenshot: Task > TMX memory > Set TMX property / Remove TMX property. This applies to all filtered segments. Apart from the Filter menu, you can also filter segments by searching them in the Quick Search Bar. For example: Type the number 5 and only the 5th segment will be selected Type 5-95 and only that range will be selected. Just make sure that when you click Ctrl+F to bring up the advanced search preferences, "Segment numbers" is enabled on the left side of the search window. One specific query I have relates to compatibility, as I created (and currently use) my TMs in Trados Studio (2017). Can you confirm that I can do the following: - export a tmx of my TM from Trados Studio (2017) (I know how to do this step) - open and clean up* that tmx in CafeTran - convert the edited tmx back into a Studio TM Yes, SDL Trados can export and import TMX files, and CafeTran produces valid TMX files which can be imported back into SDL Trados. I have done so successfully in the 2015 version. Jean PS: For quickly customizing the UI to your liking, you can read my quick Getting comfortable document for CafeTran. Especially, you will want to test which of the 6 available window layouts (found in View > Window layout submenu) is the most practical for you in the TMX editing scenario. Try this with a TMX already loaded.
[Edited at 2018-10-18 02:26 GMT] | | | Samuel Murray Netherlands Local time: 15:22 Member (2006) English to Afrikaans + ...
Hayley Leva wrote: Can you confirm that I can do the following: - export a tmx of my TM from Trados Studio (2017) (I know how to do this step) - open and clean up* that tmx in CafeTran - convert the edited tmx back into a Studio TM I have no idea what CafeTran does to a TMX file, but even if we assume that CafeTran and Trados 2017 both fully accept valid TMX, there may still be a problem on Trados' end. Trados can export TMX files in a variety of variations, and not all of those options are roundtrip compatible. You'd have to ask this question in the Trados forum, i.e. "if I want to export to TMX and then import from TMX again, what export/import settings should I use to ensure that the imported units contain all the same information as in the original TM?". Theoretically, another thing can go wrong: CafeTran's XML engine may not be compatible with Trados' XML engine. I know from experience that Trados' engine is quite forgiving about non-valid XML characters in some cases (and unforgiving in other cases), and we don't know how CafeTran's engine would deal with such characters (would it refuse to open the TMX, would it open the TMX and silently replace the invalid characters with something else, would it open the TMX but then refuse to save it until you've removed the invalid characters, etc). | |
|
|
Still you can use an ordinary text editor with regex support | Oct 18, 2018 |
I use following code to find empty segments and segments with integer numbers to delete them in tmx file loaded into Notepad++ editor (en-US and ru). <tu>\r?\n?<tuvxml:lang=\"en-US\"><seg\/?>\d**\r?\n?(?:<\/seg>)?<\/tuv>\r?\n?<tuv xml:lang=\"ru\"><seg\/?>\d**\r?\n?(?:<\/seg>)?<\/tuv>\r?\n?<\/tu>\r?\n?
You can use it as a sample. Just change every �... See more I use following code to find empty segments and segments with integer numbers to delete them in tmx file loaded into Notepad++ editor (en-US and ru). <tu>\r?\n?<tuvxml:lang=\"en-US\"><seg\/?>\d**\r?\n?(?:<\/seg>)?<\/tuv>\r?\n?<tuv xml:lang=\"ru\"><seg\/?>\d**\r?\n?(?:<\/seg>)?<\/tuv>\r?\n?<\/tu>\r?\n?
You can use it as a sample. Just change every “en-US” and “ru” to your source and target languages.
[Редактировалось 2018-10-18 11:09 GMT] ▲ Collapse | | | To report site rules violations or get help, contact a site moderator: You can also contact site staff by submitting a support request » TMX editor with Regex support (Windows 10)? TM-Town | Manage your TMs and Terms ... and boost your translation business
Are you ready for something fresh in the industry? TM-Town is a unique new site for you -- the freelance translator -- to store, manage and share translation memories (TMs) and glossaries...and potentially meet new clients on the basis of your prior work.
More info » |
| Anycount & Translation Office 3000 | Translation Office 3000
Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.
More info » |
|
| | | | X Sign in to your ProZ.com account... | | | | | |