A free term extraction tool? Thread poster: eileenPL
| eileenPL Local time: 17:03 Polish to English
Hi all, could anyone recommend a free extraction tool for terms and phrases that can be used offline? Regards, E.
[Edited at 2018-01-28 21:21 GMT] | | | MuMyongCy (X) South Korea Local time: 00:03 English to Korean | Danesh Local time: 18:33 English to Persian (Farsi) Okapi Rainbow | Jan 29, 2018 |
Rainbow — is a GUI application to launch various utilities related to translation and localization tasks, such as: Text extraction (to XLIFF, OmegaT projects, RTF, etc.) and merging, pre-translation, encoding conversion, terms extraction, file format conversions, quality verification, translation comparison, search and replace on filtered text, pseudo-translation, and much more. Using the framework's pipeline mechanism, you can use Rainbow to create chains of steps that perform a custom set of... See more Rainbow — is a GUI application to launch various utilities related to translation and localization tasks, such as: Text extraction (to XLIFF, OmegaT projects, RTF, etc.) and merging, pre-translation, encoding conversion, terms extraction, file format conversions, quality verification, translation comparison, search and replace on filtered text, pseudo-translation, and much more. Using the framework's pipeline mechanism, you can use Rainbow to create chains of steps that perform a custom set of tasks specific to your needs. http://okapiframework.org/wiki/index.php?title=Rainbow ▲ Collapse | | | Samuel Murray Netherlands Local time: 17:03 Member (2006) English to Afrikaans + ...
eileenPL wrote: Could anyone recommend a free extraction tool for terms and phrases...? From what data sources will you be extracting? | |
|
|
In addition to Okapi... | Jan 29, 2018 |
you can also use the old free version of Xbench. But the tool I really like the most is AntConc; it's not exactly a term extraction tool, but the word list and, particularly, the "keyness" lists are great to this end, and the video tutorials are easy to follow for newcomers to corpus analysis. For specialized texts, I find that ha... See more you can also use the old free version of Xbench. But the tool I really like the most is AntConc; it's not exactly a term extraction tool, but the word list and, particularly, the "keyness" lists are great to this end, and the video tutorials are easy to follow for newcomers to corpus analysis. For specialized texts, I find that having a baseline like the Brown corpus is invaluable! ▲ Collapse | | | eileenPL Local time: 17:03 Polish to English TOPIC STARTER
Samuel Murray wrote: eileenPL wrote: Could anyone recommend a free extraction tool for terms and phrases...? From what data sources will you be extracting? I'd like to extract phrases from .doc files. I'm a Trados user, but when I think of manual copying terms from files into the translation memory... it'd take ages! Thanks for all the replies!
[Edited at 2018-01-29 20:07 GMT] | | | MuMyongCy (X) South Korea Local time: 00:03 English to Korean
eileenPL wrote: I'd like to extract phrases from .doc files. I'm a Trados user, but ... It works well with SDL Trados Studio project file. It means you can use it for the all file formats which SDL Trados Studio accepts. regards | | | You could try my PhraseMiner tool: | Jan 30, 2018 |
eileenPL wrote: Could anyone recommend a free extraction tool for terms and phrases...? http://asap-traduction.com/PhraseMiner From a Word document, it extracts "internal" or "intra-document" fuzzy matches, sentences containing 5 or more common consecutive words, sentences that are subsegments of longer sentences, terms containing at least two or more words and appearing two or more times using stop-word lists, sentences containing two more of such extracted terms, etc. David Turner | |
|
|
Arianne Farah Canada Local time: 11:03 Member (2008) English to French There's a new free app in the SDL store | Jan 30, 2018 |
It works quite well - takes a little tweaking at first to remove common words, but you can save those custom exclusion dictionaries and reuse them, so you'll only have to exclude 'that' 'which' 'will' 'allow', etc. once. You set your minimum word length, your minimum amount of times it's repeated and it'll generate this cloud of words, you can then whittle it down before creating an sdxliff from it that you'll use to populate a glossary - it sounds like a lot of steps, but it's quite painless ... See more It works quite well - takes a little tweaking at first to remove common words, but you can save those custom exclusion dictionaries and reuse them, so you'll only have to exclude 'that' 'which' 'will' 'allow', etc. once. You set your minimum word length, your minimum amount of times it's repeated and it'll generate this cloud of words, you can then whittle it down before creating an sdxliff from it that you'll use to populate a glossary - it sounds like a lot of steps, but it's quite painless http://appstore.sdl.com/language/app/projecttermextract/817/ ▲ Collapse | | | MuMyongCy (X) South Korea Local time: 00:03 English to Korean functionality limited. | Jan 31, 2018 |
It can not extract multiple words (2 words or 3 words pharases). It sees only 1 word.
[Edited at 2018-01-31 10:18 GMT] | | | 'Bilingual' Term Extraction? - Rainbow | May 1, 2018 |
Does Rainbow do 'Bilingual' Term Extraction? It doesn't appear that it does. Just source term extraction. | | | DZiW (X) Ukraine English to Russian + ... https://www.wordfast.net/wiki/PlusTools | May 1, 2018 |
Besides AntConc, for .doc files I prefer free and flexible PlusTools, although it requires weighted-words and collocations filtering too. David, not very convenient to ask a software via email, yet does it support Russian (Unicode/UTF) ? TY | |
|
|
Jerome KURES Canada Local time: 11:03 English to French + ... | To report site rules violations or get help, contact a site moderator: You can also contact site staff by submitting a support request » A free term extraction tool? Wordfast Pro | Translation Memory Software for Any Platform
Exclusive discount for ProZ.com users!
Save over 13% when purchasing Wordfast Pro through ProZ.com. Wordfast is the world's #1 provider of platform-independent Translation Memory software. Consistently ranked the most user-friendly and highest value
Buy now! » |
| Anycount & Translation Office 3000 | Translation Office 3000
Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.
More info » |
|
| | | | X Sign in to your ProZ.com account... | | | | | |