Pages in topic:   < [1 2]
Euro Commission multilingual TM for the Acquis Communautaire publicly accessible
Thread poster: Clive Phillips
MikeTrans
MikeTrans
Germany
Local time: 01:38
Italian to German
+ ...
@Farkas, Jun 8, 2012

thanks, yes I will make a cross-check with Olifant to see more clearly. My 2007 Release will stay on my external HD until I'm sure. I just guess all these files are new, a supplement to the 2007R.

Greets,
Mike


 
Clifford Marcus
Clifford Marcus
Local time: 00:38
Spanish to English
+ ...
use Phi search instead of loading the TM Jul 4, 2012

I have been playing around with this as well, and likewise have a snowball's chance in hell of loading up a 2million segment TM. But what I do is use Phi search to look inside it for terms.... it is still a little clunky but is faster than googling around to chase up stuff.
Am still thinking of other ways to use it, though.


 
Tomas Mosler, DipTrans IoLET MCIL MITI
Tomas Mosler, DipTrans IoLET MCIL MITI  Identity Verified
Czech Republic
Local time: 01:38
Member (2008)
English to Czech
SITE LOCALIZER
legal re-use Oct 6, 2012

How is it with legal aspects of the re-use of the EU TM?

"Any re-use of the database or of the structured elements contained in it is required to be identified by the re-user, who is under an obligation to state the source of the documents used: the website address, the date of the latest update and the fact that the European Commissio
... See more
How is it with legal aspects of the re-use of the EU TM?

"Any re-use of the database or of the structured elements contained in it is required to be identified by the re-user, who is under an obligation to state the source of the documents used: the website address, the date of the latest update and the fact that the European Commission retains ownership of the data."
http://langtech.jrc.it/DGT-TM.html#Usage-Conditions

If you use DGT TM (any bit) for commercial translation or any other publishing purposes, how do you adhere to this rule? I mean, you probably cannot ad a footer to the translated paragraph intended for a glossy brochure saying "based on DGT-TM". So what are the ways, if any? Thanks.
Collapse


 
FarkasAndras
FarkasAndras  Identity Verified
Local time: 01:38
English to Hungarian
+ ...
Re-use Oct 6, 2012

I don't think that translating with this TM would be classified as re-use. If you were to publish the sentence pairs extracted from the DGT-TM in another format (say, in a searchable spreadsheet available via an online interface), that would be another matter.

 
Tomas Mosler, DipTrans IoLET MCIL MITI
Tomas Mosler, DipTrans IoLET MCIL MITI  Identity Verified
Czech Republic
Local time: 01:38
Member (2008)
English to Czech
SITE LOCALIZER
re-use Oct 6, 2012

Well they have rules for it:

COMMISSION DECISION
of 12 December 2011
on the reuse of Commission documents
(2011/833/EU)

http://eur-lex.europa.eu/LexUriServ/LexUriServ.do?uri=OJ:L:2011:330:0039:0042:EN:PDF

Article 1
Subject matter
This Decision determines the conditions for the re
... See more
Well they have rules for it:

COMMISSION DECISION
of 12 December 2011
on the reuse of Commission documents
(2011/833/EU)

http://eur-lex.europa.eu/LexUriServ/LexUriServ.do?uri=OJ:L:2011:330:0039:0042:EN:PDF

Article 1
Subject matter
This Decision determines the conditions for the reuse of documents held by the Commission or on its behalf by the Publications Office of the European Union (the Publications Office) [...]

Article 2
Scope
1. This Decision applies to public documents produced by the Commission or by public and private entities on its behalf:
(a) which have been published by the Commission or by the Publications Office on its behalf through publications, websites or dissemination tools [...]

Article 3
Definitions
For the purposes of this Decision, the following definitions shall apply:
(1) ‘document’ means:
(a) any content whatever its medium (written on paper or stored in electronic form or as a sound, visual or audiovisual recording);
(b) any part of such content;
(2) ‘reuse’ means the use of documents by persons or legal entities of documents, for commercial or non-commercial purposes other than the initial purpose for which the documents were produced. The exchange of documents between the Commission and other public sector bodies which use these documents purely in the pursuit of their public tasks does not constitute reuse.


So, strictly speaking, in case I would re-use 80% fuzzy match (that being "part of any content" - isn't it quite a bullet-proof definition?) in my translation, how am I legally entitled to do so without the mentioned re-use identification? Thanks.

(PS I can imagine someone would say "no one cares", but I would not take that as an answer to my question. :) )
Collapse


 
FarkasAndras
FarkasAndras  Identity Verified
Local time: 01:38
English to Hungarian
+ ...
produced by the Commission Oct 6, 2012

Tomas Mosler, MITI wrote:

Well they have rules for it:

Article 2
Scope
1. This Decision applies to public documents produced by the Commission or by public and private entities on its behalf:


Well, I'm not sure this Decision applies to the DGT-TM. It probably applies to background documents, feasibility studies and whatnot, not to legislation. The material in the DGT-TM is legislation, produced by the Commission, the Council and Parliament in various permutations through a variety of prodedures. Some of the stuff in there was produced by the Commission alone, but I don't think they will apply different rules based on provenance.


 
septima
septima
Local time: 01:38
Usage Conditions Oct 6, 2012

Usage conditions for the Acquis are online at http://langtech.jrc.it/JRC-Acquis.html (see below, it's a slightly different link from yours)

But maybe you should specify what this hypothetical usage scenario of yours is, Tomáš. I think the chance of you pulling up an 80% Acquis hit from a non-EU law text are pretty "slim". And if it was one of those terrible Trados matches, which, alt
... See more
Usage conditions for the Acquis are online at http://langtech.jrc.it/JRC-Acquis.html (see below, it's a slightly different link from yours)

But maybe you should specify what this hypothetical usage scenario of yours is, Tomáš. I think the chance of you pulling up an 80% Acquis hit from a non-EU law text are pretty "slim". And if it was one of those terrible Trados matches, which, although supposedly "80% the same", actually bears no resemblance, apart from "the" and "if", by the time you edit it into shape there won't be much unlicensed use left

I guess you're just whiling away a rainy Saturday night with this thought experiment...

4) Usage conditions / Licensing issues
Acquis Communautaire corpus

According to an agreement with the European Commission's Office for Official Publications OPOCE, the AC corpus can be used and distributed for research purposes, but the following usage conditions must be adhered to:

The European Communities consider legislative and quasi-legislative documents published in the Official Journal of the European Union and related COM and SEC series as well as charters and treaties and ECJ case-law to be in the public domain. Prior written permission is thus not required for their reproduction/translation, and they may be reproduced/translated freely without restriction, including for the purpose of further non-commercial dissemination to final users, subject to the condition that appropriate acknowledgement is given to the European Communities and to the source, and provided that the additional guidelines set out below are respected.

Whenever a document is reproduced verbatim from a source other than the printed version of the Official Journal of the European Union, a prominently positioned disclaimer should read: 'Only European Community legislation printed in the paper edition of the Official Journal of the European Union is deemed authentic.'

For the reasons stated in the disclaimer above, it is advisable to ensure that translations are made from the printed, authentic version of the Official Journal. This precaution, while minimizing the risk of error, does not confer any legal status whatsoever to the translated text. The following notice shall accompany the translated text, printed below the acknowledgement: 'Originally published in the official languages of the European Union in the Official Journal of the European Union by the Office for Official Publications of the European Communities. Responsibility for the translation into [specify language] from the original [specify language] edition lies entirely with [name of translation copyright holder].' Moreover, please note that we do not consider a "further commercial dissemination" the inclusion, as reference material for consultation purposes, of small amounts of relevant legislative texts in articles/thesis/studies/reports/books issued by third-party authors or publishers, whatever the means, and disseminated subject to payment.
Eurovoc thesaurus

Unlike the AC corpus, the EuroVoc Thesaurus must not be used or disseminated without prior written permission from the European Commission's Office for Official Publications OPOCE. If you want to get the rights to use Eurovoc and to receive a copy of the multilingual thesaurus, please contact OPOCE at [email protected], mentioning the file reference number 2005-COP-395. To our knowledge, the licence is free of charge for research purposes. For a commercial licence, please contact OPOCE.
Collapse


 
Tomas Mosler, DipTrans IoLET MCIL MITI
Tomas Mosler, DipTrans IoLET MCIL MITI  Identity Verified
Czech Republic
Local time: 01:38
Member (2008)
English to Czech
SITE LOCALIZER
DGT Oct 8, 2012

FarkasAndras wrote:
Well they have rules for it:

Article 2
Scope
1. This Decision applies to public documents produced by the Commission or by public and private entities on its behalf:


Well, I'm not sure this Decision applies to the DGT-TM. It probably applies to background documents, feasibility studies and whatnot, not to legislation. The material in the DGT-TM is legislation, produced by the Commission, the Council and Parliament in various permutations through a variety of prodedures. Some of the stuff in there was produced by the Commission alone, but I don't think they will apply different rules based on provenance.

I understand that permutations arise along the procedural way, but the translation of the final texts (directives etc.) is still provided by entitties on behalf of the Commission ( = translators/agencies hired by DGT), and as such the scope would apply. Am I wrong? Fine - why?

(Also:
"The DGT-TM database is the exclusive property of the European Commission."
http://langtech.jrc.it/DGT-TM.html )

Septima, I'm talking about DGT TM, not about Acquis (apart from that, the usage/license for Acquis you posted seems to be even more strict as it allows only research use).

The hypothetical scenario is quite simple, one can have either a) a classic legal/REACH text or b) anything that is not specifically legal (nor namely EU-related) but can still bring a nice match (for both a+b - for example just by replacing/adding the company's name).

For example only the earlier mentioned norm
http://eur-lex.europa.eu/LexUriServ/LexUriServ.do?uri=OJ:L:2011:330:0039:0042:EN:PDF
itself contains several generic, yet potentially useful segments.

PS If you don't take my interest in legal aspects of DGT TM usage as geniune, fine, but if you don't mind I prefer other methods of communication than speculations on somebody else's behalf. :) I'm not sure it was raining here anyway. :)


 
Post removed: This post was hidden by a moderator or staff member for the following reason: Empty post.
FarkasAndras
FarkasAndras  Identity Verified
Local time: 01:38
English to Hungarian
+ ...
Produced by the Commission Oct 9, 2012

Tomas Mosler, MITI wrote:

FarkasAndras wrote:
Well they have rules for it:

Article 2
Scope
1. This Decision applies to public documents produced by the Commission or by public and private entities on its behalf:


Well, I'm not sure this Decision applies to the DGT-TM. It probably applies to background documents, feasibility studies and whatnot, not to legislation. The material in the DGT-TM is legislation, produced by the Commission, the Council and Parliament in various permutations through a variety of prodedures. Some of the stuff in there was produced by the Commission alone, but I don't think they will apply different rules based on provenance.

I understand that permutations arise along the procedural way, but the translation of the final texts (directives etc.) is still provided by entitties on behalf of the Commission ( = translators/agencies hired by DGT), and as such the scope would apply. Am I wrong? Fine - why?


I think you are (wrong, that is). "Produced by the Commission" is not the same as "Translated by the Commission". I don't think the Commission claims IP rights on all documents it translates. Furthermore, I'm not sure the Commission translates all legislation. The EP has its own translation unit, and there's also the Translation Centre for the Bodies of the European Union. I'm not sure how work is distributed.

This is all academic of course. The DGT-TM is a rather special document set, so the related IP issues are surely governed by the terms of usage page you linked. If it refers to the Decision, than that's that.
I still don't think anyone at the Commission would classify use of the TM as re-use, though, or complain about the lack of attribution.

[Edited at 2012-10-09 16:07 GMT]


 
Tomas Mosler, DipTrans IoLET MCIL MITI
Tomas Mosler, DipTrans IoLET MCIL MITI  Identity Verified
Czech Republic
Local time: 01:38
Member (2008)
English to Czech
SITE LOCALIZER
DGT Oct 9, 2012

"Produced by the Commission" is not the same as "Translated by the Commission".

Of course, but what about produced on behalf of Commission? If I translate a text for DGT, the translation is my "product" for Commission.
(NB DGT is a unit of the Commisssion, and DGT TM a property of the Commission, so I'm not sure if the argument with the separated EP/EU translation unit can stand in the sole context of DGT.)

I don't think the Commission claims IP rights on all documents it translates.

Not exactly, but still they have their conditions.

I still don't think anyone at the Commission would classify use of the TM as re-use

"‘reuse’ means the use of documents by persons or legal entities of documents [NB this is defined as any part of any content], for commercial or non-commercial purposes other than the initial purpose for which the documents were produced."
Said directly by the Commission -
http://eur-lex.europa.eu/LexUriServ/LexUriServ.do?uri=OJ:L:2011:330:0039:0042:EN:PDF

IMHO the core question is if/why TM segments are (not) considered parts of any content.
(I'm waiting for septima's updated post, but the original argument that "content" doesn't cover words or punctuation didn't exactly click for me.)

or complain about the lack of attribution.

Sure, but IMHO "not complaining" doesn't yet mean "approving".

[Edited at 2012-10-09 20:03 GMT]


 
septima
septima
Local time: 01:38
a few points Oct 10, 2012

Tomas, I think your interpretation of what constitutes unlicensed re-use of the DGT TM could be successfully challenged from many different angles, each probably sufficient on its own, and cumulatively constituting a complete argument against it. Here are some examples:

1) that the content in this case can only mean the database itself and its structured elements (paired language variants of EU law provisions), as that is what is the unique property of DG; the re-use condition there
... See more
Tomas, I think your interpretation of what constitutes unlicensed re-use of the DGT TM could be successfully challenged from many different angles, each probably sufficient on its own, and cumulatively constituting a complete argument against it. Here are some examples:

1) that the content in this case can only mean the database itself and its structured elements (paired language variants of EU law provisions), as that is what is the unique property of DG; the re-use condition therefore refers to, for example, sites like MyMemory or Glosbe which reproduce these structured elements from the database; this also makes sense of the conditions specified elsewhere about not altering the content, which would be rendered meaningless by your interpretation;

2) that these structured elements cannot be broken down below the level of segments (paired segments); the law consists of its unaltered provisions construed in the context of the whole law and defined by their meaning and effect; if you alter the text of a provision by changing the language or breaking it up in ways that divorce it from its original sense, that text ceases to be a provision of law and thus a sub-component of the structured elements of this database; if this weren't so, then any cutting and pasting of snippets of law out of context would constitute citation - this is, thankfully, not true;

3) that DG Translation does not actually have the power to specify conditions in relation to the use of the texts themselves; the Commission or a Member State, in respect of its own jurisdiction, could, after all, decide at any time that the text of its law is in the public domain; decisions of the Commission in particular are directly superior to any conditions imposed by DG Translation, except, perhaps, insofar as DGT has arranged the material in a specific way;

4) that interpreting the condition as applying to fragmented phrases of the texts themselves would, in any case, "unnecessarily restrict" the re-use of the content in the sense expressly prohibited by the Commission Decision and the governing Directive; this is a clear and established principal of legal citation, whole provisions are referenced explicitly in a text, but no explicit reference need be made every time a term or a fragment of a provision is used in, say, the middle of a sentence, even when italicized or given in quotes or marked otherwise;

Overall, it's clear that there is no problem with re-use per se, DGT has simply specified a condition that is "necessarily restrictive" in order to protect its rights to its property and require acknowledgement if that property is re-used; neither the provisions of law themselves, nor the words of the languages, nor (to drive the point home) the 0s and 1s in the binary code making up the electronic file are the property of DGT, and these cannot therefore be the subject of DGT's conditions. When a translator identifies a translated provision or fragment of a provision from the database, removes the text from its context of a structured segment in the database and inserts the isolated text into his/her translation, this does not constitute re-use of the DGT TM content, as in no way would it be possible to determine after the fact that the text belonged in any unique way to the database or could only have been drawn from it, applying "same or substantially similar" arguments; what does constitute unique content are the structured elements of paired, translated segments, and these should not be re-used without due acknowledgement.
Collapse


 
Tomas Mosler, DipTrans IoLET MCIL MITI
Tomas Mosler, DipTrans IoLET MCIL MITI  Identity Verified
Czech Republic
Local time: 01:38
Member (2008)
English to Czech
SITE LOCALIZER
challenge Oct 10, 2012

Re your challenge:

1) that the content in this case can only mean the database itself and its structured elements (paired language variants of EU law provisions), as that is what is the unique property of DG

Again (Decision): "any part of such [any] content". Not "in case of DGT, only structured elements from the database".

the re-use condition therefore refers to, for example, sites like MyMemory or Glosbe which reproduce these structured elements from the database

I'm not sure why a private TM "database" should be any different.

2) that these structured elements cannot be broken down below the level of segments (paired segments)

Well I can break a 10-word segment into new segments quite easily. Apart from that, there may be no need to break the segments to succesfully re-use the segment.

the law consists of its unaltered provisions construed in the context of the whole law and defined by their meaning and effect

I'm not sure what is the point with the law reference. The translator has a reference TM and has no idea of if the given segment (match) is law or anything else. (He may guess so given the source collection but that is about it.) A sentence like "‘structured data’ is data organised in a way that allows reliable identification of individual statements of fact and all their components, as exemplified in databases and spreadsheets;" (taken from the mentioned Decision) could be seen as a normal data(base) definition article in any IT magazine.

if you alter the text of a provision by changing the language or breaking it up in ways that divorce it from its original sense, that text ceases to be a provision of law and thus a sub-component of the structured elements of this database

Again, I'm not sure what the (absent) status of a provision of law has to do with this. I see it as a collection of segments, if it makes legal sense or not is another thing.

if this weren't so, then any cutting and pasting of snippets of law out of context would constitute citation - this is, thankfully, not true

Yes, but AFAIK when you intentionally refer to law you usually mention it (even if by the way). Plus see below at 4).

3) that DG Translation does not actually have the power to specify conditions in relation to the use of the texts themselves

I never said they do.

the Commission or a Member State, in respect of its own jurisdiction, could, after all, decide at any time that the text of its law is in the public domain; decisions of the Commission in particular are directly superior to any conditions imposed by DG Translation

No problem with that, I quoted the Decision in this matter.

4) that interpreting the condition as applying to fragmented phrases of the texts themselves would, in any case, "unnecessarily restrict" the re-use of the content in the sense expressly prohibited by the Commission Decision and the governing Directive; this is a clear and established principal of legal citation, whole provisions are referenced explicitly in a text, but no explicit reference need be made every time a term or a fragment of a provision is used in, say, the middle of a sentence, even when italicized or given in quotes or marked otherwise;

In the public legal system, fine, but does this work for commercial use?

When a translator identifies a translated provision or fragment of a provision from the database, removes the text from its context of a structured segment in the database and inserts the isolated text into his/her translation, this does not constitute re-use of the DGT TM content

But didn't you yourself mentioned structured elements (the same as a fragment of a provision perhaps?) as the lowest level of database (content)? Or does "remove the text" mean editing a single segment (as presented in CAT)? I'm now a bit confused with what (from the CAT segments point of view) do you mean with "fragment", "structured segment" and "isolated text", can you explain? Thanks.

If I re-use e.g. "to software or to documents covered by industrial property rights such as patents, trademarks, registered designs, logos and names" from the quoted Decision as a 100% match (given the layout in PDF it looks to me like a stand-alone segment), is that a structured segment/element or not?

as in no way would it be possible to determine after the fact that the text belonged in any unique way to the database or could only have been drawn from it

Even if I use only the part "industrial property rights such as patents, trademarks, registered designs, logos and names" - taken out of the italics text above - I still get only EU links on Google (plus, funnily enough, glosbe etc.). Apart from that, a limited (no) possibility of reverse tracking IMHO doesn't justify (nor prove) anything as such. I'm interested in what one can(not) do, not only in what one can be "seen" doing.

what does constitute unique content are the structured elements of paired, translated segments, and these should not be re-used without due acknowledgement.

Is the text in italics above a structured element of a paired segment?


 
septima
septima
Local time: 01:38
clarification Oct 11, 2012

@Tomas,

It's important to make a distinction between two things here (which, I admit are blurred by the way DGT/the Commission has set out this condition).

First, look back at the wording of the Decision, derived from the PSI Directive:

"Article 6
Conditions for reuse of documents
1. Documents shall be made available for reuse without application unless otherwise specified and without restrictions or, where appropriate, [with] an open licence o
... See more
@Tomas,

It's important to make a distinction between two things here (which, I admit are blurred by the way DGT/the Commission has set out this condition).

First, look back at the wording of the Decision, derived from the PSI Directive:

"Article 6
Conditions for reuse of documents
1. Documents shall be made available for reuse without application unless otherwise specified and without restrictions or, where appropriate, [with] an open licence or disclaimer setting out conditions explaining the rights of reusers.
2. Those conditions, which shall not unnecessarily restrict possibilities for reuse, may include the following:

(a) the obligation for the reuser to acknowledge the source of the documents;"

Re-use must mean reproduction, distribution etc. in whole or in part and in some shape or form. This is without restriction unless an open licence has been specified, which may set out a (not unnecessarily restrictive) condition requiring acknowledgment.

So re-use = reproduction/distribution, and, as far as I am aware, the Commission has not exercised its right to specify conditions on the re-use of EU law docs under on open-licence. So, in fact, it seems you can even grab a piece of a text of EU law and use it without restriction.

Now, moving on, the DGT TM (really a database within the meaning of the Database Directive) is then included as a document under this definition, and in this case DGT decides that it will specify a condition on "re-use". But what does re-use mean in the case of a database? Again, it means reproduction/distribution of the database (or its structured elements = the paired segments + maybe the coding). It does not relate to the underlying documents, as this would be in contradiction of the fact that the Commission has specified no restrictive conditions on their re-use.

So apply that to the case in hand. The translator lawfully "extracts from" (but does not "re-use" = reproduce) the database in order to identify corresponding text fragments. Having found one, he/she actually does re-use something, but that something is a fragment of content on which no such condition has been imposed (as far as I am aware).

To summarise, we have this process:
1) Author re-uses text in document (unrestricted)
2) Translator "extracts" text from database (but does not "re-use" = reproduce the database)
3) Translator re-uses corresponding text in document (unrestricted)

A little more clarity appears if you note Recital 24 of the Directive:

"This Directive is without prejudice to [...] Directive 96/9/EC [...] on the legal protection of databases"

The provisions of the Database Directive therefore stand, notwithstanding what is specified by 2003/98/EC etc.

Now, Article 8 of the Database Directive reads as follows:

"Rights and obligations of lawful users

1. The maker of a database which is made available to the public in whatever manner may not prevent a lawful user of the database from extracting and/or re-utilizing insubstantial parts of its contents, evaluated qualitatively and/or quantitatively, for any purposes whatsoever. Where the lawful user is authorized to extract and/or re-utilize only part of the database, this paragraph shall apply only to that part."

The Database Directive is also more explicit on what this "extracting" and "utilizing" means:

"Article 7
2. For the purposes of this Chapter [i.e. Chapter III, under which Article 8 also falls]:

(a) 'extraction' shall mean the permanent or temporary transfer of all or a substantial part of the contents of a database to another medium by any means or in any form;

(b) 're-utilization` shall mean any form of making available to the public all or a substantial part of the contents of a database by the distribution of copies, by renting, by on-line or other forms of transmission. The first sale of a copy of a database within the Community by the rightholder or with his consent shall exhaust the right to control resale of that copy within the Community;"

Here, (b) actually gives you more of an idea of what "re-use" includes (it also includes various value-added ways of combining and developing data etc).

So, to finally sum up (before I disappear entirely), in my view, the translator's usage of the DGT TM involves permitted extraction of insubstantial amounts of text, followed by unrestricted re-use of the content of the underlying documents (pasting into the text etc.). What it does not involve is re-use of the DGT TM within the meaning of the DGT TM condition - as I said before, this would consist of reproducing or distributing the database or any part of its structured elements (content).

Anyways, thanks for drawing me into your interesting thought experiment, probably you'll remain unconvinced, and who knows?, maybe I'm wrong But the real world calls.

Ahoj!

s
Collapse


 
EdaBramlett
EdaBramlett
United States
FORMS Feb 22, 2016

Timely discussion . Speaking of which , if anyone is interested a OPM SF 85P , I encountered a sample version here http://pdf.ac/8YGKPK.

 
Pages in topic:   < [1 2]


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

Euro Commission multilingual TM for the Acquis Communautaire publicly accessible







Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »
Trados Business Manager Lite
Create customer quotes and invoices from within Trados Studio

Trados Business Manager Lite helps to simplify and speed up some of the daily tasks, such as invoicing and reporting, associated with running your freelance translation business.

More info »