Can Copyrighted Content be Used for Machine Learning?

Partner

In December 2022, the Ministry of Justice published an opinion on whether Machine Learning enterprises can use copyrighted material to train Artificial Intelligence systems without permission.

Machine Learning (ML) refers to a computer’s ability to learn inductively, on its own, from databases (including text documents, sound, video, and so on) in the context of developing a technological infrastructure for Artificial Intelligence systems. The need for large-scale amounts of content for machine-training means that, almost always, content protected by the copyrights of third parties is used for such training, and the lack of clarity regarding the legality of using this content has created, according to the Ministry of Justice, legal barriers for the purpose of effective Machine Learning.

The published opinion reviewed global trends and presented the Ministry of Justice’s principled stance, according to which, in general, notwithstanding exceptional cases, use of copyright-protected content for the purpose of machine training will be permissible even without obtaining the approval of the rights-owners of the contents, due to the “fair use” exception established by the Law of Copyright, 2007.

An explanation will follow:

According to the Copyright Law, copyright in a work is the exclusive right to take any action on the work, or a substantial part of it. As a result, in general, creation of another may not be used without obtaining his consent.

The law states that an exception exists, according to which use of another’s work without his permission is permissible if it constitutes “fair use,” when fair use of a work is permitted by law for purposes such as “self-study, research, or criticism,” and, additionally, the use itself must comply with various “fairness” tests, which take into account, among other things, the purpose and nature of the use, the nature of the work used, the scope of use, and the effect of the use on the value of the work.

The question has been raised as to whether the use of copyrighted content for the purposes of Machine Learning falls under the “fair use” exception, which allows the use of content without obtaining permission from its creators under certain conditions.

The Ministry of Justice believes that Machine Learning should be considered “self-study” or “research” and that, in many cases, the use itself will be “fair” because the process of Machine Learning is typically a transformative process that does not affect the value of the work. For example, an autonomous vehicle’s driving system “learns” that a pedestrian who disappears behind a car will reappear behind it after a few seconds. This study, however, has no effect on the value of the films themselves, but rather uses them for another purpose.

This opinion expressly excludes from its scope ML datasets that are solely composed of works created by a single author in order to compete with this author in existing markets. Such use will not be considered “fair use,” and will necessitate obtaining permission from the rights-holders to use such content.

Furthermore, the opinion presents the approaches of various countries around the world to the subject of Machine Learning, as well as the issue of copyright infringement. It should be noted that some of the world’s dominant legal systems have established a specific legal arrangement for this issue. In some of them, general exceptions to the law, such as the “fair use” exception, have been implemented.

The opinion, for example, notes that the United Kingdom, Japan, Singapore, and the European Union established an exception known as TDM-Text & Data Mining, which was intended to regulate the automatic collection and analysis of information in digital space and was applied to the creation of a database for Machine Learning. In contrast, the United States has refrained from creating a specific exception and instead allows the content of rights holders to be used only in limited circumstances and under the “fair use” exception.

Regarding databases with terms of use that expressly prohibit the use of their contents, the opinion states that as long as these clauses appearing in the terms of use are recognized as an “unduly disadvantage provision in a standard contract,” they will not be able to prevent the use of the information for Machine Learning purposes. However, if this is not a standard contract but rather a contract negotiated between the parties, it will be possible to determine that, under certain circumstances, waiving the exception provided by law would be justified by a user; however, the opinion opted not to provide any fixed determination as an answer to the question of whether it is possible to allow conditions for the fair use arrangement stipulated by law.

It should be noted that the opinion was provided only for situations in which the learning process was carried out by the machine but did not provide any answer as to the legality of the output of Artificial Intelligence systems based on such machine learning, and it was determined that the question of whether the product constitutes a violation of the creators’ rights would be judged in accordance with the regular rules of the law.

Want to know more? Contact Us

This article is provided for general information only. It is not intended as legal advice or opinion and cannot be relied upon as such. Advice on specific matters may be provided by our group’s attorneys.

Services

Prominent Industries

Can Copyrighted Content be Used for Machine Learning?