Version: 1.0
The Apache-2.0 license, and the Apache Individual Contribution License Agreement, both remind contributors that they are responsible for disclosing any copyrighted materials in submitted contributions that are not their original creation. This is as true when using generative AI tooling, as it is when using materials from public websites or code from other open-source projects.
When disclosing these materials, contributors should also identify the licensing for these materials. The ASF maintains a 3rd Party Licensing Policy that provides guidance on which licenses are acceptable, along with instructions on the treatment of 3rd Party Works.
While in general, content generated by a non-human (e.g., machine or monkey) is not copyrightable, if content consists of some portions generated by AI and other portions authored by a human, the portions authored by a human may be copyrightable.
As explained by the following U.S. Copyright Office Registration Guidance (3/16/2023):
“For example, a human may select or arrange AI-generated material in a sufficiently creative way that “the resulting work as a whole constitutes an original work of authorship.” Or an artist may modify material originally generated by AI technology to such a degree that the modifications meet the standard for copyright protection. In these cases, copyright will only protect the human-authored aspects of the work, which are ‘independent of’ and do ‘not affect’ the copyright status of the AI-generated material itself.”
These portions authored by a human may simply come from the prompt the human provided or subsequent changes they make. However, a prominent concern with generative AI is the risk of reproducing copyrightable portions of materials that they were trained on, some of which may be copyrightable subject matter. Thus, a recommended practice when using generative AI tooling is to use tools with features that identify any included content that is similar to parts of the tool’s training data, as well as the license of that content.
Given the above, code generated in whole or in part using AI can be contributed if the contributor ensures that:
When providing contributions authored using generative AI tooling, a recommended practice is for contributors to indicate the tooling used to create the contribution. This should be included as a token in the source control commit message, for example including the phrase “Generated-by:
Finally, please note that while the above seems like a reasonable set of guidelines in June 2023, this is a rapidly evolving area. Whatever we recommend to PMCs today, policies will need to be re-evaluated and updated in response to:
We will continue communicating with PMC and ASF members as updates to this FAQ get discussed and merged in.
The above text applies to documentation as well. Pay attention to tools that have restrictive licensing for the generated content, caution should be applied, make sure it complies with the 3rd Party Licensing Policy and 3rd Party Works.
As with documentation, the above principles would still apply. Though with images being a non-textual form, the details quickly become complex. We expect this to continue to be a rapidly evolving area.
Don't second guess vendor's terms of use (TOU). Your usage of their tools is bound by the totality of the given TOU and you are not expected to go outside of the TOU text for further clarifications.
Refer to the 3rd Party Licensing Policy as with any other contribution.
It is not in the interest of the ASF to tell developers what tools to use. You may use whatever tools you wish provided that you follow the guidance in this document.