When working on an Apache Software Foundation (ASF) project, there are specific policies that must be followed. One of these policies is ensuring that all source files contain the appropriate license header.

The purpose of this article is to provide an approach to follow the source file license header policy at an earlier development stage. It is not going to focus on the wording or understanding of the policy itself.

I am not a legal expert, and everything here is based on my interpretation of the ASF Source Header and Copyright Notice Policy. If you are a contributor or PMC member of an existing ASF project or a project that is in the incubation stage and have legal or policy related questions, I can only recommend you to contact Apache Legal for further assistance.

The Questions

As a developer who contributes to Apache projects, we ensure that the project files across all repositories follow the license header policy before releasing by using a tool called Apache Rat (Release Audit Tool).

The Apache Rat is developed and maintained by the Apache Creadur project. It is to help scan and report files that are missing the required license headers. I would like to give a huge thanks to the Apache Creadur project team and a big hand of applause for creating and maintaining this tool! It is an incredibly useful resource. šŸ‘

During the release process for our project, we run through a bunch of steps, from testing to ensuring meeting ASF policies.

Eventually, I started to question some of the process.

Do all of these steps need to be executed during the creation of a release package, or could some be performed at an earlier stage in the development lifecycle?

Clearly, some steps should be run in every phase. However, this led me to additional questions:

Why not perform some of these checks during the development phase?

When a new file is created in a pull request (PR), shouldn’t it have already been audited?

If there are valid reasons to exclude the ASF header license, should it not have been previously discussed, within the PR? (Might require legal advice in these cases…)

The Solution

Considering the above questions, I believe it was more important to run the audits during the development phase to ensure that all files were properly vetted and not missed. By doing so, we can prevent issues before reaching the release stage. We can also have an open discussion with the original PR’s author while they are active and the content is still fresh on their minds.

This is when I decided to create a reusable GitHub Action that runs Apache Rat.

If you are using GitHub Actions for your CI pipeline and would like to implement this workflow in your project, then let’s take a look at how to set it up:

Creating the Release Audit Workflow

  1. Creating the workflow in your GitHub repository.

    From your project’s root directory, run the following commands:

    mkdir -p .github/workflows
    touch .github/workflows/release-audit.yml
    

    Note: If your repository does not have existing workflows, you will need to create the .github/workflows directory to store them.

    Note: The filename is not fixed. For this guide, we use release-audit.yml, but you may rename it as preferred.

  2. Configure the Release Audit Workflow

    name: Release Auditing
    
    on:
      push:
        branches-ignore:
          - 'dependabot/**'
      pull_request:
        branches:
          - '*'
    
    jobs:
      test:
        name: Audit Licenses
        runs-on: ubuntu-latest
        steps:
          # Checkout project
          - uses: actions/checkout@v4
    
          # Check license headers
          - uses: erisu/apache-rat-action@555ae80334a535eb6c1f8920b121563a5a985a75
    

That’s it! But let’s break down what the above configuration does.

Breaking down the Workflow Configurations

  1. Defining the workflow name as Release Auditing.

  2. Having the workflow triggered on all pull requests and commit pushes.

    We exclude running on the dependabot/** branches, but dependabot will still create an associated PR that will trigger the execution of the workflow.

  3. Creating a single job, called Audit Licenses, which runs on the linux system and perform the following steps:

    1. Checkout the repository code.
    2. Run the apache-rat-action, which executes the Apache Rat.

For details on the apache-rat-action, you can view the source code here: erisu/apache-rat-action

As of writing, the current release of the GitHub Action, apache-rat-action@1.2.0, uses Apache Rat version 0.16.1.

You can configure the workflow to use any specific version of Apache Rat as long as it is available in the Apache Download CDN or Archives. To specify a version, use the rat-version argument in the with section:

        # Check license headers
        - uses: erisu/apache-rat-action@555ae80334a535eb6c1f8920b121563a5a985a75
          with:
            rat-version: 0.16.1

Bonus!

If you are working on a Node.js project and use various npm modules, you can enhance this Release Audit workflow by including the following GitHub Action, erisu/license-checker-action.

This is another Action which I created to utilize the npm package license-checker-rseidelsohn, which is licensed under BSD-3-Clause. The license-checker-rseidelsohn package walks through the node_modules directory and checks each module’s package.json for the SPDX license identifiers. If it fails to find the license, it will also attempt to detect from the following files: LICENSE, LICENCE, COPYING, & README.

To use this action, simply add the following steps to the Audit Licenses job:

      # Setup environment with node
      - uses: actions/setup-node@v4
        with:
          node-version: 20

      # Install node packages
      - name: npm install packages
        run: npm i

      # Check node package licenses
      - uses: erisu/license-checker-action@e929758f9416f30234ac454fc9054ca4b803871d
        with:
          license-config: 'licence_checker.yml'

The above steps will:

  1. Set up Node.js
  2. Install the project’s npm dependencies, which will be scanned.
  3. Runs the license-checker-action Action, with the defined licence_checker.yml configuration file.

The licence_checker.yml file will be located in the project’s root directory and containing:

allowed-licenses:
  - 0BSD
  - AFL-3.0
  - Apache-1.1
  - Apache-2.0
  - APAFML
  - BlueOak-1.0.0
  - BSD-2-Clause
  - BSD-3-Clause
  - BSD-3-Clause-LBNL
  - BSL-1.0
  - CC-PDDC
  - CC0-1.0
  - EPICS
  - HPND
  - ICU
  - ISC
  - MIT
  - MIT-0
  - MS-PL
  - MulanPSL-2.0
  - NCSA
  - OGL-UK-3.0
  - PHP-3.01
  - PostgreSQL
  - PSF-2.0
  - SMLNJ
  - Unicode-DFS-2016
  - Unlicense
  - UPL-1.0
  - W3C
  - WTFPL
  - X11
  - Xnet
  - Zlib
  - ZPL-2.0

I read through and created the above SPDX standardized short identifiers list based on the current allowed 3RD PARTY LICENSES from the ASF CATEGORY A: WHAT CAN WE INCLUDE IN AN ASF PROJECT.

If any npm packages need to be excluded from this check, you can list them under exclude-packages. Example:

For example:

exclude-packages:
  - spdx-exceptions@2.3.0

Final Thoughts

After implementing these Actions and setting up a release audit workflow, all our repositories now validate files during development, ensuring:

  • All files have the proper header license statement.
  • All npm dependencies comply with the “ASF CATEGORY A” requirements.

Not only do we verify PRs, but we also cover direct individual commits. This early detection allows us to promptly address issues, ask questions, and discuss necessary changes within PRs or mailing list threads as soon as they arise.

P.S. Don’t forget to add the Apache license header to the YAML files! šŸ˜

# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements.  See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership.  The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License.  You may obtain a copy of the License at
#
#   http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied.  See the License for the
# specific language governing permissions and limitations
# under the License.

Resources: