Skip to content

Latest commit

 

History

History
168 lines (105 loc) · 9.66 KB

CONTRIBUTING.md

File metadata and controls

168 lines (105 loc) · 9.66 KB

First of all, thanks for taking the time to contribute! 🎉👍

Table of Contents


Workflow

Pull requests

We follow the GitHub Flow: all code contributions are submitted via a pull request towards the main branch.

Opening a pull request means you want that code to be merged. If you want to only discuss it, send a link to your branch along with your questions through whichever communication channel you prefer.

Peer reviews

All pull requests must be reviewed by at least one person who is not their original author.

To help reviewers, make sure to describe your pull request with a clear text explanation of your changes.

Continuous delivery

GitHub Actions is used to release the package on every merge to the main branch.

Branch protection is active, meaning that a merge to the main branch can only take place once all tests pass in CI, and that the peer review policy has been fulfilled.

Commit messages

We strive to follow this recommendation to write our commit messages, which contains the following rules:

We add this additional rule:

  • Do not rely on GitHub issue reference numbers in commit messages, as we have no guarantee the host system and its autolinking will be stable in time. Make sure the context is self-explanatory. If an external reference is given, use its full URL.

Changelog

All changes to the codebase that impact users must be documented in the CHANGELOG.md file.

The format to use is Common Changelog, with the following additional specifications:

  1. The unreleased section of Keep a Changelog is preserved with the addition of a tag to specify which type of release should be published and to foster discussions about it inside pull requests. This tag should be one of the names mandated by SemVer, within brackets: [patch], [minor] or [major]. For example: ## Unreleased [minor].
  2. Each listed change must provide an actionable way to adapt the user’s codebase, either directly in the changelog or through instructions or links.
  3. Changes are a single sentence with no punctuation (like all examples given in Common Changelog).
  4. Since each release is produced automatically from a single pull request, the notice is always used to link to the source pull request rather than references, which would always reference the same pull request. References can instead link to relevant specific parts of an RFC, decision record or diff.

Development

Documentation

Copywriting

Avoid “you” and “we” (who is “we” in a common anyway?). Use neutral or passive wordings.

- You can find federated public instances on GitHub.
+ Federated public instances can be found on GitHub.

CLI

Options case

Use snake case for multiple word options:

- ota validate --schemaOnly
+ ota validate --schema-only
docopt

For command-line examples and documentation, we follow the docopt usage patterns syntax. Quick recap of the main points:

  • mandatory arguments are given between < and >;
  • optional elements are given between [ and ];
  • mutually exclusive elements are given between ( and ) and separated by |.
- ota track --services [ $service_id ] [, $service_id, ...]
+ ota track [--services <service_id>...]
Long options

In order to improve the understandability of commands, we document all CLI options and examples with the long version of the options.

- ota track -s $service_id -r
+ ota track --services <service_id> --extract-only

Naming

Instances and repositories

An “instance” of Open Terms Archive is comprised of a server running Open Terms Archive and up to three repositories. An instance has a name describing the scope of services it aims at tracking. This scope is defined by one or several dimensions: jurisdiction, language, industry…

For example, the france instance tracks documents in the French jurisdiction and French language, while the dating instance tracks services from the dating industry.

The instance name is written in lowercase and is made of one word for each dimension it focuses on, separated by dashes.

For example, the france-elections instance tracks services in the French jurisdiction and French language that could impact the French electoral processes.

This name is used consistently in all communication, written references, and in the inventory of instances that are managed automatically. It is also used as the base for naming the database repositories, by suffixing it with each type:

  • The repository containing the declarations of services to be tracked is named <instance_name>-declarations. You can create it from a template.
  • The repository containing the snapshots of the tracked documents (unless the instance is storing them in an alternative database) is named <instance_name>-snapshots. You can create it from a template.
  • The repository containing the versions of the tracked documents (unless the instance is storing them in an alternative database) is named <instance_name>-versions. You can create it from a template.

Namespaces

We deploy identifiers for packages and namespaces across different universes: package managers, social networks, URLs… In order to unify these names across constraints, we reserve everywhere the name opentermsarchive, with no space, no dash, no underscore, no capital.

For example, NPM does not allow uppercase and spaces; Ansible does not allow dashes and spaces; Twitter does not allow spaces. The name ota is too unlikely to be available everywhere.

Practices

Errors handling

First of all it's important to distinguish two fundamentally different kinds of errors: operational and programmer errors.

  • Operational errors represent run-time problems experienced by correctly-written programs. These are not bugs in the program. These are usually problems with something else: the system itself (e.g. out of memory), the system’s configuration (e.g. no route to a remote host), the network (e.g. socket hang-up), or a remote service (e.g. a 500 error).

  • Programmer errors are bugs in the program. These are things that can always be avoided by changing the code. They can never be handled properly, since by definition the code in question is broken (e.g. tried to read property of undefined, or forget to await an asynchronous function).

So the very important distinction is that operational errors are part of the normal operation of a program whereas programmer errors are bugs.

Also noteworthy, failure to handle an operational error is itself a programmer error.

Handling operational errors

There are five ways to handle operational errors:

  • Deal with the failure directly. For example, create directory if it's missing.
  • Propagate the failure. If you don’t know how to deal with the error, the simplest thing to do is to abort whatever operation you’re trying to do, clean up whatever you’ve started, and propagate the error.
  • Retry the operation. For example, try to reconnect if the connection is lost.
  • Log the error — and do nothing else. If it's a minor error and there’s nothing you can do about, and there is no reason to stop the whole process.
  • Crash immediately. If the error cannot be handled and can affect data integrity.

In our case, we consider all fetch-related errors as expected, so as operational errors and we handle them by logging but we do not stop the whole process. We handle errors related to the notifier in the same way. In contrast, we consider errors from the recorder module as fatal, and we crash immediately.

Handling programmer errors

The best way to handle programmer errors is to crash immediately. Indeed, it is not recommended to attempt to recover from programmer errors — that is, allow the current operation to fail, but keep handling requests. Consider that a programmer error is a case that you didn’t think about when you wrote the original code. How can you be sure that the problem won’t affect the program itself and the data integrity?

This section is highly inspired, and in part extracted, from this error handling guide.