Skip to content

SciML/SciMLStyle

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

SciML Style Guide for Julia

SciML Code Style Global Docs

The SciML Style Guide is a style guide for the Julia programming language. It is used by the SciML Open Source Scientific Machine Learning Organization. It covers proper styles to allow for easily high-quality, readable, robust, safety, and fast code that is easy to maintain for production and deployment.

It is open to discussion with the community. Please file an issue or open a PR to discuss changes to the style guide.

Table of Contents

Code Style Badge

Let contributors know your project is following the SciML Style Guide by adding the badge to your README.md.

[![SciML Code Style](https://img.shields.io/static/v1?label=code%20style&message=SciML&color=9558b2&labelColor=389826)](https://github.com/SciML/SciMLStyle)

Overarching Dogmas of the SciML Style

Consistency vs Adherence

According to PEP8:

A style guide is about consistency. Consistency with this style guide is important. Consistency within a project is more important. Consistency within one module or function is the most important.

But most importantly: know when to be inconsistent -- sometimes the style guide just doesn't apply. When in doubt, use your best judgment. Look at other examples and decide what looks best. And don't hesitate to ask!

Some code within the SciML organization is old, on life support, donated by researchers to be maintained. Consistency is the number one goal, so updating to match the style guide should happen on a repo-by-repo basis, i.e. do not update one file to match the style guide (leaving all other files behind).

Community Contribution Guidelines

For a comprehensive set of community contribution guidelines, refer to ColPrac. A relevant point to highlight PRs should do one thing. In the context of style, this means that PRs that update the style of a package's code should not be mixed with fundamental code contributions. This separation makes it easier to ensure that large style improvements are isolated from substantive (and potentially breaking) code changes.

Open source contributions are allowed to start small and grow over time

If the standard for code contributions is that every PR needs to support every possible input type that anyone can think of, the barrier would be too high for newcomers. Instead, the principle is to be as correct as possible to begin with, and grow the generic support over time. All recommended functionality should be tested, and any known generality issues should be documented in an issue (and with a @test_broken test when possible). However, a function that is known to not be GPU-compatible is not grounds to block merging, rather it is encouraged for a follow-up PR to improve the general type support!

Generic code is preferred unless code is known to be specific

For example, the code:

function f(A, B)
    for i in 1:length(A)
        A[i] = A[i] + B[i]
    end
end

would not be preferred for two reasons. One is that it assumes A uses one-based indexing, which would fail in cases like OffsetArrays and FFTViews. Another issue is that it requires indexing, while not all array types support indexing (for example, CuArrays). A more generic and compatible implementation of this function would be to use broadcast, for example:

function f(A, B)
    @. A = A + B
end

which would allow support for a wider variety of array types.

Internal types should match the types used by users when possible

If f(A) takes the input of some collections and computes an output from those collections, then it should be expected that if the user gives A as an Array, the computation should be done via Arrays. If A was a CuArray, then it should be expected that the computation should be internally done using a CuArray (or appropriately error if not supported). For these reasons, constructing arrays via generic methods, like similar(A), is preferred when writing f instead of using non-generic constructors like Array(undef,size(A)) unless the function is documented as being non-generic.

Trait definition and adherence to generic interface is preferred when possible

Julia provides many different interfaces, for example:

Those interfaces should be followed when possible. For example, when defining broadcast overloads, one should implement a BroadcastStyle as suggested by the documentation instead of simply attempting to bypass the broadcast system via copyto! overloads.

When interface functions are missing, these should be added to Base Julia or an interface package, like ArrayInterface.jl. Such traits should be declared and used when appropriate. For example, if a line of code requires mutation, the trait ArrayInterface.ismutable(A) should be checked before attempting to mutate, and informative error messages should be written to capture the immutable case (or, an alternative code that does not mutate should be given).

One example of this principle is demonstrated in the generation of Jacobian matrices. In many scientific applications, one may wish to generate a Jacobian cache from the user's input u0. A naive way to generate this Jacobian is J = similar(u0,length(u0),length(u0)). However, this will generate a Jacobian J such that J isa Matrix.

Macros should be limited and only be used for syntactic sugar

Macros define new syntax, and for this reason, they tend to be less composable than other coding styles and require prior familiarity to be easily understood. One principle to keep in mind is, “can the person reading the code easily picture what code is being generated?”. For example, a user of Soss.jl may not know what code is being generated by:

@model (x, α) begin
    σ ~ Exponential()
    β ~ Normal()
    y ~ For(x) do xj
        Normal+ β * xj, σ)
    end
    return y
end

and thus using such a macro as the interface is not preferred when possible. However, a macro like