I am not particularly fond of coding style guides in general. However, as Julia matures common practices are starting to materialize, many of which I applaud, but some of which I detest. I therefore felt compelled to create this guide as a rebuttal to some of the style practices of which I disapprove.
Some of this guide specifically addresses what is currently the most commonly cited Julia style guide BlueStyle. Despite my emphasis to the contrary, most of this style guide agrees with BlueStyle, and I am partially grateful that it has become so commonly followed, as Julia is expressive enough to allow for some much worse alternatives. However, I have some major gripes with BlueStyle, in particular some quibbles about naming, the infuriating way they write multi-line function arguments, and especially their gratuitous insistence on return
statements nearly everywhere.
In this guide I am most careful to record major points of departure between my style and some of the most common practices. Therefore, if I don't mention it here, I probably agree with BlueStyle, and almost certainly agree with the Julia language docs style guide.
As I've previously stated, I'm a bit dubious of the entire concept of style guides and as such, I frown upon dogmatic adherence to any style guide, including this one.
Readability, though subjective, is the most important thing.
Typing is fast and fun, even the best coders will spend far more time reading code than writing it.
The Julia style guide appearing in the base language documentation is widely adhered to and not particularly controversial. Indeed, I follow it here, there are few, if any, points of disagreement between what I present here and this base style.
Many functions with mostly just a few lines per function.
4 spaces per indentation level, no tabs.
Roughly 100 character line length limit. Shorter is ok, longer quickly starts getting bad.
Upper camel-case for modules and types.
All lowercase without underscores for functions by default. Underscores are encouraged in cases of overly verbose symbols, but these should be avoided.
Terse, succinct variable names. Single characters are ideal. Unicode is encouraged, particularly for idiomatic names.
should group packages semantically, e.g. stdlibs together. using
statements which import module contents should get their own line.
Method doc strings are strongly encouraged. Incomplete is better than missing.
Some whitespace for readability is a good thing, but don't go overboard or it becomes counter-productive.
No padding brackets with spaces.
The best code is both modular and generic. The former means that you can either take small pieces out of your code or put other things into it and still get something useful, the latter means the code has many applications.
Code which is modular and generic tends to look a certain way. For one, it cannot be empahsized enough that you should write functions, not just scripts Functions can be taken out and used in different places for different purposes. Scripts are mostly useless except for the purpose and context for which they are written.
Usually making your code modular means that the vast majority of functions are just a few lines. If you are writing much longer functions, it is likely they can be split up into smaller components many of which may have broader applicability. This is also related to the "don't repeat yourself" principle since usually very large blocks of code contain some sort of repetition. You will likely find that code written this way is much easier to understand.
short1() = # small amount of code
short2() = # small amount of code
iteration() = # some stuff involving the above
function long()
A = Matrix{ComplexF64}(undef, n, m)
foreach(iteration, CartesianIndices(A))
function long()
# lots of preparation in-line
A = Matrix{ComplexF64}(undef, n, m)
for idx â CartesianIndices(A)
# tons of complicated code in-line
Keeping code modular is also extremely helpful for writing good unit tests.
Types and modules should use upper camel case. In some cases, it might be preferable to lower following characters as this can cut down on extensive visual noise (e.g. feel free to make some of the non-initial characters in acronyms lowercase).
Again, this is consistent with both the base Julia style guide and BlueStyle, I will defer more detail to those.
module FiberBundles end
module fiberbundles end
struct ManifoldPoint end
struct Manifold_Point end
In cases where there is likely to be confusion between type parameters which are types and those which are literals, type parameters should use a script case, for example đŻ
to distinguish them as such. It is best to do this only for parameters which are actually types, for exmaple in Array{N,đŻ}
, N
is intended to be an Integer
. Script characters are \mathcal
in latex and prepended with \scr
in LaTeX-to-unicode.
#GOOD (X,Y,Z,A are not types)
struct ManifoldPoint{đŻ<:Real,X,Y,Z,A} end
In more common cases in which there is no risk of confusion it is fine to use uppercase letters.
struct ManifoldPoint{T} end
# BAD (they look like literals)
struct ManifoldPoint{type} end
struct ManifoldPoint{TYPE} end
In some cases a short or special type name is idiomatic. In all cases, such special names should be some sort of special character. For example, it is reasonable to define
const â = Real
const â = Complex
const R = Real
Function names should be all lowercase and run together without underscores. Particularly long or verbose names should have underscores for clarity, but the main reason for discouraging underscores is because short function names encourage good multiple dispatch code.
For example, it is very unlikely that you want two separate functions rand_int()
and rand_float()
, much more likely you want rand(đŻ)
where the argument specifies the method behavior and return type. Looking for short function names can encourage you to consider what methods it should have, and to combine similar functionality into functions that can be used generically.
Particularly short, one or two character function names are not often appropriate, but may be in some idiomatic cases, for example, if you are using the gamma function, by all means please call it Î
, it would be crazy not to.
Yes, it is possible to go too far. The main point of combining methods into functions is to facilitate generic functionality and make it easier to write generic code. This is different from coding "puns" in which a symbol is overloaded with methods that cannot possibly be used in analogous situations.
function run(s::Server)
function Î(z::Number)
function run_server(s::Server)
function gamma_function(z::Number)
It is often the case the particularly verbose names are appropriate for "internal" functions with very limited applicability. As Shannon taught us, efficient encodings use longer symbols for less common values. If it is intended that a function is not expected to be used far from its initial definition, it should be prepended with a _
. For example, an internal function might look something like _handle_bizarre_edge_case(a...)
Variable names should be short and terse, preferably one character. Naming conventions should be idiomatic where possible. This is likely to be one of the more controversial aspects of this style guide and I am importing it from math and physics.
Readability is of the utmost importance. Verbose variable names reduce the signal to noise ratio and, in the worst cases, make reading complicated expressions horrendous.
Certain conventions about what variables are typically used as what should be respected unless the context demands otherwise. For example, n, m, i, j, k
are probably integers. x, y, z
are probably numbers in a continuous space, either real or complex. Ξ, Ï, Ï
are likely continuous and dimensionless. Îș
is likely "constant" in some respect. v, w
are likely not scalar. f, g
may be functions, but it is often better to use script characters such as đ», â
to specify functions to show the distinction. A, B
are likely operators or matrices.
The above should not be followed dogmatically, but is flexible depending on the context.
It is usually preferable to accompany a function with at least a minimal doc string, which in many cases will explain at least argument variables. Comments explaining variables are acceptable if you feel they are needed.
Some might object to this saying that expecting comments and doc strings to do the job of "self-documenting" verbose variable names is crazy. In my experience, this argument is based on a fantasy: either the use of the variables is mostly clear even without the verbose names or additional documentation is needed anyway. There is only a very narrow space in between and it isn't worth having unwieldy and verbose naming conventions in the hope that some day you're going to be lucky enough to land there.
n = 0
k += 1
z = x + im*y
v[idx] # in some cases, such as if idx isa CartesianIndex
cfg = configure()
Ï = connect()
B = A*x # x is a vector in this case
α, ÎČ = divrem(N, m)
number = 0
count += 1
value = real_part + im*imag_part
config = configure()
connection = connect()
lhs_matrix = coefficient_mat*vec
Some accommodation for fonts which may have poor unicode support is reasonable, for example, it's reasonable to hesitate to use đ
. It is also reasonable to avoid Μ
because some fonts display it as a v
. However, excessive subservience to crap fonts should be avoided. If you are using a font that's so bad that you are afraid to use non-ASCII, change it.
Using terse variable names requires a little bit of conscientiousness on the part of the programmer. For example, in a function (or module) in which "counting" integers are extremely common, one might struggle to assign appropriate names for all of them.
For example
f(i, j, n, k) = do_stuff_1()
f(i, j, name::AbstractString, idx) = do_stuff_2()
f(i, j, n, k) = do_stuff_1()
# in the good example we saw that the 3rd and 4th args have different semantics here,
# so it's not good to keep the names (especially k)
f(i, j, n::AbstractString, k) = do_stuff_2()
In other words, yes, it's possible to go too far using terse variable names. This is not a good justification for extremely verbose variable names, but it is a good justification for not fanatically adhering to a specific convention. If you are struggling to come up with good names, just use a more verbose one, it doesn't mean you should make everything verbose. (See above comments about dogmatic adherence to style guides being bad.)
Avoiding globals is good practice in most languages for a number of reasons, but they are particularly bad in Julia unless you know what you're doing. All globals should be const
, except in very small scripts. Global names should be all uppercase, this serves to strongly distinguish them from locals, and to a lesser extent to warn you to be careful with them.
#GOOD (sometimes at least)
const GLOBAL_MUTABLE_STATE = Dict{Int,Float64}() # note this is fully-typed
#BAD, very bad
global_mutable_sate = Dict()
Use unicode operator names where appropriate, particularly the built-ins.
A â B
a â b
α ⥠ÎČ
x â S
issubset(A, B)
a != b
α === ÎČ
x in S
Please, for the love of god, stop using unnecessary return
An elegant and delightful convention which Julia has inhereted from Lisp is that everything is an expression. To use return
statements gratuitously is to deny the beauty and simplicity of this concept. It seems likely to me that many people want unnecessary return
statements because they are not sufficiently comfortable with this concept. It is a good thing to get used to, as it provides many nice and expressive ways of writing code
function đ»(x, y)
#(this should be in one line, but I expand it for illustrative purposes)
if y â„ 0
function đœ(x, y)
z = if rand() > 1/2
x + y
x - y
function đ»(x, y)
if y â„ 0
return x
return -x
function đœ(x, y)
if rand() > 1/2
z = x + y
z = x - y
return cosh(z)
It has been argued that choosing not to include return
statements makes it less clear what the intention of the function is. This is of course absurd, and likely imported from non-Lisp languages, since nothing is forcing you to have a return value, if you really don't want one, return nothing
, that's why it exists. Don't return something you don't intend to return.
function â(A)
# do some stuff to A
# in most cases functions like this return A, but perhaps not
function â(A)
# do some stuff to A
return nothing
Obviously, if you want to return a value before the end of the function block, you should use the return
Arguments should usually have type assertions. It is sometimes falsely claimed that Julia relies on this for performance. This is not true, but type assertions on arguments are still useful because
They catch many errors and usually result in a much more comprehensible error message than if they were absent. It's also much easier to unintentionally allow for undefined behavior if omitting them.
It can make use case of methods more clear.
It is useful for multiple dispatch. Even if you do not initially intend to define other methods for your function, using reasonable type assertions early on can save you from a lot of confusion later.
You should avoid overly specific types assertions. Type assertions which are too specific inappropriately limit the functionality of a method. The following might be educational:
Most of the time, if you want an integer, you want Integer
. In most contexts, any integer makes sense. Sometimes Signed
or Unsigned
is appropriate.
Most of the time you want Real
and not Float64
. A notable exception is GPU's, which typically require Float32
to work efficiently.
Most of the time you want AbstractArray
rather than array. In many cases this will be AbstractVector
or AbstractMatrix
Avoid overly specific types for container type parameters. You probably want AbstractVector{<:Real}
, not Vector{Float64}
function đ»(z::Number, x::Real, n::Integer, v::AbstractVector{<:Complex})
#BAD (again, usually)
function đ»(z, x, n::Int, v::Vector{ComplexF64})
It's important when to know to "give up" on type assertions. Again, if they are too specific, your code won't be as useful as it could be. Unless you are new to the language, if you find yourself struggling to figure out an appropriate type assertion, it's time to give up and just leave it off. Usually you don't want to bother with assertions with Union
, though an important exception is for handling "null" values such as Union{Nothing,Int}
Another notable class of cases in which you should leave off type assertions is iterators. Iterators can be of any type and Julia has no formal way of specifying argument interfaces. If, e.g. an AbstractArray
or Tuple
is appropriate, you probably shouldn't bother with a type assertion.
You should adhere to the following
Use one-line function if it fits!
The first arguments should always appear on the same line as the function name.
Use more lines for clarity if needed, especially for keyword arguments.
The closing bracket should either be on the same line as the function name, or in the same column as the opening bracket, never at the 1st column.
Start keyword arguments on a new line unless your signature is very short... trust me, I'm very bad with this and it has often cost me.
No spaces between keywords, =
and their arguments.
đ»(x::Number, y::Number) = x + y
function â(z::đŻ;
switch1::Bool=true, switch2::Bool=false,
) where {đŻ}
function đ»(x, y)
x + y
function â(
z::đŻ; option1 = "some_option", option2 = "another_option",
switch1::Bool = true, switch2::Bool = false,
) where {đŻ}
Lists of items, whether arguments or an array definition, can be unrolled into multiple lines for clarity when appropriate. This is semantically meaningful in the case of arrays.
If a list is any longer than a single line always put commas (or whatever delimiter is appropraite) after every element, including the last one. The reason for this should be obvious: if you add elements to or otherwise alter the list, you will cause a syntax error unless you remember to add the comma.
Excessive verbosity is discouraged. Feel free to make list elements more compact than similar syntax might appear in other situatons. For example, pairs =>
should not be padded with spaces.
const LOOKUP_TABLE = Dict("a"=>0x01, "b"=>0x02,
"c"=>0x03, "d"=>0x04,
) # closing on the previous line is also ok
const LOOKUP_TABLE = Dict("a"=>0x01,
const LOOKUP_TABLE = Dict("a" => 0x01,
"b" => 0x02,
"c" => 0x03,
"d" => 0x04 #â missing comma!
One should always use â
, not in
and especially not =
(the latter of which should, frankly, be removed from the language).
for x â v
for x in v
#WTF, how is this even a thing?
for x = v