Bookmark and Share

This is a description to a programming and code formatting style for Mathematica programs that I call "munging style code layout".

Any self-contained part of a Mathematica program is a Mathematica expression. As the oft-mentioned idiom said, in Mathematica, "everything is an expression". The appearance of a Mathematica expression is function[arg1, arg2, ...] ("functions", composite expressions) or simply foobar ("symbols", atomic expressions). This is true for any part of a Mathematica program at any depth.

Any Mathematica function is essentially a nested structure of the above building blocks. Complex Mathematica program tends to have deeply nested expressions like

f4[arg41, f3[arg31, arg32, f2[arg21, f1[input], ...], ...], arg43, ...] (* comment explaining what f1, f2, f3, f4 does *)

Imagine a complex Mathematica program having hundreds of above expressions, sometimes nesting each other. It is hard to read or understand code formatted this way because one needs to constantly jumping out and back into the brackets [ ... ]. This is a stack) data structure stored in one's brain that one has to frequently navigate.

Over the time I have gradually established a particular formatting style for complex Mathematica functions with such deep nested structure. I think it helps writing and understanding my Mathematica programs. I call it munging style code layout. It's nothing but putting small parts of the code onto separate lines and use Composition to chain them up:

Composition[
    f4[arg41, #, arg43, ...]&, (* comment for what f4 does *)
    f3[arg31, arg32, #, ...]&, (* comment for what f3 does *)
    f2[arg21, #, ...]&, (* comment for what f2 does *)
    f1 (* comment for what f1 does *)
][input]

I call it munging because it overall operates on a compact initial input argument, typically represented by a Mathematica symbol, and process it step by step and little by little, for multiple steps before returning a final desired output.

This way of laying out the code has a few advantages:

  1. The program body will spread out to multiple lines naturally according to each step's logic. The code is still logically nested, but not visually nested.

  2. Each munging step can be commented at the end of line, so it's easier to write and read the comment. One line of code is structurally and logically simple, so each line's comment will be short too. As a result, the comments for the complete program should also be easier to read and understand.

  3. It is easier to write and debug code. A typical scenario is

Write the first simple step

Composition[ 
    f1[arg11, #, arg13]& (* comment for what f1 does *)
][input]

Run it and verify the first step does what it should do, then add a second step, and comment as you code:

Composition[ 
    f2[arg21, #, ...]&, (* comment for what f2 does *)
    f1[arg11, #, arg13]& (* comment for what f1 does *)
][input]

Notice, when writing the body of f2[...], one needs not to edit around f1[....], but on a separate new line. This may not sound like a big deal, but, according to my personal experience, it eliminates a lot of chances of messing up with f1[...] when typing f2[...]. If one decides the f2 just typed down shouldn't stay, he can just simply select the whole line of of f2 -- typically a keyboard-only operation or a mouse-only operation depending on the programmer's habit and the editor -- and delete it without worrying about messing up any part of f1[...] or needing to copy f1[...] out safely. So there is an ergonomic advantage to separate f1[...] and f2[...] into different lines. You write your code in small chunks, and manage the chunks as compositing terms of the entire logic.

In addition, one can easily print out the intermediate value in-between two steps by simply inserting a NOP step:

Composition[ 
    f2[arg21, #, ...]&, (* comment for what f2 does *)
    (Print@#; #)&,  (* NOP step to print out intermediate value for debugging *)
    f1 (* comment for what f1 does *)
][input]

When you have more steps added, you might find the first few steps aren't perfect, now you can insert NOP step to print out more intermediate values:

Composition[
    ...,
    f4[arg41, #, arg43, ...]&, (* comment for what f4 does *)
    (Print@#; #)&,
    f3[arg31, arg32, #, ...]&, (* comment for what f3 does *)
    (Print@#; #)&,
    f2[arg21, #, ...]&, (* comment for what f2 does *)
    f1 (* comment for what f1 does *)
][input]

Notice inserting these NOP steps again doesn't require editing at lines of the actual code (f1[...]&, f2[...]&, ...). So there is natural and convenient separation of debugging code and real code.

All of this may appear to be too trivial a matter to document. But I found Mathematica code is harder to format than most other programming languages, partially because the character of Mathematica language (functional and symbolic) and partially because neither Mathematica notebook nor Wolfram Workbench provides sophisticated and robust automatic code formatting/indentation. This little formatting rule seems to help me writing better Mathematica code, sometimes also faster and easier in doing so.

Related:

blog comments powered by Disqus