Graphlet: a readable serialized representation of graphs

←	March 2021
S	M	T	W	T	F	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

blog	11
computing	10
note	8
programming	3
tip	3
Java	2
bash	2
china	2
chinese	2
data	2
emacs	2
git	2
journal	2
linguistics	2
mathematica	2
mathematics	2
mercurial	2
news	2
revision control	2
statistics	2

A tree graph is uniquely identified by the set of its edges

{1->2, 1->3, 2->4, 2->5, 2->6, 3->7, 3->8}

This syntax is used in Wolfram Language's graph specification. The actual picture of the graph can be displayed in Mathematica with:

Graph[{1 -> 2, 1 -> 3, 2 -> 4, 2 -> 5, 2 -> 6, 3 -> 7, 3 -> 8}]

It can be drawn in Wolfram|Alpha conveniently with query "graph 1 -> 2, 1 -> 3, 2 -> 4, 2 -> 5, 2 -> 6, 3 -> 7, 3 -> 8".

The question is: is there a better way to represent the graph in a compact, readable, and text-only form?

Why?

A few use cases which might give some motivation:

In tweets, sometimes, one might want to include graphs accurately represented by a short sequence of regular characters, which ideally should also be easy to parse as-is (instead of only after being processed by software such as Mathematica's Graph[...] function.)
When labeling/classifying objects, besides using tags which are a flat list of IDs of form {tag1, tag2, ...} and equivalent to a graph with vertexes but no edges, one can use a tree graph represented in a succinct form to carry more information about the hierarchical classification of the object.

One way: list of tags

One obvious way is to write the list of edges such as {1->2, 1->3,2->4, 2->5, 2->6, 3->7, 3->8}, which uniquely identifies the graph. It's not very compact -- vertexes with multiple children is repeated -- and neither very readable -- one need to do much mental processing and memorizing in order to understand and imagine the structure of the graph.

Another way: "graphlet"

Another way I designed is to write it into the following form, which I dubbed "graphlet representation" of (tree) graphs:

1`{2`{4, 5, 6}, 3`{7, 8}}

So, in graphlet representation:

The edges in a graph are represented by an edge character (backtick for instance);
Child vertexes are grouped by a pair of grouping characters ({ and } for instance).

Note that the graphlet representation is shorter than the flat list-of-edges representation.

Classification graphlet is more informative than list of tags

An object's classification is often represented as a list of tags

{tag1, tag2, ..., tag8}

However, if the classification is hierarchical, a graphlet representation can easily records more information about the classification structure:

tag1`{tag2`{tag4, tag5, tag6}, tag3`{tag7, tag8}}

Application examples

To represent the classification of a problem, a sequence of key words is often used, e.g.

astrophysics, cosmology, general-relativity, star, galaxy

One can actually additionally include its hierarchical classification structure by using a graphlet:

science`{physics`{astrophysics, cosmology, general-relativity}, astronomy`{star, galaxy}}

Advantages

Preserving the full graph structure, many graph characteristics can be exploited, and graph-theoretic methods can be used for analyzing metadata in this form. For example, graphlets can be systematically shorten/simplified by pruning leaves.