This is an add on to my previous post, Wikis as Multigraphs of Text.
As of right now, most source code is stored as text files in a file system. And this has worked out fine for the industry for decades. But, there are other ways of storing programs. What if, we stored source code in a wiki?
As a reminder, from my previous blog post, I stated that a Wiki data type could be encoded as a multigraph, that is, a graph with at least two kinds of edges,
Outline, where we encode text block order for output, and
Hyperlink, where we track hyperlink references between blocks.
Why should we put source code in wikis?
For the purposes of this blog post, I’d like to avoid the question of the usefulness of this paradigm, and just make this statement: I think this is a fun thought experiment, not a call for change.
If/when I do implement some kind of system based on these ideas, I will report that in another post.
I think that a wiki could be an excellent way for a group of programmers to maintain a codebase over time. Utility scripts could be encoded with documentation alongside, with comments and examples included. If the wiki was version controlled, then all of that information would be as well. All alongside your code, instead of in a different place. And if that wiki was a multigraph, then you can use graph theory to structure your code instead of a bytestream.
Here’s a made up example: let’s say that there is a license/copyright comment block at the top of every one of your files in your code base. In the real world examples of these, often there was a script or automation of some kind that made sure that all of these comment blocks were up to date. If source code was encoded in a graph, then you could have only one vertex in the graph that has that license file, then you can have any number of references to it. If you have the system present the code as a file system (as a particular view into your graph data), then all the files would have that same comment block at the top.
How does this relate to Literate Programming?
This is an evolution of Literate Programming, and as such, the
weave functions from that paradigm are required. The
weave function creates a website that allows users to view the graph data as prose. The
tangle function creates that view into the graph data of a filesystem with the source code in plaintext.
What types do we need for the multigraph?
So, the types needed to encode this change would have to expand to include different kinds of text blocks, and include a new type of edge.
In the last post, the Wiki only has one type of block, which is just Text. However, in the case of differentiating between different kinds of blocks, we need to make at least two kinds of blocks: Prose and Code. If we wanted to have our wiki be a polyglot environment, we can have the Code block encode what language the code is in as Text or possibly a sum type.
The Outline edge type can also be divided into two kinds, which I’m naming after the two Literate Programming functions: Tangle and Weave. Both the Tangle and Weave types could include a filepath as a property. So, during the
tangle operation, the system would output the files as they should be for compilation, and the
weave operation would generate the pages for the wiki for user viewing.
This is a fairly new idea for me, and I don’t know a lot of details as of yet. I’m going to read up on Tinkerpop and Goblin and see if I can’t test out some of these ideas on my own.
Thanks for reading!