Jekyll2021-07-16T00:27:16-04:00http://cleare.st//The Clearest of Blue SkiesI'm technically a professional mathematician. I also like computers, words, and bad jokes. But I hate fun. Fun is banned.
Ilia ChtcherbakovSampling products from finite groups2021-07-12T13:56:59-04:002021-07-12T13:56:59-04:00http://cleare.st/math/sampling-finite-groups<p>I was recently reminded of a very simple chestnut about sampling elements from finite groups.
This is the kind of problem you give your intro group theory students to test whether or not they are paying attention, or at least to keep them busy; nothing special.
But the way it was worded suggested a harder version of the problem that immediately piqued my interest, and though I haven’t fully solved it yet, it ended up tying itself up in a neat little bow by the end of it.
<!--more--></p>
<p>So here’s the easy problem. What is the probability that <script type="math/tex">k</script> group elements sampled uniformly at random from the finite group <script type="math/tex">G</script> multiply together to form the identity?
To be precise, if <script type="math/tex">g_i \in G</script> were sampled uniformly randomly, what is <script type="math/tex">\mathrm P(g_1 \cdots g_k = 1)</script>?</p>
<p>I’m not going to beat around the bush and stall for time in order for you to pause and solve it any longer than it takes for you to finish reading this sentence, so look away if you must.
The answer is <script type="math/tex">1/\lvert G \rvert</script>, regardless of the value of <script type="math/tex">k</script>.
This is because, once you’ve sampled <script type="math/tex">g_1</script> through <script type="math/tex">g_{k-1}</script>, the final element must be exactly <script type="math/tex">(g_1 \cdots g_{k-1})^{-1}</script> in order to get the identity.</p>
<p>You can end the story here, and leave it as a cute problem to toss out if you ever wanna make sure someone remembers how groups work.</p>
<p>Or, you can do what I did, and mishear the problem statement as follows.
What is the probability that <script type="math/tex">k</script> group elements, sampled uniformly at random from the finite group <script type="math/tex">G</script>, multiply together <em>in some order</em> to form the identity?
To be precise, for <script type="math/tex">g_i</script> as before, what is</p>
<script type="math/tex; mode=display">p(G,k) := \mathrm P(\exists \sigma \in S_k: g_{\sigma_1} \cdots g_{\sigma_k} = 1)?</script>
<p>Now we’re talking about something interesting!
This problem seems hard in general. I haven’t found a solution, nor do I know where to find it in the literature, if it exists there at all.</p>
<p>For the sake of terminology, if <script type="math/tex">A</script> is a multiset of elements of <script type="math/tex">G</script>, say that a <strong>Wilsonian product</strong><sup id="fnref:1"><a href="#fn:1" class="footnote">1</a></sup> of <script type="math/tex">A</script> is some ordered product of all the elements of <script type="math/tex">A</script>. Let <script type="math/tex">w(A)</script> be the set of all Wilsonian products of <script type="math/tex">A</script>, so that we can restate</p>
<script type="math/tex; mode=display">p(G,k) = \mathrm P(1 \in w(\text{sample of size}\ k)).</script>
<p>So, some easy observations to begin with.
<script type="math/tex">p(G,k) \ge 1/\lvert G \rvert</script> because the easy version of this problem is a lower bound.
In fact, <script type="math/tex">p(G,1) = p(G,2) = 1/\lvert G \rvert</script>.</p>
<p>Let’s try to solve <script type="math/tex">p(G,3)</script>.
<script type="math/tex">g_3</script> must equal either <script type="math/tex">(g_1g_2)^{-1}</script> or <script type="math/tex">(g_2g_1)^{-1}</script> in order to make the identity possible, and those fail to be distinct elements iff <script type="math/tex">g_1</script> commutes with <script type="math/tex">g_2</script>.
So we find that</p>
<script type="math/tex; mode=display">p(G,3) = \frac{2 - \text{comm.prob.}}{\lvert G \rvert},</script>
<p>involving the <a href="https://en.wikipedia.org/wiki/Commuting_probability">commuting probability</a>, an atrocious number. Perhaps there is some slicker argument, but I don’t see any.
From where I’m standing this just looks like increasingly horrifying case analysis, so I propose we just let this go for now.</p>
<p>For the next round of observations, recall the derived group <script type="math/tex">G' = [G,G]</script> generated by all the commutators <script type="math/tex">[g,h] = ghg^{-1}h^{-1}</script>.
This is a normal subgroup of <script type="math/tex">G</script>, and the quotient <script type="math/tex">G/G'</script> is also known as the <strong>abelianization</strong> <script type="math/tex">\def\ab{\textrm{ab}}G^\ab</script>.</p>
<p>These are relevant because across all <script type="math/tex">\sigma \in S_k</script>, <script type="math/tex">g_{\sigma_1} \cdots g_{\sigma_k}</script> always has the same image under the quotient map <script type="math/tex">G \to G^\ab</script>.
So for instance, if the image <script type="math/tex">\overline{g_1 \cdots g_k}</script> is not equal to <script type="math/tex">\overline1 \in G^\ab</script>, then there’s no possible reordering that could give a product of the identity.
This amounts to an upper bound <script type="math/tex">p(G,k) \le 1/\lvert G^\ab \rvert</script>.
If <script type="math/tex">G</script> is abelian then <script type="math/tex">G = G^\ab</script> so <script type="math/tex">p(G,k) = 1/\lvert G \rvert</script>.</p>
<p>Seeing no more convenient inroads for specific <script type="math/tex">k</script>, we perform the standard mathematical give-up maneuver of turning to the asymptotics, in this case letting <script type="math/tex">k \to \infty</script>. Let’s abbreviate</p>
<script type="math/tex; mode=display">p(G,\infty) = \lim_{k\to\infty} p(G,k).</script>
<p>It behooves me to justify the existence of this limit in order to say that, but I’ll do you one better, by pinning down its value over the course of the rest of this ‘blog post. In fact, let’s get right into it:</p>
<blockquote>
<p><strong>Proposition.</strong> <script type="math/tex">p(G,\infty) = 1/\lvert G^\ab \rvert</script>.</p>
<p><strong>Proof Sketch.</strong> <script type="math/tex">p(G,k) \le 1/\lvert G^\ab \rvert</script> so <script type="math/tex">p(G,\infty) \le 1/\lvert G^\ab \rvert</script> also.
We may assume that <script type="math/tex">\overline{g_1 \cdots g_k} = \overline1</script> and our task is to find a reordering equal to <script type="math/tex">1</script> with arbitrarily high probability.
For any multiset <script type="math/tex">A</script> of elements of <script type="math/tex">G</script>, we may assume that <script type="math/tex">A</script> is a submultiset of our sample with arbitrarily high probability by taking <script type="math/tex">k</script> sufficiently large.
Suppose there exists a choice of multiset <script type="math/tex">A</script> such that the Wilsonian products <script type="math/tex">w(A)</script> contain every element of <script type="math/tex">\overline A{}^{-1} \in G^\ab</script>.
Then, for any ordered product of <script type="math/tex">(g_i) - A</script>, there exists an inverse in <script type="math/tex">w(A)</script>, so pick an arbitrary order of <script type="math/tex">(g_i) - A</script> and use the corresponding ordering of <script type="math/tex">A</script> to cancel it. ∎</p>
</blockquote>
<p>This limit trick reduces the proof of the proposition to the task of finding such an <script type="math/tex">A</script>—for convenience I’ll call it a <strong>flexible</strong> multiset—and it is due to Discord user smorc.
The smaller <script type="math/tex">|A|</script> is, the better the concrete bounds, but any flexible <script type="math/tex">A</script> will suffice to prove this in the limit.</p>
<p>My contribution to the problem was to actually construct a flexible <script type="math/tex">A</script>, first for <script type="math/tex">G = S_n</script> and then for all finite groups <script type="math/tex">G</script>.
It will have <script type="math/tex">w(A) = G'</script>, which is to say <script type="math/tex">\overline A = \overline 1 \in G^\ab</script>, so in what follows I will be talking as if that is a given,
even though strictly speaking it is a coincidence and flexibility is more general than that.</p>
<p>Recall that <script type="math/tex">G'</script> is generated by commutators <script type="math/tex">[g,h] = ghg^{-1}h^{-1}</script>.
Let <script type="math/tex">S</script> be a <strong>generating multiset</strong> of <script type="math/tex">G'</script> consisting entirely of commutators.
A generating multiset is like a generating set, but it has the multiplicities baked into it already, so that every element of <script type="math/tex">G'</script> can be expressed as the ordered subproduct<sup id="fnref:2"><a href="#fn:2" class="footnote">2</a></sup> of <script type="math/tex">S</script>.</p>
<p>These obviously exist, because you can take <script type="math/tex">S</script> to be a sufficiently large multiple of a plain old generating set, and decomposing all its products of commutators into their factors.
We incur no penalty to the size of <script type="math/tex">S</script> in doing this, and because of the way we turn <script type="math/tex">S</script> into <script type="math/tex">A</script>—you’ll see in just a moment—it actually allows for more potential reductions in size.</p>
<p>Finally, let <script type="math/tex">A</script> be the multiset</p>
<script type="math/tex; mode=display">\sum_{[g,h] \in S} (g, g^{-1}, h, h^{-1}).</script>
<p>That is to say, for every commutator in <script type="math/tex">S</script>, take into <script type="math/tex">A</script> its four factors.
By construction, <script type="math/tex">w(A) = G'</script>. To see this, for <script type="math/tex">c \in G'</script>, write it as an ordered subproduct of <script type="math/tex">S</script>, so that it is also an ordered subproduct of <script type="math/tex">A</script>;
then for every remaining element of <script type="math/tex">A</script>, its inverse is also present, so you can tack them onto the end in inverse pairs.
This is not the most efficient <script type="math/tex">A</script> you can get out of <script type="math/tex">S</script>, but it’s the easiest to specify and justify.</p>
<p>And that’s the whole problem. This <script type="math/tex">A</script> slots right into the proof sketch above and we’re done!</p>
<p>That felt a little short to me, so let’s screw around a little more with examples and “historical” notes, before I end this post.</p>
<p>For <script type="math/tex">G = S_n</script>, we have <script type="math/tex">G' = A_n</script> and <script type="math/tex">G^\ab = \{\text{even}, \text{odd}\}</script>.
We can take <script type="math/tex">A</script> to be the multiset containing every transposition of <script type="math/tex">S_n</script> twice. You can even do a hands-on argument, as I originally did, where you handle arbitrary <script type="math/tex">\tau \in A_n</script> by doing each cycle separately and finishing off the induction by handling the case of double transpositions.</p>
<p>I was pretty proud of this because I hadn’t had a nice problem in a while and I got to put to use my experience with fiddling with <script type="math/tex">A_n</script>.
You see, in 2016, I had to prove <script type="math/tex">A_n</script> was simple for <script type="math/tex">n\ge5</script> as part of an algebra assignment, so I am regrettably quite practiced with permutation groups.</p>
<p>In between solving this symmetric group case and finding an elementary solution for the general case, Forte Shinko brought to my attention the following theorem:</p>
<blockquote>
<p><strong>Theorem (Dénes–Hermann 1982).</strong> <script type="math/tex">w(G)</script> is a coset of <script type="math/tex">G'</script>.</p>
</blockquote>
<p>For such a short theorem statement, it is a remarkably strong fact.
<script type="math/tex">w(G)</script> is the entire coset, which means that if you take <script type="math/tex">A = G</script>—that is, every element of <script type="math/tex">G</script> taken exactly once—in our sampling problem, it will make a flexible set.
A more detailed version of this result actually characterizes which coset this is, and in most cases that coset is just <script type="math/tex">G'</script> itself, with only a small list of specific conditions resulting in the alternative, but for our purposes that is irrelevant.</p>
<p>Dénes–Hermann has a mightily complicated proof for its concise conclusion.
The first completed proof relied on the Feit–Thompson theorem—odd order implies solvable—of all things!
Feit–Thompson was the crown jewel of group theory at the time, and the Classification of Finite Simple Groups was a year away from being announced (and over two decades away from being correctly sketched (and as of 2021 we’re <em>still</em> waiting for it to be published)) so it was the biggest tool in the toolbox.</p>
<p>Subsequent work in this area, then and still known as the problem of L. Fuchs, focused on excising the dependency on Feit–Thompson, and was to my knowledge completed in 1972, but only published twenty years later (Yff 1991) for some reason?
The whole timestream is wacky around this problem:
Golomb started his inquiries into Wilsonian products of <script type="math/tex">G</script> in 1951, but did not conjecture [what is now known as the theorem of] Dénes–Hermann until 1970,
by which time the problem was already well-known (Fuchs 1964) and solved in the case that <script type="math/tex">G</script> is solvable (Rhemtulla 1969<sup id="fnref:3"><a href="#fn:3" class="footnote">3</a></sup>).</p>
<p>Luckily there is a much more elementary argument in our case. It’s a dinky little problem but you could probably guide a student of group theory through it without trouble.</p>
<div class="footnotes">
<ol>
<li id="fn:1">
<p>This nomenclature comes to us more or less by way of Golomb. <a href="#fnref:1" class="reversefootnote">↩</a></p>
</li>
<li id="fn:2">
<p>In case it is not clear, by this I mean an ordered product of a submultiset of <script type="math/tex">S</script>. <a href="#fnref:2" class="reversefootnote">↩</a></p>
</li>
<li id="fn:3">
<p>This paper isn’t online for whatever reason, so if you need “help” finding it, drop me a line. <a href="#fnref:3" class="reversefootnote">↩</a></p>
</li>
</ol>
</div>Ilia ChtcherbakovI was recently reminded of a very simple chestnut about sampling elements from finite groups.
This is the kind of problem you give your intro group theory students to test whether or not they are paying attention, or at least to keep them busy; nothing special.
But the way it was worded suggested a harder version of the problem that immediately piqued my interest, and though I haven’t fully solved it yet, it ended up tying itself up in a neat little bow by the end of it.The forbidden subword method2021-05-25T17:56:18-04:002021-05-25T17:56:18-04:00http://cleare.st/math/forbidden-subwords-method<p>I haven’t felt like writing anything hard or long, but I have felt like writing, so here is something easy that I have had cause to think about over the past couple days.
This is a post about a basic tool in the enumeration toolkit, which I have used countless times since I learned of it, but strangely I have never heard anyone else talk about it.
There are less general techniques that are well-known, and there more general techniques that are significantly more complicated to use, but this Goldilocks zone seems woefully underdisturbed.
<!--more--></p>
<p>I learned this method in my intro to combinatorics course, taught by Chris Godsil in 2014.
I don’t think the rest of the class liked him all that much, because the class was early in the morning and he, in the parlance of the times, goes hard in the paint.
But I ended up taking four whole courses with him, so suffice it to say he left a good impression on me.
I am now feeling the compulsion to wax nostalgic for those days, so let me move on quickly.</p>
<p>A <em>language</em> is simply a subset of all the words—finite sequences—made from some alphabet <script type="math/tex">\Sigma</script> having <script type="math/tex">s</script> letters.
Suppose that <script type="math/tex">s</script> is finite, and you have a finite set of words <script type="math/tex">F \subseteq \Sigma^*</script>.
Then the basic form of the forbidden subword method gives you a generating function</p>
<script type="math/tex; mode=display">A(x) = \sum_{w \in L} x^{|w|} = \sum_{n \in \mathbb N} (\text{# of words of length}\ n)x^n</script>
<p>for the language <script type="math/tex">L</script> consisting of all words that do not contain anything in <script type="math/tex">F</script> as a subword.</p>
<p>Before I begin, I should hope I do not have to extol the virtues of generating functions to you, dear reader. They are a fundamental piece of enumerative technology, and they are exceedingly versatile.
I will give a sample of the applications for which you can use this method, but it is by no means exhaustive, and is truncated for concision.
If you want something to read, I perennially recommend Wilf’s book <a href="https://www2.math.upenn.edu/~wilf/DownldGF.html">generatingfunctionology</a>.</p>
<p>In case you need a refresher to how languages work, let me prime you with a couple of quick examples.</p>
<ul>
<li>The empty word is denoted <script type="math/tex">\epsilon</script>, and is a perfectly valid word.
It has length zero, so the generating function of the singleton language <script type="math/tex">\{\epsilon\}</script> is the constant <script type="math/tex">1</script>.</li>
<li><script type="math/tex">\Sigma</script> is a set of letters, which are words of length 1,
so its generating function is the monomial <script type="math/tex">sx</script>.</li>
<li>
<p>The set of all words <script type="math/tex">\Sigma^*</script> has all <script type="math/tex">s^n</script> possible words of length <script type="math/tex">n</script>, so its generating function is the geometric series</p>
<script type="math/tex; mode=display">\sum_{n \in \mathbb N} s^n x^n = \frac1{1 - sx}.</script>
</li>
<li>The generating function of the disjoint union of two languages is the sum of their generating functions.</li>
<li>
<p>If you have two languages <script type="math/tex">L</script> and <script type="math/tex">M</script>, their concatenation is the language</p>
<script type="math/tex; mode=display">L \cdot M = \{ ww' : w \in L, w' \in M \}.</script>
<p>The generating function of <script type="math/tex">L \cdot M</script> is the product of the generating functions of <script type="math/tex">L</script> and <script type="math/tex">M</script>.<sup id="fnref:1"><a href="#fn:1" class="footnote">1</a></sup></p>
</li>
</ul>
<hr />
<p>To start, let’s walk through the method’s internal logic in a simple case, with two letters and one word: <script type="math/tex">\Sigma = \{\texttt h,\texttt t\}</script> and <script type="math/tex">F = \{\texttt{hth}\}</script>.
The trick is to define an auxiliary language <script type="math/tex">M</script>, consisting of all words having precisely one occurrence of <script type="math/tex">\texttt{hth}</script>, and that occurrence being right at the end.
<script type="math/tex">L</script> and <script type="math/tex">M</script> are related to each other in some simple ways, and these will enable us to find a system of equations among them, which can be transformed into a system of equations in their generating functions.</p>
<p>The first observation comes from considering the language product <script type="math/tex">L \cdot \Sigma</script>. Every such concatenation is a nonempty word, and either it contains no occurrences of <script type="math/tex">\texttt{hth}</script>, or it contains precisely one, right at the end.
Accounting for the fact that the empty word <script type="math/tex">\epsilon</script> does indeed belong to <script type="math/tex">L</script>, we find the following equation of languages:</p>
<script type="math/tex; mode=display">\def\Cup{\mathbin{\mkern2mu\cup\mkern2mu}}\{\epsilon\} \Cup L \cdot \Sigma = L \Cup M.</script>
<p>The value of this observation is that it translates directly into an equation of generating functions. Letting <script type="math/tex">A(x)</script> be the generating function for <script type="math/tex">L</script> and <script type="math/tex">B(x)</script> for <script type="math/tex">M</script>, we see that</p>
<script type="math/tex; mode=display">1 + A(x) \cdot (2x) = A(x) + B(x).</script>
<p>If we can find a second, independent equation of languages, we will have a system in which we can solve for <script type="math/tex">A(x)</script>.</p>
<p>And for this, we make a second observation: if you simply concatenate the entirety of <script type="math/tex">\texttt{hth}</script> to <script type="math/tex">L</script>, that definitely has occurrences of the forbidden word, you just need to track what they look like.
It is possible that you have a word in <script type="math/tex">L</script> ending in <script type="math/tex">\texttt{ht}</script>, such that the first <script type="math/tex">\texttt h</script> completes an occurrence of the forbidden word, and then there is a <script type="math/tex">\texttt{th}</script> hanging on afterwards.
But if not, then you simply get a word in <script type="math/tex">M</script>.</p>
<p>Putting this all together, we find</p>
<script type="math/tex; mode=display">L\cdot\texttt{hth} = M\cdot\texttt{th} \Cup M,</script>
<p>which overall gives us the following system of generating functions:</p>
<script type="math/tex; mode=display">% <![CDATA[
\begin{align*}
1 + 2xA(x) &= A(x) + B(x) \\
x^3A(x) &= x^2B(x) + B(x)
\end{align*} %]]></script>
<p>Then you solve this system for <script type="math/tex">A(x)</script> just like you would in grade school: substitute <script type="math/tex">B(x) = \frac{x^3}{1+x^2}A(x)</script> into the top equation and rearrange to find</p>
<script type="math/tex; mode=display">A(x) = \frac{1}{1 - 2x + \frac{x^3}{1+x^2}} = \frac{1 + x^2}{1 - 2x + x^2 - x^3}.</script>
<p>Now you can do anything you can normally do with generating functions, such as extract coefficients.
Here’s an example of a spicier thing you can do: evaluating <script type="math/tex">\frac{xd}{dx}B(x)</script> at <script type="math/tex">1/2</script> gives the expected number of coin flips until you observe the sequence <script type="math/tex">\texttt{hth}</script>.
Feel free to verify that the answer in this case is 10.</p>
<hr />
<p>We are ready to examine the general case, where <script type="math/tex">\Sigma</script> and <script type="math/tex">F \subseteq \Sigma^*</script> are arbitrary finite sets.</p>
<p>As before, we define <script type="math/tex">L</script> to have no subwords from <script type="math/tex">F</script>.
Now, there is one auxiliary language <script type="math/tex">L_f</script> for each <script type="math/tex">f \in F</script>, having no occurrence of any forbidden subwords except for a single occurrence of <script type="math/tex">f</script> at the end of the word.
Without loss of generality, we may assume no word in <script type="math/tex">F</script> is a proper subword of another word in <script type="math/tex">F</script>: if some <script type="math/tex">f</script> is properly contained in another <script type="math/tex">f'</script>, then <script type="math/tex">L_{f'}</script> is empty because any word ending in <script type="math/tex">f'</script> will contain an occurrence of <script type="math/tex">f</script> also.</p>
<p>Then we construct <script type="math/tex">\lvert F \rvert + 1</script> equations. The first is, as before, <script type="math/tex">\{\epsilon\} \cup L\cdot\Sigma = L \cup \bigcup_{f \in F} L_f</script>, which translates to</p>
<script type="math/tex; mode=display">1 + sx A(x) = A(x) + \sum_{f \in F} A_f(x).</script>
<p>Then for each <script type="math/tex">f \in F</script>, we obtain an equation whose LHS is <script type="math/tex">L\cdot f</script>.
On the RHS, for each triple <script type="math/tex">(u,v,w)</script> of words such that <script type="math/tex">|v|\ge1</script>, <script type="math/tex">uv \in F</script>, and <script type="math/tex">vw = f</script>, there will be an <script type="math/tex">L_{uv}\cdot w</script> term.</p>
<p>One way to manage this complexity is to package the <script type="math/tex">w</script>’s into sets of “quotients”</p>
<script type="math/tex; mode=display">\alpha\backslash f = \{ w : \alpha = uv, \lvert v \rvert \ge 1, f = vw \},</script>
<p>so that</p>
<script type="math/tex; mode=display">L\cdot f = \bigcup_{(u,v,w)} L_{uv}\cdot w = \bigcup_{f' \in F} L_{f'}\cdot(f'\backslash f).</script>
<p>This translates into the equation</p>
<script type="math/tex; mode=display">x^{|f|}A(x) = \sum_{f' \in F} A_{f'}(x) Q_{f'\backslash f}(x)</script>
<p>where <script type="math/tex">Q_{\alpha\backslash f}(x)</script> is the generating function of <script type="math/tex">\alpha\backslash f</script>, which will always be a polynomial of degree strictly less than <script type="math/tex">\lvert f\rvert</script> where each coefficient is 0 or 1.</p>
<p>Now you simply solve this as a system of equations, and you’re done.</p>
<hr />
<p>And now, here’s a handful of quick applications and extensions of this method.</p>
<ol>
<li>
<p><em>What’s the probability of seeing at least <script type="math/tex">k</script> heads in a row if you flip a coin <script type="math/tex">n</script> times?</em></p>
<p>Take <script type="math/tex">\Sigma = \{\texttt h, \texttt t\}</script> and <script type="math/tex">F = \{ \texttt h^k \}</script>.
Once you have that <script type="math/tex">A(x) = \frac{1-x^k}{1-2x+x^{k+1}}</script>, you can scale these numbers down to probabilities, and negate them by computing <script type="math/tex">\frac{1}{1-x} - A\bigl(\frac x2\bigr)</script>. Then you simply extract the <script type="math/tex">n</script>-th coefficient.</p>
</li>
<li>
<p><em>How many sequences of coinflips having <script type="math/tex">h</script> heads and <script type="math/tex">t</script> tails avoid the word <script type="math/tex">\texttt{htth}</script>?</em></p>
<p>Take <script type="math/tex">\Sigma = \{\texttt h, \texttt t\}</script> and <script type="math/tex">F = \{\texttt{htth}\}</script>, but this time use multiple variables in your generating functions: <script type="math/tex">x</script> for <script type="math/tex">\texttt h</script> and <script type="math/tex">y</script> for <script type="math/tex">\texttt t</script>. Now your system of equations looks like:</p>
<script type="math/tex; mode=display">% <![CDATA[
\begin{align*}
1 + (x+y)A(x,y) &= A(x,y) + B(x,y) \\
x^2y^2A(x,y) &= (xy^2+1)B(x,y)
\end{align*} %]]></script>
</li>
<li>
<p><em>My friends and I are playing a game where each of us picks a sequence of letters in <script type="math/tex">\Sigma</script>, and then we sample it uniformly until one of us sees our word… What are my chances of winning?</em></p>
<p>Take <script type="math/tex">F</script> to be the set of all your friends’ words.
The chances of <script type="math/tex">f \in F</script> winning is <script type="math/tex">A_f(1/s)</script>,
unless <script type="math/tex">f</script> is a superword of some other word, in which case it will either sometimes tie with its suffixes or always lose.
You can verify that <script type="math/tex">\sum_{f \in F} A_f(1/s) = 1</script> always.</p>
</li>
<li>
<p><em>How do I count lattice paths in the plane that can go in any direction but do not backtrack?</em></p>
<p>Take <script type="math/tex">\Sigma = \{\texttt n,\texttt e,\texttt s,\texttt w\}</script> and <script type="math/tex">F = \{\texttt{ns},\texttt{sn},\texttt{we},\texttt{ew}\}</script>.
You can speed up your working by observing that <script type="math/tex">A_{\texttt{ns}} = A_{\texttt{sn}}</script> and <script type="math/tex">A_{\texttt{we}} = A_{\texttt{ew}}</script>, by a simple bijection argument.</p>
</li>
<li>
<p><em>Can <script type="math/tex">F</script> be infinite?</em></p>
<p>Technically, yes.
There are only finitely many words of any length so the RHS of each equation will always converge<sup id="fnref:2"><a href="#fn:2" class="footnote">2</a></sup> and only finitely many equations will affect the prefixes of the generating functions <script type="math/tex">A(x)</script> and <script type="math/tex">A_f(x)</script> so <script type="math/tex">F</script> can be infinite.
However, this will become extremely difficult to solve by hand, unless <script type="math/tex">F</script> is nice in some analyzable way. If you can put this to use I’d love to see it.</p>
</li>
<li>
<p><em>Can <script type="math/tex">\Sigma</script> be infinite?</em></p>
<p>It can, but it has to at least be <em>locally finite</em>, meaning that only finitely many letters have any particular weight (cf. example 2).
Equivalently, it needs to have a well-defined generating function.
That said, interpreting what it is you’ve just counted becomes a little more nuanced in this case, so I would not take this as an easy generalization.</p>
</li>
<li>
<p><em>I want the answer to be Fibonacci!</em></p>
<p>Uh, okay, I guess. Take <script type="math/tex">\Sigma = \{0,1\}</script> and <script type="math/tex">F = \{11\}</script> to find that <script type="math/tex">A(x) = \frac{1}{1-x-x^2}</script>.</p>
</li>
</ol>
<hr />
<p>Finally, I will extemporate a little on the other techniques that I mentioned earlier, and how this method sits among them.</p>
<p>The first that comes to mind is that all these languages <script type="math/tex">L</script> are <a href="https://en.wikipedia.org/wiki/Regular_language">regular languages</a>, which means in principle you can write out their regular expressions and then multiply together the appropriate generating functions without solving a system of equations.
True as that is, it is rare that the regular expression appropriate for the problem is simpler to find than writing out and solving this system of equations. Regular expressions are atrocious to work with in all but the nicest circumstances.</p>
<p>The other technique that comes to mind is the more powerful “state machine” method where you write down the adjacency matrix <script type="math/tex">A</script> of a digraph of states and transitions, and then compute</p>
<script type="math/tex; mode=display">E_T^\top(I - xA)^{-1}E_S^{\vphantom\top} = \frac{E_T^\top\operatorname{adj}(I-xA)E_S^{\vphantom\top}}{\det(I-xA)},</script>
<p>where <script type="math/tex">E_S = \sum_{s \in S} e_s</script> is the sum over all permissible source states <script type="math/tex">S \subseteq V</script> and <script type="math/tex">E_T</script> the target states.</p>
<p>This can certainly emulate and even trounce the forbidden subword method in some special cases, but for the generic problem that the forbidden subword method solves easily, this technique is bloated and difficult.
See, if <script type="math/tex">\ell</script> is the length of the longest word in <script type="math/tex">F</script> then naively you need <script type="math/tex">s^{\ell-1}</script> states, and reducing the number of states usually requires increasing the complexity of the matrix you’ve just described, which makes the calculations not all that much easier.</p>
<p>It’s all about picking the right tool for the job, in the end. And the forbidden subword method is the right tool for a surprising number of jobs.
And it’s super easy to remember how it goes, too: I internalized it once in 2014 and I haven’t forgotten it since.</p>
<div class="footnotes">
<ol>
<li id="fn:1">
<p>Technically this only works if each word in <script type="math/tex">L \cdot M</script> can be written uniquely as the concatenation of a word in <script type="math/tex">L</script> with a word in <script type="math/tex">M</script>. Otherwise, the product of generating functions will enumerate <script type="math/tex">L \cdot M</script> as a multiset, i.e. with multiplicities considered. In the language products considered in this post, this will never be an issue, but technically you’ve gotta be careful. <a href="#fnref:1" class="reversefootnote">↩</a></p>
</li>
<li id="fn:2">
<p>To be clear, this is convergence in the <script type="math/tex">x</script>-adic topology aka as a formal power series. I know people usually talk about formal power series as not worrying about converging, but technically it’s a form of converging. This is not a generating functions post, so please do not make me get out the generating functions pedantry, I just wanna do a cute little bit of enumeration. <a href="#fnref:2" class="reversefootnote">↩</a></p>
</li>
</ol>
</div>Ilia ChtcherbakovI haven’t felt like writing anything hard or long, but I have felt like writing, so here is something easy that I have had cause to think about over the past couple days.
This is a post about a basic tool in the enumeration toolkit, which I have used countless times since I learned of it, but strangely I have never heard anyone else talk about it.
There are less general techniques that are well-known, and there more general techniques that are significantly more complicated to use, but this Goldilocks zone seems woefully underdisturbed.Chris Barker’s Iota-Jot-Zot family of esolangs2020-12-19T22:30:12-05:002020-12-19T22:30:12-05:00http://cleare.st/code/iota-jot-zot<p>The alternative, more poetic title that I was tempted to give this post was <em>A Most Esoteric Tragedy</em>.
I relented for three reasons. Morally: I detest clickbait.
Practically: it would unhelpfully obscure the topic on the rare occasion that someone might actually find this ‘blog helpful. (Ha!)
Most of all, emotionally: it would be comically hammy for me to entertain the idea that the contrivance contained in this analysis could be called a tragedy.
But hopefully, the fact that I spent the entire flavour text being snobby about this alternative title sufficiently conveys the self-indulgence and megalomania I feel for having come up with it.
<!--more--></p>
<p>Esoteric programming languages, or <em>esolangs</em>, are, as their name suggests,
deliberately minimalist or extravagant or otherwise obfuscated programming languages, which are mostly of recreational—at best, theoretical—value to the disciplines of computer science.
Having cut my teeth on them in high school, I feel a deep affinity for them, and their existence is what makes the idea of programming bearable to me.
For this reason, I will endeavour to make this readable to what I recall to be myself a decade ago, so if you are a recurring reader and you think I emphasize or clarify something that is out of character for me,
rest easy, because I am merely trying to insult my past self’s level of intelligence, not yours.</p>
<p>There are many genres of esolang, but in this post we will focus on one small family, all invented by linguist Chris Barker in 2001 and 2002, that falls into what might be called the <em>combinatory logic</em> genre.
I plan to explain these three languages, both in the literal sense of the specification, and in the metaphorical sense of ‘by what mechanisms and principles do they work’.
Partly to cap off my own recent study of them, and partly to give an alternative perspective to the internet and anyone else who has the same archaeological curiosity as my past self.</p>
<p><strong>Table of contents.</strong></p>
<ul>
<li><a href="#warmup">Warmup for combinatory logic esolangs</a></li>
<li><a href="#iota">Iota, the simplest one</a></li>
<li><a href="#jot">Jot and desyncategorematicization</a></li>
<li><a href="#zot">Zot and the question of I/O</a></li>
<li><a href="#conclusion">Conclusion and exercises</a></li>
</ul>
<p>To set the scene, combinatory logic is a rudimentary model of computation, which exists to be compared and contrasted with the lambda calculus.
Where lambda calculus seeks to reason about functions and computation by defining and reasoning about functions <em>analytically</em>,
combinator calculi<sup id="fnref:1"><a href="#fn:1" class="footnote">1</a></sup> work <em>synthetically</em>, building complicated functions from a small set of simpler ones.
They both describe the same thing, and can live peacefully in the same world,
so I will be using ‘terms’ and ‘combinators’ almost interchangeably, but they have slightly different theoretical connotations which don’t matter unless you are a pedant like me.
I will be using both of them heavily in what follows, so familarity with both is ideal;
but as with most esolang-related computer science, you can probably fake your way through it even if you aren’t, and have merely skimmed the <a href="https://en.wikipedia.org/wiki/Lambda_calculus">Wikipedia</a> <a href="https://en.wikipedia.org/wiki/Combinatory_logic">articles</a>.</p>
<p>Finally, I will admit up front that the analysis I will be doing is ahistorical.
I will be: assigning a sequential order to developments which did not necessarily occur in that (or any!) order; poetically assigning motivations and narratives that I have imagined; and speculating wildly overall.
Chris Barker, if you’re reading this, sorry for putting words in your mouth.
Oh, and while I have your attention, please finish renovating your website!
You’ve destroyed the links to the original documentation of these langs and I had to create an account on the Esolang wiki to <a href="https://esolangs.org/w/index.php?title=Jot&type=revision&diff=79341&oldid=72850">edit its References section</a> to include links to archived versions on the Wayback Machine.</p>
<h1 id="warmup">Warmup</h1>
<p>To begin, here are some combinators to consider.</p>
<script type="math/tex; mode=display">% <![CDATA[
\def\m{\mathrm}\begin{align*}
\m I &= \lambda x. x \\
\m K &= \lambda xy. x \\
\m S &= \lambda xyz. xz(yz) \\
\m B &= \lambda xyz. x(yz) \\
\m C &= \lambda xyz. xzy \\
\m W &= \lambda xy. xyy
\end{align*} %]]></script>
<p>Very famously, every term with no free variables in the lambda calculus can be written as some combination of the S and K combinators. Indeed, for any combinator <script type="math/tex">X</script>,<sup id="fnref:2"><a href="#fn:2" class="footnote">2</a></sup></p>
<script type="math/tex; mode=display">\m{SK}X = \lambda v.\m{SK}Xv = \lambda v.\m Kv(Xv) = \lambda v.v = \m I,</script>
<p>and then there is a simple algorithm by which you can recursively eliminate abstractions from any term using only S, K, and I:</p>
<script type="math/tex; mode=display">\lambda v.v = \m I, \quad \lambda v.w = \m Kw, \quad \lambda v.TT' = \m S(\lambda v.T)(\lambda v.T')</script>
<p>The reason this matters is to do with a little thing called Turing-completeness, you might have heard of it.
That maybe sounds like I’m being flippant for a joke, but actually I’m not going to talk about Turing-completeness at all, because it’s irrelevant.
For our purposes we will just leave it at: lambda calculus and combinatory logic are both Turing-complete, whatever that means, so any language or computation model that can express everything they can express, is also Turing-complete.</p>
<p>Anyway, the fact that SKI forms a basis for combinatory logic is actually the main conceit for the very first esolang in this combinatory logic genre, David Madore’s <a href="http://www.madore.org/~david/programs/unlambda/">Unlambda</a>.
Unlambda is so famous, it not only has an entry on the <a href="https://esolangs.org/wiki/Unlambda">Esolang wiki</a> or on <a href="https://en.wikipedia.org/wiki/Unlambda">Wikipedia</a>, but it was referenced on my ‘blog in a <a href="/code/call-cc-yin-yang-puzzle">previous post</a>!
To make a medium story short, Unlambda is SKI, plus some extra design quirks, wrapped in a spartan syntax, and then I/O hastily duct-taped on top.</p>
<p>Unlambda set a baseline for the genre, in that you can boil down—or at least taxonomize—the difference into worries about the aesthetics of those four facets: the combinator calculus, the other language design, the syntax, or the I/O capabilities.
For example, you can trade out SK for another basis, like BCKW.
Or maybe your name is John Tromp and you think you can do I/O better, so you come up with a different model for that and create <a href="https://tromp.github.io/cl/lazy-k.html">Lazy K</a>.</p>
<p>And this brings us to the first of Barker’s languages: Iota, from (near as I can place it) 2001.</p>
<h1 id="iota">Iota</h1>
<p>The main conceit of Iota, and what grants it its harsh minimalism, is primarily the use of a single combinator as the basis for its calculus.
This universal combinator, which Barker calls <script type="math/tex">\iota</script>, Iota calls <code class="highlighter-rouge">i</code>, and which I will call U, is <script type="math/tex">\lambda f.f\m{SK}</script>.
The other necessary ingredient is the notation for application, for which Barker follows in the footsteps of Unlambda and makes a prefix operator, which he decides to spell <code class="highlighter-rouge">*</code>.</p>
<p>Barker summarizes this information as follows.</p>
<table class="centered">
<thead>
<tr>
<th style="text-align: center">syntax</th>
<th style="text-align: center">semantics</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: center"><script type="math/tex">F\to\texttt i</script></td>
<td style="text-align: center"><script type="math/tex">\lambda f.f\m{SK}</script></td>
</tr>
<tr>
<td style="text-align: center"><script type="math/tex">F\to\texttt *FF</script></td>
<td style="text-align: center"><script type="math/tex">[F][F]</script></td>
</tr>
</tbody>
</table>
<p>To prove U forms a basis by itself, it suffices to write another basis in terms of it. So here is SKI, and how to combine them:</p>
<script type="math/tex; mode=display">% <![CDATA[
\begin{align*}
\m I &= \m{UU} &&= \texttt{*ii} \\
\m K &= \m U(\m{UI}) &&= \texttt{*i*i*ii} \\
\m S &= \m{UK} &&= \texttt{*i*i*i*ii} \\
(AB) &= \texttt*(A)(B)
\end{align*} %]]></script>
<p>If you don’t like U you can also consider a variant of Iota wherein <code class="highlighter-rouge">i</code> codes for a different universal combinator.
Barker does not explicitly suggest this, but he includes discussion of alternatives, such as <script type="math/tex">\lambda f.f\m{KSK}</script> or <script type="math/tex">\lambda f.f\m S(\m{BKK})</script>.</p>
<p>In any case, Iota can write anything that SKI can, so it gets to enjoy the same Turing-completeness.
And it accomplishes this with only two language elements, <code class="highlighter-rouge">*</code> and <code class="highlighter-rouge">i</code>, which is generally where minimalist esolangs are content to draw the line.</p>
<p>Notably, Iota does not have I/O of any kind. I mean, if you are so inclined you can consider the program as coding for a combinator that takes input according to <a href="https://en.wikipedia.org/wiki/Church_encoding">one of the many data encodings</a> of lambda calculus and spits out a similarly coded output.
But it doesn’t really have a framework for that. We will see this addressed by other members of the language family, but not immediately.</p>
<p>Instead, the first issue Barker takes with Iota is the linguistic relationship between <code class="highlighter-rouge">*</code> and <code class="highlighter-rouge">i</code>.
In Barker’s own words, <code class="highlighter-rouge">*</code> is <em>syncategorematic</em><sup id="fnref:3"><a href="#fn:3" class="footnote">3</a></sup>, which is to say it does not really have its own semantic meaning, it is necessarily facilitating some modification of other elements, and cannot exist without them.
To wit, <code class="highlighter-rouge">i</code> is an Iota program, but <code class="highlighter-rouge">*</code> is not.
Barker is primarily a linguist, so this is probably a natural question to ask, whether application need be handled syncategorematically.</p>
<h1 id="jot">Jot</h1>
<p>In the middlest language of this family, Jot, Barker succeeds in lexicalizing Iota’s application operation, and consequently arrives at a language with a number of interesting properties.
However, as I hope to convey to you, there were a couple of design choices made that give me pause, and make me wonder how Barker thinks about his languages.
Here, I’ll start with the syntax/semantics table and we’ll work out from there.</p>
<table class="centered">
<thead>
<tr>
<th style="text-align: center">syntax</th>
<th style="text-align: center">semantics</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: center"><script type="math/tex">F \to \varepsilon</script></td>
<td style="text-align: center"><script type="math/tex">\m I</script></td>
</tr>
<tr>
<td style="text-align: center"><script type="math/tex">F \to F\texttt0</script></td>
<td style="text-align: center"><script type="math/tex">\m U[F]</script></td>
</tr>
<tr>
<td style="text-align: center"><script type="math/tex">F \to F\texttt1</script></td>
<td style="text-align: center"><script type="math/tex">\m B[F]</script></td>
</tr>
</tbody>
</table>
<p>The high level overview of Jot’s syntax is that it is left-branching, so each next bit modifies the program, starting with the identity for the empty program.
<code class="highlighter-rouge">0</code> applies the universal combinator (which for now we will insist must remain U and not some other one, for a subtle reason) and <code class="highlighter-rouge">1</code> uses the lexicalized application.
The mapping to SKI is a little uglier now:</p>
<script type="math/tex; mode=display">% <![CDATA[
\begin{align*}
\m I &= \texttt{11010} \\
\m K &= \texttt{11100} \\
\m S &= \texttt{11111000} \\
(AB) &= \texttt1(A)(B)
\end{align*} %]]></script>
<p>In particular, the proofs that these work as they should require a little more subtlety than the simple transliteration we used in Iota.
Here’s how you prove that, say, K works as intended: for all programs <script type="math/tex">F</script>,</p>
<script type="math/tex; mode=display">% <![CDATA[
\begin{align*}
[F\texttt{11100}] &= \m U[F\texttt{1110}] = [F\texttt{1110}]\m{SK} \\
&= \m U[F\texttt{111}]\m{SK} = [F\texttt{111}]\m{SKSK} \\
&= \m B[F\texttt{11}]\m{SKSK} = [F\texttt{11}](\m{SK})\m{SK} \\
&= \m B[F\texttt1](\m{SK})\m{SK} = [F\texttt1](\m{SKS})\m K \\
&= \m B[F](\m{SKS})\m K = [F](\m{SKSK}) \\
&= [F]\m K.
\end{align*} %]]></script>
<p>And then for application, you observe that <script type="math/tex">[F\texttt1]XY = \m B[F]XY = [F](XY)</script>.
Finally, you can say the magic words, “by induction,” and the correctness follows.</p>
<p>We’re almost done describing what makes Jot cool, but we have enough for me to bring up my first quibble.
If you were paying attention to the universal combinator stuff in Iota, then you may have noticed that it’s actually kind of weird for <script type="math/tex">[w\texttt0]=\m U[w]</script> when it’s a lot more directly Iota-like for it to have been <script type="math/tex">[w]\m U</script>.
In fact, I’ll save you the trouble of checking: if Barker had done this then K would have been <code class="highlighter-rouge">1010100</code> and S would have been <code class="highlighter-rouge">101010100</code>, and Jot would have been literally the same language as Iota but spelled with <code class="highlighter-rouge">10</code> instead of <code class="highlighter-rouge">*i</code>.<sup id="fnref:4"><a href="#fn:4" class="footnote">4</a></sup>
If you were feeling whimsical, you could call this version of Jot a <em>resemanticization</em> of Iota that <em>categorematifies</em> <code class="highlighter-rouge">*</code>.</p>
<p>So why did he change it when moving to Jot? I don’t know for sure, but currently my running hypothesis is, “because he could”.
Barker likely noticed that using U in this way still gave access to roughly the same computations as the Iota way did, but the result made the coding of S and K a little bit shorter, so he went with that for Jot.
This mutation really isn’t that big of a deal, but you should keep the unsimplified version in mind when we move to the final language in the family.</p>
<p>The final observation that Barker makes about Jot is that, since every term in the SK combinator calculus can be written as a Jot program starting with <code class="highlighter-rouge">1</code>,
Jot constitutes a <em>Gödel numbering</em>, that is, for every program there exists a natural number coding for it.
The mapping from Gödel number to program is to express the number in binary and then read that as Jot code for a term in the combinator calculus;
the reverse mapping is to code your program into combinator calculus, express that in terms of S and K, and then read off the Jot number.</p>
<p>This is admittedly a pretty cool property for a language to have. It’s an extremely natural coding—it’s just binary, and every bit has a concrete meaning!—compared to the hacks that logicians have had to employ in the past.
But I feel like there is something devastatingly obvious the Barker missed about this coding, that has some serious implications for the aesthetics of the language and its Gödel numbering.</p>
<p>Here, take a look.
By fiat, the meaning of the empty Jot program is just the identity combinator.
Nothing wrong with that, it’s a perfectly natural choice.
But look at what happens to initial <code class="highlighter-rouge">1</code>s in a Jot program as a result:</p>
<script type="math/tex; mode=display">[\texttt1] = \m B[\varepsilon] = \m{BI} = \lambda xy.\m{BI}xy = \lambda xy.\m I(xy) = \lambda xy.xy = \m I = [\varepsilon]</script>
<p>They have no effect! Any Jot program can begin with arbitrarily few or many <code class="highlighter-rouge">1</code>’s and they would not affect the resultant program.
So the “every combinator can be written as a Jot program that starts with <code class="highlighter-rouge">1</code>” observation is kind of a sham, because you can freely add or remove <code class="highlighter-rouge">1</code>’s at no cost.</p>
<p>And here’s the kicker.</p>
<p>When you express a number in binary, you know what bit that you can freely add to or remove from the front of it, that doesn’t change its binary meaning?</p>
<p>The bit 0.</p>
<p>If Jot had all its bits flipped, it would still be a Gödel numbering—just drop all the leading zeroes from your transliteration of the SK expression—but it would also arithmetically respect the binary representations of numbers.</p>
<p>…Tragedy!</p>
<p>Of course, it behooves me to try to explain Barker’s design choice.
My hypothesis for this one relies on the original form of Barker’s syntax/semantics table for Jot.
In it, he writes <script type="math/tex">[F\texttt1] = \lambda xy.[F](xy)</script>, which is more explicit about its purpose. This explicit formulation probably muddles the computations just enough to make the extensional equivalence <script type="math/tex">[\texttt1] = [\varepsilon]</script> feel not so transparent.
Even the SK form of this expression, <script type="math/tex">[F\texttt1] = \m S(\m K[F])</script>, seems complicated enough that you could conceivably miss what happens for the empty program.
But it still feels like an oversight to me, and I can’t have been the only person to have noticed this, so the explanation is probably more nuanced than that.
Maybe Barker simply rejects extensional equivalence altogether?</p>
<p>In any case, I hereby declare my Jot relex, Jomplement, an even better Gödel numbering than Jot.</p>
<h1 id="zot">Zot</h1>
<p>In 2002, Barker finally tries to tackle I/O. To recap, Iota had no I/O to speak of.
Jot was not designed with I/O in mind, but strictly speaking, it can take further bits of input after the given program.
However, those bits manifest as combinators of arbitrary complexity, which are passed to the Jot program as arguments, making the task of determining the bits used all but impossible.
So for Zot, Barker had to devise some new model for which a program could conceivably take as input discernable bits, and then give it a compatible output model too.</p>
<p>Barker describes Zot as a variant of Jot, with the same syntax,
and “the only semantic difference is that the values have all been type-lifted using continuations”.
This is almost perfectly accurate, but you have to use the unsimplified Jot that I discussed, where <script type="math/tex">[F\texttt0] = [F]\m U</script>, and not the standard <script type="math/tex">\m U[F]</script> one.
If you recall, that is the one whose combinator transliterations of S and K were exactly the same as Iota’s, and it will be no coincidence that Zot spells S and K the same way too.</p>
<table class="centered">
<thead>
<tr>
<th style="text-align: center">syntax</th>
<th style="text-align: center">semantics</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: center"><script type="math/tex">F \to \varepsilon</script></td>
<td style="text-align: center"><script type="math/tex">\lambda c.c\m I</script></td>
</tr>
<tr>
<td style="text-align: center"><script type="math/tex">F \to F B</script></td>
<td style="text-align: center"><script type="math/tex">[F][B]</script></td>
</tr>
<tr>
<td style="text-align: center"><script type="math/tex">B \to \texttt0</script></td>
<td style="text-align: center"><script type="math/tex">\lambda c.c\m U</script></td>
</tr>
<tr>
<td style="text-align: center"><script type="math/tex">B \to \texttt1</script></td>
<td style="text-align: center"><script type="math/tex">\lambda cL.L(\lambda\ell R.R(\lambda r.c(\ell r)))</script></td>
</tr>
</tbody>
</table>
<p>If you are not a computational linguist or someone else that has cause to think about continuations, you may say to yourself,
“Continuations? What are those needlessly complicated abstractions doing here?”
And I do not yet have a ‘blog post in which I explain continuations adequately, so I cannot point you there, for you to see the errors of your ways.
But suffice it to say that continuations, arcane as they may seem, are an important functional tool for understanding and manipulating control flow,
and for Barker, continuization is the method by which Zot is able to build a program and have that program take input, using the same semantics throughout.</p>
<p>I don’t really know a way to describe this that doesn’t require meditation on continuations, so I’ll do this.
First I’m going to give a lemma about Zot semantics.
Then, towards a sketch of a proof, I’m going to describe the sense in which <script type="math/tex">[\varepsilon]</script>, <script type="math/tex">\texttt0</script>, and especially <script type="math/tex">\texttt1</script> are continuized versions of their Iota and Jot counterparts.
After that, even if you don’t understand continuations, hopefully you will still come out with a sense of how Zot accomplishes what it does.</p>
<blockquote>
<p><strong>Lemma.</strong> If <script type="math/tex">F</script> is a bitstring
where every prefix has at least as many <code class="highlighter-rouge">1</code>’s as <code class="highlighter-rouge">0</code>’s,
then <script type="math/tex">[F] = \lambda c.cX</script> for some combinator <script type="math/tex">X</script>.</p>
</blockquote>
<p>The condition about prefixes should be familiar to any combinatoricists that are reading, and they are probably already coming up with alternative ways of phrasing it that they prefer.
One way in particular that I will single out, for its intuitive content, is that if you treat <code class="highlighter-rouge">1</code>’s like opening parentheses and <code class="highlighter-rouge">0</code>’s like closing parentheses,
then every closing paren in <script type="math/tex">F</script> matches up to some open one (and then maybe there are some extra open ones that haven’t been closed yet but that’s fine).</p>
<p>This <script type="math/tex">\lambda c.cX</script> form is, roughly, a signal that the meaning of the term is still continuized, so it will interact with other continuzed terms in a “program-building” way.
But after that, if we reach a <code class="highlighter-rouge">0</code> bit in <script type="math/tex">F</script> where the <code class="highlighter-rouge">0</code>’s finally outnumber the <code class="highlighter-rouge">1</code>’s, the result is no longer guaranteed to be continuized,
and then Zot may stop permitting the next bits to contribute to the definition of the program and instead just treat them as a series of distinguishable arguments.</p>
<p>The argument at the start of the continuized form is meant to be the continuation of the program—to make a long story short, the future of the program ought to be packed into that one argument, so that you apply it to your result when you’ve finished making it.
You see, if you have two such continuzed combinators, then they dance around each other in an interesting way:</p>
<script type="math/tex; mode=display">(\lambda c_1.c_1X)(\lambda c_2.c_2Y) = (\lambda c_2.c_2Y)X = XY</script>
<p>Did you catch that? It came out to <script type="math/tex">XY</script> at the end, but both of the continuized versions took a turn being at the head of the computation.
In this case they both simply ceded control to the other, so nothing special occurred,
but the point is that if you live in this continuized world where everybody promises to give everyone else a turn at the head of the computation,
then everyone has a chance to enact something at the top level if they really want to. If the <script type="math/tex">Y</script> continuization was <script type="math/tex">\lambda c.\m K(cY)</script> then there would be a K left on the outside of the computation.</p>
<p>This very short and oversimplificatory explanation is also the basis by which some computational linguists think that meaning in natural languages may also be continuized.
How else do you explain the ability of the word “everyone” to slip a universal quantifier around the sentence it contains?
Oh hey, come to think of it, I’ve some papers and even a book on the applicability of continuations to natural language<sup id="fnref:5"><a href="#fn:5" class="footnote">5</a></sup> by a certain linguist named Chris Barker. Where have I seen that name before…</p>
<p>Now we’ll talk about how our combinators specifically were continuized.
The simplest form of continuization is just as constants: even though combinators themselves are meant to be functions, you can continuize them as though they are data that are not necessarily intended to take arguments.
To do this, you just map your combinator <script type="math/tex">X</script> to <script type="math/tex">\lambda c.cX</script>, simple as that.</p>
<p>So <script type="math/tex">[\varepsilon] = \lambda c.c\m I</script> is the constant continuization of the identity combinator, that begins every Jot program,
and <script type="math/tex">\texttt0 = \lambda c.c\m U</script> is the constant continuization of our universal combinator, U. (It is notable that here, in the Zot documentation, is the first time Barker suggests that you can swap out the universal combinator if you are not satisfied with his choice, so long as you remember to continuize it.)</p>
<p><code class="highlighter-rouge">1</code> is different, however. It was never just B as data, it was B applied to the meaning of the program. In Jot, Barker phrased it as a function taking two arguments, applying them to each other, and then handing that off to the rest of the meaning of the program.</p>
<script type="math/tex; mode=display">[F\texttt1]_{\text{Jot}} = \lambda xy.[F](xy)</script>
<p>So we’re going to do a more complicated continuization, that lifts this combinator into the continuized world <em>as a function</em>.
Here’s the plan. We take the continuation first—everybody does, so we agree to play along—and that looks like <script type="math/tex">\lambda c.({-})</script>.
(I’m gonna use <script type="math/tex">({-})</script> to denote a blank, into which the rest of our work will go.)
Then we’re gonna do whatever other work we wanted to get done, and finally once we have our result, say <script type="math/tex">X</script>, we finish off by doing <script type="math/tex">cX</script>.</p>
<p>Then we start to take our arguments. But remember, our arguments are going to be continuized, so they look like <script type="math/tex">A = \lambda x.xa</script> or something,
and the only way we can get access to the juicy data they contain is if we give them a continuation and ask them very nicely to please use it.
So the blueprint for that is something like this: <script type="math/tex">\lambda A.A(\lambda a.({-}))</script>.
The <script type="math/tex">a</script> is the argument we wanted, but the <script type="math/tex">A</script> is the continuized argument that we initially get,
and we give it the continuation <script type="math/tex">\lambda a.({-})</script> under the assumption that it will surrender its argument to us, as <script type="math/tex">a</script>, when it gets a chance to be the head.</p>
<p><code class="highlighter-rouge">1</code> takes two arguments, the left one and the right one, so so far we’re looking at <script type="math/tex">\lambda c.\lambda L.L(\lambda\ell.\lambda R.R(\lambda r.({-})))</script>.
All we have to do is apply those arguments to each other (<script type="math/tex">\ell r</script>), and then finish off by handing that to <script type="math/tex">c</script> like we promised earlier we would (<script type="math/tex">c(\ell r)</script>).
So overall, we obtain:</p>
<script type="math/tex; mode=display">\lambda c. \underbrace{\lambda L.L(\lambda \ell}_{\text{left arg}}. \underbrace{\lambda R.R(\lambda r}_{\text{right arg}}. c(\ell r)))</script>
<p>This is exactly <code class="highlighter-rouge">1</code> as advertised in the semantics table.</p>
<p>With this understanding, the proof of the lemma is not difficult, just a little tedious to phrase correctly.
If I had to prove it, I’d reduce it the following more explicit claim.</p>
<blockquote>
<p><strong>Claim.</strong> If <script type="math/tex">F</script> is a bitstring satisfying the prefix condition
and having <script type="math/tex">n</script> more <code class="highlighter-rouge">1</code>’s than <code class="highlighter-rouge">0</code>’s, then</p>
<script type="math/tex; mode=display">[F] = \lambda C_0.C_0(\lambda c_0.(\cdots(\lambda C_n.C_n(\lambda c_n.Xc_0\dots c_n))\cdots))</script>
<p>for some combinator <script type="math/tex">X</script>.
In other words, <script type="math/tex">[F]</script> is the <em>(<script type="math/tex">n+1</script>)-th order continuization</em> of <script type="math/tex">X</script>.</p>
</blockquote>
<p>And this is a perfect segue into Zot’s I/O facilities.
Zot takes program and input from the same bitstream, one after the other, so the delineation between the two is a lot fuzzier than in other languages.
Strictly speaking, you are free to declare any particular bitstring as the program you intended to write, and all subsequent bits input.
But from the Lemma, there is a natural dichotomy between the bits that Zot treats as program and the bits it treats as input.</p>
<p>Combining the bit-by-bit semantics with the Lemma, any bitstring where all prefixes have at least as many <code class="highlighter-rouge">1</code>’s as <code class="highlighter-rouge">0</code>’s will be continuized,
and is therefore required to cede the head of the computation to its argument.
But by the same token, the first prefix of a bitstring to have more <code class="highlighter-rouge">0</code>’s than <code class="highlighter-rouge">1</code>’s is no longer bound by the Lemma and can potentially treat the following bits purely as arguments.</p>
<p>And, being that <script type="math/tex">\texttt0 \ne \texttt1</script>, they can be distinguished.
Barker suggests the combinator <script type="math/tex">\m Q = \lambda f.f\m{IIIK}</script>, which satisfies</p>
<script type="math/tex; mode=display">\m Q\texttt0 = \m K \quad\text{and}\quad \m Q\texttt1 = \m{KI},</script>
<p>the standard encodings of true and false respectively for boolean into lambda calculus and combinatory logic.
(Yes, you read that right, it’s true on <code class="highlighter-rouge">0</code> and false on <code class="highlighter-rouge">1</code>. Barker gets this mixed up in his Zot ‘page. Another point for the Jomplement truthers!)
So now, if <script type="math/tex">B</script> is the Zot meaning of a bit, and <script type="math/tex">X</script> and <script type="math/tex">Y</script> are the combinators you want to evaluate on <code class="highlighter-rouge">0</code> or <code class="highlighter-rouge">1</code> respectively, then <script type="math/tex">\m QBXY = B\m{IIIK}XY</script> will accomplish that.
Set yourself up in a loop (infinite or finite) and you have a program taking bits of input.</p>
<p>Output is less cute, and borne of a non-theoretical necessity, but I will try to describe it nonetheless.
You may have noticed a pattern among the terms <script type="math/tex">\m U, [\varepsilon], \texttt0, \m Q</script>: they all begin with a lambda abstraction, as in “<script type="math/tex">\lambda f.</script>”,
and then their action is simply to feed <script type="math/tex">f</script> a sequence of combinators.
For U it is <script type="math/tex">\langle \m S,\m K \rangle</script>, for <code class="highlighter-rouge">0</code> it is <script type="math/tex">\langle \m U \rangle</script>. In fact, the identity combinator itself is of this form, for the empty sequence <script type="math/tex">\langle\rangle</script>.</p>
<p>Predictably, Barker calls combinators of this form <em>sequences</em>. For him, <script type="math/tex">\langle \m I, \m I, \m I, \m K \rangle</script> is an alternative notation for Q.
He gives a recursive definition of sequences, and casts input in their terms.
You may have even noticed that a Zot program is precisely a binary sequence—that is, a sequence of <code class="highlighter-rouge">1</code>’s and <code class="highlighter-rouge">0</code>’s—applied to the continuization of the identity, which starts the whole process of peeling those continuization layers off and mingling them together into a program.
Because they are how program is read and how input is taken, Barker considers binary sequences the appropriate way to give output, too.</p>
<p>If your program is designed to take only certain well-formed inputs, then there is no need to do anything special, and your program will just take input bits until it gets a well-formed input, and then spit out its output.
But at this point, Barker imagines what it would be like to have interactive dialogues between input and output.
The reasoning goes, the Zot program needs some method of receiving an EOL and optionally surrendering a partial output, and then going back to a further inputs mode.
And this is where it gets a bit ill-specified.</p>
<p>Barker has some proof-of-concept Zot interpreters which signal end-of-input by first sending a new combinator that he calls “output” and that I will call E:</p>
<script type="math/tex; mode=display">\m E = \lambda abcd.\m K(\m{KI}) = \m K(\m K(\m K(\m K(\m K(\m{KI}))))) = \m K^6\m I;</script>
<p>and then, assuming the result is some binary sequence, further sending a printing function to it which recursively prints everything in the sequence it gets.
In case E looks silly at first blush, let me point out that it is designed so that <script type="math/tex">\m{QE} = \m K(\m{KI})</script>,
which in turn has the effect that the bit-distinguishing trick from our input discussion has the result <script type="math/tex">\m QBXY = \m I</script> when <script type="math/tex">B = \m E</script>.</p>
<p>This is a fun start of an idea, but Barker seems to leave it unfinished, deferring to the statement, “This allows an appropriately designed system to have orderly dialogs of input and output.”
Perhaps the implication is that a Zot program that does not yet wish to halt would, upon receiving E followed by a printer, apply the printer to whatever bits it wishes to output, and then discard the printer and eat more input?
It’s unclear, at least to me.</p>
<p>Furthermore—and I understand that this might be a particularly modern or softwarey concern when Barker is a linguist and made Zot almost two decades ago, but still—is any of this stuff something that a Zot interpreter can accurately tell is occurring?
I think you bump up against some uncomputable problems if you try to verify whether a Zot program is actually adhering to any of these hypothetical contracts beyond the simple ones.
It is uncomputable, for instance, to tell if the print function you have handed to your Zot program is ever going to be used again, so you can’t use that to tell if Zot is done with it.
I haven’t checked, but it’s probably even uncomputable to tell if you have a finite binary sequence or not.</p>
<p>Anyway, this is merely nitpicking at this point. It’s not entirely fair for me to grill Chris Barker <em>in absentia</em>.
My real point is, the output is a proof of concept. Zot isn’t quite “a version of Jot with I/O”, as much as it is “(unsimplified) Jot with I and the potential for O”.</p>
<h1 id="conclusion">Conclusion</h1>
<p>These languages are slick and I think they deserve more eyes and more study.
They are a rather insular family, admittedly; I think it would hard to learn specific lessons that are applicable to other languages, especially non-esoteric ones. But there is some juicy theory in there.</p>
<p>Jot in particular is extremely interesting to me—not only because it is a variant of the best Gödel numbering, Jomplement—but also because the semantic sleight of hand by which it operates is impressive.
This is perhaps a contradictory tone for me to take, after melodramatically saying it had at its heart a grand tragedy, but I am not so silly as to let what a thing actually is inform my opinion of it.
Jot is an idea, that forms the dazzling centerpiece of an adorable<sup id="fnref:6"><a href="#fn:6" class="footnote">6</a></sup> little theme-and-variations progression in the world of esolangs.</p>
<p>And now, to ruin a perfectly good mood, and make good on the threat I made in my table of contents:</p>
<blockquote>
<p><strong>Exercises.</strong>
1.(i) Prove the combinators <script type="math/tex">\lambda f.f\m{KSK}</script>
and <script type="math/tex">\lambda f.f\m S(\m{BKK})</script> are universal.
(ii) Write down a universal combinator that is distinct from the three
discussed in this post. Do you notice any patterns?</p>
<p>2. Prove the Lemma. Optional advice: prove the Claim first.</p>
<p>3. Write down a translation table from SKI to Zot.</p>
<p>4. I have argued that Zot is a semantic continuization of
“unsimplified Jot”. Write down a continuized semantics for actual Jot.</p>
</blockquote>
<p>These exercises are ordered thematically.
In terms of difficulty, I reckon <script type="math/tex">% <![CDATA[
1(\m i) < 3 < 2 < 1(\m{ii}) <\mkern-7mu< 4 %]]></script> but your mileage may vary.
Remember, “write down” means you have to come up with it and then prove it’s correct!</p>
<div class="footnotes">
<ol>
<li id="fn:1">
<p>Yes, plural. I had half a mind to use the plural form for the lambda calculi too, because there’s more than one of those, but I don’t want to scare you off too quickly. <a href="#fnref:1" class="reversefootnote">↩</a></p>
</li>
<li id="fn:2">
<p>I’m skipping some pedantry about extensional versus intensional equality and what implicit foundations the theory of computation rests on. In particular there are some bad evaluation models and bad <script type="math/tex">X</script>’s for which this won’t actually be the identity, but instead have the potential to fail to halt. It doesn’t really matter for the purposes of this ‘blog, but if it did matter there would be a lot of interesting stuff to say, so add that to the list of Things I’d Like to Write Down Sometime. <a href="#fnref:2" class="reversefootnote">↩</a></p>
</li>
<li id="fn:3">
<p>I cannot adequately describe the sheer philological decadence the existence of this word inspires in me. <a href="#fnref:3" class="reversefootnote">↩</a></p>
</li>
<li id="fn:4">
<p>This is why I insisted on U as the universal combinator for Jot, by the way. The sense in which it needs to be universal is different from Iota’s. If you wish to use a different combinator for <code class="highlighter-rouge">0</code>, then either it’s universal and you use the Iota-like semantics, or it’s a subtly different property from universality that it needs to satisfy. U satisfies both universality and this other property, so it’s okay. <a href="#fnref:4" class="reversefootnote">↩</a></p>
</li>
<li id="fn:5">
<p>Here are a couple citations, just for you.</p>
<blockquote>
<p>Barker, Chris. <a href="https://semanticsarchive.net/Archive/902ad5f7/barker.continuations.pdf">Continuations and the nature of quantification</a>. <a href="https://doi.org/10.1023/A:1022183511876">doi:10.1023/A:1022183511876</a></p>
</blockquote>
<blockquote>
<p>Barker, Chris and Chung-chieh Shan. <a href="https://semanticsarchive.net/Archive/mJjY2YxO/barker.cw.pdf">Continuations and Natural Language</a>. <a href="https://doi.org/10.1093/acprof:oso/9780199575015.001.0001">doi:10.1093/acprof:oso/9780199575015.001.0001</a></p>
</blockquote>
<p><a href="#fnref:5" class="reversefootnote">↩</a></p>
</li>
<li id="fn:6">
<p>I’ve noticed I use the word ‘adorable’ a lot whenever I positively review something, lately. It’s in like half of my Steam reviews. I’m trying my best to vary but it’s just such a perfectly good little word. One might even say that it’s… nay, I mustn’t! <a href="#fnref:6" class="reversefootnote">↩</a></p>
</li>
</ol>
</div>Ilia ChtcherbakovThe alternative, more poetic title that I was tempted to give this post was A Most Esoteric Tragedy.
I relented for three reasons. Morally: I detest clickbait.
Practically: it would unhelpfully obscure the topic on the rare occasion that someone might actually find this ‘blog helpful. (Ha!)
Most of all, emotionally: it would be comically hammy for me to entertain the idea that the contrivance contained in this analysis could be called a tragedy.
But hopefully, the fact that I spent the entire flavour text being snobby about this alternative title sufficiently conveys the self-indulgence and megalomania I feel for having come up with it.pu pi toki pona: A case study in orthodoxy2020-05-18T18:49:10-04:002020-05-18T18:49:10-04:00http://cleare.st/lang/toki-pona-and-orthodoxy<p><em>toki pona</em> is a minimalist constructed language, created by linguist Sonja Lang between 2001 and 2014.
It features 9 consonants and 5 vowels in its phonemic inventory (cf. most dialects of English at 24 and 14-25), and those 14 phonemes combine into a 123-word vernacular.
The standard reference document on the conlang is currently <a href="https://www.amazon.com/gp/product/0978292308">Lang’s book</a> from 2014.
There are many things to find fascinating about toki pona, but the one I’d like to explore in detail right now is this:
toki pona dedicates one of its 123 words to that specific book, despite not having a word for “book”.
<!--more-->
Before we get into what I think is so spicy about that, let’s all get on the same page about toki pona’s history, philosophy, and current state.</p>
<hr />
<p>toki pona’s main conceit is its minimalist philosophy.
You can see material evidence of this in even its phonemic inventory.
Here are all fourteen of its phonemes: they are rather basic and almost exactly match their <a href="https://en.wikipedia.org/wiki/International_Phonetic_Alphabet">IPA</a>.</p>
<table class="centered">
<thead>
<tr>
<th style="text-align: center">a</th>
<th style="text-align: center">e</th>
<th style="text-align: center">i</th>
<th style="text-align: center">o</th>
<th style="text-align: center">u</th>
<th style="text-align: center">j</th>
<th style="text-align: center">k</th>
<th style="text-align: center">l</th>
<th style="text-align: center">m</th>
<th style="text-align: center">n</th>
<th style="text-align: center">p</th>
<th style="text-align: center">s</th>
<th style="text-align: center">t</th>
<th style="text-align: center">w</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: center">/<a href="https://en.wikipedia.org/wiki/Near-open_central_vowel">ɐ</a>~<a href="https://en.wikipedia.org/wiki/Open_central_unrounded_vowel">ä</a>/</td>
<td style="text-align: center">/<a href="https://en.wikipedia.org/wiki/Open-mid_front_unrounded_vowel">ɛ</a>/</td>
<td style="text-align: center">/<a href="https://en.wikipedia.org/wiki/Close_front_unrounded_vowel">i</a>/</td>
<td style="text-align: center">/<a href="https://en.wikipedia.org/wiki/Close-mid_back_rounded_vowel">o</a>~<a href="https://en.wikipedia.org/wiki/Open-mid_back_rounded_vowel">ɔ</a>/</td>
<td style="text-align: center">/<a href="https://en.wikipedia.org/wiki/Close_back_rounded_vowel">u</a>/</td>
<td style="text-align: center">/<a href="https://en.wikipedia.org/wiki/Voiced_palatal_approximant">j</a>/</td>
<td style="text-align: center">/<a href="https://en.wikipedia.org/wiki/Voiceless_velar_stop">k</a>/</td>
<td style="text-align: center">/<a href="https://en.wikipedia.org/wiki/Voiced_alveolar_lateral_approximant">l</a>/</td>
<td style="text-align: center">/<a href="https://en.wikipedia.org/wiki/Voiced_bilabial_nasal">m</a>/</td>
<td style="text-align: center">/<a href="https://en.wikipedia.org/wiki/Voiced_alveolar_nasal">n</a>/</td>
<td style="text-align: center">/<a href="https://en.wikipedia.org/wiki/Voiceless_bilabial_stop">p</a>/</td>
<td style="text-align: center">/<a href="https://en.wikipedia.org/wiki/Voiceless_alveolar_sibilant">s</a>/</td>
<td style="text-align: center">/<a href="https://en.wikipedia.org/wiki/Voiceless_alveolar_stop">t</a>/</td>
<td style="text-align: center">/<a href="https://en.wikipedia.org/wiki/Voiced_labio-velar_approximant">w</a>/</td>
</tr>
</tbody>
</table>
<p>This inventory is fully compatible with English, as well as most of the other world’s most spoken languages.
Maybe at a glance it sounds a little infantilised because the most aggressive sounds it can muster are /t/ and /k/, but it only really rears its head if you need to transliterate something.
I’m not going to get into the phonotactics, even though <a href="https://en.wikipedia.org/wiki/Toki_Pona#Phonology_and_phonotactics">it’d be easy to do so</a>, because it’s been done to death and I don’t really need it here.
Suffice it to say, the rules are simple and easy to learn.</p>
<p>The real meat of the minimalism of toki pona is in its vocabulary, though.
As I alluded to earlier, the vocabulary is incredibly small, just 123 words.
For reference, my <a href="/meta/introducing-comments">last post</a> alone used over a thousand unique English words, which is larger by a factor of eight.
The function of this vocabulary is to dissect the world into a comparatively small number of basic elements, and challenge you to build the world back out of those primitives with the grammar.
The grammar is fairly versatile, allowing you to recombine these atoms in a number of ways whenever you encounter something not having its own word,
but not so strong that you can just translate from a natural language word-for-word—you need to put some thought into what it really is, and what’s worth saying about it.</p>
<p>For example, <em>jan</em> is the word for ‘person’ and <em>pona</em> is the word for ‘good’ and ‘friend’ is traditionally analysed into a compound of these two words, <em>jan pona</em>: a good person.
For another example, a <em>tomo tawa</em> is a movement structure, which is to say a vehicle, and then a <em>tomo tawa waso</em> is a bird vehicle, which probably refers to a plane.</p>
<p>To get more mileage out of this vocabulary, toki pona often allows its words to occupy more than one part of speech—the adjectival <em>tawa</em> meaning ‘of movement’ can only be distinguished grammatically from the prepositional <em>tawa</em> meaning ‘to’ or ‘towards’ or ‘according to’, or the verbal <em>tawa</em> meaning ‘to move’ transitively (or ‘to move to’ intransitively!)—and wholly embraces the occasional syntactic/semantic ambiguity that arises.
“I have an airplane” would be <em>mi jo e tomo tawa waso</em>, but in a different context, that sentence might be read as “I have a house for the bird” instead!
You generally have to determine which makes sense based on context.
toki pona also does not decline for number or case or anything, preferring instead adjectives (like <em>mute</em>, ‘many’) or prepositions (like <em>lon</em>, ‘in, on, at’) or particles (like <em>e</em>, the direct object particle) or even word order.</p>
<p>The lesson that toki pona is trying to teach is—in my opinion, anyway—that you need to think about what you really mean, and how to most faithfully express it, and maybe even if it deserves to be disambiguated in the way you want it to.
Here, here’s a concrete example.
toki pona has three words for number: <em>wan</em>, <em>tu</em>, and <em>mute</em>; meaning one, two, and many, respectively.
Well, okay, there’s also <em>ala</em> for zero, but that’s just because it’s a polyseme for ‘none’ and also handles verb negation.
The claim here is that anything larger is not really important enough to disambiguate.
This is probably bold and unsettling to many people at first glance, given how much numbers and sequences of numbers are important in our modern lives.
There is an alternative system of numbers, in which <em>luka</em> (hand) is assigned 5, <em>mute</em> 20, and <em>ale</em> (all) 100, but this is still terribly unwieldy.
The rejection of large numbers is a statement, a conscious decision on the part of toki pona that they don’t—or shouldn’t—matter to a toki pona speaker.</p>
<p>Here are some more quick examples, now that you get the picture.
Both ‘good’ and ‘simple’ are polysemously accomodated by <em>pona</em>, which means toki pona is intentionally conflating them, and asserts you should too.
The word <em>wile</em> means both ‘to want’ and ‘to need’, and you can interpret this as toki pona philosophically objecting to anyone that claims that you can want without needing, or need without wanting.
The lone third person pronoun is <em>ona</em>, subsuming all of ‘he’, ‘she’, singular ‘they’, and ‘it’, and the claim here (for English speakers at least!) is clear, I should hope.</p>
<p>There is lots to discuss and dissect here, but overall the result is a language that, while perhaps not being perfectly semantically minimal,
forces its speakers to engage with minimalist philosophy in order to speak/write the language profitably,
and in turn asks its speakers to consciously be understanding and interpretive in order to read/listen to the communication of others successfully.
It’s very fun, and makes for quite a positive community on the whole.
The simple vocabulary lends itself to a very low barrier to entry, too, which you can leisurely tackle in a month, or power through in as quick as a weekend.
(Consider this my token endorsement of the language.)</p>
<hr />
<p>This, as I’ve described it, is toki pona, according to <em>jan Sonja</em>’s book: <em>Toki Pona: The Language of Good</em>.
Or, as the community calls it, <em>pu</em>.
And that’s not an in-joke of some kind, that’s the Official Toki Pona Name for that book. <em>TP:tLoG calls itself</em> pu, <em>as one of its 123 words</em>.
It’s not even polysemous with other words that could be related, like ‘textbook’ or ‘instruction’ or ‘orthodoxy’.
This in itself is interesting already—this level of self-awareness and self-reference places <em>pu</em> within comparison distance to other texts like religious texts and some manifestos—but it gets even more interesting when you look at the history of toki pona.</p>
<p>As I alluded to earlier, the first draft of toki pona was published online in 2001, and received periodic updates until <em>pu</em> was finally published in print in 2014.
It quickly grew from a Yahoo! mailing list to a phpBB forum to an expansive community that now reaches across several websites and social networks: Facebook, Reddit, Discord, Telegram, and likely several more.
This, you may have observed, is a long time to live with version 0.x of anything.
There are old timers that grew fluent in toki pona with words that have since been deprecated by the time <em>pu</em> was released.</p>
<p>Furthermore, as with any thriving language, new words are coined and grammatical innovations are created regularly.
So toki pona exists in three states at once: I shall follow the community in referring to these states as pre-<em>pu</em>, <em>pu</em>, and post-<em>pu</em>.
Note that these are not literal temporal states, but more so attitudes towards the language that are causally related to <em>pu</em>.</p>
<p>What complicates the story further is that these states exist in tension with each other.
<em>pu</em> implicitly rejects all toki pona that is not <em>pu</em>.
It is a critical part of toki pona’s philosophy that you use the language as it is, instead of coining new words for the complex concepts that <em>pu</em> rejects.
And, somewhat cleverly, it enforces this by having a word for itself: it is <em>pu</em> to study <em>pu</em>, and whether or not a production is supported or rejected by <em>pu</em> is a discussion that can occur, <em>in toki pona</em>.</p>
<p>In practice, the tensions look something like this.
Pre-<em>pu</em> toki pona is rather widely and fluently spoken, and adheres to the same minimalist philosophy, but contains deprecated words, in mild but direct conflict with <em>pu</em>.
Post-<em>pu</em> toki pona, on the other hand, is wary (via <em>pu</em>) of some of these pre-<em>pu</em> words, but, being a language with users that have independent thought, it is generally free to take any of them that it pleases, as well as whatever new words it coins, which <em>pu</em> and pre-<em>pu</em> both take issue with.</p>
<p>It is easy to sympathize with <em>pu</em> here, if you take its philosophy seriously: new words are circumventing the minimalism of trying to analyse the world into its primitive elements.
Plus, as its fluent speakers constantly demonstrate, it is generally very possible to find the appropriate circumlocutions to talk about any desired topic.
A very common and powerful strategy is to explain a concept in as much toki pona as you need, and then refer to it as <em>ijo ni</em> (“that thing”) or <em>jan ni</em> (“that person”) or <em>nasin ni</em> (“that method”) or a <em>ni</em>-modified version of whatever other simple noun is appropriate.
And as a final nail in the coffin, <em>pu</em> itself admits that toki pona is a simple language (a <em>toki</em> that is <em>pona</em>) and it can very easily become unsuitable for a world as complicated as ours.
toki pona is not for technical documents, according to <em>pu</em>.</p>
<p>Let me argue in post-<em>pu</em>’s defence, though.
Post-<em>pu</em> toki pona having more words than <em>pu</em> is not necessarily inconsistent with the minimalist philosophy of <em>pu</em>, if the new words are carefully justified.
Consider, for example, the post-<em>pu</em> word <em>tonsi</em>.
In <em>pu</em>, there is a single third person pronoun, <em>ona</em>, and the only two words to have any relation to gender are the two words <em>meli</em> (female, femininity; etymologically from Tok Pisin <em>meri</em>) and <em>mije</em> (male, masculinity; etym. Finnish <em>mies</em>).
However, if you are trans or nonbinary, there is no word for you in toki pona.
Although you can circumlocute with any number of options, like <em>jan pi meli mije</em> or <em>jan pi meli ala pi mije ala</em> or something,
this is an unwieldy hassle to simply state who you are.</p>
<p>So define <em>tonsi</em> (nonbinary; etym. Mandarin <em>tóngzhì</em> for ‘comrade [no political connotation]’ and, in slang, ‘LGBT’).
Now nonbinary people don’t have to be circumlocuted or explained or <em>ni</em>-ed, they simply are, and have a language primitive to match the existing ways of talking about gender: <em>jan tonsi</em> to accompany <em>jan meli</em> and <em>jan mije</em>.
The assertion is that <em>tonsi</em> is perfectly in keeping with the <em>pu</em> philosophy, and more generally there can exist strong arguments for the introduction of words otherwise excluded or neglected by <em>pu</em>.</p>
<p>And this is merely one of the many “levels of strength” to which one can be post-<em>pu</em>.
It is both “post-<em>pu</em>” to introduce an exhaustive system for number-naming—rendering the <em>wan-tu-mute</em> trick irrelevant for the sake of utility—and to use a <em>pu</em> word in a new part of speech—taking its meaning there to be the obvious one suggested by the existing polysemy and the inferred grammar of this modification.
Perhaps I am giving off the impression that I am a post-<em>pu</em> sympathizer.
That’s true, but in my opinion, it’s not as clear-cut as <em>pu</em> or post-<em>pu</em> is prescriptivist or descriptivist, “right” or “wrong”;
and this is an immensely interesting place to be.</p>
<p>One final thing to note.
According to <em>pu</em>, <em>pu</em> means either the book, interacting with the book, or being in accordance with the book.
But a fairly weak post-<em>pu</em> novelty is that <em>pu</em> can mean orthodoxy in general.
And while everyone in the community generally has a pretty clear understanding of what it means for something to be <em>pu</em> or pre-<em>pu</em> or post-<em>pu</em>,
I find it just a little funny that depending on which side of the line you fall, you might be inclined to draw the line a little differently.</p>
<hr />
<p>Because I think this is so important as to deserve archival in a ‘blog post,
I would be remiss if I did not make good on my threat to compare <em>pu</em>’s self-reference to that of religious texts.
So I searched for prior art and came upon Donald Haase’s Master’s thesis in Religious Studies.<sup id="fnref:1"><a href="#fn:1" class="footnote">1</a></sup></p>
<blockquote>
<p>Haase, Donald. <em>Self-Referential Features in Sacred Texts.</em> FIU Electronic Theses and Dissertations (2018), 3726. <a href="https://doi.org/10.25148/etd.FIDC006911">doi:10.25148/etd.FIDC006911</a></p>
</blockquote>
<p>It is the only thing I could find when I looked for a sincere survey of self-reference in notable ideological texts—for fans of the ‘blog, that means
I am excluding <a href="https://en.wikipedia.org/wiki/G%C3%B6del,_Escher,_Bach">Hofstadter</a> from my search on the grounds that it is, as contemporary youth might pejoratively say, “a meme” and “kinda cringe”.</p>
<p>In his study, Haase considers a very broad definition of sacred texts—any fixed and bounded sequence of words that is considered by at least one person to be sacred—which serves our purposes handily.
He finds three major categories of self-reference: inlibration, necessity, and untranslatability.
<em>pu</em> satisfies the second of these very literally, but not so much the others.</p>
<p>Since I will be skipping over them, let me briefly summarize the inapplicable two categories.
Inlibration is the proclamation that a text is the textual manifestation of a deity or other sacred entity. Clearly irrelevant here until someone takes the step of deifying <em>jan Sonja</em>.
On the other hand, untranslatability is the instruction that any translation or other non-verbatim communication of the text will fail to inherit its sacredness, being merely mundane, if not outright profane.
This has slightly more applicability, if you interpret non-<em>pu</em> lesson materials as non-verbatim communication, as their details frequently come under scrutiny for adherence to <em>pu</em>.
But, especially given the existence of <a href="https://www.amazon.fr/dp/0978292359">an official translation of <em>pu</em> into French</a>, I think modified transmission of <em>pu</em> is not controversial,
at least so long as the linguistic rules and minimalist philosophy are represented faithfully.</p>
<p>The necessity self-referential, <a href="http://www.catb.org/~esr/jargon/html/O/on-the-gripping-hand.html">on the gripping hand</a>, is when a text insists on its own necessity for some (ostensibly important) purpose.
Haase suggests many traditionally religious necessities:</p>
<blockquote>
<p>A text can describe itself as necessary in (at least) the following ways:</p>
<ul>
<li><a href="https://en.wikipedia.org/wiki/Soteriology">Soteriologically</a> Necessary: Necessary for salvation.</li>
<li><a href="https://en.wikipedia.org/wiki/Eschatology">Eschatologically</a> Necessary: Necessary to bring about the end of the world, or to make the end of the world occur in a desired way.</li>
<li>Ritualistically Necessary: Necessary for the correct performance of a ritual.</li>
<li><a href="https://en.wikipedia.org/wiki/Ontology">Ontologically</a> Necessary: Necessary for the functioning or existence of reality.</li>
<li>Commanded Necessity: Mandated to be used in some way by a sacred source (without explicit invocation of one of the above reasons).<sup>114</sup></li>
</ul>
<p>What this is not is any kind of implied necessity of the contents of the text.
For instance a text merely describing a ritual or the end of the world does not make it ritualistically or eschatologically necessary.
The most common occurrences of this though are commands within a text that the text be read, recited, studied, copied, or otherwise transmitted.</p>
<p>114. Perhaps this could be separated into distinct types.
Sample statements that would fall under this would be commands to read, learn, teach, copy, pass down, or safeguard the text.
Each of these is impossible to fulfill without the text.</p>
<p><em>[links mine, footnote from source]</em></p>
</blockquote>
<p>Of those that he lists, <em>pu</em> satisfies mainly the ritualistical necessity and the commanded necessity,
by insisting that studying <em>pu</em> is the appropriate way to learn to speak toki pona, both in the main text and cheekily from the exercises.
I would argue that there is also an indirect component, in <em>pu</em>’s strict control of its specific models of grammar and of parts of speech, which I would like to metaphorize as a glorified nonprogrammer’s version of <a href="https://en.wikipedia.org/wiki/Backus%E2%80%93Naur_form">EBNF</a>.
However, Haase explicitly ignores this implied self-referentiality in his study, for the fair purposes of objective comparison.</p>
<p>From Haase’s examples, the only sacred text he considers that falls only under the category of necessity self-referentials is the <a href="https://en.wikipedia.org/wiki/Papyrus_of_Ani">Papyrus of Ani</a>.
This is a personalized funeral scroll, containing passages that would be read at someone’s funeral by a priest so that the deceased is ensured a proper passage into the afterlife.
It also contains instructions on how to read those passages and carry out the ceremony, and even has pictures of a priest reading from that same funeral scroll.</p>
<p>At the risk of making light of ancient funeral customs, I think <em>pu</em> and the Papyrus of Ani are both comparable in the degree of insistence of their ritualistical and commanded necessities.
For contrast, Haase’s other examples, like the Quran and the Book of Mormon, are generally far stricter about their own necessity, and possess further forms of self-referentiality to boot.
It would be immensely silly to leave the comparison at that, but I think I’ll have to, being that I don’t feel qualified to talk at length about any of the other texts that Haase uses.
So overall, though the comparison is not entirely fair, I think it still definitely places <em>pu</em> somewhere interesting on a hypothetical “sacral continuum”.</p>
<hr />
<p>I don’t have a conclusion to this case study that isn’t just gesturing towards Haase’s thesis and implying that I don’t have much to add.
Strictly speaking, my motivations are orthogonal to Haase’s, though his groundwork is definitely stuff I think needs to be done and I am lucky that I do not have to do it myself just to do this case study saliently.</p>
<p>And, don’t let me let you forget it, this is just a (rather shallow) case study centered around a single text.
As I alluded to earlier, this self-referentiality lens probably has some interesting things to say about ideological manifestos, too.
I’m more interested in this because it relates to a conlang I like,
but I don’t think there’s a similar dynamic going on with too many other conlangs.
There definitely might be a couple of the popular ones worth examining, though: my money’s on <a href="https://en.wikipedia.org/wiki/Lojban">Lojban</a>, because I don’t know enough about the history of <a href="https://en.wikipedia.org/wiki/Esperanto">Esperanto</a> to assess its usefulness there.</p>
<p>As far as my <em>pu</em> opinions go, I think <em>tonsi</em> specifically is a good word and belongs in toki pona, and there are very few non-<em>pu</em> words to which I extend the same privilege, even among the generally uncontroversial pre-<em>pu</em> lexicon.
An <a href="https://www.reddit.com/r/tokipona/comments/g9ne0s/survey_results_heres_how_real_these_tp_words_are/">informal survey</a> on Reddit recently concluded that, out of 86 respondents, more than half agree that <em>tonsi</em> is a valid toki pona word.
Only a slightly larger proportion reports in favour of the meme word <em>kijetesantakalu</em> that Sonja Lang introduced into toki pona on 1 April 2009, which I know is just a funny fact but I think also deserves to be called good progress for <em>tonsi</em>.</p>
<p>I have been working on a math post in toki pona.
For anyone that is familiar with toki pona, or that paid attention earlier when I said that toki pona was not built to discuss technical subjects:
yeah, I am indeed as stupid as I sound.
No ETA on it yet, because I’m not entirely sure that it’s even going to work out comprehensibly, but my fingers are crossed.
If there’s an easy topic you’d like to see, leave me a comment below.
I won’t make any guarantees, because I’d probably have to pivot the stuff I’ve already written, but inspiration is never unwelcome.</p>
<div class="footnotes">
<ol>
<li id="fn:1">
<p>The lattice fans out there will be pleased to know that lattices tried to sneak their way into this post just as much as any of my mathematical ones: I kept typoing Haase as Hasse! If one slipped past the Ctrl+F, then you can officially declare the lattices as the winner of this round. <a href="#fnref:1" class="reversefootnote">↩</a></p>
</li>
</ol>
</div>Ilia Chtcherbakovtoki pona is a minimalist constructed language, created by linguist Sonja Lang between 2001 and 2014.
It features 9 consonants and 5 vowels in its phonemic inventory (cf. most dialects of English at 24 and 14-25), and those 14 phonemes combine into a 123-word vernacular.
The standard reference document on the conlang is currently Lang’s book from 2014.
There are many things to find fascinating about toki pona, but the one I’d like to explore in detail right now is this:
toki pona dedicates one of its 123 words to that specific book, despite not having a word for “book”.Introducing: Comments!2020-05-07T17:16:20-04:002020-05-07T17:16:20-04:00http://cleare.st/meta/introducing-comments<p>For a long time, I’ve wanted comments for my Jekyll-based ‘blog.
However, the existing alternatives, Disqus and Staticman, were not satisfactory, so I decided to roll up my own implementation.
It has certain drawbacks that might make it unsuitable for most applications, but I would like to tell its story here anyway, because it would not have been possible without the ‘blog posts of others.
<!--more--></p>
<p>I can’t say in advance how good of a story this makes, but I do have a fair amount of code to share with you, to help you get off your feet too.
Obviously, not the full code—I’m not entirely convinced anyone else should actually literally follow in my footsteps and choose this as the basis for their comments engine—but enough that you’ll know what to do if you were me at the time I started:
you know how programming works in general but you don’t know how these tools work in specific, and you don’t necessarily have the patience to learn them all the way through just to get a couple of measly comments on your ‘blog.</p>
<p>The technologies I’ll be using are ‘blog-aware, static website generator <a href="https://jekyllrb.com/">Jekyll</a>—which is written in <a href="https://www.ruby-lang.org/">Ruby</a> but has its templates written in a combination of HTML markup, <a href="https://sass-lang.com/">SCSS</a> styles, and <a href="https://shopify.github.io/liquid/">Liquid</a> templating—and famously maligned garbage fire <a href="https://www.php.net/">PHP</a>, which are as of this writing being served on <a href="https://httpd.apache.org/">Apache</a>.
Any compatible combination of static site generator, CGI-ready language, and CGI-capable HTTP server will do, but obviously you will have to more liberally translate the code snippets.</p>
<hr />
<p>But let’s start from the beginning.
Jekyll is a static website generator, which means its goal is to take templates and content and combine them into static webpages, which are then served by the HTTP server.
The benefit of static web content is that it can be rather confidently cached, as it is not expected to change frequently.
This reduces load on your HTTP server in two ways: firstly, it doesn’t need to do any thinking when serving the content, and secondly, it is highly amenable to caching by other services, which handles requests before they even reach it.
And ‘blogs are some of the best suited to this publishing paradigm, because they consist almost entirely of content that rarely changes.
Sure, you might correct some typoes or release updates to posts, and maybe you update frequently, but there isn’t really any significant computation needed at serving time.
Thence come static site genrators like Jekyll.</p>
<p>However, being a static website means that any interactive facets of websites are off the table, unless they are somehow adapted to the static paradigm, because you can’t be both dynamic and static.
For ‘blogs, this is the elephant in the room… on the table… because since the invention of comments in 1998, they have become a mainstay on ‘blogs specifically and content-based websites more generally.
It is not that weird for <em>one</em> ‘blog to not have comments, but if Jekyll claims to be “‘blog-aware”, then it needs to have an answer to the inherently dynamic nature of comments.</p>
<p>Jekyll’s default solution is to <em>outsource</em> comments to <a href="https://disqus.com/">Disqus</a>, a comment hosting service.
The ‘blog itself remains static, and each commentable page contains a widget—which is to say, an element that is dynamic only on the client-side—that defers the work of handling comments to the service.
Features like comment moderation, user verification, and spam protection are handled either by Disqus or through it by logging in to an admin panel.
For most people, this is probably the simplest and easiest solution, as it hides all the dynamic components away from the ‘blogger.</p>
<p>As is probably obvious from the historical irony of the existence of this ‘blog post, I did not find this solution satisfactory.
A petty reason is that the design is ugly. It’s an ugly widget. Using your eyes on it is a negative experience.
But here’s a more pressing reason. Because the comments are hosted on Disqus, they belong to Disqus.
If Disqus goes down, your comments go down.
To use the comments (as commentor or as ‘blogger) you need to agree to the Disqus <a href="https://help.disqus.com/en/articles/1717102-terms-of-service">Terms of Service</a>, which include a User ToS, a set of content rules called the Service Rules, a Privacy Policy, an arbitration agreement, and a further Publisher ToS for the ‘blogger.<sup id="fnref:1"><a href="#fn:1" class="footnote">1</a></sup>
No thanks. If I’m going to have comments, they will be mine.</p>
<p class="centered"><img src="/files/blog/disqus.png" alt=""Do not sell my data" Disqus button" /></p>
<p class="caption">I saw this button at the bottom of a Disqus comments widget. Why does this button exist?</p>
<p>Professional website person <a href="https://www.hawksworx.com/blog/adding-a-static-comments-system-to-my-jekyll-build/">Phil Hawksworth</a> made a related but only somewhat similar observation: Disqus has way too many features, and an old school web hacker like him would much prefer a minimalist form service, which by the way can generalize the scope of “comments” to other user-generated content, like reviews or whatever.
So in 2014 he created a project called Poole, which I won’t link because it was short-lived and is now defunct.
My specific complaint surfaced in a <a href="https://www.youtube.com/watch?v=BMve1OCKj6M">talk in 2015</a> from the mouth of Tom Preston-Werner, cofounder of Jekyll and also Github, although in more moral terms.
Ultimately Preston-Werner proposed that the future of comments on static sites was packaged alongside your ‘blog data and built by some omnipresent Jekyll-running service that also builds your ‘blog any time any other change occurs, such as a commit from your computer.</p>
<p>Fellow ‘blog theorist <a href="https://eduardoboucas.com/blog/2015/05/11/rethinking-the-commenting-system-for-my-jekyll-site.html">Eduardo Bouças</a>, also following these developments, built a prototype system which looked to be the start of exactly that:
it receives user comments via some minimal dynamic webpage (in his original case a PHP script), saves them as Jekyll-compatible data (the markup <em>lingua franca</em> is <a href="https://yaml.org/">YAML</a>), and then rebuilds your static website with the new content.
If you want a moderation step, then instead of building immediately you log it somewhere and wait for the owner to respond, etc.
The main point is this calculated violation of staticness.</p>
<p>Okay, that’s pretty good. I like it.</p>
<p>Eventually Bouças matures this project and bundles it into a service called <a href="https://staticman.net/">Staticman</a>, which could be either installed and run locally on your server,
or if you use Github Pages, exists somewhere in the ephemeral cloud as an easily integrated plugin which you can grant permission to commit to your ‘blog repository on your behalf.
And this is where the story turns sour for me: I’m not on Github Pages, so that’s out, and the local installation options are either <code class="highlighter-rouge">npm</code> up a server or something (kiiind of not interested) or install <a href="https://docs.docker.com/compose/">Docker and Docker Compose</a> just to run its little bundled container (Suuuuper Not Interested In That 64 & Knuckles).</p>
<p>I get that this strategy kind of relies on there being a server-like entity that can monitor incoming submissions and trigger the rebuild, but honestly, if this is what that takes, I’ll look for something else.
That said, I don’t really think there is something else.
But what there <em>is</em> is a lineage—<a href="https://google.com/?q=adding+static+comments+to+jekyll">an absolute <em>dearth</em></a>—of ‘blog posts from Bouças and others on how they built their own comment engines in the static framework that I can adapt to my own purposes.
It’s practically a trope for a coder’s personal Jekyll ‘blog to include a post on how you too can add static comments to your static ‘blog in exactly the same way as they did.
It’s CONTENT, dude, you can’t pass up CONTENT.
Well, who am I to break tradition?<sup id="fnref:2"><a href="#fn:2" class="footnote">2</a></sup></p>
<p>In the end, I decided on the following setup.
Like Bouças, my solution to comments will be to violate the static constraint in the form of a PHP script accepting comments.
Unlike Bouças, however, I am under no pressure to publish comments immediately upon submission, because I would like to moderate them.
I have no particular qualms about manually checking for and moderating comments, either, because I don’t expect to receive very many for now,
so I don’t even need any sort of tool to notify me externally.
Although… one would certainly be convenient. Maybe someday I’ll hack together something that sends me an email when I get a comment.</p>
<p>I am also not interested in most comment innovations, so I won’t be including those.
Nested replies, for instance, are temporal aberrations that belong only in large fora, and organizing and implementing such a system is far more work than I care for.
Replying to people is fine, sure—I’ll include some way to link to previous comments and commentors—but I like the flow of time the way it is.
I think the one thing that falls in this category is permitting formatting: I’ll allow full Markdown and MathJax in comments, and simply moderate anything that gets out of hand.</p>
<p>In fact, since I intend to go through these comments manually every time anyway, I’ve decided to rethink the purpose of comments a little bit.
Sure, maybe I’m being a little sentimental here, thinking that I have a legitimate thing to iterate upon in this twenty-year old feedback mechanism.
But at the very least, it’s an experiment I want to try.</p>
<p>The innovation I am proposing is the <em>private</em> comment.
When you submit a comment, you have the option of ticking a checkbox that says “private”, and if you do, then when the comment is submitted, I’ll know it was meant for me only.
It won’t be put in the publishing queue, and even if I ever decide to stop moderating comments, it won’t find its way onto the website.
I imagine this to be a low-effort alternative to sending me an email, so that you don’t even have to leave the article you’re reading to contact me.
I also hope there might be a secondary effect of implying these comments are meant to be a legitimate line of contact with me, and not just some functionality tacked onto the end of a ‘blog out of obligation or audience engagement or whatever.</p>
<hr />
<p>Now let’s get to the implementation details.
If you don’t care about that, then this is as good a place as any to stop reading, because what follows is all code snippets and explanations of Jekyll’s internal mechanisms.
For everyone that does care but isn’t using the specific technologies I’m using, I’ve tried to insert some context so that you’ll have a handier time translating this into your own setting,
but I should once again advise against following directly in my footsteps, because this was a rather idiosyncratic setup that was customized to match the specific way my brain is broken.
There are likely saner solutions to this problem that you are perfectly comfortable with, and I encourage you to seek those out.</p>
<p>First of all, the architecture.
The relevant fraction of Jekyll’s working directory/mental model looks something like the following.
There’s a <code class="highlighter-rouge">_posts/</code> folder full of Markdown files containing ‘blog posts, which are each passed through a Markdown parser and each fed into a post template <code class="highlighter-rouge">_layouts/post.html</code>.
There is also a <code class="highlighter-rouge">_data/</code> folder where you can leave data files for Jekyll to parse and make available in its templating language Liquid.
Finally, there are some other miscellaneous files at the top level, which can include Markdown, HTML, SCSS, or other files that can specify tools they need for preprocessing (Markdown, Liquid, Sass, etc.) and optionally what layout in <code class="highlighter-rouge">_layouts/</code> they desire to be set in.</p>
<p>I desire comments only on select ‘blog posts, and not at all on any other pages, so my comments engine will interface with Jekyll roughly as follows.
I have a bespoke PHP script <code class="highlighter-rouge">commentor.php</code>, acting as a dynamic page which is executed via CGI, expecting a POST request with comment data.
This saves the comment to a YAML file in <code class="highlighter-rouge">_data/comments/</code>, so that Jekyll can parse them and make them available to Liquid.
I specify in each ‘blog post’s metadata whether or not it is to have comments, and if so, which comments file is desired.
Finally I modified <code class="highlighter-rouge">post.html</code> to loop through and display the comments attached to a ‘blog post, if any exist, as well as a form for comment submission to <code class="highlighter-rouge">commentor.php</code>, if the post permits it.
The remainder of the system—moderation, publishing—is handled manually, by me, via SSH or an SCP client.
Specifically, rebuilding will publish public comments by default, but I can check a log to see if there’s been anything new.</p>
<p>Is it stupid? Yes. Does it work? Also yes. I’m a simple man.</p>
<p>Now, let’s get into the nitty-gritty. <code class="highlighter-rouge">commentor.php</code> looks like this:</p>
<div class="language-php highlighter-rouge"><pre class="highlight"><code><span class="cp"><?php</span> <span class="c1">// commentor.php
</span>
<span class="sd">/** expected CGI variables:
* $_SERVER['REMOTE_ADDR'] = ip
* $_POST['submit'] = button
* $_POST['location'] = honeypot
* $_POST['page'] = 'blog page
* $_POST['name'] = username
* $_POST['contact'] = email or whatever
* $_POST['private'] = private? checkbox
**/</span>
<span class="nv">$date</span> <span class="o">=</span> <span class="nb">time</span><span class="p">();</span>
<span class="c1">// initialize $ipbans
</span><span class="nv">$ipbanned</span> <span class="o">=</span> <span class="nb">in_array</span><span class="p">(</span><span class="nv">$_SERVER</span><span class="p">[</span><span class="s1">'REMOTE_ADDR'</span><span class="p">],</span> <span class="nv">$ipbans</span><span class="p">);</span>
<span class="k">if</span><span class="p">(</span><span class="o">!</span><span class="nv">$_POST</span><span class="p">[</span><span class="s1">'submit'</span><span class="p">])</span> <span class="p">{</span>
<span class="nv">$outcome</span> <span class="o">=</span> <span class="s2">"form misuse"</span><span class="p">;</span>
<span class="p">}</span>
<span class="k">elseif</span><span class="p">(</span><span class="nv">$ipbanned</span><span class="p">)</span> <span class="p">{</span>
<span class="nv">$outcome</span> <span class="o">=</span> <span class="s2">"banned"</span><span class="p">;</span>
<span class="p">}</span>
<span class="k">elseif</span><span class="p">(</span><span class="o">!</span><span class="k">empty</span><span class="p">(</span><span class="nv">$_POST</span><span class="p">[</span><span class="s1">'location'</span><span class="p">]))</span> <span class="p">{</span>
<span class="nv">$outcome</span> <span class="o">=</span> <span class="s2">"spam"</span><span class="p">;</span>
<span class="p">}</span>
<span class="k">elseif</span><span class="p">(</span><span class="k">empty</span><span class="p">(</span><span class="nv">$_POST</span><span class="p">[</span><span class="s1">'name'</span><span class="p">]))</span> <span class="p">{</span>
<span class="nv">$outcome</span> <span class="o">=</span> <span class="s2">"no name"</span><span class="p">;</span>
<span class="p">}</span>
<span class="k">else</span> <span class="p">{</span> <span class="c1">// empty honeypot
</span> <span class="nv">$outcome</span> <span class="o">=</span> <span class="s2">"write attempt"</span><span class="p">;</span>
<span class="c1">// initialize $comments_file
</span> <span class="nv">$public</span> <span class="o">=</span> <span class="nv">$_POST</span><span class="p">[</span><span class="s1">'private'</span><span class="p">]</span> <span class="o">?</span> <span class="s2">"false"</span> <span class="o">:</span> <span class="s2">"true"</span><span class="p">;</span>
<span class="c1">// $message = sanitized $_POST['message']
</span> <span class="nv">$comment</span> <span class="o">=</span> <span class="sh"><<<HEREDOC
...
HEREDOC;
</span> <span class="k">if</span><span class="p">(</span><span class="nb">file_put_contents</span><span class="p">(</span><span class="nv">$comments_file</span><span class="p">,</span> <span class="nv">$comment</span><span class="p">,</span> <span class="nx">FILE_APPEND</span><span class="p">))</span> <span class="p">{</span>
<span class="nv">$outcome</span> <span class="o">=</span> <span class="s2">"success"</span><span class="p">;</span>
<span class="p">}</span>
<span class="k">else</span> <span class="p">{</span>
<span class="nv">$outcome</span> <span class="o">=</span> <span class="s2">"write fail"</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="c1">// log $outcome and stuff somewhere (optional)
// initialize e.g. $title, $body, ...
</span>
<span class="cp">?><!DOCTYPE html></span>
<span class="nt"><html></span>
<span class="nt"><head><title></span><span class="cp"><?=</span> <span class="nv">$title</span> <span class="cp">?></span><span class="nt"></title></head></span>
<span class="nt"><body></span><span class="cp"><?=</span> <span class="nv">$body</span> <span class="cp">?></span><span class="nt"></body></span>
<span class="nt"></html></span>
</code></pre>
</div>
<p>It’s half code and half pseudocode, but that’s okay because you probably shouldn’t do this (and I probably shouldn’t be giving out all the details).
One fun thing to notice is the rudimentary spam protection, in the form of IP bans and honeypots.
If you want or need more comprehensive solutions, you can make that more sophisticated, but this works for me.</p>
<p>The comment itself (what you assemble in the HEREDOC) looks like this:</p>
<div class="language-yaml highlighter-rouge"><pre class="highlight"><code><span class="pi">-</span> <span class="s">page</span><span class="pi">:</span> <span class="s">blog-post</span>
<span class="s">date</span><span class="pi">:</span> <span class="s">1234567890</span>
<span class="s">name</span><span class="pi">:</span> <span class="s">Dwigt Rortugal</span>
<span class="s">contact</span><span class="pi">:</span> <span class="s">drortugal@fighting.baseball</span>
<span class="s">ip</span><span class="pi">:</span> <span class="s">69.420.19.94</span>
<span class="s">public</span><span class="pi">:</span> <span class="kt">!!bool</span> <span class="s">true</span>
<span class="s">message</span><span class="pi">:</span> <span class="pi">></span>
<span class="no">cool blog very good</span>
</code></pre>
</div>
<p>These would get appended to an appropriate .yml file in <code class="highlighter-rouge">_data/comments/</code>, so that it has a list of comments in it.
When building, Jekyll will parse them and then you can access their data as <code class="highlighter-rouge"><span class="p">{</span><span class="err">{</span><span class="w"> </span><span class="err">site.data.comments</span><span class="w"> </span><span class="p">}</span><span class="err">}</span></code> in Liquid.
Using that, to every page in <code class="highlighter-rouge">_layouts/</code> which is intended to display comments, you could add something like this, to print out all the comments:</p>
<div class="language-html highlighter-rouge"><pre class="highlight"><code><span class="c"><!-- comment displayer --></span>
{% if page.viewcomments %}
<span class="nt"><h1></span>Comments ({{ site.data.comments[page.comments] | where: "public" | size }} public)<span class="nt"></h1></span>
{% for comment in site.data.comments[page.comments] %}
{% if comment.public %}
<span class="nt"><div</span> <span class="na">class=</span><span class="s">"comment"</span> <span class="na">id=</span><span class="s">"c{{ comment.date }}"</span><span class="nt">></span>
<span class="c"><!-- comment no. {% increment counter %} --></span>
<span class="nt"><p</span> <span class="na">class=</span><span class="s">"comment-author"</span><span class="nt">></span>{{ comment.name }} at {{ comment.date | date: "%R, %-d %b %Y" }}:<span class="nt"></p></span>
<span class="nt"><div</span> <span class="na">class=</span><span class="s">"comment-body"</span><span class="nt">></span>{{ comment.message }}<span class="nt"></div></span>
<span class="nt"></div></span>
{% endif %}
{% endfor %}
{% endif %}
</code></pre>
</div>
<p>Finally, the most important part, the form that actually receives the comments and pushes them out to <code class="highlighter-rouge">commentor.php</code>.</p>
<div class="language-html highlighter-rouge"><pre class="highlight"><code>{% if page.sendcomments %}
<span class="nt"><h2></span>Submit a comment<span class="nt"></h2></span>
<span class="nt"><form</span> <span class="na">id=</span><span class="s">"comment-form"</span> <span class="na">action=</span><span class="s">"{{ 'commentor.php' | relative_url }}"</span> <span class="na">method=</span><span class="s">"post"</span><span class="nt">></span>
<span class="nt"><div</span> <span class="na">class=</span><span class="s">"first-line"</span><span class="nt">></span>
<span class="nt"><input</span> <span class="na">type=</span><span class="s">"hidden"</span> <span class="na">name=</span><span class="s">"page"</span> <span class="na">value=</span><span class="s">"{{ page.commentspage }}"</span> <span class="nt">/></span>
<span class="nt"><input</span> <span class="na">type=</span><span class="s">"hidden"</span> <span class="na">name=</span><span class="s">"id"</span> <span class="na">value=</span><span class="s">"{% increment counter %}"</span> <span class="nt">/></span>
<span class="nt"><input</span> <span class="na">type=</span><span class="s">"text"</span> <span class="na">name=</span><span class="s">"location"</span> <span class="na">class=</span><span class="s">"honeypot"</span> <span class="na">autocomplete=</span><span class="s">"off"</span> <span class="nt">/></span>
<span class="nt"></div></span>
<span class="nt"><div</span> <span class="na">class=</span><span class="s">"line"</span><span class="nt">></span>
<span class="nt"><input</span> <span class="na">type=</span><span class="s">"text"</span> <span class="na">name=</span><span class="s">"name"</span> <span class="na">placeholder=</span><span class="s">"Name"</span> <span class="na">required</span> <span class="nt">/></span>
<span class="nt"><span></span>Your name<span class="nt"></span></span>
<span class="nt"></div></span>
<span class="nt"><div</span> <span class="na">class=</span><span class="s">"line"</span><span class="nt">></span>
<span class="nt"><input</span> <span class="na">type=</span><span class="s">"text"</span> <span class="na">name=</span><span class="s">"contact"</span> <span class="na">placeholder=</span><span class="s">"Email address"</span><span class="nt">/></span>
<span class="nt"><span></span>An optional email (private)<span class="nt"></span></span>
<span class="nt"></div></span>
<span class="nt"><div</span> <span class="na">class=</span><span class="s">"line"</span><span class="nt">></span>
<span class="nt"><textarea</span> <span class="na">id=</span><span class="s">"commentmsg"</span> <span class="na">rows=</span><span class="s">"10"</span> <span class="na">cols=</span><span class="s">"60"</span> <span class="na">name=</span><span class="s">"message"</span> <span class="na">placeholder=</span><span class="s">"Comment"</span> <span class="na">required</span><span class="nt">></textarea></span>
<span class="nt"></div></span>
<span class="nt"><div</span> <span class="na">class=</span><span class="s">"line"</span><span class="nt">></span>
<span class="nt"><input</span> <span class="na">type=</span><span class="s">"submit"</span> <span class="na">name=</span><span class="s">"submit"</span> <span class="na">value=</span><span class="s">"Send"</span> <span class="nt">/></span>
<span class="nt"><span></span>Private comment?<span class="nt"></span></span>
<span class="nt"><input</span> <span class="na">type=</span><span class="s">"checkbox"</span> <span class="na">name=</span><span class="s">"private"</span> <span class="na">value=</span><span class="s">"yes"</span> <span class="nt">/></span>
<span class="nt"></div></span>
<span class="nt"></form></span>
{% else %}
<span class="nt"><h2</span> <span class="na">class=</span><span class="s">"nothing-there"</span><span class="nt">></span>Comments have been disabled for this post.<span class="nt"></h2></span>
{% endif %}
</code></pre>
</div>
<p>The way I wrote it, I can enable or disable comments on a per-post basis, both for viewing (<code class="highlighter-rouge">page.viewcomments</code>, <code class="highlighter-rouge">page.commentspage</code>) and submission (<code class="highlighter-rouge">page.sendcomments</code>), by adding or removing those fields in the Jekyll preambles of the posts.</p>
<p>At this point, the sky’s the limit. You can style this how you want, in <code class="highlighter-rouge">assets/main.scss</code> or wherever you’re keeping your styles.
Here’s a hot tip to get you started: you can strip the comment body, like <code class="highlighter-rouge"><span class="p">{</span><span class="err">{</span><span class="w"> </span><span class="err">comment.message</span><span class="w"> </span><span class="err">|</span><span class="w"> </span><span class="err">strip</span><span class="w"> </span><span class="p">}</span><span class="err">}</span></code>, and then style it with <code class="highlighter-rouge">white-space: pre-line;</code> so that it preserves people’s newlines internally but cleans up any starting or ending whitespace.
Jekyll also extends Liquid with a <code class="highlighter-rouge">markdownify</code> filter so you can enable markdown in comments by running it through that.</p>
<p>You want another tip? Sure, why not?
If you want to make replying easier, then give each comment an <code class="highlighter-rouge">id</code> (I used <code class="highlighter-rouge">c{{comment.date}}</code> in my example above) and then also throw an <code class="highlighter-rouge">onclick</code> attribute on the name, containing some Javascript to append a link to the comment:</p>
<div class="language-javascript highlighter-rouge"><pre class="highlight"><code><span class="nb">document</span><span class="p">.</span><span class="nx">getElementById</span><span class="p">(</span><span class="s1">'commentmsg'</span><span class="p">).</span><span class="nx">value</span> <span class="o">+=</span> <span class="s1">'@[{{comment.name}}][#{{comment.id}}] '</span>
</code></pre>
</div>
<p>Like I said, the sky’s the limit. Well, that and your webdesign skills, I suppose. And your sense of prudence. And your tolerance for bad ideas.
There are a lot of limits, actually. Whatever.</p>
<hr />
<p>As I conclude, I would like to remind you once more why it is a bad idea to implement comments the way I did.</p>
<p>For one, there is no notification system. All comments go through me and their timely publishing relies on me to check for them.
If you’re interested in engagement or public discourse or whatever then this is terrible because that means nothing can happen independent of your presence.
Even from a UX perspective this is kind of weird, because the only feedback that a comment was received is the confirmation page, and the comment doesn’t show up.
Corrective action for this looks like some kind of notification system, possibly via email.
I am loosely interested in this, because the problem of deciding when to notify seems like an interesting one, but it is not pressing, because I don’t think the downsides to the system as-is are that awful.</p>
<p>For another, the specific details of this implementation mean that people can sneak comments in if they comment right as I am building the website for an unrelated change.
This is fairly easily addressed, simply by adding a field for whether a comment has passed moderation.
I don’t expect voluminous comments, but this seems fairly easy to implement, and I can’t really think of a good reason for not going back and doing it now besides “I don’t feel like it”?</p>
<p>Finally, it’s a lot of work for something as simple as comments.
Just sell your soul to Disqus or Github for the convenience.
I gather that indifference about corporate contracts is pretty hip.
Maybe I’m sounding too cynical: as you can imagine I have pretty strong opinions about this stuff.
I don’t have a Facebook account even though I am in principle a member the generation for whom it was hip to use Facebook in high school.
(Now I hear it’s Boomer central and all the cool kids are on <strike>Twitter</strike> <strike>Instagram</strike> <strike>LinkedIn?</strike> TikTok. IDK, I don’t keep up with social media demographics.)
Regardless, this may still be a valid reason for some people not to follow this implementation.</p>
<p>Upon reflection, I actually don’t know how that last one might be the problem for you if the previous two were okay.
Maybe you’re a younger version of me, and knowing that something is stupid isn’t going to stop you from doing it?
That’s probably as good a signal as any that I should stop trying to come up with stuff to write.</p>
<p>So that’s the end of my ‘blog post. If you liked this ‘blog post, give it a Like, or in some cases preferably a Dislike, so that The Algorithm will increase my Exposure to and Impressions on Suitable Demographics.
If you haven’t already inflated my famousness number yet, you should definitely subscribe and activate all of the related notification options, so that a corporation can see you prostrating on its altar and be content, however fleetingly, with your obsequience.</p>
<p>And, of course, leave a comment below! 😉</p>
<div class="footnotes">
<ol>
<li id="fn:1">
<p>Here’s some of the fine print in the <a href="https://help.disqus.com/en/articles/1717102-terms-of-service">Disqus Terms of Service</a>, specifically out of the User ToS (UToS), Publisher ToS (PToS), and Privacy Policy (PP). Disqus owns the comments (PToS§4.1) and can give them to whoever it wants (PToS§4.2, PP§§2.b.iv, 5.d.iv-vii, 6). Users grant (and must be able to grant) Disqus license to all creative rights to their comments (UToS§Rights Regarding User Content) except as is strictly unenforcable by intellectual property law (PToS§5). Your own use of your website’s comments, analytics, and moderation is on a revocable license (PToS§4.1) and you have to relinquish and/or destroy all of it at Disqus’ request (PToS§4.2), not to mention Disqus is under no obligation to give any of it to you if the agreement is terminated (PToS§10.2). Disqus can use your ‘blog’s branding in its marketing materials (PToS§11.8) and you have to notify them in writing if you do not agree to this, or any of the other handful of things you are legally allowed to disagree to. <a href="#fnref:1" class="reversefootnote">↩</a></p>
</li>
<li id="fn:2">
<p>Yeah, yeah, I know there’s no level of self-awareness I can mount to excuse my own participation in this sordid tradition. But I think I understand what this kind of post means to someone, symbolically. For me this post is also: an announcement to everyone that my ‘blog now has comments; a proof to myself that I can still make computers do what I want and that I will not be bested by technology; and most importantly closure for a long-standing project. So I suppose I should be willing to extend this courtesy to all the bandwagoners and not just the truly innovating ones. <a href="#fnref:2" class="reversefootnote">↩</a></p>
</li>
</ol>
</div>Ilia ChtcherbakovFor a long time, I’ve wanted comments for my Jekyll-based ‘blog.
However, the existing alternatives, Disqus and Staticman, were not satisfactory, so I decided to roll up my own implementation.
It has certain drawbacks that might make it unsuitable for most applications, but I would like to tell its story here anyway, because it would not have been possible without the ‘blog posts of others.The cheapest path problem and idempotent semirings2020-04-20T19:47:30-04:002020-04-20T19:47:30-04:00http://cleare.st/math/cheapest-paths-and-idempotent-semirings<p>Let me propose a interesting theoretical variant of the shortest path problem. You have a directed graph with source and target vertices, and every edge has a cost to traverse it.
However, not every cost is in good old American dollars—some are in alternative currencies, and you don’t know what the exchange rates will be until the day of your trip.
<!--more-->
You can’t tell for sure how much it’s going to cost, but there’s still a significant amount of precomputation that you can do, and we’re going to employ some fun algebra to do it.</p>
<p>So here’s the setup. You have a directed graph <script type="math/tex">G</script> and each edge <script type="math/tex">e \in E(G)</script> has a cost <script type="math/tex">c_e</script>.
You can add costs together <script type="math/tex">c+c'</script>, and you can compare costs, so we have an <strong>ordered monoid</strong>.
To be precise, this is a monoid <script type="math/tex">(M, {\cdot})</script> equipped with a partial order <script type="math/tex">\le</script> such that if <script type="math/tex">m \le m'</script> then <script type="math/tex">m \cdot n \le m' \cdot n</script> and <script type="math/tex">n \cdot m \le n \cdot m'</script>, for all <script type="math/tex">m, m', n \in M</script>.
I know I said you add costs together but I’m going to be calling the monoid operation <script type="math/tex">\cdot</script> from now on, so, like, I’m sorry for deceiving you for five seconds.
The monoid identity gonna be called <script type="math/tex">1</script>, too, just so we’re clear.</p>
<p>Anyway, the task is to produce the set of all cheapest costs, which is to say, the set of all costs that are the cheapest in some linearization of the monoid <script type="math/tex">M</script>: some total order <script type="math/tex">\le'</script> extending the partial order.
Such a total order is a characterization of prices as they are on the day of travel.
The hope is that in a simple enough monoid, antichains would be reasonably small, so flattening an antichain to a final cost is not as complicated as the preprocessing that is done on the graph now.</p>
<p>Notice that we are not requesting very much of the monoid of costs:
in the currency example it would be very reasonable to stipulate that <script type="math/tex">M</script> be commutative (<script type="math/tex">m \cdot n = n \cdot m</script>), positive (<script type="math/tex">1 \le m</script>), etc.
(Actually, positive is an interesting condition to keep in mind as we proceed, especially if the digraph contains any directed cycles.)</p>
<p>The most important thing to not request is that the order be total: we are not guaranteed to compare any two monoid elements.
In fact, equality is a perfectly valid partial order—every monoid is an ordered monoid under “<script type="math/tex">m \le n</script> iff <script type="math/tex">m = n</script>”—and this represents the worst case of knowing absolutely nothing about the relative costs of anything, even compared to the trivial cost <script type="math/tex">1</script>.</p>
<p>So what are we to do? The only thing we intuitively <em>can</em> do: carry around all the possible costs that we see, only throwing out those <script type="math/tex">c</script> for which we find a provably better cost <script type="math/tex">% <![CDATA[
c' < c %]]></script> for the same task.
Then, I don’t know, I guess you can run some graph algo and just keep all the incomparable results you see and throw away the big ones and it probably works out, right?</p>
<p>Well, I mean, yeah, it does… but for some (arguably) subtle and (debatably) interesting reasons.</p>
<hr />
<p>So let’s start by making this precise.
When solving the graph problem, instead of carrying one cost <script type="math/tex">c</script>, we want carry a subset of costs.
And furthermore, whenever two costs are comparable—comparable by the order in the monoid and also algorithmically permitted to be comparable by having the same source and target—we keep the lesser.
A more expensive subpath can’t possibly lead to a less expensive final path, so this checks out.
So, we won’t be encountering just any old subsets of the costs, we will be dealing with <em>antichains</em>.</p>
<p>An <strong>antichain</strong> <script type="math/tex">A \subseteq M</script> is a set of pairwise incomparable elements, which is to say, for all <script type="math/tex">a,b \in M</script>, if <script type="math/tex">a \le b</script> then <script type="math/tex">a = b</script>.
The set of finite antichains on a poset like <script type="math/tex">M</script> will be denoted by <script type="math/tex">\mathfrak A(M)</script>, because I haven’t used a fraktur letter in a while and I’m horny.
If we have two antichains, we can merge them together and take the best of both, by considering the minimal elements of their union,</p>
<script type="math/tex; mode=display">A \wedge B = \text{min'l}(A \cup B).</script>
<p>I am using the <script type="math/tex">\wedge</script> symbol for this operation because it makes <script type="math/tex">\mathfrak A</script> into a <strong>semilattice</strong>: it is associative, commutative, and <em>idempotent</em> (<script type="math/tex">A \wedge A = A</script>).
Formally, it makes no difference whether the semilattice operation is a join or a meet, but in our case we will intuitively be considering it a meet, because we are interested in minimal costs.
As such, the partial order on <script type="math/tex">\mathfrak A</script> induced by this meet is <script type="math/tex">A \le B</script> iff <script type="math/tex">A \wedge B = A</script>, iff for all <script type="math/tex">b \in B</script> there exists <script type="math/tex">a \in A</script> with <script type="math/tex">a \le b</script>.
This has the somewhat unfortunate side effect that subsets of antichains (which are still antichains) are larger than their supersets: if <script type="math/tex">A \subseteq B</script> then <script type="math/tex">A \ge B</script>.
Oh well. Not everything is sunshine and rainbows.</p>
<p>Now, if you’re me, you might be tempted to do an aside and talk about if this semilattice is a lattice.
And though it is indeed very tempting, it’s a little long to get into the details, so I will just spoil the ending: it is a lattice,</p>
<script type="math/tex; mode=display">A \vee B = \{ a \in A : a \ge b \ \text{for some}\ b \in B \} \cup \{ b \in B : b \ge a \ \text{for some}\ a \in A \},</script>
<p>and the proof that it works is just ugly casework.</p>
<p>In any case, whether or not <script type="math/tex">\mathfrak A</script> has a lattice join doesn’t actually matter, because we only care about minimizing costs.
What does matter is the operation on antichains induced by the monoid operation.
We know what to do with the singleton antichains—<script type="math/tex">\{m\} \cdot \{n\} = \{ m \cdot n \}</script>—and since every antichain is the meet of the singletons of its elements, we can extend this to all antichains by distributivity:</p>
<script type="math/tex; mode=display">A \cdot B = \biggl( \bigwedge_{a \in A} \{a\} \biggr) \cdot \biggl( \bigwedge_{b \in B} \{b\} \biggr) = \bigwedge_{a \in A} \bigwedge_{b \in B} \{a \cdot b\}.</script>
<p>This is where we rely on the fact that we defined <script type="math/tex">\mathfrak A</script> to be the finite antichains.
Up until this point, we could do things for all antichains, but if <script type="math/tex">\mathfrak A</script> is not a complete semilattice then this infinite meet may not be defined.
You can’t even dodge this by externally declaring it’s just the minimal elements of the setwise product <script type="math/tex">\{ a \cdot b : a \in A, b \in B \}</script> because there’s no guarantee it has any, let alone is adequately described by them.<sup id="fnref:1"><a href="#fn:1" class="footnote">1</a></sup></p>
<p>This package of data <script type="math/tex">(\mathfrak A(M), {\wedge}, {\cdot})</script> is an example of an <strong>idempotent semiring</strong>.
Recall that a semiring <script type="math/tex">(R, {+}, {\cdot})</script> is a set <script type="math/tex">R</script> equipped with two monoid operations, a commutative addition <script type="math/tex">+</script> with identity <script type="math/tex">0</script> and a not-necessarily-commutative multiplication <script type="math/tex">\cdot</script> with identity <script type="math/tex">1</script>, and a further stipulation that <script type="math/tex">\cdot</script> distributes over <script type="math/tex">+</script>.
Of course, every ring is a semiring, and the most famous example not arising from a ring is the natural numbers <script type="math/tex">(\mathbb N, {+}, {\times})</script>.</p>
<p>A semiring is (additively) idempotent if <script type="math/tex">r + r = r</script> for all <script type="math/tex">r \in R</script>.
A particularly famous example is the <em>tropical semiring</em> <script type="math/tex">(\mathbb R \cup \{\infty\}, {\min}, {+}')</script>, where the multiplication <script type="math/tex">+'</script> is the usual real addition extended to have <script type="math/tex">\infty</script> as an absorbing element.
(Its fame comes from tropical geometry, a hot topic in algebraic geometry as of late.)
Idempotence means the addition is a semilattice operation, and as such defines a partial order on the semiring: <script type="math/tex">a \le b</script> iff <script type="math/tex">a + b = a</script>.<sup id="fnref:2"><a href="#fn:2" class="footnote">2</a></sup>
Furthermore, because of distributivity, this order is a monoid order on the multiplicative monoid <script type="math/tex">(R, {\cdot})</script>.</p>
<blockquote>
<p><strong>Exercise.</strong> Verify that for any idempotent semiring, <script type="math/tex">\le</script> is a semilattice ordering of the multiplicative monoid.
That is, show that <script type="math/tex">\le</script> is:</p>
<ul>
<li>reflexive: <script type="math/tex">a \le a</script>;</li>
<li>antisymmetric: <script type="math/tex">a \le b</script> and <script type="math/tex">b \le a</script> implies <script type="math/tex">a = b</script>;</li>
<li>transitive: <script type="math/tex">a \le b</script> and <script type="math/tex">b \le c</script> implies <script type="math/tex">a \le c</script>;</li>
<li>a meet-semilattice order: <script type="math/tex">a \le b</script> and <script type="math/tex">a \le c</script> iff <script type="math/tex">a \le b + c</script>;</li>
<li>a monoid order: <script type="math/tex">a \le b</script> implies <script type="math/tex">a \cdot c \le b \cdot c</script> and <script type="math/tex">c \cdot a \le c \cdot b</script>.</li>
</ul>
</blockquote>
<p>Let’s quickly take stock of our idempotent semiring <script type="math/tex">(\mathfrak A(M), {\wedge}, {\cdot})</script>.</p>
<ul>
<li><script type="math/tex">\mathfrak A(M)</script> is the set of finite antichains of our ordered monoid <script type="math/tex">M</script>.</li>
<li><script type="math/tex">\wedge</script> takes the minimal elements of the union of its two operands, so it’s associative, commutative, and its identity element is the empty antichain <script type="math/tex">\varnothing \in \mathfrak A</script>.</li>
<li><script type="math/tex">\wedge</script> can be interpreted as the meet of a semilattice, so it determines a partial order <script type="math/tex">\le</script>: the order it induces on the singleton antichains agrees with the monoid order on <script type="math/tex">M</script>, and the order it induces on subsets of any fixed antichain agrees with the superset order (if <script type="math/tex">A \supseteq A'</script> then <script type="math/tex">A \le A'</script>).</li>
<li><script type="math/tex">\cdot</script> takes the minimal elements of the setwise product of its operands, so it’s associative, and its identity element is <script type="math/tex">\{1\}</script>, the singleton containing the identity of <script type="math/tex">M</script>. <script type="math/tex">\cdot</script> is commutative iff the <script type="math/tex">M</script> is.</li>
<li><script type="math/tex">\varnothing</script> is an absorbing element for <script type="math/tex">\cdot</script>: <script type="math/tex">\varnothing \cdot A = \varnothing = A \cdot \varnothing</script>.</li>
<li><script type="math/tex">\varnothing</script> is the greatest element of the poset of antichains—representing a literally priceless cost—and if <script type="math/tex">M</script> is positive then <script type="math/tex">\{1\}</script> is the least element.</li>
</ul>
<hr />
<p>Now that we have our costs in a cute little arithmetical package, we can unleash it on the problem.
Recall from the setup: <script type="math/tex">G</script> is a directed graph, and <script type="math/tex">c : E(G) \to M</script> is an assignment of costs to the edges.
The cost of a path <script type="math/tex">(e_1, e_2, \dots, e_n)</script> is the product of the costs along that path, <script type="math/tex">c(e_1) \cdot c(e_2) \cdots c(e_n)</script>.</p>
<p>Recall also that the goal is to find all possibly cheapest paths from some source <script type="math/tex">s \in V(G)</script> to some target <script type="math/tex">t \in V(G)</script>, subject to the indeterminacy of “some pairs of costs in <script type="math/tex">M</script> may not be comparable”.
In <script type="math/tex">\mathfrak A</script>, we still are not able to compare costs, but if they come from paths that have the same start and end points, we can combine them without much thought, by simply taking their meet in <script type="math/tex">\mathfrak A</script>.
By construction, we know how to interpret <script type="math/tex">M</script> in <script type="math/tex">\mathfrak A(M)</script>, as singleton sets are antichains.</p>
<p>Immediately we can observe that the cheapest path from <script type="math/tex">s</script> to <script type="math/tex">t</script> will only definitively exist if there are no directed cycles whose cost around is not at least the cost of the identity, that is, every directed cycle <script type="math/tex">C</script> satisfies <script type="math/tex">c(C) \ge \{1\}</script>.
If not, then there is some linearization of the monoid order—some cost conversion eventuality on the day of your trip—where the more you travel around that cycle, the cheaper the trip will be.
So in the following analysis I will silently ignore this possibility, because its treatment is exactly similar as in the shortest path problem.</p>
<p>A second observation is that if we have a cheapest path from <script type="math/tex">s</script> to <script type="math/tex">t</script>, then every subpath is also a cheapest path between its own start and endpoints.
That is to say, this problem is amenable to dynamic programming in precisely the same way as the shortest path problem is.
Some of you reading may already see where this is going, but for everyone else, I will take it one step at a time.</p>
<p>First of all, let’s look at cheapest paths of length one. I claim it’s pretty easy to see that it’s solved by the following function <script type="math/tex">c_1 : V(G)^2 \to \mathfrak A</script>, defined as</p>
<script type="math/tex; mode=display">% <![CDATA[
c_1(s,t) = \begin{cases} \{1\} & \text{if}\ s=t, \\ \bigwedge( c(e) : s \overset e\longrightarrow t ) & \text{otherwise}. \end{cases} %]]></script>
<p><script type="math/tex">\{1\}</script> of course is the least possible cost, representing free transit, which I am implicitly assuming is the cost for simply staying at a given vertex.
If the digraph has only at most one edge between any two vertices, then the big meet is not necessary, so long as it is acknowledged that nonedges are equivalent to edges whose cost is <script type="math/tex">\varnothing \in \mathfrak A</script>.</p>
<p>Now, the cheapest paths of larger lengths <script type="math/tex">c_k : V(G)^2 \to \mathfrak A</script> are a breeze:</p>
<script type="math/tex; mode=display">% <![CDATA[
\begin{align*}
c_2(s,t) &= \bigwedge_{v \in V(G)} c_1(s,v) \cdot c_1(v,t), \\
c_3(s,t) &= \bigwedge_{v \in V(G)} c_2(s,v) \cdot c_1(v,t), \\
c_4(s,t) &= \bigwedge_{v \in V(G)} c_3(s,v) \cdot c_1(v,t), \dots
\end{align*} %]]></script>
<p>And since <script type="math/tex">c_1(t,t) = \{1\}</script>, we have <script type="math/tex">c_{k+1}(s,t) \le c_k(s,t)</script>, which means you just need to repeat this calculation until you are satisfied that there are no longer paths that need to be considered.</p>
<p>Now, this may look kind of complicated, but you have probably seen an algorithm of this form before, though possibly in an unexpected way.
You see, in the semiring <script type="math/tex">\mathfrak A</script>, a computation of the form <script type="math/tex">\bigwedge_i a_i \cdot b_i</script> is a dot product.
We can actually view <script type="math/tex">c_1</script> as a <script type="math/tex">V(G) \times V(G)</script>-matrix with coefficients in the semiring <script type="math/tex">\mathfrak A</script>, and then <script type="math/tex">c_k</script> is just the matrix power <script type="math/tex">c_1^k</script>!
The addition is unorthodox compared to your run-of-the-mill linear algebra, sure, but in the arithmetic of <script type="math/tex">\mathfrak A</script> it is perfectly natural, and indeed you can view <script type="math/tex">c_1</script> as an obvious adjacency matrix for <script type="math/tex">G</script> with coefficients from <script type="math/tex">\mathfrak A</script>.<sup id="fnref:3"><a href="#fn:3" class="footnote">3</a></sup></p>
<p>One final observation about this computation.
Because of the “idempotence” property <script type="math/tex">c_{k+1} \le c_k</script>, overshooting isn’t really a bad thing, so you can repeatedly square the matrix instead of naively multiplying on copies of <script type="math/tex">c_1</script>, taking full advantage of the “exponentiation by squaring” principle.
I don’t think this gets you any serious computational gain if you actually track the timespace complexity of building and computing on antichains, but it’s pretty cool, in my opinion.</p>
<hr />
<p>To me, this solution is satisfying. To some of you, it might not be.</p>
<p>Perhaps you imagined a stricter variant of the problem, where the task is to produce a list of paths that enacts each of the cheapest costs.
Depending on who you are, this is either obviously equivalent or obviously inequivalent.
I am of the former position, but regardless of whether or not you agree with me, the procedure to accomodate you is standard.
In fact, this whole matrix-over-idempotent-semirings approach is essentially an algebraic recasting of <a href="https://en.wikipedia.org/wiki/Floyd%E2%80%93Warshall_algorithm">the Floyd–Warshall algorithm</a>, so that discussion may be your next destination.
I myself am not particularly interested in that line of study, as it lacks a certain elegance—yes, I think <script type="math/tex">\mathfrak A</script> is elegant—and is more of a necessary evil sort of math.</p>
<p>This topic is also a good springboard to talk about the use of idempotent semirings in combinatorial optimization.
Since the late 19th-century, semirings have been trying to find a way to break into mainstream algebra.
While they have largely failed to uproot the stranglehold that rings have on algebra, they persist, and by the 1970’s or so they finally started appearing in the work of applied mathematicians and computer scientists, who noted how much could be cast in their language.
Idempotent semirings are especially valued, precisely for their ability to be viewed as an operation with a compatible and pleasant partial order.
A <em>min-plus</em>–style semiring like <script type="math/tex">(\mathbb R \cup \{\infty\}, {\min}, {+})</script>, for example, allows one to perform optimization, like in this ‘blog post,
while a <em>max-plus</em> semiring like <script type="math/tex">(\mathbb R \cup \{-\infty\}, {\max}, {+})</script> is more handy for recasting scheduling-type problems.</p>
<p>The study is only about as old as computer science is, and generally lays neglected out of what I can only assume is a distaste for unnecessary algebra, but it is not without its textbooks.
I rather like the texture and mouthfeel<sup id="fnref:4"><a href="#fn:4" class="footnote">4</a></sup> of Golan, <a href="https://doi.org/10.1007/978-94-015-9333-5"><em>Semirings and their Applications</em></a>, but I suppose it would be dishonest not to mention the newer (and not extremely worse typeset) book by <a href="https://doi.org/10.1007/978-0-387-75450-5">Gondran and Minoux</a>.
Full disclosure, I haven’t read them all the way through (I actually never learned how to read, I just like the pretty pictures) so I can’t guarantee it’s not a waste of time, but I mean, it’s semiring theory.
You’ve already admitted you don’t value your time by reading this ‘blog post all the way through, so why stop after dipping your toes in?</p>
<div class="footnotes">
<ol>
<li id="fn:1">
<p>The meet-semilattice of antichains almost coincides with the complete lattice of the upper sets. They coincide when <script type="math/tex">M</script> satisfies the <em>descending chain condition</em>—that no sequence <script type="math/tex">a_1 > a_2 > \cdots</script> can continue indefinitely—which at first blush sounds like a tough guarantee. However, it follows from the simple assumption that the monoid is positive, that is, <script type="math/tex">1 \le m</script> for all <script type="math/tex">m \in M</script>. On any graph, the set of costs that appear on edges is a finite set, and hence gives a finitely generated submonoid of <script type="math/tex">M</script> which inherits the order, and that alongside positivity gives you the preconditions for Higman’s Lemma from universal algebra. The conclusion is that the order is a wellquasiorder, which is equivalent to DCC plus the additional fact that all antichains are finite! So in a sense, upper sets are what we should really be looking at, and antichains are simply the computational approximation to them, and the only time they don’t work as an approximation is when the antichains are infinite anyway. <a href="#fnref:1" class="reversefootnote">↩</a></p>
</li>
<li id="fn:2">
<p>Most semiring literature defines the partial order the other way, <script type="math/tex">a \le b</script> iff <script type="math/tex">a + b = b</script>, because addition feels more like a join-y operation. This also has the benefit of making the additive identity <script type="math/tex">0_R</script> the bottom of the semilattice, which matches the notational conventions of order theory. However, this would require an unintuitive flip somewhere in the setup of our cost minimization problem, so for exposition’s sake I will turn it around here. Still, I didn’t lie to you about the tropical semiring, the min-plus formulation is probably more common, and I don’t have an explanation for that, algebraic geometry is just weird. <a href="#fnref:2" class="reversefootnote">↩</a></p>
</li>
<li id="fn:3">
<p>Depending on your persuasion, you could call it either luck or predestination that semirings are precisely the objects which can be coefficients of matrices, and an idempotent semiring is a natural way of recasting a semilattice-ordered monoid. <a href="#fnref:3" class="reversefootnote">↩</a></p>
</li>
<li id="fn:4">
<p>20. Sorry, I couldn’t think of a better place to put a weed joke. <a href="#fnref:4" class="reversefootnote">↩</a></p>
</li>
</ol>
</div>Ilia ChtcherbakovLet me propose a interesting theoretical variant of the shortest path problem. You have a directed graph with source and target vertices, and every edge has a cost to traverse it.
However, not every cost is in good old American dollars—some are in alternative currencies, and you don’t know what the exchange rates will be until the day of your trip.Prime Filters in Distributive Lattices III2020-04-13T02:15:00-04:002020-04-13T02:15:00-04:00http://cleare.st/math/prime-filters-in-distributive-lattices-3<p>Recall from <a href="/math/prime-filters-in-distributive-lattices-2">PFDL II</a>, I gave an interesting characterization of Boolean algebras among distributive lattices, using a technique from formal logic.
Today I’d like to share some final musings on the topic, specifically in the form of a counterexample to a weakening of the hypotheses.
<!--more--></p>
<p>To remind ourselves of the work done so far in the series:</p>
<blockquote>
<p><a href="/math/prime-filters-in-distributive-lattices"><strong>Theorem 1.</strong></a> Let <script type="math/tex">L</script> be a Boolean algebra.
Then every nonempty prime filter in <script type="math/tex">L</script> is an ultrafilter.</p>
<p><a href="/math/prime-filters-in-distributive-lattices-2"><strong>Theorem 2.</strong></a> Let <script type="math/tex">L</script> be a distributive bounded lattice.
If every nonempty prime filter in <script type="math/tex">L</script> is an ultrafilter,
then <script type="math/tex">L</script> is a Boolean algebra.</p>
</blockquote>
<p>I’ve stated <a href="/math/excluded-subobjects-for-nondistributive-lattices">before</a> that this proof was a homework problem in a logic course I took.
But actually, that’s only half-right: as homework, we were assigned a weaker version of Theorem 2 in which the assumption of boundedness is omitted.
And the interesting thing about that weaker version is it has no proof, because it’s false!
I didn’t know at the time, but I couldn’t find a proof without assuming boundedness, so I told the professor as much and he accepted that submission.</p>
<p>To be clear, in the presence of Theorems 1 and 2, this weakening amounts to saying if every prime filter in a distributive lattice is an ultrafilter, then the lattice is bounded.
And for the longest time I couldn’t figure out how to show this, because it doesn’t seem like it has any reason to follow.
Well, in 2019, the problem floated back into my head and I was irritated enough that I wrote out some counterexamples, explaining very neatly why boundedness is necessary and killing this problem for good.</p>
<p>Let’s recall the terminology so you don’t have to click back to the previous installments. For a distributive lattice <script type="math/tex">L</script> and a subset <script type="math/tex">F \subseteq L</script>, we consider five properties:</p>
<ol>
<li>if <script type="math/tex">a \in F</script> and <script type="math/tex">a' \ge a</script> then <script type="math/tex">a' \in F</script>,</li>
<li>if <script type="math/tex">a, b \in F</script> then <script type="math/tex">a \wedge b \in F</script>,</li>
<li><script type="math/tex">\varnothing \subsetneq F \subsetneq L</script>,</li>
<li>if <script type="math/tex">a \vee b \in F</script> then <script type="math/tex">a \in F</script> or <script type="math/tex">b \in F</script>,</li>
<li>for all <script type="math/tex">a,b \notin F</script> there exists <script type="math/tex">x \in F</script> so that <script type="math/tex">a \wedge x \le b</script>,</li>
<li>there exists <script type="math/tex">a \in F</script> such that for all <script type="math/tex">a' \in F</script>, <script type="math/tex">a \le a'</script>.</li>
</ol>
<p>If <script type="math/tex">F</script> satisfies properties 1 and 2, it is a <strong>filter</strong>, and if it also satisfies property 3 then it is <strong>nontrivial</strong>.
We will almost exclusively be considering nontrivial filters.
If a nontrivial filter satisfies property 4, it is a <strong>prime filter</strong>, while if it satisfies property 5 it is an <strong>ultrafilter</strong>.
Property 5 easily implies property 4 in distributive lattices, and there is a short proof of this in <a href="/math/prime-filters-in-distributive-lattices">PFDL I</a>.
Finally, if a filter satisfies property 6, then it is a <strong>principal</strong> filter, and the least element is its <em>generator</em>—this has a complicated relationship with properties 4 and 5, but the terminology will be handy.</p>
<p>To show that boundedness in both directions is necessary in theorem 2, I need to produce a counterexample with no top, and another counterexample with no bottom.</p>
<hr />
<p>First, unbounded above. Let <script type="math/tex">L</script> be the lattice of finite subsets of <script type="math/tex">\mathbb N</script>.
Obviously, this lattice is distributive, and is bounded below but not above.</p>
<blockquote>
<p><strong>Proposition 1.</strong> Every nonempty filter in <script type="math/tex">L</script> is principal.</p>
<p><strong>Proof.</strong> Let <script type="math/tex">F</script> be a nonempty filter of <script type="math/tex">L</script>, and let <script type="math/tex">f = \bigcap F</script> be the intersection of all the sets in <script type="math/tex">F</script>.
By property 1, it suffices to show that <script type="math/tex">f \in F</script>.
Consider <script type="math/tex">x \in F</script>. We have <script type="math/tex">f \subseteq x</script>, and for each <script type="math/tex">n \in x-f</script>,
there exists <script type="math/tex">a_n \in F</script> such that <script type="math/tex">n \notin a_n</script>. Then</p>
<script type="math/tex; mode=display">f = x \cap \bigcap_{n \in x-f} a_n.</script>
<p><script type="math/tex">x \backslash f</script> is finite, so by property 2, <script type="math/tex">f \in F</script>. ∎</p>
</blockquote>
<p>With this characterization, analysing the prime filters and ultrafilters becomes a lot easier to manage.</p>
<blockquote>
<p><strong>Proposition 2.</strong> A filter in <script type="math/tex">L</script> is prime iff its generator is a singleton.</p>
<p><strong>Proof.</strong> First, note that if <script type="math/tex">f = \varnothing</script>, then <script type="math/tex">F = L</script> is a trivial filter, so it cannot be prime.</p>
<p>Else, suppose <script type="math/tex">f = \{n\}</script>.
Then for any two finite sets <script type="math/tex">a,b \in L</script>, if <script type="math/tex">a \cup b \in F</script>, then <script type="math/tex">n \in a \cup b</script>.
Obviously, either <script type="math/tex">n \in a</script> or <script type="math/tex">n \in b</script>, so at least one of <script type="math/tex">a</script> or <script type="math/tex">b</script> belongs to <script type="math/tex">F</script>. Therefore, <script type="math/tex">F</script> is prime.</p>
<p>Finally, suppose <script type="math/tex">f</script> is the union of two nonempty proper subsets.
Both are strict subsets of <script type="math/tex">f</script>, so neither belong to <script type="math/tex">F</script>, but their union is <script type="math/tex">f \in F</script>.
Thus, <script type="math/tex">F</script> is not prime.</p>
<p>These cases are jointly exhaustive. ∎</p>
</blockquote>
<p>And now we swing it home.</p>
<blockquote>
<p><strong>Claim.</strong> Every prime filter in <script type="math/tex">L</script> is an ultrafilter.</p>
<p><strong>Proof.</strong> Let <script type="math/tex">F</script> be a prime filter. By the previous proposition, <script type="math/tex">f = \{n\}</script>. Let <script type="math/tex">a, b \notin F</script>, that is, <script type="math/tex">n \notin a</script> and <script type="math/tex">n \notin b</script>.
Define <script type="math/tex">x = b \cup \{n\}</script>, so that <script type="math/tex">x \in F</script>, and observe that</p>
<script type="math/tex; mode=display">a \cap x = a \cap (b \cup \{n\}) = a \cap b \subseteq b.</script>
<p>Thus, <script type="math/tex">F</script> is an ultrafilter. ∎</p>
</blockquote>
<p>Therefore, boundedness above is required.</p>
<hr />
<p>Now, we will show another example for bounded below.
Let <script type="math/tex">L^{\mathrm{op}}</script> be the dual lattice of <script type="math/tex">L</script>, which we will identify as the <em>cofinite</em> subsets of <script type="math/tex">\mathbb N</script>, that is, the sets whose complement is finite.
This case is essentially the same as the previous, but a little dirtier to prove.
For starters, we can no longer get away with the adjective ‘principal’, though we can still state something very similar.</p>
<blockquote>
<p><strong>Proposition 3.</strong> Every nonempty filter in <script type="math/tex">L^{\mathrm{op}}</script> is of the form <script type="math/tex">\{ x \in L^{\mathrm{op}} : x \supseteq S \}</script> for some subset <script type="math/tex">S \subseteq \mathbb N</script>.</p>
<p><strong>Proof.</strong> Let <script type="math/tex">F</script> be a filter and <script type="math/tex">S = \bigcap F</script>.
Suppose <script type="math/tex">x</script> be a cofinite set and <script type="math/tex">x \supseteq S</script>. It suffices to show that <script type="math/tex">x \in F</script>.
Consider its complement, <script type="math/tex">\bar x</script>, which is finite and disjoint from <script type="math/tex">S</script>.
Thus, for each <script type="math/tex">n \in \bar x</script>, there exists <script type="math/tex">a_n \in F</script> with <script type="math/tex">n \notin a_n</script>.</p>
<p>Let <script type="math/tex">a = \bigcap_{n \in \bar x} a_n</script> be their intersection.
<script type="math/tex">a \in F</script> by property 2, and <script type="math/tex">a \subseteq x</script>, so <script type="math/tex">x \in F</script> by property 1. ∎</p>
</blockquote>
<p>This is essentially the same as being principal, but you can’t literally call it that because your generator does not belong to the lattice.
As a result, the analogue of Proposition 2 goes through with little trouble.</p>
<blockquote>
<p><strong>Proposition 4.</strong> A filter in <script type="math/tex">L^{\mathrm{op}}</script> is prime iff its generator is a singleton.</p>
<p><strong>Proof.</strong> If <script type="math/tex">F</script> is empty, or <script type="math/tex">S</script> is empty or a singleton, then <script type="math/tex">F</script> is easily trivial and prime respectively.
If <script type="math/tex">S</script> contains two distinct elements <script type="math/tex">m, n \in \mathbb N</script>, then clearly neither <script type="math/tex">\mathbb N - \{m\}</script> nor <script type="math/tex">\mathbb N - \{n\}</script> belongs to <script type="math/tex">F</script> but</p>
<script type="math/tex; mode=display">\bigl( \mathbb N - \{m\} \bigr) \cup \bigl( \mathbb N - \{n\} \bigr) = \mathbb N \in F,</script>
<p>so <script type="math/tex">F</script> is not prime. These cases are jointly exhaustive. ∎</p>
</blockquote>
<p>Because the characterization of the prime filters are exactly the same ones as in <script type="math/tex">L</script>, the proof that prime filters are ultrafilters goes through without any modification.
I’m not even going to copypaste it.
So, boundedness below is also required.</p>
<hr />
<p>Now, of the people I imagine to have read this far, there are two kinds.
The first kind would say, hey, this is boring, who even cares, why are you still beating this dead-ass horse.
To them, I say, first of all, the way this was presented to me was very different then the way it was presented to you.
You got three cute little blog posts all wrapped up in a bow, while I got Assignment 3 Question 1(c).
It burrowed its way into my brain and I am proud for having excised it and achieved the happy ending.</p>
<p>And second of all, I would continue, lattices are a dying art.
All the papers were written like a hundred years ago and the textbooks shortly afterwards.
You don’t see them until they get quickly glossed over when you need to learn about like Stone-Čech compactification or something real quick while you’re in the middle of some godforsaken functional analysis class.
Hell, half the uses of the word lattice nowadays are referring to discrete subgroups of <script type="math/tex">\mathbb R^n</script>, instead of special posets.
Nobody stans lattices.</p>
<p>The second kind of person that I imagine still reads my undying screeds would say to me, hold on a second there, you haven’t given a counterexample that is unbounded in both directions.
And to you, I say, truly you are my people.
And unto you I bestow the most sacred of gifts, that I have reserved for you and you alone:</p>
<blockquote>
<p><strong>Exercise.</strong> Show that every prime filter in the product lattice <script type="math/tex">L \times L^{\mathrm{op}}</script> is an ultrafilter.</p>
<p><strong>Hint.</strong> There are two ways of doing this: following the blueprint laid out in this ‘blog post, or developing a little general theory of filters and product lattices.</p>
</blockquote>Ilia ChtcherbakovRecall from PFDL II, I gave an interesting characterization of Boolean algebras among distributive lattices, using a technique from formal logic.
Today I’d like to share some final musings on the topic, specifically in the form of a counterexample to a weakening of the hypotheses.Review of This ‘Blog2020-04-12T12:18:48-04:002020-04-12T12:18:48-04:00http://cleare.st/meta/review-of-this-blog<p>This ‘blog has a lot of lattices in it.</p>
<p>Some of them are kinda hard to imagine, like <a href="/math/prime-filters-in-distributive-lattices">countable boolean algebras</a>, where you have to use your brain and count or something. Some of them are <em>really</em> hard to imagine, like <a href="/math/excluded-subobjects-for-nondistributive-lattices">the free modular lattice on three generators</a>, which for some strange reason is finite and proving that requires over a half hour of computations.
<!--more--></p>
<p>But none of them are geometric lattices, and I was promised matroids on this ‘blog, goddamnit. How can you call yourself a matroid theorist if you grade your posets but you’ve never even said the word semimodular once in a post??
Where are the 749 notes, huh? Where are you keeping them?</p>
<p>I rate this ‘blog 0 out of 5 stars. Would not recommend.</p>
<hr />
<p>April fools! You thought this ‘blog was dead, didn’t you? Well, me too.
But in the midst of an international crisis—PRIMS is threatening to publish Mochizuki’s purported proof of the <script type="math/tex">abc</script> conjecture despite the notable expert consensus that it is flawed!<sup id="fnref:1"><a href="#fn:1" class="footnote">1</a></sup>—I know you all need every bit of comfort you can find on this cold remorseless beach of an earth.</p>
<p>I think the ‘blog’s issue was I dreamed too big.
I had plans of writing my own comments engine: you may notice below that comments are allegedly disabled, when in reality I am still working on them.
I wanted all of my posts to be novel and comprehensive, and retrospectively I think that strangles the spontaneity on which my projects thrive.</p>
<div class="footnotes">
<ol>
<li id="fn:1">
<p>Oh, and COVID-19 got Conway. That’s very bad too. <a href="#fnref:1" class="reversefootnote">↩</a></p>
</li>
</ol>
</div>Ilia ChtcherbakovThis ‘blog has a lot of lattices in it.
Some of them are kinda hard to imagine, like countable boolean algebras, where you have to use your brain and count or something. Some of them are really hard to imagine, like the free modular lattice on three generators, which for some strange reason is finite and proving that requires over a half hour of computations.An excluded subobject theorem for nondistributive lattices2017-07-07T14:51:53-04:002017-07-07T14:51:53-04:00http://cleare.st/math/excluded-subobjects-for-nondistributive-lattices<p>I gave a talk yesterday at the PMC’s SASMS. I typeset some notes, and I thought I would share them because I drew pictures.
This typeset version has more details and less visual intuition than what I presented on the blackboard, which is why the whole thing fit into a half hour.
Find a link to the notes and some flavour text under the cut.
<!--more--></p>
<p><a href="/files/sasms1175.pdf">Here is a link to the PDF version of my notes.</a></p>
<p><a href="http://sasms.puremath.club">Short Attention Span Math Seminars</a>—or SASMS—is an event held by the University of Waterloo’s <a href="http://puremath.club">Pure Math Club</a> every four months.
It’s a fun little evening where students can sign up to give short 30 minute talks on topics to interest them, and it’s a fun way to practice speaking.
I was club president for probably like a billion years, and I’ve signed up to give a talk every term since before my father was born, so of course I talked this term.
This is probably the last talk I’m going to give, so I wanted to do a good job for once. I think it went over well.</p>
<p>The talk is about lattices, of course, because lattices are basically my favourite things that aren’t matroids.
A lattice is <strong>distributive</strong> if its meet operation distributes over its join, or equivalently vice versa.
It turns out that there is a nice excluded-subobject theorem characterizing distributivity—akin to <a href="https://en.wikipedia.org/wiki/Kuratowski%27s_theorem">Kuratowski’s theorem</a> on planarity—and it “factors” nicely into two halves as far as lattice theory is concerned.</p>
<p>I think I’ll spend the rest of this post explaining the story behind why I’m gave this talk.</p>
<p>Last fall, I took a course called “Introduction to Substructural Logics”.
This was an undergraduate topics course in philosophy, so it counts as an elective on my transcript, but it was taught by a hardcore logician and was very heavily mathematical in nature.
Basically a perfect course for someone like me who is lazy and can’t stand doing things that aren’t math.
There were at most twelve people taking it, most of them probably upper-year philosophy students with logic leanings.
Me and my good friend Sean Harrap, a mathematician and a computer scientist respectively, were also students.</p>
<p>It proceeded slowly enough at first, with many breaks to talk about the history or some connections to philosophical ideas, but it walked and talked like an easy math course.
The grading scheme was to be based on the class’ performance on three assignments, with the explicit expectation that people should try to complete as many questions as they can and the grading curve would be decided holistically after the fact.
Having enough time and interest on my hands, Sean and I decided to try to answer every question. For the first two assignments, this was not very hard or time-consuming, as the exercises were fairly simple.</p>
<p>By the time the third assignment rolled around, however, we had begun to cover algebraic semantics, and while the lectures still proceeded at a reasonable pace, the assignment pulled out all the stops.
One of the questions was to prove that a bounded lattice was a Boolean algebra iff every prime filter was an ultrafilter—you may recall I squeezed a <a href="/math/prime-filters-in-distributive-lattices">series of two blog posts</a> out of this problem—and it was assigned to us as casually as anything.
Another question was to prove the previously mentioned excluded subobject result for lattice distributivity, which an exercise I would not recommend for even the most intense training regimens.</p>
<p>I managed to walk out of the course with a hundo, so all’s well that ends well, but I worry the marking scheme was a little too adversarial
and might have left some of the less algebraically inclined students high and dry.</p>
<p>In any case, this was a tale I’d recounted many times since, and at one point, as a joke, my partner in substructural crime Sean Harrap suggested I give a talk about this course since I couldn’t stop yammering on about it.
So I put my money where my mouth was and set to work digesting this proof until it fit into 30 minutes.
Even then it’s a bit of a stretch, but at the very least all the necessary ingredients are there.</p>Ilia ChtcherbakovI gave a talk yesterday at the PMC’s SASMS. I typeset some notes, and I thought I would share them because I drew pictures.
This typeset version has more details and less visual intuition than what I presented on the blackboard, which is why the whole thing fit into a half hour.
Find a link to the notes and some flavour text under the cut.sl(2) for Combinatorialists2017-05-24T20:38:49-04:002017-05-24T20:38:49-04:00http://cleare.st/math/sl2-for-combinatorialists<p>There is a long and terrific story to tell about Lie theory, and I wish I could do it justice, but there’s far too much to say in a single post.
What I have today is merely one application of one Lie algebraic idea, which ends up being a useful theoretical and practical tool in enumerative combinatorics.
<!--more--></p>
<p>The long and short of it is that the representation theory of a spooky object called <script type="math/tex">\def\sl{\mathfrak{sl}}\sl(2)</script> can be hijacked by combinatorialists
to prove that certain sequences of positive integers are <em>symmetric</em> and <em>unimodal</em>.
Typically symmetry is obvious but unimodality is quite hard to establish,
so this <script type="math/tex">\sl(2)</script> technology does make things somewhat neater.
The other big tool I know about for proving things about unimodality or related properties like log-concavity is the theory of <em>stable polynomials</em>,
which is also rather algebraic.</p>
<h1 id="lie-algebras">Lie algebras</h1>
<p>Fix the field <script type="math/tex">\def\C{\mathbb C}\C</script>. Some of the following math can be done over other fields and even over general rings, but <script type="math/tex">\C</script> is good enough for the combinatorialist and hence for this post.
Formally, a <strong>Lie algebra</strong> is a vector space <script type="math/tex">\def\g{\mathfrak g}\g</script> together with a special bilinear map <script type="math/tex">[{-},{-}] : \g \times \g \to \g</script> called the <em>Lie bracket</em>.
The Lie bracket must be antisymmetric, in that <script type="math/tex">[a,b] = -[b,a]</script> for all <script type="math/tex">a,b \in \g</script>, and it must also satisfy the <em>Jacobi identity</em>,</p>
<script type="math/tex; mode=display">[a,[b,c]] + [b,[c,a]] + [c,[a,b]] = 0,</script>
<p>for all <script type="math/tex">a,b,c \in \g</script>.</p>
<p>Lie algebras arise in a natural way from fantastic objects called <em>Lie groups</em>, which are essentially groups with smooth manifold structure.
There is an enormous amount of theory on this topic, of which I will be needing rather little, and most of what I will talk about today can be done without invoking any of the deep Lie theory underlying everything,
but I thought I would record at least a taste of what lies beneath.</p>
<p>Any associative <script type="math/tex">\C</script>-algebra <script type="math/tex">A</script> gives rise to a Lie algebra on <script type="math/tex">A</script>, by taking the Lie bracket to be the commutator <script type="math/tex">[a,b] = ab - ba</script>.
In particular, the matrix algebra <script type="math/tex">\mathrm{End}(V)</script> of endomorphisms of a finite-dimensional vector space gives a Lie algebra denoted <script type="math/tex">\def\gl{\mathfrak{gl}}\gl(V)</script>.
When the particular vector space is irrelevant we often abbeviate <script type="math/tex">\gl(n) = \gl(\C^n)</script>.</p>
<p>The Lie algebra <script type="math/tex">\sl(2)</script> is a sub–Lie algebra of <script type="math/tex">\gl(2)</script>, consisting of those matrices with zero trace.
The trace functional is not multiplicative, so <script type="math/tex">\sl(2)</script> is not a subalgebra of <script type="math/tex">\mathrm{End}(\C^2)</script>, but it is true that <script type="math/tex">\def\tr{\operatorname{tr}}\tr(AB) = \tr(BA)</script>, so that <script type="math/tex">\tr([A,B]) = \tr(AB) - \tr(BA) = 0</script> and then <script type="math/tex">\sl(2)</script> is closed under the Lie bracket.</p>
<p>To better discuss <script type="math/tex">\sl(2)</script>, let</p>
<script type="math/tex; mode=display">% <![CDATA[
X = \begin{bmatrix} 0&1\\0&0 \end{bmatrix}, \quad Y = \begin{bmatrix} 0&0\\1&0 \end{bmatrix}, \quad H = \begin{bmatrix} 1 & 0 \\ 0 & -1 \end{bmatrix}. %]]></script>
<p><script type="math/tex">\{X,Y,H\}</script> is a basis for <script type="math/tex">\sl(2)</script>, and we can compute that <script type="math/tex">[X,Y] = H</script>, <script type="math/tex">[H,X] = 2X</script>, and <script type="math/tex">[H,Y] = -2Y</script>.
In fact, <script type="math/tex">\{X,Y\}</script> together generate <script type="math/tex">\sl(2)</script> as a Lie algebra, in that the only subset of <script type="math/tex">\sl(2)</script> closed under finite linear combinations and brackets, and containing <script type="math/tex">X</script> and <script type="math/tex">Y</script>, is all of <script type="math/tex">\sl(2)</script>.</p>
<h1 id="representation-theory">Representation theory</h1>
<p>A <strong>representation</strong> of a Lie algebra <script type="math/tex">\g</script> is a linear map <script type="math/tex">\pi : \g \to \gl(V)</script> for some vector space <script type="math/tex">V</script>, such that <script type="math/tex">\pi([x,y]_\g) = [\pi(x),\pi(y)]_{\gl(V)}</script>.
One famous representation of any Lie algebra is the <em>adjoint representation</em> <script type="math/tex">\mathrm{ad} : \g \to \gl(\g)</script> where <script type="math/tex">\mathrm{ad}(x) = [x,{-}]</script>.
We’re going to investigate the representation theory of <script type="math/tex">\sl(2)</script><sup id="fnref:1"><a href="#fn:1" class="footnote">1</a></sup> and an arguably combinatorially useful property will fall out.</p>
<p>Let <script type="math/tex">\pi : \g \to \gl(V)</script> be a representation. A subspace <script type="math/tex">W \subseteq V</script> is <strong><script type="math/tex">\pi</script>-invariant</strong> if it is <script type="math/tex">\pi(x)</script>-invariant for all <script type="math/tex">x \in \g</script>, that is, if <script type="math/tex">\pi(x)W \subseteq W</script>.
<script type="math/tex">\pi</script> is <strong>irreducible</strong> if the only nontrivial invariant subspace is <script type="math/tex">V</script>.</p>
<p>One would hope, as in the representation theory of finite groups, that every complex finite-dimensional representation of a Lie algebra <script type="math/tex">\g</script> is a direct sum of irreducibles.
This doesn’t work out unless <script type="math/tex">\g</script> is <em>semisimple</em>.
The definition is a bit involved and doesn’t motivate itself, but it’s not wrong to say that <script type="math/tex">\g</script> is semisimple iff it is a direct sum of <em>simple</em> Lie algebras, which are those where the only nontrivial subspace <script type="math/tex">\mathfrak i</script> such that <script type="math/tex">[\g,\mathfrak i] = \mathfrak i</script> is <script type="math/tex">\g</script> itself.
Point is, <script type="math/tex">\sl(2)</script> is semisimple.</p>
<p>One common abuse of notation is to make <script type="math/tex">\pi : \g \to \gl(V)</script> implicit by declaring that <script type="math/tex">V</script> is a representation of <script type="math/tex">\g</script>
and that <script type="math/tex">xv = \pi(x)v</script> for <script type="math/tex">x \in \g</script> and <script type="math/tex">v \in V</script>.
Never having been one to rock the boat, I’ll do the same when discussing representations of <script type="math/tex">\sl(2)</script>.</p>
<p>Because <script type="math/tex">\{X,Y\}</script> generate <script type="math/tex">\sl(2)</script>, representations of <script type="math/tex">\sl(2)</script> are determined by the images of <script type="math/tex">X</script> and <script type="math/tex">Y</script>.
The coherence conditions they have to satisfy are <script type="math/tex">[H,X] = 2X</script> and <script type="math/tex">[H,Y] = -2Y</script>, where of course <script type="math/tex">H = [X,Y]</script>.
By the bilinearity and antisymmetry of the bracket, any pair of maps <script type="math/tex">(X,Y)</script> satisfying these (two) equations forms a representation of <script type="math/tex">\sl(2)</script>.</p>
<p>If we take for granted that every representation of <script type="math/tex">\sl(2)</script> decomposes a direct sum of irreducible representations, often abbreviated <em>irreps</em>,
then it suffices to understand the irreps.
Once we have that knowledge I can explain what it’s good for and how a combinatorialist might use it.
(At this point you can skip to the next section if you believe me and don’t care why.)</p>
<p>So let <script type="math/tex">V</script> be a finite-dimensional irrep of <script type="math/tex">\sl(2)</script>.
By semisimplicity, we can use a principle called <em>the preservation of Jordan decomposition</em><sup id="fnref:2"><a href="#fn:2" class="footnote">2</a></sup>.
This tells us that <script type="math/tex">H</script> acts diagonalizably on <script type="math/tex">V</script>, since it itself is diagonal in <script type="math/tex">\sl(2)</script>, and likewise <script type="math/tex">X</script> and <script type="math/tex">Y</script> act nilpotently.
Because <script type="math/tex">H</script> is diagonalizable, let’s decompose <script type="math/tex">V = \bigoplus_\lambda V_\lambda</script> into <script type="math/tex">\lambda</script>-eigenspaces <script type="math/tex">V_\lambda</script> for <script type="math/tex">H</script>.
The eigenvalues <script type="math/tex">\lambda</script> that have nontrivial <script type="math/tex">V_\lambda</script> are called <strong>weights</strong> and the <script type="math/tex">V_\lambda</script> are called <strong>weight spaces</strong>.</p>
<p>If <script type="math/tex">v \in V_\lambda</script>, then</p>
<script type="math/tex; mode=display">HXv = (XH + [H,X])v = X(\lambda v) + (2X)v = (\lambda + 2)Xv</script>
<p>so <script type="math/tex">X(V_\lambda) \subseteq V_{\lambda+2}</script>, and likewise <script type="math/tex">Y(V_\lambda) \subseteq V_{\lambda-2}</script>.
For this reason <script type="math/tex">X</script> and <script type="math/tex">Y</script> are often called <em>raising</em> and <em>lowering</em> operators, respectively.</p>
<p>If some <script type="math/tex">\alpha \in \C</script> has a nontrivial <script type="math/tex">V_\alpha</script>, then <script type="math/tex">\bigoplus_{n \in \mathbb Z} V_{\alpha+2n}</script> is an invariant subrepresentation of <script type="math/tex">V</script>, and hence equals <script type="math/tex">V</script> by irreducibility.
So by finite dimensionality, these eigenvalues show up in an unbroken line as in <script type="math/tex">\alpha, \alpha+2, \alpha+4, \dots, \alpha+2k</script>.</p>
<p>Let <script type="math/tex">v \in V_\alpha</script> be a vector of lowest weight.
Then consider the cyclic subspace <script type="math/tex">\{v, Xv, X^2v, ...\}</script>.
Obviously, <script type="math/tex">Yv = 0</script> by lowest weight.
By induction we can show <script type="math/tex">YX^nv = n(\alpha+n-1)X^{n-1}v</script>, for</p>
<script type="math/tex; mode=display">% <![CDATA[
\begin{align*}
YX^{n+1}v &= XYX^nv + HX^nv \\
&= X\bigl( n(\alpha+n-1) X^{n-1}v \bigr) + (\alpha+2n)X^nv \\
&= \bigl( n\alpha + n(n-1) + \alpha+2n \bigr)X^nv \\
&= \bigl( (n+1)\alpha + n(n+1) \bigr) X^nv \\
&= \bigl( (n+1)(\alpha + (n+1)-1) \bigr) X^nv.
\end{align*} %]]></script>
<p>It follows that this cyclic subspace is a subrepresentation, and by irreducibility, <script type="math/tex">V</script> is equal to this subrepresentation.
But now, because <script type="math/tex">V</script> is finite-dimensional, we can do some numerological magic.
<script type="math/tex">X^nv = 0</script> for some least <script type="math/tex">n = \dim V</script>, and then <script type="math/tex">0 = YX^nv = n(\alpha+n-1)X^{n-1}v</script>.</p>
<p>Well, <script type="math/tex">X^{n-1}v</script> is a nonzero vector, so the coefficient <script type="math/tex">n(\alpha+n-1)</script> must be zero.
If <script type="math/tex">V</script> is nontrivial, then <script type="math/tex">\alpha + n-1 = 0</script> and hence <script type="math/tex">\alpha</script> is a strictly negative integer!</p>
<p>Finally, recall that <script type="math/tex">X(V_\lambda) \subseteq V_{\lambda+2}</script>, so that the weight spaces of <script type="math/tex">V</script> have dimensions <script type="math/tex">1, 0, 1, 0, \dots</script>, starting at <script type="math/tex">\alpha</script>.
Also, since <script type="math/tex">k = n-1 = -\alpha</script> gives the last nonzero vector, the highest nontrivial weight space is <script type="math/tex">\alpha+2k = -\alpha = |\alpha|</script>.</p>
<p>Now we know all that we need to about irreps of <script type="math/tex">\sl(2)</script>.</p>
<h1 id="symmetry-and-unimodality">Symmetry and unimodality</h1>
<p>Taking stock of what just happened, we see that there is one irrep of any particular dimension, and its weight spaces have dimensions <script type="math/tex">1, 0, 1, 0, \dots, 0, 1</script>, symmetrically arranged around the 0-eigenspace.
It follows that any finite-dimensional representation is isomorphic to a direct sum of these.
That is to say, if <script type="math/tex">V = \bigoplus_i V_i</script> is a representation of <script type="math/tex">\sl(2)</script>, graded by its weight spaces, and <script type="math/tex">d_i = \dim V_i</script> is the dimension of the <script type="math/tex">i</script>-th weight space, then the following is true:</p>
<script type="math/tex; mode=display">\cdots \le d_{-4} \le d_{-2} \le d_0 \ge d_2 \ge d_4 \ge \cdots</script>
<script type="math/tex; mode=display">\cdots \le d_{-3} \le d_{-1} = d_1 \ge d_3 \ge \cdots</script>
<p>These two sequences, <script type="math/tex">(\dots, d_{-4}, d_{-2}, d_0, d_2, d_4, \dots)</script> and <script type="math/tex">(\dots, d_{-3}, d_{-1}, d_1, d_3, \dots)</script>, have two properties that are referred to as <strong>symmetry</strong>—that they can be reflected about their center and remain equal—and <strong>unimodality</strong>—that they rise monotonically to some peak, and subsequently fall monotonically.</p>
<p>Because of this, if you would like to prove that some sequence of positive integers is symmetric and unimodal, it would suffice to find a representation of <script type="math/tex">\sl(2)</script> with suitable weight spaces.
The encoding for a sequence <script type="math/tex">(d_i)_{i=0}^n</script> is usually to have <script type="math/tex">d_i</script> be the dimension of the <script type="math/tex">(2i-n)</script>-th weight space.<sup id="fnref:3"><a href="#fn:3" class="footnote">3</a></sup>
To show you some interesting examples, I’ll use a couple of bits of technology, but in principle I could give the coefficients explicitly and that would suffice.</p>
<p>Given any finite set <script type="math/tex">S</script>, there is a natural representation of <script type="math/tex">\sl(2)</script>, called the <em>Boolean representation</em>, on the free vector space <script type="math/tex">\C\mathcal P(S)</script> whose basis is indexed by subsets of <script type="math/tex">S</script>.
Denote the basis vector of <script type="math/tex">A \subseteq S</script> by <script type="math/tex">\tilde A</script>.
Then the representation of <script type="math/tex">\sl(2)</script> is given by</p>
<script type="math/tex; mode=display">X\tilde A = \sum_{a \notin A} (A \cup \{a\})^\sim \quad \text{and} \quad Y\tilde A = \sum_{a \in A} (A \smallsetminus \{a\})^\sim.</script>
<p>Let <script type="math/tex">V</script> be a representation of <script type="math/tex">\sl(2)</script> and suppose it has an action by some group <script type="math/tex">G \le \mathrm{GL}(V)</script> as well.
If the <script type="math/tex">\sl(2)</script>-rep is <strong>equivariant</strong> with respect to the <script type="math/tex">G</script>-action, in that <script type="math/tex">gx = xg</script> for all <script type="math/tex">x \in \sl(2)</script> and <script type="math/tex">g \in G</script>,
then there exists a subrepresentation on the <script type="math/tex">G</script>-invariant vectors, i.e. on the vector space</p>
<script type="math/tex; mode=display">V^G = \{ v \in V : g.v = v\ \text{for all}\ g \in G \}.</script>
<p>To wit, if <script type="math/tex">\{v_1, \dots, v_n\}</script> is some orbit of <script type="math/tex">G</script>, then <script type="math/tex">\sum_i v_i \in V^G</script>, so in some sense <script type="math/tex">V^G</script> is the space of orbits of <script type="math/tex">G</script>.</p>
<p>Now, we can see a couple of examples.</p>
<p>First, let <script type="math/tex">g_n(k)</script> the number of isomorphism classes of <script type="math/tex">n</script>-vertex <script type="math/tex">k</script>-edge graphs.
Clearly the sequence <script type="math/tex">g_n = ( g_n(k) : 0 \le k \le \binom{n}{2} )</script> is symmetric, via complementation, but unimodality is far far harder to show combinatorially.
Instead, we’ll use <script type="math/tex">\sl(2)</script>!</p>
<p>Let <script type="math/tex">E = \binom{[n]}2</script> be the edge set of the complete graph <script type="math/tex">K_n</script>.
The symmetric group <script type="math/tex">S_n</script> has a natural action on <script type="math/tex">E</script>, by permuting the vertices of <script type="math/tex">K_n</script> and bringing the edges along.
This induces an action on <script type="math/tex">2^E</script>, which can be interpreted as the set of all graphs on the vertex set <script type="math/tex">[n] = \{1,\dots,n\}</script>.
Notably, two graphs are in the same orbit iff they are isomorphic.
It follows that the dimensions of the weight spaces of <script type="math/tex">\C\mathcal P(E)^{S_n}</script> are precisely the <script type="math/tex">g_n(k)</script>’s above,
so by the invariant subrepresentation of the Boolean representation of <script type="math/tex">\sl(2)</script> on <script type="math/tex">E</script>, <script type="math/tex">g_n</script> is symmetric and unimodal.</p>
<p>As a second example, let <script type="math/tex">p_{a,b}(k)</script> be the number of <a href="https://en.wikipedia.org/wiki/Partition_%28number_theory%29">integer partitions</a> of <script type="math/tex">k</script> with at most <script type="math/tex">a</script> parts, each of which has size at most <script type="math/tex">b</script>.
It’s a classical result of <a href="https://en.wikipedia.org/wiki/Q-analog">q-combinatorics</a> that this is the coefficient of <script type="math/tex">q^k</script> in the Gaussian polynomial</p>
<script type="math/tex; mode=display">\def\qbinom{\genfrac{[}{]}{0pt}{}} \qbinom{a+b}{a}_q = \prod_{i=1}^a \frac{1-q^{b+i}}{1-q^i}.</script>
<p>Let <script type="math/tex">V</script> be the Boolean representation of <script type="math/tex">[a] \times [b]</script>, and let <script type="math/tex">G = S_b \wr S_a</script> be the <a href="https://en.wikipedia.org/wiki/Wreath_product">wreath product</a> of two symmetric groups.
If you don’t know what this group is, then know that its action on <script type="math/tex">\mathcal P([a] \times [b])</script> is exactly to permute the cells within each row, and then also to permute the rows afterwards.
(If you’re reading the Wikipedia article, then this is an action induced by the <em>imprimitive</em> action.)</p>
<p>Given a proper definition, it is not hard to show each orbit of <script type="math/tex">G</script> on <script type="math/tex">\mathcal P([a] \times [b])</script> contains the Ferrers diagram of exactly one partition,
and hence <script type="math/tex">G</script> acts on <script type="math/tex">V</script> such that <script type="math/tex">V^G</script> is a vector space with basis indexed by these <script type="math/tex">a \times b</script>-bounded partitions.
The partitions of <script type="math/tex">n</script> all fall into the weight space with eigenvalue <script type="math/tex">2n - ab</script>,
so the sequence <script type="math/tex">p_{a,b} = ( p_{a,b}(k) : 0 \le k \le ab )</script> is symmetric and unimodal.
This is but one of many proofs of the celebrated unimodality of the coefficients of <script type="math/tex">\qbinom{n}{k}_q</script>.</p>
<p>It is not very hard to give explicit coefficients for this particular representation, actually.
Writing partitions as their multiplicity vectors <script type="math/tex">(m_0, \dots, m_b)</script>, it turns out that</p>
<script type="math/tex; mode=display">% <![CDATA[
\begin{align*}
X\cdot(m_0, ..., m_b) &= \sum_{i=0}^b (b-i)m_i \cdot(\dots, m_i-1, m_{i+1}+1, \dots), \\
Y\cdot(m_0, ..., m_b) &= \sum_{j=0}^b jm_j \cdot(\dots, m_{j-1}+1, m_j-1, \dots).
\end{align*} %]]></script>
<p>At first glance it may appear that the above is not well-defined, but only valid partitions will show up in summands with nonzero coefficients, so all is well.</p>
<h1 id="lets-talk-about-posets-now">Let’s talk about posets now</h1>
<p>Because posets and lattices are my favourite thing on this blog, I would be remiss if I were not to mention a very obvious connection to posets.</p>
<p>A poset <script type="math/tex">P</script> is <strong>graded</strong> if it can be partitioned into disjoint ranks <script type="math/tex">P_i</script>, <script type="math/tex">i \in \{0, ..., r\}</script>, such that the only covering relations are between adjacent ranks <script type="math/tex">P_i</script> and <script type="math/tex">P_{i+1}</script>.
In such a situation, you could prove that <script type="math/tex">P</script> is <em>rank-symmetric</em> and <em>rank-unimodal</em> by finding a representation of <script type="math/tex">\sl(2)</script> on the free vector space <script type="math/tex">\tilde P</script> whose weight spaces are the free subspaces <script type="math/tex">\tilde P_i</script>.</p>
<p>If you additionally require that the representation of <script type="math/tex">\sl(2)</script> respect the poset structure—by saying that <script type="math/tex">X</script> and <script type="math/tex">Y</script> only raise or lower along covering relations,
i.e. whenever <script type="math/tex">X\tilde a = \sum_i x_i \tilde b_i</script> for nonzero <script type="math/tex">x_i</script>, then <script type="math/tex">a \le b_i</script>—then we say that <script type="math/tex">P</script> carries that representation of <script type="math/tex">\sl(2)</script>.
In this case, you prove not only that <script type="math/tex">P</script> is rank-symmetric and unimodal, but has a third property: that any union of <script type="math/tex">k</script> antichains is at most as large as the union of the <script type="math/tex">k</script> largest ranks.</p>
<p>This is called the <a href="https://en.wikipedia.org/wiki/Sperner_property_of_a_partially_ordered_set"><em>strong Sperner</em> property</a>, and those of you who have heard of Sperner theory are probably already groaning and closing your browser window,
so I promise I won’t say much more about it.
Essentially, this property is saying that there are no clever collections of large antichains, and if you want a bunch of them you might as well take from the ranks.
In some sense it guarantees that your poset is not very lopsided.</p>
<p>A poset has the <em>Peck property</em> if it is rank-symmetric, rank-unimodal, and strongly Sperner.
By a theorem of Proctor, a poset is Peck iff it carries a representation of <script type="math/tex">\sl(2)</script>.</p>
<p>The representations given as examples above are actually carried by posets.
The first is carried by the poset of <script type="math/tex">n</script>-vertex isomorphism classes of graphs, ordered by the subgraph relation,
and the second is carried by the lattice of bounded partitions, equivalently viewed as the lattice of order ideals in a product of two chains.</p>
<div class="footnotes">
<ol>
<li id="fn:1">
<p>By “investigate” I mean I’m just going to say some things which aren’t technically wrong, and you can look them up if you don’t believe me; and by “we” I mean I’m reading some <a href="http://csclub.uwaterloo.ca/~mlbaker/s14/">Lie rep theory notes</a> curated by a friend of mine, <a href="https://mlbaker.net/">Michael Baker</a>, and pilfering just enough of the relevant presentation to make me feel bad if I didn’t say anything. <a href="#fnref:1" class="reversefootnote">↩</a></p>
</li>
<li id="fn:2">
<p>Of course, this depends on semisimplicity. Properly, this is called the preservation of <a href="https://en.wikipedia.org/wiki/Jordan%E2%80%93Chevalley_decomposition">Jordan–Chevalley decomposition</a>, and a precise statement and proof can probably be found in something like Fulton and Harris. <a href="#fnref:2" class="reversefootnote">↩</a></p>
</li>
<li id="fn:3">
<p>It is more convenient to index from 0 when dealing with <script type="math/tex">\sl(2)</script>, for the same reason that both <script type="math/tex">\varnothing</script> and <script type="math/tex">S</script> are subsets of <script type="math/tex">S</script>. <a href="#fnref:3" class="reversefootnote">↩</a></p>
</li>
</ol>
</div>Ilia ChtcherbakovThere is a long and terrific story to tell about Lie theory, and I wish I could do it justice, but there’s far too much to say in a single post.
What I have today is merely one application of one Lie algebraic idea, which ends up being a useful theoretical and practical tool in enumerative combinatorics.