The Czech-Slovak Olympiad problem: Exploring steps.

We visit the problem once more. This time we are looking for whether things work following heuristics implemented by hand. We will also be more explicit in naming stuff. We will also have a few more steps, which are flips of the earlier steps. In some cases we only consider the flips. This time we will not try to generate statements of lemmas, as this was (barely) successful last time.

The problem

Let ⋆ be a binary operation on a nonempty set $M$. That is, every pair $(a,b) \in M$ is assigned an element $a$ ⋆$ b$ in $M$. Suppose that ⋆ has the additional property that $(a $ ⋆ $b) $ ⋆$ b= a$ and $a$ ⋆ $(a$ ⋆$ b)= b$ for all $a,b \in M$. Show that $a$ ⋆ $b = b$ ⋆ $a$ for all $a,b \in M$.

We derive the following results.

  1. $ m * n = n * m $ (the final claim; by symmetry from 8)
  2. $ (m * n) * n = m $ (instantiation)
  3. $ (m * n)*((m * n) * n) = n$ (instantiation)
  4. $ n = (m * n)*((m * n)* n)$ (symmetry from 3)
  5. $ (m * n)*((m * n)* n) = (m * n)* m $ (left multiplication by $m * n$ of 2
  6. $ ((m * n)* m))* m = m * n$ (instantiation)
  7. $ n = (m * n)* m$ (transitivity from 4 and 5)
  8. $ n * m = ((m * n)* m)* m $ (right multiplication from 7)
  9. $ n * m = m * n $ (transitivity from 8 and 6)

In this notebook we do not consider selection of the correct lemmas, but just the generation.

In [1]:
import $cp.bin.`provingground-core-jvm-a09844e8fb.fat.jar`
import provingground._ , interface._, HoTT._, learning._ 
repl.pprinter() = {
  val p = repl.pprinter()
  p.copy(
    additionalHandlers = p.additionalHandlers.orElse {
      translation.FansiShow.fansiHandler
    }
  )
}
Out[1]:
import $cp.$                                              

import provingground._ , interface._, HoTT._, learning._ 
In [2]:
val M = "M" :: Type

val eqM = "eqM" :: M ->: M ->: Type

val a = "a" :: M
val b = "b" :: M
val c = "c" :: M

val m = "m" :: M

val n = "n" :: M

val mul = "mul" :: M ->: M ->: M
Out[2]:
M: Typ[Term] = M
eqM: Func[Term, Func[Term, Typ[Term]]] = eqM
a: Term = a
b: Term = b
c: Term = c
m: Term = m
n: Term = n
mul: Func[Term, Func[Term, Term]] = mul

We first make all the statements.

In [3]:
val results = Vector(
    eqM(mul(m)(n))(mul(n)(m)),
    eqM(mul(mul(m)(n))(n))(m),
    eqM(mul(mul(m)(n))(mul(mul(m)(n))(n)))(n),
    eqM(n)(mul(mul(m)(n))(mul(mul(m)(n))(n))),
    eqM(mul(mul(m)(n))(mul(mul(m)(n))(n)))(mul(mul(m)(n))(m)),
    eqM(mul(mul(mul(m)(n))(m))(m))(mul(m)(n)),
    eqM(n)(mul(mul(m)(n))(m)),
    eqM(mul(n)(m))(mul(mul(mul(m)(n))(m))(m)),
    eqM(mul(n)(m))(mul(m)(n))
)
Out[3]:
results: Vector[Typ[Term]] = Vector(
  eqM(mul(m)(n))(mul(n)(m)),
  eqM(mul(mul(m)(n))(n))(m),
  eqM(mul(mul(m)(n))(mul(mul(m)(n))(n)))(n),
  eqM(n)(mul(mul(m)(n))(mul(mul(m)(n))(n))),
  eqM(mul(mul(m)(n))(mul(mul(m)(n))(n)))(mul(mul(m)(n))(m)),
  eqM(mul(mul(mul(m)(n))(m))(m))(mul(m)(n)),
  eqM(n)(mul(mul(m)(n))(m)),
  eqM(mul(n)(m))(mul(mul(mul(m)(n))(m))(m)),
  eqM(mul(n)(m))(mul(m)(n))
)

We shall now introduce all the axioms.

In [4]:
val refl = "refl" :: a ~>: (eqM(a)(a))

val sym = "sym" :: a ~>: (b ~>: (eqM(a)(b) ->: eqM(b)(a)))

val trans =
    "trans" :: a ~>:
      (b ~>: (c ~>: ((eqM(a)(b)) ->: (eqM(b)(c)) ->: (eqM(a)(c)))))

val leftMul = "left-multiply" :: a ~>: (b ~>: (c ~>: (eqM(b)(c) ->: eqM(mul(a)(b))(mul(a)(c)))))
val rightMul = "right-multiply" :: a ~>: (b ~>: (c ~>: (eqM(b)(c) ->: eqM(mul(b)(a))(mul(c)(a)))))

val ass1 = "ass1" :: a ~>: (b ~>: eqM(mul(mul(a)(b))(b))(a))
val ass2 = "ass2" :: a ~>: (b ~>: eqM(mul(a)(mul(a)(b)))(b))
Out[4]:
refl: FuncLike[Term, Term] = refl
sym: FuncLike[Term, FuncLike[Term, Func[Term, Term]]] = sym
trans: FuncLike[Term, FuncLike[Term, FuncLike[Term, Func[Term, Func[Term, Term]]]]] = trans
leftMul: FuncLike[Term, FuncLike[Term, FuncLike[Term, Func[Term, Term]]]] = left-multiply
rightMul: FuncLike[Term, FuncLike[Term, FuncLike[Term, Func[Term, Term]]]] = right-multiply
ass1: FuncLike[Term, FuncLike[Term, Term]] = ass1
ass2: FuncLike[Term, FuncLike[Term, Term]] = ass2

Formal proving

Before considering proof discovery, we look for proof representation.

In [5]:
val pf1 = ass1(m)(n)
pf1.typ
Out[5]:
pf1: Term = ass1(m)(n)
res4_1: Typ[U] = eqM(mul(mul(m)(n))(n))(m)
In [6]:
pf1.typ == results(1)
Out[6]:
res5: Boolean = true
In [7]:
val mn = mul(m)(n)
val pf2 = ass2(mn)(n)
Out[7]:
mn: Term = mul(m)(n)
pf2: Term = ass2(mul(m)(n))(n)
In [8]:
pf2.typ
Out[8]:
res7: Typ[U] = eqM(mul(mul(m)(n))(mul(mul(m)(n))(n)))(n)
In [9]:
pf2.typ == results(2)
Out[9]:
res8: Boolean = true
In [10]:
Unify.appln(sym, pf2)
Out[10]:
res9: Option[Term] = Some(
  sym(mul(mul(m)(n))(mul(mul(m)(n))(n)))(n)(ass2(mul(m)(n))(n))
)
In [11]:
val pf3 = Unify.appln(sym, pf2).get
pf3.typ
Out[11]:
pf3: Term = sym(mul(mul(m)(n))(mul(mul(m)(n))(n)))(n)(ass2(mul(m)(n))(n))
res10_1: Typ[U] = eqM(n)(mul(mul(m)(n))(mul(mul(m)(n))(n)))
In [12]:
pf3.typ == results(3)
Out[12]:
res11: Boolean = true
In [13]:
Unify.appln(leftMul(mn), pf1)
Out[13]:
res12: Option[Term] = Some(
  left-multiply(mul(m)(n))(mul(mul(m)(n))(n))(m)(ass1(m)(n))
)
In [14]:
val pf4 = Unify.appln(leftMul(mn), pf1).get
pf4.typ
pf4.typ == results(4)
Out[14]:
pf4: Term = left-multiply(mul(m)(n))(mul(mul(m)(n))(n))(m)(ass1(m)(n))
res13_1: Typ[U] = eqM(mul(mul(m)(n))(mul(mul(m)(n))(n)))(mul(mul(m)(n))(m))
res13_2: Boolean = true
In [15]:
val pf5 = ass1(mn)(m)
pf5.typ
pf5.typ == results(5)
Out[15]:
pf5: Term = ass1(mul(m)(n))(m)
res14_1: Typ[U] = eqM(mul(mul(mul(m)(n))(m))(m))(mul(m)(n))
res14_2: Boolean = true
In [16]:
Unify.appln(trans, pf3)
Out[16]:
res15: Option[Term] = Some(
  ($inr : M) ↦ trans(n)(mul(mul(m)(n))(mul(mul(m)(n))(n)))($inr)(sym(mul(mul(m)(n))(mul(mul(m)(n))(n)))(n)(ass2(mul(m)(n))(n)))
)
In [17]:
val step6 = Unify.appln(trans, pf3).get
Unify.appln(step6, pf4)
Out[17]:
step6: Term = ($lzd : M) ↦ trans(n)(mul(mul(m)(n))(mul(mul(m)(n))(n)))($lzd)(sym(mul(mul(m)(n))(mul(mul(m)(n))(n)))(n)(ass2(mul(m)(n))(n)))
res16_1: Option[Term] = Some(
  trans(n)(mul(mul(m)(n))(mul(mul(m)(n))(n)))(mul(mul(m)(n))(m))(sym(mul(mul(m)(n))(mul(mul(m)(n))(n)))(n)(ass2(mul(m)(n))(n)))(left-multiply(mul(m)(n))(mul(mul(m)(n))(n))(m)(ass1(m)(n)))
)
In [18]:
val pf6 = Unify.appln(step6, pf4).get
pf6.typ
pf6.typ == results(6)
Out[18]:
pf6: Term = trans(n)(mul(mul(m)(n))(mul(mul(m)(n))(n)))(mul(mul(m)(n))(m))(sym(mul(mul(m)(n))(mul(mul(m)(n))(n)))(n)(ass2(mul(m)(n))(n)))(left-multiply(mul(m)(n))(mul(mul(m)(n))(n))(m)(ass1(m)(n)))
res17_1: Typ[U] = eqM(n)(mul(mul(m)(n))(m))
res17_2: Boolean = true
In [19]:
val pf7 = Unify.appln(rightMul(m),pf6).get
pf7.typ
pf7.typ == results(7)
Out[19]:
pf7: Term = right-multiply(m)(n)(mul(mul(m)(n))(m))(trans(n)(mul(mul(m)(n))(mul(mul(m)(n))(n)))(mul(mul(m)(n))(m))(sym(mul(mul(m)(n))(mul(mul(m)(n))(n)))(n)(ass2(mul(m)(n))(n)))(left-multiply(mul(m)(n))(mul(mul(m)(n))(n))(m)(ass1(m)(n))))
res18_1: Typ[U] = eqM(mul(n)(m))(mul(mul(mul(m)(n))(m))(m))
res18_2: Boolean = true
In [20]:
val step8 = Unify.appln(trans, pf7).get
Out[20]:
step8: Term = ($zde : M) ↦ trans(mul(n)(m))(mul(mul(mul(m)(n))(m))(m))($zde)(right-multiply(m)(n)(mul(mul(m)(n))(m))(trans(n)(mul(mul(m)(n))(mul(mul(m)(n))(n)))(mul(mul(m)(n))(m))(sym(mul(mul(m)(n))(mul(mul(m)(n))(n)))(n)(ass2(mul(m)(n))(n)))(left-multiply(mul(m)(n))(mul(mul(m)(n))(n))(m)(ass1(m)(n)))))
In [21]:
val pf8 = Unify.appln(step8, pf5).get
Out[21]:
pf8: Term = trans(mul(n)(m))(mul(mul(mul(m)(n))(m))(m))(mul(m)(n))(right-multiply(m)(n)(mul(mul(m)(n))(m))(trans(n)(mul(mul(m)(n))(mul(mul(m)(n))(n)))(mul(mul(m)(n))(m))(sym(mul(mul(m)(n))(mul(mul(m)(n))(n)))(n)(ass2(mul(m)(n))(n)))(left-multiply(mul(m)(n))(mul(mul(m)(n))(n))(m)(ass1(m)(n)))))(ass1(mul(m)(n))(m))
In [22]:
pf8.typ
pf8.typ == results(8)
Out[22]:
res21_0: Typ[U] = eqM(mul(n)(m))(mul(m)(n))
res21_1: Boolean = true
In [23]:
val pf = Unify.appln(sym, pf8).get
pf.typ
pf.typ == results(0)
Out[23]:
pf: Term = sym(mul(n)(m))(mul(m)(n))(trans(mul(n)(m))(mul(mul(mul(m)(n))(m))(m))(mul(m)(n))(right-multiply(m)(n)(mul(mul(m)(n))(m))(trans(n)(mul(mul(m)(n))(mul(mul(m)(n))(n)))(mul(mul(m)(n))(m))(sym(mul(mul(m)(n))(mul(mul(m)(n))(n)))(n)(ass2(mul(m)(n))(n)))(left-multiply(mul(m)(n))(mul(mul(m)(n))(n))(m)(ass1(m)(n)))))(ass1(mul(m)(n))(m)))
res22_1: Typ[U] = eqM(mul(m)(n))(mul(n)(m))
res22_2: Boolean = true

Conclusions

  • We have successfully formalized the proof.
  • We will now try to generate the steps, ruthlessly narrowing.
In [24]:
val lp1 = LocalProver(TermState(FiniteDistribution.unif(ass1, ass2, m, n, mn), FiniteDistribution.empty)).noIsles.sharpen(10)
Out[24]:
lp1: LocalProverStep = LocalProver(
  TermState(
    FiniteDistribution(
      Vector(
        Weighted(ass1, 0.2),
        Weighted(mul(m)(n), 0.2),
        Weighted(ass2, 0.2),
        Weighted(m, 0.2),
        Weighted(n, 0.2)
      )
    ),
    FiniteDistribution(Vector()),
    Vector(),
    FiniteDistribution(Vector()),
    FiniteDistribution(Vector()),
    Empty
  ),
  TermGenParams(
    0.1,
    0.1,
    0.1,
    0.0,
    0.0,
    0.05,
    0.05,
    0.0,
    0.0,
    0.0,
    0.0,
    0.0,
    0.3,
    0.7,
    0.5,
    0.0,
    0.0,
    0.0,
    OrElse(
      OrElse(OrElse(OrElse(<function1>, <function1>), <function1>), <function1>),
...
In [25]:
val terms1T = lp1.nextState.map(_.terms).memoize
val r1 = terms1T.map(fd => results.zipWithIndex.filter{case (tp, n) => fd.map(_.typ)(tp) > 0}).memoize
Out[25]:
terms1T: monix.eval.Task[FiniteDistribution[Term]] = Async(
  <function2>,
  false,
  true,
  true
)
r1: monix.eval.Task[Vector[(Typ[Term], Int)]] = Async(
  <function2>,
  false,
  true,
  true
)
In [26]:
import monix.execution.Scheduler.Implicits.global
terms1T.runToFuture
In [28]:
val f1 = r1.runToFuture

Interim conclusion

We see that all the instantiation lemmas have been obtained.

In [35]:
val lp2 = LocalProver(TermState(FiniteDistribution.unif(pf1, pf2, sym, leftMul, m, n, mn), FiniteDistribution.empty)).noIsles.sharpen(10)
Out[35]:
lp2: LocalProverStep = LocalProver(
  TermState(
    FiniteDistribution(
      Vector(
        Weighted(m, 0.14285714285714285),
        Weighted(sym, 0.14285714285714285),
        Weighted(mul(m)(n), 0.14285714285714285),
        Weighted(left-multiply, 0.14285714285714285),
        Weighted(ass2(mul(m)(n))(n), 0.14285714285714285),
        Weighted(n, 0.14285714285714285),
        Weighted(ass1(m)(n), 0.14285714285714285)
      )
    ),
    FiniteDistribution(Vector()),
    Vector(),
    FiniteDistribution(Vector()),
    FiniteDistribution(Vector()),
    Empty
  ),
  TermGenParams(
    0.1,
    0.1,
    0.1,
    0.0,
    0.0,
    0.05,
    0.05,
    0.0,
    0.0,
    0.0,
    0.0,
    0.0,
    0.3,
    0.7,
    0.5,
    0.0,
    0.0,
    0.0,
    OrElse(
...
In [36]:
val terms2T = lp2.nextState.map(_.terms).memoize
val r2 = terms2T.map(fd => results.zipWithIndex.filter{case (tp, n) => fd.map(_.typ)(tp) > 0}).memoize
Out[36]:
terms2T: monix.eval.Task[FiniteDistribution[Term]] = Async(
  <function2>,
  false,
  true,
  true
)
r2: monix.eval.Task[Vector[(Typ[Term], Int)]] = Async(
  <function2>,
  false,
  true,
  true
)
In [37]:
terms2T.runToFuture
In [38]:
// val r2 = terms2T.map(fd => results.zipWithIndex.filter{case (tp, n) => fd.map(_.typ)(tp) > 0}).memoize
val f2 = r2.runToFuture

The consequences 3 and 4 were found with moderately focussed generation. We next try using transitivity. This will be very focussed but with a lower cutoff.

In [39]:
val lp3 = LocalProver(TermState(FiniteDistribution.unif(pf3, pf4, trans), FiniteDistribution.empty)).noIsles
Out[39]:
lp3: LocalProver = LocalProver(
  TermState(
    FiniteDistribution(
      Vector(
        Weighted(
          sym(mul(mul(m)(n))(mul(mul(m)(n))(n)))(n)(ass2(mul(m)(n))(n)),
          0.3333333333333333
        ),
        Weighted(
          left-multiply(mul(m)(n))(mul(mul(m)(n))(n))(m)(ass1(m)(n)),
          0.3333333333333333
        ),
        Weighted(trans, 0.3333333333333333)
      )
    ),
    FiniteDistribution(Vector()),
    Vector(),
    FiniteDistribution(Vector()),
    FiniteDistribution(Vector()),
    Empty
  ),
  TermGenParams(
    0.1,
    0.1,
    0.1,
    0.0,
    0.0,
    0.05,
    0.05,
    0.0,
    0.0,
    0.0,
    0.0,
    0.0,
    0.3,
    0.7,
    0.5,
    0.0,
    0.0,
...
In [40]:
val terms3T = lp3.nextState.map(_.terms).memoize
val r3 = terms3T.map(fd => results.zipWithIndex.filter{case (tp, n) => fd.map(_.typ)(tp) > 0}).memoize
Out[40]:
terms3T: monix.eval.Task[FiniteDistribution[Term]] = Async(
  <function2>,
  false,
  true,
  true
)
r3: monix.eval.Task[Vector[(Typ[Term], Int)]] = Async(
  <function2>,
  false,
  true,
  true
)
In [41]:
terms3T.runToFuture
In [42]:
val f3 = r3.runToFuture

This worked, with a very narrow focus but quickly. We now aim for Lemma 7, which is again right multiplication.

In [43]:
val lp4 = LocalProver(TermState(FiniteDistribution.unif(pf6, rightMul, m, n, mn), FiniteDistribution.empty)).noIsles
val terms4T = lp4.nextState.map(_.terms).memoize
val r4 = terms4T.map(fd => results.zipWithIndex.filter{case (tp, n) => fd.map(_.typ)(tp) > 0}).memoize
Out[43]:
lp4: LocalProver = LocalProver(
  TermState(
    FiniteDistribution(
      Vector(
        Weighted(right-multiply, 0.2),
        Weighted(mul(m)(n), 0.2),
        Weighted(m, 0.2),
        Weighted(
          trans(n)(mul(mul(m)(n))(mul(mul(m)(n))(n)))(mul(mul(m)(n))(m))(sym(mul(mul(m)(n))(mul(mul(m)(n))(n)))(n)(ass2(mul(m)(n))(n)))(left-multiply(mul(m)(n))(mul(mul(m)(n))(n))(m)(ass1(m)(n))),
          0.2
        ),
        Weighted(n, 0.2)
      )
    ),
    FiniteDistribution(Vector()),
    Vector(),
    FiniteDistribution(Vector()),
    FiniteDistribution(Vector()),
    Empty
  ),
  TermGenParams(
    0.1,
    0.1,
    0.1,
    0.0,
    0.0,
    0.05,
    0.05,
    0.0,
    0.0,
    0.0,
    0.0,
    0.0,
    0.3,
    0.7,
    0.5,
    0.0,
...
terms4T: monix.eval.Task[FiniteDistribution[Term]] = Async(
  <function2>,
  false,
  true,
  true
)
r4: monix.eval.Task[Vector[(Typ[Term], Int)]] = Async(
  <function2>,
  false,
  true,
  true
)
In [44]:
terms4T.runToFuture
In [45]:
val f4 = r4.runToFuture
In [46]:
f4.value
Out[46]:
res45: Option[scala.util.Try[Vector[(Typ[Term], Int)]]] = Some(
  Success(
    Vector(
      (eqM(n)(mul(mul(m)(n))(m)), 6),
      (eqM(mul(n)(m))(mul(mul(mul(m)(n))(m))(m)), 7)
    )
  )
)

The Lemma 7 was obtained with a fairly focus, but with the default cutoff. We now look for Lemma 8 and the Theorem. It will probably take two steps.

In [47]:
val lp5 = LocalProver(TermState(FiniteDistribution.unif(pf5, pf7, sym, trans), FiniteDistribution.empty)).noIsles.sharpen(10)
val terms5T = lp5.nextState.map(_.terms).memoize
val r5 = terms5T.map(fd => results.zipWithIndex.filter{case (tp, n) => fd.map(_.typ)(tp) > 0}).memoize
Out[47]:
lp5: LocalProverStep = LocalProver(
  TermState(
    FiniteDistribution(
      Vector(
        Weighted(ass1(mul(m)(n))(m), 0.25),
        Weighted(
          right-multiply(m)(n)(mul(mul(m)(n))(m))(trans(n)(mul(mul(m)(n))(mul(mul(m)(n))(n)))(mul(mul(m)(n))(m))(sym(mul(mul(m)(n))(mul(mul(m)(n))(n)))(n)(ass2(mul(m)(n))(n)))(left-multiply(mul(m)(n))(mul(mul(m)(n))(n))(m)(ass1(m)(n)))),
          0.25
        ),
        Weighted(sym, 0.25),
        Weighted(trans, 0.25)
      )
    ),
    FiniteDistribution(Vector()),
    Vector(),
    FiniteDistribution(Vector()),
    FiniteDistribution(Vector()),
    Empty
  ),
  TermGenParams(
    0.1,
    0.1,
    0.1,
    0.0,
    0.0,
    0.05,
    0.05,
    0.0,
    0.0,
    0.0,
    0.0,
    0.0,
    0.3,
    0.7,
    0.5,
    0.0,
    0.0,
...
terms5T: monix.eval.Task[FiniteDistribution[Term]] = Async(
  <function2>,
  false,
  true,
  true
)
r5: monix.eval.Task[Vector[(Typ[Term], Int)]] = Async(
  <function2>,
  false,
  true,
  true
)
In [48]:
terms5T.runToFuture
In [49]:
val f5 = r5.runToFuture
In [50]:
f5.value
Out[50]:
res49: Option[scala.util.Try[Vector[(Typ[Term], Int)]]] = Some(
  Success(
    Vector(
      (eqM(mul(m)(n))(mul(n)(m)), 0),
      (eqM(mul(mul(mul(m)(n))(m))(m))(mul(m)(n)), 5),
      (eqM(mul(n)(m))(mul(mul(mul(m)(n))(m))(m)), 7),
      (eqM(mul(n)(m))(mul(m)(n)), 8)
    )
  )
)

We got the theorem with symmetry and transitivity working together.

Conclusions

  • We could generate all the lemmas and the final theorem.
  • While we did focus, it was not too gross.