On the selectivity of a semantic subsumption index
Seminar Room 1, Newton Institute
Subsumption is a way of saying that one first order clause is ''more general'' than another. Millions of pairs of clauses need to be tested for subsumption in a single run of a typical first order resolution-based theorem prover. Deciding, for two given clauses, whether one of them subsumes another is NP-hard and thus subsumption tests take a lot of time, usually more than half of the running time of a prover.
Complicated data structures have been developed to support subsumption tests, including discrimination trees with their most advanced version, called code trees, implemented in Vampire. Since subsumption is a syntactic notion, it is the syntax of the clauses which is indexed in the trees.
We show, that in some cases, a simple semantical indexing technique can possibly prove more efficient in this context, than the known complicated syntactic data structures.
Our main experimental result is that:
(i) in most (up to 99 %) of "practical" negative cases, the non-subsumption can be shown by a counterexample being a structure consisting of 4 elements;
(ii) if a logic with many enough truth values is used, then the counterexamples are relatively frequent, so they can be constructed by a random guess.
Finally, we provide some evidence that a similar semantic idea can be successfully used for testing for AC matching, which is another NP-complete problem whose millions of instances are being solved by theorem provers.