A Toy Memory-less Abstract Machine
Due to the current technology of the computers we can use, we have chosen
an extremely abridged emulation of the machine that will effectively
run the programs, instead of more proper languages, such as l -calculus
(or LISP). We have adapted the "toy RISC" machine of [Hernández-Orallo
and Hernández-Orallo 1993] with two remarkable features inherited
from its object-oriented coding in C++: it is easily tunable for our needs,
and it is efficient. We have made it even more reduced, removing
any operand in the instruction set, even for the loop operations. We have
only three registers, which are AX (the accumulator), BX and CX. The operations
Q b we have used for our experiment are in Table 1:
|
Decrements CX. If it is not equal to the first element jump to the program top. |
|
Same as LOOPTOP but it jumps n (for the tests n=4) instructions backward. |
|
Same as LOOPTOP but it jumps m (for the tests m=7) instructions backward. |
|
Increments the accumulator. |
|
Decrements the accumulator. |
|
Writes into the output and moves fwd. |
|
Moves back and reads from the output. |
|
Moves fwd and reads from the output. |
|
Copy register BX into AX |
|
Copy register AX into BX |
|
Copy register CX into AX |
|
Copy register AX into CX |
|
Rotates 45° to the right. |
|
Rotates 45° to the left. |
Table 1. Instruction Set
The operations with no superscript are present in all the subsets. Operations marked with (1) are present in the professional version of the machine, the operations with (2) are present in the Turing-like version and those with (3) are present in the Logo version where the output is bidimensional. This sparseness of only 10 operations will be clearly justified later. We have essayed with many different alphabets but for this test we will use the professional version and a circular alphabet W b = {a,b,c,d,...,z}, i.e., incrementing z yields a and decrementing a yields z. Since the first element is an inflexion point for the loops, it is presented to the subjects as "a critical element".
This configuration still produces many programs that are not robust (intensional) because programs can be often split into subprograms. The solution for these cases comes from another restriction: the programs must be comprised wholly inside a loop. This leaves a good approximation to explanatory programs. The rest to do is to avoid repetitions of patterns such as "abcabcabcabc" (for sake of gain and plausibility) and take apart the strings where an important part is explained by a shorter program (for sake of intensionality). We think that the bias is not all the expressible we would like but it allows the generation of strings of certain complexity. Also we think it is fair because it does not relate on arithmetic (such as cryptarithmetic tests) or any other preceding knowledge, except the order of the alphabet.
The Generation of k-Hard Strings
The algorithm we have used to generate a set of different k-incomprehensible strings is very similar to the one we presented in section 5.4 (of [Hernández-Orallo and Minaya-Collado 1998]). Having 10 operations, we have that usually only about a 20% of the programs of any size are explanatory. This means that trying to know whether a randomly generated program of, say, size 15, is valid, will need the checking of more than 2,222,222,222,222 programs. And this is the case if the computational cost of x* is slow, contrariwise (if x* is a costly program) we will have to check longer programs.
We have used some optimisations and heuristics in order to make the great amount of programs to check more tractable. Some examples of questions are:
Prediction style:
k12 a, a, z, c, y, e, x, Answer: g
k14: c, a, b, d, b, c, c, e, c, d, Answer: d
k10: a, x, _, v, w, t, u, Answer: y
k13: a, y, w, _, w, u, w, u, s, Answer: y
Four tests were devised to measure prediction, abduction, g-factor and similarity. The prediction test is composed of 19 exercises generated with the following k-hardness distribution (2 k7, 1 k8, 2 k9, 3 k10, 3 k11, 3 k12, 2 k13 and 1 k14), redundancy r = 2 and the less akin as possible. The abduction test is composed of 15 exercises using the same generator and redundancy. The distribution was (2 k7, 2 k8, 1 k9, 2 k10, 1 k11, 3 k12 and 4 k13). In these two tests, the incorrect options were generated randomly but relative near to the solution and the letters appearing in the string. The IQ test we used was the European IQ test simply because it is a culture-fair test, devised for 20 minutes, ensuring a reasonable range (75-174) of values and available on the Internet.
Return to the Computational Measurement of Intelligence Page