Tuesday, June 6, 2017

Vampisoft

I have started a software company of my own - www.vampisoft.com . I will do subcontracting, so let me know if you have any code to write (C/C++, Java, Perl, Python). I will also try to popularize my Open Source projects, especially Perkun.

Monday, May 22, 2017

To believe vs. to see

The input variables (in perkun/wlodkowic/zubr) represent what is directly perceived. The hidden variables used to construct the agent's state represent something the agent believes. What we believe may be more important than what we can see. The hidden variables can be used to model unknown parameters of the world, the hidden processes running beneath the surface of the visible. All in all the hidden variables are a tool to represent compactly the past, the history.


Thursday, March 9, 2017

Is zubr better than perkun or wlodkowic?

Just to recall: perkun and wlodkowic are interpreters. Zubr, on the contrary, is a code generator. Is it better? It has some advantage, since its code does not generate all the visible states, i.e. all the situations possible in the game. It works in a different way - much closer to the chess playing programs. It builds the game tree dynamically.

Both perkun/wlodkowic and zubr generated code contain my optimization algorithm. The same algorithm to maximize the expected value of the payoff function.

Zubr generates a Java code, which I consider an advantage.

All the three tools come in the perkun package: https://sourceforge.net/projects/perkun/

If you have a C++ program that needs my optimization algorithm then it is better to link it against libperkun or libwlodkowic. I have written two small games demonstrating how to do it, it is https://sourceforge.net/projects/perkunwars/ (for perkun) and https://sourceforge.net/projects/thragos/ (for wlodkowic). They both create separate processes for the perkun/wlodkowic interpreters and communicate with the parent process with pipes. Feel free to take a look at their source code.

There are, however, some things you might consider a zubr's disadvantage. For example the model - you have to hardcode it in the getModelProbability method. There is no syntax for a zubr model. The same holds for the payoff (method getPayoff). Wlodkowic offers an extra section for the apriori belief - again, in zubr this requires an extra method.

Zubr has also no syntax to inform the optimizer about the impossible states or illegal actions. It should be resolved with an extra feature - the iterators. I hope to explain them later. You may also take a look at the zubr man page and the code it generates.

In the recent posts I walked through the zubr examples stored in the "examples" folder of the perkun package. I tried to demonstrate that the hidden variables based state is beneficial for the quality of prediction/optimization. I think it is time for a major example using zubr, something like Perkun Wars for perkun or Thragos for wlodkowic.


Wednesday, March 8, 2017

example22_hidden_variables_based_predictor.zubr

You want a prove that hidden variables allow to optimize better? Here you are.

Imagine an optimizer that takes two input variables instead of one. The Perkun section of the zubr specification looks as follows:

values
{
    value FALSE, TRUE;
}

variables
{
    input variable alpha:{FALSE, TRUE}, reward:{FALSE, TRUE};
    hidden variable gamma:{FALSE, TRUE};
    output variable action:{FALSE, TRUE};
}


There are two input variables now: alpha and reward. What is the semantics? Alpha has a sequence FALSE, FALSE, TRUE, TRUE, FALSE, FALSE, TRUE, TRUE,... and so on, independently on the agent's action. But the agent does not know where we begin within the sequence. Action is a bet - it is an attempt to predict the next alpha. Depending on the action the agent receives a reward - an immediate information whether the prediction was correct. Reward TRUE means the prediction was right, FALSE means no reward.

You can execute the program directly from my server:

http://www.pawelbiernacki.net/hiddenVariableBasedPredictor.jnlp

For example let us start with FALSE, FALSE. The program sets its initial belief to gamma=>FALSE at 50% and gamma=>TRUE at 50%. The chosen action is FALSE (he bets the next alpha will be false). Let as assume he was wrong and the next alpha will be TRUE. So there will be no reward, enter TRUE, FALSE.

Now he knows that gamma is  FALSE (the belief reflects this). The action will be TRUE. So he thinks the next alpha will be TRUE. Let's confirm his expectations: enter TRUE, TRUE. Now gamma=>TRUE. Action => FALSE.

In short - due to the usage of the hidden variables based state his prediction will always be correct after the first two signals. He will always get a reward TRUE. Only in the beginning there is an uncertainty (reflected by the belief).

When you compare this optimizer (in fact - this predictor) with the functions based merely on the input variables you will see that no function can beat him. I found two functions that are pretty good:

f1(FALSE, FALSE) = FALSE
f1(FALSE, TRUE) = FALSE
f1(TRUE, FALSE) = TRUE
f1(TRUE, TRUE) = FALSE

f2(FALSE, FALSE) = FALSE
f2(FALSE, TRUE) = TRUE
f2(TRUE, FALSE) = TRUE
f2(TRUE, TRUE) = TRUE

I tested all the possible 16 functions - only f1 and f2 get close. But even they make mistakes (after the first two signals). On the contrary - our predictor generated by zubr can make only one mistake, after first two signals he makes no more mistakes.

If you take a look at the file example22_hidden_variables_based_predictor.zubr (unpack perkun and see the "examples" folder) you will see that we use a custom dialog (extending JDialog) in the method getInput. This was necessary because we have two input variables here. You may process the example file with zubr:

zubr example22_hidden_variables_based_predictor.zubr > MyOptimizer.java

The result Java code can be compiled (remember to place it in a package "optimizer").

What is the conclusion? The optimizer/predictor with a state is much better for the case discussed here than any function based on the input variables. The state should be based on the hidden variables (it is not the only possibility, but the most natural one). This was the problem with the AI - we tried to achieve this with IF THEN, and IF THEN can only see the current state. The hidden variables are a natural way to compress our knowledge about the past. The history.










Tuesday, March 7, 2017

example21_set_apriori_belief.zubr

In the examples I assume we have a good world model (for example we know the sequence FALSE, FALSE, TRUE, TRUE on MOVE) but we do not know exactly where we begin. If we initially get FALSE then MOVE could lead to another FALSE or TRUE. This implies that the initial belief (probability distribution) must reflect this uncertainty. But even though we do not know the hidden variables initially - we may know more than nothing about them. For instance if we talk with a patient and we are a doctor we may introduce a hidden variable "patient_has_cancer". But we should not assume 50% for TRUE and 50% for FALSE, as zubr usually does. Instead we should apply the natural probability distribution of cancer in the population, i.e. use a so-called apriori belief.

This requires us to tell zubr we will define the method setAPrioriBelief:

%option setaprioribelief own

Then in the definition section we provide the implementation:

protected void setAPrioriBelief(Belief target) {
    for (StateIterator i = createNewStateIterator(target.getVisibleState()); !i.getFinished(); i.increment()) {
        State si = i.createState();
        target.addState(si);
       
        if (si.getVariableValue("gamma") == Value.FALSE)
            target.setProbability(si, 0.3f);
        else
            target.setProbability(si, 0.7f);       
    }
}


As you can see we iterate over all possible states using a StateIterator, create states and add them to the target (Belief). We will talk later about the iterators so take them for granted now. Once we have populated belief with states we may query the states for hidden variable values and set the probability. Note that we choose to set 30% for gamma=>FALSE and 70% for gamma=>TRUE.

Now process the example with zubr and compile the java outcome:

zubr example21_set_a_priori_belief.zubr > MyOptimizer.java

You can also execute the program directly from my server:

http://www.pawelbiernacki.net/aprioriOptimizer.jnlp

Have you noticed a small change after the first signal? The belief is not 50%/50% any more, but 30%/70%! This can be important when we have more real-world examples.

Download zubr from https://sourceforge.net/projects/perkun/.

Monday, March 6, 2017

example20_hidden_variables.zubr

This is our first example with the hidden variables. The Perkun section of the zubr specification looks as follows:

values
{
    value FALSE, TRUE;
    value MOVE, DONT_MOVE;
}

variables
{
    input variable alpha:{FALSE, TRUE}; // alpha may have value FALSE or TRUE   
    hidden variable gamma:{FALSE, TRUE};
    output variable beta:{MOVE, DONT_MOVE};
}


What is it good for? Imagine an automaton that can do either MOVE or DONT_MOVE. When it constantly does MOVE then the input will be:

FALSE, FALSE, TRUE, TRUE, FALSE, FALSE, TRUE, TRUE,...

But it is not known where in the sequence we begin. So even though it knows that two FALSE are succeeding when it gets a FALSE it does not know whether it was the first FALSE in the sequence or the second one.

The payoff function makes the program to "like" TRUE as input and "dislike" FALSE.

You may process the example with zubr to obtain the Java optimizer code:

zubr example20_hidden_variables.zubr > MyOptimizer.java

Here is the link to the program (you can run it directly from my server):

http://www.pawelbiernacki.net/hiddenVariablesOptimizer.jnlp

There are three scenarios possible:

1.  TRUE -> DONT_MOVE
     TRUE -> DONT_MOVE
     TRUE -> DONT_MOVE
     ....

2.  FALSE -> MOVE
     FALSE -> MOVE
     TRUE -> MOVE
     TRUE -> DONT_MOVE
     TRUE -> DONT_MOVE
     ...

3.  FALSE -> MOVE
     TRUE -> MOVE
     TRUE -> DONT_MOVE
     TRUE -> DONT_MOVE
     ...

You can see that the program created by zubr behaves a little as if it were indeterministic. Sometimes it responds TRUE with MOVE, sometimes with DONT_MOVE.

In fact it is completely deterministic, but it has a state, which is a belief (probability distribution over a set of two possible facts - gamma => FALSE and gamma => TRUE). This belief changes depending on the performed actions and obtained results. Because it has this additional knowledge (the belief) the optimizer can permit itself for example to choose MOVE when it knows that after MOVE still a TRUE will follow. On the contrary in the first scenario after TRUE it does not know whether another TRUE will follow, therefore it chooses DONT_MOVE.

This is an important point that I want to make - the state is very important for successful optimization and the hidden variables are a natural way to construct such states. Second - the optimizers can be deterministic, but still better than functions based on the input variables. In the case discussed here it is easy to construct a function that performs just as well as the zubr generated optimizer:

f(FALSE) = MOVE
f(TRUE) = DONT_MOVE

So in this case a function is just as good as the zubr optimizer, but in the more complex cases the functions just cannot beat the optimizers. We will later discuss such an example. The hidden variable based optimizers differ from the functions in so far that they have a deeper "understanding" of the outer world.

Download zubr from https://sourceforge.net/projects/perkun/.





Sunday, March 5, 2017

example19_get_payoff.zubr

What is the purpose of an optimizer? It attempts to maximize the expected value of the so-called payoff function. In this example we are finally implementing a method specifying the payoff function. First we have to tell zubr about it:

%option getpayoff own // method getPayoff

Then in the definition section we provide the implementation:


protected float getPayoff(VisibleState vs) {

    switch (vs.getVariableValue("alpha"))
    {
        case FALSE:
            return 0.0f;
       
        case TRUE:
            return 100.0f; // TRUE is "better" than FALSE
    }
    return 0.0f;
}


This way we make our optimizer to prefer alpha=TRUE and dislike alpha=FALSE. The example can be executed as usually, with zubr:

zubr example19_get_payoff.zubr > MyOptimizer.java

There are two possible decisions: MOVE and DONT_MOVE. The Perkun section in the zubr specification looks as follows:

values
{
    value FALSE, TRUE;
    value MOVE, DONT_MOVE;
}

variables
{
    input variable alpha:{FALSE, TRUE}; // alpha may have value FALSE or TRUE   
    output variable beta:{MOVE, DONT_MOVE};
}


You can execute the final program directly from my server:
http://www.pawelbiernacki.net/getPayoffOptimizer.jnlp

As is easy to anticipate the optimizer will do MOVE after FALSE, while he would do DONT_MOVE after TRUE. We still don't have the hidden variables here, but the example is sufficient to introduce the getPayoff method.

An interesting observation - if there are no hidden variables then the optimizer could be replaced by a simple function. Only then. The optimizers with hidden variables can be much better than any function mapping input to output, as was shown previously. 

The next example will be based on hidden variables, which is what makes zubr interesting.