Friday, January 21, 2011

Stacks and Queues in Prolog

Stacks and queues are basic bread and butter data structures, so I figured why not implement them in Prolog.

First, the stack.  A lot of the time you can get away with using the call stack as a stack.  Of course, this might not be the best idea if the stack is large, and depending on the problem it might be awkward.  Stacks are particularly easy to implement in Prolog, requiring just two predicates:

% item to add; original stack; new stack
add_stack( Item, Stack, [Item|Stack] ).
% original stack; item to take; new stack
remove_stack( [Item|RestStack], Item, RestStack ).

Pretty simple.  By merely utilizing a list and manipulating the first element, one can make a simple stack.  "remove_stack" is particularly simple; it will return false on an empty stack, which can lead to more natural solutions than those that explicitly must check for an empty stack.

Now for queues.  This is also pretty straightforward:
add_queue( Item, Queue, NewQueue ) :-
   append( Queue, [Item], NewQueue ).
remove_queue( [Item|RestQueue], Item, RestQueue ).

Again, a simple list is used.  The only difference is that added elements go to the end of the list, instead of the front.  However, there is a problem with this.  Generally, queue implementations are assumed to offer queue and dequeue operations in O(1).  This one achieves an O(1) dequeue, but an O(n) queue because of the append operation.  We want to put an item at the end of the list, but in order to do that, we need to traverse the entire list.

For trivial things, or where it can be guaranteed that n is small, this is probably not a problem.  However, for real applications, this is definitely an issue.

We want to be able to add things to the back of the list, while taking things from the front of the list, both in constant time.  The problem is that we can't do that with just a plain old list.

Think outside of Prolog a bit.  Imagine this were C++ code.  Each element of the list would likely be implemented as a node in a linked list.  The way to get O(1) access for both the first and last nodes would be to store pointers to both the front and back.  Prolog gives you the front for free, but not the back.  You would like to store a pointer to the back, but Prolog doesn't let you do that.

At least not as a pointer.  Prolog lets you do something much cooler: variables.  Logical variables have seemingly magical properties to someone that is used to more traditional languages.  The most magical bit is that they can be uninstantiated, to be instantiated at some later time (or not).  With that in mind, consider the following:
?- Test = [1, 2, X, 3, 4], X = 10.
Test = [1, 2, 10, 3, 4],
X = 10.

The variable "X" is instantiated after it was put in the list.  Now, normally X = 10 would be a constant time operation, but in this case...it's a constant time operation!  With this simple example, one has a pointer to the middle of the list.  So why not put the variable X at the end?

This is exactly how one can accomplish the trick of adding to the end in O(1).  It does, however, require a more complicated queue definition.  Queue operations must return both the queue and the variable at the back of the queue.  Additionally, since the two terms are very closely related to each other, splitting them into two separate terms isn't advisable.  Fortunately, Prolog's pattern matching makes this trivial.  The power of pattern matching can be seen with the following example:
math( X-Y, Z ) :-
        Z is X - Y.
math( X+Y, Z ) :-
        Z is X + Y.
math( X*Y, Z ) :-
        Z is X * Y.
math( X/Y, Z ) :-
        Z is X/Y.
?- math( 2 + 5, X ).
X = 7.
?- math( 2 - 5, X ).
X = -3.
?- math( 2 * 5, X ).
X = 10.
?- math( 2 / 5, X ).
X = 0.4.

For this example, one essentially provides 3 parameters in one term: two numbers and an infix operator.  Pattern matching allows this to be specified in a far more natural way.

Using this same technique, we can define a queue of the form [x1, x2, x3, ..., xn, X]-X, where X is the logical variable at the end of the queue.  Using this form, we can then define operations that manipulate this form of queue, like so:
add_queue2( Item, Queue-X, Queue-Y ) :-
        X = [Item|Y].
remove_queue2( [Item|Queue]-X, Item, Queue-X ).

It is assumed that X is at the end of Queue in these predicates.  The whole "-X" part is just a way to keep the end of the queue in the same term, as described above.  When we add to the queue, we replace the old logical variable X with a new one, Y, in addition to appending the given item.  This achieves constant time for both operations.

Note that an empty queue is represented as X-X, not [X]-X.  In the definition of add_queue2, with a base case of X-X to add to, a list is created.  If there is already a list as in [X]-X, then this list ends up getting nested inside the initial list.  To illustrate this, here is an example with the different base cases:
% empty queue: X-X
?- add_queue2( 1, X-X, NewQ ), add_queue2( 2, NewQ, NewQ2), add_queue2( 3, NewQ2, NewQ3 ), remove_queue2( NewQ3, Item, NewQ4 ).
X = [1, 2, 3|_G1416],
NewQ = [1, 2, 3|_G1416]-[2, 3|_G1416],
NewQ2 = [1, 2, 3|_G1416]-[3|_G1416],
NewQ3 = [1, 2, 3|_G1416]-_G1416,
Item = 1,
NewQ4 = [2, 3|_G1416]-_G1416.

% empty queue: [X]-X
?- add_queue2( 1, [X]-X, NewQ ), add_queue2( 2, NewQ, NewQ2), add_queue2( 3, NewQ2, NewQ3 ), remove_queue2( NewQ3, Item, NewQ4 ).
X = [1, 2, 3|_G1422],
NewQ = [[1, 2, 3|_G1422]]-[2, 3|_G1422],
NewQ2 = [[1, 2, 3|_G1422]]-[3|_G1422],
NewQ3 = [[1, 2, 3|_G1422]]-_G1422,
Item = [1, 2, 3|_G1422],
NewQ4 = []-_G1422.

It *almost* works with [X]-X, and the other definitions could be tweaked so that it will work, but doing that skirts around the problem.  X-X is just cleaner.

There is, however, one problem.  In the original remove_queue definition with O(n) enqueue,  the call simply failed on an empty queue.  In this definition, this doesn't happen:
?- remove_queue2( X-X, Item, Q ).
X = [Item|_G1221],
Q = _G1221-[Item|_G1221].

Not only did this call succeed, it ended up placing the variable Item within the new queue.  The easiest way to fix this is with an additional predicate.  One might be tempted to add the following before the current definition of remove_queue2:
remove_queue2( X-X, _, _ ) :- !, fail.

Simple idea.  When we explicitly see the form of an empty queue, fail with no chance of backtracking.  At first, this appears to work:
?- remove_queue2( X-X, Item, Q ).
false.

However, a little extra testing reveals that something horribly wrong is occurring:
?- add_queue2( 1, X-X, Q ), remove_queue2( Q, Item, NewQ ).
false.

The new predicate is matching everything that takes the form of this queue, empty or not!  The reason for this is unfortunately quite low-level for my taste.  It has to with how Prolog unifies terms.  The short answer is that, likely for performance reasons, a certain check that would fix this is bypassed.  A more detailed explanation can be found in John Wickerson's explanation of difference lists at www.cl.cam.ac.uk/~jpw48/difflists.pdf(note that a great deal of this post is based on that document!)

The solution is to explicitly perform this check, like so:
remove_queue2( X-Y, _, _ ) :-
        unify_with_occurs_check( X, Y ), !, fail.

This acts to cut execution once the end of the queue is encountered.  As long as this predicate is asserted before the predicate to actually remove items, this will work.  A quick demonstration:
?- add_queue2( 1, X-X, NewQ ), add_queue2( 2, NewQ, NewQ2), add_queue2( 3, NewQ2, NewQ3 ), remove_queue2( NewQ3, Item, NewQ4 ).
X = [1, 2, 3|_G1419],
NewQ = [1, 2, 3|_G1419]-[2, 3|_G1419],
NewQ2 = [1, 2, 3|_G1419]-[3|_G1419],
NewQ3 = [1, 2, 3|_G1419]-_G1419,
Item = 1,
NewQ4 = [2, 3|_G1419]-_G1419.

We have a queue with O(1) enqueue and dequeue.

Wednesday, January 5, 2011

BFS in Prolog (AI)

I've mentioned elsewhere that I recently had to write a BFS in LISP.  Well, this is kinda a follow up to that.  Kinda.


I thought it made for a nice toy problem, and I wanted to learn some real Prolog, so I put the two together.  It worked, but it was a tad quirky, mostly in the way it stored the solution paths.

In an odd twist of fate, someone on Stack Overflow wanted to see an AI search like BFS implemented in Prolog.  I didn't want to post the solution to what originated as an assignment, so I changed the problem.  Like Knuth's conjecture, which is what the original solver solved, it has a given starting integer and a given goal integer.  However, the operations are simpler: increment, decrement, and multiply by +2 (no negative 2).  Additionally, I'm allowing for any starting integer, instead of positive four as in Knuth's conjecture.  The quirkiness is gone as well (which is why my bfs definition is slightly different from the original).

Without further ado, I post the code.  I'm sure that there are better ways to do certain things, especially if I had used some of SWI-Prolog's higher order predicates, but I think I learned more with the given definition.

% given a goal integer, it tries to determine the shortest
% series of actions needed to get to this integer given any other
% integer.  The actions allowed are increment, decrement, and
% multiply by two

% states are represented as two element lists
% the first is a number, and the second is a path

% gets the successors of the given state
% note that it must be redone via backtracking in order to
% get all of the successors
successors( [N,Path], [NewN, [Function|Path]] ) :-
        ( Function = increment, NewN is N + 1 ;
            Function = decrement, NewN is N - 1 ;
            Function = multiply, NewN is N * 2 ).

% gets all successors as a list
successors_list( State, Result ) :-
        findall( X, successors( State, X ), Result ).

% records results that have already been seen
:- dynamic seen/1.

% given a list of states, it will add each state to the table

% of states that have already been seen
add_to_seen( [] ).
add_to_seen( [[N|_]|Rest] ) :-
        assertz( seen( N ) ),
        add_to_seen( Rest ).

% removes all states that have already been seen
% returns a new list
remove_seen( [], [] ).
remove_seen( [[N|_]|Rest], Result ) :-
        seen( N ), !,
        remove_seen( Rest, Result ).
remove_seen( [State|Rest], [State|Result] ) :-
        !, remove_seen( Rest, Result ).

% performs a BFS, with the given goal and queue
bfs( Goal, [[Goal|[Path]]|_], FinalPath ) :-
        % note that operations are added from the front, and it's
        % more natural to read them left to right
        !, reverse( Path, FinalPath ).
bfs( Goal, [State|Rest], Result ) :-
        successors_list( State, Successors ),
        remove_seen( Successors, NewStates ),
        add_to_seen( NewStates ),
        append( Rest, NewStates, Queue ),
        bfs( Goal, Queue, Result ).

% runs the BFS for the given start integer and goal integer
% returns the path to reach the goal in "Path"
go( Start, Goal, Path ) :-
        retractall( seen( _ ) ),
        bfs( Goal, [[Start,[Start]]], Path ).



?- go( 4, 7, X ).
X = [4, multiply, decrement].


4 * 2 = 8; 8 - 1 = 7.  Cool.

Glow in the Dark Scorpions

Under UV light, scorpions are known to fluoresce. A few compounds have been identified in scorpions that are the root of this florescence, but it is not known why exactly the scorpions produce these compounds. Presumably, there is some sort of selective advantage to this florescence, or it is at least a byproduct of something else that confers a selective advantage.

Based on the work of Kloock et. al. (link to paper), it appears that this fluorescence has something to do with low-level UV light detection. Scorpions that underwent a treatment to remove the fluorescence properties seemed unable to detect the light. Individuals with the treatment were observed to behave differently under the same low UV light conditions as untreated scorpions, although curiously enough the type of behavior wasn't consistent between scorpions. In other words, ones with the treatment did different things than those without the treatment, but these things differed between individuals with the treatment (whew!).

Personally, I'm not 100% convinced of the results. The paper noted that the presented results were different from those of another paper, where the amount of UV light was so high that it likely caused the scorpions to avoid it altogether. Additionally, although it is completely possible that individuals with the treatment would exhibit different behavior, it seems odd to me that the behaviors differed.

Most importantly, the treatment itself, which batters the scorpions with light for weeks at a time, could be to blame for the differences. The authors acknowledged this, and presented an experiment to show that the eyesight of the scorpions was unaffected. This experiment showed there were no differences between treated and untreated scorpions in response to normal light as opposed to UV light. However, there still seems a possibility to me that there could have been retinal damage, but damage specific to the detection of UV. If a different protein or set of proteins is specifically responsible for detecting UV in the eye as opposed to normal light, then this is feasible. One of these proteins could have been degraded by the treatment, which would produce the same kind of results seen by the authors.

If time and money were not an issue, I think that a gene-knockout approach would be superior to the method used by the authors. By genetically preventing the scorpions from ever fluorescing, instead of relying on a post-fluorescence treatment, then it would be possible to make stronger conclusions about the results. However, this assumes that we understand at least part of the metabolism pathway involved in the production of the fluorescing compounds. Given that these compounds were all discovered within the decade (according to the paper), I might be getting ahead of myself here. It's one thing to know that an organism produces a given compound, and it's a whole other thing to know exactly how that organism produces the compound.

Even with all my criticism, I do think that the authors are on to something. They have provided enough evidence for me to think that their hypothesis is correct, though I would like to see additional experiments to help bolster it.

Sunday, January 2, 2011

State

State is one of those funny things in programming. It is both wonderful and a curse, depending on the context. In my experience, moreso a curse.

First, a quick definition. By state, I mean a change occurs to some portion of a program, usually by assignment. A very simple example in C:

for( int x = 0; x < 10; x++ );

This has state, because the value of x changes by executing this loop.

The imperative paradigm views state as almost an end-all be-all. If you've ever worked in assembly language, which is essentially purely imperative, you know what I'm talking about. Technically, there are no functions, only subroutines, as functions return something. Machine languages typically have no way of "returning" something. Instead, there is some programming convention that certain changes to state are indicative of a function returning something. In MIPS, the convention is to put return values in certain globally available registers, or the stack.

Assuming the programmer knows EXACTLY what he or she is doing, this is great. This is typically the fastest way of doing something in terms of execution speed, and it can be efficient on memory as well. The problem is that most programmers don't completely know what they are doing. This isn't a bad thing, and I don't mean it offensively. In a good abstraction, the programmer shouldn't have to worry about everything. Do you really know and understand ALL of the software between a print statement and your monitor? I hope not. (If you do...well I tip my hat to you).

Things can get complicated fast in a purely imperative model. This is why there are all sorts of assembly conventions which attempt to limit the amount of state changes one can make. One could probably write marginally faster code, but it would take an enormous amount of time just to read the resulting nightmare. Humans can only really understand a few things at once, and this works contrary to a completely imperative model.

At the opposite end of the spectrum of the imperative paradigm is the purely functional paradigm. In a purely functional paradigm, there is no state. Everything is represented as a series of function calls. What may be surprising to the typically imperative programmer is that it's possible to achieve quite a bit in this model. Programs tend to be shorter, and easier to understand. Debugging time tends to be shorter, as well. Individual functions can be tested independently. Without state, the same function is guaranteed to return the same outputs on the same inputs under all conditions, barring some catastrophic hardware failure.

There is an added bonus of removing state: parallelism. In parallel code, the areas that need locks typically modify state. These areas can introduce sequential dependencies. Taken together, these regions result in code that is harder to produce, less clear, more prone to deadlock, and is slower. Without state, such regions don't exist. It's much easier to write parallel code. In fact, since purely functional programs can be broken down into a single expression with a multitude of subexpressions, it's possible to automatically achieve parallelization!

There is a downside of completely removing state, however. For one, all I/O is based on state. A program without I/O capabilities cannot take in any inputs or give any outputs, which is fairly worthless. Purely functional languages perform tricks to do I/O, which is technically impure by its very nature.

There is another downside: algorithmic complexity. Without state, certain problems cannot be represented efficiently. I've started to read Chris Okasaki's "Purely Functional Data Structures", which makes clear that even on a purely theoretical basis it's impossible for certain purely functional algorithms to be as efficient as equivalent imperative implementations. (Note that "certain" is an important word here; you can usually do as well.)

State just plain makes certain things easier to represent. Arguably, this is the main reason why many functional languages are impure, meaning they do allow for state.

Between these two extremes of purely imperative and purely functional programming are object oriented programming and logical programming. In the object oriented model, computation is represented in terms of objects. These objects have specific state and behavior. State is modified through well-defined channels which are typically integrated into the behavior of objects.

This is an interesting middle ground. It allows for state, but in a way that hides most of the complexity. In a well-written object-oriented program, changing the state in one object will not effect the state or behavior of a completely unrelated object. In the imperative model, there is no such enforcement of this. (Note that the aforementioned assembly conventions share a resemblance to the object oriented model, by allowing only certain side effects to occur. However, these are merely convention; your code does not enforce that these conventions are followed.)

Like object-oriented programming, logical programming has an interesting, partially restricted way of allowing state. The entire database, which holds both the program and the data, can be modified. Items can be added or removed, and the behavior of the program can change based on this. This is extremely powerful, but it can also be just as dangerous as in the imperative paradigm. Typically, programs change the database as little as possible, but this is more of a convention than anything else. It's possible to truly have a program write and execute itself with this model, in a way that is unparalleled in other paradigms.

A major downside of this is that changes are usually global. There is only one database, accessible to everything. I suspect that there are logical programming systems that allow for more restrictions to be placed on modifications, but this is true for ISO-standard Prolog.

So why did I just blab on about state? The code I'm looking at uses it...a lot. It's Java, but it's written in more of an imperative style. Oh boy...