A Parallel Prolog System for Distributed Memory.
Lourdes Araujo, José J. Ruz,
Journal Logic Programming (Elsevier Science), 33(1), (1997), p. 49-79.

This paper presents a parallel execution system (PDP: Prolog Distributed
Processor) for efficiently supporting both Independent_AND\OR
parallelism on distributed memory multiprocessors. The system is composed
of a set of workers with a hierarchical structure scheduler. Each worker
operates on its own private memory and interprocessor communication is
performed only by the passing of messages. The execution model follows a
multisequential approach in order to maintain the sequential optimizations.
Independent AND_parallelism is exploited following a fork-join
approach and OR_parallelism is exploited following a recomputation
approach. PDP deals with OR_under_AND parallelism by producing the
solutions of a set of parallel goals in a distributed way, that is, by
creating a new task for each element of the cross product. This approach
has the advantage of avoiding both storing partial solutions and
synchronizing workers, resulting in a largely increased performance.
Different scheduling policies have been studied, and granularity controls
have been introduced for each kind of parallelism. PDP has been implemented
on a network of transputers and performance results show that PDP introduces
very little overhead into sequential programs, and provides a high speedup
for coarse grain parallel programs.