Скачать книгу

      For the illustration of rules and queries, we use the Mathematics Genealogy data [MathGen 2000] that contains basic facts about people, dissertations, and advisor relationships between people and dissertations. We assume the following EDB predicates, which are slightly different from those in the introductory example. In particular, an advisor ID (AID) is connected to a dissertation ID (DID), and a dissertation has a candidate (CID) who wrote it.

Image

      The data sets contains 198,962 people, 202,505 dissertations, and 211,107 facts of the advised predicate. In each of the systems, we answer the following five queries to illustrate the performance.

      1. Who are the grand-advisors of David Scott Warren?

      2. Which candidates got their degrees from the same university as their advisor?

      3. Which candidates worked in a different area than at least one of their advisors?

      4. Who are the academic ancestors of David Scott Warren?

      5. How many academic ancestors does David Scott Warren have?

      These queries make use of simple joins, recursive rules, and aggregations that facilitate assessing the performance of the systems in various aspects. In the next subsections, we introduce four systems and how these queries are expressed in each.

       1.8.2 XSB

      XSB [Sagonas et al. 1994] is a top-down implementation of Prolog extended with tabled resolution as described in Section 1.5.2.1. XSB allows fine-grained control over which tabling strategy and indexes to use for each predicate, and the choices for tabling and indexing affect asymptotic behavior. Consider Query 1, which is written in XSB as follows:

Image

      Here the rule defining enumallq1/0 uses a Prolog idiom of a fail-loop, which has the effect of generating all results in the most efficient way.

      By default, XSB indexes each predicate on its first argument. However that may not be the most effective choice. For example, in the rule defining q1, after the first subgoal binds D, top-down evaluation will need to obtain values for the first argument of adv given a binding for the second argument. The following directive creates an index on the second argument:

Image

      To illustrate tabling, consider Query 4:

Image

      For this query, one can show that there can be repeated subgoals for predicate anc. The following tabling directive will avoid reevaluating such repeated subgoals and, therefore, avoid an infinite loop, which would affect an ordinary Prolog system:

Image

      Rules for Queries 2 and 3 can be written in XSB as follows:

Image

      Query 5 can be implemented using the aggregate construct findall in XSB, which returns all answers to a query as a list via the last argument:

Image

      The same query can also be implemented using the table aggregation construct in XSB as follows:

Image

      The special tabling directive in the first line works as follows: whenever an answer acc for q4 is generated, XSB calls cnt(acc, _, NewAcc), removes acc from the table, and adds NewAcc. For the first answer found, the first argument in the call to cnt would be 0, as indicated by the tabling directive. So q5_2 returns the number of facts found for q4.

       1.8.3 LogicBlox

      LogicBlox [Aref et al. 2015] is a commercial system unifying the programming model for enterprise software development that combines transactions with analytics by using a flavor of Datalog called LogiQL. LogiQL is a strongly typed, extended form of Datalog that allows coding of entire enterprise applications, including business logic, workflows, user interfaces, statistical modeling, and optimization tasks. LogicBlox evaluates LogiQL rules in a bottom-up fashion.

      In terms of the language, in addition to pure Datalog, LogiQL has functional predicates that map keys to values, and various aggregation operators. In LogiQL, the arguments of each EDB predicate need to be typed. For our queries, these types can be specified as follows:

Image

      In the specification above, person is a functional predicate (shown by the bracket notation), mapping person IDs to names. Using these specifications, Query 1 can be answered with the following rules:

Image Image

      Apart from the use of the functional predicates shown above, Queries 2–4 can be implemented similarly to XSB:

Image

      Query 5 can be specified using the aggregation construct in LogiQL as follows.

Image

       1.8.4 Datomic

      Datomic [Anderson et al. 2016] is a distributed NoSQL database, which uses Datalog as its rule and query language. Since the system is closed-source and no academic paper has been published by the implementation team, one can only rely on the documentation for its details. Datalog evaluation in Datomic is performed bottomup, and indexing directives can be specified.

      The syntax of a Datalog rule in Datomic is quite distinct from that used in the logic programming literature. It is akin to the RDF query language SPARQL [Prud’hommeaux et al. 2008]. Datomic operates on object-like predicates. For example, for our EDB predicate dissertation, rather than representing it as a predicate with arguments, each dissertation is considered an object with an identifier and arguments that can be accessed as fields. Therefore, all EDB predicates are represented by binary predicates written in infix form, and they use square brackets. For example, a dissertation’s candidate ID can be thought of as a predicate :dissertation/cid whose form in the body is [?d :dissertation/cid ?c], where variables are preceded by ?. In contrast, IDB predicates are written in prefix notation and use round parentheses, as seen in the rules below.

      A rule is written as a list whose first element is the conclusion of the rule and the rest of the elements in the list are the body. For example, a new predicate author whose arguments are the dissertation ID and the candidate ID can be defined via the following rule:

Image

      Once a new predicate is defined, it can be used in the body of a rule to define more predicates, as shown in the definition of the advisor relationship below.

Image

Скачать книгу