Concepts

During the CDCL design effort, it became clear, fairly early in the process, that the requirement for Boolean aggregation of atomic rule particles would cause a lot of grief for semitechnical users. (Face it: Boolean logic has caused confusion even among professional programmers. Not even the least logically challenged among programmers wouldn't not be reluctant to deny that there isn't doubt about likelihood of logical reversals of sense when one is not careful not to avoid non-confusing Boolean constructs.)

There are several features of formal Boolean expressions that are likely to cause a semitechnical user to feel lost, but the most forbidding is probably the parentheses that delimit logical blocks. Parentheses are visually confusing, dense, and worst of all they look like math. It's also kind of difficult to convey, to a nontechnical user, the topology of parenthetic notation:

"See, you have this concept of a matched pair of parentheses, which means one opening parenthesis, followed by some content, followed by the closing parenthesis. Okay?"

"Uh, sure. I guess."

"Now. That content inside? it can only contain other matched pairs, or no parentheses at all. If it contains any parenthesis without a matching partner, we're only looking at some kind of mismatched fragment, and we can't say that our two parentheses on the right and left ends are a matched pair, because they're not."

"Whatever you say."

"Yeah now this is important, because the content of every matched pair has to resolve to true or false. And you have to figure out which it is before you apply any adjacent operator."

"...
"...
"...Look, is this math? It looks like algebra."

"It's not algebra, it's boolean algebra. That's not the same --"

"AAAAAAAAAAHHHHHHH! It's math! It's math! Get it off me get it off me! Math!"

Kidding aside: Boolean Algebra, which is a terse, analytically rigorous form of notation convenient for mathematicians and logicians, simply doesn't play well amongst nontechnical users. We need to find an alternative notation form that can suit our needs.

So: What would be the characteristics of a "suitable notation form"?

Well, well. What do you know?

That bulleted list can be represented as Boolean Algebra: A (B (C + D)) . Could that possibly work in reverse? i.e., could a bulleted list express any Boolean Algebra expression, in general?

OK, since the title of the page (as well as the name of the notation system itself) really gives it away, there's little use trying to maintain dramatic tension. Booliette is the name of a system of notation, designed to be usable and understandable by nontechnical people, that is premised on the idea that

Given sufficient conventions, a bulleted list, with indented sublists, can be used to represent any Boolean expression.

In other words, appropriate use of bulleted lists with indented sublists satisfies A in the requirements list above. Proof of the soundness of that central idea awaits rigorous mathematical treatment. But, we're not going to let that slow us down.

Let's retry our earlier thought experiment, but with Booliette instead of parenthesized Boolean Algebra.

"In Booliette there is one simple rule: every bullet point comes up true or false.".

"What does that mean, 'comes up true or false'?""

"It means that at every bullet point, you have some information; and a statement about that information; and you can tell if the statement is true or false."

"So... no matter what bullet point it is, you can say it's true or it's false."

"Right."

"Okay, got it. Every bullet is true or false."

"OK, now: there are two kinds of bullets. There's the kind with a sublist, and the kind without a sublist."

"Wait wait. How can I tell what's a sublist?"

"It's, um, indented."

"Oh. D'oh. I use those all the time. Sorry, go on."

"Now, bullets without a sublist have to be true or false entirely on their own. They don't need any information from any other bullet. Whatever they say, is whatever it is.

"But for the bullets with a sublist, it's the other way around: Whether they come up true or false depends on the bullets in their sublist. The bullet with a sublist can only state conditions about its sublist."

"Hey, you're losing me here...conditions about its sublist? Like what?"

"Like, 'all of my sublist's bullets must be true', or 'at least one of them must be true', or 'none of them can be true', or --"

"Ah, ok, I get it. Because if what it's saying about its sublist is true, then the bullet itself comes up true."

"Yes yes, right, that's it exactly!"

"And... this does what for me, exactly?

"Well, every Booliette expression has just one bullet at the top level: everything else is in a sublist, or a sub-sublist, or whatever. All of those bullets underneath represent logic statements - rules, if you want - and then it all feeds up to whether that top bullet comes up true or false. Even if the rules are complicated, you can always work it out."

"And that's all there is to it?"

"Well, yeah, basically. See, bullets without sublists are the equivalent of Boolean variables, and bullets with sublists are the equivalent of Boolean operators, and the sublists themselves are like parenthesized--"

"AAAAAAAAAAHHHHHHH! It's math! It's math! Get it off me get it off me! Math!"

These examples prove nothing, of course. They are only rhetorical demonstrations of the very real advantage in comprehensibility of Booliette over Boolean Algebra as an expression of complex Boolean propositions.


Booliette lexical analysis

character set

Booliette uses characters from the 7-bit ASCII set. Source code written or generated by any Booliette editor may contain only the following characters:

dec hex description
9 9 tab
10 A linefeed
13 D carriage return
32-126 20-7E printing ASCII characters
line structure and indentation

Booliette uses indentation levels, rather than block delimiters, to demarcate sublists. In the spirit of trying not to reinvent any unnecessary wheels, Booliette follows the lexical conventions from Python (a similarly indentation-oriented language) with respect to defining lines and standardizing indentation treatment. In particular, the following topics from the Python specification should be considered definitive for Booliette:

2.1.1 Logical lines
2.1.2 Physical lines (except for the description of "embedding Python" which is inapplicable)
2.1.5 Explicit line joining
2.1.5 Implicit line joining (although we are not yet certain whether this applies to block delimiters () and {}; we haven't worked out comments yet; and the reference to a "triple-quoted line" is not applicable.)
2.1.8 Indentation (except for the references to "Formfeed characters", i.e. ASCII Decimal 12, which are prohibited in Booliette code)
2.1.9 Whitespace between tokens (again, except for the references to "Formfeed characters")

Further parts of the Python specification may also be adopted as part of the Booliette lexical definition. If so, they will be posted here.

Note:as of this writing, the current Python Release is 2.4.3; it seems unlikely that these links or their content would change in any substantive way, because they represent fundamental low-level characteristics of the language.
bullet structure

Every logical line in Booliette is a bullet. A bullet takes the general lexical form


	(indent-whitespace) * (whitespace) (proper content) 
	

Where indent-whitespace establishes level of indentation, the asterisk character literal * (ASCII decimal 42, hex 2A) represents the bullet itself, and proper content is all characters, from the first non-whitespace character after the bullet, up to the end of the line of code (i.e., the first unescaped end-of-line sequence).

Note that the bullet representation is not a logical necessity, since a bullet and a line are the same thing in Booliette. It's included as a syntactic requirement because:


Booliette syntax and semantics

bullet components

Every bullet point has two semantic characteristics:

Booliette recognizes two kinds of bullet point:

common name logical name description Boolean equivalent
standalone particle The standalone is the kind of bullet without a sublist. It evaluates as true or false based on information stated within the bullet's proper content. a Boolean variable
sublist operator A sublist is the kind of bullet that does have a sublist. Its proper content is a statement about the occurrence of truth or falsity concerning information based on its sublist bullets. It evaluates as true if that statement is determined to be a truthful expression. a Boolean operator

A standalone bullet's proper content can take one of three general forms:

  1. Static Assertion Form, which declares a constant Boolean assertion of TRUE or of FALSE
    		
    		* key word
    		
    	
  2. Proposition Form
    		
    		* left-hand value (whitespace) argument (whitespace) right-hand value
    		
    	
  3. Protocol Form
    		
    		* protocol [ protocol-specific part ] 
    		
    	

And a sublist bullet's proper content is a "sublist Boolean assertion", which is often equivalent to a Boolean operator. A sublist bullet precedes an indented list of bullets. It takes the general form

		
		* sublist boolean assertion
			* bullet
			* bullet
			* bullet
			* ...
		
	
standalone bullets in proposition form

Propositions are dynamically evaluated to either true or false or, under exceptional circumstances, may generate error(s). Multiple propositions or complex propositions may be organized by conjunctions (from a grammar model) that are, in fact, n-place Boolean logical operators (from a mathematical/logical model) through use of "sub-lists" and the various kinds of bullet points.

A proposition appears like natural language and consists of an argument relating two subjects, where one subject exists in the proposition's left-hand value and the other in the right-hand value.

An example of a proposition could appear like:

			
    * today is-later-than '7/4/1776'
    			
		
where "today" is the left-hand value, "7/4/1776" is the right-hand value, and "is-later-than" is the argument relating the two. This proposition, were it to be included in policy, would be dynamically evaluated to determine the applicability for the disclosure rule in which it existed.
A more complex example illustrating use of a sublist bullet together with proposition form standalone bullets (i.e. a proposition containing a Boolean operator -- a "conjunction") could appear like:
			
    * all-true
        * today is-later-than '7/4/1776'
        * today has-semantic weekday
			
		
This is to say "Today is later than 7/4/1776, AND today has semantic weekday (i.e. today is a weekday)."

A left-hand value or right-hand value may be a literal, quoted string, such as '7/4/1776', using matched pairs of either apostrophes (i.e. single quotation marks) or double quotation marks. Nothing within the matched pair of marks is prohibited as long as it is a part of the supported character set for literal strings. Alternatively, a left-hand value or right-hand value may be a dynamically evaluated reference.

dynamic references in propositions

A dynamic reference takes the form (independent of sequence as long as whitespace is properly handled)

			
    (optional possessive determiner and whitespace) subject (optional whitespace and manipulator, repeated)
			
		

The reserved words for possessive determiners and for manipulators may be included in the dynamic reference but are dependent on the nature of the subject. For example, a possessive determiner may be included if the subject derives from user context, and a manipulator may be included if the subject is of a type that corresponds to the manipulator. If both a possessive determiner and one or more manipulators are present in either a left-hand value or right-hand value, then the possessive determiner acts first upon the subject followed by the manipulators acting in any order upon the subject. For example, "any-user's years-of-service plus-4" would be evaluated by considering first any-user's years-of-service. But of course, before feeding the value to the (unstated) argument of the proposition, the manipulator of plus-4 would be accomplished. So, once the individual's years-of-service has been found, four is added to it via the manipulator, and then it is fed to the argument. This recurs until the argument and the possessive determiner are satisfied or until the inspection ultimately exhausts all possibilities. Next we discuss the expectation of the commutative behavior of coincidentally-applied manipulators.

No more than one manipulator of a given, specific word is permitted to exist in either a left-hand value or right-hand value. Yet multiple, different manipulator words may be used, and if so, the entire set of manipulator words used must be commutative because there is no guarantee of the order of processing. For example, one should refrain from mixing multiplication manipulators with addition manipulators. Also, manipulators may be subject-specific. For example,
prohibited: identity-full-name plus-3-minutes the subject and manipulator do not correspond in type
prohibited: now plus-3-minutes plus-7-minutes we seek to avoid unlimited occurrences of the same manipulator word
permitted: now plus-12-minutes minus-2-minutes OK, since two different manipulator words are used
permitted: 3-days-ahead-from-now plus-30-minutes should be obvious

A subject should have value, semantic meaning for that value, and value type. A subject value has a scalar value part and an optionally defined unit part. The scalar value part is a byte-stream, and the unit part is a string. When the unit part is defined, the subject as a whole is considered a "measure", such as fifteen dollars (US$15) or three kilometers (3 km). When the unit part is not defined, the subject as a whole is considered a byte-stream.

The proposition relies on its particular argument's implementation to interpret the proposition's subjects. For example, one argument may use and interpret a subject's semantics while a different argument may use and interpret a subject's measure.

Irrespective of whether the subject is a measure or a byte-stream, a subject may or may not be a container for other subjects. Because the subject's scalar value is atomic and is not interpreted as anything more than a character set encoded byte-stream, the values and scalars of subjects are never natively considered as multivalued (although a non-native, third party implemented call-out of IsRelatedTo may do whatever it wishes). Certainly, multiple instances of a subject may exist within a given context. A hypothetical example is multiple subjects of "authenticated-session-start", one for each of several authenticated users possibly found within the user context. In other words, a subject may contain multiple values as independent subjects but does not exhibit multiple values or multiple scalars itself. For example, a flag does not exhibit a color of three values (red, white, blue), but rather the flag ought to contain a collection of its own "color subjects": color with value (red), color with value (white), and color with value (blue). In order to handle sets of data, where one may treat multiple contained subjects as a single entity, such as treating a flag as though it exhibits color that is a set of three values (red, white, blue), refer to the result set bullet. In such a circumstance, this hypothetical flag's set of colors, were it to be dynamically referenced within a proposition bullet, would exhibit a value of [red, white, blue] (independent of order), which may then be compared with other sets. So, the result set bullet will often be preferable for data like this.

If the Gatepoint discovered in the present document a flag exhibiting a single color subject with value of the literal string "red, white, blue" (actually, in whatever order as presented in the document), then this single value is preserved and made available when ever it may be referenced within a proposition. It is not considered a set of three colors; it is a single color in this case called "red, white, blue". The party that generated the document is responsible for ensuring data is properly formatted or characterized. A rulesheet author is responsible for ensuring that written policy is comprehensive to cover disclosure control under many circumstances, including prerogatives exercised by parties generating documents in selection of formatting or characterization schemes for their data.

To illustrate what had been covered thus far, here is an example of a subject from the user context:
authentication-level-of-assurance
And here is an example of using a possessive determiner with that same subject:
every-recipient-user's authentication-level-of-assurance
And now a full proposition with this subject:

			
    * every-recipient-user's authentication-level-of-assurance has-value "Level_4"
    			
		
Possessive determiners allow the proposition to specify how multiple instances of a subject within a given context shall be interpreted.
Another example of a subject:
client-date-and-time
This same subject with a manipulator:
client-date-and-time minus-5-minutes
And a full proposition:
			
    * client-date-and-time minus-5-minutes not-is-earlier-than now
    			
		
or alternatively as:
			
    * now not-is-later-than client-date-and-time minus-5-minutes
    			
		

standalone bullets in protocol form

These standalone bullets are assertions formulated in two parts:

We've got plenty of unresolved questions about the particular nature of the protocol-specific part. In particular: the distinction between opaque and semantically structured types of entities. (This may end up being resolved by something similar to the syntax defined in the URI Generic Syntax reference from the Network Working Group: there are several possible formulations for parts of URIs, of varying degrees of opacity to Booliette's Syntax.)

sublist bullets

A sublist bullet requires accompaniment of an indented list of one to many bullets. The constraints on the quantity and kind of bullets in the sublist depends on the containing sublist bullet's implementation. A sublist bullet takes one form, and that is an assertion expression:

		
		* assertion key word
		
	
An example is the equivalent of the logical operator AND.
			
    * all-true
        * bullet
        * ...
			
		
Another example is the equivalent of the logical operator OR.
			
    * any-true
        * bullet
        * ...
			
		
For more examples, see the next section on result set assertions or see sublist boolean assertions.

result set assertions and the result set bullet

Result set assertions come in two groups: set builders and set operators.

Set builders come in two varieties: intensional and extensional. Extensional set builders are a topic reserved for the future, but it's currently thought they may take a form similar to that of a result set bullet (see historic note below). Herein, the discussion will concern exclusively the intensional set builder. Some intensional set builders can be expressed as a valid kind of proposition form stand-alone bullet.

Set operators come in two varieties: order-independent and order-dependent. Order-dependent set operators are a topic reserved for the future. Herein, the discussion will concern exclusively the order-independent set operators. Some set operators can be expressed as a valid kind of sublist bullet.

Intensional Set Builder
An intensional set builder is a way to create a set by means of satisfying membership criteria. This is opposed to an extensional set builder, which by definition creates a set by means of explicitly specifying each member (i.e. by listing the members). An intensional set builder kind of result set assertion bullet follows the proposition form of a stand-alone bullet. To illustrate:

			
			* left-hand value (whitespace) argument (whitespace) right-hand value
			
		
As is the case of any proposition form stand-alone bullet, the constraints on the left-hand and right-hand values are dependent on the implementation of the given argument in use. In the case of intensional set builders, the constraints are that one side's value be a literal string or a dynamic reference node and that the other side's value be only a dynamic reference node. The following example intensionally builds a set of all the nodes found within the present document where the nodes satisfy set membership by having type currency:
			
	* "currency" being-type-of-elements-within present-document
			
		
If the argument's implementation fails to satisfy the assertion, in our example that there are elements within the present document of type "currency", then the bullet returns false. It also happens that the argument's implementation created a set, albeit an empty set, in this case. On the contrary, if a non-empty set were created, then the bullet returns true.

Order-independent Set Operator
An order-independent set operator is one which cares not for any declared order in which its operands exist (e.g. the top-down appearance in authoring form of bullets in its sublist). There are three order-independent operator families:

Each family of order-independent set operator can be expressed in variants that declare what characteristic of the sets' members shall be operated upon. For example,
			
	* these-have-no-value-in-common
			
		
Another example,
			
	* these-have-no-type-in-common
			
		
And yet another,
			
	* these-have-a-meaning-in-common
			
		
And still another,
			
	* these-have-every-caption-in-common
			
		
And here is an example of the use of a set operator and two set builders:
			
	* these-have-no-value-in-common
		* check-kiter being-meaning-of-elements-within present-document
		* "fullName" being-caption-of-elements-within identity-attributes
			
		

historic notes

possible notational convenience: the "resultset" bullet

At this point, we're not sure whether this third, supplemental bullet type is a good idea or not. It may not be a logical necessity, because any proposition expressible by this kind of bullet is also expressible by the other kinds. However: expressing it those other ways is cumbersome and error-prone, so we're considering this as a user-friendliness measure. It's possible that references to a set, say when a data node exhibits a set as its value, may require this kind of bullet.

The basic concept of a resultset bullet is that it is like a specialized standalone bullet, with an assertion in its proper content. And it is also like a sublist bullet, because that assertion describes a condition applying to a nested list. However, this nested list is not a bullet list (i.e., not a "sublist" as such). It's a list of non-bullet data representations. Ordinarily, these data items would be compared to elements in the result of a query against a resource; hence the name "resultset" for this type of bullet.

Lexically, the resultset is represented one item per logical line. Each line is prefaced by indent whitespace and then (instead of an asterisk, as would be the case for a bullet), a single minus sign "-" (ascii decimal 45, hex 2D), and another whitespace character. Everything from the first subsequent non-whitespace character to the first unescaped end-of-line sequence is the value of the item.

A resultset bullet would therefore take the general form

			
			* (generation of the resultset) (assertion about the resultset):
				- item
				- item
				- item
				- ...
			
		

boolean keywords and syntax

sublist boolean assertions
	
	* at-least-one-true:
		* bullet
		* bullet
		* ...
	
	
	* all-true:
		* bullet
		* bullet
		* ...
	
	
	* exactly-one-true:
		* bullet
		* bullet
		* ...
	
	
	* none-true:
		* bullet
		* bullet
		* ...
	
Boolean operator Booliette sublist assertion keywords

One design characteristic we are leaning towards is to provide multiple idiomatic expressions that carry the exact same logical significance. In other words Booliette is probably going to have a lot of redundant keywords.

The downside of this is that it diminishes uniformity of expression. We justify that cost by asserting that the readability of a Booliette expression is probably improved, because the intuitive connotations of the various formulations are likely to be more appropriate for a given line of thought.

Yeah, we're probably on shaky ground here... Anyway, the keywords listed are rough-draft lists of possible equivalent ways of writing the given sublist truth constraint.

AND all-true:
all-are-true:
all-must-be-true:
all-of-these-are-true:
all-of-these-must-be-true:
OR at-least-one-true:
at-least-one-is-true:
at-least-one-must-be-true:
at-least-one-of-these-is-true:
at-least-one-of-these-must-be-true:
any-true:
any-of-these-is-true:
XOR exactly-one-true:
exactly-one-is-true:
exactly-one-of-these:
exactly-one-must-be-true:
exactly-one-of-these-is-true:
exactly-one-of-these-must-be-true:
NOT not-all-true: (!(A and B)) equivalent to (!A or !B)

none-true: (!(A or B))
none-is-true:
none-can-be-true:
none-of-these:
none-of-these-can-be-true:
all-false: (!A and !B) equivalent to (!(A or B))
all-must-be-false:
all-of-these-are-false:
all-of-these-must-be-false:

any-except-these:
anything-except-these:
higher-cardinality operators (proposed)

Boolean logic is excellent for making statements about set membership, but it's not really usable for dealing with cardinalities other than 1 and 0. However, Booliette is intended to be useful for specifying rules. When dealing with rules, especially rules related to policy, there is often a "figure-of-merit" consideration that leads to specifying a number that represents some kind of satisfactory threshold. (One of the most familiar examples of this is n-factor authentication for users, where n >= 2.)

The possible logic problems inherent in adding these operators to Booliette are unknown, which is why this remains "proposed". Effectively, adding these makes Booliette into a superset of Boolean logic, and that may be too ambitious for the first iteration of the Booliette notation specification.

In this, as in other issues, we await rigorous mathematical treatment before finalizing any design decision.

A threshold specification, in some sense comparable to extending the Boolean OR operation to cardinalities > 1 at-least-(posint)-true:
at-least-(posint)-are-true:
at-least-(posint)-must-be-true:
at-least-(posint)-of-these-are-true:
at-least-(posint)-of-these-must-be-true:
A boundary specification, in some sense comparable to extending the Boolean XOR operation to cardinalities > 1 at-most-(posint)-true:
at-most-(posint)-are-true:
at-most-(posint)-must-be-true:
at-most-(posint)-of-these-are-true:
at-most-(posint)-of-these-must-be-true:
A specific count specification, a convenient shorthand keyword representing the compound use of both a threshold specification and a boundary specification with each sharing identical cardinalities, resulting in a range of 1 exactly-(posint)-true:
exactly-(posint)-are-true:
exactly-(posint)-must-be-true:
exactly-(posint)-of-these-are-true:
exactly-(posint)-of-these-must-be-true:
resultset bullet operators (proposed)

If we decide in favor of using resultset bullets in this version of Booliette, we would be similarly permitting multiple synonymous keywords.

AND finds-all:
finds-all-of:
contains-all:
contains-all-of:
query-finds-all:
query-finds-all-of:
resultset-contains-all:
resultset-contains-all-of:
all-items-present:
OR finds-at-least-one-of:
contains-at-least-one-of:
query-finds-at-least-one-of:
resultset-contains-at-least-one-of:
at-least-one-item-present:
XOR finds-exactly-one-of:
contains-exactly-one-of:
query-finds-exactly-one-of:
resultset-contains-exactly-one-of:
exactly-one-item-present:
NOT finds-none-of:
contains-none-of:
query-finds-none-of:
resultset-contains-none-of:
none-of-these-items-present:

Note that if higher-cardinality expressions do make it into the Booliette specification, we would also define such syntax for resultset bullets.

aliasing

One of the goals of Booliette notation is to provide easily readable and understandable statements of logical requirements. One impediment to achieving this is the permissibility of including visually awkward content within the body of standalone or resultset bullets. For example:

It's very much in our interest to create a mechanism to remove these elsewhere, so that the structure of bullets is clearly understood. The most straightforward mechanism is aliasing.

An alias is a statement that a short, convenient, visually unobtrusive identifier is to be substituted for something less tractable. An alias is not a "variable" assignment in any sense. (Note, though, that programmers working in languages that do permit variable assignment sometimes use the alias mechanism to achieve some of the same advantages. Within Booliette, the proposition form of bullet points includes the concept of a variable known as a subject pronoun (keyword "that-item", "whilst", or "ibid" are synonyms for each other). But back to the topic at hand.) Aliasing

An alias declaration takes the general form

	
	alias:(alias identifier){(referent)}
	

in which


evaluation of Booliette statements


quasi-logical states: void and null

In order to deal with certain authoring and processing situations, Booliette defines two supplemental, or quasi-logical, states a bullet may assume, besides the pure logical states of true and false. These states are void and null.

void

A bullet can be assigned a value of void. This is a comparatively simple state: it simply means "this bullet is ignored when evaluating the Booliette expression of which it is a part".

Examples of void bullets:

null

null is a more difficult (and possibly logically flawed - we await rigorous mathematical treatment on this) state. Conceptually, it is the situation that occurs during processing, when the order of logical evaluation of bullets prescribed for Booliette requires evaluation of a bullet that cannot, for whatever reason, be resolved to true or false at the moment.

In such a case, the bullet may be assigned a logical value of null, and processing may continue, if possible, in the evaluation of other bullets. There are two immediately obvious cases in which this would be useful:

null may not prove to be rigorously usable, in which case it would need to be discarded. However, superficial analysis does yield a plausible truth table of Boolean operations defined for null values. It's not overwhelmingly useful, but it does appear to be logically consistent:

expression containing null resolves to
null AND true null
null AND false false (this is a logically demonstrable case in which the value of the null variable can make no difference to the value of the overall expression)
null OR true true (this is a logically demonstrable case in which the value of the null variable can make no difference to the value of the overall expression)
null OR false null
null XOR true null
null XOR false null
NOT null null