quick-link index to major Collation topics

I Want My Policy Included for Enforcement

Can anyone just appear on the scene and claim some sort of redaction authority over any ol' type of document or over any certain primary custodian's messages?

Answer: No. Doing so would raise the specter of denial of service attacks by some black hat-wearing prankster or enemy whose aim would be to include a policy that redacted all data all the time. Stakeholder Directories play a critical role here in allowing communities of stakeholders to govern community membership.

In addition to Stakeholder Directories, Rulesheet Repositories also have a role. On the Rulesheet Repository side, an author may be permitted to claim that a rulesheet is applicable to any chosen Problem Space and to any Primary Custodian. However, the legitimacy of claims will be defined and persisted in Stakeholder Directories.

For this discussion, it's important to remember that a document (a file of data) submitted for disclosure control to a CDCL Gatepoint is characterized by two pieces of metadata: the document's primary custodian and the document's "problem space", which is the type of document being exchanged between parties.

Problem Space versus Document Type
"Document type" can be an overloaded term. Therefore, within the discussion of CDCL, we try to avoid the term. Instead, documents utilize file types (structures) for information processing purposes, and documents represent problem spaces - or information exchange spaces - for business domain purposes. Any existing reference within CDCL documentation to "document type" shall mean "problem space" unless there is obvious context to the contrary.
A document has a single problem space, and this is synonymous with the business purpose or meaning of the document's existence, such as for the explicit exchange of information between parties. For example, a problem space can be "Wisconsin Incident Report IEPD". Often, the problem space can be represented by the document's schema, particularly for documents of XML file type.

For example, assume an author has been given the privilege to manage rulesheets for the fictitious Acme Accountants Inc., and assume the author created "rulesheet A". This author may claim that rulesheet A for stakeholder Acme Accountants Inc. governs disclosure over "Canadian Inter-Provincial Monetary Transfers" (this is the problem space) for both Quebec (a "primary custodian") and Ontario (a "primary custodian"). The rulesheet and this claim would be persisted in Rulesheet Repository 888. Yet, a Gatepoint, which conducts policy collation, evaluation, and enforcement, will reference Stakeholder Directories to determine legitimate claims of stakeholder interest for disclosure control. In our example, Gatepoint IceCube will reference Stakeholder Directory XYZ, and the response could be:

For the problem space of Canadian Inter-Provincial Monetary Transfers and for the primary custodian of Quebec, the stakeholders are For the problem space of Canadian Inter-Provincial Monetary Transfers and for the primary custodian of Ontario, the stakeholders are
So, Gatepoint IceCube shall collate rulesheet A into its rulesheet deck when the primary custodian is Quebec. But it won't when the primary custodian is Ontario, despite the claim the rulesheet author made.

In summary, there are three points at which legitimacy is determined when collating all stakeholders' rulesheets during a Gatepoint's disclosure control event. They are:

  1. The Gatepoint is configured by its owner/operator to know of one to many Stakeholder Directories. It considers these directories to be the authoritative sources for stakeholder community information. This dictates which Stakeholder Directories are trusted by the Gatepoint. Gatepoint operators and Stakeholder Directory operators, who represent community members, must coordinate with each other in order to have data exchanges rely on those Gatepoints that have been configured to reference appropriate Stakeholder Directories. Operators of endpoints (origination and destination) in data exchanges will need to understand that there is coordination between these Gatepoints and Directories. Operators of endpoints in data exchanges may very well be CDCL stakeholders themselves.
  2. As a member in a stakeholder community, stakeholders work with Stakeholder Directory operators to govern additions and changes to directory information. This dictates which stakeholders have a legitimate stake in the disclosure control event at hand.
  3. Stakeholders grant authorization to specified rulesheet authors at each given Rulesheet Repository. With this, an author may act on the stakeholder's behalf for rulesheet management. This dictates which rulesheets are legitimate out of the entire global population of existing rulesheets.

Stakeholder Directories

The primary purpose of a Stakeholder Directory is to persist validated associations between parties claiming to have a stake in the disclosure of a given information exchange and to provide information pertaining to these associations to Gatepoints that query for same. A Stakeholder Directory is a system envisioned (not required) to behave like an X.500 or LDAP directory.

Validating Stakeholder Directory Entries
Creation of associations between a Primary Custodian and a claiming stakeholder or between a Problem Space and a claiming stakeholder is subject to business requirements that may differ from one stakeholder directory to another. This is internal to each directory, and no universal requirements exist other than that each directory administrator must publish for public view the rules by which primary custodians are registered, by which stakeholders are registered, and by which stakeholders' claims are deemed legitimate.

Conveying Authorizations of Rulesheet Authors
Each rulesheet author is identified by a URI that describes both the author and the Rulesheet Repository in which the author has account access for rulesheet management. The Stakeholder Directory sends authorization information via a message to Rulesheet Repositories, which are determined from the author URIs associated to each stakeholder. The association itself between the URI and the stakeholder represents the authorization. This authorization information sent to the Rulesheet Repositories shall be:

The message to a given Rulesheet Repository will contain only the author URIs pertinent to that repository. The Stakeholder Directory may be configured by its administrator to automatically send these messages on an interval, and when this interval is undefined, no messages are sent at all.

Responding to Gatepoint Queries

  1. "confirm URI", where the directory shall either honestly refute the URI's validity within the directory's scope or honestly respond with the stakeholder's attributes, subject to authorization
  2. "divulge stakeholders", given arguments of
    1. Primary Custodian (required)
    2. Problem Space (required)
    3. stakeholder pattern {ZZZ} (optional, but this is permitted to be required if an 'all patterns' choice is allowed)
For "divulge stakeholders", the directory shall return the comprehensive collection (which may be empty) containing the following data for each stakeholder matching pattern {ZZZ} where the stakeholders have been associated with the specified document Problem Space or associated with the specified Primary Custodian: Consider the ordered collection of Rulesheet Repository URIs for a given stakeholder URI. Within this collection, each Rulesheet Repository URI contains a locator (URL) for a Rulesheet Repository and arguments of (entity's stake {Primary Custodian or Stakeholder}, Problem Space, entity URI, & Primary Custodian URI), where the entity URI is the value of this stakeholder URI whose collection is being considered. The entity's stake shall not be misrepresented by the Stakeholder Directory; there shall be no more than one entity within the group of divulged stakeholders with a stake of Primary Custodian. The Primary Custodian URI is the value of the URI for the entity that was acknowledged by the Stakeholder Directory to be the Primary Custodian, and this shall be dependent on the input argument provided by the Gatepoint (this input argument's value may differ from the Primary Custodian URI value determined by the directory). The Problem Space is echoed from the input argument provided by the Gatepoint.

For more information, see this schematic of the landscape of the working parts of CDCL.

Composition of a Stakeholder Registry Entry

To reiterate: a Gatepoint is configured to be familiar with a finite quantity of stakeholder directories. Each stakeholder directory can be considered a community of stakeholders. Each community is responsible for managing itself and for verifying the legitimacy of individual member stakeholder's claims of stake in both problem spaces and primary custodians, even when the primary custodian is not a member of the community. The stakeholder community (via the management and administration of the stakeholder directory) is in control over definitions of "interest" or "stake". This approach allows a directory to be visited only once during a redaction request.

Elaboration of Primary Custodian versus Stakeholder

Concerning Data Documents
There is one and only one primary custodian for any document. Multiple primary custodians may not exist within a document. If multiple parties wish to claim joint custodianship, then those parties ought to form a joint entity, and this one joint entity may become the sole primary custodian of the document.

When a document is a mixture or conglomeration (i.e. a mash-up) of information from disparate primary custodians, the sole primary custodian of the document explicitly assumes primary custodianship & responsibility for disclosure of the information taken from the other parties.

Concerning Rulesheets
How do primary custodians differ from other stakeholders within CDCL? One and only one primary custodian rulesheet must exist within the rulesheet deck after its assembly in order for redaction to proceed. Zero to many other stakeholders' rulesheets may exist within the deck following assembly. For each stakeholder that is identified, there is no more than one stakeholder rulesheet allowed to exist within the rulesheet deck following assembly.

One implication of this difference between Primary Custodians and Stakeholders is that documents do not contain other documents, from CDCL's standpoint.

Document Containment

As aforementioned, documents do not contain other documents. However, documents are not prohibited from being attached together whereby each document remains independent from the others and maintains its own primary custodian. For example, "document A" and "document B" are attached together in and comprise all the data for "information exchange E". The disclosure policy written by primary custodian A must exist in the rulesheet deck for redaction of document A, and likewise, the policy by primary custodian B must exist in the separate rulesheet deck for redaction of document B. This does not preclude the entity serving as primary custodian A from drafting a stakeholder disclosure policy, which might be included in the rulesheet deck for redaction of document B (and vice versa concerning entity B serving as stakeholder in the disclosure of document A).

In the hypothetical example above, how would a rulesheet author specify a dependence on the individual document and a dependence on the business information exchange in which the document is being passed? This can be viewed in two parts. For the first part: Within Stakeholder Directories and within the Rulesheet Repository, the author's rulesheet may be associated with a document's Problem Space (such as relevant to "A-like documents" or to "B-like documents"), subject to business validation by the stakeholder community. So, any rulesheet dependency on the individual document's Problem Space is handled via this association within the stakeholder directory every time the Gatepoint assembles the rulesheet deck. If the author has a further dependency on the fact that the document is being passed within the context of a specific information exchange, the rules are permitted to be defined with dependencies on the client context. The client context is all the information known by the Gatepoint's client, which is making this request for disclosure control. This client context is passed by the client to the Gatepoint in the transaction for disclosure control, and therefore, this information is available to be referenced from any rule. For example, this author's rulesheet may have a rule with a dependency on "information exchange E", which might be found in the client context.

From CDCL's standpoint, the ramification of this is that both "outcome inheritance" and "inter-outcome dependencies" (e.g. prohibiting line-item redaction) do not cross document boundaries, such as would otherwise have to be the case if CDCL respected the concept of document containment within other documents. So, in a containment model, an author would be permitted to write just a single rule governing disclosure of a document node that contains another document, and that contained document would inherit the applicable outcome. But since containment is not recognized and since there is no respect for relationships between documents, the author must consider drafting a separate policy for the other document. Remember that rulesheets may be associated to document Problem Space and/or to Primary Custodian at the rulesheet level. They shall not be associated to "information exchange" at this same level. Instead, as outlined above, dependencies on "information exchange" may be part of individual rule dependencies on client context.

Since inheritance and inter-outcome dependencies are only conveniences for defining elaborate outcome logic, there is actually no negative impact if allowing such when crossing document boundaries; containment does not necessarily violate respect for primary custodianship, in other words. However, a problem lies in the fact that policy may differ when an entity is acting as primary custodian versus acting as a stakeholder. In an environment that permits containment, there is no economical way (perhaps no reasonable way) to assemble the rulesheet deck such that entity A's primary custodian policy is enforced for only document A content and not for document B content. At the same time, there is no economical way (perhaps no reasonable way) to assemble the deck such that entity A's stakeholder policy is enforced for only document B content and not for document A content. Containment creates scenarios where entity A's primary custodian policy and its stakeholder policy, which may be contradictory or possess logical discrepancies when taken together, are enforced simultaneously on the same data. It is for this reason that containment (i.e. more than one primary custodian appearing within a document) is an unrecognized concept within CDCL and that document attachments as a concept is favored. However, this does not prohibit the entities engaged in an information exchange from moving forward with their own containment concept, but this necessitates that all the data in the resulting document is to be considered under the auspices of a single primary custodian.