RUSTPrivacy.org


 

 

 

 

What if some smart people in New Jersey figured out that under the right circumstances encryption was subject to an asymetric assult like the one envisioned in TakingWorkHome? We will speak no more of this and we don't like to gloat, but unlike encryption, RUST is still a sure thing. Redact Unless Static Text or RUST is the normal default behavior of many graphical word processors when printing or reformatting files containing references in complex structures. An embedded structure is redacted in the sense that only a plain text short form or a text icon survives the printing or reformatting.

 

 

DANGER! DANGER! Data Loss Alert!

 

This is the sort of thing which drives Computer Scientists to distraction, but how dangerous is it, really? What if the author never meant for the data to be seen in document copies? What if the author never meant for the document to be copied? What if the data was too important to depend upon others respecting copyrights? What if the data was Personally Identifiable Information and isn't non-propagation the whole point? When we use RUST Privacy, we have control over both what we write and how that will be repeated, and categorized - there is no loss of data. Privacy follows Freedom of Speech , as it must, otherwise it is just an invitation not to speak.

 

RUST Privacy is a viable alternative to the abandonment of mobile devices in the Public Sector, and a good portion of the Private Sector, as current research suggests. Google "Crackberry" for details on how well that might work out. Yet, Computer Science never met an acronym it didn't like and WYSIWYG, "What You See Is What You Get", is no exception. When acronyms become paradigms and paradigms become dogma it's high time the anarchist rabble asked what may hide behind the dogma. While we are positive that being anarchist rabble is the sign of a messy desk, we see no reason to be messy about the details . That gives Freedom of Speech a bad name. As long as you redact nicely, there should be far fewer events requiring you to defend your rights. The Courts are correct, no one grants you a "right" to privacy. With RUST , privacy is a freedom to be embraced.

 

RUST Privacy is the patch which fills in a gap which has existed too long between legacy paper display and electronic display. By any estimate Gutenberg had a pretty good run and it is far from over yet, but pretending that the future holds transparent interoperability for capabilities paper never had is just a little bit more marketing hyperbole than most of us are willing to swallow. The danger, all too real, is that marketing has better funding than privacy. Computer Security Professionals tell the public on a regular basis that writing personal information down is dangerous while Advertisers covet that information so they can ... write it down. Rather than debate who is right and wrong, one simple goal of RUST Privacy should be to show how "private" information can co-exist with "public" information in one source document. This simple tool changes the situation in a fundamental way, making both security and commercial imperatives confluent.

 

Stripped of the big words, basic RUST behavior comes from a simple observation: links change, reference anchors don't. It may be that once people discover how easy it is to protect information from propagation that breaches will become intolerable and seen as negligence. We can certainly hope so.

 

Anchor*Buoy*Boat

 

As in the drawing above, an anchor can be tied to a named object - buoy, or not. That says nothing about what is tied to the buoy. As it turns out, this alone is sufficient to redact information by naming the reference anchor(s) after the particular buoy to which it or they are attached. It is not a "buoy" ontology, that would mean deep water buoys, big ones, small ones, assigned ones and so forth. Every name in the class (buoy) bears the same hierarchical relation to every possible reference anchor.

 

Yes there are sixteen named buoys, and three types of linkage in a full implementation . "Simple" refers to the behavior of the computer, passing through plain text and renaming a reference anchor after a buoy (class name). For certain file formats (see figure below), the behavior is automatic - a round trip redacts the anchor to a text icon with the class name. The automatic behavior can also be seen as a "print preview" where the reference is also replaced by the "buoy" name in a display.

 

As static text: 1 Main St. -> 1 Main St.

As a reference: [my home sweet home at 1 Main St.] -> [location]

 

Document Examples (Zip File)

 

With some caveats, all rich text word processors make this same mistake in the same way. The reasons are fundamental: Until the "paperless office" includes, somehow, the dis-invention of paper, life will be good for users of RUST Privacy. But, if one would break the automatic behavior one breaks the system. This should not be allowed to happen. Fortunately, the Advanced Methods allow plenty of opportunity for experimentation.

 

On 14 January, 2008 the Dublin Core Metadata Initiative announced a major overhaul of their syntax. While the basic RUST behavior and syntax is not affected, the Advanced Methods rely upon the DCMI definitions and Abstract Models. Considerable rewriting of the basic documents, including the RDF Schema of the PII Namespace, was necessary. The revised name space and PURL's should now be used.

 

In late January 2008, Sun Computer announced their intention to donate a new Export XSLT (from OpenDocumentFormat to XHTML 1.0) to OpenOffice.Org. RUSTPrivacy.Org will in turn donate the modifications to use Dublin Core syntax in the XHTML output and a GRDDL type transform to write a RDF Description Set based on the meta data contained in the header of the XHTML. The purpose is really quite simple - to make 1) an Archive 2) a pleasing content presentation and 3) a modern meta data presentation from an open source document format - a document "self-portrait" of sorts. The modified and supporting files are contained in a zip file. Knowing the names of the giants (Standards Organizations) on whose shoulders we stand, RUSTPrivacy.Org is happy to give meta data it's due respect.

 

 

The document you are reading right now is XHTML (the yellow box above) and of course has an RDF Description Set (the light blue box above). In fact, the document which defines the namespace (pii-history.html) has a transformation to RDF as well. Just be careful you don't confuse this instance with the RDFS (schema).

 

Taken together, these protocols should make it much easier to control the meta data, "data about data", flow between, for example, data bases at different levels of government. However, this donation does not apply to RUST or the use of the PII name space to fulfill organizational obligations either in the Private or Public Sector. We feel that information which belongs to individuals should not be brokered by third parties without cost; it is then and only then that individual privacy has value. RUSTPrivacy.Org reserves the right to be a civil thorn in the foot of Big Brother and all his commercial relatives, borrowing a page from that playbook, "because we can".

 

Redact Unless Static Text (RUST) Syntax

Linking to the namespace. (REL="Help")

A file called pii-history.html defines the tagspace stylized (HTML) for human reading. The format of the (unique) ID's is (tag name)-XXX, beginning with (tag-name)-001.

For example: the tag 'car' has the link
<A
REL="help"
TITLE="Personally Identifiable Information"
TYPE="text/html"
CHARSET="UTF-8"
LANG="en"
HREF="http://www.RustPrivacy.org/2008/pii-history.html#car-003">[car]</A>
in an HTML page. In a browser, it will look like this: [car].

"help" links do not appear in RDF Description Sets. They would not be a "resource" a computer could use.

Linking to Resource Description Framework (RDF). (REL="Glossary")

Every term in the PII namespace is a Dublin Core (DC) member of a set of labels, concepts, which denote a class of Personally Identifiable Information. The labels constitute a vocabulary scheme or framework which can be used in mark-up (XML, XHTML, HTML, PDF etc.) or even plain text to propagate meta data without the propagation of the specific content. This explicit separation of content from meta data is of particular importance for Personally Identifiable Information (PII).

The RDF counterpart to this file defines the tagspace in a computer processing friendly format. Just as HTML can be valid (against a DTD), or invalid, a resource can use RDF where the tagspace is well-defined by a schema (RDFS) or not. Failure to define a tag in a tagspace or trying to link to a tag not defined in a tagspace is an error, not a "bug" to be fixed as time permits.

The RDF schema file is called pii-class.rdf, defining the tagspace [http://purl.org/pii/terms/] and all of the tags in the tagspace.

To link to the 'dna' term:
<A
REL="term"
TITLE="Personally Identifiable Information"
TYPE="application/rdf+xml"
CHARSET="UTF-8"
HREFLANG="en"
HREF="http://purl.org/pii/terms/dna">[dna]</A>
in an HTML page. In a browser, it will look like this: [dna].

Note the subtle difference here. In links to human readable definitions the 'HREF' is a URL (a location) and in RDF, 'HREF' is a URI (an identifier). Both anchor elements above go to the same page, the advantage being that the person or computer writing the link need not know which historical version is the latest ahead of time or update the page using the "help" relation when changes are made to the tagspace.

Anchors. (REV="Contents")

Meta data and Personally Identifiable Information is "conserved" in the same sense as 19th Century Physics used the term. Its definitions and behavior in context come from somewhere external to where it resides. While way too abstract a concept for most people, this has an immediate, concrete consequence for the representation of PII in documents.

PII displays itself only as an anchor.

To anchor a 'location' term in a HTML page, write:
<A
ID="IDAWD1N"
TITLE="Personally Identifiable Information"
NAME="location"
REV="contents"
TYPE="text/plain"
CHARSET="UTF-8"
LANG="en">Any Location</A>
In a browser, it will look like this: Any Location.

This result does not look any different than plain text, and the reason, of course, is that it should not look any different. The "id" attribute is a unique identifier only within the HTML document, the "contents" remain generic in The-Big-Picture, no matter how specific they seem in the local document. Of the three types of anchor, this type alone has an "ID" (global only to the current document) and is without a class hyperlink meaning the "contents" of documents can be multi-valued).

Like the "help" link, these reverse links to content do not appear in RDF files either. The same arguement for exclusion can be made - that no external "resource" is identified. But, but the NAME="location" ... misses the point, the computer's resource name and location is at least a hyperlink (in HTML), [location]

Linking from a Data Base (Citations)

Given a well-defined term space and XSLT, it is possible to convert an ODF <text:bibliography-mark ... /> to an anchor with the 'rel' attribute set to either "help", "term". To minimize typing, one can use a "biblio schema" table of OpenOffice 2.3.x to initialize the ODF element.

The logic necessary to convert a visible HTML Element (CITE|ABBR|ACRONYM) to a <text:bibliography-mark ... /> calls for a reverse relation "contents" link structure, which is not quite the same thing as a mechanically swapping the TITLE attribute and the Element Content. In the case of an HTML META Element, the SCHEME attribute can be used to specify the namespace, in effect a "Table Of Q-Names" which maps the table of "contents".

Some of the other Dublin Core terms, and all of the Vocabulary Encoding Schemes can be inserted as visible meta data in an XHTML Body, and processed by the Advanced Methods. However, because a round-trip into meta data space (uniformity in URL and URI) is not always possible, we have to cheat just a bit when it comes to tagging XHTML with terms from the other Vocabulary Encoding Schemes, quickly adding that we do not mean the use of ontologies (OWL) is wrong, just unnecessary for http://purl.org/pii/terms/ until such time as coherent semantics are needed for inference engines to resolve the relationship of similar PII with different genres.

 

 

 

 

 

Copyright © 2005-2008 Gannon J. Dick, all rights reserved. Comments