Easy RDF For Real-Life System Modeling

Thomas B. Passin

August 7, 2007

 [next] 

 [last] 

 [first] 

 [prev] 

 [next] 

 [last] 

Slide 2 of 26

Two Threads - Better Together

- Make RDF much easier

For "ordinary" people and programmers

- Enterprise and systems modeling

Simplify instance data collection and organization

- Hope for simplicity, flexibility, better interoperability, etc.

 [first] 

 [prev] 

 [next] 

 [last] 

Slide 3 of 26

Why Pair RDF And Modeling?

 [first] 

 [prev] 

 [next] 

 [last] 

Slide 4 of 26

Diagrams Look Simple, But ...

 [first] 

 [prev] 

 [next] 

 [last] 

Slide 5 of 26

What's Wrong With A Good Old Database?

Plus ... not interoperable

Plus ... local identifiers, no semantics, ...

 [first] 

 [prev] 

 [next] 

 [last] 

Slide 6 of 26

OK, What's Wrong With Program X?

EA programs - e.g., Metis, Rose

Expensive

Steep learning curve

Hard to share data (XMI isn't that great)

Hard to get data out in new forms

Can be hard to restructure

Emphasis here is on early modeling, data collection. See wish list on next slide.

 [first] 

 [prev] 

 [next] 

 [last] 

Slide 7 of 26

A Wish List

1. Easy collection of instance data

2. Data looks human- e.g., indented lists

3. Instance data can be machine-processed

4. Easy restructuring

5. Ability to include undefined or undescribed entities in the dataset

i.e., can be added even when their characteristics are not known

6. Ability to reclassify entities, even after instance data has been entered

7. Ability to extract the vocabulary (ontology)

8. Ability to create data entry templates

(to guide data collection)

(more coming on next slide)

 [first] 

 [prev] 

 [next] 

 [last] 

Slide 8 of 26

More Wish List

9. Easy to add new properties

discovered as the model evolves

or during data collection activities

10. Take data collected using the (paper) templates and get it into the data set

11. Feasible to transform the dataset into other forms

For display, import into established tools, etc.

 [first] 

 [prev] 

 [next] 

 [last] 

Slide 9 of 26

How? RDF (Yes, Really) - The Practical RDF Subset

Start with toy model as an example:

(Compare with this earlier diagram )

 [first] 

 [prev] 

 [next] 

 [last] 

Slide 10 of 26

How People Like To Jot Down Structured Data

Fruit
   apples
   bananas

milk
eggs

Paper products
   paper towels
   napkins

 [first] 

 [prev] 

 [next] 

 [last] 

Slide 11 of 26

What Should Instance Data Look Like?

How a human would like it -

Organization
   org-id:OFS
   name:Office for Federated Systems
   member of:Federated Umbrella Organization
 

A small RDF subset looks similar -

<Organization
   rdf:about='OFS'>
   <rdfs:label>Office for Federated Systems</rdfs:label>
   <member-of
      rdf:resource='Federated-Umbrella-Organization'/>
</Organization>

It's already close, but we can do better

 [first] 

 [prev] 

 [next] 

 [last] 

Slide 12 of 26

The RDF As Structured Text

A simple structured text format really does the job -

Organization::
   rdf:about::OFS
   rdfs:label::Office for Federated Systems
   member-of::
     rdf:resource::Federated-Umbrella-Organization

Compare with the original -

Organization
   org-id:OFS
   name:Office for Federated Systems
   member of:Federated Umbrella Organization

A simple Python parser converts this into the RDF-XML form.

Rules for the structured text are in the full paper.

Convenient, but can also just use the RDF-XML form, not much harder.

 [first] 

 [prev] 

 [next] 

 [last] 

Slide 13 of 26

Layout - Whether Structured Text or Not

Key points -

"Striped" structure - alternate resource and property

Use blank nodes (bnodes) for structured properties

(rdf:parseType='Resource')

Use XML shortcuts - entities, xml:base, namespaces - for readability

Modularize using external entities

Pull parts together using DTD, skeleton RDF file.

All together, these give readable, textual data format

Any resource can come in any order.

Top level (XML) element names represent the kinds of things (classes)

Restructure by moving properties under newly inserted bnode

Extract data for presentation using XSLT.

 [first] 

 [prev] 

 [next] 

 [last] 

Slide 14 of 26

Other Benefits

Annotations and labels can be added later

RDF processors Just Do The Right Thing.

Can use existing OWL ontology or derive one from the data

 [first] 

 [prev] 

 [next] 

 [last] 

Slide 15 of 26

Naming, Classification, and OWL

Everything needs an identifier but not everything needs a name at first.

Can use RDFS or OWL to give labels to kinds of resources and properties

owl:Class::
   rdf:about::ObservingSystem
   rdf:label::Observing System
   @rdfs:subClassOf::
      rdf:resource::System

   ; Note that we need to use the "@" character, because
   ; rdfs:subClassOf is not one of the three special 
   ; attributes that are predefined in the translator.

RDF-XML for this fragment -

<owl:Class rdf:about='ObservingSystem'>
    <rdfs:label>Observing System</rdfs:label>
    <rdfs:subClassOf rdf:resource='System'></rdfs:subClassOf>
</owl:Class>

 [first] 

 [prev] 

 [next] 

 [last] 

Slide 16 of 26

Example Instance Data

; Here is a bit of instance data
Org::
   rdf:about::ofs
   rdfs:label::Office for Federated Systems
   acronym::OFS
   homepage::http://www.example.com/ofs
   member-of::
      rdf:resource::fed-umbr-org
   related-org::
      rdf:parseType::Resource
      role::
         rdf:resource::fundedby
         org::
            rdf:resource::lga

Org::
   rdf:about::fed-umbr-org
   rdfs:label::Federated Umbrella Organization
   acronym::FUO

 [first] 

 [prev] 

 [next] 

 [last] 

Slide 17 of 26

The Real World - Models Evolve

Typical kinds of evolution - MUST be able to handle

Renaming

Name defined by defined by rdfs:label

Define and change in just one place

Adding And Splitting Structure

Most common need - see example on next slide

Reclassifying

Reparented, or to be declared to be a subclass of a new class

Done in one place, by direct manual typing or by a search-and-replace operation

Accidental Duplication

Using same identifier - no harm done, RDF processor will Do The Right Thing

Using different identifier -

Can fix in text editor using global search-and-replace for identifer, or

Use OWL owl:sameAs property. Processor must understand this property.

 [first] 

 [prev] 

 [next] 

 [last] 

Slide 18 of 26

Example Of Evolving Structure

Original funding description

   funding-profile::
      funding-fy-profile::
         date-fy::2006
         budget-amount:500k

Oh, wait, we need to capture funding agency

   funding-profile::
      funding-fy-profile::
         date-fy::2006
         budget-amount::500k
         funding-org::
             rdf:resource::big-govt-agency

We're not done, see next slide

 [first] 

 [prev] 

 [next] 

 [last] 

Slide 19 of 26

Example Of Evolving Structure (concluded)

Oops, the client wants the funds shown in specific categories

   funding-profile::
      funding-fy-profile
         date-fy::2006
         budget-amount::
            operational:400k
            new-acquisition::100k
         funding-org::
            rdf:resource::big-govt-agency

This process will probably continue as we collect more data and learn more.

The RDF-subset/Structured Text approach works beautifully for these kinds of changes.

 [first] 

 [prev] 

 [next] 

 [last] 

Slide 20 of 26

Example DTD

Typical DTD is clean and lean

<?xml version='1.0' encoding='utf-8'?>
<!--================ URI shortcuts =================-->
<!ENTITY model-orgs "http://tpassin.net/architecture/2007-04-10/orgs/">
<!ENTITY model-systems "http://tpassin.net/architecture/2007-04-10/systems/">
<!ENTITY model-person "http://tpassin.net/architecture/2007-04-10/person/">

<!--============= Entity for the base namespace ======-->
<!ENTITY arch-ns "http://tpassin.net/architecture/2007-04-10/">

<!--=========== External entities containing the data ===============-->
<!ENTITY base-ontology SYSTEM 'basemodel_ontology.ent'>
<!ENTITY systems SYSTEM 'systems.ent'>
<!ENTITY persons SYSTEM 'persons.ent'>
<!ENTITY orgs SYSTEM 'orgs.ent'>

 [first] 

 [prev] 

 [next] 

 [last] 

Slide 21 of 26

Example Top-Level RDF File

Again, clean and simple

<?xml version='1.0'?>
<!DOCTYPE rdf:RDF  SYSTEM "arch_rdf.dtd">
<rdf:RDF 
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xmlns:owl="http://www.w3.org/2002/07/owl#"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns='&arch-ns;'
xml:base='&arch-ns;'
>

&base-ontology;
&systems;
&persons;
&orgs;
</rdf:RDF>

 [first] 

 [prev] 

 [next] 

 [last] 

Slide 22 of 26

Creating The Data Files - Typical Workflow

1. Decide on base URI, key entities (can extend later)

2. Write DTD fragment to declare them

3. Write top-level RDF file to import DTD, define namespaces

4. Write text-format files with instance data (class definitions optional)

e.g., person.txt

5. Transform structured text files to RDF-XML

e.g., person.ent

6. Add one external entity declaration to DTD for each module (RDF-XML file)

7. Add a reference to each external entity in the body of the top-level RDF-XML file.

8. (optional) Can create all-in-one RDF-XML file by running through a good XML parser, like RXP.

Step 4 is most of the work - can be done by people with minimal training.

 [first] 

 [prev] 

 [next] 

 [last] 

Slide 23 of 26

Presenting The Model

Many purposes -

Show structure

Show classes and properties

Show all instance data

Create templates for data collection

Transform to XMI or other format used by graphical tools

How? XSLT transforms work well

Possible because of simple, regular structure

 [first] 

 [prev] 

 [next] 

 [last] 

Slide 24 of 26

After The Model Stabilizes ...

- Stick with this format

- Pro: easy

- Con: Not many tools to work with the data as is

- Con: lots of identifiers to remember and navigate

- Import into RDF Database

- Pro: simple way to handle lots of data

- Con: few modeling tools that can use RDF data

- Convert to RDBMS

- Pro: easy, because RDF model is well-normalized already

- Con: proprietary, one-off data model

- Convert to modeling tool format (Metis, Rose, UML, etc)

- Pro: these tools have many capabilities

- Pro: Metis data format is nearly RDF already

- Pro: some clients mandate specific tool

- Con: may be difficult to transform data for tool.

- Com: expense and learning curve of tools.

 [first] 

 [prev] 

 [next] 

 [last] 

Slide 25 of 26

Limitations

Can have lots of identifiers to remember

Not unique to this approach, but no tool help for navigating and creating graphics.

Could write something ...

Good mnemonic naming conventions help a lot

How to incorporate non-text data

E.g., images, tables

Should be as simple as the basic structured text

Export to other tools

Need to roll your own

 [first] 

 [prev] 

Slide 26 of 26