Spiral Version 0.34

This minor release adds support for optional query patterns. Optional patterns are ones that attempt to match triples but do not fail the query if no match is possible. This is still quite a low-level implementation as the Sparql parser is still being written. For the moment you have to build the underlying objects yourself.

For example, to represent a query like:


?subj <http://example.com/property > ?obj .
OPTIONAL { ?subj <http://example.com/other > ?obj2 . }

You would need to write:


  Query query = new Query();

  query.AddPattern( 
    new Pattern( 
      new Variable("subj"),  
      new UriRef("http://example.com/property"), 
      new Variable("obj") 
    ) 
  );

  query.AddOptionalPattern( 
    new Pattern( 
      new Variable("subj"),  
      new UriRef("http://example.com/other"), 
      new Variable("obj2") 
      ) 
    );

To process it you would do the following:


  IEnumerator solutions = tripleStore.Solve( query );
  
  while (solutions.MoveNext() ) {
    QuerySolution solution = (QuerySolution)solutions.Current;
    
    Console.WriteLine("Subject is " + 
      tripleStore.GetBestDenotingNode( solution["subj"] ).GetLabel() );
      
    Console.WriteLine("Property value is " + 
      tripleStore.GetBestDenotingNode( solution["obj"] ).GetLabel() );

    Console.WriteLine("Other value is " + 
      tripleStore.GetBestDenotingNode( solution["obj"] ).GetLabel() );

  }

Work is ongoing to support the Sparql syntax and cleaner result handling.

About Spiral: Spiral is an RDF processing framework targetted at the .NET and mono platforms and is made freely available under a liberal MIT-style license. There is no unmanaged code and so should work in every environment supported by .NET or mono. It provides a set of classes to represent the RDF model, implementations of in-memory and mySQL triple stores, an RDF/XML parser, multi-pattern querying and a simple rules system. Early support for Sparql syntax has recently been added. More about Spiral…

Comments

No comments yet.

Spiral Version 0.33

This version of spiral includes a bugfix for the backtracking query solver which was failing on some orderings of the query patterns. It also contains preliminary support for Sparql. This is at a very early stage but is usable for simple query patterns using SELECT. Error handling is minimal at the moment so the query syntax has to be just right. Consult the test cases to see exactly what is supported.

Comments

No comments yet.

Spiral mySQL Schema

I've been reading Benjamin Novack's writing on exploding triple stores with interest and took some time to speak with him about it at ISWC this year. It prompted me ot take another look at the mySQL triplestore used by Spiral. This is a little write up of the design of that triplestore database.

First, let me explain a little about the RDF representation used in Spiral which might be a little different to other frameworks. Spiral is resource-centric rather than graph-centric. In concrete terms Spiral has a Resource class which has zero or more graph nodes associated with it. A graph node can only be associated with a single resource. Without smushing or reasoning each resource always has one node. The smushing process generally finds nodes that denote the same resource and so these associations can simply be updated.

The TripleStore interface provides various methods to deal with either resources or graph nodes and their associations. For example there's a GetResourceDenotedBy method which gives you the resource that a particular graph node denotes. Conversely there's a GetBestDenotingNode method which gives you a graph node for a particular resource. Generally this will give you a URI if the store knows of one, otherwise you'll get a blank node (I'm ignoring literals here). You can also get a list of all nodes for a resource using GetNodesDenoting. Finally you can build associations by using AddDenotation which takes a graph node and a resource pairing.

This is the model we wanted to follow when designing the mySQL support in Spiral. We already had a memory based triple store working well but we knew we wanted to support much larger persistent stores.

In Spiral a triplestore is equivilent to a named graph. Each mySQL database can store multiple triplestores. We use a Graphs table to partition the space. Each graph has a unique id. I'd like to see this changed to a URI to be more consistent with other implementations of named graphs.

The core of the mySQL triplestore is the Statements table. This table holds all of the triples for all of the graphs. Key here is that the triples are in terms of the underlying resources not the graph nodes. Each resource is represented in this table by a unique hash, currently a 32-bit integer. We may decide to increase this as an option but haven't needed to yet. Each row in the Statements table is 16 bytes. Using a bigint would double this row length with the corresponding reductions in rows per disk read. There is a separate Resources table which records which resource is known in a particular graph with a unique index to ensure that each resource can appear in a graph only once. They are related to the specific nodes using a ResourceNodes table. This table associates a node with a resource in a specific graph. Every node appears in this table and those that have additional lexical information such as literals and URIs have additional tables as well. Blank nodes simply exist in the ResourceNodes table.

Diagram of Spiral database schema

This schema is designed to work well for querying smushed graphs. Queries are rewritten in terms of the underlying resources by looking up the graph node/resource relationships at query compilation time. The query is then evaluated against the Statements table and the best nodes for each matching resource looked up at output time. Because the Statements table contains only unique triples based on the underlying resources it can actually get smaller after smushing and so should improve query performance.

Comments

No comments yet.

Carp and Spiral Versions 0.3

I recently uploaded version 0.3 of Carp and Spiral our open source RDF tools. Both are written in pure C# and are designed to run under Mono or Microsoft's .NET runtimes. Spiral provides RDF parsing, storage, output and querying functions with both in-memory and mySQL storage options. Carp is a high level object API built on Spiral that allows RDF to be incorporated into .NET applications in a natural and idiomatic way. For example, the following Carp code loads some RDF from http://foo.example.com/, creates a search pattern for all foaf:Agents with a foaf:mbox_sha1sum of xxx and uses it to search the loaded RDF, printing out the names of all the friends of each agent found.

KnowledgeBase knowledge = new KnowledgeBase();
knowledge.include("http://foo.example.com/");


Foaf.Person pattern = new Foaf.Person();
pattern.mbox_sha1sum += "xxx";

foreach (Foaf.Person person in pattern.findAllMatching(knowledge)) {
  Console.Write( person.name );
  foreach (Foaf.Person friend in person.knows) {
    Console.Write ("knows " + friend.name );
  }
}

Downloads:

Note that this version has breaking changes: the naming of methods and classes in both Carp and Spiral now conform to .NET standard conventions. This will affect all existing code using versions prior to 0.3.

Comments

No comments yet.

SemPlan RdfLib Version 0.21

Oops. It seems that I forgot to update the build script to include the new mySQL classes for RdfLib, so I've released version 0.21 that does contain them!

Direct downloads: binaries, source, docs, license

3 Comments

  1. Will you be fixing the broken links to the binaries, etc? You have one of the few RDF library implementations for C#, so it'd be nice to have these updated.

    Comment by Bruce Schalau — 17 May 2005 @ 2:49 pm

  2. Whoops, they were left behind when I moved to a different hosting provider yesterday! The links should be fixed now.

    Comment by Ian Davis — 17 May 2005 @ 3:01 pm

  3. Could you Update he binaries? because the binaries doesnt contain the MySql classes.
    Thanks

    Comment by Luiz Silva — 23 May 2005 @ 1:49 pm

Carp Version 0.2

Hot on the heels of SemPlan RdfLib 0.2 is Carp 0.2, our Mono/.NET RDF API. As I mentioned earlier, we've refactored the two projects and pulled a lot of stuff out of Carp down into RdfLib. This leaves Carp leaner, but still as easy to use. Because of this, if you're using Carp and you want to upgrade RdfLib then you need to upgrade Carp too. Carp is now under the less restrictive MIT license too.

The introduction of non-memory based triple stores in RdfLib has meant that we've had to rethink the way ResourceDescriptions work. There are some (slightly out of date) design notes on the wiki which help describe the problem. The summary is that because each ResourceDescription needs access to the complete subgraph relating to a Resource the old implementation took a copy of the relevant triples. Modifications to the ResourceDescription never affected the underlying KnowledgeBase so was always safe to get a Foaf.Agent, add a weblog and some knows properties and then add it to another KnowledgeBase. You'd never change your original data by manipulating a ResourceDescription. However, when it comes to database backed triple stores, it's dangerous to assume that a ResourceDescription could fit into memory. Therefore we had to devise a new model that preserved the consistency we had before, but didn't require a complete copy of all the data. So now, each ResourceDescription is read-only by default which allows it to read triples on demand from it's underlying triple store without side effects occuring. We call this an "attached" read-only ResourceDescription. You can get a writeable copy of the ResourceDescription by invoking the copy() method - you get a complete copy of the subgraph as before so you'd better be sure you know how much memory you need. This is a "detached" writeable ResourceDescription. Finally, you can specify that you want a writeable object when you query the KnowledgeBase. getDescriptionOf now takes an isWriteable parameter. When you do this you get an "attached" writeable ResourceDescription. Changes to the object will affect the underlying triple store so you need to be careful passing this thing around your application unless you like dealing with non-local side-effects. It's not an ideal solution, but it is a convenient one, in the spirit of Carp.

There have been some other small API changes, mainly because we've clarified the resource/node relationships in RdfLib. There shouldn't be anything that breaks existing code though. Let me know if you have any problems.

The main new feature is full RDFS entailment. This utilises the new query and rule capabilities in RdfLib and is triggered by calling think on a KnowledgeBase. Now your foaf:Persons are foaf:Agents and the rel:fatherOf someone really does foaf:know them too!

Direct downloads: binaries, source, docs, license

2 Comments

  1. […] eems to be a bit more movement around creating semantic web tools for .net. In particular carp and semplan. Good […]

    Pingback by Bird's Eye View Blog — 23 Dec 2004 @ 11:35 pm

  2. I think the binarys for the CARP downloads are missing. Anychance we can have a link to these. Thanks

    Andy

    Comment by Andy — 23 Aug 2005 @ 7:58 am

SemPlan RdfLib Version 0.2

I've just uploaded version 0.2 of our Mono/.NET RDF library. We've improved just about everything that was included in 0.1 and added lots of new features, laying the groundwork for the future. The most important change is in the licence which is now less restrictive. We've adopted the MIT license which is the same license that the Mono project uses to license their class libraries. Our previous license was BSD based and required an attribution in the documentation for any application using RdfLib - that's all gone now.

Over the last few months we spent a lot of time examining RdfLib and Carp and we've pushed a lot of the useful classes from Carp down into RdfLib. Carp is lighter and RdfLib is more functional. One thing that emerged from Carp's KnowledgeBase class was a TripleStore interface that represents some storage for triples. We've refined this and moved it into the core of RdfLib and it's now responsible for the organisation of any RDF data that is being processed. There are two implementations at the moment: MemoryTripleStore is an in-memory store suitable for small, ad-hoc processing and MySqlTripleStore which will become capable of storing much larger quantities of data for long durations. We've had some ups and downs with MySQL, mainly around the quality of the drivers available for use with Mono. At this stage we consider the MySqlTripleStore to be alpha quality. It's usable but performance is a big issue. One of us will write more on that another time hopefully.

Feature-wise the biggest addition is the query capabilities. We have a pattern-based query implementation including a backtracking query solver that can be applied for any triple store and one optimised for mySQL. Our general principle is to provide default implementations of interfaces that work as efficiently as a general purpose algorithm can, while creating much more performant implementations for specific situations. Unfortunately there are no query language parsers yet, all queries have to be constructed out of objects. We plan to turn our attention to parsers in the next release (see the roadmap). The next exciting addition follows straight on from querying: rule processing. We have implemented a very basic rule processor capable of expressing the RDFS entailment rules. Version 0.2 of Carp uses it for that exact purpose.

There's lots of work yet to be done, but we're having fun, learning a huge amount and hopefully producing some useful software.

Direct downloads: binaries, source, docs, license

4 Comments

  1. […] RSS 1.0 (comments)
    ⇐ previous entry […]

    Pingback by Semantic Planet :: Carp Version 0.2 — 22 Dec 2004 @ 3:43 pm

  2. […] s to be a bit more movement around creating semantic web tools for .net. In particular carp and semplan. Good to see, I try and write about how semantic web applications should be […]

    Pingback by Bird's Eye View Blog — 23 Dec 2004 @ 11:35 pm

  3. Hi,

    I'm just starting on this awesome world of the semantic web. I'm interested in persistence of RDF on relational databases, so I got to your site after searching for a while.

    I have read on this topic that

    ; but I've downloaded the package but there is only the memory version. Perhaps I misunderstood the comment, but I supposed that the MySQL version was on this release. It should be or not?

    Thanks for your work.

    Comment by Jose A. — 21 Jan 2005 @ 7:51 pm

  4. Sorry for the previous post. Following "I have read on this topic that" I wanted to write:
    "There are two implementations at the moment: MemoryTripleStore is an in-memory store suitable for small, ad-hoc processing and MySqlTripleStore which will become capable of storing much larger quantities of data for long durations."

    Comment by Jose A. — 21 Jan 2005 @ 7:53 pm

Why RDF Templates Uses Macros

I'm working on a .NET version of RDF Templates using Carp. James asked why you had to use macros to get nice NodePaths like *[rdf:type = foaf:Agent] instead of QNames based off of the namespaces in the source documents or stylesheet. It's been a while since I wrote the specification and to be honest I'd forgotten why, other than just disliking QNames in content.

After a bit of thought I remembered some of the original reasoning. You couldn't base QName expansion off of the source documents because RDF Templates and their NodePaths operate over a graph, not an instance document. The graph could have been bult from hundreds of individual instance documents each with their own namespace prefix mappings (or none if the RDF source wasn't XML based).

That leaves getting QName mappings from the stylesheet. One reason why this would be a bad idea is that it would force the stylesheet author to include namespace mappings for every QName they refer to in their NodePaths even if they never use an element from that namespace in their actual markup. Roundtripping the stylesheet through an XML processor then becomes problematic since many processors will not output unused namespace declarations and of course the only usage in the stylesheet might hidden in a NodePath inside an attribute value.

The last part of the reasoning is that macros give a lot more expressive power than QNames. A QName by definition can only represent a namespace URI and local name. A macro on the other hand can represent almost any part of a NodePath. They can contain other macros too, so you can simplify very complex expressions down to a few concise macros and get a lot of reuse througout the document. A future version of RDF Templates will also allow inclusion of stylesheets which would enable the creation of macro libraries, essentially libraries of useful RDF query expressions.

I knew there was a good reason why I avoided QNames…

2 Comments

  1. Search
    Please check the pages in the field of music…

    Trackback by Mp3 Music — 26 Mar 2005 @ 11:07 am

  2. Search
    Please check the pages in the field of music…

    Trackback by Mp3 Music — 26 Mar 2005 @ 11:57 am

Announcing Giblet

Giblet (Graph In-Browser Lightweight Editing Tool) allows the creation of simple RDF graphs, based on available RDF schemata. It runs as a Javascript application in an enabled browser.

Launch Giblet.

This experimental release of Giblet is targetted at RDF savvy developers, although we will re-skin it with a less geeky look and feel for beginner tasks like FOAF or DOAP generation. There is a constant tension here between power and simplicity.

Giblet offers the following basic features:

  • Class selection and resource creation
  • Appropriate property selection, based on the domain of the class and its super-classes
  • Appropriate value input, whether as a string literal or an internal resource reference (based on the range of the property)
  • Saving and loading of simple graphs to and from local storage, via cookies
  • Export of graphs as RDF/XML, and a submission to the W3C's online image generator for RDF graphs
  • Availability of templates (initially FOAFNet only) to help a user get started

Giblet is initialised by a model described in Javascript. This is generated from various RDF schema by XSLT, although we will use Semantic Planet's CARP library in future. Giblet has been tested in FireFox preview release 1 and Internet Explorer 6 only. If you use it with another browser, please let us know.

Future plans for Giblet, including the dynamic loading of schemata, can be found on the Giblet wiki page.

We'd love Giblet to get used. If it does not do exactly what you need, or you feel it could be improved, please let me know by e-mailing me at james dot carlyle at semanticplanet.com.

9 Comments

  1. Very nice. Should be very useful.

    I did find it confusing at first, not altogether obvious what was going on, though after a few minutes started getting the hang of it.

    I'm not sure, something like a continuously updated NTriples or even a tree rendition of what had already been entered and where you are might be helpful.

    No obvious bugs after about 10mins in Firefox 1.0 PR on Win2k.

    Don't think I've seen a "Save to Cookie" button before…

    What busy bees. Keep up the good work.

    Comment by Danny — 21 Oct 2004 @ 11:10 am

  2. PS. there may be something useful in the DOM twangling here:
    http://semtext.org/2004-02/slides/w6.html

    (seems a bit broken in latest FF, should be ok in IE6)

    Comment by Danny — 21 Oct 2004 @ 11:16 am

  3. PS. there may be something usable around the DOM-twangling here:

    http://semtext.org/2004-02/slides/w6.html

    (seems broken in latest FF, should be ok in IE6)

    Comment by Danny — 21 Oct 2004 @ 11:19 am

  4. Um, where is it? I don't see any link in the story.

    Comment by Dave Beckett — 22 Oct 2004 @ 10:51 am

  5. Danny,

    Thanks for your feedback. I've added a line or two to get people started, but clearly need to think about how to make it more user-friendly. We knew that for this to be presented as a FOAF or DOAP (or W6!) editor we would have to think about labelling, help text etc. a great deal.

    We could add an n-triples view next to the RDF/XML view with no problems - this would be continually updated and the user could toggle this on or off. I don't really want to introduce a tree view, because this is a graph and there is no notion of tree root, unless we ask the user to specify one of the resources as being the root. Even then they might pick a resource that did not link to all the other resources - in fact the tool could be used to create several trees.

    Comment by James Carlyle — 22 Oct 2004 @ 11:14 am

  6. Dave,

    Errr - you're right. The original posting should have had a link to launch it, instead of being hidden in the Wiki page. Sorry. I've fixed the posting.

    Comment by James Carlyle — 22 Oct 2004 @ 11:17 am

  7. Danny

    I really like the W6 garland approach and was thinking of putting the required classes and properties in a config file to allow Giblet to be used to create and edit W6 garlands or chains of garlands, and having a "sample W6" button alongside the "minimal FOAFnet" button. Let me know if I can do this.

    Comment by James Carlyle — 22 Oct 2004 @ 1:03 pm

  8. James - sorry, I think I was going to respond and then got distracted…

    Yep, please do what you can with W6. I'm open to suggestions for modifications too, if there's anything that don't look right.

    I'd be grateful if you could post a note somewhere about where the code does its pre-loading of classes/properties, I fancy trying it with a project vocab.

    Comment by Danny — 28 Oct 2004 @ 9:07 pm

  9. Danny

    Sorry for not getting on with W6. I have been busy with a database backend for Carp (see other email) and have some clear ideas for Giblet, not implemented yet, and somewhat contradictory. One is to allow the dynamic loading of RDFS schemas, looking at how some schemas depend on others. Another is thinking about how an absolute beginner would like to generate RDF statements. I suppose that a beginner-friendly skin would wrap everthing else under the covers, but might still use a dynamic schema loading facility without the user ever being aware.

    I could post a note about how the current preloading works, but would prefer to work on the dynamic loading ability and then write about that. I'll have a much better idea after thinking about it over the weekend, if you can wait till Monday.

    James

    Comment by James Carlyle — 29 Oct 2004 @ 10:59 am

Site Redesign

If you can read this then your DNS has picked up our server change. We've also given the site a new look, retired some areas that were past their sell-by date and dusted off a few things that were lurking in the background.

Comments

No comments yet.

Announcing Semantic Planet's RDF Lib and Carp

James and I are pleased to announce the first public release of RdfLib and Carp, which are .NET/Mono libraries for fetching, parsing, munging and writing RDF. I first mentioned the existence of these in my talk at FOAF Galway and both are used to run our FriendSpace experiment. Both libraries are released under a liberal, attribution-only, open source license. We'd love it if you wanted to use either or both in your applications. If you do use them, please let us know - we're keen to help as much as possible.

SemPlan.RdfLib (binaries / source / docs) provides foundation RDF services for other applications such as parsing and writing RDF. The SemPlan.RdfLib.Core namespace contains interfaces defining fundamental RDF concepts such as UriRef, BlankNode etc. It also contains a Parser interface with a number of associated Factory interfaces to get hold of parsers, resources and statements.

We ship three parsers: XsltParser which is based off of James’ XSL stylesheet RDF parser; DriveParser which is a wrapper around the pure .NET Drive Parser and ICalParser which wraps SemaView's iCal parser. All of these implement the same interface and so are completely interchangeable. It should be very little effort to produce other RDF parser bindings.

The SemPlan.RdfLib.Utility namespace contains some handy implementations of the RdfWriter interface (RdfXmlWriter and NTripleWriter), plus a SimpleModel and SimpleDereferencer, both of which are suitable for quick hacking.

SemPlan.Carp (binaries / source / docs) uses RdfLib and conceptually sits in a layer above it. Carp stands for Convenient API for RDF Programing and is designed to provide a simple API for programming with RDF without losing the power of the underlying model.

The heart of Carp is the KnowledgeBase. You can include RDF from abitrary locations very simply:

KnowledgeBase knowledge = new KnowledgeBase();
knowledge.include("http://www.semanticplanet.com/index.rdf");
knowledge.include( new StreamReader("./bloggers.rdf"), "");

By default, a KnowledgeBase uses the XsltParser but you can supply any ParserFactory in its constructor to customise that behaviour. In Carp, one of the design principles is to make simple things really simple and hard things possible so most constructors and quite a few methods offer overloaded versions with sensible defaults.

The KnowledgeBase maintains a ResourceDescription for every node in the input documents. A ResourceDescription is a set of properties and values attached to a uriref or a blank node. Each value is a further ResourceDescription with its own properties and values. You can ask any KnowledgeBase for a description of a node like this:

ResourceDescription description = knowledge.getDescriptionOf("http://www.semanticplanet.com/");

Or, you can iterate through all the descriptions:

foreach( ResourceDescription description in knowledge) {
  Console.WriteLine( description.getAboutLabel() );
}

The KnowledgeBase also offers some basic, but useful, querying methods such as findByPropertyValue and findByType. You can also write out the KnowledgeBase as RDF by passing an RdfWriter to its write method. Here's a FOAF aggregator:

KnowledgeBase knowledge = new KnowledgeBase();
knowledge.include("http://iandavis.com/foaf.rdf");
knowledge.include("http://www.takepart.com/about/foaf.rdf");
    
RdfXmlWriter writer = new RdfXmlWriter( new XmlTextWriter( Console.Out ) );
knowledge.write( writer );    

The ResourceDescription class is probably the next most important after KnowledgeBase. Once you've got hold of one you can get the values of its properties using the [ ] indexer notation. Property values are always returned as a ResourceDescriptionList which you can enumerate with foreach, index into with [ ] notation or just use the first() method to get the first value. For example:

ResourceDescriptionList topics = doc["http://xmlns.com/foaf/0.1/topic"];
foreach( ResourceDescription topic in topics ) {
  Console.WriteLine( topic.getAbout().getLabel() );
}

or, more succinctly,

foreach( ResourceDescription topic in doc["http://xmlns.com/foaf/0.1/topic"] ) {
  Console.WriteLine( topic );
}

Calling WriteLine with a ResourceDescription argument like above invokes ToString on the ResourceDescription. This, in turn, returns the label of the node the ResourceDescription is about. The label is either the uri of a uriref, a generated blank node label, or the string value of any literal.

Setting properties is as easy as getting them for a ResourceDescription:

ResourceDescription person = new ResourceDescription();
person["http://xmlns.com/foaf/0.1/name"] += "Ian Davis";
person["http://xmlns.com/foaf/0.1/interest"] += new ResourceDescription( new UriRef("http://example.com"));

Note that in the spirit of RDF we're adding values to a list of values for each of the properties. Copying from one ResourceDescription to another is easy too:

ian["http://xmlns.com/foaf/0.1/interest"] += james["http://xmlns.com/foaf/0.1/interest"];

There's a lot more to core Carp than just these two classes such as investigators that encapsulate algorithms for discovering more RDF about resources, a caching dereferencer with etag and if-modified-since support, and an early implementation of RDF Schema inference rules. However, the I want to finish by describing the SemPlan.Carp.Vocabularies namespace. This, as you might guess, contains classes for using common RDF vocabularies. Carp comes with FOAF, RSS 1.0, RDFS and a partial RdfCal implementation. If you want more, there's an example application in the source distribution called VocabGenerator that can generate C# classes compatible with Carp from RDF Schemas.

To use a Carp vocabulary class, just add using SemPlan.RdfLib.Vocabularies; to your class. Then you can address any FOAF property by its name:

ResourceDescription person = new ResourceDescription();
person[ Foaf.name ] += "Ian Davis";
person[ Foaf.interest ] += new ResourceDescription( new UriRef("http://example.com"));

The vocabulary classes also provide a typed form of ResourceDescription for each class defined in the schema. Each typed ResourceDescription has shortcut properties inferred from the schema that make it even easier to manipulate RDF data:

foreach( Foaf.Person person in james.knows) ) {
  Console.WriteLine( person.name + " (" + person.mbox + ")" );
}

Typed ResourceDescription have their own collections, addressed as Agent.Collection and contain implicit casts to and from ordinary ResourceDescriptions and ResourceDescriptionLists. Creating a new typed ResourceDescription creates a pattern which you can use to query the knowledge base for all nodes of that type.

Foaf.Agent pattern = new Foaf.Agent();
foreach (Foaf.Agent agent in pattern.findAllMatching( knowledge ) ) {
 ... do something with agent
}

I'll leave you for now with one of examples, an RSS aggregator that uses the same format of blogroll as Planet RDF:

KnowledgeBase bloggers = new KnowledgeBase(parserFactory);
bloggers.include(new StreamReader("./bloggers.rdf"), "");
Foaf.Agent pattern = new Foaf.Agent();

foreach (Foaf.Agent agent in pattern.findAllMatching( bloggers ) ) {
  Console.WriteLine( agent[Foaf.name] );
  foreach (Foaf.Document doc in agent.weblog) {
    Console.WriteLine( "Weblog: " + doc );

    KnowledgeBase rss = new KnowledgeBase();
    rss.investigate( doc , new SeeAlsoPropertyInvestigator());

    Rss.Item itemPattern = new Rss.Item();
    foreach (Rss.Item item in itemPattern.findAllMatching( rss ) ) {
      Console.WriteLine( "  - " + item.title );
      Console.WriteLine( "    " + item.description );
    }
  }          
}

3 Comments

  1. I've been playing around with it, and it looks pretty nice! I have a couple of questions:

    Why use camelCase for public methods instead of StudlyCaps (which is the C# convention)? This makes a lot of the classes not self-consistent when they need to implement interfaces, like KnowledgeBase which has a property named "Count" (uppercase C) and another named "verbose" (lowercase v). It makes it quite a bit more difficult to learn the API when you have to keep two conflicting public conventions in mind, and you have to keep the interfaces they implement in mind to make a logical distinction.

    The other question is about the license. The first clause contains what appears to be a BSD-style advertising requirement, which might not be compatible with the GPL. No question you deserve attribution for your work, but is GPL-compatibility important to you? What constitutes a "product"? Would a non-commercial open source project which uses your software count? What if that project were included in a commercial product like a Linux distribution? A lot of the open source licenses out there cover these issues; you might want to consider relicensing under one of them for both your own legal benefit, and to make it clear to potential users of your library how exactly they may use it.

    Thanks,
    Joe

    Comment by Joe Shaw — 20 Oct 2004 @ 1:29 pm

  2. Joe, I agree the naming conventions are a little messy. The story behind it is that the intention was to provide properties and classes that follow the RDF conventions i.e. StudlyCaps types and camelCase properties. When using Carp, the average user will rarely need to access methods on the core objects - most code will use the indexers and RDF vocabulary properties. Then I realised that fundamental properties and methods such as Count and ToString needed to follow framework conventions. So the convention is that all methods and properties are camelCase, unless they override a standard framework method. We're only at version 0.1 so if there's enough pressure to change then I'm open to makeing a big change across the whole API.

    Comment by Ian Davis — 21 Oct 2004 @ 2:37 pm

  3. I'm not sure I understand why the simple BSD-style license is not GPL compatible - can you provide a pointer to any discussion of this?

    Comment by Ian Davis — 21 Oct 2004 @ 2:38 pm

Topics

As part of our work on FriendSpace, we've been thinking about the relationships between agents, the things they create and their interests. What's particularly interesting to me is whether or not particular "topics" can be discovered from the information each person includes in their FOAF and other documents.

FOAF has an interest property which can be used to express an interest like this:

<foaf:interest
  rdf:resource="http://example.com/document"/>

In the FOAF schema, the range of interest is a foaf:Document, which means that the application can conclude that http://example.com/document is a document. You could additionally specify a topic for the document like this:

<foaf:interest>
  <rdf:Description rdf:about="http://example.com/document">
    <foaf:topic rdf:resource="http://example.com/topic"/>
  </rdf:Description>
<foaf:interest>

FOAF specifies the range of foaf:topic to be a Resource, i.e. it could be anything. Conceptually this looks like:

A graph depicting a node labelled Agent, a node labelled Document and a node labelled with a question mark. The Agent node has an arrow pointing to the Document node  labelled interest. The Document node has an arrow pointing to the question mark node labelled topic, the question mark node has a label pointing back to the Document node labelled page.

I've also shown how the topic resource can refer back to the document, using the page property. There is an implication here that the agent is "interested" in the "topic" but this has to be a different sense of "interest" to that specified by FOAF since the range of interest is document and one can certainly be interested in things that aren't documents. FOAF does in fact provide the necessary property: topic_interest. However, this property is labelled as The foaf:topic_interest property is generally found to be confusing and ill-defined and is a candidate for removal. The goal was to be link a person to some thing that is a topic of their interests (rather than, per foaf:interest to a page that is about such a topic).. Some of the confusion may be caused by the spec itself which appears to be schizophrenic, referring to both topic_interest and interest_topic. Nevertheless, I think there's a case for reintroducing this property. More on this later.

You can also specify the topic of a weblog in a similar fashion:

<foaf:weblog>
  <rdf:Description rdf:about="http://example.com/weblog">
    <foaf:topic rdf:resource="http://example.com/topic"/>
  </rdf:Description>
<foaf:interest>

A graph depicting a node labelled Agent, a node labelled Document and a node labelled with a question mark. The Agent node has an arrow pointing to the Document node  labelled weblog. The Document node has an arrow pointing to the question mark node labelled topic, the question mark node has a label pointing back to the Document node labelled page.

What does this imply about the agent? There's certainly some level of "interest" in the "topic" since they're maintaining a weblog about it. However the implication is not, in my opinion, as strong as the previous one.

In his WordPress hack for RSS 1.0 output, Morten Frederiksen includes a foaf:topic for each rss:item like this:

<rss:item rdf:about="http://example.com/item">
  <foaf:topic rdf:parseType="Resource">
    <dc:title>topic name</dc:title>
    <foaf:page rdf:resource="http://example.com/document"/>
  </foaf:topic>
  <foaf:maker>
    <foaf:Person>
      <foaf:name>name</foaf:name>
    </foaf:Person>
  </foaf:maker>
</rss:item>

This results in a graph similar to this:

A graph depicting a node labelled Agent, a node labelled item, a node labelled Document and a node labelled with a question mark. The item node has an arrow pointing to the Agent node labelled maker and an arrow pointing to the question mark node labelled topic, the question mark node has a label pointing to the Document node labelled page.

Again, it might be reasonable to conclude that the agent has a foaf:interest in the document, and a general abstract "interest" in the "topic".

Can this pattern be extended into other types of document such as calendars? The RDF calendar effort is an attempt to mirror the icalendar specification in RDF. Potentially one could include a foaf:topic property for each calendar event and a maker for the calendar as a whole. The result would be something like:

A graph depicting a node labelled Agent, a node labelled Vcalendar, a node labelled Vevent, a node labelled Document and a node labelled with a question mark. The Agent node has an arrow pointing to the Vcalendar node labelled made. In return the Vcalendar node has an arrow pointing back to the Agent node labelled maker. The Vcalendar node also has an arrow pointing to the Vevent node labelled component. The Vevent node has an arrow pointing to the question mark node labelled topic. The question mark node has an arrow pointing to the Document node labelled page.

You can currently assign categories to events but these are text literals not URIs so they don't map cleanly to foaf:topic. A possible hack is to use URIs as category names and coerce the iCal/RDF converter to insert a foaf:topic property if the category looks like a URI. It is a hack though and there's precious little RDF calendar data out there as it is and virtually no tools to support it.

I think, in general, it is possible to use FOAF to assign topics to entities people have made or are responsible for. However, there are some holes in the FOAF interest model. How is one to declare an "interest" in a "topic". There is the foaf:topic_interest property but it's meaning os confusing. This area is made difficult because the foaf:interest property is squatting on a broad area of semantics and is making it hard to introduce new terms. The property has a restricted meaning but uses a very general English word for the property name. The FOAF spec defines this property as A page about a topic of interest to this person so it should really be called interestInWhateverThisPageIsAbout - or maybe something more succinct :).

My suggestion is to rename the foaf:topic_interest property to foaf:topicOfInterest with the definition: A topic that this person expresses an interest in.. Then, for completeness I'd also suggest the introduction of a foaf:Topic class. The domain of foaf:topicOfInterest and foaf:topic would be foaf:Topic. The range of foaf:topicOfInterest would be foaf:Agent. I'd also like to rename the foaf:interest property, but it's in widespread use so renaming might be hard. A pragmatic approach would be to deprecate it and introduce a less semanticly overloaded property.

Update: This is what I suggested in a message to the rdfweb-dev list:

My suggestion is to rename the foaf:topic_interest property to foaf:topicOfInterest or foaf:interestInTopic with the definition: A topic that this person expresses an interest in. Then, for completeness I'd also suggest the introduction of a foaf:Topic class. The domain of foaf:topicOfInterest and foaf:topic would be foaf:Topic. The range of foaf:topicOfInterest would be foaf:Agent.

A better name for foaf:interest is harder, in my posting I suggest rather tongue in cheek foaf:interestInWhateverThisPageIsAbout. Maybe pageAboutInterest or interestPage or, to parallel a suggestion above, interestInPageTopic (i.e. bob has interest in this page's topic).

2 Comments

  1. hogtied damsels

    Topics

    Trackback by Gerhast — 4 Oct 2005 @ 8:32 am

  2. bdsm forced pain hogtied movies

    Topics

    Trackback by Eckhart Schmidt — 14 Oct 2005 @ 8:41 pm

A basic RDF XML validation stylesheet

By James Carlyle. I've put together a basic XSLT 1.0 stylesheet to validate RDF in XML. It has been tested against all the simple negative tests in the W3C RDF test suite. It produces no output for all the positive tests (i.e. it finds them all valid), and produces output for all the negative tests (i.e. it finds them all invalid). The output from the stylesheet looks something like this: # ERROR <rdf:ID rdf:resource="http://example.org/node2"/> # ERROR rdfms-rdf-names-use/error013.rdf : ID is forbidden as a property element name. There is a simple error message (taken from the test case) and a snippet of the original document to help provide some context. The checks are specified using XPath expressions. If I've misunderstood the RDF/XML syntax specifications, these will be incorrect or insufficient, so please email me if you think they could be improved. The motivation for this work was to improve the capabilities of the XML to NTriples stylesheet that I wrote last year. At the time I said that it had no validation tests and that they could be added either in the same stylesheet or in a separate one. I decided to separate the validation into a preprocess step in order not to muddy the already complex NTriples stylesheet further, and to allow use as a standalone item. There are some further details (and a link to the stylesheet itself) on the Semantic Planet Wiki here.

Comments

No comments yet.

Comment Spam

We, like many others, have been inundated with comment spam in recent weeks. Dealing with this is a complete waste of our time so we've had to make some changes. We're never going to be able to prevent spammers from posting comments. Every method I've seen suggested removes functionality that is genuinely required for comment systems - we should encourage the publishing of links, not prevent them and anonymous posting is required in certain circumstances.

So, we have to let the spammers in and deal with the mess they make. And rather than relying on the bottleneck of James and I, we need to open up so that everyone can contribute to the quality of the content here. So, we've decided to use the wiki for commenting in future. Wiki's get spammed but the spam doesn't hang around long because anyone can delete it. Problem solved and we get the benefit of integration into other comments and postings, free linking, formatting, revisions, etc. etc. It's basic at the moment, but should be easy to extend to copy some of the posting content across to the wiki page to provide some context. We've left trackbacks alone because we're not getting any trackback spam yet…

Update October 2004: switched to WordPress and now using a security keyword approach to comments which seems to have eliminated all comment spam.

1 Comment

  1. wedding blogs and wikis
    it's a vision of the future that, frankly, brings tears to my eyes. the original idea came as a solution…

    Trackback by <dykes.d().digital> — 11 Apr 2004 @ 9:32 pm

Nearly Round-Tripping

Well, it's nearly there. The PHP implementation now supports the generation of elements given only the URI (as opposed to supplying the namespace and local parts separately) which was the major blocker on getting this working. I've extended the spec to allow NodePaths to be specified in the rt:name attribute of rt:element. They're distinguished from string values by enclosing them in curly brackets {} just like in XSLT. I'll do the same for attributes too. There's still some work to do on the actual XML generation around the namespace prefixes but it's essentially all there. The stylesheet itself would be cleaner with some conditional logic in there - that will come soon too.

Here's the stylesheet:

<rt:stylesheet xmlns:rt="http://purl.org/vocab/2003/rdft/"
  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" 
  >
  <rt:output rt:method="xml" rt:indent="yes"/>

  <rt:root-template>
    <rdf:RDF>
      <rt:for-each rt:select="~subject()">
        <rdf:Description>
          <rt:attribute rt:name="about" rt:namespace="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
            <rt:value-of rt:select="label()" />
          </rt:attribute>

          <rt:for-each rt:select="node()/resource()">
            <rt:element rt:name="{label()}">
              <rt:for-each rt:select="resource()/resource()">
                <rt:attribute rt:name="resource" rt:namespace="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
                  <rt:value-of rt:select="label()" />
                </rt:attribute>
              </rt:for-each>
              <rt:for-each rt:select="resource()/literal()">
                  <rt:value-of rt:select="label()" />
              </rt:for-each>
            </rt:element>
          </rt:for-each>
          
        </rdf:Description>
      </rt:for-each>
    </rdf:RDF>
  </rt:root-template>
</rt:stylesheet>
(more…)

Comments

No comments yet.

Namespace Handling

After a bit of a breather, I'm back working on RDF Templates. My current focus is getting namespace handling to work correctly. This is a very small spec change but quite a lot of work in the PHP implementation. I have rudimentary handling sorted out now, but I'm ignoring some of the finer points such as unprefixed namespaces. These will come along shortly.

Along the way I've added an rt:output element which is the RDFT analog of xsl:output. It lets you specify options such as XML or text output. I'm gradually edging towards the RDF syntax round-tripping milestone.

After namespace handling is polished off I need to look at using RDFT elements to generate the values of attributes. This should be fairly simple because the current attribute implementation uses an rt:text action for the value anyway, it's just an extension of that to make it more explicit.

Comments

No comments yet.

Launch of sw-announce Mailing List

sw-announce is a moderated announcements-only mailing list. The list is intended to efficiently communicate news relevant to the Semantic Web and related technologies. This includes, but is certainly not limited to, metadata, ontologies, RDF, knowledge management, AI, electronic agents, and semantic web services.

Comments

No comments yet.

Category RSS Feeds and RDF Versions

Each category now has its own RSS 1.0 feed linked at the top of the category archive (e.g. the RDF Templates Design Notes category). Additionally, most pages now have RDF Versions with full machine-readable metadata.

Comments

No comments yet.

Resource Condition Extension

Currently resource conditions in a nodepath restrict you to a single arc and node pattern, e.g. resource()[arc()/literal() = literal('hello')]. I've relaxed this constraint so that the selection part of the condition can be any arc-matching nodepath. This means that you can now write conditions that test the arcs on a node, e.g. resource()[arc() = resource('http://xmlns.com/foaf/0.1/weblog')] would match any resource with a foaf:weblog property, no matter what the value of that property is, or resource()[arc('http://xmlns.com/foaf/0.1/knows')/resource()/arc() = resource('http://xmlns.com/foaf/0.1/weblog')] that matches any resource that knows a resource with a foaf:weblog.

The new BNF for the spec is:

ResourceCondition            ::= ArcMatchingNodePath [ " = " NodeSpecifier ] 

Comments

No comments yet.

Arc Selection Syntax Reprise

OK, I implemented the proposed ArcPattern syntax change and decided that I didn't like it! However, I came up with an alternative that is even more expressive: introduce a specifier called arc() that acts in all ways like resource() but is to be used as the specifer in ArcPatterns. Here's how it would look:

<rt:root-template>
  <rt:for-each rt:select="~subject()">
    <rt:for-each rt:select="resource()/arc()">
      <rt:for-each rt:select="arc()/resource()">
        <rt:value-of rt:select="label()"/>
      </rt:for-each>
  </rt:for-each>
</rt:root-template>

For comparison, here's the original (i.e. current spec version):

<rt:root-template>
  <rt:for-each rt:select="~subject()">
    <rt:for-each rt:select="resource()/resource()">
      <rt:for-each rt:select="resource()/resource()">
        <rt:value-of rt:select="label()"/>
      </rt:for-each>
  </rt:for-each>
</rt:root-template>

The new way is clearer I'm sure.

Comments

No comments yet.