(Comment on LtU thread)
The goal of the current semantic
technologies is to enable not merely the creation of semantic information, but also the automated processing of that information.
RDF is not just a notation: it’s also a data model (strictly speaking, RDF/XML or N3 or what have you are notations; triples and graphs are the data model). Given that data model, it’s possible to do some automated processing of semantic information by algorithmic means: graph traversal, for instance. The data model allows us to say things like, in order to make valid inferences based on these statements, perform these operations on the graph made up by the triples representing the statements
.
The semantics of the semantic web, as currently understood and practised, represent a highly constrained subset of the semantics of what we might call ordinary knowledge construction. There are things that I can know and say about the contents of my file system that are fairly difficult to put into RDF triples (I don’t know whether there are any things I can know and say about the contents of my file system that it would be impossible to put into RDF triples).
To be more precise, the translation of the kinds of stuff human beings think they know, and the kinds of meanings they like to bandy about, into machine-processable semantic information generally entails a degree of (re-)formalization. We have not only to discover [the] structure inherent in the data
, but also to derive a representation of that structure that will fit into our data model; and this is true even if the data model is claimed to support semistructured
data.
The difficulty
is then of the following kind: the process of formalizing semantic information so that it can be processed by an automaton is not itself automatable (or at least not by the same process that the machine will use to process the formalized semantic information; there might be some higher-order process, but the same problem would then apply at the higher level). The person entering data
still has a job to do (apart from just typing the stuff in), and it is not necessarily an easier job than the job of the old-fashioned suit-wearing person who performs domain modelling and creates relational database schemas.
The (marketing) promise of the Longhorn FS has been that ordinary users will be able to transfer the things they know about the contents of their file systems into the machine, so that the machine will be able to do a variety of smart things with that information. The creation of better and easier-to-use tools for the (re-)formalization of human knowledge is I think a Good Thing; but there is an unfortunate tendency for such tools to be marketed as if they did the job themselves (or magically altered reality so that the job no longer needed to be done).
I would like to have, and could see myself benefiting from the use of, semantic technology in my file system. Even a few user-definable metadata tags that could be addressed by a straightforward query language would be useful. However, Google’s desktop search (which chucks semantics out of the window and does pure syntax-crunching text processing) is currently more useful to me than any existing semantic technology, and I think this is because it places less of an onus on me as the end-user to translate myself into automatonese. Google’s search engine just gets on and does what machines are good at doing. Semantic technologies want to be your friend.