The Community Engine Blog
News, tools, and analysis for innovating in the information economy
« xFolk and Learning Blogosophere | | xFolk: Iteration in Response to Comments »
xFolk: An xhtml microformat for folksonomy
xFolk is an open xhtml microformat that allows users to publish their own folksonomy classifications for aggregation by the services they choose. As such, it starts to give back users control of their own data. A side benefit of xFolk is that users may designate explicit semantics for their folksonomy tags, providing a link to the more formal practices of information architects.
Sections: Emerging Practice News Tools and Analytics
Topics: folksonomy SXSW xFolk
Folksonomy is an emerging practice on the Internet where people tag digital artifacts (e.g., pictures, bookmarks) with their own labels and then share them. Of course, people in the real world have been tagging and sharing collections for years; consider record and coin collections with tags like classical, rock, jazz, Merovingian, etc. One factor that makes tagging on the Internet different is that items and their tags can be shared with the world at large very easily, allowing for an emergent understanding of how common, tagged items are viewed by a large number of people. Further, folksonomy tagging with web-based tools (e.g., flickr and del.icio.us) is also very easy.
As a result of ease of tagging and sharing, folksonomy is spreading on the web like wild fire. Fickr, the folksonomy-based photo sharing service, has over 300,000 users, and Robert Scoble has signaled folksonomy tagging to Microsoft's senior management as an area requiring strategic investment.
Currently, all folksonomy data formats are proprietary and idiosyncratic to the service providing them, likely as a result of folksonomy's ad hoc development to date. There is no easy way to transfer one's data from one service to another. Further, there is currently no format for sharing or even expressing one's explicit understanding of the meaning of his/her own folksonomy tags. Tag and object tagged mutually and implicitly define each other without an explicit anchor point.
In this post I will motivate and sketch xFolk, a microformat for specifying and publishing folksonomies using components already part of xhtml. Since xhtml is already a well-established web standard, a compelling motivation for creating the xFolk microformat is that it will enable users to independently maintain their own folksonomy data while still being able to easily share it. A byproduct is that xFolk will allow those who wish to define explicit semantics for their folksonomy tags to do so, and these semantics will also be easily shared.
xFolk is inspired by Eric Meyer's and Tantek Çelik's panels at SXSW 2005 on emergent semantics and extending xhtml. My thoughts are at a very early stage and might even be termed a pre-proposal, but I am publishing them here because I think they are sufficiently developed to profit from community interaction.
Motivation and Expected Benefits
There are three motivations for xFolk.
xFolk can make folksonomy data independent of vendor implementation
Currently, all folksonomies exist in vendor specific formats and are stored on vendor sites. Store your bookmarks in del.icio.us and you have no standard way of exporting them en masse and reimporting them somewhere else. The RSS and html feeds, the obvious export method del.icio.us provides, are limited to 30 entries. As of this writing, I have 808 bookmarks and 61 separate tags. My only export option is screen scraping or the del.icio.us REST interface (with somewhat sketchy documentation).
Even having conquered data export with some work, I have no real way of getting my folksonomy data picked up by another service. Yes, I can publish my newly reacquired folksonomy data in a personal link blog, and this is something that many high profile bloggers, such as Robert Scoble and Jason Kottke do. However, as pointed out by Thomas Vander Wal and Liz Lawley, a key part of folksonomy's value is sharing. My tagging contribution aggregated with others creates a richer information space for all to explore. Isolated, individual link blogs do not provide this experience.
It seems clear to me that as a folksonomy contributor, I have an interest in getting my data shared in places where the people I want to share with go. It also seems clear to me that start-up folksonomy services have an interest in making it easy for people to put their data in. By systematizing folksonomy data representation in a standards-compliant xhtml microformat, xFolk has the potential to make sharing across services possible.
Some people want to provide explicit semantics for their folksonomy tags
People who take the care to tag an item with something other than “miscellaneous” generally have an idea of what they mean. Earlier this year, Richard MacManus provided a high profile demonstration of this in a weblog entry where he defined what he meant by the tag “Web 2.0” and analyzed its uptake in the blogosphere. Why not give people like Richard the opportunity to more easily and explicitly share their tag definitions (tag semantics) through a microformat?
More relevant data is generally better than less
Lou Rosenfeld has expressed the opinion that folksonomy might be useful as an input to the information architect's more formal task of creating controlled vocabularies to organize information spaces. Many, including Thomas Vander Wal and the IA Summit panel on the topic, have noted the ambiguity of folksonomy tags.
Allowing users the option to add explicit semantics regarding the meaning of their tags, if they so choose, has the potential to reduce ambiguity as Richard MacManus felt compelled to do. Further, not all users need to provide explicit semantics for this to work. If users are tagging similarly and only a few of them provide explicit semantics, then those semantics can help inform inquiry into what the whole group means with a tag as well as an indication of diverse ways people may conceive of a resource and a tag.
Defining xFolk as an xhtml microformat
In this initial draft definition of xFolk, I considered three use cases:
- Use Case 1 — The user wants to publish his or her folksonomy contributions in a way that can be easily crawled by or imported into multiple folksonomy services.
- Use Case 2 — The user wants to publish explicit semantics (definitions) for his or her tags.
- Use Case 3 — The user wants to contribute a particular artifact (picture, blog post, etc.) he or she has created to a folksonomy.
I'll sketch the elements of the format and then show how it can be used in the three use cases.
basic elements: the rel attribute and XMDP profile
The basic premise of xFolk is that the user is tagging a URL that serves as a permanent link to the item under consideration. The approach could be extended to URIs to add a level of abstraction, but I'm sticking with the simple version for now. Further I'm assuming that the user is using the standard <a> element commonly employed for creating links in xhtml documents. The URL could be pointing at anything: a document, a picture, or some physical item somewhere.
The question then becomes how to indicate the folksonomy tags that apply to the URL. In xFolk, I propose the “rel” attribute be used for indicating how the item pointed to by the <a> element should be classified. This is, quite defensibly, a valid interpretation of how the rel attribute qualifies URLs in XFN and VoteLinks, two existing xhtml microformats. In XFN, one classifies the item indicated according in terms of acquaintance with its author. In VoteLinks, one classifies the item indicated in terms of endorsement. In xFolk, one classifies the item indicated in terms of concepts.
The key difference, of course, is that, in XFN and VoteLinks, the allowed attribute values are pre-agreed by a group and static. What I'm proposing here is essentially to make it possible to use XFN's and VoteLinks' same mechanism for specifying rel attribute values but without any sort of prior mutual agreement on the allowed classifications. Specifically, I'm proposing to let the classifications emerge at the user level, be explicitly defined by the user (or not), and then, at the users' discretion, be shared with the world using the microformat mechanism developed for XFN and VoteLinks.
Let me give a concrete example of how this might work for a document I myself recently tagged in del.icio.us:
<div class=“folksonomyEntry”>
<p>
<a
href=“http://www.hyperorg.com/blogger/mtarchive/003733.html”
rel=“folksonomy popularization KM communityCreation”
title=“Authors tags and topics”>
<span class=“extended”>
This essentially reflects the view that one might be able to
marry folksonomy to the topic maps idea of Public Subject
Indicators. How would these trickle up from the bottom? How
about clustering or other statistical aggregation?
</span>
</a>
</p>
</div>
The interpretation of each component is as follows. In the <a> element, the href attribute points to the resource. The rel attribute gives my folksonomy tags for this resource as a space delimited list of terms. The title is what the original author called the resource but could also be an abbreviated version of that. The text inside the <span> element of class extended is my extended description of the item.
We can note a few things from the example. First, I have used <div> and <span> elements with specialized class attributes to help disambiguate context. I'll have more to say on that shortly. Perhaps more fundamentally, how do we know that items appearing in the space-delimited list in the rel tag are folksonomy tags?
Enter XMDP, a format developed by Tantek Çelik for specifying the interpretation of rel and possibly other attribute values. XMDPs are written in xhtml and kept in separate files that are then linked to published xhtml documents. A simple example using XFN will help illustrate the process and benefits. To use XFN, one links its XMDP to the xhtml document he or she is trying to publish. Tools like rubhub can then compare rel attribute values in the <a> tag with those defined in the XMDP, determine if a relationship is classified for the link, and then create a record of the classified relationship that is shared with others. In xFolk, we will use XMDPs in a slightly different way. Instead of predefining a list of values for the rel attribute along with definitions, the XMDP will list all of the values that have been used so far as folksonomy tags in the rel attribute. Once the xhtml document is published to the web, to know if a value in one of its rel attributes is meant as a folksonomy tag, one need simply compare that value with the definition terms in the XMDP.
All that is required for this to work is that the user keep and publish an updated XMDP file as new folksonomy tags are added (i.e., the user adds definition terms to the XMDP as he or she generates new folksonomy tags). The XMDP need only contain the definition terms (i.e., attribute values) used as folksonomy tags, not the folksonomy tag definitions (i.e., attribute definitions, also part of the XMDP spec) for this to work. The folksonomy tag definitions are a bonus that the user may add at his or her discretion.
Let me make this concrete by showing the XMDP definition list as it would need to appear for the example I presented earlier. The definition list is the key component of the XMDP (see Tantek Çelik's XMDP specification for the full details).
<dl class=“profile”> <dt id=“rel”>rel</dt> <dd> <p><a rel=“help” href=“http://www.w3.org/TR/html401/struct/links.html#adef-rel”> HTML4 definition of the 'rel' attribute.</a> Here are some additional values. </p> <dl> <dt id=“communityCreation”>communityCreation</dt> <dd>Helps me think about how to create online communities.</dd> <dt id=“folksonomy”>folksonomy</dt> <dd></dd> <dt id=“KM”>KM</dt> <dd>Knowledge management</dd> <dt id=“popularization”>popularization</dt> <dd></dd> </dl> </dd> </dl>
Here's how to interpret this definition list. The main list is of class “profile”, meaning simply it is an XMDP definition list. The first definition term indicates that we are talking about rel attribute values. The associated definition (<dd>) element points to the official definition of the attribute. Then another definition list embedded in this definition provides the elements of the rel attribute that are to be interpreted as part of the folksonomy. Note that I provide definitions for only two of the tags, and one of those is simply the expansion of an abbreviation.
Maintenance of the XMDP file may be accomplished with some tedium by hand. In other words, every time you add a new rel attribute value that you want to use as a folksonomy tag, you quickly edit the XMDP to reflect the addition of this new value. Alternatively, one can develop automated tools, most likely integrated into blogging environments, that do this. Even at its current pre-proposal stage, it should be apparent that xFolk will be very amenable to tool development.
disambiguating context: class attributes in <span> and <div>
The profile mechanism outlined by Tantek in conformance with the W3 guidelines provides some means for disambiguating context. If more than one profile is used, they receive precedence in the order they are listed. This helps with interpretation of the rel attribute values, but as illustrated in my example, one often wishes to include more data in the folksonomy entry than just the resource and the tags applied to it. This is where the <div> and <span> elements come into play.
The <div> element of class folksonomyEntry encloses one folksonomy entry. For an entry to be complete, it must have at least an <a> tag with href and title attributes. The rel attribute and everything else are optional. They have the interpretation I provided earlier.
One may wish to include additional elements like an extended description or a picture. In this case, <span> elements may be included of class “extended” and “picture” respectively.
Summary
At this point, it seems worthwhile to summarize xFolk in terms of required and optional components.
Required:
- An XMDP file that specifies rel attribute values that are to be interpreted as folksonomy tags.
- A <div> of class folksonomy that encloses the entry.
- Exactly one <a> element within the folksonomyEntry <div> that points at the resource to be tagged. Within this element, only the href and title attributes are required. The rel attribute containing folksonomy tags is not required but strongly encouraged.
Optional:
- As just mentioned, the rel attribute with folksonomy tags in the <a> element is optional but strongly encouraged.
- <span> elements of classes extended and picture.
Illustration with use cases
Earlier I listed three use cases. Here is how the proposed microformat would work in each.
Use Case 1 — The user wants to publish folksonomy contributions
I also call this the link blog case, and it is exactly the example I gave in explaining the format. The user would simply publish many contributions in separate <div> elements of class folksonomy. Folksonomy services could then crawl the pages and update their entries for that user.
Use Case 2 — The user wants to specify folksonomy tag semantics
This use case is covered by the use of the XMDP file. The user would include definitions for the terms (tags) in the folksonomy.
Use Case 3 — The user wants to tag their own resource in a folksonomy
This is currently the case technorati covers for individual blog entries. In xFolk, tagging your own blog posts or pictures is exactly the same as cataloging another person's contributions in your folksonomy. In the case of blog posts, one might imagine enclosing the title and abstract in a <div> of class folksonomy and putting a <span> of class extended around the abstract. The <a> tag could be wrapped around the post title with all of the requisite attributes.
Let me provide a concrete illustration of what I just said as it might apply to this very post:
<div class=“folksonomyEntry”>
<h3>
<a
href=“http://thecommunityengine.com/home/archives/2005/03/xFolk_an_xhtml.html”
rel=“folksonomy xFolk SXSW Emerging+Practice”
title=“xFolk: An xhtml microformat for folksonomy”>
xFolk: An xhtml microformat for folksonomy
</a>
</h3>
<p class=“abstract”>
<span class=“extended”>xFolk is an open xhtml microformat that
allows users to publish their own folksonomy classifications for
aggregation by the services they choose. As such, it starts to give
back users control of their own data. A side benefit of xFolk is
that users may designate explicit semantics for their folksonomy
tags, providing a link to the more formal practices of information
architects.</span>
</p>
</div>
Note that I have put the post title in an <h3> element with the <a> element wrapping it. The <p> element of class abstract is purely there for formatting purposes. All other components are as explained before.
Next steps
I could go on at length about potential implications, but that would be premature. What I really need at this point is feedback and suggestions for how to move this microformat forward. So, please feel free to comment, trackback, or tagback (just copy this link, xFolk, to your post to tagback) to this post with relevant, constructive feedback. Tell people you know who have an interest in folksonomy to have a look and comment. If you feel you have a substantial contribution to make and want to collaborate, drop me an email.
Let me pre-empt two potential issues that people might raise:
- The rel attribute is getting stretched well beyond its original purpose. That may be, but the only interpretation I can find for the rel attribute in its current uses is that it is metadata indicating what the thing at the other end of the link is. That's essentially the use I have made of it here.
- Can't you just do this with RSS or some other standard? Yes and no. To emulate this microformat in RSS, one would need to consider at least three flavors of syndication format (RSS 1.0, 2.0, and atom 0.x) and define the semantics for each. Further, RSS is a syndication format for things that change not an archival format for things that are more enduring like a folksonomic classification. xhtml is a well accepted archival format and therefore well suited to creating a microformat for folksonomy. This fact further suggests that there is no reason to create a completely new format using XML or other mark-up.
Technorati Tags: xFolk
Bud posted this on March 22, 2005
Trackback Pings
TrackBack URL for this entry:
http://thecommunityengine.com/cgi-sys/cgiwrap/fpgibson/thecommunityengine.com/mt/mt-tb.cgi/560
Listed below are links to weblogs that reference xFolk: An xhtml microformat for folksonomy:
» xFolk from Blog֮¼¡
ԽԽڱǩӦó֣ЩӦ֮ǨƾͳһԵ⣬PicasaڱعƬpicasaʹñǩƬȴ֮佫ƬͬǩFlickrУֱûа취ղصfurl.netdel.icio.us... [Read More]
Tracked on April 2, 2005 01:29 AM
» Xfolks from Deakialli DocuMental
ltimamente se escucha hablar de un microformato llamado Xfolk que intenta unificar e implementar una base en xhtml a los sistemas de indizacin social, las folksonomias.
Antes de nada, por si alguno est ya perdido, las folksonomias son sistemas... [Read More]
Tracked on April 28, 2005 06:15 AM
Comments
Interesting post to dig a bit more.
Some issues:
* Internationalization: a "tag" is not a word but an identifier, how do you correlate two different tags with the same meaning in different languages? (owl:sameAS)
rel=“folksonomy popularization KM communityCreation”
with
rel=“folksonomie GC CréationCommunauté”
* Internationalization ambiguity:
rel="pain" English... suffering
rel="pain" French... pain
* Two related identifiers:
rel="dog"
and
rel="dogs"
* Semantics scope
rel="cow"
which meaning between those four?
n 1: female of domestic cattle: "`moo-cow' is a child's term" [syn: {moo-cow}]
2: mature female of mammals of which the male is called `bull'
3: a large unpleasant woman
v : subdue, restrain, or overcome by affecting with a feeling of awe; frighten (as with threats) [syn: {overawe}]
We could add also the sacred animal in india, and many things of this type.
* Hierarchy of concepts
if the user wants to create hierarchy of concepts?
Something that might help you for ideas
http://www.w3.org/2004/03/thes-tf/primer/
http://www.w3.org/TR/grddl/
Posted by: karl at March 23, 2005 11:59 AM
I have looked over this a little bit this morning, but I don't think the rel attribute is used properly in the examples. The example supplies a definition "folksonomy" then the tags. It seems like Technorati Tags, but not using the attributes properly. It seems like it would make sense for Technorati Tags to evolve into something with this intention.
The blogging tools need to drastically change to provide consistancy in this area. This information needs to be portable and put into some standardard for it to work properly. Personally, I like the open API model that del.icio.us provides as I have access to the information and can extract it for my own needs, as well as port it so another tool (would I ever want to). The centrally aggregated and socially shared components of del.icio.us are what make it powerful.
Another downside is privacy issues related to having your tags in the open. del.icio.us, Upcoming, and others are working hard to add
private tags (personal or closed groups) available to their offerings. This next step would be very hard to accomplish when coding in a web page.
A personal aggregation would be tough to manage, just because it an entry is on a page your and only you manage. This may change over time, and when it does how do we or the tools know who posted what, as well as can roughly verify that.
Posted by: vanderwal at March 23, 2005 01:21 PM
Karl and Thomas, thanks for the remarks.
Karl, I'll have to think about the issues you bring up in terms of multilingual thesauri.
Thomas, you're right about the rel attribute, I think. The user experience, though, is not really part of this proposal. In order to have private tags, etc. you might consider publishing alternative xhtml pages and alternative xmdp's. This would be handled behind the scenes by the tool, not the user.
Posted by: Bud Gibson at March 23, 2005 03:15 PM