Weinberger's Miscellany

Click to Listen to the Show (24 MB MP3)

[This show will record at 11:00 am Eastern so that Chris can attend an evening meeting.]

David Weinberger, one of the smartest of our many smart neighbors, has a new book about books and planets, Staples and Amazon, 20 questions and the periodic table, Carl Linnaeus and Melvil Dewey, data and metadata — about everything, in other words: Everything is Miscellaneous.

It’s hard to summarize his theory of everything in one sentence, but this is pretty close: “To get as good at browsing as we are at finding — and to take full advantage of the digital opportunity — we have to get rid of the idea that there’s a best way of organizing the world.”

Weinberger is the first to admit this is a mighty tall order. We were organizing the world (and, implicitly, privileging our particular organizing principles) long before Linnaeus and Dewey. As Weinberger explains, we’re basically hard-wired to organize all the atoms and planets we see: “We invest so much time in making sure our world isn’t miscellaneous in part because disorder is inefficient — ‘Anybody see the gas bill?’ — but also because it feels bad.” And Weinberger isn’t suggesting that we’re going to stop naming, sorting, or ordering things. In fact as “things” gallop exponentially into our lives we’ll end up doing it more. The trick is that we — not librarians, or book sellers, or photo editors, or other metadata misers — will be doing the sorting.

We at Open Source use — and celebrate — the new tagging tools on a daily basis. We’d have no photos on our site without Flickr and no way to easily share links without del.icio.us. We gaze at the Global Voices tag cloud and dream of the day when we’ll have one of our own.

But at the risk of seeming like a nostalgic prig, I wonder if anyone else out there is also fiending for the quaint numeric certainty of Dewey and his decimals. We know what we’re gaining when a photograph is tagged “beach,” “Phuket,” “galangal,” “Christmas,” and “singhabeer.” There’s a whole lot of potentially useful information in those tags, for one thing, and you can simultaneously file it under as many categories as you want. But is anything lost when it’s not called “P & P in Phuket, Christmas 2008?” When a photo has multiple names and infinite existences, and doesn’t let us pretend that, in this very 21st-century world, we can still exert 18th-century control?

David Weinberger

Fellow, The Berkman Institute

Author, Everything is Miscellaneous

Blogger, JOHO the Blog

Karen Schneider

Librarian, Florida State University

Blogger, Free Range Librarian

Tim Spalding

Founder, LibraryThing

Extra Credit Reading

David Weinberger

David Weinberger (co-author), 95 Theses, The Cluetrain Manifesto, created April 1999:

“#6. The Internet is enabling conversations among human beings that were simply not possible in the era of mass media.

#7. Hyperlinks subvert hierarchy.”

Beth Kanter, David Weinberger at NTC on Transparency, Beth’s Blog, April 6, 2007: “I asked David ‘Why is transparency important for nonprofits?’ Transparency helps with learning is the gist of the sound byte.”

Sony Cloward, Joho the Buttkicker: David Weinberger, NTEN, April 5, 2007: “Funny, insightful, academic, and down to earth, David led us through the evolution – and impact of that evolution – of content, ideas and organization from the physical (e.g. libraries, Encyclopedia Britannica, newspapers) to the digital (Amazon, Wikipedia, blogs).”

David Weinberger, Zero Tolerance for Humans, The Huffington Post, April 21, 2007: “We are dragging the process down, legitimizing the tactic, debasing understanding, and driving nuance out of the system. Frankly, taking McCain down a peg just isn’t worth it.”

Paul Gillin’s Blog, David Weinberger’s comments provoke thought and debate, Paul Gillin’s Blog, April 24, 2007: “I agree with David that this is the way the world is going. In an atmosphere in which information is freely available to everyone, the expert can no longer claim to be the final word on anything. He or she must admit to fallibility and derive influence from the ability to assimilate many facts”

The Happy Tutor, French Code of Blogger Conduct – Oui Oui!, Wealth Bondage, April 22, 2007: “One code that David implicitly observes is to be painfully literal… French code of conduct? David, did you ever read Derrida? Those French people are shifty.”

Related Content

  • Christopher Tindall

    :first post ever:

    As I understand it, the metadata revolution that is the semantic web is in essence one to allow machines to recognize data for what it is. A person may see a picture and see a person and make connections as to who that person is or what that person is doing. No machine is yet sophisticated enough to perform this process adequately. Metadata, provided by humans, allows that data to become a database of searchable, significantly more useful pictures/data/connections.

    Perhaps I simply need to read the book but I feel it should be made more clear whether the data needs reorganizing, the applications we use to view or compile the data, or the interface with which we view data is at the heart of this coming reorganization. Shouldn’t we focus on creating a more semantic web before we get caught up in the minutia of the presentation of its data?

  • well, there may be a best way to organize as much as the universe as we have seen.

    if you believe in infinity, then it would be impossible for one data management system to manage all possibilities.

  • Cameron Brown

    1) I find it easy to believe that there is no “best way” to organize any non-trivial collection of objects. My professional experiences with object-oriented programming has made this abundantly clear!

    2) It’s useful to consider the difference between a HIERARCHY (a collection based on inheritance) and a HETERARCHY (a collection based on tags). The web is teaching us the practical value of heterarchies over hierarchies for large collections of objects.

    Expanding on (1), object-oriented programming organizes objects into a hierarchy of ever-more-specialized objects. For example a programmer might create an object called “vehicle”. Below “vehicle” we might create multiple sibling objects that inherit any functionality that exists in “vehicle”. Let’s create “boat”, “car”, “plane”. Then below “boat” we might create “yacht”, “oil tanker”, etc.

    The problem is that the classification process becomes ever more subjective. E.g., should a “catamaran” be considered a “yacht”? Or should it perhaps be a child of “yacht” (i.e., a more specialized yacht)? Or then again maybe it should be a _sibling_ of yacht?

    The parallels to Linnaean taxonomy are obvious, with the endless debates about what does or does not constitute a new species, and the sometimes alarmingly arbitrary criteria used to sort individual creatures into buckets!

    As a result of these difficulties, computer science has explored other classification strategies, such as so-called “component systems”, which have some similarities to Flickr-style tagging strategies. Of course, these strategies come with their own set of problems… =]

  • I’m looking forward to what David has to say on OpenSource Radio, and to reading his book. There is much to debate on the merits and demerits of various means of organizing information, and who can argue against the notion that there are many fruitful approaches, and none of them enjoy perfection.

    The social web certainly shows us that ‘normal’ people — end users — have much to offer through the ground-up approaches of tagging and content sharing. Among the contributions of more formal information management approaches, however, is the creation and curation of persistent repositories that we can count on for the long term — museums, libraries, and archives.

    As we struggle with the challenges that surge through our increasingly-electronic domains, we need to keep in mind that the new models will serve us in the future only to the extent that they are given the curatorial attention of institutions with missions rather than simply bottom-lines. Dewey, for all its faults, is useful to a large degree because of just this long term attention.

    The link to David’s Book in the post above is to a book store… a fine one, with strong credentials in the electronic domain. Why not anchor it instead to a global collective of libraries that will be around as long as there are libraries, rather than a store? Try this one instead:

    Everything is miscellaneous : the power of the new digital disorder

    A persistent identifier in a service environment designed to support the information needs of libraries and their uses for the long haul.

    Stuart Weibel

    Senior Research Scientist

    OCLC Programs and Research

    OCLC is a not for profit library consortium that runs WorldCat.org, and the Dewey Decimal Classification

  • I don’t want to argue for or against a standard classification system, but I’d like to see a standard system for tagging. I blogged on this subject earlier this week: We’ve gotten to a place where users feel confident that they can use basic search language standards such as Boolean operators in most search engines; I’d like to feel the same level of confidence when tagging data online. But right now the rules vary from site to site, and if I want to get my tags right I have to remember where I am and what to do. For example, here are the tagging rules for some of the web services I use every day:

    Flickr – space-separated, double quotes can be used to join words together in a single tag.

    Del.icio.us – space-separated. Multiple-word tags need to be joined with a hyphen or underscore. Commas and quotes become part of the tag.

    Blogger – comma-separated. Quotes become part of the tag.

    I can try to keep a mental matrix of which sites use which rules (“If it’s Blogger it must be commas!), but if (when) I make mistakes, it’s going to affect the integrity of my tags, and by extension the integrity of others’ search results.

    Please, let’s have a standard. I don’t care which one. I just want to spend more time searching, learning, and tagging, and less time going back and re-doing all the tags that sorted to the top (or got lost) because they start with “.

  • nikolrb

    I work at a company that does transportation logisitcs. I am always trying to find new ways to better organize information that crosses multiple categories. The problem I keep running into is that ways of organizing are generally two-dimensional (think like an excel spreadsheet or an X-Y axis graph.) We don’t generally think about categorizing in more dimensions, but I feel that this is essentially what our problem is. I keep fantasizing about a computer with a three-dimensional screen that allows me to draw links between documents and emails and bits of information with a thread that tells why they are strung together. Because each bit of information often serves many purposes.

    Or perhpas real organization is more organic, molecules of information bonded together.

    I hope it is part of our evolution as a society to move beyond the two dimensions. We are finally starting to question if “good vs. evil” or “masculine vs. feminine” are valid concepts, I think the way we organize would follow that same evolution.

  • nicka

    While I think that finding ways to create user-driven organization schemes are absolutely imperative, I can’t help but have some nagging concerns. Part of the beauty of the professional organization is that one has a certain assurance that a document will always be where you left it. I worry that in a world of dynamic, user-driven organization, important information could be lost to obscurity for all time.

  • This is David Weinberger. First, thanks Chris — and Mary and David and, well, the entire RadioOpenSource gang.

    Now to respond, in chronological order…and I’m sorry for the loooong post. I hope you don’t mind.

    Chris, the great thing about what’s happening is that you can have your Dewey Decimal system and eat it too. If you have a hankerin’ for one of those glorious Victorian taxonomies (or a modern version), then go ahead and create it. It’ll be one more useful way of navigating the miscellaneous pile of leaves. Because that pile is digital, you don’t have to re-order the stuff itself. You can just layer orderings (of the metadata) on top of it. So, indulge your yen for a single, numeric, precise taxonomy. In fact, share it. That’ll just add to the richness of the pile.

    Christopher, my book talks about the Semantic Web very briefly, taking it as an attempt to reimpose a top-down order on the Web. That has value, of course, but I think it’s a small part of the overall move which is adding semantics link by link and tag by tag, bottom up.

    demarconia, the best way to order anything (including the universe) has to be best for some purpose or according to some interest. But interests are always particular to the person and task. So, I don’t think there can be one best way. IMO.

    Cameron: I think what you’re caling a “heterarchy” others call a “folksonomy.” No? Wrt to object-oriented programming, as you know far better than I, it allows multiple inheritance to solve the problem you point to, but I think your overall point is right on. And I agree that we’re exploring new principles of organization. To me, that’s among the most exciting research being done right now.

    stuart, good point. All I’d add is that (as I’m sure you agree) the degree and type of curating depends on the type and purpose of the collection. Some will do fine with just an occasional Dusting by the Crowd. Others need experts worrying over every inch.

    sadalit, yes, the different conventions are annoying. Sigh.

    nikolrb, I don’t think you’re being ambitious enough 🙂 Why stop at 3 dimensions? It’s an n-dimensional Web! During the program, I put this rather badly. I talked about there being a near infinite of “attributes,” without explaining that by “attribute” I meant something very simple: A strawberry has a color, smell, shape, weight, place, expected life span, medicinal properties, etc. Which ones matter to us depends on what we’re up to. And that determines how we’ll cluster and arrange our world at that moment — things that taste like strawberries? Things to write “SOS” on a wall with? Things handy to throw at an incoherent author? (Of course, visualizing past 3 dimensions gets hairy, which was your point.)

  • TJI

    The conversation seems to be about two different activities:

    – naming and organizing things for the purpose of understanding the hidden order of the world (or universe)

    – doing the same simply so we can find them later when we need them

    On the second point, we will be at a new age in information retrieval when we can just say “Show me more like this” and get results that are meaningful.

    This is very different from a functionally organized “body of knowledge” that is intended to reflect “the way things really are.” Collapsing the two introduces all of the theoretical problems that led to the artificial intelligence debacle.

  • Two comments:

    1) I am glad to read David Weinberger’s note (April 26, 4:48 pm) that “you can have your Dewey Decimal system and eat it too. If you have a hankerin’ for one of those glorious Victorian taxonomies (or a modern version), then go ahead and create it.” In other words, that value remains in existing formal organzational models.

    While I am a fan of user-tagging, there are times when user-made tags may not be fully appropriate. Take online health information, for example. I hesitate at the idea of users tagging information themselves on sites where presenting vetted and credible health information is the site’s main goal, unless the tagging is moderated. This is because some tags might, for example, incorrectly link a disease with a cause, or a symptom with a disease, or a stigma with a disease, or might promote ideas or perceptions that are too untested to be sure enough to use or that come wrapped in language that includes stigmatizing personal value judgements.

    Usually such mismatched tagging would be inadvertant, but left unmoderated, could lead readers to incorrect information. The tags themselves, which themselves are usually full of meaning, could be means of perpetuating outmoded or popular, but not quite on the mark (at least for that site), perceptions.

    2) I have noted user-generated tags and models are really useful for researching on one’s own, but they are less efficient when groups of people are attempting to locate the same thing at the same time. Terms and paths to get to content can vary so much, and the amount of similar-but-not-the-same content is so great that a group can spend an inordinate amount of time just getting on the same page! Once again, this is where existing, conventional modes of organizing material come in handy…they allow people to communicate efficiently in groups because they force the use of an agreed-upon, or mostly agreed-upon, set of terms and language.

    All this said, much enjoyed the half of the show that I was privileged to listen to this evening.

  • Pingback: Everything is Miscellaneous « When Cars Run on Information()

  • hurley

    I don’t have a copy here, so the quotation might not be exact, but early in William Gaddis’ extraordinary JR — my candidate for the Great American Novel — a despairing and embittered character named Jack Gibbs lays out the nature of the enterprise for his sixth-grade class:

    Order is just a thin, perilous veil we impose on chaos.

  • Tom Morris

    dweinberger: “my book talks about the Semantic Web very briefly, taking it as an attempt to reimpose a top-down order on the Web”

    David, I don’t think your conception of the Semantic Web ties up with how it is in reality. Tagging and linking which you see as an alternative are perfectly compatible with the Semantic Web, as is the reordering of taxonomies. Thanks to

    There is a perception that the Semantic Web is all about ‘top-down’ taxonomies, but it really isn’t. Certain central concepts have been pre-rolled – FOAF (to describe people), SIOC (for their postings) and DOAP (for their projects), but they are infinitely remixable. Using applications like Protegé, it is possible to completely remix or extend any ontology (It’s not easy, mind. The Semantic Web is currently at the level of complexity of pre-Kodak photography).

    Yes, every concept that can be expressed in Semantic Web terms can be explicitly changed. Which one wins is highly democratic – whichever one gets more use. If your OWL ontology specifies that “a Person may have only one relationship of type ‘marriage’, and the object of that relation must be of the opposite gender” and another OWL ontology specifies “you can do whatever you like”, we have basically created a technology in which it is possible to model any type of social and political reality that you like. Imagine if you had the Dewey Decimal System laid out on a big table and had the ability to drag things around to your heart’s content.

    Take a look at dbPedia and Semantic MediaWiki. dbPedia allows you to search Wikipedia using the W3C’s new SPARQL protocol. If you want to find Belgian atheists or female entrepreneurs who are related to actors or places where philosophers have been born, the Semantic Web is the way to allow that to happen.

    Take a look at Zotero – an RDF-powered bibliography database which fits in to Firefox (and is straight-forward enough for my mother to use). I can hit ‘export’ in Zotero and have my whole library up on the web in less than a minute. It can also be SPARQLed. Zotero has a ‘related’ function built-in – you can use it to specify relationships inside a bibliography. I use it to link books with critical articles and blog posts (etc.) about them.

    Take a look at DiscourseDb.org – a wiki-based database of political discourse. This is all powered by Semantic Web technology. If you find another ‘attribute’, you open the wiki page and add it.

    The Semantic Web is unfashionable among certain people, but the things which you’ve spoken about on this show and written about are problems/opportunities that SemWeb people have been pondering and developing around for the best part of a decade now. Hopefully it’s an area that I’m going to spend at least a significant chunk of my career working on.

  • jtb313

    Not only do all things connect; all things inter-connect.

    Further, there is no right or wrong; there just IS. (Things just are.)

    Thus, all individual perpectives & views are valid and can be substatianted

    with concrete evidence and logic.

    In other words. all things are possible; all versions exist endlessly.

    One hurdle to overcome is trying to make any single reason or answer or method work within multidimensional universes.

    Just let go and let it all in. It’s all good.

  • My spice rack is not organized, and it’s not just spices. So I would dispute the use that analogy as a universial.

  • I just tried wading through the comments to this point to see if I could find something to attach my comment to, but I think I’ll just let it hang out there on its own. Sorry!

    I just finished listening to the show, and it reminded me (especially the sections regarding species) of some background noise to my job. One of the biggest discussions around the salmon world is what constitutes and evolutionary significant unit (ESU). This matters because, generally speaking, it is the basis on which the federal government manages the Endangered Species Act in terms of salmon.

    There has been a series of lawsuits since 2000 challenging the government’s definition of what a member of a particular ESU is, implying that salmon produced in a hatchery are just as significant as salmon spawning in the wild. Similar discussion have also popped up regarding rainbow trout and steelhead, which are pretty much the same species, but one stays in freshwater for its entire life while another migrates out to sea. Its even assumed that two rainbow trout may spawn a steelhead and vice versa, but one is a listed species and another is a popular game fish caught in lake tournaments.

    Not sure if it adds anything, but it what was on my mind during the show.

  • Pingback: Re-ordering of Everything at Praxis Language | Learning on Your Terms()

  • Pingback: …My heart’s in Accra » links for 2007-05-01()

  • Dacker

    An infinity of connections, a rejection of dichotomies, using a technology that writes only in binary code, 0 or 1. Hmm.

  • Cameron Brown

    jtb313: “Further, there is no right or wrong; there just IS. (Things just are.)”

    That may be so in the most general sense, but in specific functional domains (e.g., a computer program for a specific purpose) there very much IS a right and a wrong. In the example I gave in my earlier post, there are different types of vehicles – a car is not a boat, and vice versa. It’s certainly “wrong” to program a boat such that it has wheels and drives around on roads!

    Dacker: “An infinity of connections, a rejection of dichotomies, using a technology that writes only in binary code, 0 or 1. Hmm.”

    Interesting thought, but misleading. You need to separate information from how information is encoded. The English alphabet has only 26 letters, but you can express an infinite array of thoughts with it. With just four binary digits (half a byte) you can encode all twenty six letters, with room to spare:

    0000 = A

    0001 = B

    0010 = C

    0011 = D

    0100 = E

    …and so on. So to encode the word “bad” I would write “0001 0000 0011”.

    Hopefully from there it’s pretty obvious that binary encoding is just a less efficient form of spelling – it ultimately has the same expressive power as English writing!

  • Dacker

    Exactly! And yet in each bit, there is one choice – 0 or 1. You can’t have both. The technology that underlies this explosion of meaning is a real space, real time technology. 0001 can’t be B and Z, but Moby Dick can be symbolic, boring, gay, etc.

    A stupid point perhaps, but I think it’s a little intriguing.

  • Cameron Brown

    Ah, I understand your point now. You’re saying that at the very lowest level, the ambiguity disappears and everything can be sorted neatly into buckets labeled “zero” or “one”.

    I suppose that’s true, but at that level all of the higher level context is irrevocably lost. A bit that is part of a Britney Spears .mp3 cannot be distinguished from a bit that is part of a Moby Dick audiobook, except by “zooming out”. I don’t think an individual bit preserves any of the essence of Britney-ness, just as you wouldn’t consider a water molecule “wet” (wetness is a property that emerges from large numbers of water molecules in the aggregrate).

    I guess my point is: scale is important. Things look very different depending on how far away you are.

    And of course we have to consider , which can be both zero and one…

  • Cameron Brown

    Oops, messed up my link 🙁


  • plnelson

    Most of this discussion has been limited to conceptual things – ideas, data, O-O sw objects, etc. What about real, concrete, things that have mass and volume?

    My wife and I live in a 2600 ft^2 house where I paint, garden, design electronics, write software, run websites, do dance- and studio photography, do mat-cutting and framing, cook, and invent stuff. I also like to work on my car and play sports, with all the gear that suggests. My wife is similarly busy. We have no kids and don’t watch TV and don’t get enough sleep so we have time for all this.

    So obviously how to ORGANIZE the stuff that goes with it all so it can be be quickly available and accessible and I don’t have to dig one thing out from behind another thing, or lose it altogether, is a big issue in my life.

    I’ve consulted with professional organizers who have failed utterly to come up with a taxonomical scheme for my stuff! When I check their references the references always speak glowingly about how the organizer comes back to their office or business every few months and re-organizes them! That doesn’t say much for whatever scheme they came up with!


    …assumes a taxonomical scheme that can identify what the “place” is for a given item and can easily accomodate novel items as they are introduced to the system.

    The “place” for an item should be intrinsic to the item, not arbitrary – i.e., I should be able to determine what the “right” place is based on inherent properties of the item, not on an arbitrary decision. If I arbitrarily decide that NEMA twist-lock 3 phase plugs go in the top drawer of my basement workbench, then in 6 months when I’m looking for one I might forget what I decided! I need to be able to ask, “where should a NEMA twist-lock 3 phase plug be?” so that the answer leads me to where it is. This requires a good taxonomical scheme.

  • Pingback: Classifying Information: Foucault to Librarians « Disparate()

  • herbert browne

    RE: ..” I need to be able to ask, “where should a NEMA twist-lock 3 phase plug be?” so that the answer leads me to where it is. This requires a good taxonomical scheme..”-

    Well, yeah… as long as it continues to be known by that nomenclature, you SHOULD be able to sort it out the way that you want. Then, somebody decides that the nomenclature can be tweaked a little, to afford even GREATER specificity (& you should appreciate THAT- right?)… and- voila! the “tre fez” is born! I have a nursery in which I’ve had to change name tags for some plants 3 times in the last 15 years, because “Professional taxonomists” have decided what a name SHOULD be… and it changes- just like that… ^..^

  • Dacker

    Cameron: re your link- touche! ( I don’t know how to make the accent over the e). And way over my head.

  • Pingback: links for 2007-05-05 « Spinstah()

  • Pingback: Marketing Conversation » Everything is Emergent in the Semantic Web - New Marketing and New Media by Abraham Harrison LLC()

  • Pingback: Ted Roche’s weblog - Mission: Interoperable. Competition breeds Innovation. Monopolies breed stagnation. Working Well with Others is Good. » What I’m listening to…()

  • Pingback: david cook radio interview()

  • Nice open source information.

  • Pingback: Some thoughts about the evolution of computing | oookblog2()