ATProto Taxonomy Use Cases

Story time

App Developer

  • wants to advertise the existence of their app
    • uses some deployment tool to publish an "app store entry" record to the protocol
  • also not averse to promulgating lexicons they have authored to other developers
    • tool also scrapes codebase for lexicons used
      • prompts user to tag any new lexicons they have contributed

Aggregator

  • wants to create an "app store" for atproto-enabled apps
    • trawls network for "app store page" records
  • wants to categorize for easy findability
    • has paid to research audiences and tailor top-level categories
    • can just execute the transitive closure over the tagged apps and lexicons to sort them

Leaflet/Semble/etc user

  • has just done a post; wants to tag it to categorize it
    • one of the tags couldn't be resolved, so a new one with that label is created and added to their PDS
    • tag is added to their personal concept scheme (?)
      • user is invited to connect the tag to others

Aggregator/Leaflet/Semble/etc admin

  • wants to make their content more findable; better paired to audiences
  • manages their own concept scheme for UX (and QA) purposes
    • integrates entire concept schemes from trusted curators wholesale
    • also vetoes/deltas against them
  • this is somebody's job at the company

Random curator/librarian/lexicographer

  • runs a website (or is a moderator on some community)
  • has cultivated a reputation as the definitive authority for a given niche (e.g. a fandom)
  • ultimately views it as a social duty but would be lying if they didn't enjoy the cred
  • plus has Opinions™ about it
    • uses some heretofore-unseen tool to create a concept scheme
      • pulls in concepts from all over atproto in addition to writing their own
  • this is a non-trivial volunteer effort
    • evenings and weekends tweaking the entries and topology
      • (perhaps via some out-of-band input from their own community)

Drive-by contributor

  • wants to help the community, feel useful
  • or not even necessarily; even just motivated by the affordance
    • uses whatever tool is handy to add value to an individual concept
      • add synonym
      • add translation
      • add definition
      • add example
      • add semantic relation to another concept
        • propose removal of semantic relation
          • (destructive changes should not be unilateral outside your own scope of influence)
      • propose split
      • propose merge
  • these are structured micro-interventions, analogous to fixing a typo or adding a link in Wikipedia

Record types

App store entry

  • this I think might actually be better off considered separately
    • in particular, things like ratings and testimonials that aren't part of this record per se but will attach to it
    • plus all the attendant social dynamics around that
  • again, look at schema.org/SoftwareApplication for inspo

Concept

  • the whole idea of having a concept as an addressable record is to create a durable identifier off of which one can hang spongier things like text labels and definitions
    • again: synonyms, translations, acronyms/initialisms, other alternate labels…
  • moreover, attach via semantic relation to (the durable addresses of) other concepts
    • again: broader, narrower, related…
      • I would also add "not to be confused with", ie when two concepts are related by the fact that they are often mistaken for one another, but nothing else
    • (although to do this in ATProto, I would be inclined to make the semantic relations part of the concept scheme/collection record rather than a property of concepts)

Concept scheme/collection

  • SKOS distinguishes between a concept scheme and an arbitrary collection of concepts
    • also ordered collections
    • not clear if this distinction is strictly necessary (modulo ordering) given some kind of "tributary" mechanism
    • one weird thing I have noticed, though, is collections (in SKOS) are orthogonal to concepts
      • ie, a collection with concepts as members is a different kind of thing and set of relationships than a concept and a bunch of immediately narrower concepts
      • a collection would be useful to yoke together a bunch of concepts that were found in a particular domain but not otherwise related to one another
  • in ATProto, any concept (or collection thereof) will implicitly have an author attached to it
    • it isn't clear then what the distinction would be between a concept scheme and an arbitrary collection, other than for parity with SKOS.
    • SKOS concept schemes do have a notion of "top concepts", which people tend to interpret to mean hierarchichally broadest, though arguably would be better off using them as genus-level entry points (after Lakoff).

Obvious issues

  • general abuse
    • spamming
    • trolling
    • brigading
  • astroturfing/manipulation
  • even people just mechanizing parts of this process to be "helpful"
    • even bigger issue now with LLMs
    • look at GitHub PR spam for a template
      • accounts write scripts to crawl repositories and issue thousands of trivial pull requests
      • they ostensibly do this to light up their "green grid", in turn, presumably, to impress prospective employers

Analysis

  • if anybody can add a concept to the protocol, arbitrarily many will
    • concepts are invariably going to be more durable/reusable than posts
    • this makes it an attractive target for:
      • abuse and hate,
      • spam links,
      • trolling,
      • other garbage
    • that said, a concept can only really get meaningfully surfaced two ways:
      • tagged by a post (that itself gets seen)
      • being included in a concept scheme/collection (that gets currency)
  • there are invariably going to be fewer concept schemes/collections than individual concepts
    • (at least, one would hope?)
  • the A in AT refers to authenticated ie every record has a progenitor
    • knowing who minted a concept is a useful (inverse?) reputation signal
      • (inverse ie the known reputation of the person is a signal to curators to ignore the concept)
  • the (labeled) indegree of a given concept will tell you:
    • how many times it is being used to tag things
    • how many concept schemes/collections it belongs to
    • (these numbers can of course both be juked, so the inputs would have to be cleaned)

Mitigations

  • individual concepts get legitimated by being included into concept schemes
  • if you were curating a composite concept scheme, you could curate your tributaries and effectively delegate to them
    • if they in turn use tributaries, you could veto those relationships wholesale
    • (there may need to be some negotiation process among curators for eliminating circularities)