Bonus Drop #78 (2025-03-09): Knot Your Parents’ Social-Enabled Git Collaboration Platform
Oh What A Tangled Code We Weave
Just one topic today since it took a bit more time to dissect the topic than I anticipated.
Hit me up if you have any issues getting your own “knot” untangled.
Oh What A Tangled Code We Weave
Tangled (Blog | Bsky) is a new platform for Git-based collaboration that blends the benefits of decentralized systems with human-friendly social features. It’s built on top of the AT Protocol (i.e., the thing that Bsky invented), seeking to give developers ownership of our code while fostering open-source community governance and a more social coding experience, free from billionaire and empire controls.
If U.S. and global Drop readers haven’t yet received the memo that you should be avoiding U.S.-hosted services and billionaire/global-mega-corp controlled platforms, consider said memo now delivered.
Instead of choosing between a fully federated model like Forgejo (which uses ActivityPub) or a purely peer-to-peer approach like Radicle, Tangled provides a decentralized social networking framework with a central identity system. If you have a looksie into the AT Protocol documentation, you’ll find key concepts like:
- Repositories: These are self-authenticating storage units for a our content.
- Lexicon: This is a schema language used to define the structure of data within the AT Protocol.
- App Views: The documentation refers to these as consolidated views into the network. In Tangled’s case, the app view at tangled.sh provides a unified interface for accessing and contributing to repositories hosted across different “knots.”
To put that last bullet a different way, Tangled “knots” are lightweight servers that host Git repositories. These knots can be single-tenant (think self-hosting on a Raspberry Pi) or multi-tenant (for larger community servers). Tangled provides (free) managed knots as a default, lowering the barrier to entry.
The project is still in its early stages, with the Tangled team actively developing core features while “dogfooding” the platform themselves. Their design decisions are guided by three principles: data ownership, low barrier to entry, and a strong interactive experience. The goal is to make collaboration feel natural and intuitive, even within a decentralized environment.
For readers that just want to try it out head on over to tangled.sh
and you can see the “firehose”:
To do much beyond being a voyeur and/or cloning repos over HTTPS, you’ll need to authenticate to the network via your Bluesky handle and a generated app password (I’m confident oauth is coming at some point for Tangled). Once you login, tap your handle (upper-right corner) and add some SSH public keys (via “settings”).
I made a repo for this Bonus Drop:
to test it out and give y’all something to poke at.
The point of Tangled is to foster decentralized social coding, which means if you’re going to be serious about using this service, you should create your own “knot” (a cute word for a Tangled instance that will broker access to the repos via the Tangled AppView and git SSH/HTTPS ops).
The core project Readme has all you need to walkthrough the creation of an instance. There are three Golang binaries (well, four, really, but you’re likely going to rely on Tangled for the AppView) in the mix.
keyfetch
is a program designed to run as an SSH AuthorizedKeysCommand
. It fetches SSH public keys from an internal API endpoint and formats them for use with SSH authentication. When a user attempts to connect via SSH, keyfetch:
- retrieves a list of authorized keys from a specified internal API endpoint
- formats these keys with specific command restrictions
- outputs the formatted keys to be used by the SSH server for authentication
This lets the Tangled platform dynamically manage SSH access to Git repositories based on user credentials stored in its system. If you did load your PUBLIC keys into Tangled, you can run keyfetch
manually (after you successfully bootstrap your knot) to see the output.
knotserver
/knot
is a core component of the architecture that manages the “knots” — those lightweight, headless servers that host Git repositories. The knotserver
:
- sets up and manages the database for storing repository information
- implements role-based access control (RBAC) for repository permissions
- integrates with Jetstream for event handling and communication
- runs both a main server and an internal server on different ports
The knotserver
essentially provides the backend infrastructure for hosting and managing repositories within a knot, handling authentication, authorization, and API endpoints.
repoguard
acts as a security layer between SSH connections and Git operations. It:
- validates incoming Git commands from users connecting via SSH
- resolves user identities (handles or DIDs) to their proper DID format
- verifies that users have appropriate permissions for the requested operations
- executes the Git commands in a controlled environment
- logs all access attempts and operations for security purposes
repoguard
ensures that only authorized users can perform specific Git operations on repositories, providing security and access control at the Git command level.
Together, these three components form part of the infrastructure that enables Tangled’s decentralized Git collaboration platform, with keyfetch handling SSH key management, knotserver providing the repository hosting backend, and repoguard securing Git operations.
I rean it on one of my public internet-facing services, but since the recommended config is to use a reverse proxy to the localhost port 5555
service (which I used Caddy for), there is nothing stopping you from running this instance on your local network and reverse proxying to a Tailscale interface.
NOTE: if you use their recommendedsystemd
setup, you’ll need to modifyknotserver.service
and changeExecStart=/usr/local/bin/knotserver
toExecStart=/usr/local/bin/knot
and make sure to restartsshd
.
If you hit up knot.hrbrmstr.app
you’ll get:
This is a knot server. More info at tangled.sh
instead of that minimalist web interface you saw on tangled.sh
.
The reason for this is that Tangled provides the AppView to the federated AT protocol brokered knot instances (much in the same way Bluesky is the AppView for all Bluesky-compatible PDS instances and WhiteWind is the blog hub AppView for all of the WhiteWind compatible PDS isntances).
Sice this is all happening on the AT protocol, we can even explore things from the protocol-perspective.
If you start at my (or your did
): https://pdsls.dev/at://did:plc:hgyzg2hn6zxpqokmp5c2xrdo
, you can see all of the available collections:
app.bsky.actor.profileapp.bsky.feed.likeapp.bsky.feed.postapp.bsky.feed.postgateapp.bsky.feed.repostapp.bsky.feed.threadgateapp.bsky.graph.blockapp.bsky.graph.followapp.bsky.graph.listapp.bsky.graph.listblockapp.bsky.graph.listitemblue.zio.atfile.uploadchat.bsky.actor.declarationcom.whtwnd.blog.entrysh.tangled.feed.starsh.tangled.graph.followsh.tangled.publicKeysh.tangled.reposh.tangled.repo.issuesh.tangled.repo.issue.comment
Well, look at all the ones on the sh.tangled.*
PDS!
If we follow one repo
path to the end: https://pdsls.dev/at://did:plc:hgyzg2hn6zxpqokmp5c2xrdo/sh.tangled.repo/3ljx2j3twex22
, we can get the metadata:
{ "knot": "knot.hrbrmstr.app", "name": "my-first-self-hosted-knot-repo", "$type": "sh.tangled.repo", "owner": "did:plc:hgyzg2hn6zxpqokmp5c2xrdo", "addedAt": "2025-03-09T12:37:24Z"}
This also means there’s nothing stopping anyone from building off of this new ecosystem.
I’m not sure why I find Tangled easier to grok (thanks to a certain individual and service I think I need to find a new word to use) than Radicle, but it seems to fit my mental model better.
You can check out both the example repo for this post, or the uncreatively-named one on my knot.
If you join Tangled drop me a note so I can follow you there, and send me your handle if you’d like to see how it works when you hit up my instance.
FIN
Remember, you can follow and interact with the full text of The Daily Drop’s free posts on:
- 🐘 Mastodon via
@[url=https://dailydrop.hrbrmstr.dev/@dailydrop.hrbrmstr.dev]dailydrop.hrbrmstr.dev@dailydrop.hrbrmstr.dev[/url]
- 🦋 Bluesky via
https://bsky.app/profile/dailydrop.hrbrmstr.dev.web.brid.gy
Also, refer to:
to see how to access a regularly updated database of all the Drops with extracted links, and full-text search capability. ☮️
Bonus Drop #69 (2024-11-24): Blue Skies, Four Squares, And Federated Blogs
Oh The (Foursquarae) Places You’ll Go (In Maine); WhiteWind; My Own Private PDSToday’s Bonus Drop explores spatial data visualization using R and Foursquare’s recent data drop, decentralized blogging over ATproto, and setting up a personal data server.
TL;DR
(This is an AI-generated summary of today’s Drop using Ollama + llama 3.2 and a custom prompt.)
- Foursquare’s open-source POI dataset (docs.foursquare.com/data-produ…)
- WhiteWind, a decentralized blogging platform built on AT Protocol for Bluesky integration (whtwnd.com/hrbrmstr.dev/3lbp5x…)
- Setp and run your own Personal Data Server (PDS) for Bluesky with practical implementation steps (github.com/bluesky-social/pds)
Oh The (Foursquare) Places You’ll Go (In Maine)
Foursquare recently open-sourced over 100 million places of interest (POI). It’s a great gesture, but we warned the data is not fully normalized or validated.I needed to make at least one map for the 30-Day Map Challenge, and wanted to see what this data looked like (especially since I helped build it back in the day when I was foolish enough to “check-in” at locations via the Foursquare app).
Getting the data is easy. You will need the AWS CLI, and can make an unsigned request to retrieve the Parquet files:
$ aws s3 sync s3://fsq-os-places-us-east-1/release/dt=2024-11-19/places/parquet/ --no-sign-request ./places/$ tree -h places[4.0K] places├── [434M] places-00000.snappy.parquet├── [434M] places-00001.snappy.parquet├── [434M] places-00002.snappy.parquet├── [434M] places-00003.snappy.parquet├── [434M] places-00004.snappy.parquet├── [434M] places-00005.snappy.parquet├── [434M] places-00006.snappy.parquet├── [434M] places-00007.snappy.parquet├── [434M] places-00008.snappy.parquet├── [434M] places-00009.snappy.parquet├── [434M] places-00010.snappy.parquet├── [434M] places-00011.snappy.parquet├── [434M] places-00012.snappy.parquet├── [434M] places-00013.snappy.parquet├── [434M] places-00014.snappy.parquet├── [434M] places-00015.snappy.parquet├── [434M] places-00016.snappy.parquet├── [434M] places-00017.snappy.parquet├── [434M] places-00018.snappy.parquet├── [434M] places-00019.snappy.parquet├── [434M] places-00020.snappy.parquet├── [434M] places-00021.snappy.parquet├── [434M] places-00022.snappy.parquet├── [434M] places-00023.snappy.parquet└── [434M] places-00024.snappy.parquet
I did this on my home server, and I’m being very lazy today and working from a comfy chair three floors up and many square feet removed from said box. While I would normally use
{duckdbfs}
to do data ops on those files, I refuse to suffer the extra few seconds of query delay, so you get SQL:COPY ( FROM read_parquet('./foursquare/places/*.parquet') SELECT name, latitude, longitude, post_town, fsq_category_labelsWHERE (lower(region) = 'maine' OR upper(region) = 'ME') AND country = 'US') TO '4sqme.json' (FORMAT JSON);
A quick
scp
to my laptop and we can now get down to bidnez.I always have Maine geo data handy, so we’ll pull in the border and counties:
me_counties <- read_sf("~/Data/me-counties.geojson")me_border <- read_sf("~/Data/me-border.geojson")
Now we’ll read in the Foursquare points from Maine:
jsonlite::stream_in(file("~/Data/4sqme.json")) |> filter( !is.na(longitude), !is.na(latitude), ) |> st_as_sf( coords = c("longitude", "latitude"), crs = st_crs(me_counties) ) -> me4sq
As noted, the data is a bit janky, so we’ll need to make sure the points are in my state, also remove any POI without a category, and only grab the top-level category so I can use a decent palette:
me4sq |> st_filter( me_border, .predicate = st_within ) |> filter( lengths(fsq_category_labels) > 0 ) |> mutate( top_level = fsq_category_labels |> map_chr(\(.x) .x[[1]]) |> stri_replace_all_regex("> .*", "") |> stri_trim_both() ) -> actually_in_maine
Finally, we plot it all (ref. section header):
ggplot() + with_shadow( geom_sf( data = me_border, fill = "white" ), x_offset = -2, y_offset = -2 ) + geom_sf( data = me_counties, fill = NA, size = 1/4 ) + geom_sf( data = actually_in_maine, aes(color = top_level), size = 1/4, alpha = 1/6, show.legend = FALSE ) + scale_fill_tableau() + coord_sf( datum = NA ) + facet_wrap(~top_level, ncol = 5) + labs( title = "Oh The Places You'll Go (In Maine)!", subtitle = "Locations plotted from top-level categories in the Foursquare Places open data." ) + theme_ipsum_gs(grid="") + theme( strip.text.x.top = element_text(hjust = 0.5) )
I’m pretty sure Foursquare released this to gin up sales for their higher-quality premium data available via their API. But, it’s hard to complain about having some free geo data to play with.
WhiteWind
WhiteWind is a free blogging platform built on the AT Protocol (atproto) that integrates with Bluesky accounts. The platform enables users to publish markdown-formatted content while maintaining complete control over their data through personal data servers (PDS).The project uses a mixed technology stack with Go and TypeScript. The backend implements XRPC API functionality in Go, while the frontend utilizes Next.js. Development environments are containerized using a Go-based devcontainer configured for TypeScript development.
A primary attribute of WhiteWind is its decentralized approach to data storage and user management. Content is stored on assigned personal data servers, preventing the WhiteWind service from having direct control over user content modification, visibility, or deletion. This tracks with ATproto’s core principles of decentralized user account management and user-controlled data storage.
The project is under active development with rapid architectural changes. While formal contribution guidelines and documentation are still in development, the project welcomes both pull requests and bug reports from the community.
You can read this section on WhiteWind.
My Own Private PDS
I’ve been holding off pointing to the Bluesky repo that lets you run your own Personal Data Server (PDS) until I had a chance to set it up (to make sure it was straightforward enough).Gosh they made it pretty much painless. Just grab the
installer.sh
as they tell you to in that repo, run it the way they said to, and you’ll be up and running in no time. You just need a domain, an IP, and some scant system resources.You can use a cheap VPS if you want to only run a PDS, but I’ve got some beefy cloud boxes and ended up co-hosting it with some other apps. I’m also running Ubuntu 24.04. Both of those items changed up a few things.
First, I had to modify the script to think 24.04 was OK to use (it is). That’s just changing an
if
statement high up in the script.Then, I had to
exit
the script at the point where it generated the Caddy config and pulled down the Docker Compose YAML (ugh) file. I added the Caddy config to my own Caddy setup and deleted Caddy portion from the Docker Compose file.I ran the remainder of the script, and it worked great!
========================================================================PDS installation successful!------------------------------------------------------------------------Check service status : sudo systemctl status pdsWatch service logs : sudo docker logs -f pdsBackup service data : /pdsPDS Admin command : pdsadminRequired Firewall Ports------------------------------------------------------------------------Service Direction Port Protocol Source------- --------- ---- -------- ----------------------HTTP TLS verification Inbound 80 TCP AnyHTTP Control Panel Inbound 443 TCP AnyRequired DNS entries------------------------------------------------------------------------Name Type Value------- --------- ---------------pds.rudis.dev A 104.225.216.74*.pds.rudis.dev A 104.225.216.74Detected public IP of this server: 104.225.216.74To see pdsadmin commands, run "pdsadmin help"========================================================================
I did a test post from Bash:
ACCESS_JWT=$(curl -s -X POST "https://pds.rudis.dev/xrpc/com.atproto.server.createSession" \ -H "Content-Type: application/json" \ -d '{"identifier": "bob.pds.rudis.dev", "password": "yes-there-is-a-password"}' | jq -r .accessJwt)curl -X POST "https://pds.rudis.dev/xrpc/com.atproto.repo.createRecord" \ -H "Authorization: Bearer ${ACCESS_JWT}" \ -H "Content-Type: application/json" \ -d '{ "repo": "bob.pds.rudis.dev", "collection": "app.bsky.feed.post", "record": { "text": "Hello from my self-hosted PDS!", "createdAt": "2024-11-24T13:43:10Z" } }'
and, it worked (sort of):
I still need to figure out that “invalid handle” message, and have the Bluesky network crawl the PDS (if I want it federated…not sure I do).I gave ATFile a go with it, too:
$ atfile upload bw-shield.pngUploading '/Users/hrbrmstr/Documents/bw-shield.png'...---Uploaded: 🖼️ bw-shield.png↳ Blob: pds.rudis.dev/xrpc/com.atproto… Key: 3lbp6kujvqk2a↳ URI: atfile://did:plc:ktycg4pjzupqr5755su5mz6j/3lbp6kujvqk2a
And, that also worked: pds.rudis.dev/xrpc/com.atproto…
The PDS supports arbitrary Blob storage, is authenticated, and has a well-defined protocol for interacting with these cryptographically signed records. Sounds like a great new data toy to play with! I may try to put the Markdown for all the Drops into it and provide a way to list them and view them.FIN
We all will need to get much, much better at sensitive comms, and Signal is one of the only ways to do that in modern times. You should absolutely use that if you are doing any kind of community organizing (etc.). Ping me on Mastodon or Bluesky with a “🦇?” request (public or faux-private) and I’ll provide a one-time use link to connect us on Signal.Remember, you can follow and interact with the full text of The Daily Drop’s free posts on Mastodon via
@[url=https://dailydrop.hrbrmstr.dev/@dailydrop.hrbrmstr.dev]dailydrop.hrbrmstr.dev@dailydrop.hrbrmstr.dev[/url]
☮️GitHub - bluesky-social/pds: Bluesky PDS (Personal Data Server) container image, compose file, and documentation
Bluesky PDS (Personal Data Server) container image, compose file, and documentation - bluesky-social/pdsGitHub