Ciao a tutti,
I’m excited to announce the
Tuscolo Static Certificate Transparency log, operated by
Geomys.
Tuscolo is a production log, hosted on a single dedicated server by Port 179 LTD (better known as the operator of
bgp.tools,
AS206924, and
Behind the Sofa). It is backed by
Sunlight and a new local filesystem backend.
We wish for this log to be both public good infrastructure for the WebPKI, and a learning resource for the CT community. To that end, we are happy to share administrative and operational details.
The total yearly amortized cost to Geomys is £7,760 ($10,350): £5,250 of hardware price, and £500/month of colocation and operational support. This includes 500Mbps of P95 bandwidth (~160TB of traffic per month), which we estimate should be 2-3x what we’ll need, based on data from other operators and accounting for Static CT savings of up to 78%. We can easily scale bandwidth at increasingly cheap marginal costs.
The hardware is a an overprovisioned refurbished Supermicro H12SSW-AN6 with new enterprise SSDs for the main storage pool.
- 24-core Zen 3 (Milan) AMD EPYC 7443P
- 128GB of ECC DDR4 2666 MHz
- 25Gb/s Mellanox ConnectX-5
- 2x 512GB SK hynix PC801 NVMe SSDs (md raid1, rootfs)
- 4x 7.68TB Samsung PM9A3 U.2 SSDs (ZFS raidz2)
- Redundant power / PDUs
- Colocated at Digital Realty's LON1
Sunlight is configured with a pool size of 750 certificates per shard, selected somewhat arbitrarily, and sequences the pool every second. Tuscolo is currently receiving all Let’s Encrypt pre- and final certificates, with 1.05 system load, 41ms P50 / 101ms P99 sequencing time, and 1.04s P99 add-chain duration (including up to 1s of pooling).
Since the local backend lays out tiles according to the Static CT hierarchy, the read path could have been any HTTP file server. However, we implemented a custom read path,
Skylight, to get better observability and rate limits.
In particular, the Tuscolo logs apply a global rate limit of 75 requests / second
from clients that don’t include an email address in the User-Agent, and an additional global rate limit of 75 requests / second for partial data tile requests. A well-behaved client that identifies itself and
opportunistically waits for tiles to fill up will not incur in any rate limit. We think this will be an important tool to shape and iterate on client behavior, but we’re open to feedback from the community.
A couple fun Skylight details:
- Rate limits are implemented with the GCRA, taking a page out of Let’s Encrypt’s book.
- Heavy hitter User-Agent strings are tracked with the Space Saving algorithm, and exposed internally, as they include private email addresses. We might publish redacted stats occasionally.
- Copious metrics are exposed publicly, just like Sunlight’s, including connection reuse stats. (Please reuse connections!) Feel free to scrape them, but please use an interval of at least 60s.
- The /health endpoint that’s linked to our pagers loads the checkpoint of each log, verifies their signatures, and returns a 500 status code if any checkpoint is older than 5s.
- All endpoints allow Cross-Origin Resource Sharing. I am not sure what this is for, but maybe someone will come up with something interesting!
We are internally targeting 99.9% uptime across endpoints, and are confident we can exceed the 99% threshold. You can track our uptime on the
automated status page.
Currently, we accept the same roots as Let’s Encrypt’s Oak log, but we plan to ensure we accept all logs in the Google, Apple, and Mozilla root programs before we submit this log for inclusion.
We also set up a
staging Sunlight instance on the same box, called
Navigli. It only accepts staging and testing roots, and we’re happy to add more. Let’s Encrypt’s staging certificates are including SCTs from it. We will test Sunlight development builds on this log, and we offer no SLO. With the introduction of Navigli, I will be decommissioning the
Rome development log.
A selection of configuration files and system details is available
here. You can also see our (short)
playbook.
As you can imagine, I am running this log out of pocket in part to prove a point about how cheap operating a Static CT log can be. I am fully committed to run this log long-term regardless of funding, but it would be great to spin up another instance in a different location for redundancy, potentially on behalf of a generous CA or UA. wink
Looking forward to any feedback,
Filippo
P.S.: Fun fact, the first commit of Sunlight was written at Tuscolo, on
this red bench, which is pictured in the Sunlight logo.