PG&E, ERCOT, and Engineering Black Swans
I’ve seen over the last few days a lot of comparisons being made between the power outages occurring in Texas and the power outages that are an annual event for parts of California. These events aren’t particularly compatible due to the wildly different circumstances at play, and I want to talk about why. This post will get into what engineers look at when designing a system, the concept of calculated risk, and how black swans can really come from nowhere.
Read More...
Awards are Bogus Metrics
I recently had the baffling experience of reaching out to a company to ask about commercial support for their product as I was having a lot of trouble getting it installed and working. Those in the know of FOSS software can probably already see the red flag that needing to engage commercial support to get even a tech demo working is usually a sign of poor engineering quality and a fragile solution.
Read More...
The confluence of cheapness and dubious design: AT&T FTTH
For annoying reasons that I won’t get into here, I’m finally building out a home office. For me this meant getting another IKEA desk, and then making sure that the network path from my main network rack out to the garage where my office will be is built well and reliably installed. This has so far been a case of pulling wire through the attic, and then putting in a network terminal in the closet where all the network gear lives.
Read More...
Early Config Binding
Early and late binding are often discussed in terms of symbol resolution in programs that have symbols loaded from shared objects and static libraries, so what does this have to do with configuration? It turns out that a lot of the pitfalls and concepts that have to do with symbol resolution also apply to configuration management. IN a traditional systems management environment, configuration binding is typically performed very late. The binding happens either by a tool such as Ansible writing config files into place, or a package containing configuration files being installed, or even an admin logging into a machine and writing the config data.
Read More...
Alpine Hashistack 6 Months On
Just over 8 months ago I wrote about running the complete HashiCorp stack on top of Alpine Linux. Since then, the entire production workload of my work has moved over to this cluster, and through a handful of upgrades we’ve learned a lot about how it works and how to maintain it. This article is a followup to the original, which if you haven’t read, you should take a break and do so.