Writing

Fish Stick: A Stateless Incident Management Bot for Slack

I just released Fish Stick, a stateless incident management bot for Slack. I’ve built this bot six+ times at different jobs, so figured it was time to stop recreating it and share it with the world.

The pitch: Most incident management tools are either too simple (basic Slack workflows) or too complex (enterprise platforms with a million knobs). Fish Stick sits in the middle—more powerful than workflows, simpler than enterprise tools.

Key features:

Random channel names (incident_furious_chicken, incident_brave_penguin)
Timeline logging with timestamps
Threaded team updates to stakeholders
Incident commander tracking and handoffs
Auto-generated timeline reports from channel history
Private incident support
Test mode for game days

The interesting part: it’s completely stateless. No database. No web interface. No OAuth flow. For this use case and this niche, Slack as DB is good enough.

All incident data lives in Slack:

Incident metadata → channel properties and pinned messages
Timeline → messages in the incident channel
Summary → pinned message

You can restart the bot anytime without losing anything. This keeps deployment dead simple.

Built it in TypeScript with the Slack Bolt framework. Supports Socket Mode for local dev (no public URL needed) and HTTP mode for production. Takes about 5 minutes to set up if you use the app manifest.

I’ve pared down the features quite a bit from what I have in other versions of the bot to keep things simple, but will be adding back some things like webhooks over time.

It’s MIT licensed. If you run incidents in Slack and want something between “too basic” and “too much,” check it out: github.com/chrisdodds/fishstick

AI Layoffs

Paycom, OKC’s biggest “tech” company, announced layoffs today that they blamed on AI. Lots of other companies have done the same recently (Microsoft, Saleforce, etc.)

The cynical (but true!) counter-narrative is that AI is just an excuse. Companies have always limped along with pointless inefficiencies.

After a couple years of LLM work, my read is that most “AI optimization” is just someone finally asking obvious questions about bad processes. It’s in-house consulting dressed up as innovation. AI mostly works as a permission structure to kill dumb workflows that could’ve been fixed 20 years ago.

Example: many moons ago I worked at a company where field offices entered truck weights in a spreadsheet, printed and faxed them, and someone at corporate re-typed the numbers into Access. One hour of questioning turned that job into linked Excel sheets. I guess we could’ve slapped an “AI transformation” label on it.

That feels like what’s happening now. “Rub some AI on everything” creates the cover story, but the real driver is people finally saying: “Why are we doing this?”

Using LLMs for meaningful work in a consistent/deterministic way is hard. These companies didn’t all become AI experts overnight.

None of this makes the layoffs less brutal for the people caught up in them, but it does punch some more holes in the AI-is-eating-the-world story.

Monitor Available IPs with Lambda and CloudWatch

I ran into a situation where I needed to keep track of available IPs related to an AWS EKS cluster and couldn’t find any off-the-shelf tooling in AWS or otherwise to do so.

Tangential gripe: The reason I needed the monitor is because EKS doesn’t support adding subnets to a cluster without re-creating it and the initial subnets that were used were a little too small due to reasons. I wanted to have a sense of how much runway I had pending AWS fixing the gap or me implementing a work around.

So, I cobbled together a Lambda function to pull the info and pipe it into CloudWatch.

Gist here

I’m using tags to scope the subnets I want to track, rather than piping in everything – since CloudWatch custom metrics cost money. But you could use whatever filters you wanted.

After getting the data into CloudWatch, I was quickly reminded that you can’t alarm off of multiple metrics, so I used a math expression (MIN) to group them instead. This works for up to 10 metrics (this post should really be titled “The numerous, random limitations of AWS”), which luckily worked for me in this instance.

Then I setup an alarm for the threshold I wanted and tested it – it worked. Fun times.

How to Live-Rotate PostgreSQL Credentials

OK, I didn’t actually learn this today, but it wasn’t that long ago.

Postgres creds rotation is straightforward with the exception of the PG maintainers deciding in recent years that words don’t mean anything while designing their identity model. “Users” and “Groups” used to exist in PG, but were replaced in version 8.1 with the “Role” construct.

In Postgres, everything is a “Role.” A user is a role. A group is a role. A role is a role. If you’re familiar with literally any other identity system, just mentally translate “Role” to whatever makes sense in context.

Now that we’ve established this nonsense, here’s a way of handling live creds rotation.

CREATE ROLE user_group; -- create a role, give it appropriate grants.

CREATE ROLE user_blue WITH ENCRYPTED PASSWORD 'REPLACE ME' IN ROLE user_group;

-- This one isn't being used yet, so disable the login.
CREATE ROLE user_green WITH ENCRYPTED PASSWORD 'REPLACE ME AS WELL' IN ROLE user_group nologin;

That gets you prepped. When you’re ready to flip things.

ALTER USER user_green WITH PASSWORD 'new_password' login;

Update the creds wherever else they need updating, restart processes, confirm everything is using the new credentials, etc. Then

ALTER USER user_blue WITH PASSWORD 'new_password_2' nologin;

Easy, peasy.