All posts
AutomationAugust 22, 20256 min read

Building Internal Automation That Actually Gets Used

Most internal tools get built, shown to the team, and quietly abandoned. The failure is almost never technical.

AutomationPythonInternal ToolsOperations

The Pattern I Keep Seeing

Someone builds a script that genuinely saves time. It gets announced in Slack. A few people use it. Three months later, the original author is still the only person running it, and even they've started doing it manually again because they can't remember the exact flags.

The engineering usually isn't the problem. The problem is that the tool was designed for the person who built it, not for the person who needs to run it.

Before You Write a Line

Automation build lifecycle
Automation build lifecycle

Map the manual process in detail. Watch someone actually do it, not just describe it. Ask what they forget every time. Ask what the consequences are when it goes wrong. Ask which parts are annoying vs. which parts are just slow.

The two questions that matter most:

How often does this happen? Automation for something that happens twice a month is very different from automation for something that happens fifty times a day. The first one needs to be reliable and well-documented. The second one needs to be fast and require zero thought.

What does failure look like? If the manual process fails, someone notices immediately. If the automated version fails, when does someone notice? That gap needs to be planned for.

Design for the Person Running It

# This is the right version
python onboard.py user@company.com

# This requires the runner to hold context that lives in your head
python onboard.py --email user@company.com --env prod --notify true --template standard

The common case should require the fewest arguments. Optional parameters should have defaults that are right 90% of the time. Everything else is friction that will eventually cause someone to do it manually instead.

Output matters as much as input. A script that runs silently and exits zero isn't helping anyone. Print what you're doing, print the result, and print something actionable when things go wrong.

# Bad: fails silently
try:
    provision_user(email)
except Exception:
    pass

# Good: tells you what happened and what to do about it
try:
    provision_user(email)
    print(f"Provisioned {email} successfully")
except ProvisioningError as e:
    print(f"Failed to provision {email}: {e}")
    print("Check that the account exists in Entra ID and try again.")
    sys.exit(1)

The Handoff Problem

Automation that only works when the author is available is a liability, not an asset. The handoff test I use: ask someone who wasn't involved in the build to run it cold, using only the --help output and the runbook. If they need to call you, it's not ready.

The runbook should cover three things: how to run it normally, what the common error states look like, and who to contact when something goes wrong that isn't covered. Not a novel. A reference document someone can scan in 60 seconds at 2am.

Measure Whether It's Being Used

You won't know if the automation is actually running unless you instrument it. Log every invocation with a timestamp. If it processes records, log the volume. Set up a simple alert if it hasn't run in longer than expected.

Six months after you ship something, you should be able to answer: is this running? How often? Are there errors? If you can't answer those questions, you've built something that could be silently broken and you wouldn't know.

The best internal automation is the kind nobody thinks about because it's just always worked. That doesn't happen by accident.