Simple but Effective SSH Rate Limiting with PAM and nftables

Anyone who operates an SSH server somewhere on the Internet is bound to suffer a relentless torrent of inbound connections, probably from some botnet or another, trying to log in with the myriad credentials that leaked from other systems and networks.

While the security aspects of such attempts can be mitigated rather easily using, for example, SSH public key authentication, another consequence of this problem can be more annoying to deal with: system logs overflowing with messages from failed login attempts, pertaining to all these illegitimate clients trying their luck against your server.

This article presents an apprach that combines an example nftables firewall policy with a helper script to be executed by PAM after successful authentication to deal with this nuisance.

Prior Art: fail2ban, sshguard

Of course, since this problem has existed for decades by now, there already are effective solutions to it: software like sshguard or the more generally suitable fail2ban will dutifully monitor arbitrary sources of log data for you, and sanction (by dynamically adding firewall rules to drop traffic from such sources, for example) offending network peers.

The log processing approach has, as so many things in life, its pros and cons: It is very flexible, and you can map pretty much arbitrary conditions and combinations of factors to outcomes and consequences of your choosing. A definite weakness, on the other hand, is all the ingesting and parsing and correlating of log data that has to be going on all the time. Also, in the face of logrotate(8) and other often messy ways to deal with an abundance of logs, the machinery performing this can turn out a bit brittle — and, given sufficient (bad) luck, could cause your mitigations to fail. I’ve seen it happen on very busy machines with suboptimal choices made in their logging configuration. So why not try to avoid the problem by design, which implies we also can’t incur it by accident?

How about “succeed2unban” instead?

Recently, I once again felt the need to address this problem on one of my own tiny Islands within the vast seas of the Internet — but I wanted a more lightweight approach to it. After some thinking, it dawned upon me that nftables/netfilter and PAM already provide me with all the devices I needed to build something better — at least according to my tastes.

You see, with the existing approaches, the system will usually tally violations (i.e., matches of an expression of some kind against your log content) of the “you shall not fail to have your SSH session authorized”-commandment, and punish those who trip up too often.

I decided to turn that operational principle on its head, at least somewhat: Every source address (or, in the case of non-legacy Internet: IPv6 network prefix) gets an attempt to establish its first new TCP connection to my precious sshd listening socket. While that single connection is accepted and processed, each such peer is recorded in a dynamic nftables set (with an expiration timer attached to each entry). And while any given peer is a member of this set, subsequent new connection attempts from that peer will be disallowed to arrive at the SSH listener.

This alone would already cover a wide range of (ab)use cases, but it will break an important legitimate one for circumstances where a number of SSH connections in quick succession need to be spawned, such as someone running ansible (without ControlMaster in effect, at least). To deal with that, on the host that implements this firewalling policy, and behind its potent sshd- and PAM-provided authentication and authorization schemes, we work a tiny bit of scripting magic: Involving pam_exec(8) in PAM’s session phase will cause a small helper program to find out what’s the source address associated with the already established and authorized SSH session, and remove that specific record from nftable’s rate-limiting set.

The beauty of it is that this has only very few moving parts — or maybe, more correctly put, most of these moving parts are already in motion anyway, and are a given on most systems. On Debian GNU/Linux, OpenSSHd already uses PAM by default, and the integration with nftables/nft(8) takes but a little sprinkling of magic across /etc.

Setup on Debian

To implement this scheme on a Debian 12 host, we are going to ensure a few crucial things:

sshd must actually use PAM (the default openssh-server configuration provides this)
You must have a suitable firewall policy, an example of which I will provide, that implements set-based rate limiting for inbound TCP connections on the SSH service port
You must have a helper script in place to remove clients from your rate-limiting sets once they have proven their innocence
sshd’s PAM stack must be wired up to execute that helper script under specific circumstances

The minimal set of Debian packages you will need to have installed are just an execution of apt-get install openssh-server nftables away.

The first item from the above checklist is easily accounted for: Have root execute sshd -T | grep ^usepam — if there’s a yes at the end of the output, you’re OK. If sshd was configured to not use PAM to perform authentication, it’s time to ask the person calling the shots on this host what’s the reason for that particular configuration choice, and maybe discuss/consider changing it.

Part two could get a little more hairy — but those who already have a netfilter policy in place are expected to know how and where exactly to adapt it, and those who don’t will not suffer from Debian’s benign “allow everything” default policy that establishes itself upon first installation of the nftables package. With that package installed, your persistent nftables firewalling policy, encoded in nft(8)-consumable statements, will be stored in a human-readable file at /etc/nftables.conf. To make sure the associated systemd service unit takes care of loading these rules while the system starts up, all you have to do is execute systemctl enable --now nftables.service once, and that, as thet say, is that.

Please be aware that the system does not automatically update or overwrite an existing /etc/nftables.conf on your behalf after you have altered the nftables ruleset at runtime. Also, altering the content of the configuration file will not magically affect the runtime ruleset of your Linux kernel. You are responsible for keeping the policy (persisted and at runtime) up to date and in sync yourself. The one exception is the start of the nftables.service unit (usually) during system startup mentioned above.

While crafting your nft ruleset, work with a temporary copy of it from a location other than /etc/nftables.conf, and only persist changes to that particular file when you are sure you have validated that its new content will match your actual intentions.

nftables “firewall” Policy

As far as firewalling rules go, whenever possible, I use the most essential filtering policy that I can produce myself and without assistance of automated tools. This maximizes my chances that I will understand what is going on and quickly can get on top of things if (or when) something goes awry involving the system’s firewall. In this example case here, the system in question provides only a select few services to the outside world, which I will enable/allowlist explicitly: DNS, DNS-over-TLS, HTTP, and HTTPS. Oh, and SSH, of course!

As a result, this is my server’s nftables.conf, which you may be able to follow along without too much trouble. It was based on the nftables’ wiki example of a “Simple ruleset for a server”. If you need help interpreting what is going on, the chapter right after this one will hopefully be able to assist (and you’re invited to skip over it, in case the policy’s intention seems clear enough).

#!/usr/sbin/nft -f

flush ruleset

table inet firewall {
  set ssh_ratelimit_v6 {
    type ipv6_addr
    size 65535
    flags dynamic,timeout
    timeout 1m
  }

  set ssh_ratelimit_v4 {
    type ipv4_addr
    size 65535
    flags dynamic,timeout
    timeout 1m
  }

  chain inbound_ipv4 {
    icmp type echo-request limit rate 15/second accept
    ct state new tcp dport 22 add @ssh_ratelimit_v4 \
      { ip saddr limit rate 1/hour burst 1 packets } counter accept
  }

  chain inbound_ipv6 {
    icmpv6 type { nd-router-advert, nd-neighbor-solicit, nd-neighbor-advert } accept
    icmpv6 type echo-request limit rate 15/second accept
    ct state new tcp dport 22 add @ssh_ratelimit_v6 \
      { ip6 saddr & ffff:ffff:ffff:ffff:: limit rate 1/hour burst 1 packets } counter accept
  }

  chain inbound {
    type filter hook input priority filter; policy drop;
    iifname "lo" accept
    meta protocol vmap { ip : jump inbound_ipv4, ip6 : jump inbound_ipv6 }
    ct state vmap { invalid : drop, established : accept, related : accept }
    udp dport 53 counter accept
    tcp dport { 53, 853 } counter accept
    tcp dport { 80, 443 } counter accept
  }

  chain forward {
    type filter hook forward priority filter; policy drop;
  }
}

If you would rather download than copy-and-paste the ruleset as a starting point for your own nftables adventures, you can fetch it using this link here.

To load new nftables configuration into the kernel, you execute a command like nft -f /path/to/your/nftables.conf. As already said above, while you are implementing this proposed scheme (or any other kind of nftables firewall policy change), I would highly recommend you to not place your working copy of nft configuration at /etc/nftables.conf, since saving it there would auto-load whatever you had stored there upon booting the system.

Now imagine that you were to add some new rule that locks you out of your (remote) system for good — by inadvertently dropping all SSH traffic, for example — and after you manage to somehow reboot the box, you’re now locked out from the get-go. So make a temporary copy elsewhere, edit it and (re)load it from there, and only once you are 100% certain it’s fine, have its contents overwrite your system’s /etc/nftables.conf.

Tables and Chains Explained

In plain English, the implemented mechanisms in the proposed example ruleset above are designed to work as follows:

We have an nftables table for the inet address families, and it is named firewall. At the top of this table, we define two sets: Their type indicated that one will record IPv4 addresses, the other IPv6 addresses. Both are dynamic (i.e., can have elements added and removed after they have been defined in the system) and will “forget” entries that are older than one minute via their timeout configuration. The sets’ name reflect their intentions — they will eventually be used to rate-limit new, inbound SSH connections.

In the firewall table, we also have four chains. Each chain organizes and groups individual rules, a bit like functions in an imperative programming language group and organize statements and expressions (and also calls to other functions). We will ignore the last and, frankly, least interesting chain (named forward, and preventing the routing of packets), and instead look at the other three, from the bottom up:

Chain “inbound”

The chain named inbound is defined as hooking the packet path as a filter for the inbound direction of packet flow, and has a policy (i.e., a “default action” that will be taken for all packets that were not assigned any other explicit action while traversing this chain) of drop. This implies that we will have to carefully specify all the kinds of traffic that we actually want to make it through our firewall, and end up arriving at application sockets to do something with.

The second line of the chain definition, iifname "lo" accept, takes care of all traffic directed at the lo (aka loopback) interface in your Linux host. This makes TCP and UDP traffic over the host-local address ranges work (localhost, 127.0.0.1/8, ::1). It’s actually hard for me to come up with an example of circumstances in which you would not want that, and you can create all kinds of funky effects if you forget establishing a rule like this one.

Line number three is where it gets interesting (and more relevant to this writeup, I promise!), because that’s where we “split” all traffic into the two variants of the Internet Protocol we support: All IPv4 traffic gets processed by the inbound_ipv4 chain, while all IPv6 traffic will have to traverse inbound_ipv6 instead.

Before we concern ourselves with these IP-specific chains, let’s quickly finish up with the “general” inbound chain: The next line that start with ct state vmap deals with allowlisting TCP and UDP flows that the kernel’s connection tracking considers part of previously accepted connections (in netfilter’s view, that even applies to the “connectionsless” UDP transport). When configuring a stateful firewalling policy, you will generally see something like this somewhere in your packet path’s rules.

Furthermore, the host this policy was designed for offers DNS (both “vanilla” DNS and also DNS-over-TLS on Port 853), as well as HTTP and HTTPS servies to the public — which is why we accept these protocol’s well-known ports for the UDP/TCP transport layer protcols, where appropriate. Also, since keeping stats is cheap, we make nftables do that for these rules using the counter keyword.

Any other UDP or TCP service bound to any other ports than these is out of luck, and will have traffic send towards it dropped/ignored by the kernel.

With all that said and done, let’s look at the rules we have to use to deal with the differences in IP address families, and the reasons for this as well.

Chain “inbound_ipv4”

  chain inbound_ipv4 {
    icmp type echo-request limit rate 15/second accept
    ct state new tcp dport 22 add @ssh_ratelimit_v4 \
      { ip saddr limit rate 1/hour burst 1 packets } counter accept
  }

This chains deals with traffic arriving at our host over the IPv4 protocol only - we ensured that by how we refer to it in its parent chain. The first line in the inbound_ipv4 chain will cause our system to respond to ICMP ECHO REQUEST messages (“pings”), no matter who hurled them at us, at most 15 times a second. Remember that with the default drop policy, all traffic that we do not actively condone somewhere in our ruleset will get the axe — so if we want (some) pings to be responded to, this has to be handled somewhere in the ruleset, and this is happening right here.

The next two lines are more interesting, as we’re getting closer to the meat of the subject: We use the ct (again, connection tracking) support of nftables to look at TCP connections in state new inbound for dport 22 — that is SSH’s well-known port — only. The state new stanza matches a specific pattern in TCP’s control bits that initiates a TCP flow between client and server. If all these conditions hold true, we execute some nifty nft magic to add the saddr (source address) of the peer that triggered this rule to our rate-limiting set of IPv4 addresses. The object placed in the set will make sure that a rate-limit of only one new connection attempt per hour of wallclock time will be allowed for as long as it exists (in this set).

It is important not to miss the burst 1 packets part of this rule, because otherwise, nft rules that limit packet rates will assume an implicit default burst setting of 5 (packets) during the specified observation period. This would cause each peer to be able to attempt up to five (instead of just one) SSH connections to your server before having its SSH traffic dropped, which is a lot more than what we actually want.

With the IPv4 part explained, let’s look at how exactly we’re dealing with IPv6 peers, and why.

Chain “inbound_ipv6”

  chain inbound_ipv6 {
    icmpv6 type { nd-router-advert, nd-neighbor-solicit, nd-neighbor-advert } accept
    icmpv6 type echo-request limit rate 15/second accept
    ct state new tcp dport 22 add @ssh_ratelimit_v6 \
      { ip6 saddr & ffff:ffff:ffff:ffff:: limit rate 1/hour burst 1 packets } counter accept
  }

These IPv6-specific rules need to be a bit more involved, mainly because IPv6 requires certain ICMPv6-specific flows to succeed to be able to function at all, which is what we do in the first line. Other than that, it behaves functionally identically to its legacy-Internet IPv4 sibling chain, but uses a rate-limiting set of a different name. The second IPv6-specific set is necessary because both sets can only store data types (i.e., address families) of either the one or the other type (amounting to 32 vs. 128 bits per address entry).

Attentive readers will notice that there’s something else going on in the last line of the last rule of this chain: Some funny business with ‘&’ (a binary/logical and operator) and a lot of ‘ffff’s. The constructed expression, saddr & ffff:ffff:ffff:ffff::, will set all the host bits of the triggering IPv6 source address to 0. This essentially converts the source address into an IPv6 network prefix (a /64 prefix, which is the smallest prefix that gets routed individually in IPv6 networks), and ensures that all possible IPv6 address from the given prefix will be treated equivalent to the singular specific source address that triggered the match in the first place. I think this approach is useful — or even necessary — to deal with IPv6’s vast address space efficiently.

Please note that the helper script we’re going to look at next also has to take this little trick into account, lest it would try to remove an entry from the IPv6 rate-limiting set that was never actually put into it.

PAM Helper Script

The PAM-related bits I haven’t touched upon yet, so let’s check out the helper script that we will need to have ready to execute on the host whenever a connection’s login attempt was validated by sshd. Its task is to undo the rate-limiting imposed by the nftables/netfilter machinery implemented and explained above.

The bash script we use to do so will check if it’s executing under PAM, and if it can find a properly-looking source address in the SSH_CONNECTION variable that sshd populates after it has authenticated any of its clients. If these pre-flight checks work out, it will invoke the nft executable with specific arguments to remove the successfully authenticated peer from either the IPv4- or the IPv6-specific rate-limiting set.

In the interest of (relative) brevity, I will neither reproduce the mentioned pam_clear_nft_ratelimits script inline in this article, nor discuss it line by line — but merely link to it for your downloading pleasure.

What is important to know about it is that if your nftables setup differs from my proposed policy, especially in terms of chain and/or set name(s), you will need to adapt the script before being able to use it.

Apart from that, there’s two things you will need to get right:

Store it in the proper place with sensible permissions, as PAM will execute this script as root. I recommend (and will assume) /usr/local/lib/pam_clear_nft_ratelimits, with the file owned by root:staff and its octal DAC permission bits set to 0540.
Make sure it gets invoked at the proper place in your PAM module stack.

Setting up the sshd PAM Stack

The second task is the harder part: Properly massaging sshd’s PAM stack configuration to do what we want. Since PAM is a delicate matter and I cannot foresee all possible configurations out there, I will resort to describing the principle:

Use your preferred editor to locate the last line in /etc/pam.d/sshd that deals with the session phase. Add a new line right after it that reads (assuming the helper script location mentioned above) like so:

session    optional     pam_exec.so /usr/local/lib/pam_clear_nft_ratelimits

After saving the buffer and updating /etc/pam.d/sshd like this, this will have immediate effect on your sshd’s PAM configuration, so you must validate this does not have any unforseen consequences. Logging in once or twice with all the authentication methods that you actually use on the system while having a watchful eye on journalctl -f will go a long way towards avoiding any harm caused by unintended breakage (such as syntax errors) in this vital configuration file.

Observing Impact

Once this final piece of the puzzle is in place, you should see the numbers of repeat offenders knocking at your SSH-secured doors in mindless futility drop by a lot.

You can verify that by taking a look at the authentication log of your system, nowadays most readily found in the ssh-unit specific part of the systemd journal, using journalctl -f -u ssh.service.

If you execute nft list sets, you can get an overview over all offenders your rate-limiting setup has placed on the naughty list, and who won’t be able to bother your ssh server again until their associated timer expires.

Finally, with nft monitor, you can see what the helper script removing records from the rate-limiting sets whenever an SSH login actually succeeds.

Tightening the Screws

If you want to further reduce the amount of logging possibly triggered by any connecting SSH peer, maybe consider decreasing sshd’s MaxAuthTries setting (but keep in mind this could also impact legitimate users, especially in case they have more than one private key/SSH identity ready to try authentication with loaded into their ssh-agent).

In case you need even more quiet that the described setup provides, you may want to consider increasing the timeout 1m definition in the two nftables policy sets. I actually use timeout 60m on my systems, and find that one hour of penalty time very comfy - but I also happen run a setup in which my own SSH authentication attempts never fail, and the risk of placing myself on the bench for an hour by accident is very close to zero.

Copyright and License, Version History, Misc.

Initially published on 2025-02-15

This work is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License.