Recommendations

There are a few recommendations that can make it easier to manage and monitor SQL IP pools.

Additional configuration

Once the initial configuration is working, there are some additional recommended configuration steps to consider.

The basic configuration arrived at in the previous section will be functional, but it is likely to exhibit poor performance under concurrent load. It also implements a simple, dynamic IP allocation policy that you may want to change.

Enable the IP allocation stored procedure

The stored procedure is provided as an efficient means to both find and allocate an IP address with a single SQL call so that no locks are held across rounds trips between FreeRADIUS and the database. Depending on the database software and configuration the stored procedure may increase the IP allocation performance by as much as 100x and ensure that the system doesn’t fail due to excessive locking under high load.

Load the IP allocation stored procedure in the procedure.sql file corresponding to the database dialect.

Methods vary between databases, but a typical example would be:

mysql radius < /etc/raddb/mods-config/sql/ippool/mysql/procedure.sql

The above command assumes that passwordless login has been configured via the user’s ~/.my.cnf file, or otherwise.

Read the comments in the procedure.sql file which explain how to use the stored procedure, then amend the alloc_find, and allocate_update queries (as well as the corresponding start/end transaction query-parts) in the dialect’s queries.conf file, exactly as described.

Example 1. Example queries.conf configuration to use the stored procedure with MySQL

[raddb]/mods-config/sql/ipool/<dialect>/queries.conf

...
allocate_begin = ""

alloc_find = "\
      CALL fr_ippool_allocate_previous_or_new_address( \
              '%{control.${pool_name}}', \
              '${gateway}', \
              '${owner}', \
              ${lease_duration}, \
	      '%{${requested_address} || 0.0.0.0}' \
      )"

allocate_update = ""

allocate_commit = ""

...

Now that the IP allocation stored procedure has been enabled you should re-test your configuration.

Customize the IP allocation policy

The IP allocation policy is mainly configured by modifying the alloc_find query, however it is likely that the other queries will also need to be modified to get the results you want.

By default each user is likely to get a different IP address each time they authenticate, unless they re-authenticate before their existing session has terminated (as is the case with many EAP-based implementations during "fast re-authentication").

Often it is required that upon reconnection a device obtains its previous IP address, sometimes referred to as a "sticky IPs" policy.

If you are using the base queries (rather than the recommended stored procedure) then the queries.conf file for your database dialect contains several example alloc_find queries for choosing either a dynamic or sticky IP policy.

If you are using the recommended stored procedure then the procedure.sql file which contains the definition of the stored procedure for your database dialect has comments that explain how to amend the procedure to choose either a dynamic or sticky IP policy.

With a sticky IP policy it is necessary to amend the default actions of the release_clear and bulk_release_clear queries. By default, each of these queries clobbers the owner column when sessions expire therefore erasing the affinity information.

A performance benefit of sticky IP addresses derived from the fact that reallocation of an existing IP address is generally much quicker than allocating a new IP address since it generally involves an indexed lookup of a single IP address based on the device’s request attributes rather than a walk of the table’s indexes.

Example 2. Example queries for use with a sticky IP policy

[raddb]/mods-config/sql/ipool/<dialect>/queries.conf

...
alloc_begin = ""
alloc_find = "\
        CALL fr_allocate_previous_or_new_framedipaddress( \
                '%{control.${pool_name}}', \
                '${gateway}', \
                '${owner}', \
                ${lease_duration}, \
		'%{${requested_address} || 0.0.0.0}' \
        )"
alloc_update = ""
alloc_commit = ""
...
release_clear = "\
        UPDATE ${ippool_table} \
        SET \
                expiry_time = NOW() \
        WHERE pool_name = '%{control.${pool_name}}' \
	AND owner = '${owner}' \
	AND address = '${requested_address}'"

bulk_release_clear = "\
        UPDATE ${ippool_table} \
        SET \
                expiry_time = NOW() \
        WHERE gateway = '${gateway}'"

...

[raddb]/mods-config/sql/ipool/<dialect>/procedure.sql

...
        -- Reissue an user's previous IP address, provided that the lease
        -- is available (i.e. enable sticky IPs)
        --
        SELECT address INTO r_address
        FROM fr_ippool
        WHERE pool_name = v_pool_name
                AND owner = v_owner
        LIMIT 1
        FOR UPDATE SKIP LOCKED;

        -- If we didn't reallocate a previous address then pick the least
        -- recently used address from the pool which maximises the
        -- likelihood of re-assigning the other addresses to their recent
        -- user
        --
        IF r_address IS NULL THEN
                SELECT address INTO r_address
                FROM fr_ippool
                WHERE pool_name = v_pool_name
                        AND ( expiry_time < NOW() OR expiry_time IS NULL )
                ORDER BY
                        expiry_time
                LIMIT 1
                FOR UPDATE SKIP LOCKED;
        END IF;
...

Now that you have amended your IP allocation policy you should re-test your configuration carefully to validate each of the amended queries and exercise all of the features of your policy.

If your testing passes then FreeRADIUS should now be ready to allocate IP addresses.

Design Considerations

Choose a database server

The performance of the sqlippool module is much more sensitive to the choice of backend database than is typical for other FreeRADIUS database workloads.

The sqlippool module allocates IP addresses by using a relational database to implement the "accept" part of a "worker queue" pattern, where multiple workers attempt to select a single, unique work item from an unordered queue. In our case the queue is the IP pool and each worker is an authentication request attempting to reserve a unique IP address.

The overall performance largely depends on how directly and optimally the database server software supports this pattern.

In particular, the SQL standard introduces two pragmas that facilitate an optimised implementation of a worker queue:

SELECT FOR UPDATE …: Selects rows while exclusively locking them to prevent simultaneous locking reads and writes of those same rows. When connections are simultaneously selecting an IP address from a pool this pragma is what provides the guarantee of uniqueness of the IP selected for each authentication request.
… SKIP LOCKED: This is a modifier for SELECT FOR UPDATE that optimises the select by allowing each reader to disregard any locked rows in order to avoid blocking. This pragma avoids the "thundering herd" problem that can cripple performance during highly concurrent IP allocation: Several authentication requests wait a lock to be release on a single IP address that is in the process of being allocated to a device. When the lock is release all but one of these previously blocked connections block waiting for the next free IP address to be allocated. And so on until they have all been serviced or their query times out… In the absence of this pragma concurrent authentication requests are unnecessarily serialised while their IP addresses are allocated resulting in poor performance.

Here are some comments regarding the main database engines in relation to this issue:

PostgreSQL

Since version 9.5 provides a direct implementation of the SELECT … FOR UPDATE SKIP LOCKED statement which means that it is a very efficient backend for sqlippool.

MySQL::

Ensure that you use a storage engine what supports transactions such as InnoDB. Do not use MyISAM. Since version 8 the InnoDB storage engine provides a direct implementation of the SELECT … FOR UPDATE SKIP LOCKED statement which means that it is a very efficient backend for sqlippool. Previous versions do not implement the SKIP LOCKED pragma and will therefore have poor performance under concurrent load, but see the tip concerning the alternative stored procedure below.

MariaDB

Ensure that you use a storage engine what supports transactions such as InnoDB. Do not use MyISAM or Aria. All versions of MariaDB lack SKIP LOCKED and therefore will have poor performance under concurrent load, but see the tip concerning the alternative stored procedure below. Given the choice between MariaDB and a recent version of MySQL you should use MySQL for SQL IP Pools.

MS SQL Server

Does not provide a SELECT … FOR UPDATE implementation nor a direct means to provide a row-level, exclusive lock that ensures uniqueness of the selected row. However FreeRADIUS is able to use an artificial UPDATE of a selected row along with non-standard "hints" to obtain the required SELECT … FOR UPDATE SKIP LOCKED semantics, but we pay a performance penalty by doing so. The overall performance is moderate but it does not collapse under concurreent load.

Oracle

Provides a direct implementation of the SELECT … FOR UPDATE SKIP LOCKED statement. It should be very efficient but limitations of the query syntax result in the need to use multiple subqueries for which pay a performance penalty. The overall performance is moderate but it does not collapse under concurreent load.

SQLite

Does not provide a SELECT … FOR UPDATE implementation due to its simplified locking characteristics. The only locking that is available for this purpose is an exclusive lock taken over the entire database. This makes it unsuitable for anything other than managing a small number of devices. In addition the locking model used by SQLite means that when a lock exists on the database, not even SELECT queries can be initiated. This means that use of SQLite as the backend database for an sqlippool module should only be done with FreeRADIUS running with a single worker thread.

Alternative stored procedure for MariaDB and MySQL < 8.0

The file procedure_no_skip_locked.sql contains an alternative stored procedure that is intended exclusively for MySQL variants that do not support the SKIP LOCKED pragma, i.e. all MariaDB versions and versions of MySQL prior to 8.0.

It should be a lot faster than using the default stored procedure under highly concurrent workloads and should not result in thread starvation. It can best be described as a useful hack that should not be used except when the better-performing SKIP LOCKED method is unavailable.

This procedure works by using "user locks" to provide skippable, row-level locking that facilitates non-blocking selection of unique IP addresses from the pool. We pay a performance penalty for managing the locks, however it is nowhere near as bad as serialising all concurrent pool access.

Since user locks are local to the database server instance this procedure is not suitable for a multi-master cluster.

Database clusters (e.g. MaxScale, Galera)

It is usually the case that row-level locks issued by SELECT … FOR UPDATE do not propagate around database clusters which means that unique IP address selection cannot always be guaranteed.

Ingress nodes (relays, proxies, etc.) typically balance reads across all cluster nodes (master and slave) while sending writes to one or more masters. For many workloads this guarantees an appropriate level of data coherency, however for an IP pool allocation implementation this is insufficient since we must guarantee that no two nodes may concurrently SELECT the same IP address from the pool of free addresses.

With clusters the behaviour is normally that when multiple connections initially select the same IP address, one of them will succeed and when the remaining transactions attempt to finalise the lease of the already-allocated IP address only then will the conflict be detected.

When this occurs the sane and correct behaviour of most clusters is for deadlocks to be generated for all but the first transaction and for them to be rolled back. However this will result in IP allocation failures, which is likely to be reported as an Access-Reject.

If however the cluster does not abort the transactions but goes on to erroneously commit them, duplicate IP addresses will be allocated! It is therefore critical to understand how your database cluster manages this case.

Incorrectly configured clusters are likely to have generally good performance under high volumes of serialised load but may exhibit a performance cliff-edge once a moderate level of concurrency is reached. The main symptom of this would be the sudden onset of deadlock. Test your cluster under conditions that represent the maximum load on the system, such as all devices simultaneously re-authenticating after some form of failure.

In some cases it may be possible to alleviate these kinds of issue by using a redundant policy for the sql module instances rather than a load-balance policy, or by adding routing hints to the SQL queries to direct all SQL IP Pool related queries to a single cluster node.

Choose a device identifier

Prior to issuing IP addresses to devices it is necessary to determine how to uniquely identify devices to which IP addresses will be assigned on the basis of the available RADIUS attributes.

In many fixed-line configurations a unique identifier for a device may be the NAS-Port or NAS-Port-Id. In other scenarios identifiers based on the NAS port may be shared by several devices and are unsuitable for uniquely identifying devices. In some cases a circuit-specific identifier such as Vendor-Specific.ADSL-Forum.Agent-Circuit-ID may be present.

However, a circuit-specific identifier is unsuitable if there are multiple hosts on the circuit with which you issue individual addresses (for example when using PPPoE passthrough to provide unique sessions to firewalls behind a separate CPE modem.) In this case the "device" must be thought of not as the circuit but the individual hosts behind it.

The Calling-Station-Id attribute may contain host specific information (such as a hardware address). You should consider the provenance of the data and whether you trust it, e.g. does it report a MAC address that is under an untrusted user’s control. Sometimes a combination of attributes including one that you trust (such as "%{Vendor-Specific.ADSL-Forum.Agent-Circuit-ID}:%{Calling-Station-Id}") may be appropriate.

The owner is used to issue indexed lookups into the fr_ippool table so for performance reasons it should have differentiation in the leading characters. You need to understand your network configuration and the available RADIUS attributes to determine what is an acceptable owner. You may need to reconfigure your NAS to provide a suitable unique device identifier.

Most NASs allow the content of the Calling-Station-Id to the customised so that in includes device specific information and network-related information (such as SSID in the case of wireless networks).

When accounting is enabled it is essential that any owner attribute is included in both authentication and accounting requests. Otherwise lease renewal will fail and IP addresses will be prematurely returned to the pool before sessions have disconnected, resulting in duplicate IP allocations.

Choose a lease duration

sqlippools manages the status of the IP addresses (either leased or free) using the expiry_time of each entry in the fr_ippool table. If it is in the future then the IP address is considered to be actively leased (in use by a device), otherwise it is free to be re-allocated to a device.

The lease_duration configuration item is used to set the expiry_time for an IP address lease both when it is allocated at the start of a session and renewed during an ongoing session. The expiry_time is set to lease_duration seconds after the current time. expiry_time and lease_duration are used as a means to manage the IP address pools. They are not communicated to the NAS or device.

In access networks where the IP address pool is plentiful and sessions have a defined maximum length (for example when Session-Timeout RADIUS attribute is being honoured), setting a lease_duration that is a little longer than the maximum permitted session length is acceptable.

Often either the session duration is not restricted or the capacity of the pool is small (relative to the number of devices) so that it needs careful management to continuously free IP addresses that are no longer in active use.

Normally accounting must be enabled on the NAS and an accounting interim interval set for two reasons:

Accounting Interim-Update requests indicate that an IP address is still in use by an ongoing session. This causes the lease to be periodically extended before it expires and prevents an IP address from being prematurely freed as this would result in active sessions having duplicate IP addresses.
Accounting Stop (and Accounting On/Off) requests indicate that an IP address is no longer in use so that it can be immediately released. This ensures that a IP address is not leases longer than necessary and avoids wasted pool space when a group of devices reconnect after their sessions are interrupted.

It is essential that Accounting Interim-Update requests are received reliably. Many NASs will repeat accounting requests that are not acknowledged by a RADIUS server or forward them elsewhere. You should determine whether accounting requests are delivered reliably in your network.

If accounting requests are received reliably then lease_duration could be set a little longer than twice the accounting interim interval to handle the occasional lost interim-update. Otherwise the lease duration might need to be set several times longer.