High Availability Installation
- 1 1.1. Description
- 1.1 1.1.1. DHCP
- 1.2 1.1.2. TFTP
- 1.2.1 1.1.2.1. Theory
- 1.2.2 1.1.2.2. Practice
- 1.3 1.1.3. Time
- 1.4 1.1.4. DNS
- 2 1.2. Preparation
- 2.1 1.2.1. Master
- 2.2 1.2.2. Slave
- 2.3 1.2.3. Database
- 2.4 1.2.4. CMTSs
- 3 1.3. Installation
1.1. Description
The module ProvHA provides functionality for failover setups.
This includes the following services:
DHCP
TFTP
time
DNS
Typically the load is balanced – master and slave share the provisioning load. If one instance is down all work is done by the other machine keeping your network alive. Your customer doesn't notice the incident…
The configuration files needed for provisioning are updated as follows:
master: same as in standalone installation (every time a relevant database value has changed)
slave:
rebuild every n seconds (this can be configured via master GUI (Global config ⇒ ProvHA))
rebuild on changes of several database tables (cmts, ippool)
1.1.1. DHCP
We use the failover functionality of ISC DHCP which supports a setup with one master (primary) and on slave (secondary) instance, called peers. Each server handles by default 50% of each IP pool, the load balance is configurable. Servers inform each other about leases – if one instance goes down the failover peer takes over the complete pools. If both servers are active again the pools will be balanced automatically.
Configuration is done in /etc/dhcp-nmsprime/failover.conf , the pools in /etc/dhcp-nmsprime/cmts_gws/*.conf are configured with a failover statement.
1.1.2. TFTP
1.1.2.1. Theory
For TFTP we have to distinct between DOCSIS versions:
For DOCSIS versions less 3 one can only provide one TFTP server, realized via
option next-serverstatement in global.conf. In our setup each of the two DHCP servers sets this to its own IP address.For higher versions the value in
next-servercan be overwritten usingoption vivso(125.2: CL_V4OPTION_TFTPSERVERS) – there can be multiple TFTP servers configured. For our system each DHCP server provides its IP address first and the peer's IP address as second.
1.1.2.2. Practice
Looks like – at least Cisco 7225 – is not able to make use of this information. Attached two tcpumps of the same DHCP ACK; one sent from slave to CMTS, the other sent from the CMTS to the modem: dhcp_ack_slave_to_cmts.txt dhcp_ack_slave_to_cmts.txt
And here a screenshot of the diff:
CMTS replaces the IP of “next server” and “tftp server” by it's own one
no matter in what order the IPs in option 125.2 are – every single TFTP request of a modem is directed to “Next server IP address“
that means:
if at an HA NMS the DHCP server is working all ACKs of this server contain it's own IP address
if the TFTP server is dead on this server or a configfile is missing the CM will be stuck in init(o) and rebooting endlessly
1.1.3. Time
option time-servers accept a comma separated list of IP addresses. Each DHCP server provides its IP address first and the peer's IP address as second.
1.1.4. DNS
option domain-name-servers accept a comma separated list of IP adresses. Each DHCP server provides its IP address first and the peer's IP address as second.
what about zone sections in global.conf? Shoud peer IP be given as secondary?
open question: needs DNS server configuration to be changed, too?
1.2. Preparation
You will need two NMSPrime installations to provide failover functionality.
1.2.1. Master
One is defined as MASTER and is comparable to an installation without failover (a “classical” NMSPrime). This instance is the only one with write access to the databases – all tasks changing the database are done here:
GUI actions (like adding/changing/deleting elements like contracts, modems, cmts, etc)
API
cacti polling
communication with external API like “envia TEL” (module ProvVoipEnvia)
1.2.2. Slave
The other installation (currently only one slave is supported due to restrictions of ISC DHCPd) has read-only access to the database:
at the slave the database can not be changed
provisioning of CM/MTA/CPE will be done completely by the slave if the master fails
1.2.3. Database
The database is set up as a galera cluster; this way all data is duplicated and stored e.g. at master and slave machine.
1.2.4. CMTSs
CMTS configs need to be extended: For each of the two instances we need
cable helper-addressstatement (interface Bundle x)ntp serverstatement
1.3. Installation
Most of the work is done by our installation scripts deployed with the ProvHA module. Details are given for your understanding of the whole process and for later configuration changes.