Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 4 Next »

review and discussion wished!

Overview

First of all we split between critical- and non-critical services. Everything that has to do with customer is "critical". All that has to do with service staff is "non-critical" at the time.

Critical ServicesNon Critical Services


All GUI related stuff

  • NMS PRIME GUI
  • Apache
  • Monitoring (Cacti)
  • Icinga / Nagios
Failover PossibleNo Failover considerations at the time

Failover Layers

1. NMS Prime GUI

no failover at the time

2. Apache

no failover at the time

3. Database

Normal MySQL / MariaDB failover cluster with N nodes. Possible Solutions:

  1. MaxScale
  2. MariaDB Galera Cluster (not tested)

4. NMS Prime Lower Layers

We differ between the master and N x slave NMS PRIME instances. The primary instance is also running the NMS PRIME GUI. Any changes in GUI will trigger realtime changes in the master config(s), like DHCP and TFTP. This is done via Laravel Observers or Jobs (e.g. Modem Observer).

The slaves are running on separate machines without a GUI. They are rebuilding DHCP, BIND, and TFTP configfiles on a regular base (e.g. 1 hour) e.g. via cronjob. The slaves are independent from Master and they are only connected towards MariaDB SQL cluster via a SQL read-only connection. So any changes in Master will be directly distributed towards SQL cluster and later automatically fetched from the slaves.

This concept offers:

  1. a Master with real-time changes towards all critical configs
  2. redundant slaves who is independent off Master
  3. a redundant database with load-sharing possibility
  4. Load-Sharing for either DHCP, DNS and TFTP for all Modems

5. Critical Services

ISC-DHCP

Normal ISC-DHCP failover with Master-Slave Concept:

Slaves rebuild their DHCP configs by them self after a defined time (see above).

DNS / BIND

Slaves rebuild their configs by them self after a defined time (see above).

More research required, but a good starting point could be here:

TFTP

Cronjob at slave will rebuild all configfiles on a recurring basis (e.g. every hour). In NMS Prime this could simply be done by running


Possible Cronjob(s) for Slaves

e.g. possible cronjob
php artisan nms:dhcp && systemctl restart dhcp

e.g. possible cronjob
php artisan nms:configfile


Github TODO: #687

implementing this into Laravel scheduling framework (for slaves only!) will be a advance especially if building all config files could take longer than rebuild loop, since this could be easy avoided using ->withoutOverlapping():

See: https://github.com/nmsprime/nmsprime/blob/dev/app/Console/Kernel.php#L35

I would love to see a /etc/nmsprime/env statement for a possible slave configuration, like

SLAVE_CONFIG_REBUILD_INTERVALL=3600 # time in seconds



Workflow



Considerations on Failover from 22.5.2019

Ole Ernst

Torsten Schmidt

(Christian Schramm )


  • No labels