- Network maintenance is an inherent component of a network administrator’s responsibilities.
- Interrupt-driven maintenance tasks can be reduced by proactively performing regularly scheduled maintenance tasks.
Interrupt-driven tasks are helpdesk tasks (fix problems as reported by users) while Structured tasks are tasks performed by a predefined plan. Advantages of a structured network maintenance model over interrupt-driven include:
+ Proactive vs. reactive
+ Reduced network downtime
+ More cost effective
+ Better alignment with business objectives
+ Improved network security
Network maintenance, at its essence, is doing whatever is required to keep the network functioning and meeting the business needs of an organization.
Tasks that fall under the umbrella of network maintenance are as follows:
■ Hardware and software installation and configuration
■ Troubleshooting problem reports
■ Monitoring and tuning network performance
■ Planning for network expansion
■ Documenting the network and any changes made to the network
■ Ensuring compliance with legal regulations and corporate policies
■ Securing the network against internal and external threats
Some of the more Well-Known Network Maintenance Models
■ FCAPS:FCAPS (which stands for Fault management, Configuration management,Accounting management, Performance management, and Security management) is a network maintenance model defined by the International Organization for Standardization (ISO).
■ ITIL:An IT Infrastructure Library (ITIL) defines a collection of best-practice recommendations that work together to meet business goals.
■ TMN:The Telecommunications Management Network (TMN) network management model is the Telecommunications Standardization Sector’s (ITU-T) variation of the FCAPS model. Specifically, TMN targets the management of telecommunications networks.
■ Cisco Lifecycle Services:The Cisco Lifecycle Services maintenance model defines distinct phases in the life of a Cisco technology in a network. These phases are Prepare, Plan, Design, Implement, Operate, and Optimize. As a result, the Cisco Lifecycle Services model is often referred to as the PPDIOO model.
FCAPS Management Tasks
Fault management - Use network management software to collect information from routers and switches. Send an e-mail alert when processor utilization or bandwidth utilization exceeds a threshold of 80 percent. Respond to incoming trouble tickets from the help desk.
Configuration management - Require logging of any changes made to network hardware or software configurations. Implement a change management system to alert relevant personnel of planned network changes.
Accounting management - Invoice IP telephony users for their long distance and international calls.
Performance management - Monitor network performance metrics for both LAN and WAN links. Deploy appropriate quality of service (QoS) solutions to
make the most efficient use of relatively limited WAN bandwidth, while prioritizing mission critical traffic.
Security management - Deploy firewall, virtual private network (VPN), and intrusionprevention system (IPS) technologies to defend against malicious traffic. Create a security policy dictating rules of acceptable network use. Use an AAA server to validate user credentials, assign appropriate user privileges, and log user activity.
Routine Maintenance Tasks
- Configuration changes,
- Replacement of older or failed hardware,
- Scheduled backups,
- Updating software,
- Monitoring network performance:
Maintaining Network Documentation
Network documentation typically gets created as part of a network’s initial design and installation.
Keeping that documentation current, reflecting all changes made since the network’s installation, should be part of any network maintenance model.
Network documentation could consist of:
- Logical topology diagram: the interconnection of network segments, the protocols used, and how end users interface with the network
- Physical topology diagram: shows how different geographical areas (for example, floors within a building, buildings, or entire sites) interconnect.
- Listing of interconnections: could be a spreadsheet that lists which ports on which devices are used to interconnect network components, or connect out to service provider networks.
- Inventory of network equipment: include such information as the equipment’s manufacturer, model number, version of software, information about the licensing of the software, serial number, and an organization’s asset tag number.
- IP address assignments.
- Configuration information: When a configuration change is made, the current configuration should be backed up.
- Original design documents: created during the initial design of a network might provide insight into why certain design decisions were made.
Planning and provisioning hardware and software for such outages before they occur can accelerate recovery time. Must be present:
- Duplicate hardware
- Operating system and application software (along with any applicable licensing) for the device
- Backup of device configuration information.
The Network Maintenance Toolkit
Network maintenance tools often range in expense from free to tens of thousands of dollars.
Cisco Tools & Resources http://www.cisco.com/c/en/us/support/web/tools-catalog.html
Cisco IOS offers a wealth of CLI commands, which can be invaluable when troubleshooting a network issue
A newer Cisco IOS feature, which allows a router to monitor events and automatically respond to a specific event (such as a defined threshold being reached) with a predefined action, is called Cisco IOS Embedded Event Manager (EEM). EEM policies can be created using Cisco’s tool command language (TCL).
Although Cisco has some GUI tools, such as CiscoWorks, that can manage large enterprise networks, several device-based GUI tools are freely available.
Examples of these free tools from Cisco are the following:
■ Cisco Configuration Professional (CCP)
■ Cisco Configuration Assistant (CCA)
■ Cisco Network Assistant (CNA)
■ Cisco Security Device Manager (SDM)
External servers are often used to store archival backups of a device’s operating system (for example, a Cisco IOS image) and configuration information.
Depending on your network device, you might be able to back up your operating system and configuration information to a TFTP, FTP, HTTP, or SCP server.
R1# copy startup-config ftp://kevin:email@example.comFTP config to add username and password credentials to the router’s configuration, without explicitly specifying those credentials in the copy command.
Address or name of remote host [192.168.1.74]?
Destination filename [r1-confg]?
Writing r1-confg !
1446 bytes copied in 3.349 secs (432 bytes/sec)
R1(config)# ip ftp username kevinThe process of backing up a router’s configuration can be automated using an archiving feature, which is part of the Cisco IOS Configuration Replace and Configuration Rollback feature:
R1(config)# ip ftp password cisco
R1# copy startup-config ftp://192.168.1.2
- you can configure a Cisco IOS router to periodically back up a copy of the startup config to a specified location (for example, the router’s flash or an FTP server)
- can be configured to create an archive every time you copy a router’s running configuration to the startup configuration.
Example: Back up its startup configuration every day (that is, every 1440 minutes) to an FTP server (with an IP address of 192.168.1.74, where the login credentials have already been configured in the router’s configuration)
ip ftp username kevinView archived files:
ip ftp password cisco
R1# show archiveYou can restore a previously archived configuration using the configure replace command.
The next archive file will be named ftp://192.168.1.74/R1-config-3
Archive # Name
2 ftp://192.168.1.74/R1-config-2 <- Most Recent
This command does not merge the archived configuration with the running configuration, but rather completely replaces the running configuration with the archived
Router# configure replace ftp://192.168.1.74/R1-config-3HTTP
- Configure a default username before a file is copied to or from a remote web server using the copy http:// or copy https:// command.
- The default username will be overridden by a username specified in the URL of the copy command.
Router(config)# ip http client password Secrethttp://www.cisco.com/c/en/us/td/docs/ios/netmgmt/configuration/guide/12_2sx/nm_12_2sx_book/nm_http_web.html
Router(config)# ip http client username User1
Copies a file from any supported remote location to a local file system, or from a local file system to a remote location, or from a local file system to a local file system.
Device logs often offer valuable information when troubleshooting a network issue.
If you are connected to a router via Telnet and want to see console messages, you can enter the command terminal monitor.
Logging console messages to a router’s buffer (that is, in the router’s RAM), you can issue the logging buffered. You can specify how much of the router’s RAM can be dedicated to logging.After the buffer fills to capacity, older entries will be deleted to make room for newer entries.
This buffer can be viewed by issuing the
show logging historyYou might only want to log messages that have a certain level of severity.
logging console <severity_level>Logging config
logging buffered <severity_level>
logging buffered 4096 warnings <---router can use a maximum of 4096 bytes of RAM for the buffered logging
logging console warnings <---severity level of warning(that is, 4) or less (that is, 0–2) are logged to the router’s buffer
logging 192.168.1.50 <---router is configured to log messages to a syslog server with an IP address 192.168.1.50
Because the NTP server might be referenced by devices in different time zones, each device has its own time zone configuration, which indicates how many hours its time zone differs from Greenwich Mean Time (GMT).
R1(config)# clock timezone EST -5
R1(config)# clock summer-time EDT recurring 2 Sun Mar 2:00 1 Sun Nov 2:00
R1(config)# ntp server 192.168.1.150
R1(config)# ntp server 192.168.1.250 prefer <--- use this IP as its NTP-server before falling back to 192.168.1.150
Network Documentation Tools
A couple of documentation management system examples are as follows:
■ Trouble ticket reporting system - recording, tracking, and archiving trouble reports (that is, trouble tickets).
■ Wiki - web-based collaborative documentation platform
Monitoring and Measuring Tools
Keeping an eye on network traffic patterns and performance metrics can help you anticipate problems before they occur.
SNMP allows a monitored device (for example, a router or a switch) to run an SNMP agent.
Reasons to monitor network performance include the following:
- Assuring compliance with an SLA
- Trend monitoring
- Troubleshooting performance issues - you have a reference point (that is, a baseline) against which you can compare performance metrics collected after a user reports a performance issue.
Issues could arise as a result of human error (for example, a misconfiguration), equipment failure, a software bug, or traffic patterns (for example, high utilization or a network being under attack by malicious traffic).
Troubleshooting skills vary from administrator to administrator.
Simplified model of the troubleshooting
Problem report: After an issue is reported, the first step toward resolution is clearly defining the issue.
Problem diagnosis: When you have a clearly defined troubleshooting target, you can begin gathering information related to that issue.
Problem resolution: After you identify a suspected underlying cause, you next define approaches to resolving the issue and select what you consider to be the best approach.
Structured Troubleshooting Approach
(1) Problem Report - clear problem report
(2) Collect Information - show or debug commands, or performing packet captures
(3) Examine Information
(4) Eliminate Potential Causes
(5) Hypothesize Underlying Cause
(6) Verify Hypothesis
Some experienced troubleshooters, however, might have seen similar issues before and might be extremely familiar with the subtleties of the network they are working on. they can use shoot from the hip method - immediately hypothesizing a cause after they collect information about the problem.
(1) Problem Report -> (2) Collect Information -> (3) Hypothesize Underlying Cause -> (4) Verify Hypothesis
Popular Troubleshooting Methods:
1) The Top-Down Method - Begins at the top layer of the OSI seven-layer model - first checks the application residing at the application layer and moves down from there
2) The Bottom-Up Method - Seeks to narrow the field of potential causes by eliminating OSI layers beginning at Layer 1 (physical). Might not be efficient in larger networks because of the time required to fully test lower layers of the OSI model.
3) The Divide and Conquer Method - Begins in the middle of the OSI stack (ping 10.1.2.3)
4) Following the Traffic Path - follow the path of the traffic experiencing a problem
5) Comparing Configurations - often an appropriate approach for a less experienced troubleshooter not well versed in the specifics of the network
6) Component Swapping - physically swap out components. If a problem’s symptoms disappear after swapping out a particular component (for example, a cable or a switch), you can conclude that the old component was faulty (either in its hardware or its configuration).
Including Troubleshooting in Routine Network Maintenance
Maintaining Current Network Documentation
Following are a few suggestions to help troubleshooters keep in mind the need to document their steps:
■ Require documentation
■ Schedule documentation check
■ Automate documentation
Establishing a Baseline
Troubleshooting involves knowing what should be happening on the network, observing what is currently happening on the network, and determining the difference between the two.
The process of change management includes using policies that dictate rules regarding how and when a change can be made and how that change is documented.