无论是硬件还是软件,即使在设计、测试、制造、检验等过程中都执行了严格的质量控制规范,仍有可能会出现一些瑕疵。下面介绍的几个硬件或软件的BUG,是实践中经常碰到的,给大家做个参考,以便在您遇到时能够迅速地识别和处理。
1.Cisco 7200路由器I/O控制模块硬件Bug可能会引起系统启动时报错:
Warning: monitor nvram area is corrupt ... using default values
environment checksum in NVRAM failed
C7200 platform with 262144 Kbytes of main memory
路由器进Rommon模式。该问题据说出现的概率大约为10的负19次方,处理方法是返修I/O控制模块。
2.Cisco 7200、7500、GSR 平台上的提示:
*Nov 30 00:00:40.771:
WARNING: Enviro Monitor Reference Voltage was ZERO !
*Nov 30 00:00:41.771:
WARNING: Enviro Monitor Reference Voltage was ZERO !
*Nov 30 00:00:42.771:
WARNING: Enviro Monitor Reference Voltage was ZERO !
*Nov 30 00:00:43.771:
WARNING: Enviro Monitor Reference Voltage was ZERO !
*Nov 30 00:00:44.771: %ENVM-0-SHUT: Environmental Monitor initiated shutdown
Buffered messages:
System Bootstrap, Version 12.2(4r)B2, RELEASE SOFTWARE (fc2)
TAC Support: http://www.cisco.com/tac
Copyright (c) 2002 by cisco Systems, Inc.
以上提示看似电源毛病,其实也是一个硬件的Bug。表现为启动正常,启动完毕后重复出现告警,然后自动掉电(两个电源同时开着也这样),只有手动开机。show env 结果为:
Router#sh env
All measured values are normal
Router#sh env last
I/O Cont Inlet
previously measured at 24C/75F
I/O Cont Outlet
previously measured at 24C/75F
NPE Inlet
previously measured at 25C/77F
NPE Outlet
previously measured at 25C/77F
+3.45 V
is unmeasured
+5.15 V
is unmeasured
+12.15 V
is unmeasured
-11.95 V
is unmeasured
last shutdown reason - critical voltage
Router#sh env all
Power Supplies:
Power Supply 1 is unmeasured.
Power Supply 2 is unmeasured.
Temperature readings:
I/O Cont Inlet
measured at 24C/75F
I/O Cont Outlet
measured at 25C/77F
NPE Inlet
measured at 26C/78F
NPE Outlet
measured at 26C/78F
Voltage readings:
+3.45 V
is unmeasured
+5.15 V
is unmeasured
+12.15 V
is unmeasured
-11.95 V
is unmeasured
Envm stats saved 0 time(s) since reload
处理办法:返修I/O Controller
3.Cisco Catalyst 6500 SUP720的引擎,IOS 12.2.14软件版本,如果做NAT,当NAT转换条目达到约6000条以上时,就会出现如下提示:
*Mar
2 01:19:56.738: %SYS-3-CPUHOG: Task ran for 2048 msec (39/2), process = IP NAT Ager, PC = 4021C300.
-Traceback= 4021C308 40EFB5CC 40EFBA64 40EFBF44
*Mar
2 01:20:06.974: %SYS-3-CPUHOG: Task ran for 2172 msec (44/2), process = IP NAT Ager, PC = 4021C300.
-Traceback= 4021C308 40EFB5CC 40EFBA64 40EFBF44
系统隔一段时间就会重启一次,在Bootflash中记录Crash信息。 这个时候show proce cpu看到ip nat进程占CPU相当大:
------------------ show process cpu ------------------
CPU utilization for five seconds: 44%/8%; one minute: 43%; five minutes: 44%
PID Runtime(ms)
Invoked
uSecs
5Sec
1Min
5Min TTY Process
72
200
6178
32
0.00%
0.00%
0.00%
0 Spanning Tree
73
0
2
0
0.00%
0.00%
0.00%
0 Const MPLS RP pr
74
1465884
7504888
195
3.18%
3.25%
3.65%
0 IP Input
75
2480
6141
403
0.00%
0.00%
0.00%
0 CDP Protocol
76
0
1
0
0.00%
0.00%
0.00%
0 PPPATM Session d
77
0
2
0
0.00%
0.00%
0.00%
0 PASVC create VA
78
7856368
4520305
1738 28.58% 30.23% 30.71%
0 IP NAT Ager
79
32
393
81
0.00%
0.00%
0.00%
0 HWIF QoS Process
后来查到是软件的Bug,还是思科内部的Bug,用CCO帐号也看不到。后来升级到12.2.17就好了,同样数量的NAT条目,IP NAT Ager进程只占0.16%。