Updated
Jan 2nd, 2013
First Posted
Jan 2nd, 2013

Troubleshooting (v5)

Can't Log In to GUI or Missing Buttons

If you enter your login credentials to the GUI, but instead of seeing the admin panel, you are returned to the login page with the text "Logged in as" with no username, then the likely problem is either that you are not allowing cookies to be stored by the appliance, or there is a discrepancy in the time and/or time zone between the appliance and your desktop. This can also happen if you are able to login, but the cookie times out later, in which case the GUI will be missing the action buttons such as "Add Rule", "Save Rule", etc. If this happens, log out, and then back in. Then, proceed to double-check your time and time zone settings as described below.
  • Make sure that cookies are allowed for your appliance IP or hostname in your browser
  • Double-check that both your desktop machine and the appliance show the proper time AND the correct time zone. On the appliance, you can run the "date" command to see the current time and time zone setting.
# date Thu Jul 18 11:51:14 EDT 2013
Note that on new appliances, you must set the time zone for both the OS and PHP. See the "System Configuration" section of the ET/BWMGR V5.0 User Guide for more information on how to do this. If the time zone is not set up properly, you may be unable to log in the GUI, and graphs and reports may not show the right information.

File System Full

If you get the message "File System Full", this can stop any number of programs from running correctly, most notably the mySQL database. If you haven't maintained your system in a long time, you could have log files or left over .img or .tgz files that you've downloaded and never removed. To find large files, use the following command: find {/} -type f -size +{2}G -exec ls -lh {} \; | awk '{ print $9 ": " $5 }' This command will find all files over 2GB on the system. Make sure you only have your primary disk mounted. You'll get output something like this
/usr/local/images/bigfile.img: 3.7G
/var/log/somelog.log: 3.1G
Use the "df" command to see the status of your files systems. Note that when you delete a file, it will NOT be immediately shown in the df output.
etbwmgr# df
Filesystem   1K-blocks     Used    Avail Capacity  Mounted on
/dev/ada0s1a   2031132   403652  1464992    22%    /
devfs                1        1        0   100%    /dev
/dev/ada0s1e   2031132   692800  1175844    37%    /var
/dev/ada0s1f 144424956 71843644 61027316    54%    /usr
For large log files, you should add an entry to the /etc/newsyslog.conf file so that they will be automatically rotated when they get too large or at specified time intervals. (see "man newsyslot.conf")

Rules Missing

If you have rules "missing" from your /etc/rc.bwmgr file (or you see Can't Allocate Stats Structure in your system log), make sure you have the latest code. As things are reported we find some variations of rebuilt startup rules that don't work. There is a default limit of 8000 named protocols or rules with stats enabled (stats are also maintained for Protocols). If you have over 2000 rules with stats enabled, you'll have to increase the table size. In /root/loader.conf:
bwmgr.protocol_stats_table=10000
You'll need to add or increase the value to something higher. It pre-allocates a chunk of memory so keep it to a number only a bit larger than what you'll need.

Missing Some/All Rules After Reboot

To begin, it's important to know how the rules are added at boot time: the startup script /etc/rc.bwmgr this file stores all of your ET/BWMGR rules, groups, and settings. If the rule in question is not in this file, it will not be added at boot time. Make sure the rule exists First, check to see that the rule is in the startup file. If it's not in the file, it won't be added at boot. You can either open /etc/rc.bwmgr in an editor (ie vi), or you can search for it with the grep utility:
# grep text /etc/rc.bwmgr where text is unique text within the rule, such as the rule name or index.
If the rule has an entry the startup file, but doesn't exist in the ruleset following a boot, run the rule manually from the command line to see why. If there's a syntax error or bad option, an error code should be displayed, along with the usage guide for the bwmgr utility.
If the rule is not in the file, it may be because:
  • You didn't manually rebuild your ruleset after adding the rule
  • You don't have auto-rebuild enabled
To enable auto-rebuild, make sure that Auto Rebuild is set to 1 in the Settings tab in the GUI. Note that the ruleset is updated once every 5 minutes, so if you reboot right after adding rules they may not have been added to your ruleset.
If you have lots of rules that aren't being added, you may have general syntax errors. This is only likely to happen when a large ruleset is imported from an older appliance that is not properly updated to match v5 syntax. This situation can be can be harder to track down, since some of the rules will load, and not others. To simplify the task of identifying rule(s) that are not loading due to syntax errors, run the startup script with the -x option which will cause each command to be printed as it is run.
# sh -x /etc/rc.bwmgr
Another check is to use
bwmgr rebuild bwmgr rebuild userules
"rebuild" shows the rules in the database. "rebuild userules" shows the rules actually loaded into the system. If a rule is missing, check to see if the rule is or isn't in both of those outputs. Sometimes, the database can become corrupt or out of sync. To rebuild the database from the rules in your system:
bwmgr flushdb bwmgr rebuild userules | sh
Note that the rules shown in the GUI are rules in the database.

WARNING: Outside interface not set. Limiting Disabled

The Outside Interface is a setting that tells the ET/BWMGR which of your bridge ports is the "outside," that is, connected to your upstream network. It's a required setting - as the error message indicates, no bandwidth limiting can be performed until the software can identify the outside interface. If you've used bwmgr_setup to set up networking, you should not see this message; but if you have done the setup manually, or modified it after the fact, you may have inadvertently cleared this setting, in which case you will see this on the console and also in /var/log/messages.
To set the "outside" flag:
  • go to the "Interfaces" tab in the ET/BWMGR
  • select the interface using the check-box to the left of the interface name
  • click "Edit"
  • Select the first check-box, "Outside"
  • click "Save" to apply the setting.

High Latency Through Bridge / Bridge Not Passing Traffic

Check the Hardware First, check the basic hardware connections. A bad ethernet cable, or a loose connection can cause these symptoms. Check the link status at both ends of each side of the bridge ( both bypass ports, and the equipment that they are connected to. ) Next, check for errors on your bridge interfaces. From the command line, run the following command to report on network interface statistics. bwmgr# netstat -in bwmgr# netstat -in
Name    Mtu Network       Address              Ipkts Ierrs Idrop    Opkts Oerrs  Coll
igb1   1500  Link#5       00:e0:ed:2b:05:28     3487     0     0        0     0     0
igb2   1500  Link#6       00:e0:ed:2b:05:29        0     0     0    1743        0     0
If you do see input or output errors, check again after a short time and see if the error counters are increasing as more data is passed. If so, look at the link settings on both sides of the connection. Occasionally, N-way/auto negotiation to an individual piece of equipment may fail (one common example is using a crossover cable directly to some Cisco router equipment.) If this happens, and you have conflicts as to link speed or duplex, you will see many errors and a significant drop in performance. Manually setting the link speed & duplex on each side of the link may solve the problem. If not, another solution is to put a small, inexpensive switch in-between the failover port and the problem equipment. Manually Setting Link Speed & Duplex Using the ifconfig command, you can manually set the link options, which disables auto-negotiation. To see the available options for a given port, you can read the man page for the driver, in this case em:
# man em
Examples:
Set the link speed to 100Mb/s, full duplex: # ifconfig em3 media 100baseTX mediaopt full-duplex Set the link speed to 1000Mb/s (Gigabit) full duplex: # ifconfig em3 media 1000baseTX mediaopt full-duplex
Check the Software If you do not have a hardware issue or a link negotiation conflict, then the only likely cause of high latency or poor performance is the active ET/BWMGR ruleset. In this order, here are 3 steps for identifying whether your rules are the cause of the observed poor throughput.
  • Use the failover bypass feature to bypass the appliance by "Closing" the bypass ports, which takes your ruleset out of the equation. Make sure traffic flows normally, then re-enable your rules by "Opening" the bypass ports.
  • Disable your rules by issuing the command bwmgr stop, and see if traffic flows normally. Re-enable your rules by running sh /etc/rc.bwmgr
  • Check your rules for problems. A mistake or misunderstanding about the scope of a rule can affect all traffic going through the bridge - for example, neglecting to add the "IP Address" field when adding a bandwidth limit will affect all traffic that hasn't matched a prior rule.

Loop Messages on console and/or in /var/log/messages

A bridge configuration depends on each MAC address on your network being accessible via only one port of the bridge. A loop occurs when any MAC address can be reached on both sides of the bridge. This is not necessarily a problem if you get one or two isolated messages - especially during testing when you may be moving machines around or plugging them into different ports. If you see a screen full of these messages, this means that two or more bridged ports on the appliance are plugged into the same switch or hub. Specifically, the message tells you that the MAC address was received on both of the listed interfaces. Constant looping can either halt your system or make it extremely slow, and must be resolved. It indicates a serious flaw in your network setup. Make sure your system is set up as described in the Quick-Start Guide.

ET/BWMGR is Limiting Too Much

Adjust your Shaping Settings Some causes of slower than expected connections:
  • Packet Loss: Check your interface for errors and check for drops on any rule(s) that would match the traffic in question.
  • TCP Window Settings: A combination of your bandwidth setting and the default TCP Window settings will slow down an individual session more than desired in many cases. Try using different settings for tcpwindow to keep the window from being set too low. Try 5000 to start. A setting of 64000 effectively disables window shaping. (There's also a setting in v5 to disable shaping for a rule)
If you have a very high settings, it's possible that the server your testing with (or some intermediate network) doesn't want to give you that much. Shaping isn't designed to work with 1 connection. It's designed to manage many connections, so 1 connection may not be able to get the full amount with normal settings. Check your Interrupts Per Second For acceptable latency, you shouldn't have your max_ints setting less than 6000. If your system is loaded, you can see your interrupts in action using:
# systat -vmstat 1 This will show you a lot of statistics with interrupts shown on the right side of the page. If your system isn't under significant load, the numbers won't be significant or useful for tuning purposes. If you think that you need to increase your interrupts, contact ET Support as the procedure is different for different network adapters.

Graph Problems

If you see a broken icon instead of your graph, you can right-click on the icon and "open image in new window". This should print out any error messages. Usually the error is fairly self-explanatory. If you see an error message but don't know how to go about fixing the underlying issue, send the problem report along with the exact error message(s) to our support staff using the ticket system.
If the graphs load properly, but do not show any data points, then the data is not being stored. If this is the case, the first thing to do is check the log file for bwmgrd, the program responsible for storing the stats in the databases. From the console: # tail /var/log/bwmgrd.log will show the last 25 lines in the log. Each time that bwmgrd is run, it will print an entry to this log. If the entry reads something other than "Running", that is noteworthy.

If you have no data in your graphs

Check Permissions for /usr/local/etc/bwmgr/graphs You can fix permissions with: chmod 666 /usr/local/etc/bwmgr/graphs/* they should also be owned by daemon chown daemon /usr/local/etc/bwmgr/graphs/*

General Troubleshooting

Disaster Recovery

This section deals with a situation wherein your appliance does not boot, either due to a crash that fsck (the UNIX "chkdsk" or "scandisk" equivalent) cannot deal with gracefully, or a panic during the boot process. In either case, you can either use the USB Demo to fix the problem, or take manual control of the appliance at boot time. If you do not have a USB boot image, then you must follow the step-by-step instructions below. If you do have a USB Demo or backup, boot from that, and you can check & mount your appliance disk using the following command:
#diskutil mount ada0
If the appliance was panicking at boot, this will enable you to make the necessary changes, since the appliance filesystem will be accessible in the "/ada0" directory. If you know exactly what is causing the problem, then you can take specific action to fix it. For example, If you suspect a BWMGR rule is causing problems, but don't know which one, then you can bypass starting the ET/BWMGR like this:
# cd /ada0/etc # mv rc.bwmgr rc.bwmgr.sav # halt Remove the USB and then reboot.
If you are unsure of how to proceed, you can open a support ticket and describe the problem.

Manual Instructions:

Power-on the appliance, wait for the boot menu to appear, press "s" to select single-user mode, then hit ENTER to begin booting. You will be prompted to enter the shell for root, if you are entering single-user mode. Simply hit "enter" to accept the default of /bin/sh. Now you should have a root prompt. First, you should run a fsck to check the drives:
# fsck -y
This command should take a few minutes to complete, at which time you can either continue the boot, or you can make appropriate changes to your startup files. If you need to make any changes, you must first enable read/write access to your filesystems:
# mount -a
If you know exactly what is causing the problem, then you can take specific action to fix it. If you suspect a BWMGR rule is causing problems, but don't know which one, then you can bypass starting the ET/BWMGR like this:
# mv /etc/rc.bwmgr /etc/rc.bwmgr.sav # exit The boot will now continue.
Add Comment

Next: ET/BWMGR Appliance Manual