Tag Archives: “resource exhaustion”

Watchguard XTM Firewall and UTM Appliance – High CPU Usage in the GAV (gateway anti-virus) scand process causes lag and typing delay in Remote Desktop Sessions (RDP) and SIP or VoIP latency issues

Watchguard XTM Firewall and UTM Appliance – High CPU Usage in scand process causes lag and typing delay in Remote Desktop Sessions (RDP).  You may find that remote users report a lag with Remote Desktop Sessions, freezing sessions, black screen and random disconnections.  At around the same time users report these issues you may find that the CPU usage of the scand process on your Watchguard has increased to 100% and the majority of the activity is attributed to the scand process.  You may be able to recreate this issue by browsing websites that utilise lots of Adobe Flash or Media Content as GAV will need to scan all these elements of the web page.  Login to the Watchguard System Manager and then open Firebox System Manager click on Status Report and scroll down the report until you find the Process List (Screenshot Below).  This information will automatically update every 30 seconds so you can see the %CPU column will change and update every 30 seconds.  The top value system shows the overall CPU utilisation and if you look further down you can see which sub processes are actually occupying the CPU time and making up the overall system usage.  In the screenshot below we can see that system is showing 100 % CPU Usage and then further down we can see that the scand process is accounting for 90.99% of this.  When the CPU Usage reaches 100% on the Watchguard unit it may stop forwarding other traffic and this accounts for the lag and jitter we see within the Remote Desktop Session.  Other time sensitive traffic such as VoIP or SIP traffic may also be affected by this issue as the packets are delayed whilst the Firewall recovers from the resource exhaustion.  Users may also report that web pages are slow to load at the time these issues occur where the GAV process is still dealing with the other requests.

Resolution/Workaround:

You can try disabling the GAV (gateway antivirus) for the HTTP and FTP Proxy to ensure that this is the actual cause of your issues, if the problem subsides then you may need to consider updating the XTM OS to the latest release i.e. 11.5.2 and/or adjusting the GAV policy so that it does not scan some content i.e. Images/Text within websites.  You may also need to consider opening a support case with Watchguard to make them aware of this issue, if you have a large number of users then you may even need to consider upgrading your XTM appliance to a larger unit i.e. XTM 23 to XTM 505 or XTM 22 to XTM330 to provide additional processing power (CPU) and system resources to cope with the additional anti-virus scanning requirements.

Watchguard XTM High CPU Usage scand
Watchguard XTM High CPU Usage scand

[RESOLVED] Your computer may stop responding when you run an application, Software Firewall or anti-virus package that uses the Windows Filtering Platform API

Your computer may stop responding when you run an application, Software Firewall or anti-virus package that uses the Windows Filtering Platform API

This issue affects the following operating systems:

  • Windows 7 – Service Pack 1
  • Windows Small Business Server 2011 – Service Pack 1
  • Windows SBS 2011 – Service Pack 1
  • Windows Server 2008 R2 – Service Pack 1
  • Windows Small Business Server 2008 – Service Pack 2
  • Windows SBS 2008 – Service Pack 2
  • Windows Server 2008 – Service Pack 2
  • Windows Vista – Service Pack 2

In this situation, the computer may perform slowly or stop responding and network activity may be affected.  You find that a system restart may resolve this issue in some instances.

This issue occurs because the FwpsStreamInjectAsync0 function causes the interrupt request level (IRQL) to leak.  You can resolve the issue by updating to the latest Netio.sys driver.  The download link can be found within Microsoft KB 2664888 http://support.microsoft.com/kb/2664888

 

Windows Filtering Platform (WFP) General Description

Windows Filtering Platform (WFP) is a set of API and system services that provide a platform for creating network filtering applications. The WFP API allows developers to write code that interacts with the packet processing that takes place at several layers in the networking stack of the operating system. Network data can be filtered and also modified before it reaches its destination.

By providing a simpler development platform, WFP is designed to replace  previous packet filtering technologies such as Transport Driver Interface (TDI)  filters, Network Driver Interface Specification (NDIS) filters, and Winsock Layered Service Providers (LSP). Starting in Windows Server 2008 and Windows Vista, the firewall hook and the filter hook drivers  are not available; applications that were using these drivers should use WFP instead.

With the WFP API, developers can implement firewalls, intrusion detection systems, antivirus programs, network monitoring tools, and parental controls. WFP integrates with and provides support for firewall features such as authenticated communication and dynamic firewall configuration based on applications’ use of sockets API (application-based policy). WFP also provides infrastructure for  IPsec policy management, change notifications, network diagnostics, and stateful filtering.

More info can be found here http://msdn.microsoft.com/en-us/library/windows/desktop/aa366510(v=vs.85).aspx

 

HTTP and HTTPS requests or traffic to a Windows Vista, Windows 7, Windows Server 2008, SBS 2008, Windows Server 2008 R2 or SBS 2011 machine may exhibit increased latency if the connection is through a network load balancer

If you utilise Microsoft Internet Information Services IIS or an application that uses the System.Net.HttpListener class is installed or running on one the operating systems below, and you have a Network Load Balancer then you may find that Increased latency occurs on HTTP and HTTPS requests and traffic.

This issue occurs because the HTTP and HTTPS requests from clients can include zero length data in the SSL records, certain server-side variables do not update correctly in this instance and Http.sys leaves the connection in the CLOSE_WAIT state.  This intern exhausts the open connection limit can introduce latency, timeouts and connection problems.

Affected Operating Systems:

Microsoft Windows Vista

Microsoft Windows 7

Microsoft Windows Server 2008

Microsoft Small Business Server 2008 – SBS 2008

Microsoft Windows Server 2008 R2

Microsoft Small Business Server 2011 – SBS 2011

The Microsoft Knowledge Base Article KB 2634328 includes further information on this issue and provides an updated version of Http.sys that corrects the issue http://support.microsoft.com/kb/2634328

Remote Desktop Sessions Pause Or Exhibit Unresponsiveness – Lag Whilst Typing And Session Will Not Accept Mouse Inputs

Remote Desktop Services can be extremely useful, allowing users to access a terminal server or their company desktop computer from another location.  One very common complaint with RDP sessions is screen refresh delays and a delay when typing or trying to click on items using the mouse cursor.  It will appear to most that the session has become unresponsive for a period of 5-20 seconds, after this delay the session will return to normal for a period of several minutes before once again becoming unresponsive.  You may find that this issue becomes more apparent as more users connect to the specific terminal server in question and if all these users utilise several applications (i.e. Outlook, Word and Excel) together.

Causes for poor user experience when connected via RDP are varied but one of the most common is resource exhaustion or contention.  This in turn causes a delay in processing that appears as a pause or unresponsiveness.

Check that your computer or terminal server has sufficient Memory to cope with the current load.

The next thing to verify by using performance monitor is that the PhysicalDisk\% Idle Time is consistently high, that’s correct this should be 90-100% when the server is not very busy.

It is worth running performance monitor using the PhysicalDisk\% Idle Time counter whilst you are seeing the slowdowns, this will help identify if your hard disk or controller are causing contention and in turn the pausing or unresponsiveness.

If you do find that the “%Idle Time” keeps going very low then it’s time to consider some of the options below to help resolve the issue

  • Install a second drive or mirror set, move the Windows Page File to this second disk/array to reduce the load on the drive/array holding your operating system
  • Install additional memory into the computer or server, this will reduce paging to disk and will generally improve overall system performance
  • Migrate to or upgrade your existing RAID controller to a unit that had a Battery Backed Cache (Fast) or Flash Backed Cache (Newer – Faster) to significantly improve performance and alleviate the system
  • Migrate to faster hard disk drives, 7200, 10,000 or 15,000 RPM drives are amongst the fastest.  The SATA interface is slower than the SAS interface but is cheaper.  Try to invest in the fastest drives that you can to future proof the system and avoid future performance issues if you have to scale for more users.
  • Ensure that you have at least 20-25% free disk space on all partitions/drives
  • Defragment all drives on a regular basis to optimise read and write operations
  • A cheap solution for improving disk performance may be to turn on the Hard Disk Cache using “Device Manager” or in the event that you are using a RAID controller without a battery backed Cache module you will need to open the RAID Array Management Software and then enable Disk Cache within the management software as this feature will not be available within “Device Manager” in this instance.  Please note that this does have some risks and should be used with caution, you may loose data in the event of a sudden/unexpected loss of power to the system.  Consider using this option with a UPS and redundant power supplies to reduce the risk of power loss to the system.  As always ensure you have a reliable backup that is carried out at regular intervals.

Example – HP RAID Array Configuration Utility:

Example – Windows Device Manager: