Category: Performance

Windows Performance through VSS Cleanup and Preassociation
article #1004, updated 74 days ago

These steps can improve Windows performance a whole lot. It works because a vast array of different applications and services in Windows utilize VSS on their backends. Start an administrative CMD, and then…

Step 1:

First we run the following in an administrative CMD:

vssadmin Delete Shadows /All

If there are orphan shadows, you will be asked whether you want to delete them. If there are and you delete them, you will see immediate performance benefit. Reportedly, Windows autodeletes them only after there are 64 per volume. We prefer to see zero!

Step 2:

We now improve any existing preassociation of disk space for VSS. On some machines, this will increase performance very impressively, immediately. In general it keeps them smooth and stable and prevents hesitations. This does not reserve or take up the space, it just “associates” it, makes it ready for use, so that whenever Windows wants to do any of the bajillions of things it does with VSS, things ranging from tiny to enormous, it can skip that step.

It is worthwhile to know that C: on all workstation installs and many server installs, has a minimal preassociation already set up. So this first step is to resize the existing association.

Do the below in administrative CMD:

vssadmin Resize ShadowStorage /For=C: /On=C: /MaxSize=20%

Do repeat for any other active hard drive partitions, D:, E:, et cetera. Don’t worry if you get an error, the next step covers it.

Step 3:

It may well throw an error, saying there is no such association. If this is a workstation OS, vssadmin lacks two commands on workstation OSes which we need to improve further, so in that case we are done. But on any Windows Server OS from 2003, if the error was thrown, we do an Add for every RAID volume:

vssadmin Add ShadowStorage /For=E: /On=E: /MaxSize=20%

Step 4:

And finally (server only), one more thing which can help if, for instance, C: is almost full but E: has plenty of space:

vssadmin Delete ShadowStorage /For=C: /On=C:
vssadmin Add ShadowStorage /For=C: /On=E: /MaxSize=20%

This maximizes overall performance, and also prevents possible backup failures and other issues due to insufficient disk space on C:.

Note:

On some machines, the volumes do not have letters. For these you will need to use the volume GUID path. In vssadmin list shadowstorage, they look like this:

Shadow Copy Storage association
   For volume: (\\?\Volume{99ac05c7-c06b-11e0-b883-806e6f6e6963}\)\\?\Volume{99a
c05c7-c06b-11e0-b883-806e6f6e6963}\
   Shadow Copy Storage volume: (\\?\Volume{99ac05c7-c06b-11e0-b883-806e6f6e6963}
\)\\?\Volume{99ac05c7-c06b-11e0-b883-806e6f6e6963}\
   Used Shadow Copy Storage space: 0 B (0%)
   Allocated Shadow Copy Storage space: 0 B (0%)
   Maximum Shadow Copy Storage space: 32 MB (32%)

For such a situation, substitute \\?\Volume{99ac05c8-c06b-11e0-b883-806e6f6e6963} (the whole long string) for C: in the above command lines.

PowerShell will give GUI paths for all volumes thusly:

GWMI -namespace root\cimv2 -class win32_volume

References are here:

https://technet.microsoft.com/en-us/library/cc788050.aspx

https://www.storagecraft.com/support/kb/article/289

http://backupchain.com/i/how-to-delete-all-vss-shadows-and-orphaned-shadows

http://www.tech-no.org/?p=898

Categories:   VSS   Performance

==============

Linux Speed, Responsiveness, and Latency Reduction with 'sysctl' Settings
article #892, updated 75 days ago

These items help a lot in any application, including desktop, web server, or terminal server. The end of this post has two large compilations of these settings, one for wired (“non-lossy”) networking, one for wireless (“lossy”).

On the vast majority of Linux distributions, one can just add these changes to /etc/sysctl.conf, and then run sysctl -p to apply them without reboot. However, recent additions to standards have enabled us to place custom settings in our own configuration files, so that we don’t take /etc/sysctl.conf out of distro control.

On recent Debian and Ubuntu, we may best put them in /etc/sysctl.d/60-custom.conf (or replace the word “custom” to your liking), and then run sysctl --system to load both /etc/sysctl.conf and everything under /etc/sysctl.d.

On some other recent distros, it’s /etc/sysctl.d/custom.conf (the word “custom” is still arbitrary), and then run systemctl restart systemd-sysctl.

You can check your results with sysctl -A.

The first selection is for wired networking performance:

net.ipv4.tcp_window_scaling=1
net.ipv4.tcp_workaround_signed_windows=1
net.ipv4.tcp_sack=1
net.ipv4.tcp_fack=1
net.ipv4.tcp_low_latency=1
net.ipv4.ip_no_pmtu_disc=0
net.ipv4.tcp_mtu_probing=1
net.ipv4.tcp_frto=2
net.ipv4.tcp_frto_response=2
net.ipv4.tcp_congestion_control=illinois

A bit different first group for networking performance, is recommendable for anything involving wireless, i.e., “lossy” networks:

net.ipv4.tcp_window_scaling=1
net.ipv4.tcp_workaround_signed_windows=1
net.ipv4.tcp_sack=1
net.ipv4.tcp_fack=1
net.ipv4.tcp_low_latency=1
net.ipv4.ip_no_pmtu_disc=0
net.ipv4.tcp_mtu_probing=1
net.ipv4.tcp_frto=2
net.ipv4.tcp_frto_response=2
net.ipv4.tcp_congestion_control = hybla
net.ipv4.tcp_allowed_congestion_control = hybla cubic

And then some general networking performance items:

net.core.rmem_default = 31457280
net.core.rmem_max = 12582912
net.core.wmem_default = 31457280
net.core.wmem_max = 12582912
net.core.somaxconn = 4096
net.core.netdev_max_backlog = 65536
net.core.optmem_max = 25165824
net.ipv4.tcp_mem = 65536 131072 262144
net.ipv4.udp_mem = 65536 131072 262144
net.ipv4.tcp_rmem = 8192 87380 16777216
net.ipv4.udp_rmem_min = 16384
net.ipv4.tcp_wmem = 8192 65536 16777216
net.ipv4.udp_wmem_min = 16384
net.ipv4.tcp_max_tw_buckets = 1440000
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_tw_reuse = 1

And some for network security enhancement:

net.ipv4.tcp_synack_retries = 2
net.ipv4.ip_local_port_range = 2000 65535
net.ipv4.tcp_rfc1337 = 1
net.ipv4.tcp_fin_timeout = 15
net.ipv4.tcp_keepalive_time = 300
net.ipv4.tcp_keepalive_probes = 5
net.ipv4.tcp_keepalive_intvl = 15

And now a few to keep virtual memory usage under good control:

vm.swappiness=20
vm.dirty_ratio = 60
vm.dirty_background_ratio = 2

And one to increase the maximum number of open and watched files, very helpful indeed for servers, file synchronization of all sorts, and many other functions:

fs.file-max = 2097152
fs.inotify.max_user_watches = 524288

The above was compiled from these two excellent articles:

http://www.networkworld.com/article/2227856/opensource-subnet/best-networking-tweaks-for-linux.html
https://easyengine.io/tutorials/linux/sysctl-conf/

and other sources. Here is the whole set for wired (non-lossy) networking:

net.ipv4.tcp_window_scaling=1
net.ipv4.tcp_workaround_signed_windows=1
net.ipv4.tcp_sack=1
net.ipv4.tcp_fack=1
net.ipv4.tcp_low_latency=1
net.ipv4.ip_no_pmtu_disc=0
net.ipv4.tcp_mtu_probing=1
net.ipv4.tcp_frto=2
net.ipv4.tcp_frto_response=2
net.ipv4.tcp_congestion_control=illinois
net.core.rmem_default = 31457280
net.core.rmem_max = 12582912
net.core.wmem_default = 31457280
net.core.wmem_max = 12582912
net.core.somaxconn = 4096
net.core.netdev_max_backlog = 65536
net.core.optmem_max = 25165824
net.ipv4.tcp_mem = 65536 131072 262144
net.ipv4.udp_mem = 65536 131072 262144
net.ipv4.tcp_rmem = 8192 87380 16777216
net.ipv4.udp_rmem_min = 16384
net.ipv4.tcp_wmem = 8192 65536 16777216
net.ipv4.udp_wmem_min = 16384
net.ipv4.tcp_max_tw_buckets = 1440000
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_synack_retries = 2
net.ipv4.ip_local_port_range = 2000 65535
net.ipv4.tcp_rfc1337 = 1
net.ipv4.tcp_fin_timeout = 15
net.ipv4.tcp_keepalive_time = 300
net.ipv4.tcp_keepalive_probes = 5
net.ipv4.tcp_keepalive_intvl = 15
vm.swappiness=20
vm.dirty_ratio = 60
vm.dirty_background_ratio = 2
fs.file-max = 2097152
fs.inotify.max_user_watches = 524288

and another full set for wireless / lossy networking:

net.ipv4.tcp_window_scaling=1
net.ipv4.tcp_workaround_signed_windows=1
net.ipv4.tcp_sack=1
net.ipv4.tcp_fack=1
net.ipv4.tcp_low_latency=1
net.ipv4.ip_no_pmtu_disc=0
net.ipv4.tcp_mtu_probing=1
net.ipv4.tcp_frto=2
net.ipv4.tcp_frto_response=2
net.ipv4.tcp_congestion_control = hybla
net.ipv4.tcp_allowed_congestion_control = hybla cubic
net.core.rmem_default = 31457280
net.core.rmem_max = 12582912
net.core.wmem_default = 31457280
net.core.wmem_max = 12582912
net.core.somaxconn = 4096
net.core.netdev_max_backlog = 65536
net.core.optmem_max = 25165824
net.ipv4.tcp_mem = 65536 131072 262144
net.ipv4.udp_mem = 65536 131072 262144
net.ipv4.tcp_rmem = 8192 87380 16777216
net.ipv4.udp_rmem_min = 16384
net.ipv4.tcp_wmem = 8192 65536 16777216
net.ipv4.udp_wmem_min = 16384
net.ipv4.tcp_max_tw_buckets = 1440000
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_synack_retries = 2
net.ipv4.ip_local_port_range = 2000 65535
net.ipv4.tcp_rfc1337 = 1
net.ipv4.tcp_fin_timeout = 15
net.ipv4.tcp_keepalive_time = 300
net.ipv4.tcp_keepalive_probes = 5
net.ipv4.tcp_keepalive_intvl = 15
vm.swappiness=20
vm.dirty_ratio = 60
vm.dirty_background_ratio = 2
fs.file-max = 2097152
fs.inotify.max_user_watches = 524288

Categories:   Performance   Linux OS-level Issues

==============

Hardware performance info from Microsoft
article #1010, updated 173 days ago

Some very interesting data:

https://technet.microsoft.com/en-us/windows-server-docs/networking/technologies/network-subsystem/net-sub-performance-tuning-nics

Categories:   Performance   Hardware

==============

Disable 8.3 Filename Generation
article #978, updated 300 days ago

If your software is all new, let’s say 2013 and after, it probably makes sense to disable 8.3 filename generation, for a nice kick of speed.

To do it once for all drives, just do this:

fsutil behavior set Disable8dot3 1

If you want to do it for one select drive, say E:, first do a registry edit in

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\FileSystem

you’ll want to change NtfsDisable8dot3NameCreation to 2. Then you will need to reboot, and in an administrative command prompt:

fsutil behavior set E: 1

and reboot again, and it’s done.

Categories:   Performance   Windows OS-Level Issues

==============

Internal QoS on Linux for General Speed and Reliability
article #972, updated 313 days ago

Prioritizing certain kinds of data, can help a lot in general on Linux. Here’s the FireQOS configuration I just set up on this 802.11g-wireless laptop:

DEVICE=wlan0
INPUT_SPEED=54000kbit
OUTPUT_SPEED=54000kbit
LINKTYPE="ethernet"
interface $DEVICE world-in input rate $INPUT_SPEED
interface $DEVICE world-out output rate $OUTPUT_SPEED

interface $DEVICE world-in input rate $INPUT_SPEED $LINKTYPE balanced
	class priority commit 10%
		match tcp port 22,3389,53,444  # SSH, RDP, DNS, SSL VPN
		match proto GRE
		match icmp
		match tcp syn
		match tcp ack

interface $DEVICE world-out output rate $OUTPUT_SPEED $LINKTYPE balanced
	class priority commit 10%
		match tcp port 22,3389,53,444
		match proto GRE
		match icmp
		match tcp syn
		match tcp ack

Categories:   Performance   

==============

Additional Critical and Delayed Worker Threads in Windows - speed tweak
article #422, updated 473 days ago

At this registry location:

HKLM\SYSTEM\CurrentControlSet\Control\Session Manager\Executive

create or modify “AdditionalCriticalWorkerThreads” and also “AdditionalDelayedWorkerThreads”, both DWORDs.

For a 32-bit system, I have come to prefer decimal values of 6 times the number of gigs of RAM. So a 1/2G RAM system gets 3, a 4G system gets 24. For a 64-bit system, I use 3 times the number of gigs of RAM.

The above include statements that the maximum value actually used by Windows for these entries is 16. However, the pages are old enough that they do not discuss 64-bit environments; and the second page, the more recent of the two, states that the registry entries exist by default, which they do not.

A major software vendor (which shall now go unnamed) had a page, now taken down, which seemed to at least leave the possibility open that higher numbers may be viable, 64 being recommended by them for a third item otherwise apparently undocumented:

HKLM\SYSTEM\CurrentControlSet\Services\RpcXdr\Parameters\DefaultNumberofWorkerThreads

But just today, I learned of this page:

https://blogs.technet.microsoft.com/josebda/2010/08/27/performance-tuning-guidelines-for-windows-server-2008-r2/

which references this page at Microsoft:

https://msdn.microsoft.com/en-us/library/windows/hardware/dn529134

which contains references for all current Windows Server versions. These touch all of the above and a lot more! Haven’t read them all yet, but will, am thinking to learn lots of interesting things. A quick peruse did reveal that the old hard max of 16 no longer applies, 64 is mentioned, at least in the Server 2008R2 document.

Below is a VBscript, not been updated in a while, which attempts to give an all-around tweak set for the above. Will be working on this soon from that Microsoft page.

' ***************************************
' ******* Optimize Worker Threads *******
' ****************** 2.0 ****************
' ***************************************
' ********* Jonathan E. Brickman ********
' ********* jeb@ponderworthy.com ********
' ***************************************
' ***************************************

' ********** Set up environment *********

Option Explicit

Dim HKEY_LOCAL_MACHINE, strComputer, CPUarch
Dim Return
Dim AddCriticalWorkerThreads, AddDelayedWorkerThreads, DefaultWorkerThreads

HKEY_LOCAL_MACHINE = &H80000002
strComputer = "."

' ********** Find out how much RAM is in machine *******

Dim RAMobj, i, RAMobj2, memTmp1, TotalRAM

Set RAMobj = GetObject("winmgmts:").InstancesOf("Win32_PhysicalMemory")
i = 1
For Each RAMobj2 In RAMobj
	memTmp1 = CDbl(RAMobj2.capacity) / CDbl(1024) / CDbl(1024) / CDbl(1024)
	TotalRAM = TotalRAM + memTmp1
	i = i + 1
Next

Set RAMobj = Nothing
Set RAMobj2 = Nothing

' ******** Get ready for registry operations **********

Dim ObjRegistry, strPath, strValue

Set ObjRegistry = _
    GetObject("winmgmts:{impersonationLevel = impersonate}!\\" _
    & strComputer & "\root\default:StdRegProv")

' ********** Find out whether OS is 32-bit or 64-bit **********

' HKLM\SYSTEM\CurrentControlSet\Control\Session Manager\Environment\PROCESSOR_ARCHITECTURE
' contains either 'AMD64' or 'x86' or other, if not AMD64 presume 32-bit

' ObjRegistry.GetStringValue?...
' http://msdn.microsoft.com/en-us/library/windows/desktop/aa390788%28v=vs.85%29.aspx

strPath = "SYSTEM\CurrentControlSet\Control\Session Manager\Environment"

ObjRegistry.GetSTRINGValue HKEY_LOCAL_MACHINE, strPath, "PROCESSOR_ARCHITECTURE", CPUarch

' CPUarch will be AMD64 or IA64 if 64-bit, otherwise 32-bit

' ********** Calculate values to be used **********

If CPUarch = "AMD64" or CPUarch = "IA64" Then
	AddCriticalWorkerThreads = TotalRAM * 3
	AddDelayedWorkerThreads = TotalRAM * 3
	DefaultWorkerThreads = 64
Else
	AddCriticalWorkerThreads = TotalRAM * 6
	AddDelayedWorkerThreads = TotalRAM * 6
	DefaultWorkerThreads = 64
end if

' WScript.echo "Total RAM: " & TotalRAM
' WScript.echo "Critical Worker Threads: " & AddCriticalWorkerThreads
' WScript.echo "Delayed Worker Threads: " & AddDelayedWorkerThreads
' WScript.echo "Default Worker Threads: " & DefaultWorkerThreads

' WScript.Quit

' ********** Set Additional Critical and Delayed Worker Threads ***********

strPath = "SYSTEM\CurrentControlSet\Control\Session Manager\Executive"

' Create key in case it doesn't exist yet
Return = objRegistry.CreateKey(HKEY_LOCAL_MACHINE, strPath)

ObjRegistry.SetDWORDValue HKEY_LOCAL_MACHINE, strPath, "AdditionalCriticalWorkerThreads", AddCriticalWorkerThreads
If Err <> 0 Then
	WScript.Echo "Could not set AdditionalCriticalWorkerThreads."
End If

ObjRegistry.SetDWORDValue HKEY_LOCAL_MACHINE, strPath, "AdditionalDelayedWorkerThreads", AddDelayedWorkerThreads
If Err <> 0 Then
	WScript.Echo "Could not set AdditionalDelayedWorkerThreads."
End If

' ********** Set Default Number of Worker Threads ***********

strPath = "SYSTEM\CurrentControlSet\Services\RpcXdr\Parameters"

' Create second key in case it doesn't exist yet
Return = objRegistry.CreateKey(HKEY_LOCAL_MACHINE, strPath)

ObjRegistry.SetDWORDValue HKEY_LOCAL_MACHINE, strPath, "DefaultNumberOfWorkerThreads", DefaultWorkerThreads
If Err <> 0 Then
	WScript.Echo "Could not set DefaultNumberOfWorkerThreads."
End If

' ********* End! **********

Set ObjRegistry = Nothing

' Wscript.echo "Done!"

Categories:   Performance   

==============

Compiling a Debian Kernel
article #900, updated 528 days ago

Here is a good synopsis:

http://www.debian.org/releases/stable/i386/ch08s06.html.en

Categories:   Performance   

==============

Caching web proxies for speed on the WWW
article #894, updated 536 days ago

A great way to increase! For Windows, CC Proxy is highly recommended:

http://www.youngzsoft.net/ccproxy/proxy-server-download.htm

and for Linux, Polipo, which is in many distros:

http://www.pps.univ-paris-diderot.fr/~jch/software/polipo/

Categories:   Performance   

==============

Rebuild glibc optimized for your CPU in Debian Testing as a Local Package Version
article #755, updated 574 days ago

I just rebuilt my glibc, optimizing the build for my particular CPU. I was amazed at how much more speed it brought me on this >5-year-old laptop, 2G RAM, dual-core 2 GHz Intel. Here’s what I did. If you’re following this, bear in mind that version numbers will have to be changed as development goes on!

  1. Get everything you need to build glibc. You may very well discover more packages to install if errors show up further down, depending on how you installed Debian to begin with.
apt-get build-dep glibc
  1. Create a folder for your build, and get the current source.
cd ~/Downloads; mkdir glibc-recompile; cd glibc-recompile; apt-get source glibc
  1. Edit a few files to set the optimization.

First change directory here: cd ~/Downloads/glibc-recompile/glibc-2.19/debian

Now edit the file named rules, and look for these two lines:

BUILD_CFLAGS = -O2 -g
HOST_CFLAGS = -pipe -O2 -g $(call xx,extra_cflags)

Change them as follows:

BUILD_CFLAGS = -O2 -march=native -mtune=native
HOST_CFLAGS = -pipe -O2 $(call xx,extra_cflags) -march=native -mtune=native

Now change to here: cd ~/Downloads/glibc-recompile/glibc-2.19/debian/sysdeps

You’ll now want to edit the file x32.mk, find this line:

i386_extra_cflags = -march=pentium4 -mtune=generic

and change it to:

i386_extra_cflags = -march=native -mtune=native

Then, if your CPU is Intel/AMD-compatible and your OS is 32-bit, you’ll want to edit i386.mk, find this:

i686_extra_cflags = -march=i686 -mtune=generic

and change it to this:

i686_extra_cflags = -march=native -mtune=native

and also find this:

xen_extra_cflags = -march=i686 -mtune=generic -mno-tls-direct-seg-refs

and change it to:

xen_extra_cflags = -march=native -mtune=native -mno-tls-direct-seg-refs

and if you’re Intel-compatible but your OS is 64-bit, edit amd64.mk, find this:

i386_extra_cflags = -march=pentium4 -mtune=generic

and change it to this:

i386_extra_cflags = -march=native -mtune=native

If you are running outside of the Intel/AMD world, you’ll want to find the correct file at this point for your CPU and make the same sort of setting, the idea is that “native” refers to whatever CPU on which the compiler finds itself running.

  1. Use Debian packaging tools to set a local package version. The last command in the string below will load the appropriate file in an editor:
cd ~/Downloads/glibc-recompile/glibc-2.19/debian ; dch

At this writing the original is “2.19-13”, and dch has already added this to the top:

glibc (2.19-13.1) UNRELEASED; urgency=medium

and I changed that top line to this:

glibc (2.19-13+local-native.1) UNRELEASED; urgency=medium

and then we save and close. dch then takes care of telling the other files that this local version is legit, and renames the package directory to match, which prepares us for the next step.

  1. Create a .tar.gz of the new source tree.
cd ~/Downloads/glibc-recompile ; tar czvf glibc_2.19-13+local.orig.tar.gz glibc-2.19-13+local
  1. Begin the build.
cd ~/Downloads/glibc-recompile/glibc-2.19-13+local ; debuild -us -uc

At this point you may discover additional packages which are needed. Install them, and begin #6 again. Otherwise it will generate several .deb files one level up, in ~/Downloads/glibc-recompile.

  1. Install the .deb files.

First traverse here:

cd ~/Downloads/glibc-recompile/

Then try to install them all:

sudo dpkg -i *.deb

You may get errors related to the presence of libc6 or libc6_2.19-13+local-native.1_i386.deb. If you do, install this one individually:

sudo dpkg -i libc6_2.19-13+local-native.1_i386.deb

and then do them all again:

sudo dpkg -i *.deb

Then reboot, and see!

Categories:   Performance   

==============

Recreate SBS monitoring database by PowerScript
article #866, updated 586 days ago

Really good article here:

http://www.itquibbles.com/sql-sbsmonitoring-high-disk-usage/

Solves the problem of the database reaching max capacity, and also speeds things up in general.

Short version:

In SBS 2008, run the contents of this zip file in an administrative PowerShell window.

In SBS 2011, run this as administrator:

C:\Program Files\Windows Small Business Server\Bin\MoveDataPowerShellHost.exe

and then run the contents of this zip file within it.

If it says “1 row affected”, it’s done, and the messages will point out old MDF and LDF files to remove.

You may notice that the script linked here is just a tad different than the one on the itquibbles page; this one just adds the -force items mentioned as an option on that page.

Categories:   Windows OS-Level Issues   Performance