This document contains only my personal opinions and calls of judgement, and where any comment is made as to the quality of anybody's work, the comment is an opinion, in my judgement.
[file this blog page at: digg del.icio.us Technorati]
In the previous entry about IPsec I was mentioning my astonishment that GNU/Linux based IPsec implementations seem to be designed to configure conveniently only bilateral traffic flows. One of the ironies is that these are often called Virtual Private Networkss when they are actually Virtual Private Linkss.
I also reported my satisfaction that it is possible in a somewhat awkward way to configure multilateral traffic among several systems with them same configuration files on all, that is to actually configure virtual private networks.
Part of my satisfaction is also due that proper IPsec multilateral virtual private networks can replace and improve on one of the alleged advantages of VLANs which is traffic segregation for confidentiality.
With VLANs traffic can be segregated by tagging each Ethernet frame with a unique VLAN id, and then each switch port with the id of the VLANs the attached node is allowed to receive.
By filtering traffic on a network port according to VLAN id it is indeed possible to provide a degree of confidentiality with traffic separation, but it is rather weak as it relies absolutely on the network infrastructure to be flawlessly setup and access controlled and it requires the configuration of VLANs with all their dangers and complications.
An IPsec session id
also defines a virtual
private network, where only those nodes that have negotiated
the session can exchange traffic, and with some huge
advantages:
The difficulty with IPsec used to be that encryption consumed a lot of CPU time, so something much weaker and crude like VLANs were accepted as a cheaper traffic segregation method. Just like in part VLANs spanning multiple switches were used a replacement for IP routing when many years ago routing was much slower than switching.
But as already noted AES hw acceleration finally
makes AES very cheap
and many typical systems have several CPUs so I think that
there is no reason to use VLANs for segregation instead of
IPsec Virtual Private Networks, just as there is no reason to
restrict IPsec to Virtual Private Links between two
gateway
hosts.
Most of the original research that resulted in today's
Internet protocols ignored issues of non-trivial security
such as authentication, authorization
and confidentiality. This in part was because the early usage
was for relatively trivial purposes, in part because it
happened on dedicated systems and netwoks that were presumed
safe
, and in large part because the goal
was to pick the low-hanging fruit and produce proof-of-concept
implementations, as designing and implementing security
techniques is difficult and
time-consuming, and getting the Internet done at all was a
higher priority.
The result is that most major Internet technogies have had security
oriented features retrofitted quite
late and in somewhat extemporaneous ways, including the IP
protocol itself. Because of that somewhat peculiar
technologies have been developed like the
SSL
and
TLS
plus the
SSH
protocols that secure
individual
connections replicating mechanisms in several ways; that were
usually badly implemented.
There was a bit of hesitation in adding some encryption based
security
to IP too, but eventually the
mostly obvious way was implemented with these protocols:
session(more precisely the security association) which defines which cryptographic key and algorithm to use to decrypt the payload.
nodesto communicate for allocating session tags (security association) identifiers and define related cryptographic algorithms and keyss. This is a specialized variant of the ISAKMP protocol family.
Note: there are some variants like AH in addition to or extending ESP; and IKEv1 which is a more complicated version of IKEv2. These are still in use but far less preferable.
ESP is very simple indeed in its basics: when an ESP packet sent the destination address is looked up in a table that associates the destination address with a session tag and encrypted with the associated cryptographic algorithms and encryption key and prefixed with the sessions tag; when received the session id is looked up in a table, and its data payload is then decrypted with the cryptographic algorithm and decryption key associated with that session tag
Note: since as a rule symmetric encryption algorithms are used the decryption and encryption keys are usually the same.
Note: there could be a single table per host, but the standard requires a table per host per protocol for unicast protocols, and per destination address for multicast protocols. What this really means is that sessions ids can be reused in those cases. But really as long as session ids are unique there can be a table indexed by any combination of header fields. In practice what matters is that session ids be unique by destination address.
As to sending ESP packets, there must be some logic that indicates two vital details:
Finally the inter-node session tag definition protocol must have a way to define patterns with which to associate session identifiers and the related session keys.
The common implementations under GNU/Linux and similar systems have these details:
kerneland the table is indexed by source and destination address and optionally by protocol, and its source and destination port.
object.
object.
Therefore the obvious place where to configure IPsec is the user level daemon and by listing all nodes or sets of nodes that may exchange IPsec packets among each other, with attributes such as the long-term key they use, and the encryption algorithm that they support, and which packets (protocol, ports, ...) they can decrypt.
Because after all what really matters is that destination nodes should receive packets with encrypted data tagged with a session id that they expect, so what matters is that IPsec configuration should be about how destination nodes can decrypt.
Given this table the user level daemon can generate on the local source node in the kernel table of each potential source node packet that should be encrypted, and when these get matched, generate more detailed entries in the table of sessions to encrypt, and then remove the latter after some time.
Now finally after this description of how things work at low-level and how they ought to work high-level, the punch-line: most of the popular high-level daemon implementations are not configured as described above, but rather strangely they are configured in the same was the low-level ip xfrm objects.
So for example Libreswan and strongSwan manage the kernel session table and pattern table based on the opposite logic: they allow defining specific session table entries, and then allow them with some awkwardness to be generalized into pattern table entries, and then with further awkwardness into destination node descriptions.
So for both packages there are two or three basic configurations described below with SSH RSA key based authentication, in strongSwan syntax for the ipsec.conf file. The following examples are complemented by an ipsec.secrets file that is the same for all because of the uniform convention for SSH RSA private key location:
%any %any6 : RSA /etc/ssh/ssh_host_rsa_key
This describes with a minimum of abstraction the underlying ip xfrm state configuration:
conn pair auto =route type =transport left =192.168.6.10 leftsigkey =ssh:ssh_host_192.168.6.10_rsa_key.pub right =192.168.7.12 rightsigkey =ssh:ssh_host_192.168.7.12_rsa_key.pub
It looks very simple, and will work without modification on either node, because the daemon will find the IP addresses for the node it is running on, and will use whichever of the left or right lines it matches.
Which is amazing considering that instead of just two left or right it could have equally easily N called something like node to describe a multitude of source and destination addresses.
conn gateways auto =route type =transport left =192.168.1.22 leftsubnet =192.168.6.0/24 leftsigkey =ssh:ssh_host_192.168.1.22_rsa_key.pub right =192.168.1.41 rightsubnet =192.168.7.0/24 rightsigkey =ssh:ssh_host_192.168.1.41_rsa_key.pub
This describes an encrypted link between two nodes that act as encryption gsteways between two subnets to each of which one is dual homed. Again the logic is purely bilateral, the scription translates closely to ip xfrm state configuration plus ip xfrm policy configuration.
Note that traffic among subnet nodes or between each and
the subnet IPsec gateway is not secured
,
only the traffic that goes from one subnet to the other
via the respective IPsec gateways.
This is still a bilateral sessions between a client node which is its own IPsec gateway as in the first example, and a server node that is an IPsec gateway to a local subnet, typically used for VPN style access by a remote node to campus nodes behind the IPsec gateway.
For this we switch authentication to EAP (also known as 802.1x) with MSCHAPv2 which is often used for remote authorization.
We also switch to type=tunnel as often the subnet to which an IPSec gateway like this gives access uses private addresses or uses some form of NAT.
conn vpn-server auto =route type =tunnel eap =mschapv2 left =192.168.1.22 leftsubnet =192.168.6.0/24 leftauth =eap right =%any rightauth =eap
Here left is designed to match the server, and %any is a special notation to indicate that the other node in the connection can have any address.
conn vpn-client auto =add type =tunnel eap =mschapv2 left =%defaultroute leftauth =eap right =192.168.1.22 rightsubnet =192.168.6.0/24 rightauth =eap
Here the special notation %defaultroute stands
for the main
address of the interface
through packets following which the default route goes,
and the auto=add means to activate it only on
requests.
It is possible to use bilateral configuration syntax to describe multilateral situations in some more or less awkward ways, for example:
This is to have end-to-end IPsec sessions between any pair of nodes belonging to different subnets, with some limitations.
conn subnets auto =route type =transport left =%192.168.1.0/24 leftsigkey =ssh:ssh_host_192.168.1._rsa_key.pub right =%192.168.7.0/24 rightsigkey =ssh:ssh_host_192.168.7._rsa_key.pub
Here %192.168.1.0/24 matches any address within the given subnet.
Extending this to a number of subnet means creating configurations for every possible pair of subnets, like for the configurations above.
One limitation is that sessions between pairs of nodes within the same subnet are not covered. To cover them two additional configurations with left and right within the same subnet need to be added.
Another limitation that all nodes on either subnet share the same authentication and encryption secret, which may or may not be desirable.
By using some of the pattern matching techniques used previously in a somewhat logical way, which is however awkward as it wcompletely subverts the bilateral logic of the configuration to make it become effectively multilateral.
This awkward style of configuration can be applied to most of the situation described previously, but here for brevity and generality a configuration suitable for multilateral end-to-end IPsec among many individual nodes is described, which as remarked previously was the basic intent of IPsec.
The following configuration is presented in a few sections and with more details which make it more flexible and suitable for realistic deployment.
conn nodes auto =ignore type =transport keyexchange =ikev2 # https://wiki.strongswan.org/projects/strongswan/wiki/IKEv2CipherSuites#Diffie-Hellman-Groups # https://blog.cryptographyengineering.com/2012/05/19/how-to-choose-authenticated-encryption/ ike =aes128gcm16-aesxcbc-modp2048,aes128-sha1-modp2048 esp =aes128gcm16-aesxcbc-modp2048,aes128-sha1-modp2048 leftauth =pubkey rightauth =pubkey # Only encrypt TCP and UDP leftsubnet =%dynamic[tcp],%dynamic[udp] rightsubnet =%dynamic[tcp],%dynamic[udp] left =%defaultroute leftsigkey =ssh:/etc/ssh/ssh_host_rsa_key.pub
The first section is mean to define common attributes inheritable by other sections, and thus auto=ignore means to read this conn configuration without deactivating it.
The ike and esp values indicate encryption and authentication based a particularly efficient combination of the popular AES symmetric key system with the DH system. It is particularly efficient because if the CPU model has some common encryption acceleration functions these can be applied to maximum effect, enormously reducing the cost of encryption and authentication.
Then there is a request to apply IPsec only to TCP and UDP traffic between the nodes, as protocols like ICMP and ARP presumably don't need to be encrypted in most situations, and that they don't depend upon encryption can make problem solving a lot easier.
Finally, the first major bit of awkwardness: the left=%defaultroute means effectively that the configuration applies to any current node, and only the destination node matters. That is the presence of the configuration file indicates whether the configuration is applicable.
conn node-192.168.6.31 also =nodes auto =route right =192.168.6.31 rightsigkey =ssh:ssh_host_192.168.6.31_rsa.key.pub conn node-192.168.7 also =nodes auto =route right =%192.168.7.0/24 rightsigkey =ssh:ssh_host_192.168.7_rsa.key.pub conn node-192.168.8.104 also =nodes auto =route right =192.168.8.104/24 rightsigkey =ssh:ssh_host_192.168.8.104_rsa.key.pub
This section contains a number of configurations, one each for every mode setup to communicate via IPsec with other nodes. Each configuration in effect represents both a source or a destination node (whether on the receiving or sending node, and they all inherit common IPsec attributes from the configuration in the first section.
But acrtually this is done by the the contortion of
implicitly defining a pair connection between any local
source address and that specific destination node address,
so defining bilateral configurations, but where each is
from any
source to a specific
destination.
The destinations need to be specified as address of the other nodes to or from which IPsec packets can be exchanged, plus for each of them their encryption (public) SSH RSA secret, because each can have a separate one.
If this is not required more generic configurations are possible, in this example all nodes in subnet 192.168.7.0/24 are assumed to have the same encryption key, which means that a single configuration can describe all of them.
As an extension it would be possible to define a single configuration defining many bilateral sessions encrypted with different secrets if the association between then could be specified as a pattern or a local or remote database lookup, in a form similar to one of:
rightsigkey =ssh:ssh_${right}_rsa.key.pub
rightsigkey =sshsearch:/etc/ssh/ssh_known_hosts
rightsigkey =download:192.168.7.1
In the last case the setting is meant to indicate the dynamic download of the relevant encryption key from a server with address 192.168.7.1 which has the additional advantage of not requiring to copy the encryption keys of all nodes on each of them, but only that of the server itself.
Note: in doing the download each node can authenticate itself to the encryption key server using its decryption key specified with leftsigkey.
But looking at that way of working it is in effect very similar to how Kerberos work, and indeed it would be very useful to use as leftauth and rightauth something like Kerberos or more generally GSSAPI which is not currently available but used to be implemented in earlier versions of some ISAKMP daemons (1, 2).
Overall it would have been easier if the configurations had been expressed naturally in terms of end-to-end multilateral combinations, and the authentication and encryption high level protocols had been written with popularly available secret distribution services like Kerberos too.
But the awkward configuration styles described above are fairly usable already for node counts in the tens, and probably one or two hundred with a suitable configuration system.
A previous post presents the suggestion to slice a large set of storage devices into subsets to avoid creating nefarious large single pools of storage capacity.
Of course partitions
have been a long
standing method to slice individual smaller storage devices
into smaller virtual storage devices for similar purposes, but
in general I have been skeptical of the value of doing so,
because it is broadly preferable to manage stored data
tree-wise via subtrees of filesystems than via virtual storage
devices. In UNIX and Linux terminology, it is usually better
to organize multiple sets of data as distinct directory trees
in the same block device than by each being a distinct
directory tree in its own virtual block device.
Note: I have also previously mentioned and criticized slicing a large single RAID set into sub-volumes, but that's very different from having multiple RAID volumes.
Of course the two guidelines above are somewhat at odds, and
the dividing line depends on technology and tradeoffs: the
first guideline is about avoiding single block devices that
are too large
and the second is about
avoiding those that are too small
.
Where the boundary lies depends on the capacity of single physical storage devices, their (bulk sequential) transfer rates, and crucially on a metric that is often disregarded, that is (random) IOPS per terabyte.
For example 145-300GB 15k RPM or 500Gb-1TB disks have been popular for a few years for mass storage, and they tend to be able of (bulk sequential) transfer rates around 50-150MB/s and (random) IOPS per terabyte of around 100-600.
At the same time filesystem technology seems to make awkward to maintain filetrees containing more than a few TB of data over more than a few million or tens of millions of files, largely because of the lack of parallelism in most tools that scan whole filetrees, like RAID resync, or fsck and backup.
This has resulted in my usually suggesting block devices sizes of 2TB to 4TB, exceptionally as large as 8TB, realized over a RAID10s consisting of many small and fast disks, or of a handful of larger and slower ones, exceptionally the latter as RAID5 or even RAID6. For example for a 2TB block device have:
The problem is that individual storage devices are becoming
as large as 4TB or 6TB, with a very low IOPS, ratio, and the
traditional way to raise absolute IOPS is to have many devices
in a low correlation RAID set like RAID10, but that ends up
creating block devices that are too large
for current filesystem and RAID techniques that are (and still
have poor IOPS per TB, but that is intrinsic).
For example consider a 6× set of 4TB drives: arranged as a RAID10 the resulting block device has a capacity of 12TB, and 16TB as a RAID6, which I think are a bit too large to be comfortable, especially the latter.
One way to reduce a bit the bad consequences is to slice the large storage capacity, but creating smaller RAID sets than 4-6 drives might reduce too much the absolute IOPS achievable.
A somewhat uncommon but obvious technique is to create RAID sets of partitions on those 6× disks. For example to slice each 4TB disk into 4 partitions of 1TB each, and then create across al drives 4 RAID10 (or RAID6) sets each resulting in four virtual block devices with a capacity of 3TB (or 4TB for RAID6).
Of course this does not improve directly the fundamental
issue that multi-TB disk drives still only have one
arm
, but has some advantages:
sector, only one of the RAID sets will lose a member, the partition of which that sector is part; Localized disk failures of this sort are common. When the disk gets replaced all the RAID sets of which its partitions are members will become incomplete, but delaying the impact can be an advantage, and alternative fixes are possible in many cases, such as when there are still spare sectors on the disk (and the sector was being read); after the fix only the relevant partition needs to be synchronized.
The above is largely about making do with second best, as slicing across large disks is likely to give fewer advantages than slicing into distinct disk subsets. Of course the numbers involved matter: the original discussion about slicing happened about 360× 3.5in 500GB disks, and for that slicing them into wholly independent subsets, for example RAID10 sets of 8 (or even 16) members each looked quite desirable. This post was prompted by looking at recent 4TB and 6TB disks, and considering fileservers for archives of largish documents and media files.
I have seen a lot of rack and and multi-rack layouts and most
often I have been disapponted to see layouts imspired by the
usual syntactic
attitude that all valid
combinations are equally useful and plausible.
But most are not, and for example the popular layouts organized by type of equipment, for example racks for switches, racks for servers, racks for storage sets, are particularly nefarious.
As usual to me that matters is maintainability, and in particular designs that minimize the impact of issues rather than syntactically pleasing ones.
The best guideline for that is to minimize cable length and cable crossing: because long cables are difficult to follow, long cables get easily tangled, and tangled cables are both difficult to follow and are accident prone. As usual all is fun and giggles until there a change is needed or there is an issue, and when there is an issue impact is minimized and maintainability maximixed by tthe ability to pull out and replace stuff precisely without risking pulling the wrong stuff or more stuff than intended.
Note: I am seen situations where important maintenance was not performed because nobody involved wanted to risk of stating which stuff was where in a messy rack situation.
Minimizing cable lengths and impact in case of change or issues because in practice it means putting together equipment that is strongly related by connectivity rather than weakly by type.
The further guidelines therefore is to put upstream
stuff (boxes to which several other
boxes are connected) in the middle of their downstream
stuff.
So for example a good rack layout put togeter related front-ends, back-ends and network equipment in the same rack, for example as:
When two switches per stack are too many the switch two
thirds down can be omitted. Also when I say third
I really mean section
, as they don't need to be of the
same size. Also in the case of many front-ends I would put
some in the middle third so they can be more conveniently
connected to the second switch, or to the middle switch from
below.
Conversely I would put front-ends, switches-routers, management boxes, back-ends for different applications in different racks, and so for redundant sets of clusters, and the latter ideally in different locations of course.
Note: to my horror I have seen cases where in an application with two redundant sets of front-end, switch, back-end, boes the two front-end ones were in one rack for all front-ends, both switches were in the rack for switches, and the two back-end ones both in the next rack.
The advantages of a layout like the above are many:
As to the layout of a row of racks, the same guidelines apply: for example to put the rack with the main network switches or routers and their patch panels in the middle of the relevant row of racks, to minimize the thickness of cable bundles and their length; and in case if most of the racks on the side have their own switches or routers as they ought to (in the middle of the rack of course) then the cable bundles are not going to be especially terrible either.
Note: there is an interesting special
case when a row of rack is relatively short, which is to
configure the switches or routers on each rack not in a
hierarchical relationship with that in the rack in the middle,
but stacked
with it in a single "logical"
switch or router. Stacking cables can be as long as 5m for
many enterprise
switches or routers.
I think that there are few cases where different layouts are useful, and I remember only one: racks that contain only network switches or routers and patch panels. In that case while patch panels and related network equipment should be interleaved so that they are near to each other, the network cabling should be done in the front simply because that's where network equipment has cable sockets, while computing equipment has them in the rear.
Another good thing to have in racks is to have supporting
uprights
(the vertical rails onto which
equipment is bolted not just front and rear but also in the
middle, to avoid the temptation to leave shorter equipment
hanging from the front or rear uprights only.
Finally there is a design for cable management that occurred to me (and others) as advantageous even if uncommon. Traditional cable management is mostly absent in computing equipment and mixed racks, and extends network racks with trays or other conduits on the sides, making them significantly wider.
I much prefer to have cable trays with cable guides along the depth of the rack, typically at the rear of computing or mixed racks, and at the front of network racks. This at worst extends the depth of a rack, not its width. The advantages are:
I have been reading with amazement an interesting blog post comparing the Samsung mobile phones Galaxy Note S3 and Galaxy Note S4 where one has an 1920×1080 display and the other a 2560×1440 display, both of them in the excellent AMOLED technology.
That is pretty amazing, as 2560×1440 OLED monitors for desktops or even laptops don't exist yet. Also because that's a 5.7in display, which for a mobile phone is huge. Obviously either it is difficult to build larger AMOLED panels, or LCD manufacturers want to fully depreciate their enormously expensive large LCD panel factories before building large OLED panel factories. Besides they probably regard the external display market for desktops and even the internal display market for laptops as a distraction compared to the enormous market for mobile phones.
I found again today the box where I had stored L1710B LCD monitor that I had bought in 2003 and that I have not used for some years. I took it out and tried it again and it worked well. Compared side by side with my recent U2412M it is rather smaller but still quite usable, and it has a nice 1280×1024 pixel size, with a better aspect ratio and vertical space than more recent monitor.
The LCD display if of course still very sharp and legible and usable and despite not being IPS or VA it still has pretty wide viewing angles: horizontally I can't see a contrast or color shift, and vertically I can't see a color shift even if there is a contrast shift vertically, but fairly mild. I think that the viewing angles are much better than many contemporary TN displays with claimed viewing angles of 170°/160°, even if the specification reports viewing angles of 160°/140° which seems to be a bit pessimistic; or perhap there has been some specification inflation.
Display visual quality is reasonable, even if the 18-bit colors and much lower contrast ratio are noticeable, and there is some backlight bleed. The greatest signs of passage of time are born by the backlight (four CCFL tubes) as it has a noticeable yellow tint, and it is dim, and at the same time there is a fair degree of backlight bleed.
The physical size of 340mm×270mm is usable, even if it is not possible to have two regular size windows side by side as with the U2412M physical size of 518mm×324mm. The smallest monitors currently for sale have a 474mm×296m (pixel size 1680×1050) dispkay size, which is significantly larger.
Over ten year of display development have brought many improvements, and my LG 21.5in 1920×1080 IPS225V is clearly better, even if the 10 year old display is still quite usable, even with a yellowed backlight; but this L1710B was in turn rather better than my previous 15in Hansol H530 and the even older Samsung 570S that were a bit too small and with too narrow viewing angles compared to it.
I think that the biggest improvements in a decade have been in order of decreasing importance:
The KDE SC 4 has a numer of ambitious abstraction libraries and one of these is Solid that attempts to provide an idealized view of hardware capabilities, including network devices.
Among these it can provide KDE applications with a notification as tgo whether network connectivity is available. Unfortunately it sometimes gets it wrong, in particular when using as I do PPP based links.
There is no obvious manual way to set the network connectivity status, but using its API exposed via WDBUS it is possible to do this, where $V be 1 for connectivity not available and 4 for connectivity available:
qdbus org.kde.kded /modules/networkstatus setNetworkStatus ntrack $V
There have been more examples of storage madness but a recent one made me laugh out loud:
I am using xfs on a raid 5 (~100TB) and put log on external ssd device, the mount information is: /dev/sdc on /data/fhgfs/fhgfs_storage type xfs (rw,relatime,attr2,delaylog,logdev=/dev/sdb1,sunit=512,swidth=15872,noquota). when doing only reading / only writing , the speed is very fast(~1.5G), but when do both the speed is very slow (100M), and high r_await(160) and w_await(200000).
Apart from the the bad idea of having a single 100TB filetree, which is well beyond what I think is comfortable to maintain, it is amazing that it seems to be a single RAID5 with a 100TB capacity.
This seems confirmed by the large value in
swidth=15872
which is a multiple of 31 over the
sunit=512
: if this is indicative of the physical
geometry the RAID5 set is made of 32 drives, and probably high
capacity and low IOPS-per-GB 3TB ones.
This is an admirable level of storage madness, both as to the very thin degree of redundancy, and as to the consequences of RMW with writes smaller or not aligned to 15.5MiB which is the stripe size.
The resulting speeds are actually pretty good, for example for sequential access 1.5TB over 31 drives means 50MB/s per drive, which is not bad (even if the drives can do 100MB/s and higher on the outer tracks).
For what is essentially random access, due to both the concurrent reads and writes from the the applications layer, and the consequences of RMW in the RAID5, those 100MB/s are quite good, as per-drive that is 3MB/s, which is well above the transfer rate of around 0.5MB/s for a disk drive doing purely random 4KiB operations.
There is another detail that may impact the concurrent
read-write rates: that the mounted device name is
/dev/sdc suggests that the RAID5 set is attached to a
hardware-RAID host adapter
. Many brands and
models of hardware-RAID host adapters have buggy or
misdesigned firmware
; a typical case is
scheduling reads ahead of writes or viceversa.
This may be happening here as the average read completion time r_await is reported to be merely high at 160ms but the average write completion time w_await is much biggers at 200,000ms or 200 seconds.
There is an extra layer of madness in the situation: from the
name of mount-point
/data/fhgfs/fhgfs_storage it can be guessed that this
100TB is supposed to be an object-storage pool for the
BeeFS
parallel distributed meta-filesystem. If this is true it has
two entertaining implications:
The latter point begs a question, which is why the 32 drive set was configured as a single 32 wide RAID5 set when it was obviously possible to do something else.
In part this is likely to be not knowing much about storage systems, as revealed by the surprise about the consequences of a very wide stripe in parity RAID.
But my guess, based on the attitudes of so many clever
people, is that the designer
of this
storage system wanted to achieve the lowest possible up-front
cost, boasting to their boss that:
wastedon parity, while still keeping the safety of having a parity member (probably based on the mad assumption that probability of failure is independent of set size).
uselessparity members.
A potential alternative would have been six 5 drive RAID5 sets, plus 2 hot spares, or one or two RAID10 sets, plus perhaps the same 2 hot spares. But with the former the capacity of only 24 of the 32 drives is used for data, and only 4 data drives at most can be used for parallel IO per RAID set; and with the latter only the capacity of 16 of the 32 drives can be used for data.
All of the above is indeed correct in itself, except that:
However as usual if what really matters is up-front cost, then insane designs that minimize it as the expense of longer term speed and risk are attractive. In this case however even initial speed is impacted because the design seems to me excessively targeted at the lowest possible upfront cost per capacity.
I am still often surprised by the absurdity of certain
situations and expectations, for the example the case where
umount takes a long time to complete
even if there are only 3GB of uncommitted updates (also called
dirty data
) to be committed and over the
network between two
DRBD
instances:
Now, for a moment, assume
- you don’t have DRBD in the stack, and
- a moderately capable IO backend that writes, say, 300 MByte/s, and
- around 3 GiB of dirty data around at the time you trigger the umount, and
- you are not seek-bound, so your backend can actually reach that 300 MB/s,
you get a umount time of around 10 seconds.
The first reason is that it ought to be well known that for
good reasons umount is a barrier
operation with respect to uncomitted updates, and that it can
take quite a bit of time to write 3GB of updates to probably
randomish locations on two disks, one of which requires
network traffic, as explained next:
Still with me?
Ok. Now, introduce DRBD to your IO stack, and add a long distance replication link. Just for the sake of me trying to explain it here, assume that because it is long distance and you have a limited budget, you can only afford 100 MBit/s. And “long distance” implies larger round trip times, so lets assume we have a RTT of 100 ms.
Of course that would introduce a single IO request latency of > 100 ms for anything but DRBD protocol A, so you opt for protocol A. (In other words, using protocol A “masks” the RTT of the replication link from the application-visible latency.)
That was latency.
But, the limited bandwidth of that replication link also limits your average sustained write throughput, in the given example to about 11MiByte/s.
The same 3 GByte of dirty data would now drain much slower, in fact that same umount would now take not 10 seconds, but 5 minutes.
But the bigger reason is how common is the idea that having in memory a lot of uncomitted writes is good or at least something that is unremarkable. It is instead a very bad situation that should be avoided because usually the benefits are not that significant:
blockover the delay period. But the most common patterns generating updates are randomish writes and sequential writes, and when in-place updates occur they usually happen in a short time.
delayed allocationthat benefits from delaying updates there is a limit to that benefit, even in the case of purely sequential writes to a single file, as the goal of delayed allocation is to achieve contiguity, and some contiguity is good enough.
So there are some potential benefits to delaying the commit of updates for a long time, but they are limited. But the delay can have large costs:
FlashSSD it can trigger a lot of scattered erasures at the same time; either can cause a long single surge in latencies instead of a less noticeable series of small ones.
elevatorsthat can schedule writes ahead of reads too.
page(block) cache subsystem does not really have a high and low water marks, but a cruder mechanism that tends to be more erratic in the amount of traffic it generates.
The result therefore can be that when a lot of uncommitted blocks get written out most Linux processes can seemingly freeze for dozens of seconds, as those that are reading get their reads queued behind writes, and those that are writing get crowded out by the sudden mass of page cache writes.
The best policy therefore usually is to have relatively few uncommitted updates in memory, and my guideline is for at most a few sconds worth of write time, and not more than 1GiB even when there is very fast IO and lots of memory in the system.
So for example for a typical desktop with Linux I would not want more than 100MiB of uncomitted updates, or perhaps 200MiB with a flash SSD, and to achieve this I use parameters like:
# Writes queued up to 100MB and synchronous after 900MB sysctl vm/dirty_bytes =100000000 susctl vm/dirty_background_bytes =900000000 # 6s before flushing a page anyhow, scan all pages every 2s sysctl vm/dirty_expire_centisecs =600 sysctl vm/dirty_writeback_centisecs =200
The above is to write uncommitted pages when either more than 100MB of uncommitted pages have accumulated or if less than those if they have been uncommitted for more than 6 seconds.
Traditional UNIX would write out all uncommitted blocks every 30s but that was on systems that typically had 256KiB of main memory and disks with a speed of a few MiB/s.
It has become time for another small upgrade for my main desktop system, (some previous system configurations: September 2005, January 2006, March 2006, June 2009) and I have upgraded not long ago the desktop I use for games and some software and configuration testing, with the resulting main components for the main desktop PC:
Type | Product | Notes |
---|---|---|
Motherboard | ASUS M5A97 LE 2.0 | Supports ECC memory natively. |
Memory | Kingston 2× 8GiB ECC DDR3 at 1333MHz | ECC is good, 16GiB total is a lot for this system. |
CPU | AMD Phenom II X3 720 | Not the fastest or coolest anymore, but still pretty good for a system used mostly for office tasks. |
Cooler | Arctic Freezer Pro 7 | Recently installed, an enormous improvement in cooling and a significant one in noise over the fairly lame (just a block of aluminum still, no heatpipes) one included with the CPU. |
Graphics card | Sapphire Radeon HD 4770 1GiB | Not the fastest or coolest anymore, but still pretty good for a system used mostly for office tasks, and tested with oldish games like Team Fortress 2 it can still cope, with average quality settings. It also has a particularly quiet cooler, especially at low loads. |
Disks | Various brands, 2× 2TB 7200RPM + 2× 2TB for nightly backup; 4× 2TB on a shelf for periodic backup. | I am not using much of that space, but currently 2TB is the size to get, as for 1TB or smaller the price is not that lower, and for larger capacities the ratio between capacity and IOPS is even more ridiculous. |
Power supply | Corsair HX520W | It is fairly quiet and apparently quite efficient. It
also has modularcabling, but it does not matter much because with all the disks and cards it is nearly maxed out anyhow. |
My gaming and experimental PC has instead:
Type | Product | Notes |
---|---|---|
Motherboard | ASUS M5A97 PRO | Supports ECC memory natively. |
Memory | Corsair 2× 4GiB DDR3 at 1333MHz | This does not have ECC, and for a gaming and test PC I do not mind. I should have gotten ECC anyhow, because of the small cost difference, but I was at a computer fair and these were immediately available... |
CPU | AMD FX-6100 | Not the fastest or coolest anymore, but still pretty good for a system used mostly for office tasks. |
Cooler | Arctic Freezer Pro 13 | Recently installed, a large improvement in cooling and a significant one in noise over the barely sufficient one (small, single heatpipe) included with the CPU. |
Graphics card | Sapphire Radeon HD 7850 2GiB | Still a pretty good midrange card, capable of running conteporary games fairly well and runs oldish games like Half-Life 2 at full quality and very high rates. This model also has a Dual-X cooler which is much less noisy than most others. |
Disks | Various brands, 1× 500GB 7200RPM + 6× 1TB 7200RPM. | The disks are all old ones that I had previously for various reasons. The 500GB one has the system and games, the others are for storage setup testing. |
Power supply | Corsair TX650 | It is fairly quiet and apparently quite efficient.
It is not modular, but it does not matter because with all the disks and cards it is maxed out anyhow. |
Some general comments:
As previously remarked web browsers consume resources that are cost-free to web site designers, so they tend to overuse them or be careless about the impact their web designs have on client systems, perhaps because what matters usually is running a good demo to whoever commissioned the web site design.
Note: I had previously noted with outrage that some web browser in 2005 was using over 200MiB; while I am talking about several GiB here...
This has become very common with the use of AJaX where data can be added incrementally using XMLHttpRequest to an initial page.
A good example is Tumblr blog archive
pages which can grow very long as they
are scrolled forward; for example this one for
a blog of kitten photographs
that is so addictive that one may be tempted to scroll forward
all the way.
The first deleterious consequence is that memory usage goes up dramatically as the page grows by adding new rows of blog thumbnails. The second and even worse is that Firefox won't release the memory thus allocated even when the page is closed and I have tried in several ways:
Minimize memory usage.
Some memory is reclaimed, but in about:memory's
Measure...
page the heap of allocated objects remains
sometimes very large (several GiB after a while)...
Saving the session, terminating the browser and restarting it obviously shrinks the memory back, but that means losing window positions and it is a bit cumbersome.
Note:The whole session has to closed because currently Firefox runs a single process with multiple threads, while some other web browsers instantiate a new process for every window the user creates.
However when Firefox changes some aspects of its setup it
does a very convenient in-place
restart.
This can be invoked by using the Developer Toolbar which allows the user to type commands in a command line, and one of them is restart.
A quicker alternative has been developed by someone with probably the same issue, as a tiny and very convenient extension to invoke the same functionality from a menu at any time, called Restartless Restart and that works quite well.
Note: I also use the extension Session Manager which has the very beneficial effect of avoiding to load (if so configured) any page in a newly recreated GUI tab until it is actually accessed.
I have mentioned previously that a good gamma calibration of a monitor's display can be significant, and I have now found a page that handling gamma properly is also significant when writing program to resize a picture: because resizing a picture does not shift the gamma curve only if it is done in a linear color space, and apparently most image editing applications don't do that right.
Incidentally the sample images in the article (for example here but the others too) are also pretty good for checking the gamma calibration and color range quality of the display they are shown on.
As previously mentioned (1, 2) I am fond of using ECC RAM whenever possible because it is both cheap and gives some protection against undetected corruption of data.
While essentially all enterprise
servers
support and usually require memory with ECC, most desktop
systems (whether for business or personal use) don not.
This may be because of market segmentation
strategies by suppliers, to ensure that lower margin desktops
would not be used in the same role as higher margin servers.
But in part I think it is because of people using desktops do not care about potential undetected data corruptions, or only care about getting the cheapest price. In some cases like for gaming or media oriented desktops it does not matter a lot: the very occasional undected data corruption is insignificant. But for most other cases the additional cost of ECC is or should be quite small, after all it is a physical overhead of 1 over 8 (memory chips, bus widths) and a small time overhead.
Anyhow, fortunately it is possible to get desktop components that support ECC memory and ECC memory, and currently that means:
Overall the cost of going with AMD CPUs and ASUS motherboards is insignificant: the pricing of those is entirely in line with market averages. AMD CPUs are priced for higher ratios of performance/price, and ASUS brand motherboards are priced as midrange products, and that's fine.
The price difference for ECC memory is also not large currently; for example a 4GiB 1600MHz DIMM from Crucial costs $42+taxes without ECC and $57+taxes with ECC or from Kingston $40+taxes without ECC and $49+taxes with ECC. That difference multiplied by a few DIMMs is rather small in absolute terms, and anyhow compared to the cost of the system.
I have long been using the delighful recoll text indexing system, which uses as the database backend Xapian and I have only recently discovered that it comes with a tool that can compact databases and on a freshly filled database is already quite effective:
# xapian-compact /var/data/search/recoll/xapiandb /var/data/search/recoll/xapiandb2 postlist: Reduced by 49% 1218024K (2469208K -> 1251184K) record: Reduced by 2% 10848K (484160K -> 473312K) termlist: Reduced by 28% 542928K (1899720K -> 1356792K) position: Reduced by 0% 17832K (7047320K -> 7029488K) spelling: doesn't exist synonym: doesn't exist
How can it be so effective on a freshly filled
database? It (wisely) uses mostly
B-trees
and since they grow and autobalance dynamically they tend to
have 2/3 full blocks. Presumably the compactor merges together
blocks, for example 3 blocks into 2, where possible.
Which reminds me of the point made by Michael Stonebraker automatic tuning of data structures particularly including the B-tree and static trees. The point was that the B-tree has the advantage that its default is not optimal but still fairly decent in frershly created , and then it can be compacted; while static trees by default when freshly created are very suboptimal, and then become pretty optimal when explicitly compacted.
I have continued to use daily my new-ish Dell U2412M monitor with with an 24in 1920×1200 IPS display and the Acer B326HUL monitor with a 32in 2560×1440 AMVA display and I am still very impressed with both.
They don't quite have the exquisite colour range of the Philips 240PW9ES but the colours are still pretty good and their LED backlights power up a lot faster than the W-CCFL backlight of the latter.
Also both the Acer and the Dell LED backlights don't use PWM for changing brightness, which is good for people who notice the flicker that often accompanies PWM.
By the way I did not mention one peculiar aspect of the B326HUL which is that it does not have a VGA analogue video input, only digital ones. That is the first no-VGA monitor that I have used.
As to whether I like better the 24in or 32in display size, obviously 32in are somewhat better, but the 24in are adequate too and cover enough of the viewing area too at the viewing distance I use either of them, around 80cm.
For a shorter viewing distance I would probably prefer the 24in monitor, and for a somewhat longer one I would prefer the 32in one.
I have previously discussed how terribly inappropriate are (1, 2) (ordinary) filesystems for storing what are in effect individual records as files, instead of using files to store collections of records.
Having had to recently used MHonArc there are some additional considerations that occurred to me.
The MH/Maildir style of mail archive is based not just on the lazyness of not being bothered to use a container layer for records other than the filesystem API, but on the example of mail spool implementations.
Mail spool implementations are almost always directories with individual files for indivual messages being spooled, that is MH/Maildir style, and usually for acceptable reasons:
The last point is the most important because a mail spool is essentially transient, it is a (random access?) queue, while a mail archive is essentially permanent, it is a mostly a one-way stack: very rarely are mail messages removed from a mail archive.
In particular invidual messages are essentially never removed
from a mail archive indexed by MHonArc. Therefore the MHonArc
implementation of the mail archives as a set of directories
with invidual mail messages as files in them, and the index as
a set of HTML files, one per mail message, is particularly
demented, because in effect doubles the number of files in the
archive, whose structure is optimized for being a transient
mail spool, making it very easy to add and in particular remove
random messages, but that never happens, as messages are never
removed and always added at the top
.
Looking at it more generally: mail archives are in effect
logs
, that is timewise collections, and
mail spools are in effect
inversions
,
that is spacewise (keyword indexed) collections.
As to implementation, directories as in MH/Maildir archives may be up to some (low) size limit a lazy yet still acceptable implementation of spacewise collections, especially those with random access and deletions, but they are a terrible way to implement timewise collections, especially accumulative, log-like ones.
Directories are terrible implementations for large spacewise collections or for timewise collections because:
heavyweightentities with many attributes and giving pretty strong guarantees about them, for example as to ACID properties.
The latter point of the difference between peak and average load is particularly important for mail archives: the average load on a mail archive is minuscule, because most accesses are to a small number of relatively recent small mail messages, but backups and searches (such as index building) involve scanning all or most or many members of the collection, triggering high volumes and rates of very expensive metadata accesses; expensive because they are typically random, and filesystem APIs guatantees make those metadata accesses expensive to implement.
Anyhow, it turns out that a lot of popular data happens to have natural timewise, log-like structure, where older data is essentially read-only and newer data gets appended: more or less any data related to people, amusingly enough. That means mail, photos, music, personal documents and blogs... At most older data gets deleted, usually in a timewise way too, rather than deleted in random order or updated.
Almost the only data that is not naturally timewise is source and object code, which gets created and deleted fairly randomly during development. However there are some really huge qualification to that: if the source code is version-controlled the version-control system effectively handles a lot of it timewise, and when it gets packaged and installed most of it also becomes mostly readonly with new data appended (except for whote-package updates).
So-called search engines
include a search
component and an indexing component and a query component, and
the searcher is largely a producer of logs and the indexer is
largely a consumer of logs and perhaps because of that Google
developed
a special purpose filesystem
that is implemented in a way that favors archiving and
appending; it would be unsurprising if were quite suitable
also to store Google's mail, videos, blogs, social
histories.
It is for me a huge disappointment that so many developers and people are tempted by lazyness to implement archives in the same way as spools, like MHonArc.
Note: it ought to be clearer now why this very blog is not organized as a page-per-post, as each page contains a lot of several posts.
The past month of July has been quite warm and as a result I have often kept my doors and window open. This has resulted in some flies coming into the house and I have noticed that:
I have previously reported some impressions of high end
keyboard devices
(1,
2),
but I also bought some time ago so
high end pointer devices
(mice
) and accessories. In part this was to
see if their usability and robustness was better than that of
low-end mice, which tend to break early (especially the left
button and the wheel) and have somewhat clumsy handling.
In part this was to see if they improved my scoring in
first person shooter
games, as they are
claimed to do thanks to higher resolution pointing and faster
and steadier handling. I chose these pointer devices from
favourable reviews and because they are fairly different.
I have this tried three pointer devices and two pointer
surfaces (mousepads
), they are all
relatively mid-high-end mice:
Black, rough body with a somewhat fancy shape, with the usual 2 buttons and wheel-button on the front, plus two extra buttons on the left hand side and designed for right handed use; the body is shaped also to be much wider and longer than most ordinary mice.
Note: some users report that the rough matte finish of the sides wears out fairly quickly and becomes smooth.
Two extra buttons on top in line with the wheel allow changing the reporting during use in three steps, from 3200PI to 800DPI; the settings are programmable. The event rate can go from 1000 per second down to 500, 250, 125 per second, but only via reprogramming.
It has 3 optional metal weights at the bottom that by default make it a fair bit heavier than the others, which helps with stability.
The M40 is programmable via some sort of MS-Windows only tool, but it does not require the tool work as a normal mouse, or to switch DPI.
To me it feels pretty good, and usually I use the middle resolution durign desktop use and the higher resolution during games; the lower resolution is useful for image editing. It costs around £33 (incl. VAT).
Shaped like a rather ordinary mouse, smaller than the the other with, tow a glossy white body (the other two are black and matte) with the usual two buttons, wheel-button on the front, and two side buttons. There is a single button on the bottom to switch resolution among 450DPI, 1150DPI, 2300DPI, and the mouse wheel light will change color to indicate which one. The mouse by defaults has a motion event rate of 1000 per second, but in case this is not compatible with the rest of the system it can be reduced to 500 or 125 events per second.
The manufacturer claims that the wheel has a fully optical encoder, and that makes it more precise and robust.
It handles very well, it has especially large teflon pads on the bottom, and it is quite light. it seems a bit less robust than the other two, being of more conventional mechanical design. It costs arund £ 52 (incl. VAT).
Shaped to be large and squat, with a definite right handed shape like the Raptor M40, has the usual 2 buttons, wheel-button on the front, plus two side buttons. There are two extra buttons on top change DPI, in line with the wheel, and when the DPI is changed the lighting of the wheel change color.
It has fairly wide pads on the bottom, not quite as wide as those of the Zowie EC1. It is fairly comfortable to handle and changing DPI works well.
I handles very well and very reliably, and seemes fairly robust too. I like the handling and precision; I don't like that it seems to have a very sentitive wheel-button which is very easy to press inadvertenly. It costs around £40 (incl. VAT).
This is an old product, designed for electro-mechanical mice, with a sensor tied to a rotating ball. The surface is made of rubber coated with a very rough patterned plastic on top.
It worked pretty well with mechanical mice, and works pretty well with optical mice too, where the roughness is non uniform and that suits the optical sensor. It is faily small which is good for me, and the rubber back adheres very well to my my desk, making the surface very stable. It also seems durable as I have used one for several years and is not much worn out.
What I like is that it is relatively small and quite stable and works well, what I don't like is that the surface is very rough and somewhat abrasive. It costs around £10.
This mousing surface is completely different from the 3M one, and is made of aluminum metal, with a dark black smooth burnish on top, and is huge. It has some pad underneath that give it some grip, but sometimes during game play it moves.
It works very well with the Mionix mouse, and also mostly with the Zowie one, but the Corsair mouse does not work on it; I suspect that it is too smooth for its sensor.
What I like is that it is very durable as it is made of aluminum and works well with some mice, and what I don't like is that it is huge, slips sometimes and some mice don't work with it. It costs around £14.
All the mice above are pretty good and they can be used without any special driver with GNU/Linux. They are USB2 mice. Their mouse protocol is however not one of the more common ones, and I have found that they don't work with some KVM switches as a result, except in USB pass-through mode, which is a bit inconvenient but mostly works.
Overall I like the Zowie EC1 for desktop use, as it is white and nice and looks more conventional and less garish, yet it is very precise and can change resolution with a hardware button, even if one has to lift it and turn it to do so. I am not sure however that £52 is worth paying for a much better desktop mouse (which is also a pretty good gaming mouse), and I haven't had it long enough to see whether it is more durable than the low-end ones.
I like the Raptor M40 and the NAOS 3200 more for gaming, and currently I use the NAOS for gaming. All three are much, much better than low-end mice for gaming, and my first-person shooter game aiming has been improved quite significantly by them. I have been surprised how much aiming is improved by a higher resolution, higher event rate mouse. The price for either seems more reasonable than that for the Zowie, and I think they are good value. Overall I like the Raptor M40 more than the NAOS 3200 (customizable weights, less sensitive wheel-button), and it seems cheaper too.
As to the pads, the smoother metal pad of course lets the mouse glide much faster, so I use the ENSIS 320 for gaming with the NAS 3200 mouse, and they work well together, and it seems very tough, but it is huge and slips occasionally. I use the 3M Precise Mousing Surface for desktop use to protect the wood of the desk, but its surface is a bit too rough for resting my hand on, but it happens rarely. Both seem very durable, while low-end mousepads made of cloth or soft rubber tend to fray after a few months of use.
Having reported my impressions (1, 2) of some relatively expensive premium keyboards and related products, there is question that I have never seen sensibly and explicitly answered, as to why it is worth to pay for a premium keyboard 3-10 times more than for a low-end.
The primary answer given by many, that premium keyboards are mechanical key switches, and mechanical key switches just have a better key-pressing feel, is quite wrong in its premise.
Premium keyboards as a rule do have a better, much better in
some cases, feel than cheap ones, but while most premium
keyboards have electro-mechanical key switches, some of the
best have membrane key switches, for example
Topre
and buckling spring
IBM/Unicomp
ones.
What is common to all keyboards that have a better typing feel is that the keys have springs, and that is what gives the better, crisper typing feel, and also makes them more expensive. Rubber dome keys have the keycaps supported directly on the rubber dome of the membrane, and pressing the key presses down on a rubber dome, a feeling that is fairly mushy and non-linear. Pressing the keycap on a premium keyboard means pressing down on a well calibrated metal spring, a rather different, more definite feel.
Note: that's why this and previous
writings use the term premium keyboard
instead of
mechanical keyboard
which is used (misleadingly) by
most related publications.
Some people then, as a secondary answer, like
mechanical
keyboards in particular, not
just because they have spring supported keys, but because the
electro-mechanical switch can be constructed so that it had a
positive mechanical feel and/or sound signal when it gets
switched; but some premiums keyboards with mechanical key
switches don't have either, and still pressing on them feels
better than with a rubber dome supported keycap.
Note: key switches that on registering
a press give a bump are called tactile
, if
they give a sound they are called clicky
,
and when they just have a spring with neither are called
linear
.
The obviousness of the dump, or the loudness of the
sounds, and the stiffness of the spring are also parameters,
plus some others.
A third answer is that mechanical switches in particular are as a rule designed to switch midway through their travel extent rather than near the end, therefore a typist can learn to touch-type with short strokes that requires very little effort and is rather fast, and also accurate, and makes little noise as the key deos not get pressed fully banging against the bottom. As a rule this requires tactile or clicky mechanical switches, but can be learned also with linear ones, even if it is rather more difficult.
Note: spring based keys tend to have a rather longer full travel extent than rubber dome ones.
Another answer is that given that putting a spring in each key switch is already expensive, many premium keyboards also have additional features, such as better build quality, including often better keycap quality, and some extra features, such as key backlighting or a detachable cable, and it is easier to find them in custom layouts than low-end ones which are produced for the mass-market in size-fits-all fashion. Among these better features that usually accompany spring-based keys:
coolthan black ones.
dye-sublimatedor even
double-shot.
So while the main draw of a premium keyboard is the better feel given by a spring under each keycap, the extra money often also buys useful extra aspects.
As to me I use keyboards for many hours a day, so a difference in price of a few dozen UK pounds is not going to stop me, and I particularly like the feel of the spring and the smaller layouts, in particular the TKL one, and the options for more durable keycap legends.
But I also like the better build quality, the availability of less slippery keycaps plastics, and of light colored or back-lighted keycaps. I am not particularly fond of clicky key switches, and I sort of like tactile ones, but also linear ones with a stiffer spring; and I am not interested much in macros or animated back-ligthing modes.
I have previously mentioned that I could play well recent games with AMD/ATi based cards like a model 7850 with the AMD/ATi proprietary fglrx driver, on on Ubuntu 12.04 and on Debian 7.
By upgrading my Ubuntu 12.04 system to Xorg and kernel packages backported from the more recent 14.04 release I am now able to enjoy quite high rendering speed on for TF2 on both my AMD/ATi 4770 and 7850 cards. How high? With the classic size of 1920×1080 pixels:
The key details are:
modesettingis enabled.
This is the list of relevant packages I have installed, from a list by Aptitude:
i 1994 kB 6884 kB 10.1.3-0ubuntu 10.1.3-0ubuntu precise-updates libegl1-mesa-drivers-lts-trusty i 1997 kB 6824 kB 10.1.3-0ubuntu 10.1.3-0ubuntu precise-updates libegl1-mesa-drivers-lts-trusty:i386 i A 58.6 kB 250 kB 10.1.3-0ubuntu 10.1.3-0ubuntu precise-updates libegl1-mesa-lts-trusty i A 57.8 kB 245 kB 10.1.3-0ubuntu 10.1.3-0ubuntu precise-updates libegl1-mesa-lts-trusty:i386 i A 19.3 kB 145 kB 10.1.3-0ubuntu 10.1.3-0ubuntu precise-updates libgbm1-lts-trusty i A 19.3 kB 135 kB 10.1.3-0ubuntu 10.1.3-0ubuntu precise-updates libgbm1-lts-trusty:i386 i A 359 kB 1564 kB 0.0.22-2ubuntu 0.0.22-2ubuntu precise libgegl-0.0-0 i 4907 kB 33.6 MB 10.1.3-0ubuntu 10.1.3-0ubuntu precise-updates libgl1-mesa-dri-lts-trusty i A 4796 kB 33.8 MB 10.1.3-0ubuntu 10.1.3-0ubuntu precise-updates libgl1-mesa-dri-lts-trusty:i386 i 109 kB 513 kB 10.1.3-0ubuntu 10.1.3-0ubuntu precise-updates libgl1-mesa-glx-lts-trusty i 108 kB 483 kB 10.1.3-0ubuntu 10.1.3-0ubuntu precise-updates libgl1-mesa-glx-lts-trusty:i386 i A 21.4 kB 248 kB 10.1.3-0ubuntu 10.1.3-0ubuntu precise-updates libglapi-mesa-lts-trusty i A 21.4 kB 183 kB 10.1.3-0ubuntu 10.1.3-0ubuntu precise-updates libglapi-mesa-lts-trusty:i386 i 11.6 kB 127 kB 10.1.3-0ubuntu 10.1.3-0ubuntu precise-updates libgles1-mesa-lts-trusty i 11.3 kB 122 kB 10.1.3-0ubuntu 10.1.3-0ubuntu precise-updates libgles1-mesa-lts-trusty:i386 i 12.6 kB 133 kB 10.1.3-0ubuntu 10.1.3-0ubuntu precise-updates libgles2-mesa-lts-trusty i 12.5 kB 128 kB 10.1.3-0ubuntu 10.1.3-0ubuntu precise-updates libgles2-mesa-lts-trusty:i386 i A 9667 kB 28.3 MB 1:3.4-1ubuntu3 1:3.4-1ubuntu3 precise-updates libllvm3.4 i A 9858 kB 27.6 MB 1:3.4-1ubuntu3 1:3.4-1ubuntu3 precise-updates libllvm3.4:i386 i 13.0 kB 132 kB 10.1.3-0ubuntu 10.1.3-0ubuntu precise-updates libopenvg1-mesa-lts-trusty i 13.0 kB 124 kB 10.1.3-0ubuntu 10.1.3-0ubuntu precise-updates libopenvg1-mesa-lts-trusty:i386 i 157 kB 520 kB 10.1.3-0ubuntu 10.1.3-0ubuntu precise-updates libxatracker2-lts-trusty i 1746 B 28.7 kB 3.13.0.30.26 3.13.0.30.26 precise-security,precise-upda linux-generic-lts-trusty i A 2490 B 28.7 kB 3.13.0.30.26 3.13.0.30.26 precise-security,precise-upda linux-headers-generic-lts-trusty i A 2500 B 28.7 kB 3.13.0.30.26 3.13.0.30.26 precise-security,precise-upda linux-image-generic-lts-trusty i A 509 kB 1305 kB 3.13.0-30.55~p 3.13.0-30.55~p precise-security,precise-upda linux-lts-trusty-tools-3.13.0-30 i 2542 B 28.7 kB 3.13.0.30.26 3.13.0.30.26 precise-security,precise-upda linux-signed-image-generic-lts-trusty i 2496 B 28.7 kB 3.13.0.30.26 3.13.0.30.26 precise-security,precise-upda linux-tools-generic-lts-trusty i 2488 B 28.7 kB 3.13.0.30.26 3.13.0.30.26 precise-security,precise-upda linux-tools-lts-trusty i 1594 kB 3841 kB 1:3.4-1ubuntu3 1:3.4-1ubuntu3 precise-updates llvm-3.4 i A 45.5 kB 179 kB 1:3.4-1ubuntu3 1:3.4-1ubuntu3 precise-updates llvm-3.4-runtime i 707 kB 2733 kB 10.1.3-0ubuntu 10.1.3-0ubuntu precise-updates mesa-vdpau-drivers-lts-trusty i 33.5 kB 123 kB 7.7+2ubuntu1~p 7.7+2ubuntu1~p precise-updates x11-xserver-utils-lts-trusty i 22.2 kB 1788 kB 2:1.15.1-0ubun 2:1.15.1-0ubun precise-updates xserver-common-lts-trusty i 1561 kB 3771 kB 2:1.15.1-0ubun 2:1.15.1-0ubun precise-updates xserver-xorg-core-lts-trusty i 4760 B 65.5 kB 1:7.7+1ubuntu8 1:7.7+1ubuntu8 precise-updates xserver-xorg-input-all-lts-trusty i A 34.3 kB 140 kB 1:2.8.2-1ubunt 1:2.8.2-1ubunt precise-updates xserver-xorg-input-evdev-lts-trusty i 25.8 kB 115 kB 1:1.6.2-1build 1:1.6.2-1build precise-updates xserver-xorg-input-joystick-lts-trusty i 15.6 kB 98.3 kB 1:1.8.0-1build 1:1.8.0-1build precise-updates xserver-xorg-input-kbd-lts-trusty i A 41.2 kB 134 kB 1:1.9.0-1build 1:1.9.0-1build precise-updates xserver-xorg-input-mouse-lts-trusty i 25.4 kB 108 kB 0.3.0-1build2~ 0.3.0-1build2~ precise-updates xserver-xorg-input-mtrack-lts-trusty i A 67.9 kB 233 kB 1.7.4-0ubuntu1 1.7.4-0ubuntu1 precise-updates xserver-xorg-input-synaptics-lts-trust i A 15.2 kB 117 kB 1:13.0.0-1buil 1:13.0.0-1buil precise-updates xserver-xorg-input-vmmouse-lts-trusty i 7532 B 74.8 kB 1:1.4.0-1build 1:1.4.0-1build precise-updates xserver-xorg-input-void-lts-trusty i A 93.0 kB 308 kB 1:0.23.0-0ubun 1:0.23.0-0ubun precise-updates xserver-xorg-input-wacom-lts-trusty i 17.4 kB 194 kB 1:7.7+1ubuntu8 1:7.7+1ubuntu8 precise-updates xserver-xorg-lts-trusty i 9954 B 68.6 kB 1:0.3.7-1build 1:0.3.7-1build precise-updates xserver-xorg-video-dummy-lts-trusty i A 13.5 kB 87.0 kB 1:0.4.4-1build 1:0.4.4-1build precise-updates xserver-xorg-video-fbdev-lts-trusty i 9670 B 58.4 kB 0.6.0-0ubuntu4 0.6.0-0ubuntu4 precise-updates xserver-xorg-video-glamoregl-lts-trust i A 770 kB 2811 kB 2:2.99.910-0ub 2:2.99.910-0ub precise-updates xserver-xorg-video-intel-lts-trusty i A 23.5 kB 106 kB 0.8.1-1build1~ 0.8.1-1build1~ precise-updates xserver-xorg-video-modesetting-lts-tru i A 93.3 kB 308 kB 1:1.0.10-1ubun 1:1.0.10-1ubun precise-updates xserver-xorg-video-nouveau-lts-trusty i A 165 kB 516 kB 1:7.3.0-1ubunt 1:7.3.0-1ubunt precise-updates xserver-xorg-video-radeon-lts-trusty i A 16.5 kB 91.1 kB 1:2.3.3-1build 1:2.3.3-1build precise-updates xserver-xorg-video-vesa-lts-trusty
For Ubuntu 14.04 all these package versions are the standard ones, so it is quite easy to install them.
After using for a while two nice very different premium keyboards in the lower-end price band for mechanical keyboards I was pleased enough with both of them to try and extend the experience, so I bought another fancier keyboard and a set of specialty keycaps, and these are my impressions:
Keycaps from Ducky Channel for Cherry MX (1, 2) key switches, with main key keycaps in white and special key keycaps in pink (other color combinations available), with these features:
infilled(colored).
What I like:
What I do not like:
I have been considering infilling the engraved legends by hand with black nail polish. Since the keycaps are made of PBT a solvent like acetone can be used later to remove the nail polish if desired (while ABS keycaps dissolve in acetone).
Overall perhaps I should have bought a Cherry keyboard with light grey PBT keycaps and just scavenged those. Apparently the Cherry G81 series keyboards have PBT keycaps (only the light gray models) and cost less than this one (but their PBT keycaps are less thick).
Note: I have discovered that I have an old Cherry G83 keyboard which has also has light grey colored PBT (or possibly POM) keycaps, a lucky find.
I had bought this keycap set for the Corsair K65 mentioned previously as that has printed legends but I have actually put them on the QuickFire TK Stealth for now which worked out well, both as to improved visibility with front illumination and as to better typing feel than the default ABS ones. The total cost was still within the range of many equivalent products at around £100 (VAT incl.) and one gets two alternative sets of keycaps, for potentially quite a long life.
As to sticker legends glued on keycaps I have been surprised that they seem fairly durable; while laser etched legends in my experience are abraded pretty soon (1-2 years usually for the first legends to disappear), and even worse for th printed and lacquered legends which often last less than a year on heavily used keys (and not just on keyboards).
Overall these are very good quality keycaps, and the price is high but not out-of-line with that, but that the keycaps are engraved but not infilled is for me only marginally useful. With infilled engravings, either as purchased, or added later, they would be much better.
This is from Ducky Channel and has MX black key switches, with these notable features:
What I like:
linearthat is without any clicky actuation feedback. Perhaps I would prefer the MX Clear variant that are as stiff as these MX Black but with the soft actuation feedback of the MX Brown.
What I do not like:
Overall this is a very good quality keyboard, and the orange per-key back illumination is very pleasant and useful. But the price and the lack of compatilibity with older USB sometimes make me think I should have got the Quick Fire TK instead even if the latter is less compact and less cool.
However of the three keyboards I recently bought it is the one I like best, but by a small margin.
Overall I am very pleased with the keyboards I have bought, and even with the PBT keycaps, even if I will infill their engraved legends with nail polish. Burt as to the latter perhaps I should have just used the PBT keycaps from my old Cherry G83 (or I would have bought a sacrificial Cherry G81 just for the keycaps).
I very much like that they are available in different layouts and in particular in TKL layout as that is ideal for my usage (I have been tempted by compact layouts without special and arrow keys though), that there are several flavours of key switches, and of keycaps.
Note: If I had been interested in full size keyboards I would probably have bought Cherry G80 ones, as they are available in light grey colour, with PBT keycaps, at prices that are the lowest among mechanical keyboards. Unfortunately usually they are only available with MX Blue switches, but some rare shops have them with MX Black ones too.
Their price is 4-5 times that of average keyboards, but the absolute difference is relatively small and the typing quality and product durability seems much better: I have gone through a number of average keyboards and most were defective or something broke fairly quickly or wore out in less than 1-2 years, and the quality of typing usually fairly dire.
Since I spend a lot of time working on a computer, mostly desktops but also laptops, I am fairly keen to have a healthy and comfortable setup; for example visually with good monitors (for example 1, 2, 3) with legible displays and good fonts.
But I have also been interested in finding good keyboards (and mice), even if that has not been quite as important as the visual aspect.
So I have tried several keyboards in the past, with a strong preference for light-color keyboards, even if currently dark-colored ones are more commonly found.
I have also tried to find shorter ones, in part because my computer desk at home is a somewhat narrow one on rollers, which is very convenient, but also because I prefer the main key block of the keyboard to be centered on the monitor, and longer keyboaard extend exclusively to the right.
So I have gone through a number of average
keyboards of various types, and not being happy with them,
usually because of their being light (lack of stability),
fragile, with mushy key action, keycaps with easily abraded or
difficult to read legends, and poor quality construction;
usually with a life of 1-2 years, which is inconvenient, even
if not expensive given the low cost of each.
With the much greater diffusion of computer use over the past
decade the market for all computer accessories has expanded,
and this has supported a both a shallower range of product for
average
items but also a wider range for premium
items.
So while average keyboards are all small variations on a single theme of shiny black mushy key action very cheap designs, premium products are easier to find and there is a greater variety of them. So I have looked for premium keyboards with:
In general like for many other product this means
gamer oriented products, because the marketing
prejudice of many manufacturers is that only gamers are enthusiasts
who are prepared to pay higher
prices for better products, if only to show off, and quality
keyboards (and mice) actually have an impact on game
performance.
Premium keyboards are almost always built around
mechanical (actually electro-mechanical) key switches
(also
1,
2,
3).
as they are longer lasting and give a better typing feel than
the rubber dome
ones used in average
keyboards; premium keyboards also often have better keycaps
(better-feeling plastic, more durable legends, sometimes back
illumination) too. So I bought two rather different ones, in
part because they are for two different desktops, in part
because of wanting to try two different approaches:
This is from Corsair which is a brand of mostly gamer-oriented product and has these notable features:
ISOlayout with a TKL set of keys with Cherry MX Red key switches that stand directly off their base plate, without being set off by the kind of frame that is common on other keyboards. Also, detachable mini-USB cable.
scan ratesand for a compatibility mode with older USB equipment, in particular KVMs.
Windowskeys, and for media player operation.
What I like:
What I do not like:
This is a pretty good basic relatively cheap mechanical switch good quality keyboard. Probably it is the cheapest TKL one, and only Cherry G80 full size keyboards are cheaper.
It would be better for me with light-colored keycaps, and ideally with illumination, even if not per-keycaps, just background illumination, but that is not available in its price range.
Given that the keyboard is however a fair bit cheaper than several others, it is feasible to buy and then replace the keycaps immediately (if one really objects to the feel of ABS plastic) or later when they wear out.
Also the switch that enables the compatibility-mode USB protocol can be a really useful feature.
Overall I quite like it.
This is from CM Storm which is a sub-brand of Cooler Master and has these major features:
ISOlayout with a TKL keyboard with Cherry MX brown key switches. Also, detachable mini-USB cable.
What I like:
What I do not like:
Overall it is a quality keyboard but perhaps I might have bought instead the equivalent model with the equivalent top printed keycaps model or the similar top printed back illuminated keycaps model even if more expensive.
So far I have been fairly happy with both. The MX Brown switches of the QuickFire seem slightly preferable for typing than the MX Red ones, but the MX Red ones are very quick to press, so probably better for gaming. I find the tallness of the QuickFire a bit too much sometimes; neither has back-lighting, which would instead make the back-face captions of the QuickFire in particular far more visible; however I use it far more often than the other.