Archive for February, 2010


Kernel Samepage Merging (KSM) in Linux Operating System.

February 16, 2010

Kernel SamePage Merging is a recent linux kernel feature which combines identical memory pages from multiple processes into one copy on write memory region. Because kvm guest virtual machines run as processes under linux, this feature provides the memory overcommit feature to kvm so important to hypervisors for more efficient use of memory. The motivation for KSM comes from the observation that on a large machine running several virtual machines (potentially running the same OS), there are number of duplicate pages , each consuming a page of memory in the hypervisor. If pages with exactly the same content don’t change often , then sharing these pages across the various users will free up the memory for other use in the system. The ability of the hypervisor to provide more memory than is really available (via virtual memory or other means) is memory overcommit. The scope of sharing text regions (compiled code and instructions) is enormous. In a traditional operating system , text sharing happens automatically.

An overview of the design.

KSM is currently designed only to work in anonymous page. This works very well for a KVM based virtualisation environtment , since all of the memory used by the virtual machine / OS is mapped into the qemu process. For each anonymous pages scanned , the algorithm looks for a match in the stable tree. If a match is found , the page is merged with the page in the stable tree.

Kernel thread: KSM creates a kernel thread to help with merging pages with similar content in the background. This might change in the future  , and KSM might create several threads depending on the topology of the system and the total available memory.

Stability: KSM maintains two data structures – the stable and unstable tree. KSM works in multiple passes , checking to see if pages with similar content can be merged. At the end of  each pass , the unstable tree is reinitialised. A pass is a full scan of areas registered via the madvise(2) API.

User hints: The user provides hints on the address that can be considered as ‘mergeable’ using the madvise(2) system call. The KSM subsystem uses these hints to decide what virtual address space to scan for pages with duplicate content.

Conclusion , the primary user of KSM is expected to be virtualisation solutions and KVM in particular. However , KSM is designed to be generic. In short , KSM is suited for application beyond virtualisation , specifically if there is a chance of a large amount of content being the same.


Bootstrapping Fedora 12 …. A simple review……

February 12, 2010

A GNU/Linux system , in general , has three phases of booting: kernel and initrd load , system and then services initialisation.In the first phase , the bootloader (GRUB ) loads the kerneland the initial RAM drive (initrd) into the memory. In the second phase , the essential initialisation task , like a file system check , UDEV device node generation , filesystem  mounts , network initialisation , setting the hostname , date and time and other basic stuff are done. And in the third  phase, as the system enters a run level, daemons and services like cups, httpd and the like are started.

Dracut – The current incarnation of mkinitrd sucks for a variety of reasons. Fedora developers took up the charge of creating “initramfs infrastructure” that has “…as little as possible hard coded into the initramfs”. And it looks like they made it big. The Dracut architecture is event-driven and make use of a Bash shell , eliminating Nash from initrd’s picture. This has a three fold advantage over using a minimal shell or a home grown replacement.  The end result is that Nash isn’t really required any more , and it will be dropped in a future release. However , in Fedora 12 , Nash is still used in a few places including Anaconda , the system installer.

KMS and Plymouth – In simple terms , in KMS (Kernel Modesetting ) , setting the screen resolution and colour depth as well as the video memory is managed by the kernel itselft.The advantage?Here’s an easy one : the transition from the splash screen to the display manager becomes flicker free.Since the kernel itselft manages the video memory instead of the X11 video driver , in case a Kernel Oops or panic occurs , the kernel can wipe out the X11 display and then show an error message. Compare this with the pre-KMS era where it would simply freeze up the system requiring a manual reset. Plymouth has already tasted mainstream success , being included with the current Mandriva release. It is slated to replace Ubuntu’s USplash , and with all the inherent problems of Bootsplash and the fact that Plymouth can work with UMS and KMS , we should be able to see most , if not all distro switch to Plymouth to provide a graphical boot.

Well , the infrastructure is now in place. All it needs is a bit of tweaking to get it to where the system boots in less than 15 seconds . There’s this unresolved bug , which causes Plymouth to be the first job to get killed when the system is shutting down , creating a blank screen(but it’s still flicker free) and dampening the shutdown experience. This thing will be corrected once UED’s make an appearance in Fedora.That’s the only way to fix it without using what we call ugly hacks.