Disable Hardware Prefetching in Intel Microprocessors

Prefetchers are built to predict the data that is going to be accessed by a program and automatically load the data into the cache memory, before the program actually requires it. Sometimes predictions can be woefully wrong, and affect the program's performance. So programmers need to test their programs with prefetching disabled.

There are two ways that prefetching can be disabled:

1. From the BIOS: On certain systems (mostly servers) the BIOS allows prefetching to be disabled. This is the easiest way to get the deed done.

2. Using MSR Registers: Model specific registers in Intel processors allow vital micro-architecture configurations in the processor to be viewed and modified.
I have a previous blog which shows how MSR registers can be accessed in a previous post (http://arbidprobramming.blogspot.in/2010/04/programming-intels-model-specific.html).

On an Intel i3, MSR 0x1a0 has 4 bits that deal with prefetching:
  • Bit 9: Hardware Prefetcher Disable
  • Bit 19: Adjacent Cache Line Prefetch Disable
  • Bit 37: DCU Prefetcher Disable
  • Bit 39: IP Prefetcher Disable
On AMD Family 10h processors, MSR 0xC0011022 bit 13 needs to be set to disable prefetching
 Setting any of these bits to 1 will disable that prefetcher.

However, from Nehalem onwards, Intel has disabled the option of controlling  prefetching from the MSRs, the BIOS seems to be the only way then. I'm not sure how prefetching can be disabled if the BIOS does not provide an option :(






Keeping Your Ubuntu Up-to-date

To automatically get your Ubuntu up-to-date with the latest upgrades, its a nice idea to place apt-get in crontab.

This can be done as follows:


1. sudo crontab -e
copy this into the file.

30 22   *   *   *    /update-script


2. sudo vim /update-script
and paste this in the file

echo "***************************************" >> /tmp/update.log
date >> /tmp/update.log
echo "......................................." >> /tmp/update.log
apt-get update >> /tmp/update.log
apt-get -y -q --force-yes upgrade >> /tmp/update.log



The first step adds an entry in crontab which will run a script called update-script every day at 10:30PM

The second step is to create the update-script which updates all software in your system. The log of the updates is appended in the file placed in /tmp/update.log.



Getting Started with Fiasco-UX on Ubuntu 10.04

Fiasco-UX is a port of Fiasco which can run as a user mode application in Ubuntu 10.04. That is to say, it does not require to be in privileged mode. It is an ideal way to start off working with L4/fiasco. This blog shows how to get it running on an Ubuntu machine on Intel Core 2 Duo.

1. Download l4re-core from this site: http://os.inf.tu-dresden.de/download/snapshots-oc/, and untar it of course. I have downloaded l4re-core-2012081219

2.  Run make setup and then make. Assuming you have a reasonably updated ubuntu 10.04, with gcc tools etc, there should not be any problem here. However it does take an awfully long time to build as a lot of other stuff too gets compiled, which is not needed for our work.

3. In the directory obj/fiasco/ux, run the following command
./fiasco -I ./irq0 -S ../../l4/x86/pkg/sigma0/server/src/OBJ-x86_586-l4f/sigma0 -R '../../l4/x86/pkg/moe/server/src/OBJ-x86_586-l4f/moe --init=rom/hello' -l ../../l4/x86/bin/x86_586/l4f/l4re -l ../../l4/x86/bin/x86_586/l4f/hello 
This should get your fiasco-UX rolling
---------------------------------------------------------------------------------------------------------------------------------


All About the Cache Memory in Linux

For Linux users, a lot of information about the cache memory is available in
the directory '/sys/devices/system/cpu/cpuX/cache', where 'X' is the core number (takes values 0,1,2,...).

Here I discuss the entries present in this and subdirectories.

1. indexN 
    - There can be several directories of the form index0, index1, index2, and so on. Each directory represents a cache that is visible from the cpu core (cpuX).

2. indexN/coherency_line_size
   - This contains the cache line size (generally 64 bytes)

3. indexN/level
   - Contains the cache level (ie. either 1 for L1, 2 for L2, or 3 for L3)

4. indexN/number_of_sets

5. indexN/shared_cpu_list
    - Contains the list of cpus that share this cache memory

6. indexN/shared_cpu_map
    - Contains a bitmap of the cpus that can access this cache memory

7. indexN/size

8. indexN/type
    - Can be either data, instruction, or unified.

9. indexN/ways_of_associativity
     

Get Your Program to Execute on a Specific CPU

Most systems these days have multiple cores.
In linux you can determine how many cores are present from
the following file

$ cat /proc/cpuinfo

Each core will have an entry and will be labeled starting from 0.
(in my quad core the core's are
processor 0, processor 1, processor 2, processor 3)

When you execute  a program it gets scheduled into any one of these
processors. In case you want to specify which processor you want your
program to run, then use the sched_setaffinity call.

A simple example for using this call is shown below:



int main(int argc, char **argv)
{
    unsigned long mask;
    mask = 0x2;
    if (sched_setaffinity(0, sizeof(mask), &mask) <0) {
        perror("sched_setaffinity");
    }
    sleep(1000);
}

You may have to include 'sched.h'  to get it to compile

The mask will tell which processors the program can execute in.
For example a mask of 0x2 (in binary : 0010) means processor 1.
Similary a mask of 0x1 (0001) means processor 0.

You can check which cpu is executing your program by
running top and then pressing 1.








Followers