Not A Paper - Research Snippets that were not published.

About my topic: Co-Processors in DBMS

My topic is on how we can use Co-Processors such as graphic cards in a database management system. I'm going to discuss the challenges about this topic and some background.

The CPU is a programmable universal processing unit in our computers. It can process images, filter big amounts of data by certain criteria, calculate scientific formulas on different representations of numbers: integer and therefore fixed point and floating point numbers, analyse texts, and everything else that is done with computers these days. Since the CPU's purpose is so generic, it cannot be efficiently optimized for one task. Looking at the TPC-H benchmark for database systems for example, there are no floating point numbers to process.
In contrast a co-processor is usually optimized for one purpose or at least a smaller range of tasks. When I mention co-processors to somebody in his 30s or 40s he usually thinks of the floating point unit that supported the CPUs before the pentium processor in doing floating point calculations. Nowadays the most popular co-processor is the GPU, which is optimized for image processing. Because of its limitation in functionality the co-processor is usually much better at solving problems that it can solve. Sometimes we "misused" the co-processor to solve a task that it was not designed for, but which is very similar. In case of the GPU the industry has adapted and now there are frameworks which allow us to use GPUs to do General Purpose Calculations (hence the term GPGPU). However, they can still solve only those problems (efficiently) that fit into their domain or are at least similar to it. Therefore it makes sense to use GPUs to encode/decode videos and render scenes with blender [1]. Sometimes it is not so easy to see why certain processes perform better on the GPU than on the CPU, e.g., password cracking. We 'll get to that later. For now one should keep in mind that a co-processor can only solve a subset of the problems a CPU can solve.
We can differentiate classes of co-processors by looking at their integration into the computer's CPU.

  1. Co-processors such as most graphic cards today are connected via the PCI-Express bus. This means that data has to be transferred to the graphic cards memory, where the GPU can access it. Usually this is the bottleneck [2]. FPGAs also usually work that way.
  2. The second class of co-processors is tightly integrated into the system bus. The GPU part of AMD's APUs fits into that category. It still operates on its own but it shares memory with the CPU [3]. Therefore data can be transferred much faster.
  3. Modern CPUs consist of a collection of co-processors of the third class. FPUs are now integrated and have special registers on the CPU. Cryptographic processors also fall in this category. There are instructions in the CPU's set, that use these coprocessors. A software developer usually does not care since the compiler or interpreter of the language he is programming in automatically uses these instructions when possible.
When I talk about co-processors in my posts, I usually mean the first class. Most of the problems also apply to the second class. The third is a different story. I'll leave this to compiler designers. In fact most of the time I'm writing about GPUs and sometimes about Intel's Many Integrated Core Architecture [4], because they share a lot of characteristics. However, I am certain that most thoughts can be applied to other/future co-processor-architectures as well.
[1] -- OpenCL development can be found under the term "Cycles" at the moment
[2] Chris Gregg, Kim Hazelwood: "Where is the data? Why you cannot debate CPU vs. GPU performance without the answer"
[3] Mayank Daga, Ashwin M. Aji and Wu-Chun Feng: "On the Efficacy of a Fused CPU+GPU Processor (or APU) for Parallel Computing"
Thanks to Alban Bridonneau for his report on the topic of co-processors.
Click here to enable comments powered by Disqus (third-party service - needs JavaScript)