Related work: GPUs for Processing Analytical Workload
The paper gives a nice and short overview over the current research related to GPU-assisted query processing. (I had the impression that everything related to MapReduce was just mentioned because it fills a lot of space) The author comes to the only possible conclusion after reading the few papers published: GPUs are fast at query execution, but the data transfer might be a problem. Therefore he proposes a general DBMS, that stores everything in the GPU's memory. He wants to port MonetDB [2], because Column Stores are better suited for GPUs (I agree). For a first shot, this system shall be read-only.
First I like to say, that the data transfer problem is always an understatement in recent research. In my experiments transfer usually takes longer than the actual processing on the GPU or CPU. The reasons are:
- a large amount of data is transferred but not processed (rows are filtered because one column does not fulfill the condition; the other columns of this row are untouched)
- often large result sets have to be transferred back
- in query execution data is in many cases processed faster than it is copied. A CPU evaluates a simple WHERE condition faster than the PCIe bus can transfer it.
However, there is a problem with this proposal. Although the amount of memory available on the GPU rises, it is much more expensive than ordinary RAM. Also, most applications for the GPU do not need very large amounts of memory. Therefore I don't think that the capacity will grow as fast as main memory's has in the last years. Enlarging the capacity by using more than one card is not a real solution because it again requires transfer, which we want to avoid.
But in my opinion a much bigger problem is the computing ability. Nobody said, that GPUs are able to execute every possible SQL statement - let alone faster or more efficient than the CPU. Some operators were implemented on the GPU, but never compared to a state-of-the-art DBMS. It looks as if it has never been tried, but I guess nobody was able to build a successful prototype. Researchers seldomly publish negativ results because it is hard to prove that something can't be done (better). But the absence of convincing results after a decade of research indicates that GPUs aren't perfect for the job.
After all, GPUs are only Coprocessors that assist the CPU. In my opinion, we should carfully pick parts of the DBMS that can profit from the GPUs architecture. Preferably number crunching processes, e.g., in the field of query optimization. Heimel and Markl for instance suggest selectivity estimation and kernel density estimation [3]. I did some work on exectuing certain types of queries on the GPU, which hopefully will be published soon. However, I'm sure, that a graphics card cannot be an efficient and general (SQL-)query processor today or in the near future.
[2]MonetDB
[3] Heimel, Max and Markl, Volker, "A First Step Towards GPU-assisted Query Optimization", ADMS 2012