XXHighEnd

Ultimate Audio Playback => Chatter and forum related stuff => Topic started by: AUDIODIDAKT on May 03, 2012, 01:35:45 pm



Title: OpenCL -- GPU acceleration
Post by: AUDIODIDAKT on May 03, 2012, 01:35:45 pm
Hey Peter,

Is it possible to use OpenCL (CUDA)
in the preprocess to speed things up dramaticly.

for eg.
It can convert flac to wav (with my nvidia 9800 gtx)
2-3 times faster than my overclocked quad-core Q9550.

So use CPU and GPU together in the preprocess.

????

Roy


Title: Re: OpenCL -- GPU acceleration
Post by: PeterSt on May 06, 2012, 09:18:41 am
Hi Roy,

I guess it can. But it won't be as simple as using the GPU because this is all (explicitly) about multithreading, while all is already setup in a multithreaded fashion - and it is not about converting one FLAC track, but about a whole album (or more). So, in the current setup this needs one GPU core to be faster than one normal CPU core, and although I didn't look it up, I don't think this will be the case;
Using numerous threads/cores for one FLAC will make that convert faster all right, but what about the other tracks of the album ?

But I really don't know. So yes, I thought about it for sure, but no, I didn't dive into it in decent fashion yet.
Btw, it seems that there's a working example out there somewhere (regarding your post). If you can point me to that (by email is better I think) I can start looking there ...

Thanks !
Peter


Title: Re: OpenCL -- GPU acceleration
Post by: CoenP on May 06, 2012, 09:59:06 am
Perhaps you are looking for this: http://www.cuetools.net/wiki/FLACCL (http://www.cuetools.net/wiki/FLACCL)?

The cpu is easily outperformed (on single Flacs)

Regards, Coen


Title: Re: OpenCL -- GPU acceleration
Post by: PeterSt on May 06, 2012, 10:53:13 am
Thanks Coen. But that is exactly what I meant ...
Undoubtedly (but I didn't check) the conversion of one FLAC file is optimized to use all the GPU cores available, and I could have done the same with CPU cores and one file. But I did it the other way around and optimized it all for as many files in parallel as possible. This is how a 12 core (hyperthreaded) CPU converts 12 files in parallel, and any album of 12 tracks or less loads in 1-2 seconds. So ...

When one track is optimized for using all the GPU cores, then they are not available anymore for a next track to convert in parallel; that one track may be 10 times faster all right, but net it wouldn't differ much (for an 8 or 12 core CPU).

Maybe it turns out to be very different, but at least the theory (ok, mine) withholds me from really attempting it at this moment.
But the least I could do is try what happens with your given example (assuming my nVidia in there is the right one).

Ok, we'll see ...
Thanks,
Peter