XXHighEnd - The Ultra HighEnd Audio Player
October 15, 2018, 09:11:22 pm *
Welcome, Guest. Please login or register.
Did you miss your activation email?

Login with username, password and session length
News: August 6, 2017 : Phasure Webshop open ! Go to the Shop
Search current board structure only !!  
   Home   Help Search Login Register  
Pages: [1]
  Print  
Author Topic: OpenCL -- GPU acceleration  (Read 4036 times)
0 Members and 0 Guests are viewing this topic.
AUDIODIDAKT
Audio Addict
***
Offline Offline

Posts: 727

There's Nothing So Dated As Yesterdays Future


View Profile Email
« on: May 03, 2012, 01:35:45 pm »

Hey Peter,

Is it possible to use OpenCL (CUDA)
in the preprocess to speed things up dramaticly.

for eg.
It can convert flac to wav (with my nvidia 9800 gtx)
2-3 times faster than my overclocked quad-core Q9550.

So use CPU and GPU together in the preprocess.

????

Roy
Logged

(Sept 30, 2010)                                                
W7 Ultimate x64 Tweaked/60 GB SSD OCZ Vertex (1.50)/Gigabyte GA-EP45-EXTREME/Intel Q9550 2.83Ghz/OZC Reaper 2x2GB/
Esi Juli@ soundcard (KS)(x2v-v0_978)(Tweaked Coaxial)/Nvidea Geforce 9800 GTX+/750 Watt Zalman ZM-750-HP/100 MB Fiber-Optical Internet/
(XXHighEnd 0.9z-2)
#4Engine, Special Mode, 48 samples, SFS 12MB, DAP, Scheme=3, Q1=1, Q2/Q3/Q4/Q5=30,30,0,0, PlayerPrio=Low, ThreadPrio=Realtime
x-Allow Format Change, x-Stop Services, x-Copy to XX-drive by Standard, x-Start Engine3 During Conversion
PeterSt
Administrator
High Grade Audiophile
*****
Offline Offline

Posts: 15242



View Profile Email
« Reply #1 on: May 06, 2012, 09:18:41 am »

Hi Roy,

I guess it can. But it won't be as simple as using the GPU because this is all (explicitly) about multithreading, while all is already setup in a multithreaded fashion - and it is not about converting one FLAC track, but about a whole album (or more). So, in the current setup this needs one GPU core to be faster than one normal CPU core, and although I didn't look it up, I don't think this will be the case;
Using numerous threads/cores for one FLAC will make that convert faster all right, but what about the other tracks of the album ?

But I really don't know. So yes, I thought about it for sure, but no, I didn't dive into it in decent fashion yet.
Btw, it seems that there's a working example out there somewhere (regarding your post). If you can point me to that (by email is better I think) I can start looking there ...

Thanks !
Peter
Logged

For the Stealth III LPS PC :
W10-14393.0 - August 18, 2018 (2.10)
XXHighEnd Mach *III* Stealth LPS PC -> *Xeon Scalable 14/28 core* with Hyperthreading On (*set to 10/20 cores*) @~660MHz, 48GB, Windows 10 Pro 64 bit build 14393.0 from RAM, music on LAN / Engine#4 Adaptive Mode / Q1/-/3/4/5 = 30/-/1/1/1/ Q1Factor = 10 / Dev.Buffer = 4096 / ClockRes = 15ms / Memory = Straight Contiguous / Include Garbage Collect / SFS = *140.19*  (max 140.19) / not Invert / Phase Alignment Off / Playerprio = Low / ThreadPrio = Realtime / Scheme = Core 3-5 / Not Switch Processors during Playback = Off/ Playback Drive none (see OS from RAM) / UnAttended (Just Start) / Always Copy to XX Drive (see OS from RAM) / Stop Desktop, Remaining, WASAPI and W10 services / Use Remote Desktop / Keep LAN - Not Persist / WallPaper On / OSD Off (!) / Running Time Off / Minimize OS / XTweaks : Balanced Load = *35* / Nervous Rate = 10 / Cool when Idle = n.a / Provide Stable Power = 0 / Utilize Cores always = 1 / Time Performance Index = Optimal / Time Stability = Stable / *Arc Prediction Filtering (16x)* / Always Clear Proxy before Playback = On -> USB3 from MoBo -> *Lush^2 A: B-W & Y-R, B: B-W* USB 1m00 -> Phisolator 24/768 Phasure NOS1a/G3 75B (BNC Out) async USB DAC, Driver v1.0.4b (16ms) -> B'ASS Current Amplifier -> Blaxius Interlink -> Orelo MKII Active Open Baffle Horn Speakers.
Removed Switching Supplies from everywhere (also from the PC).

For a general PC :
W10-10586.0 - May 2016 (2.05+)
*XXHighEnd PC -> I7 3930k with Hyperthreading On (12 cores)* @~500MHz, 16GB, Windows 10 Pro 64 bit build 10586.0 from RAM, music on LAN / Engine#4 Adaptive Mode / Q1/-/3/4/5 = 14/-/1/1/1 / Q1Factor = 1 / Dev.Buffer = 4096 / ClockRes = 1ms / Memory = Straight Contiguous / Include Garbage Collect / SFS = 0.10  (max 60) / not Invert / Phase Alignment Off / Playerprio = Low / ThreadPrio = Realtime / Scheme = Core 3-5 / Not Switch Processors during Playback = Off/ Playback Drive none (see OS from RAM) / UnAttended (Just Start) / Always Copy to XX Drive (see OS from RAM) / All Services Off / Keep LAN - Not Persist / WallPaper On / OSD On / Running Time Off / Minimize OS / XTweaks : Balanced Load = *43* / Nervous Rate = 1 / Cool when Idle = 1 / Provide Stable Power = 1 / Utilize Cores always = 1 / Time Performance Index = *Optimal* / Time Stability = *Stable* / Custom Filter *Low* 705600 / -> USB3 *from MoBo* -> Clairixa USB 15cm -> Intona Isolator -> Clairixa USB 1m80 -> 24/768 Phasure NOS1a 75B (BNC Out) async USB DAC, Driver v1.0.4b (4ms) -> Blaxius BNC interlink *-> B'ASS Current Amplifier /w Level4 -> Blaxius Interlink* -> Orelo MKII Active Open Baffle Horn Speakers.
Removed Switching Supplies from everywhere.

Global Moderator
CoenP
Audio Addict
***
Offline Offline

Posts: 810


View Profile Email
« Reply #2 on: May 06, 2012, 09:59:06 am »

Perhaps you are looking for this: http://www.cuetools.net/wiki/FLACCL?

The cpu is easily outperformed (on single Flacs)

Regards, Coen
Logged

Audio PC: XXHE PC v1 with RAMdisk w.o. videocard and 1 of 2 cpu fans + BRIX/USB3 storage musicserver. Power cable PE not connected, together with nos1 and poweramp in separate "audio" powerstrip.

Lush 1m, Phasure NOS1a-75B G3 USB (buf 16 ms)-> Blaxius ->SE EL95 (0,8W triode) + cheap link to Abaqus plateamps> biwired QED cable-> Bastanis Mandala Duo (upgraded).

[other sources: TD124/3009SII-i/Grace F9/lounge LCR phono; Rega Planet 1997 vintage]
PeterSt
Administrator
High Grade Audiophile
*****
Offline Offline

Posts: 15242



View Profile Email
« Reply #3 on: May 06, 2012, 10:53:13 am »

Thanks Coen. But that is exactly what I meant ...
Undoubtedly (but I didn't check) the conversion of one FLAC file is optimized to use all the GPU cores available, and I could have done the same with CPU cores and one file. But I did it the other way around and optimized it all for as many files in parallel as possible. This is how a 12 core (hyperthreaded) CPU converts 12 files in parallel, and any album of 12 tracks or less loads in 1-2 seconds. So ...

When one track is optimized for using all the GPU cores, then they are not available anymore for a next track to convert in parallel; that one track may be 10 times faster all right, but net it wouldn't differ much (for an 8 or 12 core CPU).

Maybe it turns out to be very different, but at least the theory (ok, mine) withholds me from really attempting it at this moment.
But the least I could do is try what happens with your given example (assuming my nVidia in there is the right one).

Ok, we'll see ...
Thanks,
Peter
Logged

For the Stealth III LPS PC :
W10-14393.0 - August 18, 2018 (2.10)
XXHighEnd Mach *III* Stealth LPS PC -> *Xeon Scalable 14/28 core* with Hyperthreading On (*set to 10/20 cores*) @~660MHz, 48GB, Windows 10 Pro 64 bit build 14393.0 from RAM, music on LAN / Engine#4 Adaptive Mode / Q1/-/3/4/5 = 30/-/1/1/1/ Q1Factor = 10 / Dev.Buffer = 4096 / ClockRes = 15ms / Memory = Straight Contiguous / Include Garbage Collect / SFS = *140.19*  (max 140.19) / not Invert / Phase Alignment Off / Playerprio = Low / ThreadPrio = Realtime / Scheme = Core 3-5 / Not Switch Processors during Playback = Off/ Playback Drive none (see OS from RAM) / UnAttended (Just Start) / Always Copy to XX Drive (see OS from RAM) / Stop Desktop, Remaining, WASAPI and W10 services / Use Remote Desktop / Keep LAN - Not Persist / WallPaper On / OSD Off (!) / Running Time Off / Minimize OS / XTweaks : Balanced Load = *35* / Nervous Rate = 10 / Cool when Idle = n.a / Provide Stable Power = 0 / Utilize Cores always = 1 / Time Performance Index = Optimal / Time Stability = Stable / *Arc Prediction Filtering (16x)* / Always Clear Proxy before Playback = On -> USB3 from MoBo -> *Lush^2 A: B-W & Y-R, B: B-W* USB 1m00 -> Phisolator 24/768 Phasure NOS1a/G3 75B (BNC Out) async USB DAC, Driver v1.0.4b (16ms) -> B'ASS Current Amplifier -> Blaxius Interlink -> Orelo MKII Active Open Baffle Horn Speakers.
Removed Switching Supplies from everywhere (also from the PC).

For a general PC :
W10-10586.0 - May 2016 (2.05+)
*XXHighEnd PC -> I7 3930k with Hyperthreading On (12 cores)* @~500MHz, 16GB, Windows 10 Pro 64 bit build 10586.0 from RAM, music on LAN / Engine#4 Adaptive Mode / Q1/-/3/4/5 = 14/-/1/1/1 / Q1Factor = 1 / Dev.Buffer = 4096 / ClockRes = 1ms / Memory = Straight Contiguous / Include Garbage Collect / SFS = 0.10  (max 60) / not Invert / Phase Alignment Off / Playerprio = Low / ThreadPrio = Realtime / Scheme = Core 3-5 / Not Switch Processors during Playback = Off/ Playback Drive none (see OS from RAM) / UnAttended (Just Start) / Always Copy to XX Drive (see OS from RAM) / All Services Off / Keep LAN - Not Persist / WallPaper On / OSD On / Running Time Off / Minimize OS / XTweaks : Balanced Load = *43* / Nervous Rate = 1 / Cool when Idle = 1 / Provide Stable Power = 1 / Utilize Cores always = 1 / Time Performance Index = *Optimal* / Time Stability = *Stable* / Custom Filter *Low* 705600 / -> USB3 *from MoBo* -> Clairixa USB 15cm -> Intona Isolator -> Clairixa USB 1m80 -> 24/768 Phasure NOS1a 75B (BNC Out) async USB DAC, Driver v1.0.4b (4ms) -> Blaxius BNC interlink *-> B'ASS Current Amplifier /w Level4 -> Blaxius Interlink* -> Orelo MKII Active Open Baffle Horn Speakers.
Removed Switching Supplies from everywhere.

Global Moderator
Pages: [1]
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1 RC2 | SMF © 2001-2005, Lewis Media Valid XHTML 1.0! Valid CSS!
Page created in 0.064 seconds with 19 queries.