The latest nightlies contain experimental dxva2 hw decoding through libavcodec
Only h264/h265 **8** bits are supported
It is only partial, i.e. decoding is done on the video card but data are copied back to main memory for display, which is slow
As a result for seeking, it is way slower than VDPAU or LIBVA where everything happens on the video card
Preferences=> HW decoding => DXVA2
If your PC is fast enough, and the resolution is low, it might not be faster than doing it on software
For 2k H265, it is ~ 1.5 times faster than doing it in software
2nd part (still experimental) is also done : D3D/DXVA display driver
It still has issues, refresh is not working correctly, it does not support card with restrictions on surface size ...
And it is not very fast
It is due to the fact that decoded video frames are copied back and forth between video card and main memory
The next step is to bridge them, like vdpau does, so the video stays as long as possible in the video card memory
It speeds up seeking a lot, and potentially later on directX resize
The naming is a bit confusing, it should be D3D
But to keep it the same as decoding (i.e. dxva2) it is set to DXVA2 too
Zooming is done in hw when it supports it
Rebuild in progress
I don't know much about this but would it be good to have Avidemux look at the loaded video resolution and video card memory size then set this feature on or off accordingly? I don't know if that would keep this feature off if it's going to be slower with it on.
Perhaps on the Display tab, it would be worthwhile to include a simple benchmark Test button?
Dan.
@ mean,
Dxva2, init: I'm submitting a few PR, step by step...
NB: Please, leave debug output on for the time being...
Normally, it will try first to do hw accel path, and fallback to software if it fails
If the failure happens later, you might need to disable it manually (and restart avidemux)
The build in progress is mostly ok, except :
* refresh is not working correctly. if you get a black frame, go left/right to force a new display
* The bridge is not working, there is an extra copy
Meanwhile, i've spotted that the audio part is consuming a lot of CPU on windows
i.e.
playing a 720p h264 video with all DXVA/DXVA, no sound => 4% cpu
playing a 720p h264 video with all DXVA/DXVA, with sound => 30% cpu => ????
Stupid mistake, us vs ms
Now with dva2 + fixed sound, the cpu consumption is 5% playing a small h264 mkv instead of 30%
Much better
Win32 available, win64 in progress
With CPU consumption fixed, some figures :
All done with 2k H265 video on a core i5 . It is a simple to decode video.
* Software display, software decoding : ~ 18% cpu
* Dxva2 display , Dxva2 decoding : ~ 4% cpu
Not bad :)
Thank You for all of the hard work you put in.
Quote from: mean on November 14, 2016, 07:53:20 PM
* Software display, software decoding : ~ 18% cpu
* Dxva2 display , Dxva2 decoding : ~ 4% cpu
These values are amazing, they are similar to the CPU load while playing a 720p h264 video in mpv with VDPAU on Linux in comparison with ~30% CPU load playing the same video with VDPAU in Avidemux on my hardware. Does this happen because Avidemux copies decoded images back and forth from the graphics card even if none of post-processing options or filters is enabled? Is keeping all the data for decoding and displaying with VDPAU in the graphics card memory off-limits?
Thank you for your hard work on Avidemux too.
Normally no, vdpau keeps the video on the video card as long as possible (which dxva does not, i think i'll need to go D3D11 to do that)
It could be something else.
For example it was really bad on windows, due to the audio plugin that was gobbling all the resources it could.
Try to play without sound to see if the cpu consumptions go down
(i.e. remove the audio track)
The 5.1 => dolby filter is very demanding cpu wise
Would suffice to just disable ac3 in Avidemux menu: Audio Select track. [ ] for ac3 track
Playing with AC3 one thread is showing 30%, without ac3 selected it would be around 11%
AC3 is indeed heavy on CPU with no dedicated hardware decoding it.
(just back, still need to catch up a lot of other stuff)
Just did a quick test with full HD H264 video + libva on a core i17
Dolby + pulse audio => 23% CPU
Stereo + pulse audio => 12 % cpu
stereo + dummy audio => 6 % cpu
Points to an audio problem (difference between 2nd and 3rd should be very very small)
downmixing setting related? no downmixing is improving here
Quote from: mean on November 15, 2016, 10:22:44 AM
Just did a quick test with full HD H264 video + libva on a core i17
Dolby + pulse audio => 23% CPU
Stereo + pulse audio => 12 % cpu
stereo + dummy audio => 6 % cpu
Points to an audio problem (difference between 2nd and 3rd should be very very small)
Not on my hardware (AMD CPU + NVIDIA graphics card) with VDPAU. The CPU load is the same no matter which audio device is selected, with and without downmixing.
and with libva ?
(you need the libva / vdpau wrapper installed)
Done: the same CPU load (~28% ââ,¬â€œ ~30%) for 720p h264 25fps videos. Both decoder and display driver are set to LIBVA, which requires disabling VDPAU in the preference settings to prevent it from taking over.
Coud it be the window manager messing up with the performances again ?
Hard to tell: libva seems to spread the load more evenly over the threads.
Website is very slow responding.
"Service Temporarily Unavailable
The server is temporarily unable to service your request due to maintenance downtime or capacity problems. Please try again later."
Under attack?
Quote from: mean on November 15, 2016, 01:45:51 PM
Coud it be the window manager messing up with the performances again ?
Unlikely, because turning hwaccel off adds only ~10% to the CPU load (it grows to ~40%) and dedicated media players like MPlayer and mpv consume only ~5% of CPU when playing 720p h264 videos (in a window, not fullscreen) via VDPAU in the same gnome-shell setup.
Redraw problem should be fixed
Just the mm patch to look at and it is ready for testing
Quote from: mean on November 16, 2016, 06:49:24 PM
Just the mm patch to look at and it is ready for testing
After the current PR, I'll do at least 1 more patch wrt Dxva init...
Fttb, (init succeeds then) decoding doesn't seem to succeed on my computer:
Quote
Avidemux v2.6.14 (161115_eb280212ed3) .
Operating System: Microsoft Windows Vista Home Premium Service Pack 2 (6.0.6002; 32-bit)
[uncompress] [DXVA] --No picture
It's maybe normal
There is a delay of a few frames at the beginning
Quote from: mean on November 17, 2016, 04:41:10 PM
It's maybe normal
There is a delay of a few frames at the beginning
Agreed: on second look, decoder seems to (then) succeed.
Quote
Surface to admImage = 095232C0
Retrieving image pitch=768 width=720 height=576
Align 576,16 => 576
Paint event
Yet, while Audio was fine, Video seemed as if (very) slowed down, and cpu may have been (much) higher than without Dxva.
But I'll try it again, with a newer build...
Decoder and Renderer can be used independently, can't they?
(Maybe we could had a few (raw) statistics: decoded frames, displayed frames, ...?)
with the current D3D9 they are always independant, with an extra copy as consequence
And what NVRESIZE? Supported?
on linux , with vdpau