Aforge only working for certain videos, when extracting frames as bitmaps

Guitarmonster

Member
Joined
Mar 24, 2023
Messages
19
Programming Experience
10+
I am designing a simple C# application that is using the Aforge.NET library, I already have all of the FFDSHOW required files in my build folder. The purpose of the application is to scan the video and extract each frame as a bitmap, then it takes that bitmap and analyzes different pixels. The code I am working is pulling frames and converting them to Bitmap objects, then using several GetPixel functions to get pixel color, which is analyzed for brightness. The code detects drops in brightness where the video fades to black, which is usually an indicator of an upcoming commercial break.

In short, my app needs to be able to load a video and extract one or several frames as a Bitmap object. So far it works well, except it won't work for any H.264 videos. I started noticing that MKV files didn't get me anything other than basic stream data, no frames. I tried MP4 and it gave me the same result, which tells me it's an H.264 issue and not a container issue.

Since this doesn't necessarily rely on code but FFDSHOW configuration, I don't think this is a code issue as my code is nearly identical to the sample code released by Aforge.net. Unfortunately, Aforge.net is no longer supported and they have locked their forums, so I have to ask here.

I am looking for a way to make Aforge work for me, or an alternative to what I am trying to do.

This is the code in it's most basic format:
C#:
// create instance of video reader
VideoFileReader reader = new VideoFileReader( );
// open video file
reader.Open( "test.avi" );
// check some of its attributes
Console.WriteLine( "width:  " + reader.Width );
Console.WriteLine( "height: " + reader.Height );
Console.WriteLine( "fps:    " + reader.FrameRate );
Console.WriteLine( "codec:  " + reader.CodecName );
// read 100 video frames out of it
for ( int i = 0; i < 100; i++ )
{
    Bitmap videoFrame = reader.ReadVideoFrame( );
    // process the frame somehow
    // ...

    // dispose the frame when it is no longer required
    videoFrame.Dispose( );
}
reader.Close( );

When it runs, it DOES extract the stream information including width, height, and codec name. After that there are zero frames returned. I have been searching the net like crazy and cannot seem to find anyone else with this issue, and I'm seeing a lot of coders using Aforge with FFDSHOW. So I am fairly sure this is a codec configuration issue of some sort. Any direction as to what settings to check, or for a better alternative would be greatly appreciated.
 
OK!!!

I just discovered something strange when I found another thread online talking about this issue. Firstly, Aforge.NET has been absorbed by Accord, which I will be upgrading to. What I found is although the total amount of frames being returned is zero on H.264, the frames are still there.

What I was doing was first pulling the number of frames using this code, the reader object is the VideoFileReader from the code in my first post:
C#:
long totalFrames = reader.FrameCount - 1;

The next snippet of code I was using was iterating through each frame using a FOR loop using totalFrames as the limit. The problem is with H.264 the total frames coming back was zero, so the loop never processes it's code.

I temporarily changed totalFrames to 5000 just to see what happens and it returned 5000 frames successfully and detected the first commercial break as intended.

The solution I found online was to run an endless loop and end that loop until the frame returned is null, indicating the end of the video file.

Although this issue is generally solved, I am definitely looking for a newer and possibly faster alternative to what I am doing here. I know I read some things about the ability to extract thumbnails using C#, that would work fine as I don't need high resolution images to test them for brightness.

The video scanning is being done during off times and the bookmarks added to the video database for future use. Once a video has been analyzed, its values are permanently recorded and there is no need to scan them again. So even though it's not the fastest due to FFDSHOW having to decode each frame, it's not too bad seeing that the processing will take place in the background.

However, I am absolutely open to something newer and faster if it exists. If anyone has any suggestions feel free to let me know.
 
Since you know the frame rate, you can just get every (frameRate / 2) frames, and then sample those frames for brightness. E.g. If it's 24 FPS, then you can just get frames 12, 24, 36, etc.
 
Also, GetPixel() is slow if you are trying to read each pixel of a frame. You can get the raw bytes of the bitmap and read those very quickly. You can even apply some kind of transform on it quickly compute the average brightness.
 
Also, GetPixel() is slow if you are trying to read each pixel of a frame. You can get the raw bytes of the bitmap and read those very quickly. You can even apply some kind of transform on it quickly compute the average brightness.

I assumed that and I am open to doing it differently. What I have now is I establish a grid where pixels are sampled instead of testing every pixel.

In the following code, RowSpacing is set by diving the height of the row by a certain number, so far I'm using 10. So in effect it tests a pixel grid of 10x10 going each direction. This is sufficient to check the entire screen.

C#:
//Iterate through each column using the set colums spacing
for (int col = ColSpacing; col < reader.Width - ColSpacing + 1; col += ColSpacing)
    {

    //Iterate though each row using the set row spacing
    for (int row = RowSpacing; row < reader.Height - RowSpacing + 1; row += RowSpacing)
    {

        analyzedPixels.Add(videoFrame.GetPixel(col, row).GetBrightness());

    }


}

This is still pretty slow though and I want to do something faster. I want to try what you are suggesting but am unsure how to do it. Can you share a code example with me? Below is a stripped down and simplified version of what I am working with. I would love to try your suggestion.

C#:
for (int i = 0; i < totalFrames; i++)
{

    // Converting the frame to a bitmap
    Bitmap videoFrame = reader.ReadVideoFrame();

    // Pixel analysis code here   
    
    // dispose the frame when it is no longer required
    videoFrame.Dispose();

}
reader.Close();
 
Also, GetPixel() is slow if you are trying to read each pixel of a frame. You can get the raw bytes of the bitmap and read those very quickly. You can even apply some kind of transform on it quickly compute the average brightness.

Also just FYI, I did just remove Aforge and installed Accord.NET which is basically the same but currently supported. So I'm no longer working with outdated code.
 
And this API lets you get the bitmap data of a specific frame number:

So you could do the following psuedo code:
C#:
frameNumber = 0
while(true)
{
    videoFileReader.ReadVideoFrame(frameNumber, bitmapData);

    if no bitmap data received
        break

    // analyze bitmapData

    frameNumber += frameRate / 2;
}

Sorry the documentation doesn't indicate how it will tell you if there are no more frames like you had described in post #2.
 
And this API lets you get the bitmap data of a specific frame number:

So you could do the following psuedo code:
C#:
frameNumber = 0
while(true)
{
    videoFileReader.ReadVideoFrame(frameNumber, bitmapData);

    if no bitmap data received
        break

    // analyze bitmapData

    frameNumber += frameRate / 2;
}

Sorry the documentation doesn't indicate how it will tell you if there are no more frames like you had described in post #2.

I tried this simple code which only returned empty BitmapData objects:
C#:
BitmapData videoFrameData = new BitmapData();

reader.ReadVideoFrame(videoFrameData);

// FYI, I also tried:
reader.ReadVideoFrame(1000, videoFrameData);

You're right, there isn't much information on pulling frames using this method, I'm still searching. I'm assuming that bitmapData in your pseudo code is System.Drawing.Imaging.BitmapData so our code is similar but it's not pulling anything.
 
So all the values inside the videoFrameData are zeroes or nulls after calling ReadVideoFrame()?
 
So all the values inside the videoFrameData are zeroes or nulls after calling ReadVideoFrame()?

Yeah, nothing. The object's values remain the same before and after. Also, it seems that the documentation refers to an output parameter scheme I'm not familiar with. By what I've seen, c# uses the out command for output parameters like so:

C#:
//Correct usage of the out command for output parameters.

int x;

sampleObject.ReturnData(out x);


//Code sample from Accord, this throws error "the name 'output' does not exist in the current context":

BitmapData videoFrameData = new BitmapData();

reader.ReadVideoFrame(1000, videoFrameData output);


//This code although using the proper out command, returns the error "Argument 1 may not be passed with the 'out' keyword":

reader.ReadVideoFrame(1000, out videoFrameData);
 
Where did you download Accord from?
 
We'll let's take a look at the source code.
The entry point is here:

which then calls readVideoFrame():

which then calls DecodeVideoFrame():
DecodeVideoFrame():
// Decodes video frame into managed Bitmap
Bitmap^ VideoFileReader::DecodeVideoFrame(BitmapData^ bitmapData)
{
    Bitmap^ bitmap = nullptr;

    if (bitmapData == nullptr)
    {
        // create a new Bitmap with format 24-bpp RGB
        bitmap = gcnew Bitmap(data->VideoCodecContext->width, data->VideoCodecContext->height, PixelFormat::Format24bppRgb);

        // lock the bitmap
        bitmapData = bitmap->LockBits(
            System::Drawing::Rectangle(0, 0, data->VideoCodecContext->width, data->VideoCodecContext->height),
            ImageLockMode::WriteOnly, PixelFormat::Format24bppRgb);
    }

    uint8_t* srcData[4] = { static_cast<uint8_t*>(static_cast<void*>(bitmapData->Scan0)),
                           nullptr, nullptr, nullptr };
    int srcLinesize[4] = { bitmapData->Stride, 0, 0, 0 };

    // convert video frame to the RGB bitmap
    sws_scale(data->sws_ctx, data->VideoFrame->data, data->VideoFrame->linesize, 0,
              data->VideoCodecContext->height, srcData, srcLinesize);

    if (bitmap != nullptr)
        bitmap->UnlockBits(bitmapData); // unlock only if we have created the bitmap ourselves
    return bitmap;
}

Looking at that code, it looks like if a bitmapData is passed in, then it assumes that the Scan0 member is already set and pointing to the beginning of the bitmap data. So that entry point is pretty much pointless in my view if you want are getting a fresh frame. Looks like a design bug.

So it looks like you'll need to call the entry point which only takes the frame number and you'll need to call LockBits()/UnlockBits() yourself.
 
Back
Top Bottom