Appending Directory.GetFiles Output to String[]

Status
Not open for further replies.

moderator99

Member
Joined
Mar 9, 2021
Messages
8
Programming Experience
Beginner
I'm not sure where to post this, so I'll post it here. Let me know if there's a better forum for me to post in.

I'm using Visual Studio 2019, creating a Windows Forms App under the .NET Framework in C#.

Here is my code at the moment:

C#:
        private void button1_Click(object sender, EventArgs e)
        {
            FolderBrowserDialog fbd = new FolderBrowserDialog();
            if (fbd.ShowDialog() == DialogResult.OK)
            {
                string dir = fbd.SelectedPath;
                string[] dirPaths = Directory.GetDirectories(dir, "*.*", SearchOption.AllDirectories).Where(d => !d.EndsWith("_files") && !d.Contains("_files\\")).ToArray();
                foreach (string d in dirPaths)
                {
                    string[] filePaths = Directory.GetFiles(d, "*.*", SearchOption.TopDirectoryOnly);
                    foreach (string f in filePaths)
                    {
                        listBox1.Items.Add(f);
                        listBox1.Update();
                    }
                }
            }
        }

This is just part of a test program I made as I try to create a faster directory search routine that avoids reading through certain directories.

What I'm wanting to do is instead of adding the output from Directory.GetFiles to a list and creating a new string array for each iteration of the first foreach loop, I want to append the output to the same string array (filePaths) during each iteration of the first foreach loop. The listbox is only being used to view the output while I develop the directory search routine. What I need is a string array as the final result, to be used in a different program I've already pretty much finished developing that requires the output to be in a string array.

Please forgive my lack of understanding of the basic language protocols. I'm relatively new to programming. I've looked online for help and checked through the documentation, but I fail to find a solution to my problem.
 
You can't append to an array. Arrays are fixed-size so, once you have created one, you can't change the number of elements. You can change the value of an element but you cannot add more elements.

What you should be doing is using a List<string> from start to finish. That way, you can call Add or AddRange as required to append new items. Once you're done, call ToArray on the collection to create an array if that's what you need to pass on to the next stage.
 
Your code should look much like this:
C#:
var filePaths = new List<string>();
var folderPaths = Directory.EnumerateDirectories(rootFolderPath, "*", SearchOption.AllDirectories).Where(s => !s.Split('\\').Any(ss => ss == "_files"));

foreach (var folderPath in folderPaths)
{
    filePaths.AddRange(Directory.EnumerateFiles(folderPath));
}

var filePathArray = filePaths.ToArray();
I've taken the liberty of improving it in a number ways. You should be calling EnumerateDirectories and EnumerateFiles as the default and only calling GetDirectories and GetFiles where you specifically need an array or you would enumerate the results more than once. You also should not be calling ToArray before your foreach loop because you can enumerate the list you already have.
 
Your code should look much like this:
C#:
var filePaths = new List<string>();
var folderPaths = Directory.EnumerateDirectories(rootFolderPath, "*", SearchOption.AllDirectories).Where(s => !s.Split('\\').Any(ss => ss == "_files"));

foreach (var folderPath in folderPaths)
{
    filePaths.AddRange(Directory.EnumerateFiles(folderPath));
}

var filePathArray = filePaths.ToArray();
I've taken the liberty of improving it in a number ways. You should be calling EnumerateDirectories and EnumerateFiles as the default and only calling GetDirectories and GetFiles where you specifically need an array or you would enumerate the results more than once. You also should not be calling ToArray before your foreach loop because you can enumerate the list you already have.

I appreciate you quick response. I'll try out that code instead.
 
Your code should look much like this:
C#:
var filePaths = new List<string>();
var folderPaths = Directory.EnumerateDirectories(rootFolderPath, "*", SearchOption.AllDirectories).Where(s => !s.Split('\\').Any(ss => ss == "_files"));

foreach (var folderPath in folderPaths)
{
    filePaths.AddRange(Directory.EnumerateFiles(folderPath));
}

var filePathArray = filePaths.ToArray();
I've taken the liberty of improving it in a number ways. You should be calling EnumerateDirectories and EnumerateFiles as the default and only calling GetDirectories and GetFiles where you specifically need an array or you would enumerate the results more than once. You also should not be calling ToArray before your foreach loop because you can enumerate the list you already have.

I just tried this, adding the following at the end to test the output, but it hung the program.

C#:
                foreach (string f in filePathArray)
                {
                    listBox1.Items.Add(f);
                    listBox1.Update();
                }

Obviously I'm not doing something correctly.
 
Think about what you're actually doing in that code. For every file path in the array, you add that one single file path to the ListBox and then you force the ListBox to repaint on the screen. Does that really make sense? No, it doesn't. All you need is:
C#:
listBox1.Items.AddRange(filePathArray)
or:
C#:
listBox1.DataSource = filePathArray
That adds every item in one go and the control will repaint afterwards as a matter of course.
 
Think about what you're actually doing in that code. For every file path in the array, you add that one single file path to the ListBox and then you force the ListBox to repaint on the screen. Does that really make sense? No, it doesn't. All you need is:
C#:
listBox1.Items.AddRange(filePathArray)
or:
C#:
listBox1.DataSource = filePathArray
That adds every item in one go and the control will repaint afterwards as a matter of course.

Neither of those approaches work. The program still hangs.

Is there a listbox parameter I need to set for either of them to work?
 
How many files are there in the list? If it's a lot then it's going to take a long time to load them.

There's over 16,000 files that I'm using to test with, plus about 10,000 folders that I'm excluding.

My original code that I posted is very quick to load all the file paths into the listbox (only a few seconds). But of course, I don't intend to use a listbox in the actual program I'm developing. That's why I wanted to put them all into a single array.
 
Check to see where it is "hanging"... Is it before adding to the listbox, or during. I suspect that it is during because you have to many items to be added.

Anyway, if you don't really intend to put things in a list box, then don't. During development, just set a breakpoint and inspect the list to see if the contents look correct.
 
Out of curiosity, I wanted to find out how long it would take to add 16,000 items to a listbox. I've not had to do this test since 1995 when I was first working on 32-bit Win95, and as I recall back then we had to do several "tricks" trim down the 3-5 second load time on an average machine of that era. Anyway, fast forward to 2021 and running on a 64-bit Win10 and no special tricks needed to be played. On my machine built around 2012, it takes just over a 1/10th of a second:

C#:
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Drawing;
using System.Threading;
using System.Windows.Forms;

class Program : Form
{
    ListBox _lstbox;

    public Program()
    {
        _lstbox = new ListBox()
        {
            Dock = DockStyle.Fill
        };

        Controls.Add(_lstbox);
    }

    protected override void OnShown(EventArgs e)
    {
        base.OnShown(e);

        var list = new List<string>();

        for (int i = 0; i < 16000; i++)
            list.Add($"Item {i}");

        var stopwatch = new Stopwatch();
        stopwatch.Start();
        _lstbox.Items.AddRange(list.ToArray());
        stopwatch.Stop();
        MessageBox.Show(stopwatch.Elapsed.ToString());
    }

    [STAThread]
    static void Main()
    {
        Application.EnableVisualStyles();
        Application.SetCompatibleTextRenderingDefault(false);
        Application.Run(new Program());
    }
}

So maybe there is something slow with the directory/file enumeration. In my past experience, it's never been slow and in my case, most of my file and directory enumerations are run across the network so I don't always get the SSD like speed, but it's never gotten to the point that I've had to consider creating a progress bar. It had been consistently under 3 seconds. I'll have to play around with some code later to see if the slow down is due to the enumeration. Personally, though, I think @jmcilhinney 's code is spot on because I would written things similarly. Only if perf testing shows that the "_files" folders tend to be deep would I consider changing the code to use recursion and not bother plumbing into the "_files" folders. If they are shallow, the code as is should be good enough.
 
Check to see where it is "hanging"... Is it before adding to the listbox, or during. I suspect that it is during because you have to many items to be added.

Anyway, if you don't really intend to put things in a list box, then don't. During development, just set a breakpoint and inspect the list to see if the contents look correct.
You need to read the original post to understand my problem.

It's not the listbox that's causing it to hang. Using the listbox works just fine.

It hangs when I try the code that jmcilhinney offered.
 
I did read your previous post and it said that it was hanging when you added the code put the items into the listbox.

I just tried this, adding the following at the end to test the output, but it hung the program.

Anyway, that is what is puzzling me. In my experience, EnumerateDirectories() and EnumerateFiles() was faster than their original .NET Framework 1.1 GetDirectories() and GetFiles() variants, so I would have guessed that it was the listbox insertion that was slow. But if the listbox insertion is not slow, then it's gotta be the directory/file enumeration. But that contradicts past experience, so now I'm curious to try doing some perf testing to see whether Get or Enumerate variants are faster.
 
You need to read the original post to understand my problem.

It's not the listbox that's causing it to hang. Using the listbox works just fine.

It hangs when I try the code that jmcilhinney offered.
Post #5 implies that it is adding the final data to the ListBox that hangs. If that is not the case then you should take more care with how you word things and you should debug your code and tell us exactly where it does hang. Basically, you need to provide us with ALL the relevant information. Are you saying that the whole process never completes or just that it takes a long time or that you don't see the output in the ListBox until the very end or something else? BE SPECIFIC.
 
Post #5 implies that it is adding the final data to the ListBox that hangs. If that is not the case then you should take more care with how you word things and you should debug your code and tell us exactly where it does hang. Basically, you need to provide us with ALL the relevant information. Are you saying that the whole process never completes or just that it takes a long time or that you don't see the output in the ListBox until the very end or something else? BE SPECIFIC.

It hangs on both of those lines of code you offered:

C#:
listBox1.Items.AddRange(filePathArray)

and:

C#:
listBox1.DataSource = filePathArray
 
Status
Not open for further replies.
Back
Top Bottom