couldn't scan folder with whitespace and special chars like $

fihovi

Member
Joined
Nov 26, 2019
Messages
7
Programming Experience
Beginner
Hello,

I want to scan drive in my app, but I can't scan $Recyclebin and folders with whitespace in name.

I can't find any resolution for these errors, I tried to find NuGet packages for resolving this issues for me.
I'm using System.IO

Program.cs:
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;

namespace FileHandler
{
    class Program
    {
        private static void Main(string[] args)
        {
            GetAllFilesFromFolder(@"E:\", true);
        }

        private static List<string> GetAllFilesFromFolder(string root, bool searchSubfolders)
        {
            Queue<string> folders = new Queue<string>();
            List<string> folderCount = new List<string>();
            List<string> files = new List<string>();
            folders.Enqueue(root);
            while (folders.Count != 0){
                string currentFolder = folders.Dequeue();
                try {
                    string[] filesInCurrent = Directory.GetFiles(currentFolder, "*.*", System.IO.SearchOption.TopDirectoryOnly);
                    files.AddRange(filesInCurrent);
                }
                catch
                {
                    //Console.WriteLine("Error: " + currentFolder);
                    // Do Nothing
                }
                try{
                    if (searchSubfolders){
                        string[] foldersInCurrent = Directory.GetDirectories(currentFolder, "*.*", System.IO.SearchOption.TopDirectoryOnly);
                        foreach (string _current in foldersInCurrent){
                            folderCount.AddRange(foldersInCurrent);
                            folders.Enqueue(_current);
                        }
                    }
                }
                catch{
                    Console.WriteLine("Error: " + currentFolder);
                    // Do Nothing
                }
            }
            countFiles = files.Count();

            List<string> distinct = folderCount.Distinct().ToList(); //Remove Duplicates from scan
            Console.WriteLine("Number of folders AFTER: " + distinct.Count);
            Console.WriteLine("Number of files is: " + files.Count());
       
            Console.ReadLine();
            return files;
        }
    }
}

Thanks a lot
 
Last edited:
Directory.GetDirectories() and Directory.GetFiles() returns files and folders with spaces in their name. It also finds the "C:\$Recycle.Bin"
C#:
using System;
using System.IO;

public class Test
{
    static void Main()
    {
        foreach(var f in Directory.GetDirectories(@"C:\"))
        {
            Console.WriteLine(f);
        }
    }
}
Capture.png
 
One thing you seem to not understand, and that is the recycle bin is not actually a folder at all. And there is a reason it earned its name as a bin. Microsoft are great for using intended puns. You see, the recycle bin is actually a virtual location on the drive, and the files stored within it are not actually there either. The recycle bin actually allocates space from the hard drive to store your files that you delete in a virtual binary environment. So when you delete a file, its never really deleted from your PC. Instead, its kinda assigned a pattern of building blocks which house the original address of the file, and all of its particulates which makes it the functioning file we see when previewing in windows.

However, any file which is marked "deleted", is simply assigned a flag and a GUID, which sets the file invisible from the operating system for us, but virtually visible to us in its virtual directory until we clear out the bin. The process of clearing the bin is simply the harddrive reallocating these bits that were used to construct your file and they are broken down and fragmented across various areas of the discs allocation area where it will be reused and permanently overwritten. And that is why some data can still be recovered from a hard disc even after something is deleted. The data is only scattered until its called to be reused at a later time. Anyway...

This is what the files look like as I iterate over them in the recycle bin. This is when they've been moved to the virtual folder. I believe the GUID has something to do with its next address on the Harddrive, but I may stand corrected on that one. :
C:\$Recycle.Bin\S-1-5-21-3775181533-2454628510-745607798-1001\$RYWMCQR.xml
C:\$Recycle.Bin\S-1-5-21-3775181533-2454628510-745607798-1001\$RZWSJHT.txt
If you look up the directory info on the location of the recycle bin, you will find out more about it once you step into the debugger.

Screenshot_48.jpg

Take note that its a hidden directory, and it requires elevated permissions because its a system virtual "folder". Lastly, whatever you plan on doing with the files in the recycle bin will likely be very hard to do. Also, look up KNOWNFOLDERID on MSDN, as it clarifies some of what I've said above.

Edit
Fixed a typo
 
Last edited:
Directory.GetDirectories() and Directory.GetFiles() returns files and folders with spaces in their name. It also finds the "C:\$Recycle.Bin"
C#:
using System;
using System.IO;

public class Test
{
    static void Main()
    {
        foreach(var f in Directory.GetDirectories(@"C:\"))
        {
            Console.WriteLine(f);
        }
    }
}
View attachment 722

Thank you for your insight, why in my code I can't do the same and my code does following
This is from catch block, All of these folders are under E:\ ,not the "E:\ \" - tested on my code.
wtf.png

Then the same with more folders..
wtf2.png

For instance... as you'd guess, this is in the folder "E:\[Alt+0160]\", but not "E:\[Alt+0160]\[Alt+0160]\" ( sorry for misconduct with whitespace)

When I recreate it in your code, it still is a problem,
E:\test\ \tt (No-BreakSpace is folder itself)
I get output: "E:\test\ \ \tt"
One thing you seem to not understand, and that is the recycle bin is not actually a folder at all. And there is a reason it earned its name as a bin. Microsoft are great for using intended puns. You see, the recycle bin is actually a virtual location on the drive, and the files stored within it are not actually there either. The recycle bin actually allocates space from the hard drive to store your files that you delete in a virtual binary environment. So when you delete a file, its never really deleted from your PC. Instead, its kinda assigned a pattern of building blocks which house the original address of the file, and all of its particulates which makes it the functioning file we see when previewing in windows.

However, any file which is marked "deleted", is simply assigned a flag and a GUID, which sets the file invisible from the operating system for us, but virtually visible to the in its virtual directory until we clear out the bin. The process of clearing the bin is simply the harddrive reallocating these bits that were used to construct your file and they are broken down and fragmented across various areas of the discs allocation area where it will be reused and permanently overwritten. And that is why some data can still be recovered from a hard disc even after something is deleted. The data is only scattered until its called to be reused at a later time. Anyway...

This is what the files look like as I iterate over them in the recycle bin. This is when they've been moved to the virtual folder. I believe the GUID has something to do with its next address on the Harddrive, but I may stand corrected on that one. :

If you look up the directory info on the location of the recycle bin, you will find out more about it once you step into the debugger.

View attachment 723
Take note that its a hidden directory, and it requires elevated permissions because its a system virtual "folder". Lastly, whatever you plan on doing with the files in the recycle bin will likely be very hard to do. Also, look up KNOWNFOLDERID on MSDN, as it clarifies some of what I've said above.
Thank you for your insight as well. I'm aware how recycle bin does work, I was confused with the Non-Break Space, that I thought my code won't take any special character.
So my code won't work at all, although I couldn't find anything suspicious why my Non-Break Space is added between E:\ and folder itself on the drive.
 
This is from catch block
Say what? If that's in the catch block, then that means an exception is being thrown. What is the exception? Perhaps the message text in the exception will tell you what is failing.
 
Say what? If that's in the catch block, then that means an exception is being thrown. What is the exception? Perhaps the message text in the exception will tell you what is failing.
E:\test\ \ \tt|System.IO.DirectoryNotFoundException: Could not find a part of the path 'E:\test\ \ \tt'.
at System.IO.__Error.WinIOError(Int32 errorCode, String maybeFullPath)
at System.IO.FileSystemEnumerableIterator`1.CommonInit()
at System.IO.FileSystemEnumerableIterator`1..ctor(String path, String originalUserPath, String searchPattern, SearchOption searchOption, SearchResultHandler`1 resultHandler, Boolean checkHost)
at System.IO.Directory.GetDirectories(String path, String searchPattern, SearchOption searchOption)
at FileHandler.Program.GetAllFilesFromFolder(String root, Boolean searchSubfolders) in D:\Projects\FileScan\FileHandler\Program.cs:line 34
 
Looks like the .NET Framework has a bug when dealing with the non-breaking space... It's supposedly fixed for .NET Core, but I'm guessing that you are using the .NET 4.x

See the comments below: Get-ChildItem and non-breaking space
 
Looks like the .NET Framework has a bug when dealing with the non-breaking space... It's supposedly fixed for .NET Core, but I'm guessing that you are using the .NET 4.x

See the comments below: Get-ChildItem and non-breaking space
You're right.. I tried to use .NET Framework 4.7.2 and it was not okay..
OUTPUT: E:\test\ \ \tt
When I put it into .NET Core
Got following output:
E:\test\ \tt --> Which is the one, I needed it to be.

Thanks a lot!
 
Btw, It must be something in your code that isn't right. I tried it in 4.7.2 using code I wrote myself and experienced no problems. Further tested on 4.8 and also experienced no problem.
 
I couldn't identify the problem... My code is all up there, nothing more or less.. If you could post your code here to compare it.. It'd be nice. When I copied it into .NET Core with NO edit it worked.


WELL only edit I did in the project onto .NET Framework was ClickOnce security thing (or so) to bypass problems and scan folders and files with administrative account. But that make no sense to me.
 
to bypass problems and scan folders and files with administrative account.
Administrative rights are required for that as I already said above, and so that's likely why It wasn't working. The bin is owned by the system.

Can you first try checking if it runs in 4.7.2 since that change?

The only difference to you, is that I am using the Shell API from system32 with an interface instead.
 
E:\ \ \test|System.IO.DirectoryNotFoundException: Could not find a part of the path 'E:\ \ \test'.
at System.IO.__Error.WinIOError(Int32 errorCode, String maybeFullPath)
at System.IO.FileSystemEnumerableIterator`1.CommonInit()
at System.IO.FileSystemEnumerableIterator`1..ctor(String path, String originalUserPath, String searchPattern, SearchOption searchOption, SearchResultHandler`1 resultHandler, Boolean checkHost)
at System.IO.Directory.GetDirectories(String path, String searchPattern, SearchOption searchOption)
at frameworktest.Program.GetAllFilesFromFolder(String root, Boolean searchSubfolders) in C:\Users\krogi\source\repos\frameworktest\frameworktest\Program.cs:line 55
instead of E:\ \test (\ \ => \[Alt+0160]\)

Error line
Program.cs:
                        string[] foldersInCurrent = Directory.GetDirectories(currentFolder, "*.*", SearchOption.TopDirectoryOnly);

Block of code is available in the main (first) post.

So it's still not working. Plain new project, copied WORKING code from .NET Core to .NET Framework 4.7.2 and commented out Npgsql (PostgreSQL driver for C#), because I do not need or either use database in scanning.
 
Well that's weird. I will need to run your code and debug it myself when I find a little more time later tonight. I'd love to do it now but I'm currently under pressure to finish up on something I was meant to finish for work last week but completely forgot about. Anyway, I will update you once I find my feet and delve into this. At a glance, It does actually looks like you have a problem with how the paths are iterated over, but I will check this for you later. But thanks for trying it. ;)
 
It's not just the code. The environment with the directory whose only character is a non-breaking space is needed to replicate the problem being seen by the OP.
 
I reproduced the problem with .NET Framework 4.8:
C#:
using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.IO;
using System.Linq;
using System.Runtime.InteropServices;

class Program
{
    const string RootTestDir = "C:\\TestingNBSPDir";
    static string ChildTestDir = Path.Combine(RootTestDir, "\x00A0\\Leaf");

    static bool EnsureDirectoryExists(string path)
    {
        try
        {
            Console.WriteLine($"Ensuring directory exists: {path}");
            if (!Directory.Exists(path))
                Directory.CreateDirectory(path);
            return true;
        }

        catch (Exception ex)
        {
            Console.Error.WriteLine($"Couldn't access or create {ChildTestDir}");
            Console.Error.WriteLine(ex);
        }
        return false;
    }

    static void RecursivelyListDirectories(string root)
    {
        Console.WriteLine($"Enumerating directories starting at: {root}");
        var queue = new Queue<string>();
        queue.Enqueue(RootTestDir);
        while (queue.Count != 0)
        {
            var current = queue.Dequeue();
            Console.WriteLine($"Working on '{current}'");

            try
            {
                foreach (var dir in Directory.EnumerateDirectories(current, "*", SearchOption.TopDirectoryOnly))
                    queue.Enqueue(dir);
            }

            catch (Exception ex)
            {
                Console.Error.WriteLine(ex);
            }
        }
    }

    static void Main()
    {
        if (EnsureDirectoryExists(ChildTestDir))
            RecursivelyListDirectories(RootTestDir);
    }
}

Results in:
Capture.PNG.png


And .NET Core 3.0 works fine:
Capture2.PNG.png
 
Last edited:
Back
Top Bottom