HtmlDecode

Gloops

Well-known member
Joined
Jun 30, 2022
Messages
137
Programming Experience
10+
Hello everybody,

If I have a window with a path to "file:///D:/T%E9l%E9chargements" and I want it to be displayed "D:\Téléchargements", I have to use HttpUtility.HtmDecode.

For that, I must reference System.Web. This is stored in several places.

What is the preferable one, for a WinForms project on .Net Framework 4.7.2 with a reference to PowerShell?

If I do not choose the good one, it can display the name of the class in the ListBox instead of the DisplayMember, or it can even break when loading the project, claiming about an incorrect format of DLL.
 
Solution
Url-decoding to bytes and reading those bytes as ISO-8859-1 encoding works:
C#:
var input = "file:///D:/T%E9l%E9chargements";
var enc = System.Text.Encoding.GetEncoding("ISO-8859-1");
var bytes = enc.GetBytes(input);
bytes = System.Net.WebUtility.UrlDecodeToBytes(bytes, 0, bytes.Length);
var output = enc.GetString(bytes);
That looks more like URL encoding than HTML encoding. That said, if I URL encode "D:\Téléchargements" I get "%C3%A9" for those accented characters, not "%E9".

Also, if you're not creating a web app, I'd tend to use System.Net.WebUtility, if it has what you need. It is defined in System.dll and would thus require no extra reference.
 
Well, some search led to the same conclusion:
Decode a URL:
                string strDec = System.Net.WebUtility.UrlDecode(strDesc);

But an encoding problem remains:
file:///D:/T�l�chargements

Not sure a control has a default encoding.
I see there is also UrlDecodeToBytes, perhaps it can help.

With something like
Decode bytes to chars:
            char[] tgtChars = new char[trgtEnc.GetCharCount(tgtBytes, 0, tgtBytes.Length)];
            trgtEnc.GetChars(tgtBytes, 0, tgtBytes.Length, tgtChars, 0);
            return new string(tgtChars);
 
Url-decoding to bytes and reading those bytes as ISO-8859-1 encoding works:
C#:
var input = "file:///D:/T%E9l%E9chargements";
var enc = System.Text.Encoding.GetEncoding("ISO-8859-1");
var bytes = enc.GetBytes(input);
bytes = System.Net.WebUtility.UrlDecodeToBytes(bytes, 0, bytes.Length);
var output = enc.GetString(bytes);
 
Solution
You were quicker than I, thank you.
I have put ISO-8859-15, although the probability is low there is a Euro sign in the paths.
 
Back
Top Bottom