Tip creating strings

monkeyPlus · Sep 29, 2023

Hey..anyone interessed in strings check this url

thks

andre

Skydiver · Sep 29, 2023

I don't think your string idea works very efficiently for C#. Recall that in C# string characters are typically 16-bits. So each of your "character" positions will have a radix of 0x10000.

Also if you look at the actual string comparison implementations in the the C++ library, as well as the .NET Framework, they go down to assembly code implementations that compare runs of characters as integers which is what you are trying to advertise as your "more efficient" integer comparison of strings.

Also have you considered that most CPUs can only do integer comparisons of up to 64-bits? So for C# strings, you'll only be comparing 4 characters at a time.

How does your system store strings with more than 128bits?

C# also stores strings like Pascal with a known length. It is not like C/C++ which is zero terminated. So there is no real benefit in your use of logarithms or bit counting to determine the length of a string.

Also regarding your bit counting, how to measure the length of a string that is all NUL characters? All the bits will be set to zero.

In C#, strings are immutable. So your idea of doing C/C++ like mutation of strings using bit operations and multiplication and/or division is actually pointless.

And in case you wanted to use some kind of mutable string, that is what the new C# Span<char> can do efficiently where you can treat a range of char like a simple array.

Additionally accessing characters in a string with indexing like an array is more efficient than computing positions and bit masks. Recall that array indexing is faster than multiplication and division (assuming that the data doesn't have to be pulled from main memory).

Skydiver · Sep 29, 2023

Some good reading from Eric Lippert explaining the choice of UTF-16 for C#:

ATBG: Why UTF-16?

NOTE: This post was originally a link to my post on the Coverity blog, which has been taken down. An archive of the original article is here. Today on Ask The Bug Guys we have a language design que…

ericlippert.com

(I recall when I was working on "Chicago", Eric was already a rockstar developer even as an intern. I was working on the first version of the RichEdit control and my main area was the RTF converter and OLE object storage, but I was also helping with the Far East enablement of the RichEdit control. I also got to work with Larry Osterman since he was working on the Exchange backend data storage and . Since the RIchEdit control was for the "Chicago" email client, and I was also working on some parts of the email forms there were times when I would need help from Larry for issues.)

jmcilhinney · Sep 29, 2023

For future reference, how about you provide some explanation of what it is your suggesting, rather than just providing a link and sending us off to check it out essentially blind. That's the sort of thing that spammers do so, if you don't want to look like a spammer, tell us why following your link would be of interest.

Tip creating strings

monkeyPlus

New member

Skydiver

Skydiver

ATBG: Why UTF-16?

jmcilhinney

C# Forum Moderator

Similar threads

Share this page

Latest posts