Question Cast string to byte array

rkb771

Member
Joined
Jun 21, 2022
Messages
6
Programming Experience
Beginner
I am using .NET 6 for a project where I am reading some data from serial. The SerialPort.ReadExisting is my choice of method for reading serial data. This method returns the data as string but I need the data as byte array. I know of System.Text.Encoding.ASCII.GetBytes but my confusion is that I don't understand if it just casts the data to a new type or converts the data to a new type. I don't want any conversion, I need a simple reinterpretation of the received raw data (like pointer cast in C/C++). The reason is that the received data is binary which may include anything of ASCII text, integer or float (raw 4 byte binary) which will be post-processed later. Any wrongly assumed conversion on, say, the float data will just mess up things.

Can anyone please give me some pointer on how GetBytes method behaves? If it's appropriate for my case? If not, what are my alternatives other than using a small C dll to pointer cast the memory?

Thank you very much for your time. Any help is appreciated.

Note: I am using ReadExisting method to queue all received data to be processed by fixed format (custom). I know that I can set [URL='https://learn.microsoft.com/en-us/dotnet/api/system.io.ports.serialport.receivedbytesthreshold?view=dotnet-plat-ext-8.0']ReceivedBytesThreshold[/URL] to set a fixed chunk size for easy formatting. But docs says that DataReceived event is fired for EOF character as well, which I couldn't find much information on. So, if this "EOF" is a particular character in this case, it just might appear in the binary data and set off an untimely event.
 
Note: I am using ReadExisting method to queue all received data to be processed by fixed format (custom). I know that I can set [URL='https://learn.microsoft.com/en-us/dotnet/api/system.io.ports.serialport.receivedbytesthreshold?view=dotnet-plat-ext-8.0']ReceivedBytesThreshold[/URL] to set a fixed chunk size for easy formatting. But docs says that DataReceived event is fired for EOF character as well, which I couldn't find much information on. So, if this "EOF" is a particular character in this case, it just might appear in the binary data and set off an untimely event.

I assume you were referring to this paragraph:
The DataReceived event is also raised if an Eof character is received, regardless of the number of bytes in the internal input buffer and the value of the ReceivedBytesThreshold property.

Just my opinion, but I think the person who wrote that paragraph of documentation was an old school C programmer (like me) where we expect to be reading characters from a stream using code that looks like:
C#:
// read characters until end-of-file
char ch;
while ((ch = getc()) != EOF)
{
    : // do something with the character ch
}
where it looks like EOF is a real character that might be returned by the OS. But reality, EOF has a value of -1 which is not a valid ASCII character code (0..255).

I suggest reading that paragraph as:
The DataReceived event is also raised if an end-of-file signal is received, regardless of the number of bytes in the internal input buffer and the value of the ReceivedBytesThreshold property.

In other words, you might get the DataReceive at any time.
 
The SerialPort.ReadExisting is my choice of method for reading serial data. This method returns the data as string but I need the data as byte array. I know of System.Text.Encoding.ASCII.GetBytes but my confusion is that I don't understand if it just casts the data to a new type or converts the data to a new type.

Every string in C# is allocated. Every string in C# is a Unicode string. So the string returned to you by ReadExisting() is not a window into the buffer used by the serial port, but rather, the string representation of those bytes in the buffer with those bytes in the buffer being treated as having been encoded by the Encoding property. In other words, when you call the method, it will takes the bytes in the buffer and pass them to XXXEncoding.GetString() where XXX is the encoding type. Eg. if your Encoding were set to ASCIIEncoding, then ASCIIEncoding.GetString() would be used.

Since the only encodings accepted by the SerialPort class are all single byte encodings, and all C# strings are composed of 2 byte char's simply casting a pointer to the buffer will not work. Also C# strings are not null terminated unlike C strings. Also it would go completely against the ethos of C# to pass pointers around or cast those pointers.

And as far as I know, the SerialPort class has not yet been converted over to give you Span<byte>s interfaces in the hopes of getting direct access to the bytes in the buffer.
 
I don't want any conversion, I need a simple reinterpretation of the received raw data (like pointer cast in C/C++). The reason is that the received data is binary which may include anything of ASCII text, integer or float (raw 4 byte binary) which will be post-processed later. Any wrongly assumed conversion on, say, the float data will just mess up things.

As per the ReadExisting() documentation:
If it is necessary to switch between reading text and reading binary data from the stream, select a protocol that carefully defines the boundary between text and binary data, such as manually reading bytes and decoding the data.
 
The SerialPort.ReadExisting is my choice of method for reading serial data. This method returns the data as string but I need the data as byte array

So use one of the methods that reads bytes, such as Read(byte[], int, int)

You're using the wrong tool for the job, a tool that interprets the data and now you're trying to uninterpret the data. Dont try to unbake a cake, avoid baking it in the first place
 
I assume you were referring to this paragraph:


Just my opinion, but I think the person who wrote that paragraph of documentation was an old school C programmer (like me) where we expect to be reading characters from a stream using code that looks like:
C#:
// read characters until end-of-file
char ch;
while ((ch = getc()) != EOF)
{
    : // do something with the character ch
}
where it looks like EOF is a real character that might be returned by the OS. But reality, EOF has a value of -1 which is not a valid ASCII character code (0..255).

I suggest reading that paragraph as:


In other words, you might get the DataReceive at any time.

Thanks for the clarification. Then, I might get untimely DataReceived event just because 4 sequential bytes resembles -1.
 
No. Recall that the SerialPort class doesn't interpret any of the byte stream it is getting (unless you call ReadExisting() -- but that's interpreting data already in the buffer).

Also, you are welcome to peruse the .NET Framework source code for SerialPort at:

The .NET source code can also be found in github, but unfortunately, not in Source Browser .
 

Latest posts

Back
Top Bottom