Deobfuscate String

Oh So Sick

New member
Joined
Jan 29, 2023
Messages
3
Programming Experience
Beginner
Hello everyone,

I need your help again because I’m looking to deobfuscate a string.

First I have a few questions for you experts regarding my code that you will find below:
- Is it possible to make it simpler?
- Do you see mistakes or illogical things?
- Is it possible to extract the desired parts in BYTES in order to avoid unnecessary conversion and weighing down the code?

So for my problem... I have a file that is encoded in 2 steps.
The first step of encoding (BYTES multiplied by 2) seems raised to me: I manage to open with my program the file and deobfuscate by dividing by 2 the BYTES.
But! For there is always a but! After executing this code on a file I notice that there is a second level of encoding that reacts in 2 different ways depending on the start and end delimiter! Let me explain.
For the code contained in the delimiters 5E54/545E (5E54"CODE EN HEX"545E) or 5E5E/1A14: in this example C2 (in HEX) = a (in ASCII) but this applies to all the HEX that define a characteristic and that follow each other!
- a = C2
- aa = 3C643AC23E
- aaa = 3C643AC23E
- aaaa = 3C663AC23E
- aaaaa = 3C683AC23E
- aaaaaa = 3C6A3AC23E
- aaaaaaa = 3C6C3AC23E
- aaaaaaaa = 3C6E3AC23E
- aaaaaaaaa = 3C703AC23E
- aaaaaaaaaa = 3C723AC23E
- aaaaaaaaaaa (10) = 3C62603AC23E
- aaaaaaaaaaaa = 3C62623AC23E
- aaaaaaaaaaaaaaaaaaaaaa (20) = 3C64603AC23E
- aaa... (100) = 3C6260603AC23E (guess)
And for codes not contained in this delimiter:
- a = C2
- aa = C2C2
- aaa = 3C643AC23E
- ...

I specify that this example is valid before having divided by 2 the BYTES, after division the set of HEX is divided by 2 (C2 = 61 ; 3C = 1E ; ...)

So, since I start in C# this part becomes extremely complicated for me! So let me ask you a few questions before I start without knowing where I’m going: how? Do you have any leads on what I should use to do this? and the simplest method in your opinion?
My goal is to find, analyze, and replace the parts of the STRING encoded with this second level. For example 3C643AC23E by C2C2.

Thank you in advance for your contribution!

My program :

I open the file and divide each BYTES by 2. This is the first level of coding:
Read file:
var fileContent = string.Empty;
var filePath = string.Empty;
using (OpenFileDialog openFileDialog = new OpenFileDialog())
{
    openFileDialog.InitialDirectory = "%userprofile%";
    openFileDialog.Filter = "bin files (*.bin)|*.bin";
    openFileDialog.FilterIndex = 2;
    openFileDialog.RestoreDirectory = true;
    if (openFileDialog.ShowDialog() == DialogResult.OK)
{
filePath = openFileDialog.FileName;
var fileStream = openFileDialog.OpenFile();
byte[] readBytes = File.ReadAllBytes(filePath);
for (var i = 0; i < readBytes.Length; i++)
{
    readBytes[i] /= 2;
}
foreach (byte s in readBytes)
{
    Console.WriteLine(s);
}
byte[] data = readBytes;
string hex = BitConverter.ToString(data).Replace("-", string.Empty);

Then I modify the code to extract only the parts that interest me:
Extract parts:
int pFrom = hex.IndexOf("2A60602A");
int pStart = hex.IndexOf("23322931342B391E18181805") + "23322931342B391E18181805".Length;
int pTo = hex.LastIndexOf("2A4560602A") + "2A4560602A".Length;
string startCode = hex.Substring(pStart, pFrom - pStart);
string endCode = hex.Substring(pTo);
string newCode = startCode + endCode;

And I converted to BYTES to save to a new file:
Save new file:
int numberCharsCode = newCode.Length;
byte[] newCodeBytes = new byte[numberCharsCode / 2];
for (int w = 0; w < numberCharsCode; w += 2)
{
     newCodeBytes[w / 2] = Convert.ToByte(newCode.Substring(w, 2), 16);
}
string newFilePath = filePath.Replace(".bin", "_decrypted.bin");
File.WriteAllBytes(newFilePath, newCodeBytes);
 
What game or program are you trying to hack that you need to "deobfuscate" the data? Wouldn't it be better to contact the game developers?
 
As an aside it looks like you failed to divide 0x1A, 0x14 by to to get 0x0D, 0x0A which are the ASCII characters for CR, LF.
 
Anyway divide the bytes by 2 and then look up the ASCII equivalents. The pattern becomes obvious.
 
Regarding your question about efficiency and better ways to do this: If you keep your data all in bytes (eg do not convert to strings), then doing all the division will be faster, as well as scanning and looking for the offsets, and writing back into a file will be faster because there are less bytes for the computer to look at. Each character in .NET takes 2 bytes. So 2 hex digits in a string ends up being 4 bytes, as opposed to the original data being just a single byte.
 
Hello Skydiver!

Not a game but a text editor that locks the text with a password and obscures the code!
I already managed to crack the password and I can unlock the text by deactivating the password from the application but I am looking to improve my program for my knowledge! I want to clarify that this is not sensitive data: I had an encrypted text file a long time ago by my care, I was able to recover it! But since I’m looking to learn C# I thought it would be a good challenge to get started!!

Thank you for your answers but I already split the bytes directly after reading the file! What could be improved is extracting part of the code without converting it to a string but it doesn't seem to me that we can do this in C#? Anyway, I couldn't find anything! Do you have a function to offer for this?

Regarding the obfuscation of the program, I manage to transform 0x1A/0x14 into an ASCII CR/LF character, which is normal! This is correct at the text file rendering level.
What I can't interpret for is the second level of obscuration in my example which is only for characters that follow each other.
I see the delimiter that should be used.
For example for aaaaaaaaaaa (10) which are between the delimiters 5E54/545E:
obfuscate code for char:
3C62603AC23E
//3C is the start delimiter and 3E is the end delimiter of the whole string
//3C is the start delimiter and 3A the end delimiter for the number of characters that follow each other
//3A is the start delimiter and 3E the end delimiter for the character
//So C2 = a for the character, 62 = 1 for the ten and 60 = 0 for the unit which gives 10x"a" = aaaaaaaaaa

I can't interpret this encoding in C# because as you can see there is also another encoding for the characters contained between the delimiters 5E5E/1A14. Moreover for 100 identical characters which follow each other there is a third HEX to determine the number. I would just like to be oriented on the best way to do this set of conditionals in C#. Do you have a suggestion?

Sorry for my broken English but I'm not an English speaker :)
 
There is no magic in C# or any other language. The problem is language agnostic. It's just a matter of processing the data.
 
Moving to C# General since nothing here is IDE specific. For that matter, this isn't really C# specific other that asking for code to do things more efficiently in C#.
 
Just doodling at the keyboard. Untested code that should be equivalent to the relevant code that reads, scans, and writes out the file data:
C#:
var readBytes = File.ReadAllBytes(filePath);
var newCodeBytes = GetInterestingBytes(readBytes);
File.WriteAllBytes(newFilePath, newCodeBytes);

byte[] GetInterestingBytes(ReadOnlySpan<byte> original)
{
    int start = FindEndOfPattern(original, new byte[] { 0x23, 0x32, 0x29, 0x31, 0x34, 0x2B, 0x39, 0x1E, 0x18, 0x18, 0x18, 0x05 });
    int from = FindStartOfPattern(original, new byte[] { 0x2A, 0x60, 0x60, 0x2A });
    int to = FindEndOfPattern(original,new byte[] { 0x2A, 0x45, 0x60, 0x60, 0x2A });

    var startBytes = original[start..from];
    var endBytes = original[to..];

    var interestingBytes = new byte[startBytes.Length + endBytes.Length];
    startBytes.CopyTo(interestingBytes);
    endBytes.CopyTo(interestingBytes[startBytes.Length..].AsSpan());
    return interestingBytes;

    static int FindStartOfPattern(ReadOnlySpan<byte> haystack, ReadOnlySpan<byte> needle)
    {
        var index = haystack.IndexOf(needle);
        if (index < 0)
            throw new ArgumentOutOfRangeException(nameof(needle));
        return index;
    }

    static int FindEndOfPattern(ReadOnlySpan<byte> haystack, ReadOnlySpan<byte> needle)
        => FindStartOfPattern(haystack, needle) + needle.Length;
}

The code works with bytes instead of needing to convert to string and then back to bytes again.
The use of spans try to minimize the amount of memory that needs to be allocated and copied around.
 
I can't interpret this encoding in C# because as you can see there is also another encoding for the characters contained between the delimiters 5E5E/1A14. Moreover for 100 identical characters which follow each other there is a third HEX to determine the number. I would just like to be oriented on the best way to do this set of conditionals in C#. Do you have a suggestion?
Yes. Breakdown the big problem into smaller problem and tackle those steps. Here's some pseudo-code:
C#:
collect the bytes between the 5E54/545E or 5E5E/1A14 delimiters.
divide those bytes by two
if it's a single byte
    make a Unicode string containing that single ASCII character
    return that string

verify that the first byte is the ASCII Escape (ESC) character
if it's not an ESC character
    declare an error
    done

prepare a byte buffer
skip that first byte
while current byte in not ASCII Group Separator (GS) character
    append the ASCII character to the byte buffer
convert the bytes in the buffer from an ASCII encoded string to a Unicode encoded string
parse the Unicode encoded string to get the count of repeats

prepare a byte buffer
skip the GS character
while current byte is not ASCII Unit Separator (US) character
    append the ASCII character to the byte buffer
convert the bytes in the buffer from an ASCII encoded string to a Unicode encoded string

prepare a string builder
for count times
    append that Unicode encoded string into the string builder

return the contents of the string builder
 
Lines 12-17 of the pseudo-code above can also be done without having to parse a Unicode string by doing:
C#:
let count = 0
skip the ESC character
while current byte is not GS
    value = byte value - ASCII '0'
    count = count * 10 + value
 
Thanks for all of Skydiver!! You’re a big help!! I’m going to work on all of this this weekend! I would very much like to have only this to do because it is exciting, unfortunately the work takes me a lot of time...

Thanks again for your pedagogy!!
 
Back
Top Bottom