Help understanding some C# code

John Franklin

Member
Hello all. I dabbled in C# a couple of years ago, but didn't really progress much past the beginner phase. The code below is a C# answer I found to a challenge I solved via Excel/VBA, but whereas my code takes hours to find the solution, the C# code below takes around 3 minutes(!). I was blown away by that and since my C# skills are pretty minimal at this point, I don't completely understand how the code works. So, to that end, I was hoping someone could help give me an understanding of what's going on in this code and how it is determining the solution. I don't know how involved that is, but if someone has the time and inclination, I'd really like to learn/understand how this code finds the solution. Many thanks to whomever is willing to help me with this!

The challenge this code solves is based off the old game Battleship and finds the answer of how many unique starting boards are there for the game Battleship. Meaning, how many ways are there to place all the ships of various cell lengths (5,4,3,3,2) vertically or horizontally (no diagonals) in a 10x10 grid with no overlap/intersection of ships. I was a bit shocked at the answer, but the code below confirmed my answer, and did it WAY WAY faster than my code, which is why I'd like to understand how the C# works.

C#:
``````using System;
using System.Collections.Generic;

namespace test4
{
class BShip
{
class BPat
{

public BPat(long m1, long m2)
{
}
public bool canMerge(long m1, long m2)
{
return (m1 & mask1) == 0 && (m2 & mask2) == 0;
}
public void mergeWith(ref long m1, ref long m2)
{
}
}

static private readonly IList<BPat>[] ship = new List<BPat>[5];

static void Main()
{
for( int ln = 2; ln<=5; ln++ )
{
IList<BPat> list = new List<BPat>();
// by symmetry, we can restrict the len5 ship to being vertical and in the top left quarter of the field.
int mxx = ln == 5 ? 5 : 10;
int mxy = ln == 5 ? 3 : 10;
for (int x = 0; x < mxx; x++)
{
for (int y = 0; y < mxy; y++)
{
placeShip(list, ln, x, y, 0, 1);
if( ln!=5) placeShip(list, ln, x, y, 1, 0);
}
}
ship[5-ln] = list;
}
ship[4] = ship[3];   // extra len 3 ship
ship[3] = ship[2];
long  count = search(0, 0L, 0L);
// 8 symmetries
Console.WriteLine(count * 8);
}
static long search(int depth, long m1, long m2)
{
if (depth == 5) return 1;
long count = 0;
foreach (BPat p in ship[depth])
{
if( depth == 0 ) Console.WriteLine(count);
if( p.canMerge(m1,m2))
{
long oldm1 = m1, oldm2 = m2;
p.mergeWith(ref m1,ref m2);
count += search(depth + 1, m1, m2);
m1 = oldm1;
m2 = oldm2;
}
}
return count;
}

static void placeShip(IList<BPat> list, int ln, int x, int y, int dx, int dy)
{
if (x + dx*(ln-1) >= 10 || y + dy*(ln-1) >= 10) return;
int loc = 10*y + x;
long m1 = 0, m2 = 0;
for (int i = 0; i < ln; i++)
{
if (loc >= 50) m2 |= 1L << (loc - 50);
else m1 |= 1L << loc;
loc += 10*dy + dx;
}
BPat p = new BPat(m1, m2);
}
}
}``````

Last edited:

John Franklin

Member
Lol. Sorry for the Columbo like "Just one more thing before I go...." routine, but from a technical standpoint, it sounds like those 120 individual graph sheets from above are actually 120 individual arrays, which themselves are stored in an array, basically an array of arrays containing our 120 sub arrays. That seems correct, but then I thought about it more and that would mean 760 individual arrays (120+140+160+160+180) and that didn't seem right.

Skydiver

Staff member
That's correct 760 individual arrays.

John Franklin

Member
Well, after grappling with the concept of 700+ arrays, I have another question for you. How does the C# code deal with the symmetry? For example, you really only need 15 of the 120 5 cell length arrays, then can multiply each individual result by 8, which C# seems to do as there is a line of code that multiplies a value by 8. In VBA, I ran the 120 through a filtering process that grouped the 120 into 15 groups of 8, then just picked the first one in each group to be representative of that group to run through the process. What does the C# code do relative to this?

Skydiver

Staff member
See lines 45-47:
C#:
``````// by symmetry, we can restrict the len5 ship to being vertical and in the top left quarter of the field.
static IEnumerable<Field> GenerateCarrierPositions()
=> GenerateShipPositions(5, 5, 3, Orientation.Vertical);``````

I misspoke above about accepting the 120 number for the carriers. There are less positions stored for the carrier. Only the vertical positions of the carriers in the 10x8 top left corner of the field.

John Franklin

Member
Ok. I think I'm almost there. Here's the question for today. Let's say we're comparing one of our 4 length ship arrays with one of the 5 length ones. How does the C# code determine that an element is occupied in both arrays? For example, let's say the element at 3,5 in both arrays is occupied. How is that determined? In my VBA research, it seemed like the only solution to this was to loop through one of the arrays and see if the position of an occupied element is occupied in the other array- there isn't any sort of native function that does that type of thing.

Skydiver

Staff member
See lines 91-98, and lines 70-71.

Lines 91-98 turn turn on a bit in a pair of 64-bit integers. Lines 70-71 perform an AND operation between two pairs of 64-bit integers. If the AND operation returns non-zero that means that bits in the first pair have corresponding bits on the second pair turned on.

As you recall from your Boolean logic, here the truth table for AND
C#:
``````  | F T
--+----
F | F F
T | F T``````

If false is 0, and true is 1, then you have:
C#:
``````  | 0 1
--+----
0 | 0 0
1 | 0 1``````

A 10x10 two dimensional array can be mapped into a one dimensional array by using the formula x + 10 * y (where x and y are 0 based). So the 10x10 two dimensional boolean array where false indicates an empty space, and true indicates a filled space can be mapped to a 100 element one dimensional boolean array.

That boolean array can be mapped into the bits of an integer. Using the same formula, the bit index that needs to be turned on will be the (x + 10 * y)-th bit.

Unfortunately, the current longest natural type built into the .NET Framework (and .NET Core), and also supports bit operators, and lets as quickly check the results is the 64-bit integer. So the code in lines 91-98 also knows how to overflow past the 64th bit into the next integer.

.NET Framework and .NET Core have the `BitArray` class that supports bit operations, but it has no quick way to check for results. One would have to iterate over all the bit results and would be quite slow. It would be almost as slow as checking pairs of booleans in separate arrays.

Skydiver

Staff member
there isn't any sort of native function that does that type of thing.
VBA supposedly supports bitwise operators:

John Franklin

Member
VBA supposedly supports bitwise operators:
I was speaking relative to comparing one array to another at a higher level, not at that the bit level, but you're right. I didn't know Logical/Bitwise Operators existed for VBA until I did more research. How is the boolean array mapped to the bits of the integer? Assuming I'm thinking right, that 1 dimensional 100 element array (converted from a 10x10 array) is basically a 100 bit binary number, just a sequence of 1s and 0s. Is that sequence of array elements converted to a 64 bit number? Is that the mapping you speak of? (You'll have to help me out with how the overflow code deals with the values greater than 64)

And i should mention that your explanation above regarding lines 91-98 and lines 70-71 was outstanding, as have been all your explanations. It totally made sense and I immediately set out to discover if this was translatable to VBA, which is how I came across the Logical/Bitwise Operators. It looks like the same type of mapping to a 64 bit number type is possible, but i didn't fully understand how the array values were being mapped to the 64 bit number.

Just out of curiosity, how complex is the original code i posted? What C# level of expertise could come up with that? Beginner? Intermediate? Advanced? Thanks again for all your help!

​

Skydiver

Staff member
Just out of curiosity, how complex is the original code i posted? What C# level of expertise could come up with that? Beginner? Intermediate? Advanced?
That original code looked and felt like C code that was just translated into C#. It's about intermediate level of C programming because C likes to keep you close to the metal. Being close to the metal involves knowing how bits, bytes, and arrays work in memory. That is how C is traditionally taught: start off with the basics of how things are represented in binary. The intermediate kicks in when recursion is introduced.

As for how that maps to C# level, it would be closer to the high end of intermediate in terms of concepts because of the recursion and dealing with bits, but at the same time it is also at the beginner level because that is not how you write C# code. C# tends to be more object oriented, and less procedural, while C code is traditionally written procedurally, and only more advanced C programmers will apply object oriented approaches.

Skydiver

Staff member
How is the boolean array mapped to the bits of the integer? Assuming I'm thinking right, that 1 dimensional 100 element array (converted from a 10x10 array) is basically a 100 bit binary number, just a sequence of 1s and 0s. Is that sequence of array elements converted to a 64 bit number? Is that the mapping you speak of?
Yes, that sequence of 1s and 0s becomes a 100-bit number. But since there is no natural 100 bit number in C# (yet), I used two 64-bit numbers instead. The first one, `_low` holds the lower 64-bits, and the second one, `_high` holds the remainder. In the original code you posted, the author of that code opted to put the lower 50 bits in `m1` and the upper 50 bits in `m2`.

Crash course on how binary numbers work. Let's start off with just 4 bits (called a nibble), and their corresponding values:
C#:
``````binary => decimal = computed as
0000 =>  0 = 0 * 8 + 0 * 4 + 0 * 2 + 0 * 1
0001 =>  1 = 0 * 8 + 0 * 4 + 0 * 2 + 1 * 1
0010 =>  2 = 0 * 8 + 0 * 4 + 1 * 2 + 0 * 1
0011 =>  3 = 0 * 8 + 0 * 4 + 1 * 2 + 1 * 1
0100 =>  4 = 0 * 8 + 1 * 4 + 0 * 2 + 0 * 1
0101 =>  5 = 0 * 8 + 1 * 4 + 0 * 2 + 1 * 1
0110 =>  6 = 0 * 8 + 1 * 4 + 1 * 2 + 0 * 1
0111 =>  7 = 0 * 8 + 1 * 4 + 1 * 2 + 1 * 1
1000 =>  8 = 1 * 8 + 0 * 4 + 0 * 2 + 0 * 1
1001 =>  9 = 1 * 8 + 0 * 4 + 0 * 2 + 1 * 1
1010 => 10 = 1 * 8 + 0 * 4 + 1 * 2 + 0 * 1
1011 => 11 = 1 * 8 + 0 * 4 + 1 * 2 + 1 * 1
1100 => 12 = 1 * 8 + 1 * 4 + 0 * 2 + 0 * 1
1101 => 13 = 1 * 8 + 1 * 4 + 0 * 2 + 1 * 1
1110 => 14 = 1 * 8 + 1 * 4 + 1 * 2 + 0 * 1
1111 => 15 = 1 * 8 + 1 * 4 + 1 * 2 + 1 * 1``````

But binary is simply base 2 (just like decimal is base 10). So we can look at a binary digit as:
C#:
``d c b a``
where
`a` as the value * 1 == value * (2 raised to the 0th power)
`b` as the value * 2 == value * (2 raised to the 1st power)
`c` as the value * 4 == value * (2 raised to the 2nd power)
`d` as the value * 8 == value * (2 raised to the 3rd power)

This places: `a`, `b`, `c`, `d` are also called bit positions where the exponent of 2 describes the position.

So let's start off with a value of 1: 0001 binary. To get from 1 to 2, we multiply 1 by 2 and get 0010 binary. And then when we multiply 2 by 2 again we get 4: 0100 binary. Again multiply by 4 by 2 to get 8: 1000 binary. Makes sense, right since 1, 2, 4, 8 are just powers of 2.

But also notice how the 1 simply shifted over from one place digit over to the next place digit to the left. The number of places shifted over to the left corresponds to the exponent of 2. (This also works out for any binary number be it the binary number has 8 bits, 12 bits, 16 bits, 24 bits, 32 bits, 64 bits, 128 bits, etc. as long as the language has a way of representing a number with that number of digits.)

The shift operator `<<` is used to do the shifting. See lines 95 and 97 in my code and line 82 and 83 of the code you posted.

By using the OR bitwise operator (`|`, we can turn on the corresponding bit. So for example, we start off with zero, and we want to turn on the 3rd bit: we would do:
C#:
``````   0000
|  1000
-------
=  1000``````
and then we go on to turn on the 0th bit:
C#:
``````   1000
|  0001
-------
=  1001``````
or in code:
C#:
``````int value = 0;
value = value | (1 << 3);
value = value | (1 << 0);``````
or using the OR assignment operator syntatic sugar:
C#:
``````int value = 0;
value |= 1 << 3;
value |= 1 << 0;``````

So how to deal with the overflow of going over 64-bits? Notice in my code's line 94 a check is first made to see if the bit to be turned on is greater than or equal to 64. If it is, line 95 is executed were the `bit - 64`-th bit of `_high` is turned on. Similarly, in the original code you posted, if the bit is greater than or equal to 50, the `loc - 50`-th bit of `m2` is turned on.

As noted previously the bit position is determined by the x,y position. Bits 0 to 9 correspond to the 1st row. Bits 10-19 correspond to the second row. Bits 20-29 correspond to the third row. Etc. Notice how the formula works out to be x + 10 * y where x and y are zero based. This is the mapping of the grid position to the bit position.

John Franklin

Member
Good stuff. Another excellent tutorial. So, a couple of questions for today. Does the code iterate/loop through the 1d array and assign the bit values as you described above that way? Secondly, what is the recursion section of the code responsible for doing, creating, etc.?

Skydiver

Staff member
Does the code iterate/loop through the 1d array and assign the bit values as you described above that way?
Yes. See lines 76-89

Secondly, what is the recursion section of the code responsible for doing, creating, etc.?
It's the process of grabbing a sheet from the folder and putting it into the left hand, and then looking at the papers through the light to see if there are any collisions. If no collision, then the next level recursion repeats the process of picking up a sheet from the next folder. The recursion keeps happening until there are no more folders, and a configuration is counted. Then you bubble up one level of recursion and put the topmost sheet back into the folder it came from and pickup the next sheet in the folder, check for collisions, and work your way down again. If you've exhausted the current folder, then bubble up again.

Instead of just reading my description. Try it out for yourself with. Just do 3 sheets for each ship so that you are not bogged down.

John Franklin

Member
Thanks. My end game is to try to replicate the concepts you're helping me with here into VBA, much like the C code was modified for C# use. To that end, I'm trying understand both the concepts and the code mechanics that carry out those concepts. This uses concepts (like recursion and bit level comparisons, for example) that I've never used myself or really even seen used when looking at the VBA code of others. Take recursion for example. When I researched it relative to VBA, there just wasn't a lot out there that I found useful. The mechanics of it was often described via the same example- a Sub procedure makes a Function call and passes a value to the Function. The Function ends up calling itself and modifies that original passed value to iterate through until the escape/exit value triggers an exit. It didn't help that most of these recursion descriptions seemed to indicate that you could achieve the same type thing via a series of nested FOR loops, so that was a bit confusing to me, as in what would the benefit of using recursion be then?

What you described above for how recursion works here is identical in concept to what my VBA code does, but my code doesn't use recursion in the manner described above regarding passing a value to a function which then calls itself. Mine just uses a series of nested FOR loops to grab the first 5 element item and compare it to the first 4 element item. If no intersect/overlap, get the first 3 item element. If intersect or overlap, go to the next 4 item and continue until no intersect/overlap. This just repeats and works its way down to the 2 item level. Once you've found a 2 item that doesn't intersect any of the other four, that's counted as one configuration, then we bubble up to the next 3 item and keep going. This just continues until we've bubbled back up through all the 5 element items. Maybe what I'm doing is technically recursion, but it doesn't seem like it based on the descriptions/examples I've seen of it, namely the function calling itself aspect.

With your latest tutorial, I'm thinking I'm finally at level where I can start translating to VBA. The bit mapping/comparison aspect should be interesting. Let me ask you this. You shared why you thought this started out as C code and that C# code wouldn't be written like that. As someone who's obviously proficient in C#, what would you do differently? If you were starting this from scratch, how would you write the code? Would it just look like the modified code of the original that you posted (which I believe was done that way for my benefit), or would it look different?

Skydiver

Staff member
The recursive code lets you add more ships simply by adding more lists into the array at line 36. With your nested for loops, you'll need to add more levels of nesting.

In computer science, there is an axiom that any recursive code can be re-written to be iterative. One approach of doing that iterative implementation is by hardcoding the levels of recursion just like you have done with your nested for loops. Another approach is to use queues and stacks to keep track of the levels of recursion.

What I would have done differently is that my initial implementation would have had a 10x10 two dimensional array of booleans in a `Field` class to first make sure that I code that was working correctly, albeit slowly. Then I would have used a profiler to determine where the bottlenecks and/or most used areas of code are at. Although intuitively it would have been the array comparison and merging, but it's good to have data to back that up, as well as an objective measure to compare against when refactoring the code. Only then would I start refactoring that area of code to either use parallelism (e.g. multiple threads) or use bit fields. Only once I've gone down the bit field path would I have converted the `class Field` into a `struct Field`, due to performance improvements that can be gained when not using an object reference to 100+ bytes and instead using a value type that is only 128 bits (16 bytes).

John Franklin

Member
Would your way translate into speed gain? Every time I ran the original code it completed in under four minutes, so improving on that impressive speed would be...even more impressive.

What would need to be added to my nested loops to beef up the levels nesting?

Last edited:

Skydiver

Staff member
See post #13. ~90 seconds for the code in post #10.

John Franklin

Member
When the 2D arrays are originally created, meaning the 15 for the 5 element arrays, 140 for the 4 element arrays, etc., why aren't they just created as 1D arrays right then? If they just end up being 1D anyway, why not just create them that way from the get go? We may just be talking about milliseconds here in terms of conversion time, but it does seem like it would save a step.

Skydiver

Staff member
Programs must be written for people to read, and only incidentally for machines to execute.

- Harold Abelson

Fixed size 2D arrays are implemented by modern compilers to be 1D in memory anyway. So it just comes down to what is easier to understand (and maintain).

C#:
``````field[x + 10 * y] = true;
:
if (field[x + 10 * y])
{
}``````

vs.

C#:
``````field[x, y] = true;
:
if (field[x, y])
{
}``````

Replies
17
Views
419
Replies
1
Views
230
Replies
3
Views
156
Replies
3
Views
298
Replies
1
Views
407