Help understanding some C# code

John Franklin · Jan 22, 2022

Hello all. I dabbled in C# a couple of years ago, but didn't really progress much past the beginner phase. The code below is a C# answer I found to a challenge I solved via Excel/VBA, but whereas my code takes hours to find the solution, the C# code below takes around 3 minutes(!). I was blown away by that and since my C# skills are pretty minimal at this point, I don't completely understand how the code works. So, to that end, I was hoping someone could help give me an understanding of what's going on in this code and how it is determining the solution. I don't know how involved that is, but if someone has the time and inclination, I'd really like to learn/understand how this code finds the solution. Many thanks to whomever is willing to help me with this!

The challenge this code solves is based off the old game Battleship and finds the answer of how many unique starting boards are there for the game Battleship. Meaning, how many ways are there to place all the ships of various cell lengths (5,4,3,3,2) vertically or horizontally (no diagonals) in a 10x10 grid with no overlap/intersection of ships. I was a bit shocked at the answer, but the code below confirmed my answer, and did it WAY WAY faster than my code, which is why I'd like to understand how the C# works.

C#:

using System;
using System.Collections.Generic;

namespace test4
{
   class BShip
   {
      class BPat
      {
         private readonly long mask1;
         private readonly long mask2;

         public BPat(long m1, long m2)
         {
            mask1 = m1;
            mask2 = m2;
         }
         public bool canMerge(long m1, long m2)
         {
            return (m1 & mask1) == 0 && (m2 & mask2) == 0;
         }
         public void mergeWith(ref long m1, ref long m2)
         {
            m1 |= mask1;
            m2 |= mask2;
         }
      }


      static private readonly IList<BPat>[] ship = new List<BPat>[5];
  
      static void Main()
      {
         for( int ln = 2; ln<=5; ln++ )
         {
            IList<BPat> list = new List<BPat>();
            // by symmetry, we can restrict the len5 ship to being vertical and in the top left quarter of the field.
            int mxx = ln == 5 ? 5 : 10;
            int mxy = ln == 5 ? 3 : 10;
            for (int x = 0; x < mxx; x++)
            {
               for (int y = 0; y < mxy; y++)
               {
                  placeShip(list, ln, x, y, 0, 1);
                  if( ln!=5) placeShip(list, ln, x, y, 1, 0);
               }
            }
            ship[5-ln] = list;
         }
         ship[4] = ship[3];   // extra len 3 ship
         ship[3] = ship[2];
         long  count = search(0, 0L, 0L);
         // 8 symmetries
         Console.WriteLine(count * 8);
      }
      static long search(int depth, long m1, long m2)
      {
         if (depth == 5) return 1;
         long count = 0;
         foreach (BPat p in ship[depth])
         {
            if( depth == 0 ) Console.WriteLine(count);
            if( p.canMerge(m1,m2))
            {
               long oldm1 = m1, oldm2 = m2;
               p.mergeWith(ref m1,ref m2);
               count += search(depth + 1, m1, m2);
               m1 = oldm1;
               m2 = oldm2;
            }
         }
         return count;
      }

      static void placeShip(IList<BPat> list, int ln, int x, int y, int dx, int dy)
      {
         if (x + dx*(ln-1) >= 10 || y + dy*(ln-1) >= 10) return;
         int loc = 10*y + x;
         long m1 = 0, m2 = 0;
         for (int i = 0; i < ln; i++)
         {
            if (loc >= 50) m2 |= 1L << (loc - 50);
            else m1 |= 1L << loc;
            loc += 10*dy + dx;
         }
         BPat p = new BPat(m1, m2);
         list.Add(p);
      }
   }
}

Skydiver · Jan 22, 2022

So that code is what you want to learn about, but how can we compare it to your code which you said ran slower without seeing your code. How will we know if you two took the same approach to the problem, or you each separately had different approaches? Perhaps their algorithm was better than your algorithm. Or perhaps the two of you had the same algorithm, but their implementation was just more efficient than yours.

John Franklin · Jan 23, 2022

Skydiver said:
So that code is what you want to learn about, but how can we compare it to your code which you said ran slower without seeing your code. How will we know if you two took the same approach to the problem, or you each separately had different approaches? Perhaps their algorithm was better than your algorithm. Or perhaps the two of you had the same algorithm, but their implementation was just more efficient than yours.

My code in VBA is basically brute force. Start with a 5 cell range, say A1:E1. Find a 4 cell range that doesn't intersect that. Find a 3 cell range that doesn't intersect the 5 or 4. Find another 3 cell that doesn't intersect the 5.4. or 3. Find a 2 cell that doesn't intersect the 5,4,3, or 3, and so forth down the line. To do this, all ranges for each size range are stored in individual Collections. For example, there are 120 5 cell sized ranges that will fit in that 10x10 grid. 140 4 cell sized ranges. 160 3 cell sized ranges. 180 2 cell sized ranges. All those ranges for each range size are stored in separate Collections, meaning 5 Collections total. The basic logic is exactly that- basic. Start with the first 5 cell range value in the 5 Collection. Compare that to the first value in the 4 cell Range collection. Do they intersect? If yes, go to the next 4 cell range. Repeat until a non intersecting 4 cell range is found. Now go to the first value in the first 3 cell range Collection. Does that value intersect either the 5 cell or 4 cell range values? If yes, etc., etc., through all the 2 cell range values. This all done via a series of nested loops. I wrote it that way because I didn't know how to write it in a way that produced faster results either in a non brute force way, or in a faster brute force way. So, when I came across that C# code and it found the same answer my code did but many many times faster, I wanted to understand how the C# code worked as it is clearly either a way more efficient version of what I did, or something entirely different.

As my C# skills are sketchy at best, I don't know if that code is doing what I did but better, or is using a completely different method to get the results. That's why I'm here and posted the code. I'm trying to get an understanding of how the code works as I don't fully understand it. My ultimate end game is to try and do something similar in VBA, but I need to fully understand how the C# code works before I attempt that.

Does that help?

cbreemer · Jan 23, 2022

Somehow I would not be at all surprised if a C# program is much faster than a VB script in Excel

But as Skydiver said, it's possible (I think even likely) that the C# code uses a faster algorithm than your own code. In my opinion that is for you to work out. You hardly need any C# skills to study and understand the algorithm used here. The logic would be the same in any language.

Skydiver · Jan 23, 2022

Also recall that VBA is compiled to P-code, and then that P-code is interpreted each time it is run. Compare this to C# which gets compiled to IL, and then that IL is compiled to native assembly the first time it is run. For the same algorithm, native assembly is faster than interpreted.

Skydiver · Jan 23, 2022

Now that I've started looking at the code posted in #1. Notice in the comments that the code only does 1/8th of the work because it applies the concept of symmetries where the board can be rotated and or mirrored. In your code in post #8, it doesn't sound like you are applying this principle and so you are effectively doing 8 times the amount of work as you try out the other board rotations and mirrors.

John Franklin · Jan 23, 2022

cbreemer said:
Somehow I would not be at all surprised if a C# program is much faster than a VB script in Excel But as Skydiver said, it's possible (I think even likely) that the C# code uses a faster algorithm than your own code. In my opinion that is for you to work out. You hardly need any C# skills to study and understand the algorithm used here. The logic would be the same in any language.

Well, if I understood how to convert the C# syntax to VBA I wouldn't be here, would I?

John Franklin · Jan 23, 2022

Skydiver said:
Now that I've started looking at the code posted in #1. Notice in the comments that the code only does 1/8th of the work because it applies the concept of symmetries where the board can be rotated and or mirrored. In your code in post #8, it doesn't sound like you are applying this principle and so you are effectively doing 8 times the amount of work as you try out the other board rotations and mirrors.

Yes and no to my code doing more work. I came to see the symmetry along the way, so that 120 5 cell ranges became 15 due to symmetry. But, I still used brute force to find the solution and even by using 1/8 of the 5 cell ranges, it still took hours for the solution to arrive compared to minutes for the C# code. Perhaps differences in how the programs compile can be explained for part of the speed difference, but to me it's the algorithms. Clearly the C# code is finding a solution in a more efficient manner than my code, whether we're both using the same concepts or not.

I'm not looking for an hours long training session on C#. I'm just looking for some help in understanding how the C# is finding all those combinations so quickly. As I said to cbreemer above, I don't have the knowledge or depth of understanding of C# code/syntax to be able to say "oh, lines 36-47 do xyz" and then immediately know how to translate that to VBA. Perhaps somebody with minimal C# skills could study the code and understand the algorithm, but I'm not one of those somebodies, which is why I came here for help in the first place!

Skydiver · Jan 23, 2022

The code above also uses brute force by generating all the possible ship positions, and then recursively checking to see if the ship will collide with another ship, putting the ship on the board, and then trying with that new board.

Skydiver · Jan 23, 2022

Hopefully this refactoring makes the code more readable to understand what is happening:

C#:

using System;
using System.Collections;
using System.Collections.Generic;

namespace BattleShipConfig
{
    enum Orientation { Horizontal, Vertical };

    class ConfigSearch
    {
        const long SymmetryCount = 8;

        static void Main()
        {
            long count = Search(0, new Field());
            Console.WriteLine(count * SymmetryCount);
        }

        static long Search(int depth, Field field)
        {
            if (depth == 5)
                return 1;

            long count = 0;
            foreach (Field p in _positions[depth])
            {
                if (depth == 0)
                    Console.WriteLine(count);

                if (p.CanMerge(field))
                    count += Search(depth + 1, p.MergeWith(field));
            }
            return count;
        }

        static IList<Field>[] _positions = new IList<Field>[]
        {
            GenerateCarrierPositions().ToList(),
            GenerateShipPositions(4).ToList(),
            GenerateShipPositions(3).ToList(),
            GenerateShipPositions(3).ToList(),
            GenerateShipPositions(2).ToList()
        };

        // by symmetry, we can restrict the len5 ship to being vertical and in the top left quarter of the field.
        static IEnumerable<Field> GenerateCarrierPositions()
            => GenerateShipPositions(5, 5, 3, Orientation.Vertical);

        static IEnumerable<Field> GenerateShipPositions(int length)
        {
            return GenerateShipPositions(length, 10, 10 - length + 1, Orientation.Vertical)
                    .Concat(GenerateShipPositions(length, 10 - length + 1, 10, Orientation.Horizontal));
        }

        static IEnumerable<Field> GenerateShipPositions(int length, int maxX, int maxY, Orientation orientation)
        {
            for (int x = 0; x < maxX; x++)
            {
                for (int y = 0; y < maxY; y++)
                    yield return Field.PlaceShip(length, x, y, orientation);
            }
        }
    }

    struct Field
    {
        Int64 _low = 0;
        Int64 _high = 0;

        public bool CanMerge(Field other)
            => ((_low & other._low) == 0) && ((_high & other._high) == 0);

        public Field MergeWith(Field other)
            => new Field { _low = _low | other._low, _high = _high | other._high };

        public static Field PlaceShip(int length, int x, int y, Orientation orientation)
        {
            int dx = orientation == Orientation.Horizontal ? 1 : 0;
            int dy = orientation == Orientation.Vertical ? 1 : 0;

            var field = new Field();
            for (int i = 0; i < length; i++)
            {
                field.Mark(x, y);
                x += dx;
                y += dy;
            }
            return field;
        }

        void Mark(int x, int y)
        {
            int bit = y * 10 + x;
            if (bit >= 64)
                _high |= 1L << (bit - 64);
            else
                _low |= 1L << bit;
        }
    }
}

Skydiver · Jan 23, 2022

The original C# code as well as my refactoring use 2 64-bit integers (long or Int64) to keep track of where the ships are. It takes advantage of the AND bit operation to check for collisions, and the OR bit operation to merge the positions in preparation for the next collision check. Bit operations make things fast because you get parallelization without having to write for multiple threads: 64 spots are checked or modified in a single operation that can be done in the CPU registers. Compare this to code that has to check or modify 200 or more memory locations. To make matters worse, if that memory is not in the CPU cache, the CPU has to wait to for it to get into the cache from main memory. I don't know what the current multipliers are now, but several years ago it used to be that registers were 8x faster to than the CPU cache, and the CPU cache were 10x faster than main memory.

The manual handling of the bits in the code above could have been replaced with using a BitArray which would still be relatively fast. To try to make something kind of similar to using Excel cells maybe to use 2 dimensional arrays of bools.

Skydiver · Jan 23, 2022

And for the fun of it, to get faster results, take advantage of all the processors on your machine by using parallelism:

C#:

static void Main()
{
    ulong total = 0;
    Parallel.ForEach(
        _positions[0],
        () => 0UL,
        (position, state, count) => count + (ulong)Search(1, position),
        count => Console.WriteLine(Interlocked.Add(ref total, count)));

    Console.WriteLine($"Actual: {total * SymmetryCount}    Expected: {ExpectedResult}");
}

Skydiver · Jan 26, 2022

Just wanted to share some benchmark results:

C#:

|                  Method |    Mean |    Error |  StdDev |
|------------------------ |--------:|---------:|--------:|
|                  Search | 91.70 s |  0.427 s | 0.023 s |
|          MemoizedSearch | 38.98 s |  3.255 s | 0.178 s |
|         ParallellSearch | 23.62 s | 12.117 s | 0.664 s |
| ParallellMemoizedSearch | 10.54 s |  0.646 s | 0.035 s |

The Search and ParallelSearch are the same methods I previously shared above.

The "memoized" searches are my attempts to try to cut down on doing work that has already been done by using more memory instead of more CPU. I put "memoized" in quotes because, ideally, I should really be remembering the results of whole branches of the recursion but my rough computations was that I could potentially need more than 12GB. Since my lowest level machine (on which I also do benchmarks) has only 8GB, I precomputed the all the possible formations by the submarine and patrol boat. That's about ~450MB. And then I also precompute the possible formations of the battleship and destroyer for another ~350MB for the parallel case. In the non-parallel case, I also include the carrier for ~5.25GB. Then I simply do a cross product of the two sets of formations and only count the ones without collisions. I suspect that if I had more RAM, the non-parallel version would be faster because it would avoid swapping to disk -- at least that what I interpreted the increased disk activity to be.

John Franklin · Mar 1, 2022

Skydiver said:
The code above also uses brute force by generating all the possible ship positions, and then recursively checking to see if the ship will collide with another ship, putting the ship on the board, and then trying with that new board.

Thanks for your help here. I got so frustrated and burned out on this one that I only came back today to check it out again. I didn't see any comments past #10 until today. I appreciate your attempt to simplify things for me in comment #10, but I'm still struggling a bit to understand the code.

Since I'm apparently slow on the uptake here, perhaps if you could distill the code (either the original, or your simplified version) by associated line numbers for the following components I hopefully can pick it up a bit easier.

Code lines for how all the possible ship positions are generated
Code lines that recursively check to see if the ship will collide with another ship

My main thing is trying to understand the recursion component so I can see how to apply the concept in VBA for Excel.

Thank you again for your help.

Skydiver · Mar 1, 2022

In post #10:

Position generation for the individual ships happen on lines 36-62, and lines 76-98.

The recursive placement of ships happens on lines 19-34. Recursion is kicked off by line 15.

I'll write more later when I'm on a PC.

Help understanding some C# code

John Franklin

Member

Skydiver

John Franklin

Member

cbreemer

Well-known member

Skydiver

Skydiver

John Franklin

Member

John Franklin

Member

Skydiver

Skydiver

Skydiver

Skydiver

Skydiver

John Franklin

Member

Skydiver

Similar threads

Share this page

Latest posts