I have a problem and I already solved it, but I am searching for enhancement.

pythonatSheriff

Full Stack Web Developer
Joined
Dec 15, 2021
Messages
25
Location
Egypt
Programming Experience
1-3
Twitter
shiccorama
The problem is :

((Have the function MathChallenge(str) take the str parameter being passed and determine if there is some substring K that can be repeated N > 1 times to produce the input string exactly as it appears. Your program should return the longest substring K, and if there is none it should return the string -1.

For example: if str is "abcababcababcab" then your program should return abcab because that is the longest substring that is repeated 3 times to create the final string. Another example: if str is "abababababab" then your program should return ababab because it is the longest substring. If the input string contains only a single character, your program should return the string -1. ))

My solutions is :
my solution to math problem:
using System;
using System.Text;
using System.Collections.Generic;
using System.Linq;


class MainClass
{

  public static void Main(string[] arg1)
  {
        //abcababcababcab
        string str = "abababababab";

        int count = 0;
        for(var i=0; i<str.Length; i++)
        {
            count = count + 1;
        }
        Console.WriteLine(count);

        int division = 0;
        if(count%2 == 0)
        {
            //Console.WriteLine("can divide on 2");
            division = 2;
            
        }
        else if(count%3 == 0)
        {
            //Console.WriteLine("can divide on 3");
            division = 3;
        }


        char[] LooseLetters = str.ToCharArray();
        int LettersCount = (LooseLetters.Length)/(division);
        StringBuilder part1 = new StringBuilder(str);
        if(division == 2)
        {
            part1.Insert(LettersCount, ",");
            Console.WriteLine(part1);
        }
        else if(division == 3)
        {
            part1.Insert(LettersCount, ",");
            part1.Insert((2*LettersCount)+1, ",");
            Console.WriteLine(part1);
        }

        string strConverter = part1.ToString();

        List<string> DuplicateString = strConverter.Split(",").ToList();
        // foreach(var item in DuplicateString)
        // {
        //     Console.WriteLine(item);
        // }

        string result = "";

        if(division == 2)
        {
            if(DuplicateString[0] == DuplicateString[1])
            result = DuplicateString[0];
            else
            result = "-1";
        }
        else if(division == 3)
        {
            if(DuplicateString[0] == DuplicateString[1] && DuplicateString[0] == DuplicateString[2])
            result = DuplicateString[0];
            else
            result = "-1";
        }

        Console.WriteLine(result);

  }

}


Now, I feel like it can be done using (Linq) methods, although I read most of the methods, or there is a simpler solution, so I would like to know to increase my knowledge.
Thanks in advance.
 

JohnH

C# Forum Moderator
Staff member
Joined
Apr 23, 2011
Messages
1,422
Location
Norway
Programming Experience
10+
Even if your code works that's technically a fail, since you're required to write a function named MathChallenge that take a single string parameter.
function MathChallenge(str) take the str parameter
 

JohnH

C# Forum Moderator
Staff member
Joined
Apr 23, 2011
Messages
1,422
Location
Norway
Programming Experience
10+
Another way to do it:
C#:
string MathChallenge(string input)
{
    for (int i = 2; i <= input.Length / 2; i++) //possible parts 2 to length/2
    {
        if (input.Length % i == 0) //string divisible by i
        {
            var partlength = input.Length / i;
            var part = input.Substring(0, partlength);
            if (AllPartsEqual(input, part))
                return part;       
        }
    }
    return "-1";
}

bool AllPartsEqual(string input, string part)
{
    for (int i = 1; i < input.Length / part.Length; i++)
    {
        if (input.Substring(i * part.Length, part.Length) != part)
            return false;
    }
    return true;
}
 
Last edited:

JohnH

C# Forum Moderator
Staff member
Joined
Apr 23, 2011
Messages
1,422
Location
Norway
Programming Experience
10+
Updated the code in post 3, there was a bug or three.
 

JohnH

C# Forum Moderator
Staff member
Joined
Apr 23, 2011
Messages
1,422
Location
Norway
Programming Experience
10+
Here are three alternatives to if (AllPartsEqual(input, part)) (line 9 in post 3).
After string split by part there should be nothing left:
C#:
if (input.Split(new string[] { part }, StringSplitOptions.RemoveEmptyEntries).Length == 0)
A regex expression "(part){i}" meaning part repeated i times
C#:
if (Regex.IsMatch(input, $"({part}){{{i}}}"))
Linq repeated string
C#:
if (string.Concat(Enumerable.Repeat(part, i)) == input)
 

JohnH

C# Forum Moderator
Staff member
Joined
Apr 23, 2011
Messages
1,422
Location
Norway
Programming Experience
10+
Then a pure regex version. Logic is (group of one or more chars) repeated 1 or more times (\1 is backreference, + is repeat, greedy). Match must equal input string (same length) and not just a part of it. Return group value.
C#:
string MathChallenge(string input)
{
    var m = Regex.Match(input, "(.+)\\1+");
    if (m.Success && m.Length == input.Length)
        return m.Groups[1].Value;
    return "-1";
}
MatchChallenge would probably be a more suitable function name in my opinion.
 

Skydiver

Staff member
Joined
Apr 6, 2019
Messages
5,251
Location
Chesapeake, VA
Programming Experience
10+
Interesting, but apparently that interview question has another dimension to it. Solve the same problem in O(n) instead of O(n^2) time complexity.
 

pythonatSheriff

Full Stack Web Developer
Joined
Dec 15, 2021
Messages
25
Location
Egypt
Programming Experience
1-3
Twitter
shiccorama
Many thanks for you JohnH for enlightening me with all these aspects that I should revisit to refresh my memory. And thank you SkyDiver for the hint. Apparently, I studied a lot of things THEORITICALLY, and of course I did not need them to built CMS as the steps in DotNetCore MVC don't enforce you to use algorithm, so NO practical training on algorithm had happened. But let me ask you naive question, do I really want to know all of these although the real work (as building website or CMS) is a conventional work? I know that maybe some clients want to add feature, but I feel like I will not invent the wheel, I will go see how this feature built into the system and try to make my way through. Don't get me wrong, I am not saying that algorithm and data structure is not important, I know they are, I am just curious about how I will use them in my daily work. Thanks.
 
Last edited:

Skydiver

Staff member
Joined
Apr 6, 2019
Messages
5,251
Location
Chesapeake, VA
Programming Experience
10+
When your website or CMS is running slow, you need to be able to analyze why it is running slow. Often a better algorithm or data structure gets your more bang for your buck than better hardware or more hardware.
 

JohnH

C# Forum Moderator
Staff member
Joined
Apr 23, 2011
Messages
1,422
Location
Norway
Programming Experience
10+
Interesting, but apparently that interview question has another dimension to it. Solve the same problem in O(n) instead of O(n^2) time complexity.
Like this?
C#:
if (AllPartsEqual(input, partlength))
    return input.Substring(0, partlength);
C#:
bool AllPartsEqual(string input, int partlength)
{
    for (int i = 0; i < input.Length / partlength - 1; i++) // loop parts (0 to parts-1)
    {
        for (int j = 0; j < partlength; j++) // loop chars in part
        {
            int index = i * partlength + j;
            if (input[index] != input[index + partlength])
                return false;
        }
    }
    return true;
}
 

Skydiver

Staff member
Joined
Apr 6, 2019
Messages
5,251
Location
Chesapeake, VA
Programming Experience
10+
Typically, nested loops suggest O(n ^ nestingDepth) behavior unless it can be shown that the nesting only processes part of input just once.
 

JohnH

C# Forum Moderator
Staff member
Joined
Apr 23, 2011
Messages
1,422
Location
Norway
Programming Experience
10+
So how to compare three or more parts with a single loop? I thought O was also about memory used.
 

Skydiver

Staff member
Joined
Apr 6, 2019
Messages
5,251
Location
Chesapeake, VA
Programming Experience
10+
Computer science leaning interview questions from Amazon and Google expect their new hires to know Knuth-Morris-Pratt by heart, I suppose. The idea is that KMP will build a table that makes it obvious whether there is a leading substring that also matches a trailing substring and that table cam be built in O(n). The explanation in GeeksForGeeks was a bit over my head with current brain fog from chemo.

For me dynamic programming questions like these throw me off. I prefer the more understandable brute force approaches.
 

JohnH

C# Forum Moderator
Staff member
Joined
Apr 23, 2011
Messages
1,422
Location
Norway
Programming Experience
10+
A combination of the previous two AllPartsEqual, it scans remaining string only once (but must compare chars of first part each time) and doesn't create new strings. For all practical purposes it does the same as the version in post 11, only with a single loop. It looks much simple though. I have no better idea.
C#:
bool AllPartsEqual(string input, int partlength)
{
    for (int i = partlength; i < input.Length; i++)
    {
        if (input[i] != input[i % partlength])
            return false;
    }
    return true;
}
 

Skydiver

Staff member
Joined
Apr 6, 2019
Messages
5,251
Location
Chesapeake, VA
Programming Experience
10+
Yes, but isn't AllPartsEqual() called in a loop?
 

JohnH

C# Forum Moderator
Staff member
Joined
Apr 23, 2011
Messages
1,422
Location
Norway
Programming Experience
10+
That loop just identifies the search terms, ie the part, where input string is divisible by part.

I don't think KMP applies, this is not a single wild search for a pattern anywhere in a string. Here each possible part is a new search term, and the search only examines specific segments of the input string (multiples of part). If using KMP the whole input string have to be LPS computed for each different possible part, and then doing searches that doesn't even apply to the problem, and finally check if all multiples of the part length are included in returned match indexes.
Found a KMP code here: Knuth–Morris–Pratt Algorithm Implementation in C# - Programming Algorithms

Anyway, these are the results I get with simple Stopwatch testing, time in ticks:
48 (string.Split)
646 (Enumerable.Repeat)
764 (AllPartsEqual3)
1185 (AllPartsEqual2)
2497 (pure Regex)
4444 (Regex match repeat)
10123 (AllPartsEqual1)
13297 (KMP)
 

Skydiver

Staff member
Joined
Apr 6, 2019
Messages
5,251
Location
Chesapeake, VA
Programming Experience
10+
Part of the solution is here:

Only the LPS part of KMP needs to be done.

Unfortunately, that generates the shortest string that can be repeated to build the original string. The problem above asks for the longest string.

If n % (n - len) != 0 then there is no such string.
If n / (n - len) is odd, then the first n - len characters of the pattern is the longest string.
if n / (n - len) is even then the first n / 2 characters is the longest string.
 
Last edited:

Skydiver

Staff member
Joined
Apr 6, 2019
Messages
5,251
Location
Chesapeake, VA
Programming Experience
10+
BenchmarkDotNet results for me:
C#:
|           Method |         Mean |       Error |      StdDev |
|----------------- |-------------:|------------:|------------:|
| SubstringMatcher |     466.3 us |     2.95 us |     2.61 us |
|     SplitMatcher |     749.4 us |     7.83 us |     7.32 us |
|     RegexMatcher |  23,371.9 us |    70.32 us |    58.72 us |
|      LinqMatcher |  24,440.0 us |   355.01 us |   314.71 us |
| PureRegexMatcher | 616,405.8 us | 3,398.95 us | 3,013.08 us |
|    OffsetMatcher |     500.0 us |     0.30 us |     0.25 us |
|    ModuloMatcher |   1,215.6 us |     2.90 us |     2.72 us |
|       KmpMatcher |     523.1 us |     8.42 us |     7.88 us |

I was using the following code to test:
C#:
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Text.RegularExpressions;
using System.Threading.Tasks;

namespace MatchChallenge
{
    public interface IMatcher
    {
        string MatchChallenge(string input);
    }

    public abstract class SubstringPartsMatcher : IMatcher
    {
        public string MatchChallenge(string input)
        {
            for (int i = 2; i <= input.Length / 2; i++)
            {
                if (input.Length % i == 0)
                {
                    var partLength = input.Length / i;
                    var part = input.Substring(0, partLength);
                    if (AllPartsEqual(input, part))
                        return part;
                }
            }
            return "-1";
        }

        protected abstract bool AllPartsEqual(string input, string part);
    }

    public class SubstringMatcher : SubstringPartsMatcher
    {
        protected override bool AllPartsEqual(string input, string part)
        {
            for (int i = 2; i < input.Length / part.Length; i++)
            {
                if (input.Substring(i * part.Length, part.Length) != part)
                    return false;
            }
            return true;
        }
    }

    public class SplitMatcher : SubstringPartsMatcher
    {
        protected override bool AllPartsEqual(string input, string part)
            => input.Split(new string[] { part }, StringSplitOptions.RemoveEmptyEntries).Length == 0;
    }

    public class RegexMatcher : SubstringPartsMatcher
    {
        protected override bool AllPartsEqual(string input, string part)
        {
            for (int i = 2; i < input.Length / part.Length; i++)
            {
                if (Regex.IsMatch(input, $"({part}){{{i}}}"))
                    return false;
            }
            return true;
        }
    }

    public class LinqMatcher : SubstringPartsMatcher
    {
        protected override bool AllPartsEqual(string input, string part)
        {
            for (int i = 2; i < input.Length / part.Length; i++)
            {
                if (string.Concat(Enumerable.Repeat(part, i)) == input)
                    return false;
            }
            return true;
        }
    }

    public class PureRegexMatcher : IMatcher
    {
        public string MatchChallenge(string input)
        {
            var m = Regex.Match(input, "(.+)\\1+");
            if (m.Success && m.Length == input.Length)
                return m.Groups[1].Value;
            return "-1";
        }
    }

    public abstract class StringIndexMatcher : IMatcher
    {
        public string MatchChallenge(string input)
        {
            for (int i = 2; i <= input.Length / 2; i++)
            {
                if (input.Length % i == 0)
                {
                    var partLength = input.Length / i;
                    if (AllPartsEqual(input, partLength))
                        return input.Substring(0, partLength);
                }
            }
            return "-1";
        }

        protected abstract bool AllPartsEqual(string input, int partLength);
    }

    public class OffsetMatcher : StringIndexMatcher
    {
        protected override bool AllPartsEqual(string input, int partLength)
        {
            for (int i = 0; i < input.Length / partLength - 1; i++)
            {
                for (int j = 0; j < partLength; j++)
                {
                    int index = i * partLength + j;
                    if (input[index] != input[index + partLength])
                        return false;
                }
            }
            return true;
        }
    }

    public class ModuloMatcher : StringIndexMatcher
    {
        protected override bool AllPartsEqual(string input, int partLength)
        {
            for (int i = partLength; i < input.Length; i++)
            {
                if (input[i] != input[i % partLength])
                    return false;
            }
            return true;
        }
    }

    public class KmpMatcher : IMatcher
    {
        public string MatchChallenge(string input)
        {
            if (string.IsNullOrEmpty(input))
                return "-1";

            var lps = ComputeLpsArray(input);
            int suffixLength = lps[input.Length - 1];
            if (suffixLength == 0)
                return "-1";

            int partLength = input.Length - suffixLength;
            if (input.Length % partLength != 0)
                return "-1";

            if (partLength % 2 == 0)
                partLength = input.Length / 2;
            return input.Substring(0, partLength);
        }

        int [] ComputeLpsArray(ReadOnlySpan<char> input)
        {
            var lps = new int[input.Length];
            int len = 0;
            int i = 1;

            while (i < input.Length)
            {
                if (input[i] == input[len])
                {
                    lps[i++] = ++len;
                }
                else
                {
                    if (len != 0)
                        len = lps[len - 1];
                    else
                        i++;
                }
            }

            return lps;
        }
    }

    public class MatcherBenchmarks
    {
        const string alphabet = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789";
        static Random _rnd = new Random();

        string _good;
        string _bad;
        IMatcher _substringMatcher = new SubstringMatcher();
        IMatcher _splitMatcher = new SplitMatcher();
        IMatcher _regexMatcher = new RegexMatcher();
        IMatcher _linqMatcher = new LinqMatcher();
        IMatcher _pureRegexMatcher = new PureRegexMatcher();
        IMatcher _offsetMatcher = new OffsetMatcher();
        IMatcher _moduloMatcher = new ModuloMatcher();
        IMatcher _kmpMatcher = new KmpMatcher();

        public MatcherBenchmarks()
        {
            var sb = new StringBuilder();
            for (int i = 0; i < 257; i++)
                sb.Append(alphabet[_rnd.Next(alphabet.Length)]);
            var value = sb.ToString();

            sb.Clear();
            for (int i = 0; i < 257; i++)
                sb.Append(value);

            _good = sb.ToString();

            sb.Clear();
            for (int i = 0; i < 257 * 257; i++)
                sb.Append(alphabet[_rnd.Next(alphabet.Length)]);
            _bad = sb.ToString();
        }

        string Test(IMatcher matcher)
        {
            matcher.MatchChallenge(_bad);
            return matcher.MatchChallenge(_good);
        }

        [Benchmark]
        public string SubstringMatcher() => Test(_substringMatcher);

        [Benchmark]
        public string SplitMatcher() => Test(_splitMatcher);

        [Benchmark]
        public string RegexMatcher() => Test(_regexMatcher);

        [Benchmark]
        public string LinqMatcher() => Test(_linqMatcher);

        [Benchmark]
        public string PureRegexMatcher() => Test(_pureRegexMatcher);

        [Benchmark]
        public string OffsetMatcher() => Test(_offsetMatcher);

        [Benchmark]
        public string ModuloMatcher() => Test(_moduloMatcher);

        [Benchmark]
        public string KmpMatcher() => Test(_kmpMatcher);
    }

    public class Program
    {
        public static void Main()
        {
            BenchmarkRunner.Run<MatcherBenchmarks>();
            // Console.WriteLine(new MatcherBenchmarks().KmpMatcher());
        }
    }
}

As a side note, the regex matchers should have a preparation step to escape regex special characters in the search strings prior to trying to find a match, but I didn't implement that in the code above. I just happened to run across a crash when my original alphabet included the '+' and '/' characters and my random test string happened to have a "++" in it.
 

JohnH

C# Forum Moderator
Staff member
Joined
Apr 23, 2011
Messages
1,422
Location
Norway
Programming Experience
10+
Excellent testing, I got similar results. Both Regex versions is very slow, and SplitMatcher not quite as fast as I measured.
SubstringMatcher = AllPartsEqual1
OffsetMatcher = AllPartsEqual2
ModuloMatcher = AllPartsEqual3
I also added [MemoryDiagnoser]
C#:
|           Method |         Mean |     Error |    StdDev |     Gen 0 |     Gen 1 |     Gen 2 |    Allocated |
|----------------- |-------------:|----------:|----------:|----------:|----------:|----------:|-------------:|
| SubstringMatcher |     132.1 us |   0.57 us |   0.48 us |   16.3574 |         - |         - |    138,288 B |
|     SplitMatcher |     262.1 us |   2.26 us |   2.01 us |   41.5039 |   41.5039 |   41.5039 |    134,956 B |
|     RegexMatcher |  10,276.8 us |  56.00 us |  49.64 us |  125.0000 |   15.6250 |         - |  1,088,624 B |
|      LinqMatcher |   9,972.0 us | 187.22 us | 222.87 us | 6171.8750 | 6171.8750 | 6171.8750 | 33,852,374 B |
| PureRegexMatcher | 159,565.9 us | 485.07 us | 430.01 us |         - |         - |         - |        786 B |
|    OffsetMatcher |     161.9 us |   1.12 us |   1.05 us |         - |         - |         - |        536 B |
|    ModuloMatcher |     374.7 us |   1.19 us |   1.12 us |         - |         - |         - |        536 B |
|       KmpMatcher |     266.9 us |   2.64 us |   2.47 us |  166.5039 |  166.5039 |  166.5039 |    529,040 B |
For the longer test strings memory use gets a lot worse for RegexMatcher and LinqMatcher. PureRegexMatcher surprisingly does well memory-wise, but has very bad performance.
Interesting to see ModuloMatcher vs OffsetMatcher, the % operation obviously adds more computation than multiply index.
Also tested with one of the short sample strings (and a short bad one) got these:
C#:
|           Method |        Mean |    Error |   StdDev |  Gen 0 | Allocated |
|----------------- |------------:|---------:|---------:|-------:|----------:|
| SubstringMatcher |    40.93 ns | 0.759 ns | 0.874 ns | 0.0124 |     104 B |
|     SplitMatcher |   409.65 ns | 1.360 ns | 1.272 ns | 0.1040 |     872 B |
|     RegexMatcher | 1,659.53 ns | 6.118 ns | 5.424 ns | 0.0324 |     272 B |
|      LinqMatcher |    81.48 ns | 1.635 ns | 1.883 ns | 0.0191 |     160 B |
| PureRegexMatcher | 1,827.29 ns | 9.028 ns | 8.445 ns | 0.0782 |     664 B |
|    OffsetMatcher |    69.78 ns | 0.576 ns | 0.481 ns | 0.0038 |      32 B |
|    ModuloMatcher |    93.41 ns | 0.272 ns | 0.213 ns | 0.0038 |      32 B |
|       KmpMatcher |    60.40 ns | 0.980 ns | 0.917 ns | 0.0248 |     208 B |
So one have to choose between best performance or memory use, or a combination. Maybe also expected input strings.

I have to look into KMP again, I definitely didn't apply that right. (the code I copied had a different layout)
 
Top Bottom