DataTable

patrick · Sep 2, 2023

Hello.
The conversion must be done as shown in the picture below.
I programmed the code below.
I don't like the code below.

How can I convert DataTable efficiently?

C#:

DataRow[] rowKPS = originalDt.Select("subject = 'KPS'");
DataRow[] rowDPA = originalDt.Select("subject = 'DPA'");
DataRow[] rowESG = originalDt.Select("subject = 'ESG'");

DataTable dtConvertKPS = rowKPS.CopyToDataTable();
DataTable retKPS = GenerateTransposedTable(dtConvertKPS);

DataTable dtConvertDPA = rowDPA.CopyToDataTable();
DataTable retDPA = GenerateTransposedTable(dtConvertDPA);

retKPS.Merge(retDPA);

DataTable dtConvertESG = rowESG.CopyToDataTable();
DataTable retESG = GenerateTransposedTable(dtConvertESG);

retKPS.Merge(retESG);

private DataTable GenerateTransposedTable(DataTable inputTable)
{
     DataTable outputTable = new DataTable();
     outputTable.Columns.Add(inputTable.Columns[0].ColumnName.ToString());

     // Header row's second column onwards, 'inputTable's first column taken
     foreach (DataRow inRow in inputTable.Rows)
     {
         string newColName = inRow[0].ToString();
         outputTable.Columns.Add(newColName);
     }

     // Add rows by looping columns
     for (int rCount = 1; rCount <= inputTable.Columns.Count - 1; rCount++)
     {
         DataRow newRow = outputTable.NewRow();

         // First column is inputTable's Header row's second column
         newRow[0] = inputTable.Columns[rCount].ColumnName.ToString();
         for (int cCount = 0; cCount <= inputTable.Rows.Count - 1; cCount++)
         {
             string colValue = inputTable.Rows[cCount][rCount].ToString();
             newRow[cCount + 1] = colValue;
         }
         outputTable.Rows.Add(newRow);
     }

     return outputTable;
}

cjard · Sep 5, 2023

C#:

using System.Data;

namespace Example1
{
    static class Program
    {
        static void Main()
        {
            var original = new DataTable();
            original.Columns.Add("Day"); //string is default. If not using string, passing second arg as typeof(int) is cleaner than type.gettype("system.int32")
            original.Columns.Add("Subject");
            original.Columns.Add("English", typeof(int));
            original.Columns.Add("Chinese", typeof(int));
            original.Columns.Add("Japanese", typeof(int));
            original.Columns.Add("Russian", typeof(int));
            original.Rows.Add(new object[] { "202309", "KPS", 10, 20, 30, 50 });
            original.Rows.Add(new object[] { "202311", "ESG", 90, 260, -50, -270 });
            original.Rows.Add(new object[] { "202309", "DPA", 20, 50, 20, 10 });
            original.Rows.Add(new object[] { "202310", "DPA", 50, 140, -10, -110 });
            original.Rows.Add(new object[] { "202310", "KPS", 40, 110, 0, -70 });
            original.Rows.Add(new object[] { "202311", "DPA", 80, 230, -40, -230 });
            original.Rows.Add(new object[] { "202309", "ESG", 30, 80, 10, -30 });
            original.Rows.Add(new object[] { "202310", "ESG", 60, 170, -20, -150 });
            original.Rows.Add(new object[] { "202311", "KPS", 70, 200, -30, -190 });

            var transposed = Transpose(
                sourceTable: original,
                columnWhoseValuesBecomeColumns: original.Columns["Day"]!,
                nameOfPivotColumn: "Language",
                typeOfPivotedColumns: typeof(int),
                columnsNotPivoted: new[] { original.Columns["Subject"]!}.ToHashSet()
            );
        }

        static DataTable Transpose(DataTable sourceTable, DataColumn columnWhoseValuesBecomeColumns, string nameOfPivotColumn, Type typeOfPivotedColumns, HashSet<DataColumn> columnsNotPivoted )
        {
            //set up the basic table with the columns we know will exist. Other columns will be added as we go
            var dest = new DataTable();
            dest.Columns.Add(nameOfPivotColumn);
            foreach (var cnp in columnsNotPivoted)
            {
                dest.Columns.Add(cnp.ColumnName);
            }

            //we'll be processing rows out of order, so keep an index to already created rows that need more values adding in
            var idx = new Dictionary<string, DataRow>();

            foreach (DataRow sourceRow in sourceTable.Rows)
            {
                //the value of the pivot column for this row
                var pivotingValue = (string)sourceRow[columnWhoseValuesBecomeColumns];

                //is there a column the row value can go into?
                if (!dest.Columns.Contains(pivotingValue)) dest.Columns.Add(pivotingValue, typeOfPivotedColumns);

                //a row is indexed by the non pivoted values. For every row in the source, we build the index key for that row in the dest
                //the key here isn't complete yet; it will need the pivoting column name added to it
                var keyStub = "";
                foreach (var dc in columnsNotPivoted)
                    keyStub += sourceRow[dc] + "\0";

                //now we copy the values in the current row, into the pivoted columns of the destination row. If there is no destination row we create it
                foreach (DataColumn sourceColumn in sourceTable.Columns)
                {
                    //we don't act on the value in the pivoting column or the non pivoted columns; they're copied when the new dest row is created
                    if (sourceColumn == columnWhoseValuesBecomeColumns) continue;
                    if (columnsNotPivoted.Contains(sourceColumn)) continue;

                    //for each pivoted col we add the source column name (which is part of the data now) to the indexing key stub to form a new key
                    var key = keyStub + sourceColumn.ColumnName;

                    //have we created this row before? if so retrieve it by key
                    if (idx.TryGetValue(key, out var row))
                    {
                        row[pivotingValue] = sourceRow[sourceColumn];
                    }
                    else
                    {
                        //we haven't yet had any row created in the destination to take the values. Create one, copy the values that don't pivot
                        //and index it so that next time we can go straight to it
                        var destRow = dest.NewRow();
                        destRow[nameOfPivotColumn] = sourceColumn.ColumnName;
                        foreach (var cnp in columnsNotPivoted)
                        {
                            destRow[cnp.ColumnName] = sourceRow[cnp];
                        }
                        destRow[pivotingValue] = sourceRow[sourceColumn];
                        dest.Rows.Add(destRow);
                        idx[key] = destRow;
                    }
                }
            }

            return dest;
        }
    }
}

patrick · Sep 5, 2023

cjard said:

C#:

using System.Data;

namespace Example1
{
    static class Program
    {
        static void Main()
        {
            var original = new DataTable();
            original.Columns.Add("Day"); //string is default. If not using string, passing second arg as typeof(int) is cleaner than type.gettype("system.int32")
            original.Columns.Add("Subject");
            original.Columns.Add("English", typeof(int));
            original.Columns.Add("Chinese", typeof(int));
            original.Columns.Add("Japanese", typeof(int));
            original.Columns.Add("Russian", typeof(int));
            original.Rows.Add(new object[] { "202309", "KPS", 10, 20, 30, 50 });
            original.Rows.Add(new object[] { "202311", "ESG", 90, 260, -50, -270 });
            original.Rows.Add(new object[] { "202309", "DPA", 20, 50, 20, 10 });
            original.Rows.Add(new object[] { "202310", "DPA", 50, 140, -10, -110 });
            original.Rows.Add(new object[] { "202310", "KPS", 40, 110, 0, -70 });
            original.Rows.Add(new object[] { "202311", "DPA", 80, 230, -40, -230 });
            original.Rows.Add(new object[] { "202309", "ESG", 30, 80, 10, -30 });
            original.Rows.Add(new object[] { "202310", "ESG", 60, 170, -20, -150 });
            original.Rows.Add(new object[] { "202311", "KPS", 70, 200, -30, -190 });

            var transposed = Transpose(
                sourceTable: original,
                columnWhoseValuesBecomeColumns: original.Columns["Day"]!,
                nameOfPivotColumn: "Language",
                typeOfPivotedColumns: typeof(int),
                columnsNotPivoted: new[] { original.Columns["Subject"]!}.ToHashSet()
            );
        }

        static DataTable Transpose(DataTable sourceTable, DataColumn columnWhoseValuesBecomeColumns, string nameOfPivotColumn, Type typeOfPivotedColumns, HashSet<DataColumn> columnsNotPivoted )
        {
            //set up the basic table with the columns we know will exist. Other columns will be added as we go
            var dest = new DataTable();
            dest.Columns.Add(nameOfPivotColumn);
            foreach (var cnp in columnsNotPivoted)
            {
                dest.Columns.Add(cnp.ColumnName);
            }

            //we'll be processing rows out of order, so keep an index to already created rows that need more values adding in
            var idx = new Dictionary<string, DataRow>();

            foreach (DataRow sourceRow in sourceTable.Rows)
            {
                //the value of the pivot column for this row
                var pivotingValue = (string)sourceRow[columnWhoseValuesBecomeColumns];

                //is there a column the row value can go into?
                if (!dest.Columns.Contains(pivotingValue)) dest.Columns.Add(pivotingValue, typeOfPivotedColumns);

                //a row is indexed by the non pivoted values. For every row in the source, we build the index key for that row in the dest
                //the key here isn't complete yet; it will need the pivoting column name added to it
                var keyStub = "";
                foreach (var dc in columnsNotPivoted)
                    keyStub += sourceRow[dc] + "\0";

                //now we copy the values in the current row, into the pivoted columns of the destination row. If there is no destination row we create it
                foreach (DataColumn sourceColumn in sourceTable.Columns)
                {
                    //we don't act on the value in the pivoting column or the non pivoted columns; they're copied when the new dest row is created
                    if (sourceColumn == columnWhoseValuesBecomeColumns) continue;
                    if (columnsNotPivoted.Contains(sourceColumn)) continue;

                    //for each pivoted col we add the source column name (which is part of the data now) to the indexing key stub to form a new key
                    var key = keyStub + sourceColumn.ColumnName;

                    //have we created this row before? if so retrieve it by key
                    if (idx.TryGetValue(key, out var row))
                    {
                        row[pivotingValue] = sourceRow[sourceColumn];
                    }
                    else
                    {
                        //we haven't yet had any row created in the destination to take the values. Create one, copy the values that don't pivot
                        //and index it so that next time we can go straight to it
                        var destRow = dest.NewRow();
                        destRow[nameOfPivotColumn] = sourceColumn.ColumnName;
                        foreach (var cnp in columnsNotPivoted)
                        {
                            destRow[cnp.ColumnName] = sourceRow[cnp];
                        }
                        destRow[pivotingValue] = sourceRow[sourceColumn];
                        dest.Rows.Add(destRow);
                        idx[key] = destRow;
                    }
                }
            }

            return dest;
        }
    }
}

I am studying the code you taught me.

Pablo · Sep 5, 2023

Hi, @patrick
You have 3 groups (KPS, DPA, ESG) of 4 categories (English, Chinese, Japanese, Russian)
I'll try to explain one by one the 5 columns of each row of the converted DataTable (remember that indices start from 0):
1) 1st Column: subject = i % 4 == 0 ? original.Rows[i / 4][1].ToString() : "";
If i is a multiple of 4 (i % 4 == 0, or what is the same, the remainder of the integer division by i and 4 is 0), that is, i is 0, 4, or 8, then on the rows whose indices are those (the 1st ones of each group), it prints the contents of:
original.Rows[i / 4][1].ToString() : ""
which is the 1st row, 2nd column (KPS) for i = 0 to 3, but just for i = 0 as we saw first, for the other 3 it prints "" (empty string),
the 2nd row and 2nd column (DPA) for i = 4 to 7, but just for i = 4 (remember i % 4 == 0), for the other 3 it prints "",
the 3rd row and 2nd column (ESG) for i = 8 to 11, but just for i = 8
2) 2nd Column: category = original.Columns[2 + i % 4].ColumnName;
Here it will print, for i = 0, the Column name at index 2 (2 is the 1st Category/Language Column index, so we go from it), English, for i = 1, Chinese, for i = 2, Japanese, for i = 3, Russian, and the same for i = 4 to 7, and the same for i = 8 to 11. This way we print the 4 Categories in order, 3 times (for each Subject KPS, DPA, ESG), that is 4 * 3 = 12, from 0 to 11.
3) 3rd, 4th, and 5th Column (which are the integers, the real values):
day202309 = original.Rows[i / 4][2 + i % 4].ToString();
day202310 = original.Rows[i / 4 + 3][2 + i % 4].ToString();
day202311 = original.Rows[i / 4 + 6][2 + i % 4].ToString();
You can see that the 1st sub row, from Column index 2 (10, 20, 30, 50) becomes the first sub column, by the formula:
original.Rows[i / 4][2 + i % 4].ToString();
that for i = 0 to 3 (i / 4 is 0, the row index in the original, and 2 + i % 4 is from 2, 3, 4, 5, the column index)
To get the 2nd sub column (for i = 0 to 3, that is KPS group), now we access the 4th row in the original, by the formula

because in the original you have the rows KPS, DPA, ESG, KPS, DPA, ESG, ... then to get groups you would have the indices (0, 3, 6), (1, 4, 7), (2, 5, 8))
original.Rows[i / 4 + 3][2 + i % 4].ToString();
that for i = 0 to 3 (i / 4 + 3 is 3, the index for the 4th row in the original, the 2nd KPS (remember we are grouping KPS from i = 0 to 3 (first 4 rows in the converted DT -> that are 4 Languages in the original, 3 columns from index 2 -> that are "Days"/Subject(KPS, DPA, ESG) in the original)
The 3rd sub column in the converted DT from index 2, is the same than above but with + 6, which gets the 7th sub row in the original, and complets the KPS (first) group.
For getting the 3 sub columns for DPA (rows from index 4 to 7 in the converted DT, and row indices 1, 4, and 7 in the original), the formulas are the same, but being i = 4 to 7, the (i / 4), the (i / 4 + 3), and (i / 4 + 6), will access the 2nd, 5th, and 8th rows in the original (the DPA group) - (the 2 + i % 4 references the Columns English, etc. in the original that become rows in the converted DT.
For the last 4 rows (ESG group), it is indices 2, 5, 8 (3rd, 6th, and 9th rows in the original).
I hope you find it useful.
Pablo

DataTable

patrick

Well-known member

cjard

Well-known member

patrick

Well-known member

Pablo

Well-known member

Share this page

Latest posts