read csv file but it only reads the first line

ouss_wess

Member
Joined
Jan 21, 2022
Messages
9
Programming Experience
Beginner

Am trying to read this csv file and make a header of all the distinct elements in it and put it in a datatable but it only reads the first line​


Am trying to read this current csv file and convert it into a truth table that show if an element exists in a row I out 1 if not I put 0 but it only reads the first line

I also have an error in the for loop since the lines aren't of the same length I get an index out of range, any suggestion in how I can read all the lines?
This is the csv file I have been trying to read

lines of csv file:
chicken,other vegetables,packaged fruit/vegetables,condensed milk,frozen vegetables,fruit/vegetable juice
vinegar,oil
rolls/buns,soda,specialty bar
whole milk
pork,root vegetables,whole milk,whipped/sour cream
rolls/buns,newspapers
grapes,other vegetables,zwieback,Instant food products,dishes
frankfurter,citrus fruit,whipped/sour cream,cream cheese ,rolls/buns,baking powder,seasonal products,napkins
finished products,root vegetables,packaged fruit/vegetables,yogurt,specialty cheese,frozen vegetables,ice cream,soda,bottled beer
onions,other vegetables,flour
tropical fruit,whole milk,rolls/buns

and this is the code am using to read csv files

read csv file and put it into a datatable:
static void Main(string[] args)
        {

         string filepath = @"C:\Downloads\groceriess.csv";
         DataTable res = ConvertCSVtoDataTable(filepath);
         DataTable ConvertCSVtoDataTable(string strFilePath)
         {
            StreamReader sr = new StreamReader(strFilePath);
            string[] headers = sr.ReadLine().Split(',');
            DataTable dt = new DataTable();
            foreach (string header in headers)
            {
                dt.Columns.Add(header);
            }
            while (!sr.EndOfStream)
            {
                string[] rows = Regex.Split(sr.ReadLine(), ",(?=(?:[^\"]*\"[^\"]*\")*[^\"]*$)");
                DataRow dr = dt.NewRow();
                for (int i = 0; i < headers.Length; i++)
                {
                    dr[i] = rows[i]; //this is the line that is causing the error
                }
                dt.Rows.Add(dr);
            }
            return dt;
         }


        }
 
Last edited:
Solution
For further clarity here's an example to compute the headers (as a List) and similarly the binaries (as a List)
C#:
var path = "../groceriess.csv";
var rows = File.ReadAllLines(path).Select(s => s.Split(",").ToList());
var headers = rows
    .SelectMany(s => s)
    .Distinct()
    .OrderBy(s => s) // sorts the items
    .ToList();

Notes(for OP, to aid comprehension):
  • SelectMany when used in this context with an id operation s => s; flattens List<List<string>> to List<string>
  • Distinct removes duplicates
  • OrderBysorts the items, alphabetically
That'll produce:
C#:
System.Collections.Generic.List<System.String>[
"baking powder","bottled beer","chicken","citrus...
So again, I ask, how do you determine that O, X, and Y shouldn't be headers, but A, B, C, and D should be headers without doing a second pass?
That's not what the OP describes with his example in post #10 ?:

C#:
bread, vegetables, fruits
bread, butter, oil
oil, water, wheat

..becomes...

C#:
bread vegetables fruits butter oil water wheat
No items were left out?
 
I guess we have different interpretations of what a CSV file is, as well as what the OP is asking for.
 
I guess we have different interpretations of what a CSV file is, as well as what the OP is asking for.
Quite possible...

My interpretation is as follows:

read this csv file and make a header of all the distinct elements
This for me implies that ALL distinct elements should be used for headers. No exclusions.

The second operation is then to compute binaries as you described.
 
For further clarity here's an example to compute the headers (as a List) and similarly the binaries (as a List)
C#:
var path = "../groceriess.csv";
var rows = File.ReadAllLines(path).Select(s => s.Split(",").ToList());
var headers = rows
    .SelectMany(s => s)
    .Distinct()
    .OrderBy(s => s) // sorts the items
    .ToList();

Notes(for OP, to aid comprehension):
  • SelectMany when used in this context with an id operation s => s; flattens List<List<string>> to List<string>
  • Distinct removes duplicates
  • OrderBysorts the items, alphabetically
That'll produce:
C#:
System.Collections.Generic.List<System.String>[
"baking powder","bottled beer","chicken","citrus fruit","condensed milk","cream cheese ","dishes","finished products","flour","frankfurter",
"frozen vegetables","fruit\/vegetable juice","grapes","ice cream","Instant food products","napkins","newspapers","oil","onions","other vegetables",
"packaged fruit\/vegetables","pork","rolls\/buns","root vegetables","seasonal products","soda","specialty bar","specialty cheese","tropical fruit","vinegar",
"whipped\/sour cream","whole milk","yogurt","zwieback"]

Next to compute the binaries:
C#:
public static class BoolExtension {
  public static string ToBinary(this bool x) => x ? "1" : "0";
}

var binary = rows
  .Select(xs => headers
    .Select(h => xs.Contains(h).ToBinary())
    .ToList())
  .ToList();

Notes(for OP, to aid comprehension):
  • xs is the List<string> containing 1 csv row of items to match to the header
  • Contains returns True or False if the row (xs) contains the header item (h), ToBinary is Method Extension to convert the bool value to a string "1" representing True or "0" representing False

That'll produce:
C#:
System.Collections.Generic.List<System.Collections.Generic.List<System.String>>[
    ["0","0","1","0","1","0","0","0","0","0","1","1","0","0","0","0","0","0","0","1","1","0","0","0","0","0","0","0","0","0","0","0","0","0"],
    ["0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","1","0","0","0","0","0","0","0","0","0","0","0","1","0","0","0","0"],
    ["0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","1","0","0","1","1","0","0","0","0","0","0","0"],
    ["0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","1","0","0"],
    ["0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","1","0","1","0","0","0","0","0","0","1","1","0","0"],
    ["0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","1","0","0","0","0","0","1","0","0","0","0","0","0","0","0","0","0","0"],
    ["0","0","0","0","0","0","1","0","0","0","0","0","1","0","1","0","0","0","0","1","0","0","0","0","0","0","0","0","0","0","0","0","0","1"],
    ["1","0","0","1","0","1","0","0","0","1","0","0","0","0","0","1","0","0","0","0","0","0","1","0","1","0","0","0","0","0","1","0","0","0"],
    ["0","1","0","0","0","0","0","1","0","0","1","0","0","1","0","0","0","0","0","0","1","0","0","1","0","1","0","1","0","0","0","0","1","0"],
    ["0","0","0","0","0","0","0","0","1","0","0","0","0","0","0","0","0","0","1","1","0","0","0","0","0","0","0","0","0","0","0","0","0","0"],
    ["0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","1","0","0","0","0","0","1","0","0","1","0","0"]]

The conversion at this point to DataTable would be fairly simplistic.

Notes: (for OP to aid comprehension)
A few helper methods will make the conversion easier, for example:
  • From a List<string> to DataColumn / DataColumnCollection
  • From List<List<string>> to DataRow / DataRowCollection
 
Last edited:
Solution
For further clarity here's an example to compute the headers (as a List) and similarly the binaries (as a List)
C#:
var path = "../groceriess.csv";
var rows = File.ReadAllLines(path).Select(s => s.Split(",").ToList());
var headers = rows
    .SelectMany(s => s)
    .Distinct()
    .OrderBy(s => s) // sorts the items
    .ToList();

Notes(for OP, to aid comprehension):
  • SelectMany when used in this context with an id operation s => s; flattens List<List<string>> to List<string>
  • Distinct removes duplicates
  • OrderBysorts the items, alphabetically
That'll produce:
C#:
System.Collections.Generic.List<System.String>[
"baking powder","bottled beer","chicken","citrus fruit","condensed milk","cream cheese ","dishes","finished products","flour","frankfurter",
"frozen vegetables","fruit\/vegetable juice","grapes","ice cream","Instant food products","napkins","newspapers","oil","onions","other vegetables",
"packaged fruit\/vegetables","pork","rolls\/buns","root vegetables","seasonal products","soda","specialty bar","specialty cheese","tropical fruit","vinegar",
"whipped\/sour cream","whole milk","yogurt","zwieback"]

Next to compute the binaries:
C#:
public static class BoolExtension {
  public static string ToBinary(this bool x) => x ? "1" : "0";
}

var binary = rows
  .Select(xs => headers
    .Select(h => xs.Contains(h).ToBinary())
    .ToList())
  .ToList();

Notes(for OP, to aid comprehension):
  • xs is the List<string> containing 1 csv row of items to match to the header
  • Contains returns True or False if the row (xs) contains the header item (h), ToBinary is Method Extension to convert the bool value to a string "1" representing True or "0" representing False

That'll produce:
C#:
System.Collections.Generic.List<System.Collections.Generic.List<System.String>>[
    ["0","0","1","0","1","0","0","0","0","0","1","1","0","0","0","0","0","0","0","1","1","0","0","0","0","0","0","0","0","0","0","0","0","0"],
    ["0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","1","0","0","0","0","0","0","0","0","0","0","0","1","0","0","0","0"],
    ["0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","1","0","0","1","1","0","0","0","0","0","0","0"],
    ["0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","1","0","0"],
    ["0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","1","0","1","0","0","0","0","0","0","1","1","0","0"],
    ["0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","1","0","0","0","0","0","1","0","0","0","0","0","0","0","0","0","0","0"],
    ["0","0","0","0","0","0","1","0","0","0","0","0","1","0","1","0","0","0","0","1","0","0","0","0","0","0","0","0","0","0","0","0","0","1"],
    ["1","0","0","1","0","1","0","0","0","1","0","0","0","0","0","1","0","0","0","0","0","0","1","0","1","0","0","0","0","0","1","0","0","0"],
    ["0","1","0","0","0","0","0","1","0","0","1","0","0","1","0","0","0","0","0","0","1","0","0","1","0","1","0","1","0","0","0","0","1","0"],
    ["0","0","0","0","0","0","0","0","1","0","0","0","0","0","0","0","0","0","1","1","0","0","0","0","0","0","0","0","0","0","0","0","0","0"],
    ["0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","0","1","0","0","0","0","0","1","0","0","1","0","0"]]

The conversion at this point to DataTable would be fairly simplistic.

Notes: (for OP to aid comprehension)
A few helper methods will make the conversion easier, for example:
  • From a List<string> to DataColumn / DataColumnCollection
  • From List<List<string>> to DataRow / DataRowCollection

Thank you, that was pretty similar to what I was looking for. I will try to implement your method in my code but when I tried debugging it as a single code the ToBinary() method returned a problem which is

C#:
bool doesn't contain a definition for ToBinary and no accessible extension method 'ToBinary' accepting a first argument of type 'bool' could be found

I tried searching online but I wanted to know if you did something different that I didn't that's why it showed this error

the ToBinary extension method returned a problem too but when I removed the 'this' variable from (this bool x) it worked
 
Look at the Binary() extension method he wrote in the BoolExtension class.
 
Thank you, that was pretty similar to what I was looking for. I will try to implement your method in my code but when I tried debugging it as a single code the ToBinary() method returned a problem which is

C#:
bool doesn't contain a definition for ToBinary and no accessible extension method 'ToBinary' accepting a first argument of type 'bool' could be found

I tried searching online but I wanted to know if you did something different that I didn't that's why it showed this error

the ToBinary extension method returned a problem too but when I removed the 'this' variable from (this bool x) it worked
Extension Methods like ToBinary are defined in a static class; and you can identify it as an Extension Method because it uses the this keyword on the 1st parameter; which is the data type that you are extending with this dot (extension) method.

So for example; you can define it similar to this:

C#:
namespace .. {
 
  public static class BoolExtensions {  // You pick this class name; just must be a static class and be unique
    public static string ToBinary(this bool x) => x ? "1" : "0";
  }
 
  class Program {
    static public void Main() {
      var a = True;
      var aBinary = a.ToBinary();  // "1"
    }
  }
}
The Extension Method class can be defined in a separate class file; and it can contain many methods. Typically we store all the methods related to that data type together. So BoolExtensions would contain all the extension methods for the bool data type.

Microsoft C# Reference:

"rabbit hole" note:
Linq was created using extension methods. That's a rabbit hole to explore once you've found your footing.
 
Last edited:
Extension Methods like ToBinary are defined in a static class; and you can identify it as an Extension Method because it uses the this keyword on the 1st parameter; which is the data type that you are extending with this dot (extension) method.

So for example; you can define it similar to this:

C#:
namespace .. {
 
  public static class BoolExtensions {  // You pick this class name; just must be a static class and be unique
    public static string ToBinary(this bool x) => x ? "1" : "0";
  }
 
  class Program {
    static public void Main() {
      var a = True;
      var aBinary = a.ToBinary();  // "1"
    }
  }
}
The Extension Method class can be defined in a separate class file; and it can contain many methods. Typically we store all the methods related to that data type together. So BoolExtensions would contain all the extension methods for the bool data type.

Microsoft C# Reference:

"rabbit hole" note:
Linq was created using extension methods. That's a rabbit hole to explore once you've found your footing.

Thank you, that was very helpful like you said Linq is a rabbit hole that I have on hold before I dive into it. This was enlightening you have my humble thanks
 
Back
Top Bottom