read csv file but it only reads the first line

ouss_wess · Jan 21, 2022

Am trying to read this csv file and make a header of all the distinct elements in it and put it in a datatable but it only reads the first line

Am trying to read this current csv file and convert it into a truth table that show if an element exists in a row I out 1 if not I put 0 but it only reads the first line

I also have an error in the for loop since the lines aren't of the same length I get an index out of range, any suggestion in how I can read all the lines?
This is the csv file I have been trying to read

lines of csv file:

chicken,other vegetables,packaged fruit/vegetables,condensed milk,frozen vegetables,fruit/vegetable juice
vinegar,oil
rolls/buns,soda,specialty bar
whole milk
pork,root vegetables,whole milk,whipped/sour cream
rolls/buns,newspapers
grapes,other vegetables,zwieback,Instant food products,dishes
frankfurter,citrus fruit,whipped/sour cream,cream cheese ,rolls/buns,baking powder,seasonal products,napkins
finished products,root vegetables,packaged fruit/vegetables,yogurt,specialty cheese,frozen vegetables,ice cream,soda,bottled beer
onions,other vegetables,flour
tropical fruit,whole milk,rolls/buns

and this is the code am using to read csv files

read csv file and put it into a datatable:

static void Main(string[] args)
        {

         string filepath = @"C:\Downloads\groceriess.csv";
         DataTable res = ConvertCSVtoDataTable(filepath);
         DataTable ConvertCSVtoDataTable(string strFilePath)
         {
            StreamReader sr = new StreamReader(strFilePath);
            string[] headers = sr.ReadLine().Split(',');
            DataTable dt = new DataTable();
            foreach (string header in headers)
            {
                dt.Columns.Add(header);
            }
            while (!sr.EndOfStream)
            {
                string[] rows = Regex.Split(sr.ReadLine(), ",(?=(?:[^\"]*\"[^\"]*\")*[^\"]*$)");
                DataRow dr = dt.NewRow();
                for (int i = 0; i < headers.Length; i++)
                {
                    dr[i] = rows[i]; //this is the line that is causing the error
                }
                dt.Rows.Add(dr);
            }
            return dt;
         }


        }

Skydiver · Jan 21, 2022

Your title says that your problem is that you are only reading the first line, but the code you posted has a comment that you are getting an error on line 21.

Anyway, the reason why you are getting that error is because the header line has 6 items, but when you read the second line, it only has 2 items. The for loop on line 19 goes through the index values of 0 to 5, but only index values 0 to 1 will be valid for second line.

ouss_wess · Jan 21, 2022

Skydiver said:
Your title says that your problem is that you are only reading the first line, but the code you posted has a comment that you are getting an error on line 21.

Anyway, the reason why you are getting that error is because the header line has 6 items, but when you read the second line, it only has 2 items. The for loop on line 19 goes through the index values of 0 to 5, but only index values 0 to 1 will be valid for second line.

I thought if I added the for loop It will add more lines that's why I didn't mention the error message because what I want to know is how to read all the lines but you are right I should have mentioned it, do you have a suggestion on how to resolve it?

JohnH · Jan 21, 2022

Instead of looping/indexing headers.Length do rows.Length.

ouss_wess · Jan 21, 2022

JohnH said:
Instead of looping/indexing headers.Length do rows.Length.

I did try that and even added it with it using && but it still showed a similar error

Skydiver · Jan 21, 2022

Unless you have some lines with more than 6 items, I highly doubt that you would have a similar error.

endofunk · Jan 21, 2022

Alternative method:

Read all text from file
Replace end of line characters with a comma
Spilt string on comma

Skydiver · Jan 21, 2022

That is not going to work because how would you determine where the header ends and the row data starts. And then where would you determine where each row ends.

Our OP's objective to take input that looks like this:

C#:

a, b, c, d
a, d
b, o, x
c, a, d, y

And turn it into:

C#:

If you follow your procedure and read all the lines of the text file and replace new lines with commas, our OP would have to deal with:

C#:

a, b, c, d, a, d, b, o, x, c, a, d, y,

So how do you get the desired truth table from that?

ouss_wess · Jan 22, 2022

Skydiver said:
That is not going to work because how would you determine where the header ends and the row data starts. And then where would you determine where each row ends.

Our OP's objective to take input that looks like this:

C#:

a, b, c, d a, d b, o, x c, a, d, y

And turn it into:

C#:

a b c d 1 0 0 1 0 1 0 0 1 0 1 0

If you follow your procedure and read all the lines of the text file and replace new lines with commas, our OP would have to deal with:

C#:

a, b, c, d, a, d, b, o, x, c, a, d, y,

So how do you get the desired truth table from that?

after I read it (successfully), am going to compare them and take the distinct elements and put them into a datatable and then read the the csv file again and compare the elements with the rows if yes I put 1 if not I put 0

ouss_wess · Jan 22, 2022

Skydiver said:
Unless you have some lines with more than 6 items, I highly doubt that you would have a similar error.

that I do, am working on it without having an exact idea of the line length, I tried giving a big number as a limit but that also didn't work

C#:

bread, vegetables, fruits
bread, butter, oil
oil, water, wheat

I would then choose the distinct elements

C#:

bread vegetables fruits butter oil water wheat

ouss_wess · Jan 22, 2022

endofunk said:
Alternative method:

Read all text from file

Replace end of line characters with a comma

Spilt string on comma

thank you but I need to keep using this current method as I tested with other csv files and it worked

Skydiver · Jan 22, 2022

ouss_wess said:
after I read it (successfully), am going to compare them and take the distinct elements and put them into a datatable and then read the the csv file again and compare the elements with the rows if yes I put 1 if not I put 0

How do you determine given:

C#:

a, b, c, d, a, d, b, o, x, c, a, d, y,

that O, X, and Y should not be column headers when you first build the data table before your second pass?

The OP had the right approach of doing one pass and treating the first line differently.

ouss_wess · Jan 22, 2022

Skydiver said:
How do you determine given:

C#:

a, b, c, d, a, d, b, o, x, c, a, d, y,

that O, X, and Y should not be column headers when you first build the data table before your second pass?

The OP had the right approach of doing one pass and treating the first line differently.

They should be column headers cause they are distinct, it's the first time they appeared (If I undertstood your question clearly) . I posted this in how to be able to read all the lines and the rest I will figure out after I fix this problem but until now am stuck in it so any additional information I cannot give until I fix this, Thank you.

endofunk · Jan 22, 2022

Skydiver said:
That is not going to work because how would you determine where the header ends and the row data starts. And then where would you determine where each row ends.

Our OP's objective to take input that looks like this:

C#:

a, b, c, d a, d b, o, x c, a, d, y

And turn it into:

C#:

a b c d 1 0 0 1 0 1 0 0 1 0 1 0

If you follow your procedure and read all the lines of the text file and replace new lines with commas, our OP would have to deal with:

C#:

a, b, c, d, a, d, b, o, x, c, a, d, y,

So how do you get the desired truth table from that?

It was never meant to be a complete solution... merely a pointer to the 1st challenge, namely:

ouss_wess said:
read this csv file and make a header of all the distinct elements in it

All he'd have to do is to compute the distinct elements in the returned list; and he'd have his header.

The binary match up part is rather easy once you have your header...
...naturally for that; keeping track of the row offsets would be important.

Also to avoid more than 1 file IO operation; one could naturally ReadAllLines as opposed to ReadAllText followed by split on comma; which would cover the row offsets, and then for the header; flatten the List<List<string>> to a List<string> and compute a distinct list of header elements.

Skydiver · Jan 22, 2022

ouss_wess said:
They should be column headers cause they are distinct, it's the first time they appeared (If I undertstood your question clearly) . I posted this in how to be able to read all the lines and the rest I will figure out after I fix this problem but until now am stuck in it so any additional information I cannot give until I fix this, Thank you.

If you get the distinct values from:

C#:

a, b, c, d, a, d, b, o, x, c, a, d, y,

then you will get these values as the distinct values:

C#:

a, b, c, d, o, x, y

Or better yet, what if the input file was:

C#:

a, b, c
d, 1,
e, 2, f
g, 3, h
i, 4, j

which should result in:

C#:

but will give you a distinct list of:

C#:

a, b, c, d, 1, e, 2, f, g, 3, h, i, 4, j

So again, I ask, how do you determine that O, X, and Y shouldn't be headers, but A, B, C, and D should be headers without doing a second pass?

read csv file but it only reads the first line

ouss_wess

Member

Am trying to read this csv file and make a header of all the distinct elements in it and put it in a datatable but it only reads the first line

endofunk

Skydiver

ouss_wess

Member

JohnH

C# Forum Moderator

ouss_wess

Member

Skydiver

endofunk

Active member

Skydiver

ouss_wess

Member

ouss_wess

Member

ouss_wess

Member

Skydiver

ouss_wess

Member

endofunk

Active member

Skydiver

Share this page

Latest posts

read csv file but it only reads the first line

Member

Am trying to read this csv file and make a header of all the distinct elements in it and put it in a datatable but it only reads the first line​

Member

C# Forum Moderator

Member

Active member

Member

Member

Member

Member

Active member

Share this page

Am trying to read this csv file and make a header of all the distinct elements in it and put it in a datatable but it only reads the first line