Reading CSV file into OLEDB vs SQLite into DbDataReader(System.Data.Common)

Anwind

Well-known member
Joined
Nov 6, 2021
Messages
48
Programming Experience
1-3
Hello

It's a question of memory efficiency in C#.NET VisualStudio2019.


In C#.NET, Reading CSV file into OLEDB vs SQLite into DbDataReader(System.Data.Common).

In C#.NET, Reading CSV file into OLEDB : Is it memory efficient?
In C#.NET, SQLite into DbDataReader(System.Data.Common): Is it memory efficient?

Which is more efficient ( Memory of PC ), Reading CSV file into OLEDB or SQLite into DbDataReader(System.Data.Common)?

Please Help me.
 
Last edited:
Test it out.
 
It's hard to make a test that matches your workload since we don't know what kind of workload you have for those two. Does the CSV file contain the same information as the SQLite database? If not then you cannot even do an apples to apples comparison.

In the end any kind of profiling needs to be done by yourself in your own data for the results to have any relevance.

Furthermore, you don't even need OLEDB to read a CSV file. The framework has the TextFieldParser class which is even more lightweight than the OLEDB + ACE driver or the SQLite driver DbDataReader.
 
I haven't coded yet.
Then the next step is obvious. Test the two options and see which performs better. It will only be handful of lines for each option.
 
As a quick aside. If your code will eventually be running on a server, Microsoft does not support using the ACE driver in a server environment. Read the documentation on the driver download page if you don't believe me. That may force the choice for you as to whether to use OLEDB+ACE driver or SQLite.
 
I need to use Query.
What kind of query? If you are going to select all rows anyway, the TextFieldParser is easy to wrap with a DbDataReader.

If you have a relatively simple WHERE clause, this can also be implemented within that wrapper.
 
My teacher Skydiver.

Following your teaching I am coding a TextFieldParser. C#.NET Visual Studio2019

1) << Move Next >> in DbDataReader (System.Data.Common) ===> How do you Warp to TextFieldParser?

2) Query : Select *, (rowname - 1) rowname FROM Table Where Column1 = 'ABC' ORDER BY Column2
I don't know the query syntax in TextFiedlParser.

Please Teaching me.
 
Last edited:
Just use SQLite

It will be more beneficial for your education to focus on the notion that data should be kept in a database, not a csv file that is parsed out every time you want to read something, and rewritten every time you want to persist e something

Once you adopt a more database centric view, reading and writing data becomes very easy and you'll view CSV files as a (worst) way to achieve data transfer between databases; you'll parse your CSV and store it into a DB, then use all of the nice DB facilities to work with the data

About the only time I would do an app that processed data but didn't use a DB was if it was doing a transform from one format to another. As soon as there is a requirement to consume A large amount of data and selectively query or update a small part of it, I start looking to a DB, especially if those read/write ops take place over an extended period of time(program is stopped and started between ops)
 
1) << Move Next >> in DbDataReader (System.Data.Common) ===> How do you Warp to TextFieldParser?
That just maps to the ReadFields() call on the TextFieldParser.
 
2) Query : Select *, (rowname - 1) rowname FROM Table Where Column1 = 'ABC' ORDER BY Column2
I don't know the query syntax in TextFiedlParser.
You will have to implement your own filtering and column selection with your TextFieldParser wrapper, or by using LINQ-to-objects after the wrapper returns all the rows.
 
With a Select() extension method, the lambda or function you pass in can return any kind of object. So you can either create an anonymous type with all the column values as fields, plus an additional field where you compute the column value - 1 and, the column value once more. Alterntively, you can actually declare a class with with all the fields as well as an additional field to hold that column value - 1.
 
You do realize with the amount of time you've invested into trying to get the TextFieldParser to work, you could have already tested to see what the difference between using OLEDB+ACE driver vs. using SQLite driver.
 
If I had to read a CSV like this:

C#:
Name,Height,Birthday,IsActive
John,180,1970-01-01,True
Jane,170,1980-01-01,False

I would add a reference to Sylvan.Data.Csv from nuget and have a code like:

C#:
using Sylvan.Data;
using Sylvan.Data.Csv;

using var csv = CsvDataReader.Create("path/to/csv.csv");
var people = csv.GetRecords<CsvPerson>().ToList();


record CsvPerson(string? Name, int Height, DateTime Birthday, bool IsActive) { public Person() : this(default, default, default, default) { } }

And then i'd take my People list and (probably) run it through AutoMapper to map it to my database Person entity, add those person to an EF context and save it - another couple of lines of code. The Person DB entity and context would have been created by quickly putting together a database in some query tool (I tend to use SQLS and SSMS, but Sqlite has its own ecosystem of such things) and then I would tell EF to analyse the database and write code from it; EF Core Power Tools VS extension makes this easy too.

We end up with a set of code, mostly written by tools in a few seconds, that we can just use like:

C#:
var newDbEntities = mapper.Map<CsvPerson, Person>(people); //Person is the DB entity type, CsvPerson is the record read from CSV. mapper is the configured AutoMapper that knows how to translate a CsvPerson to a Person object
context.People.AddRange(newDbEntities);
await context.SaveChangesAsync();

If I had millions of CSV records to do, I'd use a loop rather than ToList() so that I wasn't loading the whole file at once, and I'd look at the EFCore Bulk Extensions for batch loading of the data for a useful speed boost
 
Last edited:
Back
Top Bottom