oledb is slow.

Anwind

Well-known member
Joined
Nov 6, 2021
Messages
48
Programming Experience
1-3
hello

CSV File size : 500MB
The CSV file is read using the above code method.
oledb is slow.
Is there any way to speed it up?
Please teach me.
 
Last edited:
Your Answer) I still don't understand why doing a code migration is forcing a change from using SQLite to OLEDB.
===> Original Source data is a CSV file. However, the csv file is converted to a sqlite db and is being used. Except that sqlite automatically returns the rowid to the query. Here This is the reference site for row number of sqlite. click here ===> (https://renenyffenegger.ch/notes/development/databases/SQLite/tables/rowid)

Terefore, the csv file and the sqlite db have the same data.
sqlite db can actually be removed.
Given your current coding dilemna, about not being able to follow our advice about computing the row number yourself, as well as the poor performance of using the ACE driver, it looks like you cannot actually remove the SQLite database.
 
Sylvan CSV ==> Can a be MIGRATED to system.data.oledb.dll, system.data.common?
No. It will be Sylvan's assembly and system.data.common. Recall that System.Data.Common is what provides the DbDataReader class which their data reader services from.
 
Is there any way to migrate system.data.oledb & system.data.common to PARSER?
Only for a DbDataReader that has all text columns and the query in the comand is just selects columns and has very simple filtering. No sorting will be supported in the query.

Anything involving filtering, sorting, transforms, etc. will have to be done after the rows have been read from the CSV. See the pseudo code I posted earlier in one of your threads revolving around this topic.
 
in Sylvan CSV, this is no
System.Data.OleDb.OleDbCommand
System.Data.OleDb.OleDbCommandBuilder
System.Data.OleDb.OleDbConnection
System.Data.OleDb.OleDbDataAdapter
System.Data.OleDb.OleDbDataReader
System.Data.OleDb.OleDbParameter
System.Data.OleDb.OleDbParameterCollection
System.Data.OleDb.OleDbTransaction.
That is correct. It is purely a CSV data reader. It is not meant to replace a full database.
 
I have mapped(migration).

From System.Data.Common & System.Data.Oledb.dll <=== mapped System.Data.Common & System.Data.SQLite.dll


Is Sylvan CSV also mapped as above?
Sort of. The System.Data.Common will still be there but only the DbDataReadet part is use, and the only functionality it will have is equivalent to the data reader part of System.Data.Oledb.dll or System.Data.SQLite.dll.
 
I just read the csv file. Write function is not required
No you don't. In your other thread you were sorting by an "ID" column and also expecting the query to add a column that has the row number after sorting. So you are not just reading. You were reading and transforming.
 
That is correct. It is purely a CSV data reader. It is not meant to replace a full database.

Is there any way to migrate system.data.oledb & system.data.common to PARSER?

As per your answer, there is Sylvan CSV.

in Sylvan CSV, this is no

System.Data.OleDb.OleDbCommand <--Connection-->System.Data.Common.DbCommand ---> System.Data.SQLite.SQLiteCommand

System.Data.OleDb.OleDbConnection <-- Connection-->System.Data.Common.DbConnection ---> System.Data.SQLite.SQLiteConnection

System.Data.OleDb.OleDbDataReader <-- Connection-->System.Data.Common.DbDataReader --> System.Data.SQLite.SQLiteDataReader

System.Data.OleDb.OleDbDataAdapter <-- Connection-->System.Data.Command.DbAdapter --> System.Data.SQLite.SQLiteDbDataAdapter

System.Data.OleDb.OleDbCommandBuilder
System.Data.OleDb.OleDbParameter
System.Data.OleDb.OleDbParameterCollection
System.Data.OleDb.OleDbTransaction.

I have mapped(Connection).

From System.Data.Common & System.Data.Oledb.dll <=== mapped System.Data.Common & System.Data.SQLite.dll

If mapping is possible as you say, I will try with Sylvan CSV.
 
Last edited:
I'm migrating this code, it's read only.
If that is all you are doing, the Sylvan or the TextFieldParser would be a good fit for you. You don't even need the OLEDB or Sylvan if the speed for the TextFieldParser is good enough for you.

If your .CSV doesn't have any quoted text, and is truly purely comma separated, you don't even need the TextFieldParser. You could simply TextReader.ReadLine() and use Split() line by line.
 
@annayun: DO NOT DELETE POSTS OR BACK EDIT THEM!!! Your edit and delete privileges maybe take away if you keep doing that.
 
The source below is the structure of the code I want to migrate.
C#:
var command = new Sysrem.Data.OleDbCommand("SELECT * FROM Table", connection);
System.Data.Common.DbDataReader reader = command.ExecuteReader();

String.Format("Row Number : {0}", rowNum);
String.Format("CSV File Column1 : {0}", Col1);
String.Format("CSV File Column2 : {0}", Col2);

int rowNum = 1;
while (reader.Read())
{
rowNum++;
}
System.Data.Sylvan / System.Data.TextFieldParser <- Connection-->System.Data.Common.DbCommand

I converted it to the source below.

C#:
var command = new Sylvan_Command("SELECT * FROM Table", connection);

System.Data.Common.DbDataReader reader = Sylvan_Command.ExecuteReader();

String.Format("Row Number : {0}", rowNum);
String.Format("CSV File Column1 : {0}", Col1);
String.Format("CSV File Column2 : {0}", Col2);

int rowNum = 1;
while (reader.Read())
{
rowNum++;
}



Looking at the code, SystemData.DbDataReader(System.Data.Common) and System.Dara.OleDbCommand(System.Data.Oledb)are connected and coded together.
First of all, this is incorrect:
C#:
System.Data.Common.DbDataReader reader = command.ExecuteReader();

String.Format("Row Number : {0}", rowNum);
String.Format("CSV File Column1 : {0}", Col1);
String.Format("CSV File Column2 : {0}", Col2);
You have not yet read any data in yet. So there is nothing to format.

Anyway to migrate your code using Sylvan, all you would need is:
C#:
var csv = CsvDataReader.Create("data.csv");
int rowNum = 1
while(csv.Read()) 
{
    rowNum++;
}

To migrate your code using TextFieldParser, all you would need is:
C#:
var parser = new TextFieldParser("data.csv");
parser.TextFieldType = FieldType.Delimited;
parser.Delimiters = new[] { "," };
parser.CommentTokens = new[] { "#" };
int rowNum = 1;
var headers = parser.ReadFields();
while (!parser.EndOfData)
{
    parser.ReadFields();
    rowNum++;
}

To actually do something useful within the while loops, like accessing the rows would be:
C#:
var csv = CsvDataReader.Create("data.csv");
int rowNum = 1
while(csv.Read()) 
{
    Console.Write($"Row {rowNum}: ");
    for(int i = 0; i < csv.FieldCount; i++)
        Console.Write($"{csv[i]}, ");
    Console.WriteLine();
    rowNum++;
}

or

C#:
var parser = new TextFieldParser("data.csv");
parser.TextFieldType = FieldType.Delimited;
parser.Delimiters = new[] { "," };
parser.CommentTokens = new[] { "#" };
int rowNum = 1;
var headers = parser.ReadFields();
while (!parser.EndOfData)
{
    var cols = parser.ReadFields();
    Console.Write($"Row {rowNum}: ");
    for(int i = 0; i < cols.Length; i++)
        Console.Write($"{cols[i]}, ");
    Console.WriteLine();
    rowNum++;
}
 
OLEDB is just a standard interface for different databases. As a standard, there various native database engines needs to convert data to data that confirms to the interface. It is not OLEDB that is slow but rather the database engine that is implementing the OLEDB interface that is slow.

What database are you using? The ACE or the JET engine?

What non-OLEDB database driver are you comparing with?

I hope you realize that anything backed by a CSV, the plaintext file needs to parsed to determine the rows and columns values, then try to detect the type of data for that column, then convert the column your to that massive type for each row, as the rows are being returned. Compare that to a real database where the column values are already in native types, and already stored in rows.

Also, by chance are you writing it to the console within your while loop in the code above? If so, the console is slow. Write out to a log file using a real logging system. Do not kludge together an ad hoc logging system where a file keeps on being opened and closed per write -- you would only barely be faster than the console.

It is not OLEDB that is slow. It is the ACE database driver exposing an OLEDB interface that is slow. You are basically at the mercy of Microsoft. Considering that they abandoned ACE and went with JET, you'll likely not get much help there. The ACE database driver was written primarily for use with Access databases. It was only tangentially, expanded by the Excel development team to also accept a few other data formats so that data imports could be performed. It (as well as JET) was not meant to be a way for you to use a CSV files (or Excel files) as a full blown database.

My PC Install : MS Office2016
Windows10(64bit)
VS2019 compile buile 32bit, AnyCPU

site click : Access on Excel2007 via ACE.OLEDB 12.0 is VERY VERY SLOW

MSSQL.png


C#:
public System.Data.Common.DbDataReader ExecutDataReader(sQuery)
{
        Sysrem.Data.SqlClient.Command.ExecuteReader(Default);
}

String.Format("Row Number : {0}", rowNum);
String.Format("CSV File Column1 : {0}", Col1);

int rowNum = 1;
while (reader.Read())
{
rowNum++;
}

*********************** Read the csv file (My PC Install : MS Office2016) into System.Data.Sqlclient.dll.
Will reading the csv file into System.Data.SqlClient.dll make it faster?
It is possible??
 

Attachments

  • MSSQL.png
    MSSQL.png
    21.6 KB · Views: 5
Last edited:
I keep trying to drive to work on a big block of cheese, but it's really hard to get it to move. I put wheels on it but the mounting points just ripped through the cheese. I fitted rocket boosters to the back which did get it moving but then it wore away to nothing on the road so I had to buy another block of cheese. I'm looking for some way to fit more rocket boosters so I can get it to levitate, which means it won't wear away on the road, and hopefully it will not melt either

My neighbour says I should just buy a car because it's designed for the purpose I need, but I don't listen to him because I think I can make the block of cheese work out
 
Will reading the csv file into System.Data.SqlClient.dll make it faster?
The SqlClient cannot read .CSV files directly. (That is why you also see a lot of questions on the Internet about how to import CSV files into Microsoft SQL".)

So you are basically back where you started but with a different database. Originally, you imported the CSV into SQLite, and then read from the SQLite database. Now it looks like you are planning on importing the CSV in Microsoft SQL and then read from the Microsoft SQL database. Both SQLite and MSSQL will be much faster than ACE or JET.
 
Back
Top Bottom