Skip to content

Query Performance #294

@willbush

Description

@willbush

I've been testing the query performance tiberius and odbc-api vs C#.

These are the numbers I'm getting more or less:

number of rows: 3,838,164
C# dotnet = ~11 sec
Rust odbc-api = ~14 sec
Rust tiberius = ~20 sec

I'm seeing nearly double the speed in C# compared to tiberius. I apologize this is not very rigorous testing or reproducible. I'm working with proprietary data / sprocs. I did three runs per library on a computer running windows with an on-prem production database.

I suppose, to make this reproducible, a test could be written to fill a local DB with a bunch of data and test how long it takes to query all the rows. I'm willing to help with that.

Anyway, I was wondering if I'm doing anything obviously wrong:

tiberius

use anyhow::Result;
use async_std::net::TcpStream;
use once_cell::sync::Lazy;
use std::{env, str::FromStr};
use std::time::Instant;
use tiberius::{Client, Config, Query, SqlBrowser, Uuid};

static CONN_STR: Lazy<String> = Lazy::new(|| {
    env::var("TIBERIUS_TEST_CONNECTION_STRING").unwrap_or_else(|_| {
        "Data Source=Prod.xyz.local\\Prod;Database=Prod;Integrated Security=true;Connection Timeout=200;Max Pool Size=250;Encrypt=False;".to_owned()
    })
});

#[async_std::main]
async fn main() -> Result<()> {
    let config = Config::from_ado_string(&CONN_STR)?;
    let tcp = TcpStream::connect_named(&config).await?;

    tcp.set_nodelay(true)?;

    let mut client = Client::connect(config, tcp).await?;

    let select = Query::new("[dbo].[usp_GetLotsOfThings]");

    let start_time = Instant::now();

    let rows = select.query(&mut client).await?.into_first_result().await?;

    let elapsed_time = start_time.elapsed();

    println!("Elapsed time (secs): {:?}", elapsed_time.as_secs());

    println!("number of rows: {}", rows.len());
}

odbc-api

use std::time::Instant;

use anyhow::Error;
use odbc_api::{buffers::TextRowSet, Cursor, Environment};

/// Maximum number of rows fetched with one row set. Fetching batches of rows is usually much
/// faster than fetching individual rows.
const BATCH_SIZE: usize = 5000;

fn main() -> Result<(), Error> {
    // Production
    let connection_string = "
        Driver={ODBC Driver 17 for SQL Server};\
        Server=Prod.xyz.local\\Prod;\
        Database=Prod;\
        Trusted_Connection=Yes;\
        Connection Timeout=200;\
        Max Pool Size=250;\
        Encrypt=No;\
    ";

    let env = Environment::new()?;
    let conn = env.connect_with_connection_string(&connection_string)?;

    let mut prepared = conn.prepare("[dbo].[usp_GetLotsOfThings]")?;
    let mut count = 0;

    let start_time = Instant::now();

    match prepared.execute(()) {
        Err(e) => println!("{}", e),
        Ok(None) => println!("No result set generated."),
        Ok(Some(mut cursor)) => {
            // Use schema in cursor to initialize a text buffer large enough to hold the largest
            // possible strings for each column up to an upper limit of 4KiB.
            let mut buffers = TextRowSet::for_cursor(BATCH_SIZE, &mut cursor, Some(4096))?;
            // Bind the buffer to the cursor. It is now being filled with every call to fetch.
            let mut row_set_cursor = cursor.bind_buffer(&mut buffers)?;

            // Iterate over batches
            while let Some(batch) = row_set_cursor.fetch()? {
                // Within a batch, iterate over every row
                for row_index in 0..batch.num_rows() {
                    // Within a row iterate over every column
                    let _record = (0..batch.num_cols())
                        .map(|col_index| batch.at(col_index, row_index).unwrap_or(&[]));

                    count += 1;
                }
            }
        }
    }
    println!("Count: {}", count);

    let elapsed_time = start_time.elapsed();

    println!("Elapsed time (secs): {:?}", elapsed_time.as_secs());

    Ok(())
}

ping @pacman82

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions