Building sqlcsv-exporter: a Python CLI to Stream SQL Server Data into CSV

March 28, 2026

Building sqlcsv-exporter: a Python CLI to Stream SQL Server Data into CSV

I recently built a small Python package called sqlcsv-exporter to solve a very practical problem: export the results of a SQL Server query into a CSV file, from the command line, without dragging an entire dataset into memory.

The tool is packaged as a Poetry project, designed to be publishable to PyPI, and focused on a workflow that is common in reporting and data operations teams:

store the SQL in a .sql file
pass connection details at runtime
stream rows in chunks
write the CSV incrementally
show visible progress while the job runs

This post walks through why I built it, how it works, and what I learned while testing it against a local SQL Server instance.

The Problem

The starting point was familiar: a one-off Python script that connected to SQL Server, ran a query, and wrote a CSV. That works for quick experiments, but it gets messy fast.

Typical issues show up immediately:

query text is embedded inside Python instead of living in a .sql file
authentication logic is mixed into export logic
the script is hard to reuse across environments
large result sets become memory-heavy
the UX is poor when a query runs for several seconds or several minutes

The goal was to replace that with a proper CLI package that feels like a real tool rather than a disposable script.

The Goal

I wanted a command like this:

sqlcsv-exporter `
  --sql .\queries\report.sql `
  --output .\exports\report.csv `
  --server "DESKTOP-TTUSQLJ\SQLEXPRESS" `
  --database QuantDevTest

The requirements were straightforward:

read SQL from file
connect to SQL Server with trusted auth or SQL auth
stream the result set with fetchmany()
write CSV rows incrementally
avoid pandas for the write path
provide useful progress output
keep the package testable and publishable

The Package Structure

I split the project into small modules so each concern is isolated:

cli.py: argument parsing and process exit behavior
config.py: config model and validation
connection.py: ODBC connection string and connection creation
sql_rewriter.py: SQL file loading and optional variable rewriting
exporter.py: query execution, chunked fetch, CSV writing, and progress rendering

That separation matters because it makes the code easier to test. The CSV writer can be tested without a database. The SQL rewriting logic can be tested with simple strings. The export workflow can be tested with mocked connections and cursors.

Why Streaming Matters

The core design choice was to stream results instead of loading everything into memory.

The exporter:

opens the SQL file
optionally rewrites a declared date variable such as @InAsOfDate
executes the query with pyodbc
fetches rows in chunks
writes each chunk directly to a CSV file

That keeps the memory profile stable even when the result set gets large.

Instead of doing something like this:

rows = cursor.fetchall()
writer.writerows(rows)

the tool does this conceptually:

while True:
    rows = cursor.fetchmany(chunk_size)
    if not rows:
        break
    writer.writerows(rows)

That small change is the difference between a toy exporter and something you can trust on real datasets.

A Better CLI Experience

Command-line tools that do database work have one recurring UX problem: silence. If a query takes ten seconds, users start wondering whether the command is stuck, blocked, or dead.

So the exporter prints:

a short run summary before execution
detected column count after the query starts returning metadata
a live progress display while rows are written
a final completion summary with rows, columns, file size, and elapsed time

I used rich for the terminal output instead of hand-rolled print() calls. It gives the tool a more deliberate interface without turning it into a full TUI.

Handling SQL as a File

One design decision I liked immediately was keeping SQL in a separate file. That brings a few benefits:

the query is easier to inspect and review
analysts or DBAs can edit SQL without touching Python code
the same CLI can be reused for many exports
long or multi-step SQL scripts remain readable

For example, this simple test query lives in its own file:

SELECT TOP (1000)
    [Id],
    [Stock],
    [LowPrice],
    [MaxPrice],
    [AvgPrice]
FROM [QuantDevTest].[dbo].[StockMetrics];

That file can be passed directly to the CLI without changing the Python package at all.

Optional SQL Rewriting

Some teams keep parameter values in SQL scripts using a declared variable like:

DECLARE @InAsOfDate DATE = '2026-03-01';

To support that style, the tool includes a lightweight SQL rewriting step. If the variable exists, the CLI can replace its value based on --date. If the variable does not exist, the SQL runs unchanged.

This keeps the interface flexible:

SQL files can remain mostly static
operators can adjust date-driven exports without manually editing the file

It is intentionally narrow in scope. It is not trying to become a generic SQL templating engine.

A Real-World Bug Caught by Live Testing

The first end-to-end run against my local SQL Server instance did not succeed.

The CLI connected successfully, but then failed with:

AttributeError: 'pyodbc.Cursor' object has no attribute 'timeout'

That was useful. The mocked tests passed, but the live run exposed a compatibility assumption in the code. Some pyodbc environments expose timeout behavior differently.

The fix was simple:

try setting the timeout on the cursor if that attribute exists
otherwise fall back to the connection object

That is exactly why live testing matters. A tool can look clean in unit tests and still fail in a real environment for reasons that only surface with the actual driver and actual database connection.

Stress Testing with a Slower Query

A fast export is not a great demo for progress reporting, so I added a dedicated slow test query.

It:

copies base rows into a temp table
creates two multiplier temp tables
cross joins them to expand the result set
adds a short WAITFOR DELAY

That makes it easy to simulate a longer-running export without needing a large production-sized table.

Conceptually, the query looks like this:

SELECT TOP (1000) ... INTO #BaseStockMetrics ...
SELECT TOP (200) ... INTO #MultiplierA ...
SELECT TOP (200) ... INTO #MultiplierB ...

WAITFOR DELAY '00:00:03';

SELECT ...
FROM #BaseStockMetrics AS b
CROSS JOIN #MultiplierA AS a
CROSS JOIN #MultiplierB AS c;

With a small base table, this is enough to create a large CSV and make the progress output visible. With a large base table, it can scale very aggressively, so it is the kind of test query you use carefully.

Testing Strategy

The package includes pytest coverage around the parts that are most likely to break:

config validation
date resolution
SQL file loading
SQL variable rewriting
connection string generation
CSV writing
mocked end-to-end export behavior

That balance is important. Not every code path needs a deep integration test, but the components that define correctness should have coverage.

The live SQL Server run complements those tests by exercising:

the actual ODBC driver
the actual SQL Server instance
actual filesystem writes
real CLI behavior

Packaging for PyPI

The project is set up as a Poetry package and versioned as 1.0.1.

Build artifacts are created with:

poetry build

That produces:

a source distribution
a wheel

Those are the exact artifacts needed for publishing to PyPI.

Example Usage

Trusted connection:

poetry run sqlcsv-exporter `
  --sql .\queries\stock_metrics_top_1000.sql `
  --output .\exports\stock_metrics_top_1000.csv `
  --server "DESKTOP-TTUSQLJ\SQLEXPRESS" `
  --database QuantDevTest

SQL authentication:

poetry run sqlcsv-exporter `
  --sql .\queries\stock_metrics_top_1000.sql `
  --output .\exports\stock_metrics_top_1000.csv `
  --server "my-server" `
  --database "Reporting" `
  --sql-auth `
  --username "report_user" `
  --password "secret"

Final Thoughts

This is not a huge project, but it is a useful example of turning a script into a tool.

The interesting part was not the CSV writing itself. It was the engineering around it:

separating concerns into modules
choosing streaming over eager loading
designing a usable CLI
writing tests around the core behavior
validating the tool against a real SQL Server environment

That is usually the difference between “something that works on my machine” and “something I would actually hand to another team.”

If I extend it further, the next areas I would look at are:

environment-variable support for connection defaults
optional gzip output
row count estimates before execution when possible
better handling for very wide result sets
structured logging for scheduled runs

For now, sqlcsv-exporter does the job it was meant to do: run a SQL Server query from a file and stream the results into CSV in a way that is practical, testable, and ready to package.

Search This Blog

Random thoughts on coding and technology

Building sqlcsv-exporter: a Python CLI to Stream SQL Server Data into CSV

The Problem

The Goal

The Package Structure

Why Streaming Matters

A Better CLI Experience

Handling SQL as a File

Optional SQL Rewriting

A Real-World Bug Caught by Live Testing

Stress Testing with a Slower Query

Testing Strategy

Packaging for PyPI

Example Usage

Final Thoughts

Comments

Post a Comment

Popular Posts

Train Your Own LoRA with ComfyUI: A Step-by-Step Guide

Firebase Cloud Messaging with Delphi 10.1 Berlin update 2.