String right truncation pyodbc.ProgrammingError when using `fast_executemany`

Environment

To diagnose, we usually need to know the following, including version numbers. On Windows, be sure to specify 32-bit Python or 64-bit:

Python: 3.6.3
pyodbc: 4.0.23
OS: Debian Stretch
DB: SQL Server
driver: Microsoft ODBC Driver 17 for SQL Server libmsodbcsql-17.0.so.1.1

Issue

When executing a parameterized “UPDATE” statement with fast_executemany, the following exception is thrown.

ERROR - encoding with 'utf-16le' codec failed (SystemError: <built-in function utf_16_le_encode> returned a result with an error set)              
pyodbc.ProgrammingError: ('String data, right truncation: length 666 buffer 510', 'HY000')

The column widths in the target database have more than enough size to hold the data. This seems equivalent to Issue #337 except that in this case, I’m using 4.0.23 of pyodbc and MS ODBC Driver 17, which fixed the issue in that post but continue to be a problem for me.

I’ve trimmed down the data to just two rows and can consistently reproduce the problem with just those two rows.

I’m using SQL Alchemy to construct the statements and am using the connection.execute(stmt, [row1, row2]) method to execute.

Turning fast_executemany off fixes the problem, but unacceptably impacts performance.

Issue Analytics

State:
Created 5 years ago
Comments:9 (2 by maintainers)

Top GitHub Comments

4reactions

seanfdnncommented, Jun 8, 2018

I’ve narrowed this down to be a variation of #280

When the SQL statement contains more than one statement (i.e. more complex transactions, try catch blocks, etc…) and fast_executemany=True, the SQLDescribeParam call doesn’t work and thus VARCHARs are allocated a default buffer size of 510 bytes. This seems to be a known behaviour of SQLDescribeParam:

When processing a batch of Transact-SQL statements, the driver also does not support calling SQLDescribeParam for parameter markers in statements after the first statement in the batch.

The following code will reproduce the error (note the “print” statement in the SQL, but this could be TRY/CATCH BLOCKS, etc…

import pyodbc
import sys

print('Python version ' + sys.version)
print('pyodbc version ' + pyodbc.version)

driver = 'ODBC Driver 17 for SQL Server'
server = 'localhost'
database = 'db'
user = 'sa'
password = 'password'

conn_string = f'DRIVER={{{driver}}};' \
                   f'PORT=1433;SERVER={server};' \
                   f'DATABASE={database};' \
                   f'UID={user};' \
                   f'PWD={password};' \
                   f'ColumnEncryption=Enabled;' \
                   f'TrustServerCertificate=yes;'

cnxn = pyodbc.connect(conn_string, autocommit=True)

crsr = cnxn.cursor()

crsr.fast_executemany = True

crsr.execute('DROP TABLE IF EXISTS tmp')
crsr.execute('CREATE TABLE tmp (txt VARCHAR(1024))')

value_to_insert = 'a' * 256

params = [(value_to_insert,), (value_to_insert,)]

sql = """
print 'do nothing';
INSERT INTO tmp (txt) VALUES (?);
"""

crsr.setinputsizes((512,))
crsr.executemany(sql, params)

print(crsr.execute('SELECT txt from tmp').fetchall())

(note that in the actual application, the SQL is significantly more complicated; this is the simplest case that will reproduce the error)

This produces the output with errors:

Python version 3.6.3 (default, Nov  6 2017, 10:55:10)
[GCC 6.3.0 20170516]
pyodbc version 4.0.23
pyodbc.ProgrammingError: ('String data, right truncation: length 512 buffer 510', 'HY000')

The above exception was the direct cause of the following exception:

SystemError: <built-in function utf_16_le_encode> returned a result with an error set

The above exception was the direct cause of the following exception:

SystemError: encoding with 'utf-16le' codec failed (SystemError: <built-in function utf_16_le_encode> returned a result with an error set)

The above exception was the direct cause of the following exception:

SystemError: <built-in function utf_16_le_encode> returned a result with an error set

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "pypirls/memerrortest.py", line 40, in <module>
    crsr.executemany(sql, params)
SystemError: encoding with 'utf-16le' codec failed (SystemError: <built-in function utf_16_le_encode> returned a result with an error set)

However, changing any one of the below will avoid the exception:

Removing the “print” statement from the SQL
Setting the number of ‘a’ characters in the inserted varchar to 255 (fits in buffer size)
Setting fast_executemany to False

Note that unlike the solution in the wiki, a temporary table is not being used, and ColumnEncryption=Enabled

It’s unclear to me why this problem happens when fast_executemany=True but not when False as I would think that SQLDescribeParameter would have the same behaviour in either case, but I’m going to guess it’s because in the latter case, a memory array doesn’t need to be pre-allocated.

Looking at @v-chojas 's setinputsizes PR – but as far as I can tell, it hasn’t been merged into master, and it doesn’t look like it is impacting the array allocation in params.cpp::ExecuteMulti? Any ideas for a workaround?

2reactions

seanfdnncommented, Jun 7, 2018

A common thread seems to be the buffer is capped at 510, which is 2 x 255 – which suggests fast_executemany sometimes assumes a maximum string length of 255 characters (510 bytes)

Top Results From Across the Web

('String data, right truncation: length 8 buffer 4', 'HY000')" error ...

I am using pyodbc 4.0.26 version. Inserting result-set (containing a decimal result) with "fastexecutemany" set to TRUE results in "pyodbc.

How can I solve the issue: pyodbc.ProgrammingError: ('String ...

Is there a way to increase buffer? The issue: pyodbc.ProgrammingError: ('String data, right truncation: length 162 buffer 160', 'HY000'). My ...

String right truncation pyodbc.ProgrammingError when using ...

The column widths in the target database have more than enough size to hold the data. This seems equivalent to Issue #337 except...

How to Make Inserts Into SQL Server 100x faster with Pyodbc

I've been recently trying to load large datasets to a SQL Server database with Python. Usually, to speed up the inserts with pyodbc, ......

Re: String data, right truncation - Microsoft Power BI Community

While Uploading dataset to SQL Server using R getting following error: "nanodbc/nanodbc.cpp:1655: 22001: [Microsoft][ODBC SQL Server Driver]String data, ...