Skip to content

pyreadstat not honoring format for strings longer than 255 bytes #267

@maver1ck

Description

@maver1ck

Describe the issue

I'd like to set up properly data format for strings.
The problem is with format 'A500' (where number is bigger than 255)
However format is properly set if there is a value in df that has proper length.

To Reproduce

import pyreadstat
import pandas as pd
df = pd.DataFrame({'Text5': ['0123456789'],
      'Text50': ['0123456789'],
      'Text500': ['0123456789'],
      'Text500A': ['0' * 500],  
    })
variable_format = {'Text5': 'A5', 'Text50': 'A50', 'Text500': 'A500', 'Text500A': 'A500'}
pyreadstat.write_sav(df, 'bug.sav', variable_format=variable_format)

Result in SPSS.
image

File example

File created by above code

Expected behavior

Column Text500 should have width 500.

Setup Information:

How did you install pyreadstat? pip
Platform: macos
Python Version: 3.12
Python Distribution: brew
Using Virtualenv or condaenv? venv

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingrequires changes in Readstatwaiting for changes in the C library Readstat

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions