Functions

Reading

read_sav(spss_file[, row_offset, row_limit, ...])

Read data and metadata from SPSS file

read_metadata(spss_file[, usecols, locale])

Reads metadata attributes from SPSS file

pyspssio.read_sav(spss_file, row_offset=0, row_limit=None, usecols=None, convert_datetimes=True, include_user_missing=True, chunksize=None, locale=None, string_nan='')[source]

Read data and metadata from SPSS file

Parameters:
  • spss_file (str) – SPSS filename (.sav or .zsav)

  • row_offset (int (default: 0)) – Number of rows to skip

  • row_limit (int (default: None)) – Maximum number of rows to return

  • usecols (Union[list, tuple, str, callable, None] (default: None)) – Columns to use (None for all columns)

  • convert_datetimes (bool (default: True)) – Convert SPSS datetimes to Python/Pandas datetime columns; False returns seconds from October 15, 1582 (SPSS start date)

  • include_user_missing (bool (default: True)) – Whether to keep user missing values or replace them with NaN (numeric) and “” (strings)

  • chunksize (int (default: None)) – Number of rows to return per chunk

  • locale (str (default: None)) – Locale to use when I/O module is operating in codepage mode

  • string_nan (Any (default: '')) – Value to return for empty strings

Return type:

Union[Tuple[DataFrame, dict], Generator[DataFrame, None, None]]

Returns:

  • tuple – DataFrame, metadata

  • generator – DataFrame(s) with chunksize number of rows (only if chunksize is specified)

Examples

Read data and metadata:

df, meta = pyspssio.read_sav("spss_file.sav")

Read metadata only:

meta = pyspssio.read_metadata("spss_file.sav")

Read data in chunks of chunksize (number of rows/records):

for df in pyspssio.read_sav("spss_file.sav", chunksize=1000):
    # do something
pyspssio.read_metadata(spss_file, usecols=None, locale=None)[source]

Reads metadata attributes from SPSS file

Parameters:
  • spss_file (str) – SPSS filename (.sav or .zsav)

  • usecols (Union[list, tuple, str, callable, None] (default: None)) – Columns to use (None for all columns)

  • locale (str (default: None)) – Locale to use when I/O module is operating in codepage mode

Returns:

Header properties (see Header class for more detail)

Return type:

dict

Examples

>>> meta = pyspssio.read_metadata("spss_file.sav")

Writing

write_sav(spss_file, df[, metadata, ...])

Write SPSS file (.sav or .zsav) from DataFrame

append_sav(spss_file, df[, locale])

Append existing SPSS file (.sav or .zsav) with additional records

pyspssio.write_sav(spss_file, df, metadata=None, unicode=True, locale=None, **kwargs)[source]

Write SPSS file (.sav or .zsav) from DataFrame

Parameters:
  • spss_file (str) – SPSS filename (.sav or .zsav)

  • df (DataFrame) – DataFrame

  • metadata (dict (default: None)) – Dictionary of Header attributes to use (see Header class for more detail)

  • unicode (bool (default: True)) – Whether to write the file in unicode (True) or codepage (False) mode

  • locale (str (default: None)) – Locale to use when I/O module is operating in codepage mode

  • **kwargs – Additional arguments, including individual metadata attributes. Note that metadata attributes supplied here take precedence.

Return type:

None

Examples

>>> pyspssio.write_sav("spss_file.sav", df, metadata=meta)
pyspssio.append_sav(spss_file, df, locale=None, **kwargs)[source]

Append existing SPSS file (.sav or .zsav) with additional records

Parameters:
  • spss_file (str) – SPSS filename (.sav or .zsav)

  • df (DataFrame) – DataFrame

  • locale (str (default: None)) – Locale to use when I/O module is operating in codepage mode

  • **kwargs – Additional arguments

Return type:

None

Notes

Cannot modify metadata when appending new records. Be careful with strings that might be longer than the allowed width.

It may or may not be necessary to manually set locale since file encoding information is obtained from the SPSS header information.

Examples

>>> pyspssio.append_sav("spss_file.sav", df)