pyspssio.Header
- class pyspssio.Header(*args, **kwargs)[source]
Bases:
SPSSFileClass for getting and setting metadata attributes
Methods
__init__(*args, **kwargs)close()Close file
Finalize metadata
open()Open file
set_locale(locale)Set I/O module to a specific locale
Attributes
case_countNumber of cases
Record case size (in bytes)
Case weight variable
compressionCompression level
Arbitrary user-defined file attributes
file_encodingFile encoding reported by I/O module
interface_encodingI/O interface mode (Unicode or code page)
is_compatible_encodingCheck encoding compatibility
Multi response set definitions
Number of multi response set definitions
release_infoBasic file information
Variable alignments
Variable attributes
Column display widths
Short (8-byte) variable names
var_countNumber of variables
Variable formats as strings
Variable formats as tuples in the form (type, width, decimals)
Variable handles references
Variable labels
Variable measure levels
Missing values
Variable names
Variable roles
Variable sets
Variable types
Variable value labels
- property file_attributes: dict
Arbitrary user-defined file attributes
- property var_names: list
Variable names
May return a filtered list when returned as part of a metadata object if only a subset of variables are specified to be used (e.g., usecols in read_sav).
- property var_types: dict
Variable types
May return a filtered dictionary when returned as part of a metadata object if only a subset of variables are specified to be used (e.g., usecols in read_sav).
- property var_handles: dict
Variable handles references
Used when calling I/O module procedures that use variable handles instead of variable names as arguments
- property var_formats_tuple: dict
Variable formats as tuples in the form (type, width, decimals)
ex. (5, 8, 2) instead of F8.2
- property var_formats: dict
Variable formats as strings
Use var_formats_tuple property for formats as tuples
- property var_measure_levels: dict
Variable measure levels
Measure levels are returned as strings. When setting, input accepts either strings or numerics.
0 = unknown
1 = nominal
2 = ordinal
3 = scale
- property var_alignments: dict
Variable alignments
Alignments are returned as strings. When setting, input accepts either strings or numerics.
0 = left
1 = right
2 = center
- property var_column_widths: dict
Column display widths
Manually set column widths or specify 0 to use SPSS’ algorithm to assign a width
- property var_labels: dict
Variable labels
- property var_roles: dict
Variable roles
Roles are returned as strings. When setting, input accepts either strings or numerics.
0 = input
1 = target
2 = both
3 = none
4 = partition
5 = split
6 = frequency
7 = recordid
- property var_value_labels: dict
Variable value labels
Nested dictionary of variables with their value labels (if defined) as sub-dictionaries
Note: value labels only work for numeric and short string variables (length <= 8)
- property mrsets_count: int
Number of multi response set definitions
Needed if using spssGetMultRespDefByIndex. Otherwise, len(mrsets) should be equivalent.
- property mrsets: dict
Multi response set definitions
- Multi response sets contain the following attributes
label : set label
is_dichotomy : whether set is dichotomous (True) or Category (False)
counted_value : counted value for dichotomous sets
use_category_labels : whether to use counted value labels instead of variable labels
use_first_var_label : whether to use first var label as set label
variable_list : list of variables in the set
Notes
mrset name must begin with a “$”.
variable_list is the only required attribute. However, if this is the only included attribute, then is_dichotomy is assumed to be False.
If is_dichotomy is True, counted_value must be specified. If is_dichotomy is None and counted_value is not None, is_dichotomy is assumed to be True.
Numeric dichotomous sets only accept integers for a counted value.
use_category_labels is only applicable for dichotomous sets. Setting this to True turns the set into an “extended” mrset definition.
use_first_var_label is only applicable when use_category_labels is True. Specifying a set label when use_first_var_label is True might result in an invalid mrset definition.
Examples
Category (C) Set:
{"$mc_mrset": { "label": "This is an MC set", "variable_list": ["var1", "var2", "var3"] }}
Dichotomous (D) Set:
{"$md_mrset": { "label": "This is an MD set", "counted_value": 1, "variable_list": ["resp1", "resp2", "resp3"] }}
Dichotomous (E - Extended) Set:
{"$md_mrset": { "counted_value": 1, "use_category:labels": True, "use_first_var_label": True, "variable_list": ["cat1", "cat2", "cat3"] }}
- property case_size: int
Record case size (in bytes)
Raw number of bytes for a single case record. It can be calculated manually by adding all variable types rounded up to the nearest multiple of 8.
This is the buffer size used to read a whole case record at once. It is not necessily the number of bytes used to store a case record on disk (depending on compression).
- property case_weight_var: str
Case weight variable
Variable set as the “weight” variable in SPSS. Must be a scale numeric variable.
- property var_missing_values: dict
Missing values
- Missing value definitions may contain three keys
lo = Low value used in missing range
hi = high value used in missing range
values = list of discrete values set as user missing
- For missing ranges, the following keywords can be used inplace of numeric values
low = -inf, lo, low, lowest
high = inf, hi, high, highest
- property var_attributes: dict
Variable attributes
These are arbitrary variable properties, analagous to file attributes
- property var_compat_names: dict
Short (8-byte) variable names
Dictionary of variable names with their “compatible” short 8-byte counterparts
- property var_sets: dict
Variable sets
These are NOT multi response sets. These variable sets are groupings of variables that can be selected in the SPSS application as a sort of view filter.
SPSS apparently may use the 8 byte compatible variable names for this property. It’s currently not possible to obtain the auto-generated compatible names until the dictionary is committed, which means setting this property potentially requires first comitting a dictionary with all variables, and then rewriting it after obtaining the compatible variable names.
Set names when created in the normal SPSS application allow spaces and special characters. However, The I/O module returns an SPSS_INVALID_VARSETDEF error when these are included. When an “=” sign is included in the set name, the set name is truncated.