Any table makes some assumptions about the data, but they are mostly not explicitly recorded in the commonly available table format. This concerns, for example, the symbol(s) that signal "not available" values or the symbol that is used as decimal sign.

setFormat(
  schema = NULL,
  header = 0,
  decimal = NULL,
  thousand = NULL,
  na_values = NULL,
  flags = NULL
)

Arguments

schema

[schema(1)]
In case this information is added to an already existing schema, provide that schema here (overwrites previous information).

header

[integerish(1)]
The number of header rows. Optimally, a table is read so that column names are ignored (for example readr::read_csv(file = ..., col_names = FALSE)). If relatively well defined tables are processed, where the header is always only one row, the table can be read in with the default and the header can be spliced into the table by specifying the number of rows here.

decimal

[character(1)]
The symbols that should be interpreted as decimal separator.

thousand

[character(1)]
The symbols that should be interpreted as thousand separator.

na_values

[character(.)]
The symbols that should be interpreted as NA.

flags

[data.frame(2)]
The typically character based flags that should be shaved off of observed variables to make them identifiable as numeric values. This must be a data.frame with two columns with names flag and value.

Value

An object of class schema.

Details

Please also take a look at the currently suggested strategy to set up a schema description.

See also

Other functions to describe table arrangement: setCluster(), setFilter(), setGroups(), setIDVar(), setObsVar()

Examples

# please check the vignette for examples