phenopacket_mapper.utils.parsing package

This module contains utility functions concerning the parsing of strings to python values

phenopacket_mapper.utils.parsing.parse_data_type(type_str: str, resources: List[CodeSystem], compliance: Literal['soft', 'hard'] = 'soft') List[Any | CodeSystem | type | str][source]

Parses a string representing of one or multiple data types or code systems to a list of type in Python

The purpose of this method is to parse entries in a Data Model tabular file for DataField.data_type. In the tabular file, the user can list typical primitive data types such as string, int, etc., or date as a data type. Further is it possible to list the name space prefix (e.g., “SCT” for SNOMED CT) of a specific resource (given its inclusion in the list passed as the resources parameter) to indicate that codes or terms from that resource are permittable.

When compliance is set to ‘soft’ (default), this method only issues warnings if a data type is unrecognized and adds a literal to the list of allowed data types. When compliance is set to ‘hard’, it throws a ValueError in the case described above.

E.g. >>> parse_data_type(“integer, str, Boolean”, []) [<class ‘int’>, <class ‘str’>, <class ‘bool’>]

Parameters:
  • type_str

  • resources

  • compliance

Returns:

phenopacket_mapper.utils.parsing.parse_ordinal(field_name_str: str) Tuple[str, str][source]

Parsing DataField.name string to separate strings containing the ordinal and the name respectively

This method is meant as part of reading in a DataModel from a file, where data model fields might have an ordinal attached to them (e.g., “1.1. Pseudonym”), which this method can then neatly separate into ordinal=”1.1.” and name=”Pseudonym”.

>>> parse_ordinal("1.1. Pseudonym")
('1.1', 'Pseudonym')
>>> parse_ordinal("1. Pseudonym")
('1', 'Pseudonym')
>>> parse_ordinal("I.a. Pseudonym")
('I.a', 'Pseudonym')
>>> parse_ordinal("ii. Pseudonym")
('ii', 'Pseudonym')
Parameters:

field_name_str – name of the field, containing an ordinal, to parse

Returns:

a tuple containing the ordinal and the name

Submodules