phenopacket_mapper.pipeline.input module

phenopacket_mapper.pipeline.input.read_file(path: str | Path, data_model: DataModel = DataModel(data_model_name='ERDRI_CDS', fields=[], resources=[]), file_type: Literal['csv', 'excel', 'unknown'] = 'unknown') List[DataModelInstance][source]

Reads a csv file in using a DataModel definition and returns a list of DataModelInstances

Parameters:
  • path – Path to formatted csv or excel file

  • file_type – Type of file to read, either ‘csv’ or ‘excel’

  • data_model – DataModel to use for reading the file

Returns:

List of DataModelInstances

phenopacket_mapper.pipeline.input.read_data_model(data_model_name: str, resources: List[CodeSystem], path: str | Path, file_type: Literal['csv', 'excel', 'unknown'] = 'unknown', column_names: Dict[str, str] = mappingproxy({'name': 'data_field_name', 'section': 'data_model_section', 'description': 'description', 'data_type': 'data_type', 'required': 'required', 'specification': 'specification', 'ordinal': 'ordinal'}), parse_data_types: bool = False, compliance: Literal['soft', 'hard'] = 'soft', remove_line_breaks: bool = False, parse_ordinals: bool = True) DataModel[source]

Reads a Data Model from a file

Parameters:
  • data_model_name – Name to be given to the DataModel object

  • resources – List of CodeSystem objects to be used as resources in the DataModel

  • path – Path to Data Model file

  • file_type – Type of file to read, either ‘csv’ or ‘excel’

  • column_names – A dictionary mapping from each field of the DataField (key) class to a column of the file (value). Leaving a value empty (‘’) will leave the field in the DataModel definition empty.

  • parse_data_types – If True, parses the string to a list of CodeSystems and types, can later be used to check validity of the data. Optional, but highly recommended.

  • compliance – Only applicable if parse_data_types=True, otherwise does nothing. ‘soft’ raises warnings upon encountering invalid data types, ‘hard’ raises ValueError.

  • remove_line_breaks – Whether to remove line breaks from string values

  • parse_ordinals – Whether to extract the ordinal number from the field name. Warning: this can overwrite values Ordinals could look like: “1.1.”, “1.”, “I.a.”, or “ii.”, etc.

phenopacket_mapper.pipeline.input.read_redcap_api(data_model: DataModel) List[DataModelInstance][source]

Reads data from REDCap API and returns a list of DataModelInstances

Parameters:

data_model – DataModel to use for reading the file

Returns:

List of DataModelInstances

phenopacket_mapper.pipeline.input.read_phenopackets(path: Path) List[Phenopacket][source]

Reads Phenopackets from a file

Parameters:

path – Path to Phenopackets file

Returns:

List of Phenopackets