phenopacket_mapper.pipeline.mapper module

class phenopacket_mapper.pipeline.mapper.PhenopacketMapper(datamodel: DataModel)[source]

Bases: object

Class to map data using a DataModel to Phenopackets

This class is central to the pipeline for mapping data from a DataModel to Phenopackets. A dataset can be mapped from its tabular format to the Phenopacket schema in a few simple steps: 1. Define the DataModel for the dataset, if it does not exist yet 2. Load the data from the dataset 3. Define the mapping from the DataModel to the Phenopacket schema 4. Perform the mapping 5. Write the Phenopackets to a file 6. Optionally validate the Phenopackets

load_data(path: str | Path) → List[DataModelInstance][source]

Load data from a file using the DataModel

Will raise an error if the file type is not recognized or the file does not follow the DataModel

Parameters:: path – Path to the file to load
Returns:: List of DataModelInstances

map(mapping_: DataModel2PhenopacketSchema, data: List[DataModelInstance]) → List[Phenopacket][source]

Map data from the DataModel to Phenopackets

The mapping is based on the definition of the DataModel and the DataModel2PhenopacketSchema mapping.

If successful, a list of Phenopackets will be returned

Parameters:

mapping – Mapping from the DataModel to the Phenopacket schema, defined in DataModel2PhenopacketSchema
data – List of DataModelInstances created from the data using the DataModel

Returns:

List of Phenopackets

write(phenopackets: List[Phenopacket], output_path: str | Path) → bool[source]

Write Phenopackets to a file

Parameters:

phenopackets – List of Phenopackets to write
output_path – Path to write the Phenopackets to

Returns:

True if successful, False otherwise

phenopacket_mapper.pipeline.mapper.mapping(path: Path, output: Path, validate_: bool, datamodel: DataModel = DataModel(data_model_name='ERDRI_CDS', fields=[], resources=[]))[source]

Executes the pipeline mapping a dataset in the format to the Phenopacket schema

Parameters:

path – Path to formatted csv or excel file
output – Path to write Phenopackets to
validate – Validate phenopackets using phenopacket-tools after creation
datamodel – DataModel to use for the mapping, defaults to