skripts.FIA package

Submodules

skripts.FIA.FIA module

Methods for Flow-injection-analysis.

Assigning metbolites to consensus map masses.

Parameters:
  • consensus_map (pyopenms.ConsensusMap) – Input consensus map

  • database_dir (str) – Database directory

  • tmp_dir (str) – Directory for temporary saves

  • positive_adducts_file (str) – File with possible positive adducts

  • negative_adducts_file (str) – File with possible negative adducts

  • HMDBMapping_file (str) – HMDBMapping file

  • HMDB2StructMapping_file (str) – HMDB2StructMapping file

  • ionization_mode (str, optional) – Ionization mode, defaults to “auto”

Returns:

Annotated dataframe

Return type:

pandas.DataFrame

skripts.FIA.FIA.align_retention_times(feature_maps: list, max_num_peaks_considered: int = -1, max_mz_difference: float = 10.0, mz_unit: str = 'ppm', superimposer_max_scaling: float = 2.0) list[source]

Use as reference for alignment, the file with the largest number of features Works well if you have a pooled QC for example. Returns the aligned map at the first position.

Parameters:
  • feature_maps (list) – List of feature maps

  • max_num_peaks_considered (int, optional) – Maximum number of considered peaks, defaults to -1

  • max_mz_difference (float, optional) – Maximum m/z difference, defaults to 10.0

  • mz_unit (str, optional) – Unit for m/z values, defaults to “ppm”

  • superimposer_max_scaling (float, optional) – Maximum scaling during superimposition, defaults to 2.0

Returns:

List of feature maps with aligned retention times

Return type:

list

skripts.FIA.FIA.annotate_consensus_map_df(consensus_map_df: DataFrame, mass_search_df: DataFrame, result_path: str = '.', mz_tolerance: float = 1e-05) DataFrame[source]

Annotate consensus map DataFrame.

Parameters:
  • consensus_map_df (pd.DataFrame) – Input Consensus map DataFrame

  • mass_search_df (pd.DataFrame) – Mass search DataFrame

  • result_path (str, optional) – Path to output results, defaults to “.”

  • mz_tolerance (float, optional) – Tolerance of m/z deviation, defaults to 1e-05

Returns:

Identified Metabolites DataFrame

Return type:

pd.DataFrame

skripts.FIA.FIA.assign_feature_maps_polarity(feature_maps: list, scan_polarity: str | None = None) list[source]

Assigns the polarity to a list of feature maps, depending on “pos”/”neg” in file name.

Parameters:
  • feature_maps (list) – List of feature maps

  • scan_polarity (Optional[str], optional) – Scan polarity, defaults to None

Returns:

List of feature maps with annotated polarity

Return type:

list

skripts.FIA.FIA.batch_download(base_url: str, file_urls: list, save_directory: str) None[source]

Download files from a list into a directory.

Parameters:
  • base_url (str) – Base URL

  • file_urls (list) – Individual file URLS which are appended to the base

  • save_directory (str) – Directory to save files to.

skripts.FIA.FIA.bin_df_stepwise(df: ~pandas.core.frame.DataFrame | ~polars.dataframe.frame.DataFrame, binning_var='mz', binned_var='inty', statistic='sum', start: float = 0.0, stop: float = 2000.0, step: float = 0.001, backend=<module 'pandas' from '/mnt/lustre/groups/link/linca945/.conda/envs/VAE/lib/python3.11/site-packages/pandas/__init__.py'>) DataFrame | DataFrame[source]

Stepwise binning of a dataframe into discrete boxes.

Parameters:
  • df (Union[pd.DataFrame, pl.DataFrame]) – Input dataframe

  • binning_var (str, optional) – Binning variable, i.e. distance defining value (column in dataframe), defaults to “mz”

  • binned_var (str, optional) – Binned value, i.e. combined value (column in dataframe), defaults to “inty”

  • statistic – Operation to perform on binned values in a window. May take common parameters,

defined by sci.stats.binned_statistic. defaults to “sum” :type statistic: str, optional :param start: Starting binning variable point, defaults to 0.0 :type start: float, optional :param stop: Stopping binning variable point, defaults to 2000.0 :type stop: float, optional :param step: Step distance along binning variable, defaults to 0.001 :type step: float, optional :param backend: Backend (pandas or polars), defaults to pd :type backend: _type_, optional :return: Binned dataframe :rtype: Union[pd.DataFrame, pl.DataFrame]

skripts.FIA.FIA.bin_df_stepwise_batch(experiments: ~pandas.core.frame.DataFrame | ~polars.dataframe.frame.DataFrame, sample_var: str = 'sample', experiment_var: str = 'experiment', binning_var='mz', binned_var='inty', statistic='sum', start: float = 0.0, stop: float = 2000.0, step: float = 0.001, backend=<module 'pandas' from '/mnt/lustre/groups/link/linca945/.conda/envs/VAE/lib/python3.11/site-packages/pandas/__init__.py'>) DataFrame | DataFrame[source]

Stepwise binning of a dataframe into discrete boxes of multiple dataframes.

Parameters:
  • experiments (Union[pd.DataFrame, pl.DataFrame]) – Input experiments

  • binning_var (str, optional) – Binning variable, i.e. distance defining value (column in dataframe), defaults to “mz”

  • binned_var (str, optional) – Binned value, i.e. combined value (column in dataframe), defaults to “inty”

  • statistic – Operation to perform on binned values in a window. May take common parameters,

defined by sci.stats.binned_statistic. defaults to “sum” :type statistic: str, optional :param start: Starting binning variable point, defaults to 0.0 :type start: float, optional :param stop: Stopping binning variable point, defaults to 2000.0 :type stop: float, optional :param step: Step distance along binning variable, defaults to 0.001 :type step: float, optional :param backend: Backend (pandas or polars), defaults to pd :type backend: _type_, optional :return: Binned dataframe :rtype: Union[pd.DataFrame, pl.DataFrame]

skripts.FIA.FIA.bits_to_bytes(bits: float | int, factor: float | int) float[source]

Coverts a number of bits to a number of bytes for readability.

Parameters:
  • bits (Union[float, int]) – Number of bits to be converted

  • factor (Union[float, int]) – / 10**factor (e.g. use 9 for GB)

Returns:

Number of bytes

Return type:

float

skripts.FIA.FIA.build_directory(dir_path: str) None[source]

Build a new directory in the given path.

Parameters:

dir_path (str) – Directory path

skripts.FIA.FIA.centroid_batch(experiments: Sequence[_MSExperimentDF | str] | str, run_dir: str, file_ending: str = '.mzML', instrument: str = 'TOF', signal_to_noise: float = 1.0, spacing_difference_gap: float = 4.0, spacing_difference: float = 1.5, missing: int = 1, ms_levels: List[int] = [], report_FWHM: str = 'true', report_FWHM_unit: str = 'relative', max_intensity: float = -1, auto_max_stdev_factor: float = 3.0, auto_max_percentile: int = 95, auto_mode: int = 0, win_len: float = 200.0, bin_count: int = 30, min_required_elements: int = 10, noise_for_empty_window: float = 1e+20, write_log_messages: str = 'true', peak_width: float = 0.0, sn_bin_count: int = 30, nr_iterations: int = 5, sn_win_len: float = 20.0, check_width_internally: str = 'false', ms1_only: str = 'true', clear_meta_data: str = 'false', deepcopy: bool = False) str[source]

Centroids a batch of experiments, extracted from files in a given directory with a given file ending (i.e. .mzML or .mzXML). Returns the new directors as path/centroids.

Parameters:
  • experiments (Union[Sequence[Union[oms.MSExperiment,str]], str]) – Input experiments

  • run_dir (str) – Run directory

  • file_ending (str, optional) – File ending, defaults to “.mzML”

  • instrument (str, optional) – Instrument type (TOF or FT-ICR, Orbitrap), defaults to “TOF”

  • signal_to_noise (float, optional) – Signal to noise ratio, defaults to 1.0

  • spacing_difference_gap (float, optional) – Spacing difference gap, defaults to 4.0

  • spacing_difference (float, optional) – Spacing difference, defaults to 1.5

  • missing (int, optional) – Number of allowed missing values, defaults to 1

  • ms_levels (List[int], optional) – MS levels to consider, defaults to []

  • report_FWHM (str, optional) – Report full width at half maximum, defaults to “true”

  • report_FWHM_unit (str, optional) – Report full width at half maximum unit, defaults to “relative”

  • max_intensity (float, optional) – Maximum intensity, defaults to -1

  • auto_max_stdev_factor (float, optional) – Automatic maximal standard deviation factor, defaults to 3.0

  • auto_max_percentile (int, optional) – Automatic maximal percentile to consider, defaults to 95

  • auto_mode (int, optional) – Automatic mode (0/1), defaults to 0

  • win_len (float, optional) – Window length, defaults to 200.0

  • bin_count (int, optional) – Number of bins, defaults to 30

  • min_required_elements (int, optional) – Minimum required elements for a peak, defaults to 10

  • noise_for_empty_window (float, optional) – Noise value for an empty window, defaults to 1e+20

  • write_log_messages (str, optional) – Write log messages (true/false), defaults to “true”

  • peak_width (float, optional) – Expected peak width, defaults to 0.0

  • sn_bin_count (int, optional) – Signal bin count, defaults to 30

  • nr_iterations (int, optional) – Iterations to recenter peaks, defaults to 5

  • sn_win_len (float, optional) – Signal window length, defaults to 20.0

  • check_width_internally (str, optional) – Check width internally, defaults to “false”

  • ms1_only (str, optional) – Only MS1 spectrum, defaults to “true”

  • clear_meta_data (str, optional) – Clear meta data, defaults to “false”

  • deepcopy (bool, optional) – Perform a deepcopy to relieably unlink the experiment from the input, defaults to False

Returns:

Directory with centroids

Return type:

str

skripts.FIA.FIA.centroid_experiment(experiment: _MSExperimentDF | str, instrument: str = 'TOF', signal_to_noise: float = 1.0, spacing_difference_gap: float = 4.0, spacing_difference: float = 1.5, missing: int = 1, ms_levels: List[int] = [], report_FWHM: str = 'true', report_FWHM_unit: str = 'relative', max_intensity: float = -1, auto_max_stdev_factor: float = 3.0, auto_max_percentile: int = 95, auto_mode: int = 0, win_len: float = 200.0, bin_count: int = 30, min_required_elements: int = 10, noise_for_empty_window: float = 1e+20, write_log_messages: str = 'true', peak_width: float = 0.0, sn_bin_count: int = 30, nr_iterations: int = 5, sn_win_len: float = 20.0, check_width_internally: str = 'false', ms1_only: str = 'true', clear_meta_data: str = 'false', deepcopy: bool = False) _MSExperimentDF[source]

Reduce dataset to centroids.

Usecase fia_df[“cent_experiment”] = fia_df[“experiment”].apply(lambda experiment: centroid_experiment(experiment, instrument=”TOF”, # For All

signal_to_noise=2.0, spacing_difference=1.5,

spacing_difference_gap=4.0, missing=1, ms_levels=[1], # For Orbitrap report_FWHM=”true”, report_FWHM_unit=”relative”, max_intensity=-1, auto_max_stdev_factor=3.0, auto_max_percentile=95, auto_mode=0, win_len=200.0, bin_count=30, min_required_elements=10, noise_for_empty_window=1e+20, write_log_messages=”true”,

peak_width=0.0, sn_bin_count=30, nr_iterations=5, sn_win_len=20.0, # For TOF check_width_internally=”false”, ms1_only=”true”, clear_meta_data=”false”, deepcopy=False))

Parameters:
  • experiment (Union[pyopenms.MSExperiment, str]) – Input experiment

  • instrument (str, optional) – Instrument type (TOF or FT-ICR, Orbitrap), defaults to “TOF”

  • signal_to_noise (float, optional) – Signal to noise ratio, defaults to 1.0

  • spacing_difference_gap (float, optional) – Spacing difference gap, defaults to 4.0

  • spacing_difference (float, optional) – Spacing difference, defaults to 1.5

  • missing (int, optional) – Number of allowed missing values, defaults to 1

  • ms_levels (List[int], optional) – MS levels to consider, defaults to []

  • report_FWHM (str, optional) – Report full width at half maximum, defaults to “true”

  • report_FWHM_unit (str, optional) – Report full width at half maximum unit, defaults to “relative”

  • max_intensity (float, optional) – Maximum intensity, defaults to -1

  • auto_max_stdev_factor (float, optional) – Automatic maximal standard deviation factor, defaults to 3.0

  • auto_max_percentile (int, optional) – Automatic maximal percentile to consider, defaults to 95

  • auto_mode (int, optional) – Automatic mode (0/1), defaults to 0

  • win_len (float, optional) – Window length, defaults to 200.0

  • bin_count (int, optional) – Number of bins, defaults to 30

  • min_required_elements (int, optional) – Minimum required elements for a peak, defaults to 10

  • noise_for_empty_window (float, optional) – Noise value for an empty window, defaults to 1e+20

  • write_log_messages (str, optional) – Write log messages (true/false), defaults to “true”

  • peak_width (float, optional) – Expected peak width, defaults to 0.0

  • sn_bin_count (int, optional) – Signal bin count, defaults to 30

  • nr_iterations (int, optional) – Iterations to recenter peaks, defaults to 5

  • sn_win_len (float, optional) – Signal window length, defaults to 20.0

  • check_width_internally (str, optional) – Check width internally, defaults to “false”

  • ms1_only (str, optional) – Only MS1 spectrum, defaults to “true”

  • clear_meta_data (str, optional) – Clear meta data, defaults to “false”

  • deepcopy (bool, optional) – Perform a deepcopy to relieably unlink the experiment from the input, defaults to False

Returns:

Centroided experiment

Return type:

pyopenms.MSExperiment

skripts.FIA.FIA.check_ending_experiment(file: str) bool[source]

Check whether the file has a mzML or mzXML ending.

Parameters:

file (str) – Path to file

Returns:

Ending is mzML or mzXML

Return type:

bool

skripts.FIA.FIA.clean_dir(dir_path: str, subfolder: str | None = None) str[source]

Delete a directory or its subfolder and reconstruct the directory.

Parameters:
  • dir_path (str) – Directory path

  • subfolder (Optional[str], optional) – Subfolder path, defaults to None

Returns:

Directory path

Return type:

str

skripts.FIA.FIA.cluster_matlab(df: DataFrame, height_lim: int = 1000, prominence_lim: int = 1000, threshold: float = 0.004900000000000001)[source]

Clusters according to FIA matlab routine

Parameters:
  • df (pd.DataFrame) – Input dataframe

  • height_lim (int, optional) – height limit, defaults to 1000

  • prominence_lim (int, optional) – prominence limit, defaults to 1000

  • threshold (float, optional) – threshold to cut off values, defaults to (7e-2)**2

Returns:

m/z + inty as paired array

Return type:

np.ndarray

skripts.FIA.FIA.cluster_sliding_window(comb_experiment: _MSExperimentDF, height_lim: int = 1000, prominence_lim: int = 1000, window_len: int = 2000, window_shift=1000, threshold: float = 0.004900000000000001)[source]

Applies clustering over sliding window in an experiment. The result may contain duplicates or close to duplicates.

Parameters:

comb_experiment (pyopenms.MSExperiment) – Input experiment

param height_lim: height limit, defaults to 1000 :type height_lim: int, optional :param prominence_lim: prominence limit, defaults to 1000 :type prominence_lim: int, optional :param window_len: Window length, defaults to 2000 :type window_len: int, optional :param window_shift: Window shift, defaults to 1000 :type window_shift: int, optional :param threshold: threshold to cut off values, defaults to (7e-2)**2 :type threshold: float, optional :return: Clustered experiment :rtype: pyopenms.MSExperiment

skripts.FIA.FIA.combine_spectra_experiments(spectra_container: Sequence[_MSExperimentDF | MSSpectrum]) _MSExperimentDF[source]

Combines all spectra/experiements, into different spectra in one experiment

Parameters:

spectra_container (Sequence[Union[oms.MSExperiment,oms.MSSpectrum]]) – Input experiments

Returns:

Experiment with summed intensities along m/z axis

Return type:

oms.MSExperiment

skripts.FIA.FIA.consensus_features_linking(feature_maps: list, feature_grouper_type: str = 'QT') _ConsensusMapDF[source]

Linking features by consensus voting.

Parameters:
  • feature_maps (list) – List of feature maps

  • feature_grouper_type (str, optional) – Quality threshold clustering (QT) or k-dimensional tree clustering, defaults to “QT”

Raises:

ValueError – Use QT or KD for feature groupers.

Returns:

Consensums map of features

Return type:

pyopenms.ConsensusMap

skripts.FIA.FIA.consensus_map_to_df(consensus_map: _ConsensusMapDF) DataFrame[source]

Transforms a consensus map into a daraframe

Parameters:

consensus_map (pyopenms.ConsensusMap) – Input consensus map

Returns:

Dataframe from consensus map

Return type:

pandas.DataFrame

skripts.FIA.FIA.copy_experiment(experiment: _MSExperimentDF) _MSExperimentDF[source]

Makes a complete (recursive) copy of an experiment.

Parameters:

experiment (pyopenms.MSExperiment) – Experiment

Returns:

Copy of experiment.

Return type:

pyopenms.MSExperiment

skripts.FIA.FIA.define_metabolite_table(path_to_library_file: str, mass_range: list) list[source]

Read tsv file and create list of FeatureFinderMetaboIdentCompound.

Parameters:
  • path_to_library_file (str) – Path to library file

  • mass_range (list) – Range of m/z values

Returns:

Metabolite table

Return type:

list

skripts.FIA.FIA.deisotope_experiment(experiment: _MSExperimentDF | str, fragment_tolerance: float = 0.1, fragment_unit_ppm: bool = False, min_charge: int = 1, max_charge: int = 3, keep_only_deisotoped: bool = True, min_isopeaks: int = 2, max_isopeaks: int = 10, make_single_charged: bool = True, annotate_charge: bool = True, annotate_iso_peak_count: bool = True, use_decreasing_model: bool = True, start_intensity_check: bool = False, add_up_intensity: bool = False, deepcopy: bool = False)[source]

Attempt to combine isotopes in an experiment through exhaustive calculation.

Parameters:
  • spectrum (Union[oms.MSExperiment, str]) – Input experiment

  • fragment_tolerance (float, optional) – Tolerance for fragments, defaults to 0.1

  • fragment_unit_ppm (bool, optional) – Use ppm as fragmentation unit, defaults to False

  • min_charge (int, optional) – Minimal charge, defaults to 1

  • max_charge (int, optional) – Maximum charge, defaults to 3

  • keep_only_deisotoped (bool, optional) – Keep only deisotoped signals, defaults to True

  • min_isopeaks (int, optional) – Minimum amount of isotopes for a signal, defaults to 2

  • max_isopeaks (int, optional) – Maximum amount of isotopes in a signal, defaults to 10

  • make_single_charged (bool, optional) – Adapt isotops to single hydrogen adducts/deducts, defaults to True

  • annotate_charge (bool, optional) – Annotate the charge, defaults to True

  • annotate_iso_peak_count (bool, optional) – Annotate isotopic peak count, defaults to True

  • use_decreasing_model (bool, optional) – Use decreasing model (decreased chance of isotopes with further changes), defaults to True

  • start_intensity_check (bool, optional) – Intensity check at the start, defaults to False

  • add_up_intensity (bool, optional) – Add intensity of isotopes, defaults to False

Returns:

Deisotoped experiment

Return type:

oms.MSExperiment

skripts.FIA.FIA.deisotope_spectrum(spectrum: MSSpectrum, fragment_tolerance: float = 0.1, fragment_unit_ppm: bool = False, min_charge: int = 1, max_charge: int = 3, keep_only_deisotoped: bool = True, min_isopeaks: int = 2, max_isopeaks: int = 10, make_single_charged: bool = True, annotate_charge: bool = True, annotate_iso_peak_count: bool = True, use_decreasing_model: bool = True, start_intensity_check: bool = False, add_up_intensity: bool = False) MSSpectrum[source]

Attempt to combine isotopes in a spectrum through exhaustive calculation.

Parameters:
  • spectrum (pyopenms.MSSpectrum) – Input spectrum

  • fragment_tolerance (float, optional) – Tolerance for fragments, defaults to 0.1

  • fragment_unit_ppm (bool, optional) – Use ppm as fragmentation unit, defaults to False

  • min_charge (int, optional) – Minimal charge, defaults to 1

  • max_charge (int, optional) – Maximum charge, defaults to 3

  • keep_only_deisotoped (bool, optional) – Keep only deisotoped signals, defaults to True

  • min_isopeaks (int, optional) – Minimum amount of isotopes for a signal, defaults to 2

  • max_isopeaks (int, optional) – Maximum amount of isotopes in a signal, defaults to 10

  • make_single_charged (bool, optional) – Adapt isotops to single hydrogen adducts/deducts, defaults to True

  • annotate_charge (bool, optional) – Annotate the charge, defaults to True

  • annotate_iso_peak_count (bool, optional) – Annotate isotopic peak count, defaults to True

  • use_decreasing_model (bool, optional) – Use decreasing model (decreased chance of isotopes with further changes), defaults to True

  • start_intensity_check (bool, optional) – Intensity check at the start, defaults to False

  • add_up_intensity (bool, optional) – Add intensity of isotopes, defaults to False

Returns:

Deisotoped spectrum

Return type:

pyopenms.MSSpectrum

skripts.FIA.FIA.detect_adducts(feature_maps: list, potential_adducts: str | bytes = '[]', q_try: str = 'feature', mass_max_diff: float = 10.0, unit: str = 'ppm', max_minority_bound: int = 3, verbose_level: int = 0) list[source]

Attempt adduct detection through exhaustive calculations.

Parameters:
  • feature_maps (list) – List of feature maps

  • potential_adducts (Union[str, bytes], optional) – Potential adducts to consider, defaults to “[]”

  • q_try (str, optional) – Charge discovery dimension, defaults to “feature”

  • mass_max_diff (float, optional) – Maximum mass difference, defaults to 10.0

  • unit (str, optional) – Unit of mass difference, defaults to “ppm”

  • max_minority_bound (int, optional) – Maximum minority bound, defaults to 3

  • verbose_level (int, optional) – Verbosity level, defaults to 0

Returns:

Feature maps with removed adducts (deconvoluted)

Return type:

list

skripts.FIA.FIA.dynamic_plot(experiment: _MSExperimentDF, mode: str = 'lines', log: List[str] = ['x']) None[source]

Shows an interactive plot of all spectra in the experiment. May take a long time for large datasets. Recommended after centroiding, or data reduction.

Parameters:
  • experiment (pyopenms.MSExperiment) – Input experiment

  • mode (str, optional) – Mode of display [“lines” | “markers” | “lines+markers” | other pyplot.graph_objects options], defaults to “lines”

  • log (List[str], optional) – Axes to log-transform [x,y], defaults to []

skripts.FIA.FIA.elution_peak_detection(mass_traces: list, chrom_fwhm: float = 10.0, chrom_peak_snr: float = 2.0, width_filtering: str = 'fixed', min_fwhm: float = 1.0, max_fwhm: float = 60.0, masstrace_snr_filtering: str = 'false') list[source]

Elution peak detection along mass traces. Relevant for chromatographic data.

Parameters:
  • mass_traces (list) – List of mass traces

  • chrom_fwhm (float, optional) – Chromatographic full width at half maximum, defaults to 10.0

  • chrom_peak_snr (float, optional) – Minimum signal-to-noise a mass trace should have, defaults to 2.0

  • width_filtering (str, optional) – Type of width filtering, defaults to “fixed”

  • min_fwhm (float, optional) – Minimal full width at half maximum, defaults to 1.0

  • max_fwhm (float, optional) – Maximal full width at half maximum, defaults to 60.0

  • masstrace_snr_filtering (str, optional) – Filtering by signal to noise ratio, defaults to “false”

Returns:

List of final mass traces

Return type:

list

skripts.FIA.FIA.elution_peak_detection_batch(mass_traces_all: list[list], chrom_fwhm: float = 10.0, chrom_peak_snr: float = 2.0, width_filtering: str = 'fixed', min_fwhm: float = 1.0, max_fwhm: float = 60.0, masstrace_snr_filtering: str = 'false') list[list][source]

Elution peak detection along list of mass traces. Relevant for chromatographic data.

Parameters:
  • mass_traces_all (list[list]) – List of list of all mass traces

  • chrom_fwhm (float, optional) – Chromatographic full width at half maximum, defaults to 10.0

  • chrom_peak_snr (float, optional) – Minimum signal-to-noise a mass trace should have, defaults to 2.0

  • width_filtering (str, optional) – Type of width filtering, defaults to “fixed”

  • min_fwhm (float, optional) – Minimal full width at half maximum, defaults to 1.0

  • max_fwhm (float, optional) – Maximal full width at half maximum, defaults to 60.0

  • masstrace_snr_filtering (str, optional) – Filtering by signal to noise ratio, defaults to “false”

Returns:

List of list of final mass traces

Return type:

list[list]

skripts.FIA.FIA.extract_feature_coord(feature: Feature, mzs: ndarray, retention_times: ndarray, intensities: ndarray, labels: ndarray, sub_feat: Feature | None = None) list[source]

Extract feature coordinates for plots

Parameters:
  • feature (oms.Feature) – Input feature

  • mzs (np.ndarray) – m/z values

  • retention_times (np.ndarray) – Retention times

  • intensities (np.ndarray) – Intensity values

  • labels (np.ndarray) – Labels

  • sub_feat (Optional[oms.Feature], optional) – Sub-features, defaults to None

Returns:

List of mzs, retention times intensities and matching labels for plots

Return type:

list

skripts.FIA.FIA.extract_from_clustering(df: DataFrame, clustering) ndarray[source]

Extract mzs and intensities from clustering

Parameters:
  • df (pandas.DataFrame) – Input dataframe

  • clustering (np.ndarray) – Clustering

Returns:

m/z + inty as paired array

Return type:

np.ndarray

skripts.FIA.FIA.feature_detection_targeted(experiment: _MSExperimentDF | str, metab_table: list, feature_filepath: str | None = None, mz_window: float = 5.0, rt_window: float | None = None, n_isotopes: int = 2, isotope_pmin: float = 0.01, peak_width: float = 60.0) _FeatureMapDF[source]

Feature detection with a given metabolic table.

Parameters:
  • experiment (Union[oms.MSExperiment, str]) – Input experiment

  • metab_table (list) – Metabilites table

  • feature_filepath (Optional[str], optional) – Filpath to featureXML, defaults to None

  • mz_window (float, optional) – m/z window width, defaults to 5.0

  • rt_window (Optional[float], optional) – Retention time window width, defaults to None

  • n_isotopes (int, optional) – Number of considered isotopes, defaults to 2

  • isotope_pmin (float, optional) – Minimal probability of an isotope to be considered, defaults to 0.01

  • peak_width (float, optional) – Standard peak width, defaults to 60.0

Returns:

Feature map

Return type:

oms.FeatureMap

skripts.FIA.FIA.feature_detection_untargeted(experiment: _MSExperimentDF | str, mass_traces_deconvol: list = [], isotope_filtering_model='metabolites (2% RMS)', local_rt_range: float = 3.0, local_mz_range: float = 5.0, charge_lower_bound: int = 1, charge_upper_bound: int = 3, chrom_fwhm: float = 10.0, report_summed_ints: str = 'true', enable_RT_filtering: str = 'false', mz_scoring_13C: str = 'false', use_smoothed_intensities: str = 'false', report_convex_hulls: str = 'true', report_chromatograms: str = 'false', remove_single_traces: str = 'true', mz_scoring_by_elements: str = 'false', elements: str = 'CHNOPS') _FeatureMapDF[source]

Untargeted feature detection in an experiment.

Parameters:
  • experiment (Union[oms.MSExperiment, str]) – Input experiment

  • mass_traces_deconvol (list, optional) – Deconvoluted mass traces, defaults to []

  • isotope_filtering_model (str, optional) – Isotope filtering model, defaults to “metabolites (2% RMS)”

  • local_rt_range (float, optional) – Local retention time range, defaults to 3.0

  • local_mz_range (float, optional) – Local m/z range, defaults to 5.0

  • charge_lower_bound (int, optional) – Lower charge bound, defaults to 1

  • charge_upper_bound (int, optional) – Upper charge bound, defaults to 3

  • chrom_fwhm (float, optional) – Chromatographic full width at half maximum, defaults to 10.0

  • report_summed_ints (str, optional) – Report summed intensities, defaults to “true”

  • enable_RT_filtering (str, optional) – Enable retention time filtering, defaults to “false”

  • mz_scoring_13C (str, optional) – Score m/z by looking at expected Carbon13 peaks, defaults to “false”

  • use_smoothed_intensities (str, optional) – Use smoothed intensities (if smoothed before), defaults to “false”

  • report_convex_hulls (str, optional) – Report convex hulls, defaults to “true”

  • report_chromatograms (str, optional) – Report chromatograms, defaults to “false”

  • remove_single_traces (str, optional) – Remove single traces (only appear at one retention time), defaults to “true”

  • mz_scoring_by_elements (str, optional) – Score m/z by present elements, defaults to “false”

  • elements (str, optional) – Elements to consider, defaults to “CHNOPS”

Returns:

Feature Map

Return type:

pyopenms.FeatureMap

skripts.FIA.FIA.feature_detection_untargeted_batch(experiments: Sequence[_MSExperimentDF | str] | str, file_ending: str = '.mzML', mass_traces_deconvol_all: list[list] = [], isotope_filtering_model='metabolites (2% RMS)', local_rt_range: float = 3.0, local_mz_range: float = 5.0, charge_lower_bound: int = 1, charge_upper_bound: int = 3, chrom_fwhm: float = 10.0, report_summed_ints: str = 'true', enable_RT_filtering: str = 'false', mz_scoring_13C: str = 'false', use_smoothed_intensities: str = 'false', report_convex_hulls: str = 'true', report_chromatograms: str = 'false', remove_single_traces: str = 'true', mz_scoring_by_elements: str = 'false', elements: str = 'CHNOPS') list[_FeatureMapDF][source]

Untargeted feature detection in experiments.

Parameters:
  • experiments (Union[Sequence[Union[oms.MSExperiment,str]], str]) – Input experiments

  • file_ending (str, optional) – File ending, defaults to “.mzML”

  • mass_traces_deconvol (list, optional) – Deconvoluted mass traces, defaults to []

  • isotope_filtering_model (str, optional) – Isotope filtering model, defaults to “metabolites (2% RMS)”

  • local_rt_range (float, optional) – Local retention time range, defaults to 3.0

  • local_mz_range (float, optional) – Local m/z range, defaults to 5.0

  • charge_lower_bound (int, optional) – Lower charge bound, defaults to 1

  • charge_upper_bound (int, optional) – Upper charge bound, defaults to 3

  • chrom_fwhm (float, optional) – Chromatographic full width at half maximum, defaults to 10.0

  • report_summed_ints (str, optional) – Report summed intensities, defaults to “true”

  • enable_RT_filtering (str, optional) – Enable retention time filtering, defaults to “false”

  • mz_scoring_13C (str, optional) – Score m/z by looking at expected Carbon13 peaks, defaults to “false”

  • use_smoothed_intensities (str, optional) – Use smoothed intensities (if smoothed before), defaults to “false”

  • report_convex_hulls (str, optional) – Report convex hulls, defaults to “true”

  • report_chromatograms (str, optional) – Report chromatograms, defaults to “false”

  • remove_single_traces (str, optional) – Remove single traces (only appear at one retention time), defaults to “true”

  • mz_scoring_by_elements (str, optional) – Score m/z by present elements, defaults to “false”

  • elements (str, optional) – Elements to consider, defaults to “CHNOPS”

Returns:

List of Feature Maps

Return type:

list[pyopenms.FeatureMap]

skripts.FIA.FIA.filter_consensus_map_df(consensus_map_df: DataFrame, max_missing_values: int = 1, min_feature_quality: float | None = 0.8) DataFrame[source]

Filter consensus map DataFrame according to missing values and feature quality.

Parameters:
  • consensus_map_df (pandas.DataFrame) – Input consensus map dataframe

  • max_missing_values (int, optional) – Maximum number of missing values, defaults to 1

  • min_feature_quality (Optional[float], optional) – Minimal quality of feature, defaults to 0.8

Returns:

Filtered consensus map dataframe

Return type:

pandas.DataFrame

skripts.FIA.FIA.find_close(df1: DataFrame, df1_col, df2: DataFrame, df2_col, tolerance=0.001)[source]

Find close values in two dataframes.

Parameters:
  • df1 (pandas.DataFrame) – Input dataframe 1

  • df1_col (all) – Input dataframe 1 matched column

  • df2 (pandas.DataFrame) – Input dataframe 2

  • df2_col (all) – Input dataframe 2 matched column

  • tolerance (float, optional) – Absolute tolerance in difference, defaults to 0.001

Yield:

Datframe of close values

Return type:

pandas.DataFrame

skripts.FIA.FIA.impute_consensus_map_df(consensus_map_df: DataFrame, n_nearest_neighbours: int = 2) DataFrame[source]

Data imputation with k-neares-neighbours (kNN).

Parameters:
  • consensus_map_df (pandas.DataFrame) – Input consensus map dataframe

  • n_nearest_neighbours (int, optional) – K nearest neighbours, defaults to 2

Returns:

Consensus map dataframe with imputed values

Return type:

pandas.DataFrame

skripts.FIA.FIA.join_df_by(df: DataFrame, joiner: str, combiner: str) DataFrame[source]

Combines datframe with same <joiner>, while combining the name of <combiner> as the new index.

Parameters:
  • df (pd.DataFrame) – Input DataFrame

  • joiner (str) – Indicates the column that is the criterium for joining the rows

  • combiner (str) – Indicates the column that should be combined as an identifier

Returns:

Combined DataFrame

Return type:

pd.DataFrame

skripts.FIA.FIA.limit_experiment(experiment: _MSExperimentDF | str, mz_lower_limit: int | float = 0, mz_upper_limit: int | float = 10000, sample_size: int = 100000, statistic: str = 'sum', deepcopy: bool = False) _MSExperimentDF[source]

Limits the range of all spectra in an experiment to <mz_lower_limit> and <mz_upper_limit>. Uniformly samples <sample_size> number of peaks from the spectrum (without replacement).

Parameters:
  • experiment (Union[pyopenms.MSExperiment, str]) – Input experiment

  • mz_lower_limit (Union[int, float], optional) – Lower m/z value limit, defaults to 0

  • mz_upper_limit (Union[int, float], optional) – Upper m/z value limit, defaults to 10000

  • sample_size (int, optional) – Number of sampled peaks from the spectrum, defaults to 100000

  • statistic – Operation to perform on binned values in a window. May take common parameters,

defined by sci.stats.binned_statistic., defaults to “sum” :type statistic: str, optional :param deepcopy: Perform a deepcopy to relieably unlink the experiment from the input, defaults to False :type deepcopy: bool, optional :return: Limited and sampled experiment :rtype: oms.MSExperiment

skripts.FIA.FIA.limit_spectrum(spectrum: MSSpectrum, mz_lower_limit: int | float, mz_upper_limit: int | float, sample_size: int, statistic: str = 'sum') MSSpectrum[source]

Limits the range of the Spectrum to <mz_lower_limit> and <mz_upper_limit>. Uniformly samples <sample_size> number of peaks from the spectrum (without replacement).

Parameters:
  • spectrum (pyopenms.MSSpectrum) – Input spectrum

  • mz_lower_limit (Union[int, float]) – Lower m/z value limit

  • mz_upper_limit (Union[int, float]) – Upper m/z value limit

  • sample_size (int) – Number of sampled peaks from the spectrum

  • statistic – Operation to perform on binned values in a window. May take common parameters,

defined by sci.stats.binned_statistic., defaults to “sum” :type statistic: str, optional :return: Limited and sampled spectrum :rtype: pyopenms.MSSpectrum

skripts.FIA.FIA.load_experiment(experiment: _MSExperimentDF | str, separator: str = '\t') _MSExperimentDF[source]

If no experiment is given, loads and returns it from either .mzML or .mzXML file. Collects garbage with gc.collect() to ensure space in the RAM.

Parameters:
  • experiment (Union[oms.MSExperiment, str]) – Experiment, or Path to experiment

  • separator (str, optional) – Separator of data, defaults to ” “

Returns:

Experiment

Return type:

pyopenms.MSExperiment

skripts.FIA.FIA.load_experiments(experiments: Sequence[_MSExperimentDF | str] | str, file_ending: str | None = None, separator: str = '\t', data_load: bool = True) Sequence[_MSExperimentDF | str][source]

Load a batch of experiments.

Parameters:

experiments – Experiments, either described by a list of paths or one path as base directory,

or an existing experiment. :type experiments: Union[Sequence[Union[oms.MSExperiment,str]], str] :param file_ending: Ending of experiment file, defaults to None :type file_ending: Optional[str], optional :param separator: Separator of data, defaults to ” ” :type separator: str, optional :param data_load: Load the data or just combine the base string to a list of full filepaths, defaults to True :type data_load: bool, optional :return: Experiments :rtype: Sequence[Union[oms.MSExperiment,str]]

skripts.FIA.FIA.load_fia_df(data_dir: str, file_ending: str, separator: str = '\t', data_load: bool = True, backend=<module 'pandas' from '/mnt/lustre/groups/link/linca945/.conda/envs/VAE/lib/python3.11/site-packages/pandas/__init__.py'>) DataFrame | DataFrame[source]

Load a Flow injection analysis dataframe, defining important properties.

Parameters:
  • data_dir (str) – Data directory

  • file_ending (str) – Ending of file

  • separator (str, optional) – Separator for file, defaults to ” “

  • data_load (bool, optional) – Load data or only return list of experiments, defaults to True

  • backend (_type_, optional) – Use pandas or polars as backend, defaults to pd

Returns:

_description_

Return type:

Union[pandas.DataFrame, polars.DataFrame]

skripts.FIA.FIA.load_name(experiment: _MSExperimentDF | str, alt_name: str | None = None, file_ending: str | None = None) str[source]

Load the name of an experiment.

Parameters:
  • experiment (Union[oms.MSExperiment, str]) – Experiment

  • alt_name (Optional[str], optional) – Alternative Name if none is found, defaults to None

  • file_ending (Optional[str], optional) – Ending of experiment file, defaults to None

Raises:

ValueError – Raises error if no file name is found and no alt_name is given.

Returns:

Name of experiment or alternative name

Return type:

str

skripts.FIA.FIA.load_names_batch(experiments: Sequence[_MSExperimentDF | str] | str, file_ending: str = '.mzML') List[str][source]

If no experiment is given, loads and returns it from either .mzML or .mzXML file.

Parameters:
  • experiments (Union[Sequence[Union[oms.MSExperiment,str]], str]) – Experiments

  • file_ending (str, optional) – Ending of experiment file, defaults to “.mzML”

Returns:

List of experiment names

Return type:

List[str]

skripts.FIA.FIA.make_merge_dict(dir: str, file_ending: str = '.mzML') dict[source]

Create a dictionary for merging.

Parameters:
  • dir (str) – Directory to extract samples from.

  • file_ending (str, optional) – File ending, defaults to “.mzML”

Returns:

Dictonary with sample names as keys and paths for merging as values

Return type:

dict

skripts.FIA.FIA.mass_trace_detection(experiment: _MSExperimentDF | str, mass_error_ppm: float = 10.0, noise_threshold_int: float = 1000.0, reestimate_mt_sd: str = 'true', quant_method: str = 'median', trace_termination_criterion: str = 'outlier', trace_termination_outliers: int = 3, min_trace_length: float = 5.0, max_trace_length: float = -1.0) list[source]

Detection of mass traces in experiment over several spectra.

Parameters:
  • experiment (Union[oms.MSExperiment, str]) – Input experiment

  • mass_error_ppm (float, optional) – Mass error in ppm, defaults to 10.0

  • noise_threshold_int (float, optional) – Noise threshold intensity, defaults to 1000.0

  • reestimate_mt_sd (str, optional) – Reestimate mass trace standard deviation during run, defaults to “true”

  • quant_method (str, optional) – Quantification method, defaults to “median”

  • trace_termination_criterion (str, optional) – Criterion to terminate a trace, defaults to “outlier”

  • trace_termination_outliers (int, optional) – Number of cases that fulfil criterion/outliers to break trace, defaults to 3

  • min_trace_length (float, optional) – Minimal trace length, defaults to 5.0

  • max_trace_length (float, optional) – Maximum trace length, defaults to -1.0

Returns:

Mass traces

Return type:

list

skripts.FIA.FIA.mass_trace_detection_batch(experiments: Sequence[_MSExperimentDF | str] | str, file_ending: str = '.mzML', mass_error_ppm: float = 10.0, noise_threshold_int: float = 1000.0, reestimate_mt_sd: str = 'true', quant_method: str = 'median', trace_termination_criterion: str = 'outlier', trace_termination_outliers: int = 3, min_trace_length: float = 5.0, max_trace_length: float = -1.0) list[source]

Detection of mass traces in experiment over several experiments.

Parameters:
  • experiments (Union[Sequence[Union[oms.MSExperiment,str]], str]) – Input experiments

  • file_ending (str, optional) – File ending, defaults to “.mzML”

  • mass_error_ppm (float, optional) – Mass error in ppm, defaults to 10.0

  • noise_threshold_int (float, optional) – Noise threshold intensity, defaults to 1000.0

  • reestimate_mt_sd (str, optional) – Reestimate mass trace standard deviation during run, defaults to “true”

  • quant_method (str, optional) – Quantification method, defaults to “median”

  • trace_termination_criterion (str, optional) – Criterion to terminate a trace, defaults to “outlier”

  • trace_termination_outliers (int, optional) – Number of cases that fulfil criterion/outliers to break trace, defaults to 3

  • min_trace_length (float, optional) – Minimal trace length, defaults to 5.0

  • max_trace_length (float, optional) – Maximum trace length, defaults to -1.0

Returns:

Mass traces

Return type:

list

skripts.FIA.FIA.merge_batch(experiments: Sequence[_MSExperimentDF | str] | str, run_dir: str, file_ending: str = '.mzML', method: str = 'block_method', mz_binning_width: float = 1.0, mz_binning_width_unit: str = 'ppm', ms_levels: List[int] = [1], sort_blocks: str = 'RT_ascending', rt_block_size: int | None = None, rt_max_length: float = 0.0, spectrum_type: str = 'automatic', rt_range: float | None = 5.0, rt_unit: str = 'scans', rt_FWHM: float = 5.0, cutoff: float = 0.01, precursor_mass_tol: float = 0.0, precursor_max_charge: int = 1, deepcopy: bool = False) str[source]

Merge several spectra into one spectrum (useful for MS1 spectra to amplify signals along near retention times)

Parameters:
  • experiments (Union[Sequence[Union[oms.MSExperiment,str]], str]) – Input experiments

  • run_dir (str) – Run directory

  • file_ending (str, optional) – File ending, defaults to “.mzML”

  • method (str, optional) – Method to perform merging, defaults to “block_method”

  • mz_binning_width (float, optional) – m/z binning width, defaults to 1.0

  • mz_binning_width_unit (str, optional) – m/z binning width unit (ppm/Da), defaults to “ppm”

  • ms_levels (List[int], optional) – MS levels to consider, defaults to [1]

  • sort_blocks (str, optional) – Sort blocks by rentention time, defaults to “RT_ascending”

  • rt_block_size (Optional[int], optional) – Block size along retention time, defaults to None

  • rt_max_length (float, optional) – Maximal length of Retention time, defaults to 0.0

  • spectrum_type (str, optional) – Spectrum type determination, defaults to “automatic”

  • rt_range (Optional[float], optional) – Retention time range to merge over, defaults to 5.0

  • rt_unit (str, optional) – Unit to merge over, defaults to “scans”

  • rt_FWHM (float, optional) – Full width at half maximum, defaults to 5.0

  • cutoff (float, optional) – Cutoff value during merging, defaults to 0.01

  • precursor_mass_tol (float, optional) – Percursor mass tolerance, defaults to 0.0

  • precursor_max_charge (int, optional) – Maximal precursor charge, defaults to 1

  • deepcopy (bool, optional) – Perform a deepcopy to relieably unlink the experiment from the input, defaults to False

Raises:

ValueError – Method needs to be in [block_method|average_tophat|average_gaussian]

Returns:

Directory with merged experiments

Return type:

str

skripts.FIA.FIA.merge_by_mz(id_df_1: DataFrame, id_df_2: DataFrame, mz_tolerance: float = 0.0001) DataFrame[source]

Merge dataframes by mz column.

Parameters:
  • id_df_1 (pandas.DataFrame) – Input dataframe 1

  • id_df_2 (pandas.DataFrame) – Input dataframe 2

  • mz_tolerance (float, optional) – Tolerance of m/z deviation, defaults to 1e-04

Returns:

_description_

Return type:

pandas.DataFrame

skripts.FIA.FIA.merge_compounds(path_to_tsv: str) DataFrame[source]

Joins entries with equal Mass and SumFormula. Links CompoundName with: ;. Links rest with: ,.

Parameters:

path_to_tsv (str) – Path to TSV

Returns:

Merged compounds

Return type:

pd.DataFrame

skripts.FIA.FIA.merge_experiment(experiment: _MSExperimentDF | str, method: str = 'block_method', mz_binning_width: float = 1.0, mz_binning_width_unit: str = 'ppm', ms_levels: List[int] = [1], sort_blocks: str = 'RT_ascending', rt_block_size: int | None = None, rt_max_length: float = 0.0, spectrum_type: str = 'automatic', rt_range: float | None = 5.0, rt_unit: str = 'scans', rt_FWHM: float = 5.0, cutoff: float = 0.01, precursor_mass_tol: float = 0.0, precursor_max_charge: int = 1, deepcopy: bool = False) _MSExperimentDF[source]

Merge several spectra into one spectrum (useful for MS1 spectra to amplify signals)

Parameters:
  • experiment (Union[oms.MSExperiment, str]) – Input experiment

  • method (str, optional) – Method to perform merging, defaults to “block_method”

  • mz_binning_width (float, optional) – m/z binning width, defaults to 1.0

  • mz_binning_width_unit (str, optional) – m/z binning width unit (ppm/Da), defaults to “ppm”

  • ms_levels (List[int], optional) – MS levels to consider, defaults to [1]

  • sort_blocks (str, optional) – Sort blocks by rentention time, defaults to “RT_ascending”

  • rt_block_size (Optional[int], optional) – Block size along retention time, defaults to None

  • rt_max_length (float, optional) – Maximal length of Retention time, defaults to 0.0

  • spectrum_type (str, optional) – Spectrum type determination, defaults to “automatic”

  • rt_range (Optional[float], optional) – Retention time range to merge over, defaults to 5.0

  • rt_unit (str, optional) – Unit to merge over, defaults to “scans”

  • rt_FWHM (float, optional) – Full width at half maximum, defaults to 5.0

  • cutoff (float, optional) – Cutoff value during merging, defaults to 0.01

  • precursor_mass_tol (float, optional) – Percursor mass tolerance, defaults to 0.0

  • precursor_max_charge (int, optional) – Maximal precursor charge, defaults to 1

  • deepcopy (bool, optional) – Perform a deepcopy to relieably unlink the experiment from the input, defaults to False

Raises:

ValueError – Method needs to be in [block_method|average_tophat|average_gaussian]

Returns:

Merged experiment

Return type:

pyopenms.MSExperiment

skripts.FIA.FIA.merge_experiments(experiments: Sequence[_MSExperimentDF | str] | str, run_dir: str, file_ending: str = '.mzML', method: str = 'block_method', mz_binning_width: float = 1.0, mz_binning_width_unit: str = 'ppm', ms_levels: List[int] = [1], sort_blocks: str = 'RT_ascending', rt_block_size: int | None = None, rt_max_length: float = 0.0, spectrum_type: str = 'automatic', rt_range: float | None = 5.0, rt_unit: str = 'scans', rt_FWHM: float = 5.0, cutoff: float = 0.01, precursor_mass_tol: float = 0.0, precursor_max_charge: int = 1, deepcopy: bool = False) _MSExperimentDF[source]

Merge several spectra into one spectrum (useful for MS1 spectra to amplify signals along near retention times). Combines all spectra of given experiments into one experiment to merge over.

Parameters:
  • experiments (Union[Sequence[Union[oms.MSExperiment,str]], str]) – Input experiments

  • run_dir (str) – Run directory

  • file_ending (str, optional) – File ending, defaults to “.mzML”

  • method (str, optional) – Method to perform merging, defaults to “block_method”

  • mz_binning_width (float, optional) – m/z binning width, defaults to 1.0

  • mz_binning_width_unit (str, optional) – m/z binning width unit (ppm/Da), defaults to “ppm”

  • ms_levels (List[int], optional) – MS levels to consider, defaults to [1]

  • sort_blocks (str, optional) – Sort blocks by rentention time, defaults to “RT_ascending”

  • rt_block_size (Optional[int], optional) – Block size along retention time, defaults to None

  • rt_max_length (float, optional) – Maximal length of Retention time, defaults to 0.0

  • spectrum_type (str, optional) – Spectrum type determination, defaults to “automatic”

  • rt_range (Optional[float], optional) – Retention time range to merge over, defaults to 5.0

  • rt_unit (str, optional) – Unit to merge over, defaults to “scans”

  • rt_FWHM (float, optional) – Full width at half maximum, defaults to 5.0

  • cutoff (float, optional) – Cutoff value during merging, defaults to 0.01

  • precursor_mass_tol (float, optional) – Percursor mass tolerance, defaults to 0.0

  • precursor_max_charge (int, optional) – Maximal precursor charge, defaults to 1

  • deepcopy (bool, optional) – Perform a deepcopy to relieably unlink the experiment from the input, defaults to False

Raises:

ValueError – Method needs to be in [block_method|average_tophat|average_gaussian]

Returns:

Directory with merged experiments

Return type:

str

skripts.FIA.FIA.merge_mz_tolerance(comb_df: DataFrame, charge: int = 1, tolerance: float = 0.001, binned: bool = False) DataFrame[source]

Weighted average of m/z values that are within absolute tolerance of a row in the primary dataframe.

Parameters:
  • comb_df (pandas.DataFrame) – Dataframe for merging

  • charge (int, optional) – Theoretical charge, defaults to 1

  • tolerance (float, optional) – Tolerance of m/z difference, defaults to 1e-3

  • binned (bool, optional) – Values already binned (simpler process), defaults to False

Returns:

Merged Dataframe

Return type:

pandas.DataFrame

skripts.FIA.FIA.mnx_to_oms(df: DataFrame) DataFrame[source]

Turns a dataframe from MetaNetX into the required format by pyopenms for feature detection.

Parameters:

df (pandas.DataFrame) – Parameters Dataframe

Returns:

DataFrame with essential information.

Return type:

pandas.DataFrame

skripts.FIA.FIA.normalize_spectra(experiment: _MSExperimentDF | str, normalization_method: str = 'to_one', deepcopy: bool = False) _MSExperimentDF[source]

Normalizes spectra by specified method.

Parameters:
  • experiment (Union[oms.MSExperiment, str]) – Input experiment

  • normalization_method (Normalization method in [to_TIC | to_one] , optional) – _description_, defaults to “to_one”

  • deepcopy (bool, optional) – Perform a deepcopy to relieably unlink the experiment from the input, defaults to False

Returns:

Normlaized experiment along spectra

Return type:

oms.MSExperiment

skripts.FIA.FIA.plot_feature_map_rt_alignment(ordered_feature_maps: list, legend: bool = False) None[source]

Plot feature map retention time alignment

Parameters:
  • ordered_feature_maps (list) – Listof feature maps

  • legend (bool, optional) – Display legend, defaults to False

skripts.FIA.FIA.plot_features_3D(feature_map: _FeatureMapDF, plottype: str = 'scatter') DataFrame[source]

Represents found features in 3D

Parameters:
  • feature_map (oms.FeatureMap) – Input feature map

  • plottype (str, optional) – Type of plot [scatter|surface|line], defaults to “scatter”

Raises:

ValueError – Use [‘surface’,’scatter’,’line’]

Returns:

Dataframe with 3d features

Return type:

pandas.DataFrame

skripts.FIA.FIA.plot_id_df(id_df: DataFrame, x: str = 'RT', y: str = 'mz') None[source]

Scatterplot dataframe with identified metabolites.

Parameters:
  • id_df (pd.DataFrame) – Input dataframe

  • x (str, optional) – x-axis, defaults to “RT”

  • y (str, optional) – y-axis, defaults to “mz”

skripts.FIA.FIA.plot_mass_traces(mass_traces, sel=[0, 100], x: str = 'rt', y: str = 'mz', z: str = 'int', threed: bool = True)[source]

Plot mass traces along 3d plot.

Parameters:
  • mass_traces (list) – List of mass traces

  • sel (list, optional) – Selection of convex hull points, defaults to [0,100]

  • x (str, optional) – x-axis column, defaults to “rt”

  • y (str, optional) – y-axis column, defaults to “mz”

  • z (str, optional) – z-axis column, defaults to “int”

  • threed (bool, optional) – 3D plot, defaults to True

Returns:

Plot

Return type:

plotly-express plot

skripts.FIA.FIA.print_params(p)[source]

Print parameters of pyopenms class.

Parameters:

p (dict-like) – Parameters

skripts.FIA.FIA.quick_plot(spectrum: MSSpectrum, xlim: List[float] | None = None, ylim: List[float] | None = None, plottype: str = 'line', log: List[str] = []) Figure[source]

Shows a plot of a spectrum between the defined borders

Parameters:
  • spectrum (pyopenms.MSSpectrum) – Input spectrum

  • xlim (Optional[List[float]], optional) – x-axis limits, defaults to None

  • ylim (Optional[List[float]], optional) – y-axis limits, defaults to None

  • plottype (str, optional) – Type of plot [line|scatter], defaults to “line”

  • log (List[str], optional) – Axes to log-transform [x,y], defaults to []

Returns:

Figure

Return type:

Figure

skripts.FIA.FIA.read_experiment(experiment_path: str, separator: str = '\t') _MSExperimentDF[source]

Read in MzXML or MzML File as a pyopenms experiment. If the file is in tabular format, assumes that is is in long form with two columns [“mz”, “inty”]

Parameters:
  • experiment_path (str) – Path to experiment

  • separator (str, optional) – Separator of data, defaults to ” “

Raises:

ValueError – The experiment must end with a valid ending.

Returns:

Experiment

Return type:

pyopenms.MSExperiment

skripts.FIA.FIA.read_feature_map_XML(path_to_featureXML: str) _FeatureMapDF[source]

Reads in feature Map from .featureXML file.

Parameters:

path_to_featureXML (str) – Path to featureXML file.

Returns:

Feature map

Return type:

pyopenms.FeatureMap

skripts.FIA.FIA.read_feature_maps_XML(path_to_featureXMLs: str) list[source]

Reads in feature Maps from file

Parameters:

path_to_featureXMLs (str) – Path to featureXML file directory.

Returns:

List of feature maps

Return type:

list

skripts.FIA.FIA.read_mnx(filepath: str) DataFrame[source]

Read in chem_prop.tsv file from MetaNetX

Parameters:

filepath (str) – Path to file

Returns:

DataFrame

Return type:

pd.DataFrame

skripts.FIA.FIA.separate_feature_maps_pos_neg(feature_maps: list) list[source]

Separate the feature maps into positively and negatively charged feature maps.

Parameters:

feature_maps (list) – Input list of feature maps

Returns:

List of list of feature maps, separates into positive and negatively charged

Return type:

list

skripts.FIA.FIA.smooth_spectra(experiment: _MSExperimentDF | str, gaussian_width: float, deepcopy: bool = False) _MSExperimentDF[source]

Apply a Gaussian filter to all spectra in an experiment.

Parameters:
  • experiment (Union[pyopenms.MSExperiment, str]) – Input experiment

  • gaussian_width (float) – Window width to apply smoothing to

  • deepcopy (bool, optional) – Perform a deepcopy to relieably unlink the experiment from the input, defaults to False

Returns:

Smoothed experiment

Return type:

pyopenms.MSExperiment

skripts.FIA.FIA.sns_plot(x, y, hue=None, size=None, xlim: List[float] | None = None, ylim: List[float] | None = None, plottype: str = 'line', log: List[str] = [], sizes: Tuple[int, int] | None = (20, 20), palette: str = 'hls', figsize: Tuple[int, int] | None = (18, 5)) None[source]

Shows a plot of a spectrum between the defined borders.

Parameters:
  • x (array-like) – x-values

  • y (array-like) – x-values

  • hue (all, optional) – Column to group x and y values, defaults to None

  • size (float, optional) – Size of elements, defaults to None

  • xlim (Optional[List[float]], optional) – x-axis limits, defaults to None

  • ylim (Optional[List[float]], optional) – y-axis limits, defaults to None

  • plottype (str, optional) – Type of plot [line|scatter], defaults to “line”

  • log (List[str], optional) – Axes to log-transform [x,y], defaults to []

  • sizes (Optional[Tuple[int, int]], optional) – Sizes of points in relation to y-value, defaults to (20,20)

  • palette (str, optional) – Color palette, defaults to “hls”

  • figsize (Optional[Tuple[int,int]], optional) – Figure size, defaults to (18, 5)

skripts.FIA.FIA.store_experiment(experiment_path: str, experiment: _MSExperimentDF) None[source]

Stores the experiment.

Parameters:
  • experiment_path (str) – Path to be stored at (must end with .mzMl or .mzXML)

  • experiment (pyopenms.MSExperiment) – Experiment

skripts.FIA.FIA.store_feature_maps(feature_maps: list, out_dir: str, names: list[str] | str = [], file_ending: str = '.mzML') None[source]

Stores the feature maps as featureXML files.

Parameters:
  • feature_maps (list) – Feature Maps

  • out_dir (str) – Output directory

  • names (Union[list[str], str], optional) – Names of feature maps, defaults to []

  • file_ending (str, optional) – Ending of file, defaults to “.mzML”

skripts.FIA.FIA.sum_spectra(experiment: _MSExperimentDF) MSSpectrum[source]

Sum up spectra in one experiment.

Parameters:

experiment (pyopenms.MSExperiment) – Input experiment

Returns:

Spectrum with summed intensities along m/z axis

Return type:

pyopenms.MSSpectrum

skripts.FIA.FIA.targeted_feature_detection(experiment: _MSExperimentDF | str, compound_library_file: str, mz_window: float = 5.0, rt_window: float | None = None, n_isotopes: int = 2, isotope_pmin: float = 0.01, peak_width: float = 60.0, mass_range: list = [50.0, 10000.0]) _FeatureMapDF[source]

Feature detection with a given metabolic table with compund library file.

Parameters:
  • experiment (Union[oms.MSExperiment, str]) – Input experiment

  • compound_library_file (str) – Path to compound library file to define metabolic table from.

  • metab_table (list) – Metabilites table

  • feature_filepath (Optional[str], optional) – Filpath to featureXML, defaults to None

  • mz_window (float, optional) – m/z window width, defaults to 5.0

  • rt_window (Optional[float], optional) – Retention time window width, defaults to None

  • n_isotopes (int, optional) – Number of considered isotopes, defaults to 2

  • isotope_pmin (float, optional) – Minimal probability of an isotope to be considered, defaults to 0.01

  • peak_width (float, optional) – Standard peak width, defaults to 60.0

Returns:

Feature map

Return type:

pyopenms.FeatureMap

skripts.FIA.FIA.targeted_features_detection(in_dir: str, run_dir: str, file_ending: str, compound_library_file: str, mz_window: float = 5.0, rt_window: float = 20.0, n_isotopes: int = 2, isotope_pmin: float = 0.01, peak_width: float = 60.0, mass_range: list = [50.0, 10000.0]) list[_FeatureMapDF][source]

Feature detection with a given metabolic table with compund library file.

Parameters:
  • in_dir (str) – Input directory

  • run_dir (str) – Run directory

  • file_ending (str) – File ending

  • compound_library_file (str) – Path to compound library file to define metabolic table from.

  • metab_table (list) – Metabilites table

  • feature_filepath (Optional[str], optional) – Filpath to featureXML, defaults to None

  • mz_window (float, optional) – m/z window width, defaults to 5.0

  • rt_window (Optional[float], optional) – Retention time window width, defaults to None

  • n_isotopes (int, optional) – Number of considered isotopes, defaults to 2

  • isotope_pmin (float, optional) – Minimal probability of an isotope to be considered, defaults to 0.01

  • peak_width (float, optional) – Standard peak width, defaults to 60.0

Returns:

List of feature maps

Return type:

list[pyopenms.FeatureMap]

skripts.FIA.FIA.trim_threshold(experiment: _MSExperimentDF, threshold: float = 0.05) _MSExperimentDF[source]

Removes point below an absolute intensity theshold.

Parameters:
  • experiment (pyopenms.MSExperiment) – Input Experiment

  • threshold (float, optional) – Threshold for values to be excluded, defaults to 0.05

Returns:

Trimmed experiment

Return type:

oms.MSExperiment

skripts.FIA.FIA.trim_threshold_batch(experiments: Sequence[_MSExperimentDF | str] | str, run_dir: str, file_ending: str = '.mzML', threshold: float = 0.05, deepcopy: bool = False) str[source]

Removes points below an absolute intensity theshold in a batch of experiments.

Parameters:
  • experiments (Union[Sequence[Union[pyopenms.MSExperiment,str]], str]) – Input Experiments (as a list of experiments, a directory, or paths to experiemnts)

  • threshold (float, optional) – Threshold for values to be excluded, defaults to 0.05

  • deepcopy (bool, optional) – Perform a deepcopy to relieably unlink the experiment from the input, defaults to False

Returns:

Folder with trimmed experiments

Return type:

str

skripts.FIA.FIA.untargeted_feature_detection(experiment: _MSExperimentDF | str, feature_filepath: str | None = None, mass_error_ppm: float = 5.0, noise_threshold_int: float = 3000.0, charge_lower_bound: int = 1, charge_upper_bound: int = 3, width_filtering: str = 'fixed', isotope_filtering_model='none', remove_single_traces='true', mz_scoring_by_elements='false', report_convex_hulls='true', deepcopy: bool = False) _FeatureMapDF[source]

Untargeted detection of features. Combines mMass trace detection, elution peak detection and feature finding.

Parameters:
  • experiment (Union[oms.MSExperiment, str]) – Input experiment

  • feature_filepath (Optional[str], optional) – Path to featureXML file, defaults to None

  • mass_error_ppm (float, optional) – Mass error in ppm, defaults to 10.0

  • noise_threshold_int (float, optional) – Noise threshold intensity, defaults to 1000.0

  • charge_lower_bound (int, optional) – Lower charge bound, defaults to 1

  • charge_upper_bound (int, optional) – Upper charge bound, defaults to 3

  • width_filtering (str, optional) – Type of width filtering, defaults to “fixed”

  • isotope_filtering_model (str, optional) – Isotope filtering model, defaults to “metabolites (2% RMS)”

  • remove_single_traces (str, optional) – Remove single traces, defaults to “true”

  • mz_scoring_by_elements (str, optional) – Score m/z by present elements, defaults to “false”

  • report_convex_hulls (str, optional) – Report convex hulls, defaults to “true”

  • deepcopy (bool, optional) – Perform a deepcopy to relieably unlink the experiment from the input, defaults to False

Returns:

Feature map

Return type:

pyopenms.FeatureMap

skripts.FIA.FIA.untargeted_features_detection(in_dir: str, run_dir: str, file_ending: str = '.mzML', mass_error_ppm: float = 10.0, noise_threshold_int: float = 1000.0, charge_lower_bound: int = 1, charge_upper_bound: int = 3, width_filtering: str = 'fixed', isotope_filtering_model: str = 'none', remove_single_traces: str = 'true', mz_scoring_by_elements: str = 'false', report_convex_hulls: str = 'true', deepcopy: bool = False) list[source]

Untargeted detection of features. Combines mMass trace detection, elution peak detection and feature finding.

Parameters:
  • in_dir (str) – Input directory

  • run_dir (str) – Run directory

  • file_ending (str, optional) – File ending, defaults to “.mzML”

  • mass_error_ppm (float, optional) – Mass error in ppm, defaults to 10.0

  • noise_threshold_int (float, optional) – Noise threshold intensity, defaults to 1000.0

  • charge_lower_bound (int, optional) – Lower charge bound, defaults to 1

  • charge_upper_bound (int, optional) – Upper charge bound, defaults to 3

  • width_filtering (str, optional) – Type of width filtering, defaults to “fixed”

  • isotope_filtering_model (str, optional) – Isotope filtering model, defaults to “metabolites (2% RMS)”

  • remove_single_traces (str, optional) – Remove single traces, defaults to “true”

  • mz_scoring_by_elements (str, optional) – Score m/z by present elements, defaults to “false”

  • report_convex_hulls (str, optional) – Report convex hulls, defaults to “true”

  • deepcopy (bool, optional) – Perform a deepcopy to relieably unlink the experiment from the input, defaults to False

Returns:

List of feature maps

Return type:

list

skripts.FIA.FIA_oms module

Module contents