skripts.FIA package
Submodules
skripts.FIA.FIA module
Methods for Flow-injection-analysis.
- skripts.FIA.FIA.accurate_mass_search(consensus_map: _ConsensusMapDF, database_dir: str, tmp_dir: str, positive_adducts_file: str, negative_adducts_file: str, HMDBMapping_file: str, HMDB2StructMapping_file: str, ionization_mode: str = 'auto') DataFrame [source]
Assigning metbolites to consensus map masses.
- Parameters:
consensus_map (pyopenms.ConsensusMap) – Input consensus map
database_dir (str) – Database directory
tmp_dir (str) – Directory for temporary saves
positive_adducts_file (str) – File with possible positive adducts
negative_adducts_file (str) – File with possible negative adducts
HMDBMapping_file (str) – HMDBMapping file
HMDB2StructMapping_file (str) – HMDB2StructMapping file
ionization_mode (str, optional) – Ionization mode, defaults to “auto”
- Returns:
Annotated dataframe
- Return type:
pandas.DataFrame
- skripts.FIA.FIA.align_retention_times(feature_maps: list, max_num_peaks_considered: int = -1, max_mz_difference: float = 10.0, mz_unit: str = 'ppm', superimposer_max_scaling: float = 2.0) list [source]
Use as reference for alignment, the file with the largest number of features Works well if you have a pooled QC for example. Returns the aligned map at the first position.
- Parameters:
feature_maps (list) – List of feature maps
max_num_peaks_considered (int, optional) – Maximum number of considered peaks, defaults to -1
max_mz_difference (float, optional) – Maximum m/z difference, defaults to 10.0
mz_unit (str, optional) – Unit for m/z values, defaults to “ppm”
superimposer_max_scaling (float, optional) – Maximum scaling during superimposition, defaults to 2.0
- Returns:
List of feature maps with aligned retention times
- Return type:
list
- skripts.FIA.FIA.annotate_consensus_map_df(consensus_map_df: DataFrame, mass_search_df: DataFrame, result_path: str = '.', mz_tolerance: float = 1e-05) DataFrame [source]
Annotate consensus map DataFrame.
- Parameters:
consensus_map_df (pd.DataFrame) – Input Consensus map DataFrame
mass_search_df (pd.DataFrame) – Mass search DataFrame
result_path (str, optional) – Path to output results, defaults to “.”
mz_tolerance (float, optional) – Tolerance of m/z deviation, defaults to 1e-05
- Returns:
Identified Metabolites DataFrame
- Return type:
pd.DataFrame
- skripts.FIA.FIA.assign_feature_maps_polarity(feature_maps: list, scan_polarity: str | None = None) list [source]
Assigns the polarity to a list of feature maps, depending on “pos”/”neg” in file name.
- Parameters:
feature_maps (list) – List of feature maps
scan_polarity (Optional[str], optional) – Scan polarity, defaults to None
- Returns:
List of feature maps with annotated polarity
- Return type:
list
- skripts.FIA.FIA.batch_download(base_url: str, file_urls: list, save_directory: str) None [source]
Download files from a list into a directory.
- Parameters:
base_url (str) – Base URL
file_urls (list) – Individual file URLS which are appended to the base
save_directory (str) – Directory to save files to.
- skripts.FIA.FIA.bin_df_stepwise(df: ~pandas.core.frame.DataFrame | ~polars.dataframe.frame.DataFrame, binning_var='mz', binned_var='inty', statistic='sum', start: float = 0.0, stop: float = 2000.0, step: float = 0.001, backend=<module 'pandas' from '/mnt/lustre/groups/link/linca945/.conda/envs/VAE/lib/python3.11/site-packages/pandas/__init__.py'>) DataFrame | DataFrame [source]
Stepwise binning of a dataframe into discrete boxes.
- Parameters:
df (Union[pd.DataFrame, pl.DataFrame]) – Input dataframe
binning_var (str, optional) – Binning variable, i.e. distance defining value (column in dataframe), defaults to “mz”
binned_var (str, optional) – Binned value, i.e. combined value (column in dataframe), defaults to “inty”
statistic – Operation to perform on binned values in a window. May take common parameters,
defined by sci.stats.binned_statistic. defaults to “sum” :type statistic: str, optional :param start: Starting binning variable point, defaults to 0.0 :type start: float, optional :param stop: Stopping binning variable point, defaults to 2000.0 :type stop: float, optional :param step: Step distance along binning variable, defaults to 0.001 :type step: float, optional :param backend: Backend (pandas or polars), defaults to pd :type backend: _type_, optional :return: Binned dataframe :rtype: Union[pd.DataFrame, pl.DataFrame]
- skripts.FIA.FIA.bin_df_stepwise_batch(experiments: ~pandas.core.frame.DataFrame | ~polars.dataframe.frame.DataFrame, sample_var: str = 'sample', experiment_var: str = 'experiment', binning_var='mz', binned_var='inty', statistic='sum', start: float = 0.0, stop: float = 2000.0, step: float = 0.001, backend=<module 'pandas' from '/mnt/lustre/groups/link/linca945/.conda/envs/VAE/lib/python3.11/site-packages/pandas/__init__.py'>) DataFrame | DataFrame [source]
Stepwise binning of a dataframe into discrete boxes of multiple dataframes.
- Parameters:
experiments (Union[pd.DataFrame, pl.DataFrame]) – Input experiments
binning_var (str, optional) – Binning variable, i.e. distance defining value (column in dataframe), defaults to “mz”
binned_var (str, optional) – Binned value, i.e. combined value (column in dataframe), defaults to “inty”
statistic – Operation to perform on binned values in a window. May take common parameters,
defined by sci.stats.binned_statistic. defaults to “sum” :type statistic: str, optional :param start: Starting binning variable point, defaults to 0.0 :type start: float, optional :param stop: Stopping binning variable point, defaults to 2000.0 :type stop: float, optional :param step: Step distance along binning variable, defaults to 0.001 :type step: float, optional :param backend: Backend (pandas or polars), defaults to pd :type backend: _type_, optional :return: Binned dataframe :rtype: Union[pd.DataFrame, pl.DataFrame]
- skripts.FIA.FIA.bits_to_bytes(bits: float | int, factor: float | int) float [source]
Coverts a number of bits to a number of bytes for readability.
- Parameters:
bits (Union[float, int]) – Number of bits to be converted
factor (Union[float, int]) – / 10**factor (e.g. use 9 for GB)
- Returns:
Number of bytes
- Return type:
float
- skripts.FIA.FIA.build_directory(dir_path: str) None [source]
Build a new directory in the given path.
- Parameters:
dir_path (str) – Directory path
- skripts.FIA.FIA.centroid_batch(experiments: Sequence[_MSExperimentDF | str] | str, run_dir: str, file_ending: str = '.mzML', instrument: str = 'TOF', signal_to_noise: float = 1.0, spacing_difference_gap: float = 4.0, spacing_difference: float = 1.5, missing: int = 1, ms_levels: List[int] = [], report_FWHM: str = 'true', report_FWHM_unit: str = 'relative', max_intensity: float = -1, auto_max_stdev_factor: float = 3.0, auto_max_percentile: int = 95, auto_mode: int = 0, win_len: float = 200.0, bin_count: int = 30, min_required_elements: int = 10, noise_for_empty_window: float = 1e+20, write_log_messages: str = 'true', peak_width: float = 0.0, sn_bin_count: int = 30, nr_iterations: int = 5, sn_win_len: float = 20.0, check_width_internally: str = 'false', ms1_only: str = 'true', clear_meta_data: str = 'false', deepcopy: bool = False) str [source]
Centroids a batch of experiments, extracted from files in a given directory with a given file ending (i.e. .mzML or .mzXML). Returns the new directors as path/centroids.
- Parameters:
experiments (Union[Sequence[Union[oms.MSExperiment,str]], str]) – Input experiments
run_dir (str) – Run directory
file_ending (str, optional) – File ending, defaults to “.mzML”
instrument (str, optional) – Instrument type (TOF or FT-ICR, Orbitrap), defaults to “TOF”
signal_to_noise (float, optional) – Signal to noise ratio, defaults to 1.0
spacing_difference_gap (float, optional) – Spacing difference gap, defaults to 4.0
spacing_difference (float, optional) – Spacing difference, defaults to 1.5
missing (int, optional) – Number of allowed missing values, defaults to 1
ms_levels (List[int], optional) – MS levels to consider, defaults to []
report_FWHM (str, optional) – Report full width at half maximum, defaults to “true”
report_FWHM_unit (str, optional) – Report full width at half maximum unit, defaults to “relative”
max_intensity (float, optional) – Maximum intensity, defaults to -1
auto_max_stdev_factor (float, optional) – Automatic maximal standard deviation factor, defaults to 3.0
auto_max_percentile (int, optional) – Automatic maximal percentile to consider, defaults to 95
auto_mode (int, optional) – Automatic mode (0/1), defaults to 0
win_len (float, optional) – Window length, defaults to 200.0
bin_count (int, optional) – Number of bins, defaults to 30
min_required_elements (int, optional) – Minimum required elements for a peak, defaults to 10
noise_for_empty_window (float, optional) – Noise value for an empty window, defaults to 1e+20
write_log_messages (str, optional) – Write log messages (true/false), defaults to “true”
peak_width (float, optional) – Expected peak width, defaults to 0.0
sn_bin_count (int, optional) – Signal bin count, defaults to 30
nr_iterations (int, optional) – Iterations to recenter peaks, defaults to 5
sn_win_len (float, optional) – Signal window length, defaults to 20.0
check_width_internally (str, optional) – Check width internally, defaults to “false”
ms1_only (str, optional) – Only MS1 spectrum, defaults to “true”
clear_meta_data (str, optional) – Clear meta data, defaults to “false”
deepcopy (bool, optional) – Perform a deepcopy to relieably unlink the experiment from the input, defaults to False
- Returns:
Directory with centroids
- Return type:
str
- skripts.FIA.FIA.centroid_experiment(experiment: _MSExperimentDF | str, instrument: str = 'TOF', signal_to_noise: float = 1.0, spacing_difference_gap: float = 4.0, spacing_difference: float = 1.5, missing: int = 1, ms_levels: List[int] = [], report_FWHM: str = 'true', report_FWHM_unit: str = 'relative', max_intensity: float = -1, auto_max_stdev_factor: float = 3.0, auto_max_percentile: int = 95, auto_mode: int = 0, win_len: float = 200.0, bin_count: int = 30, min_required_elements: int = 10, noise_for_empty_window: float = 1e+20, write_log_messages: str = 'true', peak_width: float = 0.0, sn_bin_count: int = 30, nr_iterations: int = 5, sn_win_len: float = 20.0, check_width_internally: str = 'false', ms1_only: str = 'true', clear_meta_data: str = 'false', deepcopy: bool = False) _MSExperimentDF [source]
Reduce dataset to centroids.
Usecase fia_df[“cent_experiment”] = fia_df[“experiment”].apply(lambda experiment: centroid_experiment(experiment, instrument=”TOF”, # For All
signal_to_noise=2.0, spacing_difference=1.5,
spacing_difference_gap=4.0, missing=1, ms_levels=[1], # For Orbitrap report_FWHM=”true”, report_FWHM_unit=”relative”, max_intensity=-1, auto_max_stdev_factor=3.0, auto_max_percentile=95, auto_mode=0, win_len=200.0, bin_count=30, min_required_elements=10, noise_for_empty_window=1e+20, write_log_messages=”true”,
peak_width=0.0, sn_bin_count=30, nr_iterations=5, sn_win_len=20.0, # For TOF check_width_internally=”false”, ms1_only=”true”, clear_meta_data=”false”, deepcopy=False))
- Parameters:
experiment (Union[pyopenms.MSExperiment, str]) – Input experiment
instrument (str, optional) – Instrument type (TOF or FT-ICR, Orbitrap), defaults to “TOF”
signal_to_noise (float, optional) – Signal to noise ratio, defaults to 1.0
spacing_difference_gap (float, optional) – Spacing difference gap, defaults to 4.0
spacing_difference (float, optional) – Spacing difference, defaults to 1.5
missing (int, optional) – Number of allowed missing values, defaults to 1
ms_levels (List[int], optional) – MS levels to consider, defaults to []
report_FWHM (str, optional) – Report full width at half maximum, defaults to “true”
report_FWHM_unit (str, optional) – Report full width at half maximum unit, defaults to “relative”
max_intensity (float, optional) – Maximum intensity, defaults to -1
auto_max_stdev_factor (float, optional) – Automatic maximal standard deviation factor, defaults to 3.0
auto_max_percentile (int, optional) – Automatic maximal percentile to consider, defaults to 95
auto_mode (int, optional) – Automatic mode (0/1), defaults to 0
win_len (float, optional) – Window length, defaults to 200.0
bin_count (int, optional) – Number of bins, defaults to 30
min_required_elements (int, optional) – Minimum required elements for a peak, defaults to 10
noise_for_empty_window (float, optional) – Noise value for an empty window, defaults to 1e+20
write_log_messages (str, optional) – Write log messages (true/false), defaults to “true”
peak_width (float, optional) – Expected peak width, defaults to 0.0
sn_bin_count (int, optional) – Signal bin count, defaults to 30
nr_iterations (int, optional) – Iterations to recenter peaks, defaults to 5
sn_win_len (float, optional) – Signal window length, defaults to 20.0
check_width_internally (str, optional) – Check width internally, defaults to “false”
ms1_only (str, optional) – Only MS1 spectrum, defaults to “true”
clear_meta_data (str, optional) – Clear meta data, defaults to “false”
deepcopy (bool, optional) – Perform a deepcopy to relieably unlink the experiment from the input, defaults to False
- Returns:
Centroided experiment
- Return type:
pyopenms.MSExperiment
- skripts.FIA.FIA.check_ending_experiment(file: str) bool [source]
Check whether the file has a mzML or mzXML ending.
- Parameters:
file (str) – Path to file
- Returns:
Ending is mzML or mzXML
- Return type:
bool
- skripts.FIA.FIA.clean_dir(dir_path: str, subfolder: str | None = None) str [source]
Delete a directory or its subfolder and reconstruct the directory.
- Parameters:
dir_path (str) – Directory path
subfolder (Optional[str], optional) – Subfolder path, defaults to None
- Returns:
Directory path
- Return type:
str
- skripts.FIA.FIA.cluster_matlab(df: DataFrame, height_lim: int = 1000, prominence_lim: int = 1000, threshold: float = 0.004900000000000001)[source]
Clusters according to FIA matlab routine
- Parameters:
df (pd.DataFrame) – Input dataframe
height_lim (int, optional) – height limit, defaults to 1000
prominence_lim (int, optional) – prominence limit, defaults to 1000
threshold (float, optional) – threshold to cut off values, defaults to (7e-2)**2
- Returns:
m/z + inty as paired array
- Return type:
np.ndarray
- skripts.FIA.FIA.cluster_sliding_window(comb_experiment: _MSExperimentDF, height_lim: int = 1000, prominence_lim: int = 1000, window_len: int = 2000, window_shift=1000, threshold: float = 0.004900000000000001)[source]
Applies clustering over sliding window in an experiment. The result may contain duplicates or close to duplicates.
- Parameters:
comb_experiment (pyopenms.MSExperiment) – Input experiment
param height_lim: height limit, defaults to 1000 :type height_lim: int, optional :param prominence_lim: prominence limit, defaults to 1000 :type prominence_lim: int, optional :param window_len: Window length, defaults to 2000 :type window_len: int, optional :param window_shift: Window shift, defaults to 1000 :type window_shift: int, optional :param threshold: threshold to cut off values, defaults to (7e-2)**2 :type threshold: float, optional :return: Clustered experiment :rtype: pyopenms.MSExperiment
- skripts.FIA.FIA.combine_spectra_experiments(spectra_container: Sequence[_MSExperimentDF | MSSpectrum]) _MSExperimentDF [source]
Combines all spectra/experiements, into different spectra in one experiment
- Parameters:
spectra_container (Sequence[Union[oms.MSExperiment,oms.MSSpectrum]]) – Input experiments
- Returns:
Experiment with summed intensities along m/z axis
- Return type:
oms.MSExperiment
- skripts.FIA.FIA.consensus_features_linking(feature_maps: list, feature_grouper_type: str = 'QT') _ConsensusMapDF [source]
Linking features by consensus voting.
- Parameters:
feature_maps (list) – List of feature maps
feature_grouper_type (str, optional) – Quality threshold clustering (QT) or k-dimensional tree clustering, defaults to “QT”
- Raises:
ValueError – Use QT or KD for feature groupers.
- Returns:
Consensums map of features
- Return type:
pyopenms.ConsensusMap
- skripts.FIA.FIA.consensus_map_to_df(consensus_map: _ConsensusMapDF) DataFrame [source]
Transforms a consensus map into a daraframe
- Parameters:
consensus_map (pyopenms.ConsensusMap) – Input consensus map
- Returns:
Dataframe from consensus map
- Return type:
pandas.DataFrame
- skripts.FIA.FIA.copy_experiment(experiment: _MSExperimentDF) _MSExperimentDF [source]
Makes a complete (recursive) copy of an experiment.
- Parameters:
experiment (pyopenms.MSExperiment) – Experiment
- Returns:
Copy of experiment.
- Return type:
pyopenms.MSExperiment
- skripts.FIA.FIA.define_metabolite_table(path_to_library_file: str, mass_range: list) list [source]
Read tsv file and create list of FeatureFinderMetaboIdentCompound.
- Parameters:
path_to_library_file (str) – Path to library file
mass_range (list) – Range of m/z values
- Returns:
Metabolite table
- Return type:
list
- skripts.FIA.FIA.deisotope_experiment(experiment: _MSExperimentDF | str, fragment_tolerance: float = 0.1, fragment_unit_ppm: bool = False, min_charge: int = 1, max_charge: int = 3, keep_only_deisotoped: bool = True, min_isopeaks: int = 2, max_isopeaks: int = 10, make_single_charged: bool = True, annotate_charge: bool = True, annotate_iso_peak_count: bool = True, use_decreasing_model: bool = True, start_intensity_check: bool = False, add_up_intensity: bool = False, deepcopy: bool = False)[source]
Attempt to combine isotopes in an experiment through exhaustive calculation.
- Parameters:
spectrum (Union[oms.MSExperiment, str]) – Input experiment
fragment_tolerance (float, optional) – Tolerance for fragments, defaults to 0.1
fragment_unit_ppm (bool, optional) – Use ppm as fragmentation unit, defaults to False
min_charge (int, optional) – Minimal charge, defaults to 1
max_charge (int, optional) – Maximum charge, defaults to 3
keep_only_deisotoped (bool, optional) – Keep only deisotoped signals, defaults to True
min_isopeaks (int, optional) – Minimum amount of isotopes for a signal, defaults to 2
max_isopeaks (int, optional) – Maximum amount of isotopes in a signal, defaults to 10
make_single_charged (bool, optional) – Adapt isotops to single hydrogen adducts/deducts, defaults to True
annotate_charge (bool, optional) – Annotate the charge, defaults to True
annotate_iso_peak_count (bool, optional) – Annotate isotopic peak count, defaults to True
use_decreasing_model (bool, optional) – Use decreasing model (decreased chance of isotopes with further changes), defaults to True
start_intensity_check (bool, optional) – Intensity check at the start, defaults to False
add_up_intensity (bool, optional) – Add intensity of isotopes, defaults to False
- Returns:
Deisotoped experiment
- Return type:
oms.MSExperiment
- skripts.FIA.FIA.deisotope_spectrum(spectrum: MSSpectrum, fragment_tolerance: float = 0.1, fragment_unit_ppm: bool = False, min_charge: int = 1, max_charge: int = 3, keep_only_deisotoped: bool = True, min_isopeaks: int = 2, max_isopeaks: int = 10, make_single_charged: bool = True, annotate_charge: bool = True, annotate_iso_peak_count: bool = True, use_decreasing_model: bool = True, start_intensity_check: bool = False, add_up_intensity: bool = False) MSSpectrum [source]
Attempt to combine isotopes in a spectrum through exhaustive calculation.
- Parameters:
spectrum (pyopenms.MSSpectrum) – Input spectrum
fragment_tolerance (float, optional) – Tolerance for fragments, defaults to 0.1
fragment_unit_ppm (bool, optional) – Use ppm as fragmentation unit, defaults to False
min_charge (int, optional) – Minimal charge, defaults to 1
max_charge (int, optional) – Maximum charge, defaults to 3
keep_only_deisotoped (bool, optional) – Keep only deisotoped signals, defaults to True
min_isopeaks (int, optional) – Minimum amount of isotopes for a signal, defaults to 2
max_isopeaks (int, optional) – Maximum amount of isotopes in a signal, defaults to 10
make_single_charged (bool, optional) – Adapt isotops to single hydrogen adducts/deducts, defaults to True
annotate_charge (bool, optional) – Annotate the charge, defaults to True
annotate_iso_peak_count (bool, optional) – Annotate isotopic peak count, defaults to True
use_decreasing_model (bool, optional) – Use decreasing model (decreased chance of isotopes with further changes), defaults to True
start_intensity_check (bool, optional) – Intensity check at the start, defaults to False
add_up_intensity (bool, optional) – Add intensity of isotopes, defaults to False
- Returns:
Deisotoped spectrum
- Return type:
pyopenms.MSSpectrum
- skripts.FIA.FIA.detect_adducts(feature_maps: list, potential_adducts: str | bytes = '[]', q_try: str = 'feature', mass_max_diff: float = 10.0, unit: str = 'ppm', max_minority_bound: int = 3, verbose_level: int = 0) list [source]
Attempt adduct detection through exhaustive calculations.
- Parameters:
feature_maps (list) – List of feature maps
potential_adducts (Union[str, bytes], optional) – Potential adducts to consider, defaults to “[]”
q_try (str, optional) – Charge discovery dimension, defaults to “feature”
mass_max_diff (float, optional) – Maximum mass difference, defaults to 10.0
unit (str, optional) – Unit of mass difference, defaults to “ppm”
max_minority_bound (int, optional) – Maximum minority bound, defaults to 3
verbose_level (int, optional) – Verbosity level, defaults to 0
- Returns:
Feature maps with removed adducts (deconvoluted)
- Return type:
list
- skripts.FIA.FIA.dynamic_plot(experiment: _MSExperimentDF, mode: str = 'lines', log: List[str] = ['x']) None [source]
Shows an interactive plot of all spectra in the experiment. May take a long time for large datasets. Recommended after centroiding, or data reduction.
- Parameters:
experiment (pyopenms.MSExperiment) – Input experiment
mode (str, optional) – Mode of display [“lines” | “markers” | “lines+markers” | other pyplot.graph_objects options], defaults to “lines”
log (List[str], optional) – Axes to log-transform [x,y], defaults to []
- skripts.FIA.FIA.elution_peak_detection(mass_traces: list, chrom_fwhm: float = 10.0, chrom_peak_snr: float = 2.0, width_filtering: str = 'fixed', min_fwhm: float = 1.0, max_fwhm: float = 60.0, masstrace_snr_filtering: str = 'false') list [source]
Elution peak detection along mass traces. Relevant for chromatographic data.
- Parameters:
mass_traces (list) – List of mass traces
chrom_fwhm (float, optional) – Chromatographic full width at half maximum, defaults to 10.0
chrom_peak_snr (float, optional) – Minimum signal-to-noise a mass trace should have, defaults to 2.0
width_filtering (str, optional) – Type of width filtering, defaults to “fixed”
min_fwhm (float, optional) – Minimal full width at half maximum, defaults to 1.0
max_fwhm (float, optional) – Maximal full width at half maximum, defaults to 60.0
masstrace_snr_filtering (str, optional) – Filtering by signal to noise ratio, defaults to “false”
- Returns:
List of final mass traces
- Return type:
list
- skripts.FIA.FIA.elution_peak_detection_batch(mass_traces_all: list[list], chrom_fwhm: float = 10.0, chrom_peak_snr: float = 2.0, width_filtering: str = 'fixed', min_fwhm: float = 1.0, max_fwhm: float = 60.0, masstrace_snr_filtering: str = 'false') list[list] [source]
Elution peak detection along list of mass traces. Relevant for chromatographic data.
- Parameters:
mass_traces_all (list[list]) – List of list of all mass traces
chrom_fwhm (float, optional) – Chromatographic full width at half maximum, defaults to 10.0
chrom_peak_snr (float, optional) – Minimum signal-to-noise a mass trace should have, defaults to 2.0
width_filtering (str, optional) – Type of width filtering, defaults to “fixed”
min_fwhm (float, optional) – Minimal full width at half maximum, defaults to 1.0
max_fwhm (float, optional) – Maximal full width at half maximum, defaults to 60.0
masstrace_snr_filtering (str, optional) – Filtering by signal to noise ratio, defaults to “false”
- Returns:
List of list of final mass traces
- Return type:
list[list]
- skripts.FIA.FIA.extract_feature_coord(feature: Feature, mzs: ndarray, retention_times: ndarray, intensities: ndarray, labels: ndarray, sub_feat: Feature | None = None) list [source]
Extract feature coordinates for plots
- Parameters:
feature (oms.Feature) – Input feature
mzs (np.ndarray) – m/z values
retention_times (np.ndarray) – Retention times
intensities (np.ndarray) – Intensity values
labels (np.ndarray) – Labels
sub_feat (Optional[oms.Feature], optional) – Sub-features, defaults to None
- Returns:
List of mzs, retention times intensities and matching labels for plots
- Return type:
list
- skripts.FIA.FIA.extract_from_clustering(df: DataFrame, clustering) ndarray [source]
Extract mzs and intensities from clustering
- Parameters:
df (pandas.DataFrame) – Input dataframe
clustering (np.ndarray) – Clustering
- Returns:
m/z + inty as paired array
- Return type:
np.ndarray
- skripts.FIA.FIA.feature_detection_targeted(experiment: _MSExperimentDF | str, metab_table: list, feature_filepath: str | None = None, mz_window: float = 5.0, rt_window: float | None = None, n_isotopes: int = 2, isotope_pmin: float = 0.01, peak_width: float = 60.0) _FeatureMapDF [source]
Feature detection with a given metabolic table.
- Parameters:
experiment (Union[oms.MSExperiment, str]) – Input experiment
metab_table (list) – Metabilites table
feature_filepath (Optional[str], optional) – Filpath to featureXML, defaults to None
mz_window (float, optional) – m/z window width, defaults to 5.0
rt_window (Optional[float], optional) – Retention time window width, defaults to None
n_isotopes (int, optional) – Number of considered isotopes, defaults to 2
isotope_pmin (float, optional) – Minimal probability of an isotope to be considered, defaults to 0.01
peak_width (float, optional) – Standard peak width, defaults to 60.0
- Returns:
Feature map
- Return type:
oms.FeatureMap
- skripts.FIA.FIA.feature_detection_untargeted(experiment: _MSExperimentDF | str, mass_traces_deconvol: list = [], isotope_filtering_model='metabolites (2% RMS)', local_rt_range: float = 3.0, local_mz_range: float = 5.0, charge_lower_bound: int = 1, charge_upper_bound: int = 3, chrom_fwhm: float = 10.0, report_summed_ints: str = 'true', enable_RT_filtering: str = 'false', mz_scoring_13C: str = 'false', use_smoothed_intensities: str = 'false', report_convex_hulls: str = 'true', report_chromatograms: str = 'false', remove_single_traces: str = 'true', mz_scoring_by_elements: str = 'false', elements: str = 'CHNOPS') _FeatureMapDF [source]
Untargeted feature detection in an experiment.
- Parameters:
experiment (Union[oms.MSExperiment, str]) – Input experiment
mass_traces_deconvol (list, optional) – Deconvoluted mass traces, defaults to []
isotope_filtering_model (str, optional) – Isotope filtering model, defaults to “metabolites (2% RMS)”
local_rt_range (float, optional) – Local retention time range, defaults to 3.0
local_mz_range (float, optional) – Local m/z range, defaults to 5.0
charge_lower_bound (int, optional) – Lower charge bound, defaults to 1
charge_upper_bound (int, optional) – Upper charge bound, defaults to 3
chrom_fwhm (float, optional) – Chromatographic full width at half maximum, defaults to 10.0
report_summed_ints (str, optional) – Report summed intensities, defaults to “true”
enable_RT_filtering (str, optional) – Enable retention time filtering, defaults to “false”
mz_scoring_13C (str, optional) – Score m/z by looking at expected Carbon13 peaks, defaults to “false”
use_smoothed_intensities (str, optional) – Use smoothed intensities (if smoothed before), defaults to “false”
report_convex_hulls (str, optional) – Report convex hulls, defaults to “true”
report_chromatograms (str, optional) – Report chromatograms, defaults to “false”
remove_single_traces (str, optional) – Remove single traces (only appear at one retention time), defaults to “true”
mz_scoring_by_elements (str, optional) – Score m/z by present elements, defaults to “false”
elements (str, optional) – Elements to consider, defaults to “CHNOPS”
- Returns:
Feature Map
- Return type:
pyopenms.FeatureMap
- skripts.FIA.FIA.feature_detection_untargeted_batch(experiments: Sequence[_MSExperimentDF | str] | str, file_ending: str = '.mzML', mass_traces_deconvol_all: list[list] = [], isotope_filtering_model='metabolites (2% RMS)', local_rt_range: float = 3.0, local_mz_range: float = 5.0, charge_lower_bound: int = 1, charge_upper_bound: int = 3, chrom_fwhm: float = 10.0, report_summed_ints: str = 'true', enable_RT_filtering: str = 'false', mz_scoring_13C: str = 'false', use_smoothed_intensities: str = 'false', report_convex_hulls: str = 'true', report_chromatograms: str = 'false', remove_single_traces: str = 'true', mz_scoring_by_elements: str = 'false', elements: str = 'CHNOPS') list[_FeatureMapDF] [source]
Untargeted feature detection in experiments.
- Parameters:
experiments (Union[Sequence[Union[oms.MSExperiment,str]], str]) – Input experiments
file_ending (str, optional) – File ending, defaults to “.mzML”
mass_traces_deconvol (list, optional) – Deconvoluted mass traces, defaults to []
isotope_filtering_model (str, optional) – Isotope filtering model, defaults to “metabolites (2% RMS)”
local_rt_range (float, optional) – Local retention time range, defaults to 3.0
local_mz_range (float, optional) – Local m/z range, defaults to 5.0
charge_lower_bound (int, optional) – Lower charge bound, defaults to 1
charge_upper_bound (int, optional) – Upper charge bound, defaults to 3
chrom_fwhm (float, optional) – Chromatographic full width at half maximum, defaults to 10.0
report_summed_ints (str, optional) – Report summed intensities, defaults to “true”
enable_RT_filtering (str, optional) – Enable retention time filtering, defaults to “false”
mz_scoring_13C (str, optional) – Score m/z by looking at expected Carbon13 peaks, defaults to “false”
use_smoothed_intensities (str, optional) – Use smoothed intensities (if smoothed before), defaults to “false”
report_convex_hulls (str, optional) – Report convex hulls, defaults to “true”
report_chromatograms (str, optional) – Report chromatograms, defaults to “false”
remove_single_traces (str, optional) – Remove single traces (only appear at one retention time), defaults to “true”
mz_scoring_by_elements (str, optional) – Score m/z by present elements, defaults to “false”
elements (str, optional) – Elements to consider, defaults to “CHNOPS”
- Returns:
List of Feature Maps
- Return type:
list[pyopenms.FeatureMap]
- skripts.FIA.FIA.filter_consensus_map_df(consensus_map_df: DataFrame, max_missing_values: int = 1, min_feature_quality: float | None = 0.8) DataFrame [source]
Filter consensus map DataFrame according to missing values and feature quality.
- Parameters:
consensus_map_df (pandas.DataFrame) – Input consensus map dataframe
max_missing_values (int, optional) – Maximum number of missing values, defaults to 1
min_feature_quality (Optional[float], optional) – Minimal quality of feature, defaults to 0.8
- Returns:
Filtered consensus map dataframe
- Return type:
pandas.DataFrame
- skripts.FIA.FIA.find_close(df1: DataFrame, df1_col, df2: DataFrame, df2_col, tolerance=0.001)[source]
Find close values in two dataframes.
- Parameters:
df1 (pandas.DataFrame) – Input dataframe 1
df1_col (all) – Input dataframe 1 matched column
df2 (pandas.DataFrame) – Input dataframe 2
df2_col (all) – Input dataframe 2 matched column
tolerance (float, optional) – Absolute tolerance in difference, defaults to 0.001
- Yield:
Datframe of close values
- Return type:
pandas.DataFrame
- skripts.FIA.FIA.impute_consensus_map_df(consensus_map_df: DataFrame, n_nearest_neighbours: int = 2) DataFrame [source]
Data imputation with k-neares-neighbours (kNN).
- Parameters:
consensus_map_df (pandas.DataFrame) – Input consensus map dataframe
n_nearest_neighbours (int, optional) – K nearest neighbours, defaults to 2
- Returns:
Consensus map dataframe with imputed values
- Return type:
pandas.DataFrame
- skripts.FIA.FIA.join_df_by(df: DataFrame, joiner: str, combiner: str) DataFrame [source]
Combines datframe with same <joiner>, while combining the name of <combiner> as the new index.
- Parameters:
df (pd.DataFrame) – Input DataFrame
joiner (str) – Indicates the column that is the criterium for joining the rows
combiner (str) – Indicates the column that should be combined as an identifier
- Returns:
Combined DataFrame
- Return type:
pd.DataFrame
- skripts.FIA.FIA.limit_experiment(experiment: _MSExperimentDF | str, mz_lower_limit: int | float = 0, mz_upper_limit: int | float = 10000, sample_size: int = 100000, statistic: str = 'sum', deepcopy: bool = False) _MSExperimentDF [source]
Limits the range of all spectra in an experiment to <mz_lower_limit> and <mz_upper_limit>. Uniformly samples <sample_size> number of peaks from the spectrum (without replacement).
- Parameters:
experiment (Union[pyopenms.MSExperiment, str]) – Input experiment
mz_lower_limit (Union[int, float], optional) – Lower m/z value limit, defaults to 0
mz_upper_limit (Union[int, float], optional) – Upper m/z value limit, defaults to 10000
sample_size (int, optional) – Number of sampled peaks from the spectrum, defaults to 100000
statistic – Operation to perform on binned values in a window. May take common parameters,
defined by sci.stats.binned_statistic., defaults to “sum” :type statistic: str, optional :param deepcopy: Perform a deepcopy to relieably unlink the experiment from the input, defaults to False :type deepcopy: bool, optional :return: Limited and sampled experiment :rtype: oms.MSExperiment
- skripts.FIA.FIA.limit_spectrum(spectrum: MSSpectrum, mz_lower_limit: int | float, mz_upper_limit: int | float, sample_size: int, statistic: str = 'sum') MSSpectrum [source]
Limits the range of the Spectrum to <mz_lower_limit> and <mz_upper_limit>. Uniformly samples <sample_size> number of peaks from the spectrum (without replacement).
- Parameters:
spectrum (pyopenms.MSSpectrum) – Input spectrum
mz_lower_limit (Union[int, float]) – Lower m/z value limit
mz_upper_limit (Union[int, float]) – Upper m/z value limit
sample_size (int) – Number of sampled peaks from the spectrum
statistic – Operation to perform on binned values in a window. May take common parameters,
defined by sci.stats.binned_statistic., defaults to “sum” :type statistic: str, optional :return: Limited and sampled spectrum :rtype: pyopenms.MSSpectrum
- skripts.FIA.FIA.load_experiment(experiment: _MSExperimentDF | str, separator: str = '\t') _MSExperimentDF [source]
If no experiment is given, loads and returns it from either .mzML or .mzXML file. Collects garbage with gc.collect() to ensure space in the RAM.
- Parameters:
experiment (Union[oms.MSExperiment, str]) – Experiment, or Path to experiment
separator (str, optional) – Separator of data, defaults to ” “
- Returns:
Experiment
- Return type:
pyopenms.MSExperiment
- skripts.FIA.FIA.load_experiments(experiments: Sequence[_MSExperimentDF | str] | str, file_ending: str | None = None, separator: str = '\t', data_load: bool = True) Sequence[_MSExperimentDF | str] [source]
Load a batch of experiments.
- Parameters:
experiments – Experiments, either described by a list of paths or one path as base directory,
or an existing experiment. :type experiments: Union[Sequence[Union[oms.MSExperiment,str]], str] :param file_ending: Ending of experiment file, defaults to None :type file_ending: Optional[str], optional :param separator: Separator of data, defaults to ” ” :type separator: str, optional :param data_load: Load the data or just combine the base string to a list of full filepaths, defaults to True :type data_load: bool, optional :return: Experiments :rtype: Sequence[Union[oms.MSExperiment,str]]
- skripts.FIA.FIA.load_fia_df(data_dir: str, file_ending: str, separator: str = '\t', data_load: bool = True, backend=<module 'pandas' from '/mnt/lustre/groups/link/linca945/.conda/envs/VAE/lib/python3.11/site-packages/pandas/__init__.py'>) DataFrame | DataFrame [source]
Load a Flow injection analysis dataframe, defining important properties.
- Parameters:
data_dir (str) – Data directory
file_ending (str) – Ending of file
separator (str, optional) – Separator for file, defaults to ” “
data_load (bool, optional) – Load data or only return list of experiments, defaults to True
backend (_type_, optional) – Use pandas or polars as backend, defaults to pd
- Returns:
_description_
- Return type:
Union[pandas.DataFrame, polars.DataFrame]
- skripts.FIA.FIA.load_name(experiment: _MSExperimentDF | str, alt_name: str | None = None, file_ending: str | None = None) str [source]
Load the name of an experiment.
- Parameters:
experiment (Union[oms.MSExperiment, str]) – Experiment
alt_name (Optional[str], optional) – Alternative Name if none is found, defaults to None
file_ending (Optional[str], optional) – Ending of experiment file, defaults to None
- Raises:
ValueError – Raises error if no file name is found and no alt_name is given.
- Returns:
Name of experiment or alternative name
- Return type:
str
- skripts.FIA.FIA.load_names_batch(experiments: Sequence[_MSExperimentDF | str] | str, file_ending: str = '.mzML') List[str] [source]
If no experiment is given, loads and returns it from either .mzML or .mzXML file.
- Parameters:
experiments (Union[Sequence[Union[oms.MSExperiment,str]], str]) – Experiments
file_ending (str, optional) – Ending of experiment file, defaults to “.mzML”
- Returns:
List of experiment names
- Return type:
List[str]
- skripts.FIA.FIA.make_merge_dict(dir: str, file_ending: str = '.mzML') dict [source]
Create a dictionary for merging.
- Parameters:
dir (str) – Directory to extract samples from.
file_ending (str, optional) – File ending, defaults to “.mzML”
- Returns:
Dictonary with sample names as keys and paths for merging as values
- Return type:
dict
- skripts.FIA.FIA.mass_trace_detection(experiment: _MSExperimentDF | str, mass_error_ppm: float = 10.0, noise_threshold_int: float = 1000.0, reestimate_mt_sd: str = 'true', quant_method: str = 'median', trace_termination_criterion: str = 'outlier', trace_termination_outliers: int = 3, min_trace_length: float = 5.0, max_trace_length: float = -1.0) list [source]
Detection of mass traces in experiment over several spectra.
- Parameters:
experiment (Union[oms.MSExperiment, str]) – Input experiment
mass_error_ppm (float, optional) – Mass error in ppm, defaults to 10.0
noise_threshold_int (float, optional) – Noise threshold intensity, defaults to 1000.0
reestimate_mt_sd (str, optional) – Reestimate mass trace standard deviation during run, defaults to “true”
quant_method (str, optional) – Quantification method, defaults to “median”
trace_termination_criterion (str, optional) – Criterion to terminate a trace, defaults to “outlier”
trace_termination_outliers (int, optional) – Number of cases that fulfil criterion/outliers to break trace, defaults to 3
min_trace_length (float, optional) – Minimal trace length, defaults to 5.0
max_trace_length (float, optional) – Maximum trace length, defaults to -1.0
- Returns:
Mass traces
- Return type:
list
- skripts.FIA.FIA.mass_trace_detection_batch(experiments: Sequence[_MSExperimentDF | str] | str, file_ending: str = '.mzML', mass_error_ppm: float = 10.0, noise_threshold_int: float = 1000.0, reestimate_mt_sd: str = 'true', quant_method: str = 'median', trace_termination_criterion: str = 'outlier', trace_termination_outliers: int = 3, min_trace_length: float = 5.0, max_trace_length: float = -1.0) list [source]
Detection of mass traces in experiment over several experiments.
- Parameters:
experiments (Union[Sequence[Union[oms.MSExperiment,str]], str]) – Input experiments
file_ending (str, optional) – File ending, defaults to “.mzML”
mass_error_ppm (float, optional) – Mass error in ppm, defaults to 10.0
noise_threshold_int (float, optional) – Noise threshold intensity, defaults to 1000.0
reestimate_mt_sd (str, optional) – Reestimate mass trace standard deviation during run, defaults to “true”
quant_method (str, optional) – Quantification method, defaults to “median”
trace_termination_criterion (str, optional) – Criterion to terminate a trace, defaults to “outlier”
trace_termination_outliers (int, optional) – Number of cases that fulfil criterion/outliers to break trace, defaults to 3
min_trace_length (float, optional) – Minimal trace length, defaults to 5.0
max_trace_length (float, optional) – Maximum trace length, defaults to -1.0
- Returns:
Mass traces
- Return type:
list
- skripts.FIA.FIA.merge_batch(experiments: Sequence[_MSExperimentDF | str] | str, run_dir: str, file_ending: str = '.mzML', method: str = 'block_method', mz_binning_width: float = 1.0, mz_binning_width_unit: str = 'ppm', ms_levels: List[int] = [1], sort_blocks: str = 'RT_ascending', rt_block_size: int | None = None, rt_max_length: float = 0.0, spectrum_type: str = 'automatic', rt_range: float | None = 5.0, rt_unit: str = 'scans', rt_FWHM: float = 5.0, cutoff: float = 0.01, precursor_mass_tol: float = 0.0, precursor_max_charge: int = 1, deepcopy: bool = False) str [source]
Merge several spectra into one spectrum (useful for MS1 spectra to amplify signals along near retention times)
- Parameters:
experiments (Union[Sequence[Union[oms.MSExperiment,str]], str]) – Input experiments
run_dir (str) – Run directory
file_ending (str, optional) – File ending, defaults to “.mzML”
method (str, optional) – Method to perform merging, defaults to “block_method”
mz_binning_width (float, optional) – m/z binning width, defaults to 1.0
mz_binning_width_unit (str, optional) – m/z binning width unit (ppm/Da), defaults to “ppm”
ms_levels (List[int], optional) – MS levels to consider, defaults to [1]
sort_blocks (str, optional) – Sort blocks by rentention time, defaults to “RT_ascending”
rt_block_size (Optional[int], optional) – Block size along retention time, defaults to None
rt_max_length (float, optional) – Maximal length of Retention time, defaults to 0.0
spectrum_type (str, optional) – Spectrum type determination, defaults to “automatic”
rt_range (Optional[float], optional) – Retention time range to merge over, defaults to 5.0
rt_unit (str, optional) – Unit to merge over, defaults to “scans”
rt_FWHM (float, optional) – Full width at half maximum, defaults to 5.0
cutoff (float, optional) – Cutoff value during merging, defaults to 0.01
precursor_mass_tol (float, optional) – Percursor mass tolerance, defaults to 0.0
precursor_max_charge (int, optional) – Maximal precursor charge, defaults to 1
deepcopy (bool, optional) – Perform a deepcopy to relieably unlink the experiment from the input, defaults to False
- Raises:
ValueError – Method needs to be in [block_method|average_tophat|average_gaussian]
- Returns:
Directory with merged experiments
- Return type:
str
- skripts.FIA.FIA.merge_by_mz(id_df_1: DataFrame, id_df_2: DataFrame, mz_tolerance: float = 0.0001) DataFrame [source]
Merge dataframes by mz column.
- Parameters:
id_df_1 (pandas.DataFrame) – Input dataframe 1
id_df_2 (pandas.DataFrame) – Input dataframe 2
mz_tolerance (float, optional) – Tolerance of m/z deviation, defaults to 1e-04
- Returns:
_description_
- Return type:
pandas.DataFrame
- skripts.FIA.FIA.merge_compounds(path_to_tsv: str) DataFrame [source]
Joins entries with equal Mass and SumFormula. Links CompoundName with: ;. Links rest with: ,.
- Parameters:
path_to_tsv (str) – Path to TSV
- Returns:
Merged compounds
- Return type:
pd.DataFrame
- skripts.FIA.FIA.merge_experiment(experiment: _MSExperimentDF | str, method: str = 'block_method', mz_binning_width: float = 1.0, mz_binning_width_unit: str = 'ppm', ms_levels: List[int] = [1], sort_blocks: str = 'RT_ascending', rt_block_size: int | None = None, rt_max_length: float = 0.0, spectrum_type: str = 'automatic', rt_range: float | None = 5.0, rt_unit: str = 'scans', rt_FWHM: float = 5.0, cutoff: float = 0.01, precursor_mass_tol: float = 0.0, precursor_max_charge: int = 1, deepcopy: bool = False) _MSExperimentDF [source]
Merge several spectra into one spectrum (useful for MS1 spectra to amplify signals)
- Parameters:
experiment (Union[oms.MSExperiment, str]) – Input experiment
method (str, optional) – Method to perform merging, defaults to “block_method”
mz_binning_width (float, optional) – m/z binning width, defaults to 1.0
mz_binning_width_unit (str, optional) – m/z binning width unit (ppm/Da), defaults to “ppm”
ms_levels (List[int], optional) – MS levels to consider, defaults to [1]
sort_blocks (str, optional) – Sort blocks by rentention time, defaults to “RT_ascending”
rt_block_size (Optional[int], optional) – Block size along retention time, defaults to None
rt_max_length (float, optional) – Maximal length of Retention time, defaults to 0.0
spectrum_type (str, optional) – Spectrum type determination, defaults to “automatic”
rt_range (Optional[float], optional) – Retention time range to merge over, defaults to 5.0
rt_unit (str, optional) – Unit to merge over, defaults to “scans”
rt_FWHM (float, optional) – Full width at half maximum, defaults to 5.0
cutoff (float, optional) – Cutoff value during merging, defaults to 0.01
precursor_mass_tol (float, optional) – Percursor mass tolerance, defaults to 0.0
precursor_max_charge (int, optional) – Maximal precursor charge, defaults to 1
deepcopy (bool, optional) – Perform a deepcopy to relieably unlink the experiment from the input, defaults to False
- Raises:
ValueError – Method needs to be in [block_method|average_tophat|average_gaussian]
- Returns:
Merged experiment
- Return type:
pyopenms.MSExperiment
- skripts.FIA.FIA.merge_experiments(experiments: Sequence[_MSExperimentDF | str] | str, run_dir: str, file_ending: str = '.mzML', method: str = 'block_method', mz_binning_width: float = 1.0, mz_binning_width_unit: str = 'ppm', ms_levels: List[int] = [1], sort_blocks: str = 'RT_ascending', rt_block_size: int | None = None, rt_max_length: float = 0.0, spectrum_type: str = 'automatic', rt_range: float | None = 5.0, rt_unit: str = 'scans', rt_FWHM: float = 5.0, cutoff: float = 0.01, precursor_mass_tol: float = 0.0, precursor_max_charge: int = 1, deepcopy: bool = False) _MSExperimentDF [source]
Merge several spectra into one spectrum (useful for MS1 spectra to amplify signals along near retention times). Combines all spectra of given experiments into one experiment to merge over.
- Parameters:
experiments (Union[Sequence[Union[oms.MSExperiment,str]], str]) – Input experiments
run_dir (str) – Run directory
file_ending (str, optional) – File ending, defaults to “.mzML”
method (str, optional) – Method to perform merging, defaults to “block_method”
mz_binning_width (float, optional) – m/z binning width, defaults to 1.0
mz_binning_width_unit (str, optional) – m/z binning width unit (ppm/Da), defaults to “ppm”
ms_levels (List[int], optional) – MS levels to consider, defaults to [1]
sort_blocks (str, optional) – Sort blocks by rentention time, defaults to “RT_ascending”
rt_block_size (Optional[int], optional) – Block size along retention time, defaults to None
rt_max_length (float, optional) – Maximal length of Retention time, defaults to 0.0
spectrum_type (str, optional) – Spectrum type determination, defaults to “automatic”
rt_range (Optional[float], optional) – Retention time range to merge over, defaults to 5.0
rt_unit (str, optional) – Unit to merge over, defaults to “scans”
rt_FWHM (float, optional) – Full width at half maximum, defaults to 5.0
cutoff (float, optional) – Cutoff value during merging, defaults to 0.01
precursor_mass_tol (float, optional) – Percursor mass tolerance, defaults to 0.0
precursor_max_charge (int, optional) – Maximal precursor charge, defaults to 1
deepcopy (bool, optional) – Perform a deepcopy to relieably unlink the experiment from the input, defaults to False
- Raises:
ValueError – Method needs to be in [block_method|average_tophat|average_gaussian]
- Returns:
Directory with merged experiments
- Return type:
str
- skripts.FIA.FIA.merge_mz_tolerance(comb_df: DataFrame, charge: int = 1, tolerance: float = 0.001, binned: bool = False) DataFrame [source]
Weighted average of m/z values that are within absolute tolerance of a row in the primary dataframe.
- Parameters:
comb_df (pandas.DataFrame) – Dataframe for merging
charge (int, optional) – Theoretical charge, defaults to 1
tolerance (float, optional) – Tolerance of m/z difference, defaults to 1e-3
binned (bool, optional) – Values already binned (simpler process), defaults to False
- Returns:
Merged Dataframe
- Return type:
pandas.DataFrame
- skripts.FIA.FIA.mnx_to_oms(df: DataFrame) DataFrame [source]
Turns a dataframe from MetaNetX into the required format by pyopenms for feature detection.
- Parameters:
df (pandas.DataFrame) – Parameters Dataframe
- Returns:
DataFrame with essential information.
- Return type:
pandas.DataFrame
- skripts.FIA.FIA.normalize_spectra(experiment: _MSExperimentDF | str, normalization_method: str = 'to_one', deepcopy: bool = False) _MSExperimentDF [source]
Normalizes spectra by specified method.
- Parameters:
experiment (Union[oms.MSExperiment, str]) – Input experiment
normalization_method (Normalization method in [to_TIC | to_one] , optional) – _description_, defaults to “to_one”
deepcopy (bool, optional) – Perform a deepcopy to relieably unlink the experiment from the input, defaults to False
- Returns:
Normlaized experiment along spectra
- Return type:
oms.MSExperiment
- skripts.FIA.FIA.plot_feature_map_rt_alignment(ordered_feature_maps: list, legend: bool = False) None [source]
Plot feature map retention time alignment
- Parameters:
ordered_feature_maps (list) – Listof feature maps
legend (bool, optional) – Display legend, defaults to False
- skripts.FIA.FIA.plot_features_3D(feature_map: _FeatureMapDF, plottype: str = 'scatter') DataFrame [source]
Represents found features in 3D
- Parameters:
feature_map (oms.FeatureMap) – Input feature map
plottype (str, optional) – Type of plot [scatter|surface|line], defaults to “scatter”
- Raises:
ValueError – Use [‘surface’,’scatter’,’line’]
- Returns:
Dataframe with 3d features
- Return type:
pandas.DataFrame
- skripts.FIA.FIA.plot_id_df(id_df: DataFrame, x: str = 'RT', y: str = 'mz') None [source]
Scatterplot dataframe with identified metabolites.
- Parameters:
id_df (pd.DataFrame) – Input dataframe
x (str, optional) – x-axis, defaults to “RT”
y (str, optional) – y-axis, defaults to “mz”
- skripts.FIA.FIA.plot_mass_traces(mass_traces, sel=[0, 100], x: str = 'rt', y: str = 'mz', z: str = 'int', threed: bool = True)[source]
Plot mass traces along 3d plot.
- Parameters:
mass_traces (list) – List of mass traces
sel (list, optional) – Selection of convex hull points, defaults to [0,100]
x (str, optional) – x-axis column, defaults to “rt”
y (str, optional) – y-axis column, defaults to “mz”
z (str, optional) – z-axis column, defaults to “int”
threed (bool, optional) – 3D plot, defaults to True
- Returns:
Plot
- Return type:
plotly-express plot
- skripts.FIA.FIA.print_params(p)[source]
Print parameters of pyopenms class.
- Parameters:
p (dict-like) – Parameters
- skripts.FIA.FIA.quick_plot(spectrum: MSSpectrum, xlim: List[float] | None = None, ylim: List[float] | None = None, plottype: str = 'line', log: List[str] = []) Figure [source]
Shows a plot of a spectrum between the defined borders
- Parameters:
spectrum (pyopenms.MSSpectrum) – Input spectrum
xlim (Optional[List[float]], optional) – x-axis limits, defaults to None
ylim (Optional[List[float]], optional) – y-axis limits, defaults to None
plottype (str, optional) – Type of plot [line|scatter], defaults to “line”
log (List[str], optional) – Axes to log-transform [x,y], defaults to []
- Returns:
Figure
- Return type:
Figure
- skripts.FIA.FIA.read_experiment(experiment_path: str, separator: str = '\t') _MSExperimentDF [source]
Read in MzXML or MzML File as a pyopenms experiment. If the file is in tabular format, assumes that is is in long form with two columns [“mz”, “inty”]
- Parameters:
experiment_path (str) – Path to experiment
separator (str, optional) – Separator of data, defaults to ” “
- Raises:
ValueError – The experiment must end with a valid ending.
- Returns:
Experiment
- Return type:
pyopenms.MSExperiment
- skripts.FIA.FIA.read_feature_map_XML(path_to_featureXML: str) _FeatureMapDF [source]
Reads in feature Map from .featureXML file.
- Parameters:
path_to_featureXML (str) – Path to featureXML file.
- Returns:
Feature map
- Return type:
pyopenms.FeatureMap
- skripts.FIA.FIA.read_feature_maps_XML(path_to_featureXMLs: str) list [source]
Reads in feature Maps from file
- Parameters:
path_to_featureXMLs (str) – Path to featureXML file directory.
- Returns:
List of feature maps
- Return type:
list
- skripts.FIA.FIA.read_mnx(filepath: str) DataFrame [source]
Read in chem_prop.tsv file from MetaNetX
- Parameters:
filepath (str) – Path to file
- Returns:
DataFrame
- Return type:
pd.DataFrame
- skripts.FIA.FIA.separate_feature_maps_pos_neg(feature_maps: list) list [source]
Separate the feature maps into positively and negatively charged feature maps.
- Parameters:
feature_maps (list) – Input list of feature maps
- Returns:
List of list of feature maps, separates into positive and negatively charged
- Return type:
list
- skripts.FIA.FIA.smooth_spectra(experiment: _MSExperimentDF | str, gaussian_width: float, deepcopy: bool = False) _MSExperimentDF [source]
Apply a Gaussian filter to all spectra in an experiment.
- Parameters:
experiment (Union[pyopenms.MSExperiment, str]) – Input experiment
gaussian_width (float) – Window width to apply smoothing to
deepcopy (bool, optional) – Perform a deepcopy to relieably unlink the experiment from the input, defaults to False
- Returns:
Smoothed experiment
- Return type:
pyopenms.MSExperiment
- skripts.FIA.FIA.sns_plot(x, y, hue=None, size=None, xlim: List[float] | None = None, ylim: List[float] | None = None, plottype: str = 'line', log: List[str] = [], sizes: Tuple[int, int] | None = (20, 20), palette: str = 'hls', figsize: Tuple[int, int] | None = (18, 5)) None [source]
Shows a plot of a spectrum between the defined borders.
- Parameters:
x (array-like) – x-values
y (array-like) – x-values
hue (all, optional) – Column to group x and y values, defaults to None
size (float, optional) – Size of elements, defaults to None
xlim (Optional[List[float]], optional) – x-axis limits, defaults to None
ylim (Optional[List[float]], optional) – y-axis limits, defaults to None
plottype (str, optional) – Type of plot [line|scatter], defaults to “line”
log (List[str], optional) – Axes to log-transform [x,y], defaults to []
sizes (Optional[Tuple[int, int]], optional) – Sizes of points in relation to y-value, defaults to (20,20)
palette (str, optional) – Color palette, defaults to “hls”
figsize (Optional[Tuple[int,int]], optional) – Figure size, defaults to (18, 5)
- skripts.FIA.FIA.store_experiment(experiment_path: str, experiment: _MSExperimentDF) None [source]
Stores the experiment.
- Parameters:
experiment_path (str) – Path to be stored at (must end with .mzMl or .mzXML)
experiment (pyopenms.MSExperiment) – Experiment
- skripts.FIA.FIA.store_feature_maps(feature_maps: list, out_dir: str, names: list[str] | str = [], file_ending: str = '.mzML') None [source]
Stores the feature maps as featureXML files.
- Parameters:
feature_maps (list) – Feature Maps
out_dir (str) – Output directory
names (Union[list[str], str], optional) – Names of feature maps, defaults to []
file_ending (str, optional) – Ending of file, defaults to “.mzML”
- skripts.FIA.FIA.sum_spectra(experiment: _MSExperimentDF) MSSpectrum [source]
Sum up spectra in one experiment.
- Parameters:
experiment (pyopenms.MSExperiment) – Input experiment
- Returns:
Spectrum with summed intensities along m/z axis
- Return type:
pyopenms.MSSpectrum
- skripts.FIA.FIA.targeted_feature_detection(experiment: _MSExperimentDF | str, compound_library_file: str, mz_window: float = 5.0, rt_window: float | None = None, n_isotopes: int = 2, isotope_pmin: float = 0.01, peak_width: float = 60.0, mass_range: list = [50.0, 10000.0]) _FeatureMapDF [source]
Feature detection with a given metabolic table with compund library file.
- Parameters:
experiment (Union[oms.MSExperiment, str]) – Input experiment
compound_library_file (str) – Path to compound library file to define metabolic table from.
metab_table (list) – Metabilites table
feature_filepath (Optional[str], optional) – Filpath to featureXML, defaults to None
mz_window (float, optional) – m/z window width, defaults to 5.0
rt_window (Optional[float], optional) – Retention time window width, defaults to None
n_isotopes (int, optional) – Number of considered isotopes, defaults to 2
isotope_pmin (float, optional) – Minimal probability of an isotope to be considered, defaults to 0.01
peak_width (float, optional) – Standard peak width, defaults to 60.0
- Returns:
Feature map
- Return type:
pyopenms.FeatureMap
- skripts.FIA.FIA.targeted_features_detection(in_dir: str, run_dir: str, file_ending: str, compound_library_file: str, mz_window: float = 5.0, rt_window: float = 20.0, n_isotopes: int = 2, isotope_pmin: float = 0.01, peak_width: float = 60.0, mass_range: list = [50.0, 10000.0]) list[_FeatureMapDF] [source]
Feature detection with a given metabolic table with compund library file.
- Parameters:
in_dir (str) – Input directory
run_dir (str) – Run directory
file_ending (str) – File ending
compound_library_file (str) – Path to compound library file to define metabolic table from.
metab_table (list) – Metabilites table
feature_filepath (Optional[str], optional) – Filpath to featureXML, defaults to None
mz_window (float, optional) – m/z window width, defaults to 5.0
rt_window (Optional[float], optional) – Retention time window width, defaults to None
n_isotopes (int, optional) – Number of considered isotopes, defaults to 2
isotope_pmin (float, optional) – Minimal probability of an isotope to be considered, defaults to 0.01
peak_width (float, optional) – Standard peak width, defaults to 60.0
- Returns:
List of feature maps
- Return type:
list[pyopenms.FeatureMap]
- skripts.FIA.FIA.trim_threshold(experiment: _MSExperimentDF, threshold: float = 0.05) _MSExperimentDF [source]
Removes point below an absolute intensity theshold.
- Parameters:
experiment (pyopenms.MSExperiment) – Input Experiment
threshold (float, optional) – Threshold for values to be excluded, defaults to 0.05
- Returns:
Trimmed experiment
- Return type:
oms.MSExperiment
- skripts.FIA.FIA.trim_threshold_batch(experiments: Sequence[_MSExperimentDF | str] | str, run_dir: str, file_ending: str = '.mzML', threshold: float = 0.05, deepcopy: bool = False) str [source]
Removes points below an absolute intensity theshold in a batch of experiments.
- Parameters:
experiments (Union[Sequence[Union[pyopenms.MSExperiment,str]], str]) – Input Experiments (as a list of experiments, a directory, or paths to experiemnts)
threshold (float, optional) – Threshold for values to be excluded, defaults to 0.05
deepcopy (bool, optional) – Perform a deepcopy to relieably unlink the experiment from the input, defaults to False
- Returns:
Folder with trimmed experiments
- Return type:
str
- skripts.FIA.FIA.untargeted_feature_detection(experiment: _MSExperimentDF | str, feature_filepath: str | None = None, mass_error_ppm: float = 5.0, noise_threshold_int: float = 3000.0, charge_lower_bound: int = 1, charge_upper_bound: int = 3, width_filtering: str = 'fixed', isotope_filtering_model='none', remove_single_traces='true', mz_scoring_by_elements='false', report_convex_hulls='true', deepcopy: bool = False) _FeatureMapDF [source]
Untargeted detection of features. Combines mMass trace detection, elution peak detection and feature finding.
- Parameters:
experiment (Union[oms.MSExperiment, str]) – Input experiment
feature_filepath (Optional[str], optional) – Path to featureXML file, defaults to None
mass_error_ppm (float, optional) – Mass error in ppm, defaults to 10.0
noise_threshold_int (float, optional) – Noise threshold intensity, defaults to 1000.0
charge_lower_bound (int, optional) – Lower charge bound, defaults to 1
charge_upper_bound (int, optional) – Upper charge bound, defaults to 3
width_filtering (str, optional) – Type of width filtering, defaults to “fixed”
isotope_filtering_model (str, optional) – Isotope filtering model, defaults to “metabolites (2% RMS)”
remove_single_traces (str, optional) – Remove single traces, defaults to “true”
mz_scoring_by_elements (str, optional) – Score m/z by present elements, defaults to “false”
report_convex_hulls (str, optional) – Report convex hulls, defaults to “true”
deepcopy (bool, optional) – Perform a deepcopy to relieably unlink the experiment from the input, defaults to False
- Returns:
Feature map
- Return type:
pyopenms.FeatureMap
- skripts.FIA.FIA.untargeted_features_detection(in_dir: str, run_dir: str, file_ending: str = '.mzML', mass_error_ppm: float = 10.0, noise_threshold_int: float = 1000.0, charge_lower_bound: int = 1, charge_upper_bound: int = 3, width_filtering: str = 'fixed', isotope_filtering_model: str = 'none', remove_single_traces: str = 'true', mz_scoring_by_elements: str = 'false', report_convex_hulls: str = 'true', deepcopy: bool = False) list [source]
Untargeted detection of features. Combines mMass trace detection, elution peak detection and feature finding.
- Parameters:
in_dir (str) – Input directory
run_dir (str) – Run directory
file_ending (str, optional) – File ending, defaults to “.mzML”
mass_error_ppm (float, optional) – Mass error in ppm, defaults to 10.0
noise_threshold_int (float, optional) – Noise threshold intensity, defaults to 1000.0
charge_lower_bound (int, optional) – Lower charge bound, defaults to 1
charge_upper_bound (int, optional) – Upper charge bound, defaults to 3
width_filtering (str, optional) – Type of width filtering, defaults to “fixed”
isotope_filtering_model (str, optional) – Isotope filtering model, defaults to “metabolites (2% RMS)”
remove_single_traces (str, optional) – Remove single traces, defaults to “true”
mz_scoring_by_elements (str, optional) – Score m/z by present elements, defaults to “false”
report_convex_hulls (str, optional) – Report convex hulls, defaults to “true”
deepcopy (bool, optional) – Perform a deepcopy to relieably unlink the experiment from the input, defaults to False
- Returns:
List of feature maps
- Return type:
list