Step by step Description
[Work in progress]
Data import
If no input files are found in resources/1.1-fluo-ceq8000
or resources/1.2-fluo-ceq8000
, IPANEMAP Suite will import them using a path constructed as following : raw_data:prefix_path
from config.yaml
concatenated with probe_file
and control_file
column from samples.tsv
.
If samples.tsv
specifies a QuShape file (qushape_file
column), and no QuShape file is already present in results/2.1-qushape
, the IPANEMAP Suite will import it. The file name is interpreted relative to prefix_path
as defined in the configuration.
If samples.tsv
specifies a Map file (map_file
column), the IPANEMAP Suite will import it to the normalized reactivity folder; all previous steps of data import are overriden. The file name is interpreted relative to prefix_path
as defined in the configuration.
:::{warning} Importing QuShape files and Map files created outside of IPANEMAP Suite should be done with special care.
If you want to import such files, you must ensure that you used the exact same sequence file as provided in config.yaml
and that data are coherents between the qushape
section of your config.yaml
, columns of samples.tsv
(e.g.: ddNTP column, rt_start and rt_stop)
IPANEMAP Suite avoids to overwrite existing QuShape project files. It will created a new project using sequencer data, only if no QuShape project file exists yet in the corresponding results folder. If a QuShape project exists, the pipeline extracts its reactivities (with the exception of direct import from Map files).
Data conversion
Files from CEQ8015 sequencer must converted to be used with QuShape. Headers are removed to obtain a tabular file
QuShape project generation
To simplify the use of QuShape (saving many manual configuration steps), the pipeline generates a complete QuShape project with probe_file
control_file
, sequence
, ddNTP
. Optionally, it uses a reference project, if specified in the sample file.
All QuShape options used for generation can be controlled through the configuration dialog or the config.yaml
file.
QuShape treatment of SHAPE-CE data
Treatment of the SHAPE-CE data in QuShape is the only remaining truly manual step in IPANEMAP Suite, even if the pipeline can facilitate it by preparing the project files. The user must open each QuShape project, and perform treatment to allow the calculation of reactivities.
Once the file is treated, the pipeline will be able to extract the reactivities, which will be used in downstream steps.
QuShape reactivity extraction
Reactivity is retrieved from fully treated QuShape projects.
Reactivity Normalization
Reactivity is normalized from raw reactivity extracted from QuShape.
Parameters :
- reactive_nucleotides
Nucleotides which are affected by the SHAPE probe used
- low_norm_reactivity_threshold
normalized reactivity threshold above which reactivity is not considered as significant, and then clipped to 0
- stop_percentile
(default: 90. )The threshold above which background is estimated to be too high - data above this threshold will be discarded
- simple_outlier_percentile
(default)simple method only - threshold (in percent) above which reactivity is considered as too high
- simple_norm_term_avg_percentile
simple_method_only - threshold (in percent) above which reactivities are used as to calculate normalization term
Steps :
Substracting background luminescence
Removing top 10% - considered as Reverse transcriptase stop sites.
Selecting reactive nucleotides
Computing Normalization Term (simple normalization / interquartile method)
Dividing nucleotide by normalization term.
clipping reactivity under the low_norm_reactivity_threshold to 0
Simple normalization method
Remove top 2% (simpleoutlier_percentile) reactivity which are considered as outliers
Get top 8% (simplenorm_term_avg_percentile) following. the avg of those values constitues the normalization term.
Interquartile method
Compute interquartile threshold defined as $Ir = 1.5 \times (Q_3 - Q_1)$ all values above $Q_3 + Ir$ this threshold are considered as outliers
Avg of the top 10% of the remains values is the normalization factor.
Reactivity Aggregation
Structure generation - IPANEMAP
Added reactivity