Error in running MAGeCK-VISPR on test data

Issue #5 resolved
Former user created an issue

Hi,

I was trying to run MAGeCK-VISPR on test data available as "esc-testdata". I have successfully initialize the workflow and got "README.txt", "Snakefile" and "config.yaml" in working directory. After this is execute the dry run snakemake -n and i got this error:

AmbiguousRuleException:
Rules mageck_rra and mageck_mle are ambiguous for the file results/test/myexperiment1.gene_summary.txt.
Expected input files:
    mageck_rra: results/count/all.count.txt
    mageck_mle: esc-testdata/designmatrix.txt annotation/sgrnas.bed results/count/all.count.txt

Now, i have no clue how to sort out this error. Can you please provide me some help on this. Thanks, Rahul

Comments (31)

  1. Rahul Kumar

    I tried to run from step 4 with configured workflow as given in example, but still i am getting this same error:

    =======

    AmbiguousRuleException: Rules mageck_rra and mageck_mle are ambiguous for the file results/test/ESC-MLE.gene_summary.txt.

    Expected input files: mageck_rra: results/count/all.count.txt mageck_mle: results/count/all.count.txt annotation/sgrnas.bed designmatrix.txt

    ========

    Is there something which i am missing?

    Thanks

  2. Johannes Köster

    This should be fixed with release 0.5.3. Thanks for reporting, and feel free to reopen if the error persists.

  3. Rahul Kumar

    Thanks Johannes Köster

    I tried version 0.5.4 from step4 with provided test data (downloaded from download section) and i got this error:

    Screen Shot 2016-09-21 at 14.56.31.png

    Can you please help me with this?

    Thanks, Rahul

  4. Johannes Köster

    Indeed, the Snakefile in the example was outdated. I just uploaded an updated archive. If you re-download the step4 archive, I believe it will work now. Thanks for the hint!

  5. Trent_Frisbie

    I'm having similar problems. After re-downloading the step4 archive and updating to 0.5.3 I get the following error: Screen Shot 2016-09-22 at 5.22.47 PM.png

  6. Johannes Köster

    Bug in mageck. @davidliwei, can you have a look? Maybe the same as in the other open issue?

  7. Rahul Kumar

    Dear Wei,

    Many thanks for your reply. I successfully ran the vispr.

    I have another question, before using MAGeCK, i was using shALIGN script (PMID: 22018332) to calculate the count of gRNA in fastq files. What i noticed that the gRNA read counts from shALIGN and MAGeCK are significantly different (e.g. for one of the ATM gRNA, MAGeCK giving 122 read counts and shALIGN giving 18672 in the same fastq file, you can imagine how different they are.).

    Then i used simple linux grep command to find out the ATM gRNA and it was well close to the count given by the shALIGN.

    Can you please let me know, why this significant different is coming and how MAGeCK counting the gRNA read counts.

    Many Thanks,

    Rahul

  8. Wei Li

    How many % reads can be mapped in your fastq file, based on the output of mageck? What parameters did you use for mageck count command? For the ATM gRNA, what are their relative locations in the reads? I guess it's because the sgRNA locations are different between different reads. In this case, you need to cut the 5' adapter using cutadapt.

    Best,

  9. Rahul Kumar

    Mapped read percentage: 0.01628

    mageck command:

    "/home/breakthr/rkumar/miniconda3/bin/mageck count --output-prefix all --list-seq library.csv --fastq reads/VX_2.fastq reads/DMSO_2.fastq --sample-label VX_2,DMSO_2 --sgrna-len 19 --trim-5 23 --pdf-report"

    I have adaptor of length 23 right in the beginning of the reads, so i trimmed first 23 bases.

    Location for ATM and other in reads is just after the adaptor.

    Please let me know if i am doing something wrong.

  10. Wei Li

    Can you just send me the output of grep results of the problematic ATM gRNA (you mentioned there are over 1k hits), as well as that particular gRNA sequence? That should be enough instead of the fastq file.

    Best,

  11. Wei Li

    I checked the record, and it seems that the 5' sequence is not exactly 23nt. The majority of them are 24nt, and some are 25nt. That's the problem. Can you use cutadapt to remove the 5' flanking sequence? That will solve the problem.

    Best,

  12. Rahul Kumar

    Thanks for your reply.

    First, i tried with --trim-5 as 24nt and it worked, quite close to shALIGN.

    Second, i tried cutadapt by giving 24nt adaptor sequence in the config.yaml file and what i got is zero count everywhere. In the newly formed trimmed directory where trimmed reads has to be stored, there was no sequence only header of the reads are present. Is there anything, i am missing with cutadapt.

  13. Wei Li

    Can you try manually run cutadapt first, generate fastq files with adapter removed, and use mageck to collect read counts?

    Best,

  14. Rahul Kumar

    Hi Wei,

    Thanks for your suggestion. I got the point where i was doing wrong. In Snakefile, cutadapt is using -a option by default which trim 3' adaptor but mine was 5' so i just used -g option and it worked.

  15. Sky

    Hi,

    I'm having the same error as described at the top here. (AmbiguousRuleException...)

    my version is 0.4.7 so I'm trying to install the new 0.5.3 version. However, Anaconda can't find the package with the following command:

    conda install -c bioconda mageck-vispr=0.5.3

    is there an alternative way to install or update to the new version?

    Thanks

  16. Sky

    This is the error/message that comes up:

    Fetching package metadata ......... Solving package specifications: . PackageNotFoundError: Package not found: '' Dependencies missing in current osx-64 channels: - mageck-vispr 0.5.3 -> bioconductor-sva >=3.15.0 -> bioconductor-genefilter -> bioconductor-annotate -> bioconductor-annotationdbi >=1.27.5 -> bioconductor-biobase >=1.17.0 -> bioconductor-biocgenerics >=0.3.2 -> r 3.2.2 - mageck-vispr 0.5.3 -> bioconductor-sva >=3.15.0 -> bioconductor-genefilter -> bioconductor-annotate -> bioconductor-annotationdbi >=1.27.5 -> r-dbi - mageck-vispr 0.5.3 -> bioconductor-sva >=3.15.0 -> bioconductor-genefilter -> bioconductor-annotate -> bioconductor-annotationdbi >=1.27.5 -> r-rsqlite - mageck-vispr 0.5.3 -> bioconductor-sva >=3.15.0 -> bioconductor-genefilter -> bioconductor-annotate -> r-xml - mageck-vispr 0.5.3 -> bioconductor-sva >=3.15.0 -> bioconductor-genefilter -> bioconductor-annotate -> r-xtable

    Close matches found; did you mean one of these?

    r-dbi: r-bit, perl-dbi, r-biom
    r-rsqlite: sqlite, r-rlist, r-jsonlite
    r-xml: raxml, r-xmlrpc, r-yaml
    

    You can search for packages on anaconda.org with

    anaconda search -t conda r-xtable
    

    (and similarly for the other packages)

    You may need to install the anaconda-client command line client with

    conda install anaconda-client
    
  17. Rahul Kumar

    Hi Sky,

    This is bit tricky, I haven't installed the mageck by bioconductor (i tried, but stucked like you).

    I installed it by command:

    $ python setup.py install --user (python should be >=3.5)

    before this i installed few packages which are required by the mageck for its function like, fastqc, cutadapt etc. using command:

    $ conda install -c bioconda <package>

    After this setup mageck did its magic.

    Hope this help.

    Cheers

  18. Johannes Köster

    I have updated the installation instructions on the main page. Bioconda now has additional channel dependencies (e.g. for R packages). Hence, you need to updated your setup according to the new instructions (basically adding the r and the conda-forge channel). Afterwards, bioconda-based installation will work again.

  19. heming wang

    I tried to run from step 4 with configured workflow as given in example, but I am getting this error:

    ConfigError in line 23 of /home/cs/Desktop/cs/data/workflow/Snakefile: Error in configuration file (key=trim-5, entry=23): Expecting a string. File "/home/cs/Desktop/cs/data/workflow/Snakefile", line 23, in <module> File "/home/cs/Desktop/cs/tools/miniconda3/lib/python3.6/site-packages/mageck_vispr/init.py", line 58, in postprocess_config File "/home/cs/Desktop/cs/tools/miniconda3/lib/python3.6/site-packages/mageck_vispr/check_config.py", line 117, in check_config File "/home/cs/Desktop/cs/tools/miniconda3/lib/python3.6/site-packages/mageck_vispr/check_config.py", line 113, in _check_config File "/home/cs/Desktop/cs/tools/miniconda3/lib/python3.6/site-packages/mageck_vispr/check_config.py", line 111, in _check_config File "/home/cs/Desktop/cs/tools/miniconda3/lib/python3.6/site-packages/mageck_vispr/check_config.py", line 23, in is_str

    ======== I put all files from "esc.testdata.step4" in my workflow and run "snakemake -n" in the terminal. Could you please provide me some advices on this. Thanks, Heming

  20. Michael Apostolides

    Hello, I am also getting the exact same error as heming wang

    UPDATE: In the config.yaml file, the following is stated:

    # if a number (instead of AUTO) is specified, use quotes; for example: 
    # trim-5: "0"
    

    Use quotes around the numbers to format the type as string

  21. Log in to comment