Error when reading GFF files from GenBank
See original GitHub issueHi,
I go this issue when trying to process a bunch of genome assemblies including some that where directly downloaded from the NCBI.
It appears that panaroo cannot digest GFF files coming from GenBank as I get this error:
Traceback (most recent call last):
File "/usr/local/bin/panaroo", line 10, in <module>
sys.exit(main())
File "/usr/local/lib/python3.7/site-packages/panaroo/__main__.py", line 273, in main
args.n_cpu)
File "/usr/local/lib/python3.7/site-packages/panaroo/prokka.py", line 229, in process_prokka_input
raise RuntimeError("Error reading prokka input!")
RuntimeError: Error reading prokka input!
(the error was actually a bit longer when including the report of the error handling by master process, but i don’t think that’s relevant here)
this was obtained using panaroo version 1.2.3 (from the module installed on the Sanger farm).
One example of GFF file that fails is to be found in this assembly: https://www.ncbi.nlm.nih.gov/assembly/GCA_009746685.1
Do you have any suggestion of how this could be addressed? Is Panaroo only expecting GFF files as produced by Prokka? I used the the GenBank version of the assembly; do you think it would work better if using the RefSeq annotation?
Thank you in advance for the help.
Florent
Issue Analytics
- State:
- Created 3 years ago
- Comments:5 (1 by maintainers)
Top Related StackOverflow Question
I confirm that processing the GFF files with the
convert_refseq_to_prokka_gff.pyscript allowed me to circumvent the issue. Thanks again.ah sorry I should have thought of that myself! I was confused by the fact that the script is provided as an executable file so I was assuming it would know the right interpreter by itself. I could suggest adding a shebang like
#!/usr/bin/env python3at the beginning of the script, or to not make the script executable to avoid that mistake. Trying with the Python interpreter, I still got errors linked to missing packages in the environment; I solved that issue by installing the Conda environment specific to panaroo (https://anaconda.org/bioconda/panaroo). This script now works fine.