pandas.errors.EmptyDataError: No columns to parse from file
Hi Ben,
I just install the latest version vContact2, and get this error when testing the test data:
------------------------Contig Clustering & Affiliation------------------------- INFO:vcontact2.contig_clusters: Exporting for ClusterONE INFO:vcontact2.contig_clusters: Clustering the PC Similarity-Network using ClusterONE INFO:vcontact2.contig_clusters: Running clusterONE: java -jar /root/miniconda3/bin/cluster_one-1.0.jar VirSorted_Outputs/c1.ntw --input-format edge_list --output-format csv --min-density 0.3 --min-size 2 --max-overlap 0.9 --penalty 2.0 --haircut 0.55 --merge-method single --similarity match --seed-method nodes > VirSorted_Outputs/c1.clusters Exception in thread "main" java.lang.NumberFormatException: empty String at sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:1842) at sun.misc.FloatingDecimal.parseDouble(FloatingDecimal.java:110) at java.lang.Double.parseDouble(Double.java:538) at uk.ac.rhul.cs.cl1.io.EdgeListReader.readGraph(EdgeListReader.java:46) at uk.ac.rhul.cs.cl1.ui.cmdline.CommandLineApplication.loadGraph(CommandLineApplication.java:311) at uk.ac.rhul.cs.cl1.ui.cmdline.CommandLineApplication.run(CommandLineApplication.java:147) at uk.ac.rhul.cs.cl1.ui.cmdline.CommandLineApplication.main(CommandLineApplication.java:321) ERROR:vcontact2: Error in contig clustering ERROR:vcontact2: No columns to parse from file Traceback (most recent call last): File "/root/miniconda3/envs/vContact2/bin/vcontact2", line 618, in main mode=args.vc_mode) File "/root/miniconda3/envs/vContact2/lib/python3.7/site-packages/vcontact2/contig_clusters.py", line 92, in init self.cluster_one, self.one_opts) File "/root/miniconda3/envs/vContact2/lib/python3.7/site-packages/vcontact2/contig_clusters.py", line 227, in one_cluster return self.load_one_clusters(fi_clusters) File "/root/miniconda3/envs/vContact2/lib/python3.7/site-packages/vcontact2/contig_clusters.py", line 318, in load_one_clusters clusters_df = pd.read_csv(one_fn, header=0) File "/root/miniconda3/envs/vContact2/lib/python3.7/site-packages/pandas/io/parsers.py", line 685, in parser_f return _read(filepath_or_buffer, kwds) File "/root/miniconda3/envs/vContact2/lib/python3.7/site-packages/pandas/io/parsers.py", line 457, in _read parser = TextFileReader(fp_or_buf, kwds) File "/root/miniconda3/envs/vContact2/lib/python3.7/site-packages/pandas/io/parsers.py", line 895, in init self._make_engine(self.engine) File "/root/miniconda3/envs/vContact2/lib/python3.7/site-packages/pandas/io/parsers.py", line 1135, in _make_engine self._engine = CParserWrapper(self.f, self.options) File "/root/miniconda3/envs/vContact2/lib/python3.7/site-packages/pandas/io/parsers.py", line 1917, in init self._reader = parsers.TextReader(src, **kwds) File "pandas/_libs/parsers.pyx", line 545, in pandas._libs.parsers.TextReader.cinit pandas.errors.EmptyDataError: No columns to parse from file
The command is "vcontact2 --raw-proteins test_data/VIRSorter_genome.faa --rel-mode 'Diamond' --proteins-fp test_data/VIRSorter_genome_g2g.csv --db 'ProkaryoticViralRefSeq97-Merged' --pcs-mode MCL --vcs-mode ClusterONE --c1-bin /root/miniconda3/bin/cluster_one-1.0.jar --output-dir VirSorted_Outputs -t 20".
I also install pandas 0.25.3 again using the command "conda install -y -c conda-forge pandas=0.25.3".
The problem has not be fixed. Can you give some suggestions to fix?
Best,
Jun Liu
Comments (7)
-
-
Thank you Jun Liu and Giesela,
Have either of you tried v201? It’s a relatively new DB that should be in the more recent versions of vContact2?
If ClusterONE does work with other DBs, then I strongly suspect it's an issue with v97.
I will double-check v97 and get back to you.
Cheers,
Ben
-
I'm helpless!!
I have the same problem as you!
Did you solve it in the end?
Hope to get your reply!
Best wishes
wyx
-
Wyx,
Specify the full path to the ClusterONE java file, so basically add “cluster_one-1.0.jar” to the nearly full path you already have for “--c1-bin”. Optionally, you can place the ClusterONE jarfile in your $PATH.
-Ben
-
Hi all, adding the full path in the command line didn't work for me. However, adding the ClusterONE jarfile to the $PATH worked. Also, I couldnt get it working with ProkaryoticViralRefSeq97-Mergedm but it worked with ProkaryoticViralRefSeq94-Merged.
Thanks
-
Thank you for the report. I rebuilt RefSeq97 not too long ago because a number of users also had issues with that version specifically, but it seems that may have been in vain.
The 0.9.22 version includes a ClusterONE check to ensure vContact2 can 1) find and 2) use ClusterONE. Hopefully, that identifies issues before spending all that time/energy towards the end of the run.
-
- changed status to resolved
Closing due to inactivity, and hopefully solved this issue? Please re-open if necessary.
- Log in to comment
Hi Ben and Jun Liu,
I have the same issue like Jun Liu.
When I am running vContact2 with --db 'ProkaryoticViralRefSeq94-Merged' everthing works out fine. If I am changing to 'ProkaryoticViralRefSeq97-Merged' I get the same Error like above:
------------------------Contig Clustering & Affiliation-------------------------
INFO:vcontact2.contig_clusters: Exporting for ClusterONE
INFO:vcontact2.contig_clusters: Clustering the PC Similarity-Network using ClusterONE
INFO:vcontact2.contig_clusters: Running clusterONE: java -jar /home/xx/miniconda3/envs/vContact2/lib/python3.8/site-packages/vcontact2/cluster_one-1.0.jar /mnt/xio/xx/xxx/c1.ntw --input-format edge_list --output-format csv --min-density 0.3 --min-size 2 --max-overlap 0.9 --penalty 2.0 --haircut 0.55 --merge-method single --similarity match --seed-method nodes > /mnt/xio/xx/xxx/c1.clusters
Exception in thread "main" java.lang.NumberFormatException: empty String
at java.base/jdk.internal.math.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:1842)
at java.base/jdk.internal.math.FloatingDecimal.parseDouble(FloatingDecimal.java:110)
at java.base/java.lang.Double.parseDouble(Double.java:543)
at uk.ac.rhul.cs.cl1.io.EdgeListReader.readGraph(EdgeListReader.java:46)
at uk.ac.rhul.cs.cl1.ui.cmdline.CommandLineApplication.loadGraph(CommandLineApplication.java:311)
at uk.ac.rhul.cs.cl1.ui.cmdline.CommandLineApplication.run(CommandLineApplication.java:147)
at uk.ac.rhul.cs.cl1.ui.cmdline.CommandLineApplication.main(CommandLineApplication.java:321)
ERROR:vcontact2: Error in contig clustering
ERROR:vcontact2: No columns to parse from file
Traceback (most recent call last):
File "/home/lk/miniconda3/envs/vContact2/bin/vcontact2", line 615, in main
gc = vcontact2.contig_clusters.ContigCluster(pcp, output_dir, cluster_one_fp, cluster_one_args,
File "/home/lk/miniconda3/envs/vContact2/lib/python3.8/site-packages/vcontact2/contig_clusters.py", line 91, in init
self.clusters, self.cluster_results = self.one_cluster(os.path.join(self.folder, self.name),
File "/home/lk/miniconda3/envs/vContact2/lib/python3.8/site-packages/vcontact2/contig_clusters.py", line 227, in one_cluster
return self.load_one_clusters(fi_clusters)
File "/home/lk/miniconda3/envs/vContact2/lib/python3.8/site-packages/vcontact2/contig_clusters.py", line 318, in load_one_clusters
clusters_df = pd.read_csv(one_fn, header=0)
File "/home/lk/miniconda3/envs/vContact2/lib/python3.8/site-packages/pandas/io/parsers.py", line 685, in parser_f
return _read(filepath_or_buffer, kwds)
File "/home/lk/miniconda3/envs/vContact2/lib/python3.8/site-packages/pandas/io/parsers.py", line 457, in _read
parser = TextFileReader(fp_or_buf, **kwds)
File "/home/lk/miniconda3/envs/vContact2/lib/python3.8/site-packages/pandas/io/parsers.py", line 895, in init
self._make_engine(self.engine)
File "/home/lk/miniconda3/envs/vContact2/lib/python3.8/site-packages/pandas/io/parsers.py", line 1135, in _make_engine
self._engine = CParserWrapper(self.f, **self.options)
File "/home/lk/miniconda3/envs/vContact2/lib/python3.8/site-packages/pandas/io/parsers.py", line 1917, in init
self._reader = parsers.TextReader(src, **kwds)
File "pandas/_libs/parsers.pyx", line 545, in pandas._libs.parsers.TextReader.cinit
pandas.errors.EmptyDataError: No columns to parse from file
I seems that there is a problem in the ClusterOne .jar file or with the RefSeq97?
Thank you for your help
Best wishes
Giesela