Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

splitCsv includes the byte order mark as part of the first column of the first line. #3596

Open
sclamons opened this issue Feb 1, 2023 · 1 comment

Comments

@sclamons
Copy link

sclamons commented Feb 1, 2023

Bug report

Expected behavior and actual behavior

Expected: When Nextflow reads a CSV file that includes a byte order mark using splitCSV, the byte order mark should be removed. Byte order marks are included in CSV files produced by Excel when saved as a "CSV UTF-8" file.

Actual: Nextflow includes the byte order mark as (invisible) text in the first row.

Steps to reproduce the problem

Save the following CSV text using Excel, in CSV UTF-8 format, as "my_csv.csv":

param_1,param_2
val_1,val_2

Nextflow file:

nextflow.enable.dsl = 2

workflow {
  Channel.fromPath('my_csv.csv') | splitCsv(header:true, strip:true) | view {"row contains parameter 'param_1'?: ${it.containsKey('param_1')}; row contains parameter 'param_2'?: ${it.containsKey('param_2')}"}
}

Expected output:
"row contains parameter 'param_1'?: true; row contains parameter 'param_2'?: true"

Actual output:
"row contains parameter 'param_1'?: false; row contains parameter 'param_2'?: true"

Program output

N E X T F L O W ~ version 21.10.6
Launching bug_test_main.nf [serene_sax] - revision: 281274f014
row contains parameter 'param_1'?: false; row contains parameter 'param_2'?: true

Environment

  • Nextflow version: 21.10.6
  • Java version: openjdk 11.0.1 2018-10-16 LTS
  • Operating system: "CentOS Linux release 7.9.2009 (Core)" (3.10.0-1160.76.1.el7.x86_64)
  • Bash version: GNU bash, version 4.2.46(2)-release (x86_64-redhat-linux-gnu)

Additional context

nextflow.log

@stale
Copy link

stale bot commented Aug 12, 2023

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Aug 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants