- changed status to resolved
column header (field names) return with data
I'm sure this is intentional, but I find it unhelpful: the first row of data returned by the SAS7BDAT object is not data, but metadata: the column header. I'm transferring the data from sas dataset to sqlite3 database, and I would like to write
with SAS7BDAT(datasetFileName, extra_date_format_strings=['YYMMDDS','MMDDYYS']) as ds:
self.connection.executemany(insertStatement, ds)
but instead I must write
with SAS7BDAT(datasetFileName, extra_date_format_strings=['YYMMDDS','MMDDYYS']) as ds:
for i,row in enumerate(ds):
if i==0: continue
self.connection.execute(insertStatement, row)
to prevent the column header from being written to my table. My proposal is to remove the "yield [x.name for x in self.columns]" from SAS7BDAT.readlines(). My argument is that it shouldn't be there: metadata is not the same kind of thing as data and should not be part of the result set.
Two notes: 1. I'm a python noob, so perhaps there is a simple way to get the ds object to skip the first row. I tried iter(ds), but it did not help. 2. I'm trying to move away from SAS toward python + SQLite, and your SAS7BDAT is a big help. Thanks for the effort!
Comments (3)
-
repo owner -
repo owner Thanks for feedback. Although you could grab the generator using
generator = ds.readlines()
and then just do agenerator.next()
to skip the header, I went ahead and added askip_header
kwarg to the constructor. So, in your case, you can install the latest code (1.0.5) and do this:from sas7bdat import SAS7BDAT with SAS7BDAT(datasetFileName, extra_date_format_strings=['YYMMDDS', 'MMDDYYS'], skip_header=True) as ds: self.connection.executemany(insertStatement, ds)
-
reporter Thanks: it worked as advertised.
- Log in to comment
add a "skip_header" kwarg to constructor:
fixes
#6for those who don't want the header included with the data. simply pass in the skip_header=True kwarg to the constructor and the readlines() method will skip the header→ <<cset 006621a25699>>