When you come from R, the default for saving to a CSV or an Excel file is that row line numbers are not stored in the file. In Python’s Pandas, that is a whole other story. By default, the indices will be included in your export. Let’s get rid of that.
Turning a CSV into a Pandas data frame is done using the to_csv method. If we don’t want to include the index (or “row names”, two arguments are of relevance:
- index: which defines if we want indices to be included.
- index_label: which defines the column header of the included index
In brief, simply setting the index argument to False will omit the index, as follows:
df.to_csv(index = False)
There is a particular thing to be noted in the Pandas documentation. The index_label description contains the following lines.
If False do not print fields for index names. Use index_label=False for easier importing in R.
This is likely a mistake because setting index to True, but index_label to False will make loading the CSV into R even harder. Because the indices will be printed, but they will not have a header.
first_column,second_column 0,2019-12-21,apple 1,2019-10-21,pineapple 2,2019-09-21,pear 3,2019-02-22,mango 4,2019-03-23,kiwi