Sunday, January 28, 2018

Convert Excel into CSV using pandas

Donghuas-MacBook-Air:Downloads donghua$ python
Python 3.6.3 |Anaconda, Inc.| (default, Oct  6 2017, 12:04:38) 
[GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas as pd
>>> import numpy as np
>>> df = pd.read_excel("/Users/donghua/Downloads/LN University.xls",sheet_name="Sheet1",header=None, skiprows=3)
>>> df.head(1)
   0       1             2            3       4    5       6   7   \
0   1  鞍山师范学院  201310169001  花楸果实中花青素的提取  创新训练项目  侯文锋  110604   4   

                                          8   9   10     11    12     13   14  \
0  杨晓龙 110607 \n王博  110505 \n陈中意 110629       辛广  教授  15000  5000  10000  550   

                                                  15  
0  本项目以花楸为原材料,通过用表面活性剂结合酸化的常规提取剂辅助超声波法提取花楸果实中花青素,...  
>>> df.to_csv("/Users/donghua/Downloads/LN_University_20180125.csv",sep='\t',header=False, encoding='utf-8') 
>>> exit()

Donghuas-MacBook-Air:Downloads donghua$ 


import pandas as pd
import numpy as np
df = pd.read_excel("/Users/donghua/Downloads/LN University.xls",sheet_name="Sheet1",header=None, skiprows=3)
#replace at column level
#df[8]=df[8].str.replace(r'\n',' ');
#df[15]=df[15].str.replace(r'\n',' ');
#replace within whole data frame
df=df.replace({r'\n': ' '}, regex=True)
df.to_csv("/Users/donghua/Downloads/LN_University_20180125.csv",sep=',',line_terminator='\n',escapechar='\\', header=False, encoding='utf-8')
exit()