python - Merge two csv files based on key and secondary key -
i merge 2 csv files follows:
csv1:
formula,solver,runtime,conflicts cbs_k3_n100_m403_b30_13.cnf,swdia5by,0.001842,318 cbs_k3_n100_m403_b30_13.cnf,glucose,0.001842,318
csv2:
formula,entropy,num sols cbs_k3_n100_m403_b30_13.cnf,0.202,707286
desired output:
formula,solver,runtime,conflicts,entropy,solutions cbs_k3_n100_m403_b30_13.cnf,swdia5by,0.001842,318,0.202,707286 cbs_k3_n100_m403_b30_13.cnf,glucose,0.001842,318,0.202,707286
so did intersection between keys of 2 dictionaries (csv's), , used list comprehension
keysa = set(dict1.keys()) keysb = set(dict2.keys()) keys = keysa & keysb ... [[key] + dict1.get(key, []) + dict2.get(key, []) key in keys]
but there 'duplicate' rows (which need) field formula same field solver isn't, , output is:
formula,solver,runtime,conflicts,entropy,solutions cbs_k3_n100_m403_b30_13.cnf,swdia5by,0.001842,318,0.202,707286
how can keep rows using list comprehension? or in other way
appreciate help
edit - added example
why don't use pandas. pretty easy in pandas
import pandas pd df1=pd.read_csv("1.csv") df=pd.read_csv("2.csv") result=df1.merge(df,on="formula") result.to_csv("result.csv")
also can use result=df1.merge(df,on="formula",how="outer")
keep formula 1 of csv has other doesn't
Comments
Post a Comment