merge dataset without duplicates R -


i'm trying merge 2 dataframes in r.

df1 = data.frame(customerid = c(1:5,5), product = c(rep("toaster", 3), rep("radio", 3))) df2 = data.frame(customerid = c(2, 4, 4, 6,7), state = c(rep("alabama", 2), rep("ohio", 3)))  loj=merge(x = df1, y = df2, = "customerid", all.x = true) 

actual result:

 customerid product   state 1          1 toaster    <na> 2          2 toaster alabama 3          3 toaster    <na> 4          4   radio alabama 5          4   radio    ohio 6          5   radio    <na> 7          5   radio    <na> 

expected result:

 customerid product   state 1          1 toaster    <na> 2          2 toaster alabama 3          3 toaster    <na> 4          4   radio alabama 5          5   radio    <na> 6          5   radio    <na> 

however, if @ row 4 , 5, entry repeated. how can prevent doing that? want first match viewed , not care rest of matches may happen in ds2. essentially, merged should have same row count ds1.

thanks

one way create index vector duplicates want remove , subset loj based on ind

ind <- which(duplicated(loj$customerid))[1:abs(nrow(df1) - nrow(loj))] loj[-ind,] #  customerid product   state #1          1 toaster    <na> #2          2 toaster alabama #3          3 toaster    <na> #4          4   radio alabama #6          5   radio    <na> #7          5   radio    <na> 

Comments

Popular posts from this blog

Load Balancing in Bluemix using custom domain and DNS SRV records -

oracle - pls-00402 alias required in select list of cursor to avoid duplicate column names -

python - Consider setting $PYTHONHOME to <prefix>[:<exec_prefix>] error -