merge dataset without duplicates R -
i'm trying merge 2 dataframes in r.
df1 = data.frame(customerid = c(1:5,5), product = c(rep("toaster", 3), rep("radio", 3))) df2 = data.frame(customerid = c(2, 4, 4, 6,7), state = c(rep("alabama", 2), rep("ohio", 3))) loj=merge(x = df1, y = df2, = "customerid", all.x = true)
actual result:
customerid product state 1 1 toaster <na> 2 2 toaster alabama 3 3 toaster <na> 4 4 radio alabama 5 4 radio ohio 6 5 radio <na> 7 5 radio <na>
expected result:
customerid product state 1 1 toaster <na> 2 2 toaster alabama 3 3 toaster <na> 4 4 radio alabama 5 5 radio <na> 6 5 radio <na>
however, if @ row 4 , 5, entry repeated. how can prevent doing that? want first match viewed , not care rest of matches may happen in ds2. essentially, merged should have same row count ds1.
thanks
one way create index vector duplicates want remove , subset loj
based on ind
ind <- which(duplicated(loj$customerid))[1:abs(nrow(df1) - nrow(loj))] loj[-ind,] # customerid product state #1 1 toaster <na> #2 2 toaster alabama #3 3 toaster <na> #4 4 radio alabama #6 5 radio <na> #7 5 radio <na>
Comments
Post a Comment