i am relatively new to r, so any help is appreciated
sample dataset, also attached an image of the dataset. image is of sample dataset
a B C D
12.97221, 64.78909 1 2
69.64817, 321.90037 2 28
318.87946, 259.29946 3 5
326.17622, 94.7089 9 8
137.54006, 325.34917 5 88
258.06002, 94.77531 6 63
258.92824, 322.20164 7 64
98.57514, 12.96828 8 34
98.46303, 139.27264 9 21
317.22764, 261.25563 10 97
my Goal: i need to
1) look at value in column A
2) find the nearest/closest number in column B
3) test to see if the value in column B has already been selected
4) if the value in column B has already been selected, then ignore and choose the next closest value.
5) once a new, non-duplicated, value is chosen from column B, then
6) Test to see if the value in column C that is on the same row as the value of interest in column A is not the same as the value in column D on the same row as the nearest chosen value in column B
7) if the values in column C and D are NOT the same, then
8) return the value from column B into a new column
9) if the value in column C and D are the same, then repeat steps 4-7 until a) a new, non-duplicated value is chosen, and b) the value in C and D are not equal.
here is the code i have so far, this code solves the problem of finding the nearest number "without replacement" but does not tackle the issue of similar values in columns C and D being considered before the value in column B is chosen; developed by "Chase" from here: How to get the closest element in a vector for every element in another vector without duplicates?
foo <- function(a,b) {
out <- cbind(a, bval = NA)
for (i in seq_along(a)) {
#which value of B is closest?
whichB <- which.min(abs(b - a[i]))
#Assign that value to the bval column
out[i, "bval"] <- b[whichB]
#Remove that value of B from being chosen again
b <- b[-whichB]
}
return(out)
hopefully this (below) is a better description and example of my problem.
see adjusted table to better show my problem. look at value in column A of 12.97221, then evaluate column B and choose the value of 12.96828. then evaluate the values in column C that correspond to 12.97221, which is 1; then look at the value in column D (value in d=34) that corresponds to 12.96828. Since the value in column B of 12.96828 has not been selected and the values in column C and D do not match, then I would expect it to return 12.96828 in column E. Next it would look at 2nd value in column A of 69.64817 and compare it to the values in column B, it should select 64.78909, then evaluate if it has been chosen. then evaluate value in column C (2) that corresponds to value in column B and evaluate value in column D (2) that corresponds to chosen value in column C. although this is the first time 64.78909 is selected, the values in column C and D are the same, and therefore i need to pick the next closest value from column B of 94.7089, then evaluate whether it has been chosen; it has not. then evaluate the value in column C that corresponds to the value in column A (value in C = 2) and evaluate the value in column D that corresponds to 94.7089 (value in D of 34) and compare them. Since the value of 94.7089 was not chosen yet and the values in columns C (value in C = 2) and D (value in D = 34) are not the same, return 94.7089 into column E.
again, thanks in advance and hopefully i described my problem sufficiently