I have the code running as a daemon process on two servers on a shared NFS mount.
So either process can find the files and put the list of files into its own @tmparray.
Each process will then lock the files for further processing. So only one process is able to lock and process the files.
When the file is removed by the other process between the finding and the sorting I get the error.
The code continues to work, but it occasionally spits out these errors to my logs | [reply] |
If you have two concurrent processes, then I'm afraid there isn't much you can do to prevent this from happening (at least within the context of what you explained). The best that you can do is possibly to reduce the probability of this happening by filtering the list with a grep.
However, you might ask yourself the following questions: is it really necessary (or useful) to have two concurrent processes to run on the same set of files? If you really want two concurrent processes, can't you "specialize" them, i.e. tell them to work on different file sets (based for example on the file names, file owner or age, or some other property of the files)? I cannot help but think that there is likely something wrong in your process if these two concurrent processes work on the same files, process them and delete some of them.
Another solution would be, when you make your list, to pick up in an AoA or an AoH not only the file names, but also their age. Then your sort could be made on the filename/age pairs you've collected, and you would no longer have a problem when sorting them. But, you'd be processing names of files no longer existing, it may or may not make sense depending on the bigger picture which we don't know.
BTW, these are warnings, not errors. I hate to say that, but, if there is not consequence on your process, you might as well decide to ignore them or even to silence them (although I am very reluctant at this type of decision, that's not what I would do in such a case).
| [reply] [d/l] |
The point of having two concurrent processes is for High Availability. Both machines run exactly the same workload and provides and Active/Active HA solution.
Your right these are warnings. The overall process is working, but I see these messages in my STDERR log file. So that is why I want to address it.
I'm going to try and filter with a grep
| [reply] |