Using multiple threads makes very little sense for a depth-first search, as both tree traversal approaches (depth-first, breadth-first) impose a fixed order on when the nodes are visited.
Maybe you can show us some examples of how you think the nodes should be visited if multiple threads are iterating over a tree.
Usually, the "best" algorithmic approach is to have a worker pool of trees threads and as long as there are multiple nodes at the current level of the current tree and workers available in the pool, spawn send off a worker thread for every child node until the pool is exhausted and process the remaining nodes using the current thread.
Update: Corrected some words, as per ikegamis reply