I have read distributed tensorflow documentation and this answer.
According to this, in data parallelism approach:
This article says:
From the Tensorflow DevSummit video linked in tensorflow documentation page: It looks like data is split and distributed to each worker. So isn't
from Is Tensorflow's Between-graph replication an example of data parallelism?
According to this, in data parallelism approach:
And in model parallelism approach:
- The algorithm distributes the data between various cores.
- Each core independently tries to estimate the same parameter(s)
- Cores then exchange their estimate(s) with each other to come up with the right estimate for the step.
How do
- The algorithm sends the same data to all the cores.
- Each core is responsible for estimating different parameter(s)
- Cores then exchange their estimate(s) with each other to come up with the right estimate for all the parameters.
In-graph replication
and Between-graph replication
relate to these approaches?This article says:
For example, different layers in a network may be trained in parallel on different GPUs. This training procedure is commonly known as "model parallelism" (or "in-graph replication" in the TensorFlow documentation).And:
In "data parallelism" (or “between-graph replication” in the TensorFlow documentation), you use the same model for every device, but train the model in each device using different training samples.Is that accurate?
From the Tensorflow DevSummit video linked in tensorflow documentation page: It looks like data is split and distributed to each worker. So isn't
In-graph replication
following data parallelism approach?from Is Tensorflow's Between-graph replication an example of data parallelism?
No comments:
Post a Comment