TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems
Manage episode 487366630 series 3670304
This paper details TensorFlow, a large-scale machine learning system developed by Google. TensorFlow uses dataflow graphs to represent computation and manages state across diverse hardware, including CPUs, GPUs, and TPUs. It offers a flexible programming model, allowing developers to experiment with novel optimizations and training algorithms beyond traditional parameter server designs. The authors discuss TensorFlow's architecture, implementation, and performance evaluations across various applications, highlighting its scalability and efficiency compared to other systems. The system is open-source, facilitating widespread use in research and industry. Finally, they explore future directions, including addressing dynamic computation challenges.
https://www.usenix.org/system/files/conference/osdi16/osdi16-abadi.pdf
43 episodes