From Tensorflow to Pytorch

Grokking Pytorch in 1 week | What I liked — What I wish it were different

6 min readApr 2, 2023

Here’s a small motivation story about Pytorch. Well, learning a new framework takes time and effort and it’s substantial to find and invest time to keep up with technology trends, tools and industry standards. In my experience, it usually takes several weeks, months even years to learn a framework and become proficient in using it. Something, I wished for quite a time, was to dig up Pytorch lectures and code tutorials[1] from my bookmarks. So that’s how it starts.

I started learning Tensorflow back in 2018[2], when still was at v1, and ever since all of my experience, machine learning projects and work have been build with Tensorflow and Keras as well.

Frankly speaking, Tensorflow at its first version was really tough. Learning curve was ridiculously steep as you had to implement everything from scratch including the training loop with all essential calculations to monitor the training. The way that Tensorflow handled data loading and tensor transformations was very obscure to the developer, as eager mode was not an option and the only way to debug was during the training runtime. Limited documentation and working examples, hard to install and play with a gpu was not the best experience for a starter. Keras integration to Tensorflow v2 literally changed the game of developing a machine learning model and Tensorflow v2 introduce the eager mode as the default runtime playing interchangeable with Graph mode. The improved apis for performant data loading and training such tf.Data, GradientTape, Keras and TFX for model deployment are ingredients that made it a mature library to lead the industry standard. Despite its popularity in industry, Tensorflow has started to loose the ground in research community around the period when Transformer models emerged and based on the annual State of AI report by Hugging Face, was found that 2021, PyTorch was the most popular deep learning framework, with 53% of respondents reporting that they primarily used PyTorch, compared to 47% who primarily used TensorFlow. Quite noticeable is the number of the open source models existed at that time in Hugging Face, where from the 14,500 models available, 11,500 were made with Pytorch and only 1,200 with Tensorflow[3].

So it goes. As I reflect on my experience using PyTorch, I must say that it has been a truly enjoyable and rewarding experience, for someone like me, that had experience with Tensorflow v1. PyTorch has been a vital tool for deep learning researchers and practitioners around the world, and for good reason. I shall be discussing some of the things that I appreciated about PyTorch, as well as areas where I wished to be different. Let’s get started.

Things I liked

Documentation

Something that I appreciate the most in any project it come to my road is the documentation. A well structured documentation is an essential component of any successful software project. It helps to reduce the learning curve for new users, improve the overall quality of the software and in general gives a better impression and credibility to the engineering community behind the project. Docs in Pytorch are exceptionally well written with a eye pleasing typography, not only providing details about the api, but also provides a deep dive to the general framework’s design.

Task specific libraries

Pytorch is governed like a decentralized project and its task specific apis are organized in individual libraries (torchaudio, torchvision, torchtext etc). Even though I hadn’t have to opportunity to get my hand dirty for the moment, is a brilliant idea. A project that is not managed around a core repository, it actually makes more sense for the maintainers and user to focus on specialized libraries/binaries.

Tensor Handling like numpy arrays

Pytorch is deeply integrated with Python and it’s not just a wrapper with C++ monolithic framework as Tensorflow. Therefore, is more effective and efficient making space to focus to develop the ml pipeline and not to google search on random warnings and errors.

Dynamic Computational Graph

In model definition or data operation, every line of code represents a node in computational graph, which is built on the fly, during runtime. Comparing to the static graph based Tensorflow, it offers a fluent debugging experience and is more helpful for developers to identify and address errors quicker and more effective.

Community

The moving force behind every open source project, is always the community. Every piece of information I found in Pytorch page is what I wished to have in Tensorflow one. Well defined rules, strong governance and design philosophy[4] are some key elements that caught my eye and make a good appeal for the community. Some extra toppings to define a good software community and I found:

Get up and running landing page with installation commands for popular development environments and cloud partners
Ecosystem and fellow libraries
Blog, tutorials and handful of resources to dig in

Things that bugged me

Package naming

If a developer is confident enough and tries to install pytorch as a regular pip package, then pytorch python package is not named after the same name, but as torch!

GPU binaries as the default

The same fellow developer will hit `pip install torch` but soon will understand that something does not goes well with the installation.

(bash)$-> pip install torch                                  
Collecting torch
  Downloading torch-2.0.0-cp39-cp39-manylinux1_x86_64.whl (619.9 MB)
     |████████████████████████████████| 619.9 MB 12 kB/s 
Collecting nvidia-cusparse-cu11==11.7.4.91
  Downloading nvidia_cusparse_cu11-11.7.4.91-py3-none-manylinux1_x86_64.whl (173.2 MB)
     |████████████████████████████████| 173.2 MB 33 kB/s 
Collecting nvidia-cublas-cu11==11.10.3.66
  Downloading nvidia_cublas_cu11-11.10.3.66-py3-none-manylinux1_x86_64.whl (317.1 MB)

Be careful, pytorch comes with a gpu binary as standard default in pypi!

GPU instead of cuda in device support

In two words, cuda is just not a device. I came by a particular paragraph in docs.

The torch.device contains a device type ('cpu', 'cuda' or 'mps') [5]

For the OCD people, this is a serious choice to stop reading after this line… I’m kidding, but gpu makes more sense to name after a device setting as a distinct hardware unit. In addition, in multi gpu environments, one has to set the following:

torch.device('cuda:0')
torch.device('cuda:1')

(Well, too much OCD red flags for today)

Poetry support

This might not goes solely on Pytorch project, but if a cpu binary was set as a standard default there wouldn’t be so much confusing around torch installation via poetry[6].

Workaround to install pytorch with poetry is to explicitly add the whl link in .toml

torch = [
  {url = "https://download.pytorch.org/whl/cpu/torch-2.0.0%2Bcpu-cp39-cp3
9-linux_x86_64.whl", markers = "sys_platform == 'linux'"},
]

Syntax and coding style

Realistically, is very pythonic for my liking. It resembles a lot of Tensorflow v1 and if I shall bring myself to the position of a newcomer or entry level engineer I find the default style of Tensorflow-Keras miles away more user friendly. Pytorch is a completely different approach to write a machine learning pipeline and finally train a model. Keras is awesome in every way, as it does everything for you when you are only a knob away, to define loss functions, optimizers, callbacks, train monitor etc etc, and also write your custom training loops as well with tf.GradientTape.

Because Tensorflow offers both of the model style definition worlds, that’s why I find it a bit more mature project.

In two words, I would say that Pytorch appeals to the more advanced engineers and researchers that have a good grasp of machine learning essential elements and want to get their hands really dirty. But don’t worry, the documentation has your back :)

Verdict

Whee…that’s enough of the bits and bytes that I found in my way during a week fiddling with Pytorch basics.

In conclusion, PyTorch, with its dynamic graph and user-friendly interface, offers researchers and developers an intuitive platform for experimentation and iteration. On the other side, TensorFlow, with its solid industrial backing and robust deployment capabilities, provides a compelling choice for larger-scale applications. Ultimately, the decision to use either framework depends on your specific needs and preferences. Regardless of your choice, both PyTorch and TensorFlow have thriving ecosystems of resources and support to aid you in your work. I hope this article has provided an informative overview of these two frameworks and their key features.

Next, we are going for a dive to the code stuff. Stay tuned.

Stavros Niafas is a ML engineer interested in the broader domain of AI, focused in data-centric AI techniques, active learning, MLOps and computer vision.
He aims to democratize R&D and fill the gap between research and production.

[1] https://drive.google.com/drive/folders/1S1KfKkhVYVZQkkUGXDfiBzuI22xIU9Fx

[2] https://github.com/chiphuyen/stanford-tensorflow-tutorials

[3] https://docs.google.com/presentation/d/1bwJDRC777rAf00Drthi9yT2c9b0MabWO5ZlksfvFzx8/edit#slide=id.gebb7f5a4fd_3_884

[4] https://pytorch.org/docs/stable/community/build_ci_governance.html

[5] https://pytorch.org/docs/stable/tensor_attributes.html#torch-device

[6] https://stackoverflow.com/questions/59158044/poetry-and-pytorch