Getting Data Into PyTorch Tensors

So you’ve got some data and you want to get it into PyTorch for training a model or doing some deep learning magic. Great! PyTorch tensors are basically the building blocks of everything you’ll do, so let’s walk through the most common ways to get your data in there.
Starting Simple: From Python Lists and Arrays
The easiest way is probably starting with data you already have in Python:
import torch
import numpy as np

From a Python list

my_list = [1, 2, 3, 4, 5]
tensor_from_list = torch.tensor(my_list)
print(tensor_from_list) # tensor([1, 2, 3, 4, 5])

From a NumPy array

my_array = np.array([[1, 2], [3, 4]])
tensor_from_numpy = torch.from_numpy(my_array)
print(tensor_from_numpy)
Quick note: torch.tensor() creates a copy of your data, while torch.from_numpy() shares memory with the original NumPy array. Most of the time you won’t notice the difference, but it’s good to know.
Working with Different Data Types
PyTorch is pretty smart about figuring out data types, but sometimes you need to be explicit:

Specifying the data type

float_tensor = torch.tensor([1, 2, 3], dtype=torch.float32)
int_tensor = torch.tensor([1.5, 2.7, 3.9], dtype=torch.int64)

Load from CSV using pandas (Of course)

df = pd.read_csv(‘your_data.csv’)
tensor_from_df = torch.tensor(df.values)

Or if you want specific columns

features = torch.tensor(df[[‘feature1’, ‘feature2’, ‘feature3’]].values)
labels = torch.tensor(df[‘target’].values)
For images, you’ll probably want to use something like PIL or OpenCV first:
from PIL import Image
import torchvision.transforms as transforms

Convert to tensor (this also normalizes to [0,1])

transform = transforms.ToTensor()
image_tensor = transform(image)
Creating Tensors from Scratch
Sometimes you just need to make tensors with specific properties:

Move to GPU if available

device = torch.device(‘cuda’ if torch.cuda.is_available() else ‘cpu’)
tensor = torch.tensor([1, 2, 3]).to(device)

Or create directly on GPU:

gpu_tensor = torch.tensor([1, 2, 3], device=device)
Gradients: If you’re doing training, you might need gradients:

Most of the time PyTorch handles this, but occasionally:

contiguous_tensor = some_tensor.contiguous()
Real-World Example
Here’s how you might load and preprocess some tabular data for a neural network:
import pandas as pd
import torch
from sklearn.preprocessing import StandardScaler

Load your data

df = pd.read_csv(‘house_prices.csv’)

Separate features and target

features = df.drop(‘price’, axis=1).select_dtypes(include=[np.number])
target = df[‘price’]

Normalize features

scaler = StandardScaler()
features_scaled = scaler.fit_transform(features)

Convert to tensors

X = torch.tensor(features_scaled, dtype=torch.float32)
y = torch.tensor(target.values, dtype=torch.float32)

print(f”Features shape: {X.shape}”)
print(f”Target shape: {y.shape}”)


Wrapping Up


Getting data into PyTorch tensors is usually pretty straightforward once you know the basic patterns. The key is understanding your data format and choosing the right conversion method. Whether you’re starting with lists, NumPy arrays, pandas DataFrames, or files, there’s usually a clean way to get everything into tensor form.
The most important thing is making sure your data types, shapes, and device placement are all correct before you start training. PyTorch will usually give you helpful error messages if something’s wrong, so don’t worry too much about getting everything perfect on the first try.