Loading Data
VectorScope supports multiple data formats and sources.
Supported File Formats
CSV Files
Comma-separated values with a header row. VectorScope automatically detects:
Numeric columns - Used as vector dimensions
String columns - Used as labels
Example CSV:
id,species,sepal_length,sepal_width,petal_length,petal_width
1,setosa,5.1,3.5,1.4,0.2
2,setosa,4.9,3.0,1.4,0.2
...
When you load a CSV, VectorScope:
Parses the header row
Detects which columns are numeric
Defaults to using all numeric columns as features
Uses the first string column as labels
You can reconfigure columns in the Config Panel after loading.
NumPy Files (.npy, .npz)
NPY files contain a single 2D array of shape (n_points, n_dimensions).
NPZ files should contain one of these array names:
vectorsdataembeddingsXorx
If none of these exist, the first array is used.
For NumPy files, columns are named dim_0, dim_1, etc., and you can configure them after loading.
Loading Methods
Create Synthetic
Generates random clustered data for testing:
1000 points by default
30 dimensions
5 clusters
Load Dataset
Built-in sklearn datasets:
Iris - 150 samples, 4 features, 3 classes
Wine - 178 samples, 13 features, 3 classes
Breast Cancer - 569 samples, 30 features, 2 classes
Digits - 1797 samples, 64 features, 10 classes
Diabetes - 442 samples, 10 features, regression target
Linnerud - 20 samples, 3 features
These datasets include proper feature names and class labels.
Open Session
Load a previously saved VectorScope session (JSON + NPZ files).
Column Configuration
For source layers (not derived), you can configure which columns to use:
Select the layer in the Graph Editor
In the Config Panel, you’ll see:
Label Column dropdown - Which column provides point labels
Feature Columns checkboxes - Which columns to use as vector dimensions
Click Apply Column Configuration to update
This is especially useful for CSV files where you may want to:
Exclude certain columns (like IDs)
Use a specific column for labels/coloring
Remove non-feature columns from the vector
Note
Changing column configuration recomputes all downstream projections and transformations.
Data Limits
VectorScope stores all data in memory. Consider these practical limits:
Points: Tens of thousands work well; hundreds of thousands may be slow
Dimensions: Hundreds is fine; thousands may slow down projections
t-SNE: Especially slow for large datasets; consider using PCA first