Version: 1.6.0

Similarity Map

The Similarity Map projects the datapoints onto a 2D Map based on the similarity of the selected columns to place by. The Similarity is measured by either pre computed descriptions (e.G. embeddings retrieved from a ML workflow) or based on one or multiple scalar columns in the dataset.

Placement

Simply select one ore more columns to ‘placeBy’ in the settings and watch the dimension reduction arranging the data on the map.

In order to compute the dimension reduction we offer two different algorithms: UMAP and PCA. If the columns values have different distributions, The reduction might run into problems and overestimate the impact of some columns. To counteract this problem, a robust distance metric is available in the advanced settings.

Per default only reduced settings are available to control UMAP which help you in configuring the reduction to place the samples with more weight on local vs. global similarities. However, there also are advanced settings available in order give more control over the dimension reduction.

At the moment the samples can be placed by number columns and Embedding/Array columns.

Spotlight CE fsd50k - place datapoints on the similarity map based on a column

Controls

The map can be moved and zoomed by using the mouse.

Zooming can be done with the mousewheel, resetting zoom can be done with the Fit points button.

The map can be moved by clicking the middle mouse button or with the left mouse button if the alt key is pressed simultaneously.

Spotlight CE fsd50k - navigate in the similarity map

Selection

Similar to the data table, a single row can be added or removed from the selection.

In order to select a single point simply click on it. This will reset any selection and exclusively select the clicked point.

To add a point to a selection click on it while pressing shift and to remove it press ctrl while clicking on it.

These steps also apply for multiple points. By pressing the left mouse button and moving the mouse, a selection rectangle will be shown. On releasing the left mouse button, the selection will be applied in the same fashion as it is done for a single point.

Spotlight CE fsd50k - select and deselect points in the similarity map

Coloring and Sizing

Coloring and sizing can be used to represent more columns on the map. The coloring palette can be altered in the global coloring settings.

Spotlight CE fsd50k - colorize and scale datapoints on the similarity map based on a column

Filtering

Per default the similarity map also accounts for similarities of filtered out points. In order to compute the similarities only on the filtered points, hide unfiltered has to be checked.

Spotlight CE fsd50k - show/hide filtered datapoints on the similarity map

Similarity Map

Placement​

Controls​

Selection​

Coloring and Sizing​

Filtering​

Placement

Controls

Selection

Coloring and Sizing

Filtering