|
2\text{-}D model |
The fitted representation of the NLDR layout obtained after hexagonal binning and centroid aggregation, used as the basis for lifting into high-dimensional space. |
|
a_1 (Hexagon width) |
The horizontal distance between adjacent hexagon centroids in the hexagonal grid used for binning a 2\text{-}D NLDR layout. |
|
a_2 (Hexagon height) |
The vertical distance between rows of hexagon centroids in the hexagonal grid, related to a_1 by a_2 = \sqrt{3}a_1/2. |
|
b_1 (Number of bins along x-axis) |
The number of hexagon columns spanning the horizontal range of a 2\text{-}D embedding. |
|
b_2 (Number of bins along y-axis) |
The number of hexagon rows spanning the vertical range of a 2\text{-}D embedding. |
|
k-means clustering |
A clustering algorithm that partitions data into a fixed number of clusters by minimizing within-cluster variance. |
| 2NC7 data |
A simulated 7\text{-}D dataset consisting of two nonlinear clusters with different intrinsic dimensions and added noise variables. |
| Active (binding) constraint |
In optimization, a constraint that holds with equality at the optimal solution and determines the final parameter values. |
| Alt-text |
Textual descriptions of figures and visual content that support accessibility, particularly for screen readers and users with visual impairments. |
| Anchor point |
The center of a hole that is removed from a dataset. Often the data mean, but can be user-defined. |
| Apex |
The tip or pointy end of a shape (like the top of a cone or pyramid) where many points can be concentrated. |
| Application (example study) |
A worked example demonstrating how cardinalR can be used to generate data, apply dimension reduction and clustering methods, and evaluate their performance. |
| Archimedean spiral |
A spiral where the distance from the center increases steadily as it winds outward. |
| Area under the RNX curve (ARNX) |
A summary measure of neighborhood preservation, balancing local and global structure. |
| Aspect ratio |
The ratio of the ranges of the NLDR axes, preserved when constructing hexagonal binning grids. |
| Attention check/ trial |
A trial included to verify that participants are paying attention. These trials use very clear data structures where the correct answer should be obvious. |
| Azimuthal angle |
The angle that controls rotation around an axis (like longitude on a globe). |
| Background noise |
Additional points drawn from a distribution that do not belong to any specific geometric structure or cluster, used to simulate unstructured variation in the data. |
| Ball |
A filled-in sphere. Points occupy the whole volume, not just the surface. |
| Barycentric coordinates |
A way of picking points uniformly inside a triangle by mixing the triangles corners with random weights. |
| Baseline probability |
The estimated probability of a correct response under a reference condition, used as the starting point in power analysis. In this study, it corresponds to performance at the smallest distance factor (0.1). |
| Benchmark datasets/ structure |
Standard or reference datasets used to evaluate, compare, and validate the performance of analytical algorithms (e.g. clustering or dimension reduction). |
| Between-cluster distance |
A measure of separation between clusters, typically based on distances between cluster centroids or boundaries. |
| Between-to-within (BW) ratio |
A measure of cluster separability comparing how far clusters are from each other (between-cluster variation) relative to how spread out points are within clusters. Higher values indicate better-separated clusters. |
| Bin (hexagon) |
A spatial region in a hexagonal grid used to aggregate points from a 2\text{-}D NLDR embedding. |
| Bin centroid |
The center point representing a hexagonal bin, either defined geometrically or computed as the mean of observations within the bin. |
| Binheight (a_2) |
The vertical height of a hexagon, determined by the geometry of hexagonal tiling. |
| Binning function |
A mapping that assigns each observation in the 2\text{-}D layout to its nearest hexagon centroid. |
| Binwidth/ Bin width (a_1) |
The horizontal width of a hexagon, controlling the resolution of the hexagonal grid. |
| Blunted apex |
A tip thats flattened or rounded instead of sharp. |
| Branching structure |
A connected geometric structure consisting of multiple arms or trajectories that diverge from a common origin, often used to represent bifurcation or developmental processes. |
| Browser memory limits |
Constraints imposed by web browsers on the amount of memory available for client-side rendering and interaction, which can affect performance when visualizing large datasets in web-based applications. |
| Brushing / Linked brushing |
An interactive technique where selecting points or regions in one view (e.g., a 2\text{-}D layout) highlights the corresponding points in another view (e.g., high-dimensional model or tour). |
| Buffer parameter (q) |
A proportional margin added around the data range to ensure the hexagon grid fully covers the NLDR layout. |
| C-shaped cluster |
A nonlinear cluster with observations arranged along a curved manifold resembling the letter C. |
| Centroid |
The mean position of all points in a cluster. Centroids are used to control and measure distances between clusters in the simulations. |
| Cluster |
A group of data points generated from the same underlying geometric shape or distribution, representing a coherent structure in the dataset. |
| Cluster separability/ separation |
The degree to which clusters are distinct from one another in the data space, quantified using distance-based metrics. |
| Cluster validity statistic |
A numerical measure used to assess the quality of a clustering solution, often balancing within-cluster compactness and between-cluster separation. |
| Clustered spheres |
A structure made of one large sphere plus several smaller spheres placed around it, each treated as a separate group. |
| Clustering |
The task of grouping observations so that points within the same group are more similar to each other than to points in other groups. |
| Clustering algorithm |
A method that groups observations into clusters based on similarity, without using class labels. |
| Cone |
A shape that narrows toward one end. In high dimensions, its formed by shrinking hyperspherical cross-sections along one axis. |
| Cone cluster |
A cluster whose density varies along one axis, typically denser near the apex and more diffuse toward the base. |
| Confidence rating |
A self-reported measure indicating how confident a participant is in their judgment for a given trial. |
| Conic spiral |
A spiral that expands outward and upward, forming a cone-like helix. |
| Convex hull |
The smallest convex set that contains all points in a dataset; referenced as inspiration for extending binning to higher dimensions. |
| Correct identification rate / proportion correct |
The proportion of trials in which participants correctly judged whether the 2\text{-}D NLDR plot and the tour represented the same data. |
| Covariance structure (EII) |
A model-based clustering assumption where clusters are spherical, have equal volume, equal shape, and no orientation differences. |
| Crescent |
A curved, moon-shaped arc formed by points along part of a circle. |
| Cube / Hypercube |
Points filling a square (2\text{-}D), cube (3\text{-}D), or higher-dimensional box, either on a grid or randomly. |
| Curvy cycle |
A closed loop that isnt a simple circle, with extra folds or oscillations. |
| Cylinder (curvy) |
A cylindrical shape with circular cross-sections, extended with a nonlinear bending dimension. |
| DIFFERENT trials |
Trials in which the 2\text{-}D NLDR embedding and the tour are generated from different (but related) high-dimensional datasets. These trials act as controls to prevent trivial response strategies. |
| Data availability |
The practice of making datasets publicly accessible to support transparency, reproducibility, and independent validation of results. |
| Data set |
A simulated configuration of points in high-dimensional space with predefined geometric shapes, densities, and cluster arrangements. |
| Data set variability |
Variation in performance attributable to differences among the simulated data sets, such as cluster shape, size, or arrangement. In this experiment, data sets act as replicates and contribute random noise to the response. |
| Delaunay triangulation |
A geometric method that connects points into triangles such that no point lies inside the circumcircle of any triangle, used to define neighborhood structure. |
| Demographics questionnaire |
A set of questions collecting background information about participants, such as age range, education level, and prior experience with dimension reduction methods. |
| Diffusion process |
A mathematical process modeling how information spreads across a graph or manifold, used in NLDR to capture intrinsic geometry. |
| Dimension reduction (DR) |
A technique that maps high-dimensional data into a lower-dimensional space while attempting to preserve important structure. |
| Distance scale factor |
A multiplier applied to centroid distances to control how close or far apart clusters are. Levels include small, smallmedium, medium, mediumlarge, and large. |
| Dynamic visualization |
Visualization techniques that use motion or interaction (e.g., tours, brushing, animation) to explore high-dimensional or complex data structures. |
| Effect size |
The magnitude of a difference in performance between two experimental conditions. Here, it is defined as a difference in proportions, with an effect size of approximately 0.22 corresponding to a 20 percentage-point increase in correct identification. |
| Embedding |
A low-dimensional representation (typically 2\text{-}D) of high-dimensional data produced by an NLDR method such as tSNE or UMAP. |
| Exploratory planning |
Early-stage conceptual work involving sketches, diagrams, and working notes that guide the development of methods and software. |
| Exponential distribution (truncated) |
A distribution that produces many small values and few large ones, here limited to a fixed range. |
| Exponential scaled minimum inter-cluster distance |
A transformed version of the smallest distance between any two points from different clusters, used to emphasize differences in close cluster proximity. |
| Fitted values |
The high-dimensional bin centroids associated with observations, treated as model-predicted values. |
| Fractal (Sierpinski-like) |
A self-similar pattern with repeating holes or gaps, created using a recursive rule. |
| Gaussian cloud/ cluster |
A cluster of points generated from a multivariate normal distribution, typically dense in the center and sparse at the edges. |
| Gaussian noise |
Random variation drawn from a normal distribution, often added to make data more realistic. |
| Generalized linear mixed-effects model (GLMM) |
A statistical model that accounts for both fixed effects (e.g., NLDR method, distance) and random effects (e.g., participant variability). Used here to model correct identification probabilities. |
| Geometric relationships |
Spatial relationships among data points (such as distances and angles) in the original high-dimensional space. |
| Geometric shape |
A mathematically defined structure (e.g., Gaussian, cone, sphere, cube, spiral) used as a building block for generating synthetic data. |
| Git |
A distributed version control system used to track changes in code, documents, and research materials. |
| GitHub |
A web-based platform for hosting Git repositories, enabling collaboration, version control, and public dissemination of software and research materials. |
| Global Score (GS) |
A metric measuring preservation of overall geometry relative to a PCA baseline. |
| Global structure |
Large-scale relationships in the data, such as relative positions and distances between clusters. |
| Grid-based structure |
Points placed in a regular, evenly spaced pattern instead of randomly. |
| Hallucinated structure |
Patterns observed in an NLDR layout that do not correspond to true structure in the original high-dimensional data. |
| Helical spiral |
A twisted, elongated structure that winds around an axis while progressing forward. |
| Hemisphere |
Half of a sphere, created by restricting angles so points lie on only one side. |
| Hexagon (bin) |
A single hexagonal cell in the tessellation of the 2-D NLDR layout. |
| Hexagon grid |
A tessellation of the 2\text{-}D embedding space into regular hexagons used for binning and model fitting. |
| Hexagonal binning (hexbin/ hexbinning) |
A spatial aggregation method that partitions a 2-D layout into hexagonal cells to summarize local structure and density. |
| Hexbin Error (HBE) |
A diagnostic metric that measures how well a 2\text{-}D NLDR layout represents the underlying high-dimensional data by comparing bin centroids lifted back into high-dimensional space. Lower HBE indicates a better representation. |
| Hierarchical clustering |
A clustering method that builds a tree of nested clusters by successively merging or splitting groups. |
| High-dimensional centroid |
The mean of the high-dimensional observations assigned to a given hexagonal bin. |
| High-dimensional data (p\text{-}D data) |
Data in which each observation is described by a large number of features (dimensions), often making direct visualization and interpretation difficult. |
| High-dimensional noise |
Extra dimensions added to data that introduce variability without changing the main structure. |
| High-dimensional space (p\text{-}D) |
The original data space where each observation is represented by p variables. |
| Hole (spherical / hyperspherical) |
A region removed from the data in the shape of a circle, sphere, or higher-dimensional sphere. |
| Hyper-parameter(s) |
A method-specific tuning parameter (e.g., perplexity in tSNE, number of neighbors in UMAP) that influences the resulting NLDR embedding. |
| Hypersphere |
The higher-dimensional equivalent of a circle (2\text{-}D) or sphere (3\text{-}D). |
| Interdisciplinary users |
Researchers or students from diverse disciplinary backgrounds who may have varying levels of programming or statistical expertise. |
| Intrinsic dimensionality |
The effective dimensionality of a data structure, independent of the ambient number of variables. |
| Inverse transformation |
A nonlinear operation involving division (e.g. 1/x1/x1/x) that creates sharp curvature. |
| Isotropic distribution |
A distribution with equal variability in all directions, such as points uniformly sampled in a cube. |
| Latent parameter |
An underlying variable (like an angle or index) that drives the shape of the data. |
| Latent variable |
An underlying variable (like an angle or time index) that drives the observed structure but isnt directly observed. |
| Lifted model (p\text{-}D model) |
The representation of the 2-D wireframe model mapped back into the original high-dimensional space by averaging observations within each bin. |
| Linear optimization problem |
An optimization problem where the objective and constraints are linear functions, implying solutions occur at vertices of the feasible region. |
| Linear projection |
A mapping from high- to low-dimensional space using linear combinations of the original variables. |
| Linear structure |
Points arranged roughly along a straight line, possibly with noise and different scales across dimensions. |
| Linked plots/views |
Interactive visualizations where selections in one view (e.g., 2-D layout) are reflected in other views (e.g., tours or error plots). |
| Local structure |
Relationships among nearby points in high-dimensional space, such as nearest neighbors within a cluster. |
| Localglobal trade-off |
The balance between preserving small-scale neighborhood structure and large-scale relationships in NLDR embeddings, controlled by hyper-parameters. |
| Low-count bin removal |
The process of excluding bins with few observations to sharpen the wireframe representation. |
| Low-density hexagon |
A bin containing few points and whose neighboring bins also have low density, indicating weak local support for structure. |
| Low-dimensional representation/ embedding |
A reduced-dimensional embedding (typically 2\text{-}D) of high-dimensional data used for visualization and interpretation. |
| Manifold |
A low-dimensional shape (curve or surface) embedded inside a higher-dimensional space. |
| Match-a-roo |
A Shiny-based web application developed to collect participant responses, confidence ratings, and demographic information for the user study. |
| Minimum inter-cluster distance |
The smallest distance between any point in one cluster and any point in another cluster. It captures the closest approach of clusters. |
| Misidentification |
A participant response indicating that two displays show different data when they actually show the same data, or vice versa. |
| Model diagnostics |
Visual and quantitative tools used to assess how well an NLDR-based model fits the high-dimensional data, including error views, tours, and linked brushing. |
| Model fitting pipeline |
The sequence of steps in quollr, including scaling, hexagonal binning, centroid extraction, triangulation, lifting into p-D, and diagnostic computation. |
| Model-based clustering |
A probabilistic clustering approach that assumes data are generated from a mixture of distributions. |
| Model-in-the-data-space |
A visualization principle in which a fitted model is overlaid directly on the observed data in the original high-dimensional space to assess model fit. |
| Multicluster dataset |
A dataset composed of multiple clusters, each potentially generated from a different geometric shape or distribution. |
| Multidimensional scaling (MDS) |
A family of methods that create low-dimensional representations by preserving pairwise distances from high-dimensional space. |
| Mbius strip |
A twisted surface with only one side and one edge, often used to test how algorithms handle non-orientable geometry. |
| NLDR (Nonlinear Dimension Reduction) |
A class of methods that project high-dimensional data into lower dimensions while preserving nonlinear structure. |
| NLDR layout |
The two-dimensional embedding produced by an NLDR method (e.g., tSNE, UMAP), where each point represents a high-dimensional observation. |
| NLDR selection |
The process of choosing one or more nonlinear dimension reduction layouts that best represent the structure of high-dimensional data, based on visual and quantitative diagnostics. |
| Nearest-neighbor ordering |
A possible effect of NLDR methods where points are arranged in a way that preserves local neighborhoods but may impose unintended global ordering. |
| Neighborhood preservation |
The extent to which local proximity relationships among observations are maintained between high- and low-dimensional spaces. |
| Neighborhood structure |
The pattern of local adjacency relationships among bins or points in the 2\text{-}D layout. |
| Noise dimensions |
Additional variables added to a dataset, typically drawn from random distributions, to increase dimensionality without adding structure. |
| Non-empty bin |
A hexagonal bin containing at least one observation. |
| Nonlinear dimension reduction (NLDR) |
A class of methods that map high-dimensional data into a lower-dimensional space using nonlinear transformations, often to reveal structure not visible through linear projections. |
| Nonlinear geometry |
Data structures that cannot be adequately represented by linear projections, such as spirals or curved surfaces. |
| Nonlinear surface |
A warped 2\text{-}D surface embedded in higher dimensions, showing bends, waves, or sharp changes. |
| Orthogonal rotation |
A transformation that rotates data while preserving distances and overall shape. |
| PBMC dataset |
A single-cell RNA-seq dataset of peripheral blood mononuclear cells commonly used to benchmark dimension reduction and clustering methods. |
| PHATE (Potential of Heat-diffusion for Affinity-based Trajectory Embedding) |
An NLDR method based on diffusion processes, designed to capture both global geometry and continuous transitions in the data. |
| PaCMAP (Pairwise Controlled Manifold Approximation) |
An NLDR method that uses different types of point pairs to control local, mid-range, and global structure preservation. |
| Pancake effect |
An observed artifact where a fitted model collapses into a near-flat structure in high-dimensional space, indicating loss of intrinsic dimensionality. |
| Participant (subject) |
An individual recruited to take part in the user study and complete the evaluation trials. |
| Participant-level random effect |
A model component capturing individual differences in accuracy or response behavior across participants. |
| Perception and misperception |
The ways in which viewers correctly or incorrectly interpret visual patterns in data visualizations. |
| Perceptual identification/ accuracy |
The task performed by participants: deciding whether a 2\text{-}D NLDR plot and a tour represent the same underlying data. |
| Piling |
A phenomenon in linear projections where many points overlap or concentrate near the center of the display, potentially obscuring important structure. |
| Pilot data |
Preliminary experimental data used to estimate performance levels and inform the design of the main experiment, including sample size and effect size selection. |
| Point-level diagnostics |
Residual and error measures computed for individual observations to identify local regions of poor model fit. |
| Polar angle |
The angle controlling vertical position on a sphere (like latitude). |
| Polynomial structure |
A curved pattern (quadratic or cubic) defined by polynomial relationships between variables. |
| Power analysis |
A procedure used to determine the number of responses required per treatment to reliably detect a specified effect size with high probability. |
| Prediction |
The process of assigning a new high-dimensional observation to a location in the 2-D layout based on nearest neighbors in the lifted model. |
| Principal Component Analysis (PCA) |
A linear dimension reduction method that identifies orthogonal directions (principal components) capturing the maximum variance in the data. |
| Prolific |
An online crowd-sourcing platform used to recruit participants for the experiment. |
| Pyramid |
A shape with a broad base that narrows toward an apex, with different possible base shapes (rectangular, triangular, star-shaped). |
| Quarto |
A scientific and technical publishing system used to create reproducible documents, presentations, and websites combining text, code, and visualizations. |
| Radius scaling |
Changing the size of a cross-section as you move along an axis (e.g. shrinking toward a tip). |
| Random Neighborhood Preservation (RNX) curve |
A metric that quantifies neighborhood agreement across scales between high- and low-dimensional spaces. |
| Random Triplet Accuracy (RTA) |
A metric assessing how well relative distances among random triplets of points are preserved in the embedding. |
| Random effect |
A component of a statistical model that captures variability attributable to random differences across experimental units, such as subjects or data sets. |
| Reliability of NLDR representations |
The degree to which a low-dimensional embedding faithfully reflects the structure present in the high-dimensional data. |
| Residual |
The Euclidean distance between an observation and its corresponding fitted bin centroid in high-dimensional space. |
| Residual-based evaluation |
An approach to assessing NLDR quality by examining differences between fitted model values and observed high-dimensional data. |
| Resolution parameter |
Any parameter (such as binwidth or neighborhood size) that controls the granularity at which structure is modeled or visualized. |
| Rotation |
A transformation applied to data that changes its orientation in space while preserving distances and geometric relationships. |
| Rotation matrix |
A matrix that changes the orientation of data without changing distances or variance. |
| Run sheets |
Step-by-step task instructions provided to participants during a usability study to ensure consistency in task execution and to facilitate systematic feedback collection. |
| S-curve |
A smooth, bent surface in 3D often used to test whether algorithms can unfold nonlinear structure. |
| S-curve with hole |
An S-shaped manifold with a missing spherical region, creating topological complexity. |
| Same (SAME) trials |
Trials in which the 2\text{-}D NLDR embedding and the tour are generated from the same high-dimensional dataset. These trials form the primary basis for analysis. |
| Scaled distance |
A distance measure adjusted to account for differences in scale across simulated data structures. |
| Scaling |
A transformation that adjusts the spread or magnitude of a geometric shape along one or more dimensions. |
| Scaling (data scaling) |
Standardizing data before analysis so that variables have comparable ranges. |
| Scaling of NLDR data |
Rescaling the 2\text{-}D embedding to a standardized range to preserve aspect ratio and ensure consistent binning. |
| Server-side computation |
A computational model in which data processing and analysis are performed on a remote server rather than on the users local machine. In menuraR, NLDR generation and diagnostics are executed on the Shiny server. |
| Shape generator |
A function that produces synthetic data points according to a specified geometric form and set of parameters. |
| Shepard diagram |
A scatterplot comparing pairwise distances in the original space to distances in the embedded space. |
| Simulation-based power analysis |
A power analysis approach that uses simulated data, rather than closed-form formulas, to assess the ability to detect effects under realistic experimental conditions. |
| Single-cell RNA sequencing (scRNA-seq) |
A technology that measures gene expression at the level of individual cells, producing extremely high-dimensional datasets. |
| Sparse sampling |
Uneven sampling density across different regions of a manifold, which can distort NLDR layouts by contracting low-density regions. |
| Spearman correlation (SC) |
A rank-based correlation used to assess monotonic agreement between high- and low-dimensional distances. |
| Sphere |
The surface of a ball. Points lie only on the boundary, not inside. |
| Spherical spiral |
A spiral path that wraps around the surface of a sphere. |
| Spins |
The number of turns or revolutions in a spiral structure. |
| Standardization |
A preprocessing step in which variables are rescaled to have comparable ranges (typically mean zero and unit variance) before applying NLDR methods. |
| Standardized bin count (w_h) |
The proportion of total observations contained in a bin, used to identify low-density regions. |
| Static plot |
A single, fixed 2D visualization of an NLDR embedding. |
| Stress function |
An objective function used in MDS to quantify the mismatch between distances in the original space and the embedded space. |
| Structure exaggeration |
The tendency of NLDR methods to amplify apparent patterns, such as clusters or separations, in low-dimensional embeddings. |
| Structured noise |
Noise that follows smooth or patterned trends instead of being purely random. |
| Subject variability |
Differences in performance attributable to individual participants perceptual abilities, attention, and strategies. This variability is modeled using a subject-level random effect. |
| Swiss roll |
A flat surface rolled up into a spiral in 3D, commonly used as a nonlinear manifold example. |
| Synthetic dataset |
An artificially generated dataset designed to exhibit specific structural or statistical properties. |
| Technical barrier |
Practical or knowledge-based obstacles (e.g., software installation, programming requirements) that can limit access to advanced analytical tools. |
| Topological complexity |
Features like holes, loops, or twists that affect connectivity but not local smoothness. |
| Tour |
A dynamic visualization technique that presents a continuous sequence of 2\text{-}D linear projections of high-dimensional data, allowing structure to be explored from multiple viewing angles. |
| Treatment |
A distinct experimental condition formed by a specific combination of factors, here defined as an NLDR method paired with a distance scale factor. |
| Trefoil knot |
A closed loop with self-crossings, forming a nontrivial knot used to test preservation of topology. |
| TriMAP |
An NLDR method that preserves relative distances among triplets of points to balance local and global structure. |
| Trial |
A single evaluation task in which a participant compares two visual displays and judges whether they represent the same data. |
| Triangular mesh (triangulation) |
A network of edges connecting neighboring bin centroids, derived using Delaunay triangulation to encode local neighborhood relationships. |
| Trigonometric structure |
A geometric pattern generated using sine and cosine functions. |
| True model (geometric structure) |
The subset of variables and relationships that define the underlying data-generating manifold, which NLDR aims to capture. |
| Twisting |
A distortion in NLDR where the fitted manifold rotates or bends excessively in high-dimensional space. |
| UMAP (Uniform Manifold Approximation and Projection) |
An NLDR method that balances local and global structure preservation using manifold learning principles. |
| Uniform distribution |
All values within a range are equally likely. |
| Unimodal distribution |
A probability distribution with a single dominant peak. The distribution of subject-level proportions correct is approximately unimodal, supporting the use of normally distributed random effects. |
| Unsupervised learning |
A class of methods that identify patterns or structures in data without using labeled outcomes. |
| Usability survey |
A structured evaluation method used to assess how easily users can learn, navigate, and complete tasks within an application. In this study, it was used to inform interface design and feature refinement in menuraR. |
| User interaction logs |
Records of user actions within an application, such as layout generation and comparison steps, used to evaluate usability and identify design improvements. |
| User study |
An empirical experiment involving human participants, designed to assess how people interact with and interpret visual representations. |
| Version control |
A system for tracking and managing changes to files, enabling collaboration, rollback, and transparent development history. |
| Visual conceptualization |
The mental interpretation formed by a viewer when observing a visualization, such as perceiving the number of clusters or their relationships. |
| Visualization technique |
A method for graphically representing data to aid interpretation, exploration, or comparison of structures. |
| Wavy noise dimensions |
Noise variables that oscillate smoothly, often following sine or polynomial patterns. |
| Web-based interface |
A graphical user interface accessed through a web browser that enables interaction with computational tools without requiring local software installation. |
| Wireframe model |
A geometric representation of an NLDR layout constructed by connecting neighboring bin centroids with line segments, forming a mesh that approximates the layouts structure. |
| Within-cluster dispersion/ distance |
A measure of how spread out points are within each cluster. |
| Wrapper function |
A helper function that calls another function but simplifies inputs or outputs. |
cardinalR |
An R package designed to generate high-dimensional clustering data structures with controlled geometric and noise properties. |
detourr |
An R package for dynamic visualization of high-dimensional data using tour methods, supporting interactive exploration through projection sequences. |
knitr |
An R package that supports dynamic report generation by integrating R code with documents written in LaTeX, Markdown, or Quarto. |
langevitour |
An R package used to generate smooth, continuous tours for exploring high-dimensional data and fitted models. |
menuraR |
A Shiny web application for comparing, diagnosing, and selecting the most reasonable NLDR layouts. |
quollr |
An R package developed to diagnose and evaluate NLDR layouts using visual and quantitative methods. |
shiny |
An R framework for building interactive web applications directly from R code. |
| autoAlt |
An R package developed by the NUMBATs group that provides automated suggestions for generating alt-text for data visualizations. |
| scDEED |
A method for evaluating NLDR embeddings by assigning reliability scores based on neighborhood preservation before and after embedding. |
| shinyapps.io |
A cloud-based hosting platform for R Shiny applications that allows users to access interactive tools such as menuraR through a web browser without local installation. |
| tSNE (t-distributed Stochastic Neighbor Embedding) |
An NLDR method that prioritizes preservation of local neighborhoods, often at the expense of global structure. |