Statistical functions
Introduction
This package, stats
, provides a set of functions for statistical analysis, data processing, and geospatial operations. It includes functions for calculating percentiles, clustering data, normalizing values, generating charts, and creating styles for thematic maps. These utilities are useful for data scientists, analysts, and developers working with geospatial data and statistical analysis.
Installation
To install the package, use npm or yarn:
npm install @icgcat/stats
Usage
Import and Basic Example
Here's how to use the stats functions in your application:
<script>
import { stats } from '@icgcat/stats';
const percentiles50 = stats.calculatePercentiles([1, 2, 3, 4, 5], [50]);
console.log(percentiles50);
const percentiles255075 = stats.calculatePercentiles([1, 2, 3, 4, 5], [25,50,75]);
console.log(percentiles255075);
</script>
Function Descriptions
calculatePercentiles(numericArray,targetPercentile)
Description: Calculates specific percentiles for an array of numeric values.
Parameters:
- numericArray: Array of numeric values.
- targetPercentile: Percentile to calculate (single value or multiple).
Returns: An array with the calculated percentiles.
sortNumericArray(numericArray)
Description: Sorts an array of numeric values in ascending order.
Parameters:
- numericArray: Array of numeric values.
Returns: Sorted array.
performKMeansClustering(numericArray,numClusters)
Description: Performs k-means clustering on numeric data and returns cluster boundaries.
Parameters:
- numericArray: Array of numeric values.
- numClusters: Number of clusters.
Returns: Array of cluster boundaries.
getColumnTitleFromCSV(columnIndex,csvData)
Description: Retrieves a column title from a CSV dataset.
Parameters:
- columnIndex: Column index.
- csvData: CSV dataset.
Returns: Title of the column.
computeNormalDistribution(standardDeviation,meanValue,dataValues)
Description: Computes normal distribution values for a dataset.
Parameters:
- standardDeviation: Standard deviation of the dataset.
- meanValue: Mean value of the dataset.
- dataValues: Array of data values.
Returns: Object with labels, data, and sample data.
generateQuantilesChart(dataColumn,numRanges,colorArray,decimalPrecision)
Description: Generates a chart with quantiles-based data ranges.
Parameters:
- dataColumn: Data array.
- numRanges: Number of ranges.
- colorArray: Colors for ranges.
- decimalPrecision: Decimal precision for calculations.
Returns: Object with chart data, labels, colors, and summary statistics.
normalizeBounds(rangeBounds, precision)
Description: Normalizes data range boundaries with specified precision.
Parameters:
- rangeBounds: Array of boundaries.
- precision: Number of decimal places for normalization.
Returns: Array of normalized boundaries.
countUniqueValues(arrayColumn, geoStatsMap, colorArray)
Description: Counts unique values in a dataset.
Parameters:
- arrayColumn: Data array.
- geoStatsMap: GeoStats object for classification.
- colorArray: Array of colors for visual representation.
Returns: Object with labels, data, colors, and ticks.
getRawValueByCode(code, jsonStatData, parseAsInteger, geoField)
Description: Retrieves raw values by a specific code from JSON data.
Parameters:
- code: The lookup code.
- jsonStatData: JSON dataset.
- parseAsInteger: Boolean to parse the code as an integer.
- geoField: Geographic identifier field.
Returns: Value corresponding to the code.
getZscoreValueByCode(code, jsonStatData, parseAsInteger, geoField)
Description: Gets Z-score for a specific code in the dataset.
Parameters:
- code: The lookup code.
- jsonStatData: JSON dataset.
- parseAsInteger: Boolean to parse the code as an integer.
- geoField: Geographic identifier field.
Returns: Z-score value.
getNameZscoreValueByCode(code, jsonStatData, parseAsInteger, geoField)
Description: Retrieves the Z-score value for a specific code from a JSON dataset.
Parameters:
- code: The lookup code.
- jsonStatData: Array of JSON objects containing the data.
- parseAsInteger: Boolean to parse the code as an integer.
- geoField: Geographic field used to match the code.
Returns: Z-score value corresponding to the code, or "sense dades"
if not found.
findOutlierScore(dataArray, geoStatsObj)
Description: Calculates Z-scores for a dataset and identifies outliers.
Parameters:
- dataArray: Array of data objects, each containing a
value
field. - geoStatsObj: GeoStats object for statistical calculations.
Returns: An object containing arrays of positive and negative outliers.
calculateStatsFromJSONSTAT(activeConcept, enforceNumeric, jsonStatData, numRanges, colorArray, thematicType, layerKeyField, geoField, language, defaultDecimals, decimalsArray, embedLegend)
Description: Processes statistical data from JSONSTAT and generates a detailed statistical summary.
Parameters:
- activeConcept: Active concept containing attributes and title.
- enforceNumeric: Whether to force numerical comparison.
- jsonStatData: Array of JSON objects with statistical data.
- numRanges: Number of ranges for classification (default is 4).
- colorArray: Array of colors for thematic representation.
- thematicType: Thematic type (e.g., "Quantils", "Jenks").
- layerKeyField: Key field in the layer for mapping values.
- geoField: Geographic field for statistical analysis.
- language: Current language for labels and messages.
- defaultDecimals: Default number of decimals for rounding values.
- decimalsArray: Array of decimal configurations for specific attributes.
- embedLegend: Whether to generate embedded elements in the legend.
Returns: An object containing statistical results, legend HTML, and chart data.
calculateStatsFromCSV(columnIndex, enforceNumeric, csvData, numRanges, colorArray, thematicType)
Description: Calculates statistical data and generates a thematic representation from CSV input.
Parameters:
- columnIndex: Column index in the CSV for analysis.
- enforceNumeric: Whether to force values to be treated as numbers.
- csvData: Array of CSV data.
- numRanges: Number of classification ranges (default is 4).
- colorArray: Array of colors for thematic mapping.
- thematicType: Type of thematic classification (e.g., "intervals", "jenks").
Returns: An object containing statistical results, legend, style, and chart data.
generateCustomHTMLLegend(normalizedTitle, rangeBounds, colorArray, countArray, language, embedLegend, activeRanges)
Description: Generates a custom HTML legend for visualizations.
Parameters:
- normalizedTitle: The title for the legend.
- rangeBounds: The range boundaries for the legend.
- colorArray: Colors for each range.
- countArray: Counts for each range.
- language: Language for formatting.
- embedLegend: Whether the legend is embedded.
- activeRanges: Active ranges for display.
Returns: HTML content of the legend.
convertAndValidateValue(value)
Description: Converts and validates a value for data processing.
Parameters:
- value: The value to validate and convert.
Returns: The converted or default value.
extractColumnAndKeysFromJSONSTAT(jsonStatData, geoFieldName)
Description: Extracts column data and keys from a JSON-STAT dataset.
Parameters:
- jsonStatData: JSON-STAT data array.
- geoFieldName: The field name for geolocation.
Returns: An object with arrays data
and code
.
extractColumnFromCSV(columnIndex, enforceNumeric, csvData)
Description: Extracts a specific column from a CSV dataset.
Parameters:
- columnIndex: The index of the column.
- enforceNumeric: Whether to parse values as numbers.
- csvData: CSV data as a 2D array.
Returns: An array of column values.
createGeoStatsObject(dataSeries)
Description: Creates a GeoStats object from a numeric data series.
Parameters:
- dataSeries: An array of numeric values.
Returns: A GeoStats object initialized with the series.
getColorArrayfromBrewer(paletteName, rangeCount, initialRangeCount)
Description: Retrieves a color array from a Brewer palette based on range count.
Parameters:
- paletteName: The name of the Brewer palette.
- rangeCount: The number of ranges required.
- initialRangeCount: The initial number of ranges for the palette.
Returns: An array of colors for the specified range count.
brewerToColorScaleArray(paletteName, rangeCount)
Description: Converts a Brewer palette to a color scale array.
Parameters:
- paletteName: The name of the Brewer palette.
- rangeCount: The number of ranges required.
Returns: An array of interpolated colors.
insertPaletteColor(activePaletteIndex, rangeCount)
Description: Generates a color palette and creates an HTML representation of the color ramp, selecting and displaying a palette from a predefined list, and allowing for a specified number of ranges. Highlights the active palette.
Parameters:
- activePaletteIndex: The index of the active palette.
- rangeCount: The number of ranges for the color ramp.
Returns: An object containing the HTML string, a default color example (colorExample), and the initial color array (initialColors).
Dependencies
@icgc/stats
integrates the following libraries:
Developed by:
License
This project is licensed under the MIT License.