Package detail

modern-diacritics

Mitsunee9.4kMIT2.3.1

A modern way to latinize/ascii-fold strings and normalize symbols.

diacritics, remove, removal, search

readme

Modern Diacritics

A modern way to latinize/ascii-fold strings and normalize symbols. Particularly useful for writings search filters.

  • Modern fork of @andrewrk's node-diacritics with many new features
  • Dual-published as ESM and CJS modules
  • Normalizes similar symbols like quotation marks
  • Diacritic Removal and Symbol Normalization also available as separate functions
  • Provides slugify function with built-in latinization!
latinize("Iлtèrnåtïonɑlíƶatï߀ԉ");
// => "Internationalizati0n"

Installation

npm install modern-diacritics
# or
yarn add modern-diacritics

Usage

latinize

The supplied is latinized with normalized symbols.

import { latinize } from "modern-diacritics";

latinize("Hêƚƚó, ’worƚd‘!"); // => "Hello, 'world'!"

latinize uses removeDiacritics and normalizeSymbols internally. They are available separatly for applications where you may not wants to fully latinize strings. Options are passed along internally where applicable.

Options:

// Symbols option: on by default, disable to preserve symbols
latinize("Hêƚƚó, ’worƚd‘!", { symbols: false });
// => "Hello, ’world‘!"

// Lowercase option: off by default, enable to transform to all lowercase characters
latinize("Hêƚƚó, ’worƚd‘!", { lowerCase: true });
// => "hello, 'world'!"

// Trim option: off by default, enable to trim whitespace at start and end of string
latinize(" Hêƚƚó, ’worƚd‘!  ", { trim: true });
// => "Hello, 'world'!"

normalizeSymbols

Normalizes symbols in the supplied string and trims whitespace at the start and end of the string (can be disabled, see Options).

import { normalizeSymbols } from "modern-diacritics";

normalizeSymbols(" “Hauptstraße” ");
// => '"Hauptstraße"'

Options:

// Trim option: on by default, disable to preserve all whitespace characters as spaces
normalizeSymbols(" “Hauptstraße” ", { trim: false });
// => ' "Hauptstraße" '

// Force Single Space option: off by default, enable to replace consecutive whitespaces with a single whitespace
normalizeSymbols(" “Hauptstraße   42” ", { forceSingleSpace: true });
// => '"Hauptstraße 42"'

// Replace Whitespace option: off by default, set any string to be used as replacement for whitespaces
normalizeSymbols(" “Hauptstraße   42” ", { replaceWhiteSpace: "_" });
// => '"Hauptstraße_42"'
normalizeSymbols(" “Hauptstraße   42” ", {
  replaceWhiteSpace: "_",
  trim: false
});
// => '_"Hauptstraße_42"_'

removeDiacritics

Provies simplified diacritic removal, which does not further latinize strings or normalize symbols.

import { removeDiacritics } from "modern-diacritics";

removeDiacritics("Crêpes");
// => "Crepes"

Options:

// Lowercase option: off by default, enable to transform to all lowercase characters
removeDiacritics("Crêpes", { lowerCase: true });
// => "crepes"

slugify

The supplied string is latinized and then turned into a slug:

import { slugify } from "modern-diacritics";

slugify("HêLLó, worLd!"); // "hello-world"

Whitespace as well as underscores and parenthesis are replaced with dashes. All other symbols will be removed! slugify uses the lowerCase and replaceWhiteSpace options of normalizeSymbols. trim is not used and spaces will be transformed to dashes.

Options:

slugify supports normalizeSymbols's trim and forceSingleSpace options. For backwards compatibility these two options use false as their default value.

Special Thanks

Contributors

Changelog

See CHANGELOG.md

Planned features

  • Custom replacer lists/maps
  • Adding more symbols to normalize (feel free to submit suggestions)

changelog

Changelog

2.3.0

  • Refactored project as TypeScript

2.2.1

  • Fixed missing dev dependency rollup

2.2.0

  • Normalize dashes to ASCII hyphen-minus #4
  • Upgraded dependencies
  • migrated CI to nano-staged and uvu
  • added engines field (should support node.js 14, 16 and 17. node.js 12 was never supported and failed testing. This should not be a breaking change)

v2.1.0

  • Added forceSingleSpace option to normalizeSymbols, latinize and slugify
  • Added replaceWhiteSpace option to normalizeSymbols and latinize
  • Rewrote types to use merged interfaces for latinize

v2.0.0

  • Added dual-publishing
  • New API:
    • sanitize => latinize:
      • diacritic option removed
      • new trim option (off by default!)
    • new separate function removeDiacritics for simple removal of diacritics (with lowerCase option, off by default)
    • new separate function normalizeSymbols for handling only symbols (with trim option, on by default)
  • added corepack config and updated publish configs

v1.2.1

  • Added docs for previous release

v1.2.0

  • Added lowerCase option to sanitize
  • Refactored slugify to use new lowerCase option

v1.1.0

  • Added options to sanitize
  • rewrote testing for sanitize

v1.0.0

  • Initial Release