包详细信息

hyphen

ytiurin3.2mISC1.10.6

Text hyphenation in Javascript.

hyphen, hyphenate, hyphenation, hyphenator

自述文件

Franklin M. Liang's hyphenation algorithm

npm version All Contributors

hyphen

Demo page

This is a text hyphenation library, based on Franklin M. Liang's hyphenation algorithm. In core of the algorithm lies a set of hyphenation patterns. They are extracted from hand-hyphenated dictionaries. Patterns for this library were taken from ctan.org and ported to Javascript.

import { hyphenate } from "hyphen/en";

(async () => {
  const text = "A certain king had a beautiful garden";

  const result = await hyphenate(text);
  // result is "A cer\u00ADtain king had a beau\u00ADti\u00ADful garden"
})();

Hyphenate HTML

Processor will automaticly skip HTML tags hyphenation.

import { hyphenate } from "hyphen/en";

(async () => {
  const text = "<blockquote>A certain king had a beautiful garden</blockquote>";

  const result = await hyphenate(text);
  // result is "<blockquote>A cer\u00ADtain king had a beau\u00ADti\u00ADful garden</blockquote>"
})();

Multilingual hyphenation

To hypehante text in any other supported language, just change the import source. For example for German language, import a hyphenation function from a "hyphen/de" source.

import { hyphenate } from "hyphen/de";

(async () => {
  const text = "Ein gewisser König hatte einen wunderschönen Garten";

  const result = await hyphenate(text);
  // result is "Ein ge\u00ADwis\u00ADser Kö\u00ADnig hat\u00ADte einen wun\u00ADder\u00ADschö\u00ADnen Gar\u00ADten"
})();

It is possible to use many langauges on the same page.

import { hyphenate as hyphenateEn } from "hyphen/en";
import { hyphenate as hyphenateDe } from "hyphen/de";

(async () => {
  const english = "A certain king had a beautiful garden";

  const englishResult = await hyphenateEn(english);
  // result is "A cer\u00ADtain king had a beau\u00ADti\u00ADful garden"

  const deutch = "Ein gewisser König hatte einen wunderschönen Garten";

  const deutchResult = await hyphenateDe(deutch);
  // result is "Ein ge\u00ADwis\u00ADser Kö\u00ADnig hat\u00ADte einen wun\u00ADder\u00ADschö\u00ADnen Gar\u00ADten"
})();

Sync version

The hyphenate function returns a Promise, however a sync version of it returns a string.

import { hyphenateSync as hyphenate } from "hyphen/en";

const text = "A certain king had a beautiful garden";

const result = hyphenate(text);
// result is "A cer\u00ADtain king had a beau\u00ADti\u00ADful garden"

Install

npm install hyphen

Install types definitions for Typescript usage.

npm install --save-dev @types/hyphen

Types definitions are created and maintained by Krisztián Balla.

Options

  • exceptions

    An Array of values with exceptions of hyphenation in words. Hard hyphen symbol - should be used to mark the position of further configured hyphenation symbol. Default value is [].

  • hyphenChar

    A String sets a value of the soft hyphen character. Default value is \u00AD.

  • minWordLength

    A Number sets the minimum length of the word, intended for hyphenation. Default value is 5.

Example of using options

import { hyphenate } from "hyphen/en";

(async () => {
  const text = "A certain king had a beautiful garden";

  const result = await hyphenate(text, {
    hyphenChar: "-"
  });
  // result is "A cer-tain king had a beau-ti-ful garden"
})();

List of available languages

<summary>Check the list</summary> - Afrikaans language javascript import { hyphenate } from "hyphen/af"; - Assamese language javascript import { hyphenate } from "hyphen/as"; - Belarusian language javascript import { hyphenate } from "hyphen/be"; - Bulgarian language javascript import { hyphenate } from "hyphen/bg"; - Bengali language javascript import { hyphenate } from "hyphen/bn"; - Catalan language javascript import { hyphenate } from "hyphen/ca"; - Coptic language javascript import { hyphenate } from "hyphen/cop"; - Czech language javascript import { hyphenate } from "hyphen/cs"; - Welsh language javascript import { hyphenate } from "hyphen/cy"; - Church Slavonic language javascript import { hyphenate } from "hyphen/cu"; - Danish language javascript import { hyphenate } from "hyphen/da"; - German, traditional spelling javascript import { hyphenate } from "hyphen/de-1901"; - German, reformed spelling javascript import { hyphenate } from "hyphen/de-1996"; - German, traditional Swiss spelling javascript import { hyphenate } from "hyphen/de-CH-1901"; - Modern Greek, monotonic spelling javascript import { hyphenate } from "hyphen/el-monoton"; - Modern Greek, polytonic spelling javascript import { hyphenate } from "hyphen/el-polyton"; - English, British spelling language javascript import { hyphenate } from "hyphen/en-gb"; - English, American spelling language javascript import { hyphenate } from "hyphen/en-us"; - Spanish language javascript import { hyphenate } from "hyphen/es"; - Estonian language javascript import { hyphenate } from "hyphen/et"; - Basque language javascript import { hyphenate } from "hyphen/eu"; - Finnish language javascript import { hyphenate } from "hyphen/fi"; - French language javascript import { hyphenate } from "hyphen/fr"; - Friulan language javascript import { hyphenate } from "hyphen/fur"; - Irish language javascript import { hyphenate } from "hyphen/ga"; - Galician language javascript import { hyphenate } from "hyphen/gl"; - Ancient Greek language javascript import { hyphenate } from "hyphen/grc"; - Gujarati language javascript import { hyphenate } from "hyphen/gu"; - Hindi language javascript import { hyphenate } from "hyphen/hi"; - Croatian language javascript import { hyphenate } from "hyphen/hr"; - Upper Sorbian language javascript import { hyphenate } from "hyphen/hsb"; - Hungarian language javascript import { hyphenate } from "hyphen/hu"; - Armenian language javascript import { hyphenate } from "hyphen/hy"; - Interlingua language javascript import { hyphenate } from "hyphen/ia"; - Bahasa Indonesia, Indonesian language javascript import { hyphenate } from "hyphen/id"; - Icelandic language javascript import { hyphenate } from "hyphen/is"; - Italian language javascript import { hyphenate } from "hyphen/it"; - Georgian language javascript import { hyphenate } from "hyphen/ka"; - Kurmanji, Northern Kurdish language javascript import { hyphenate } from "hyphen/kmr"; - Kannada language javascript import { hyphenate } from "hyphen/kn"; - Classical Latin language javascript import { hyphenate } from "hyphen/la-x-classic"; - Liturgical Latin language javascript import { hyphenate } from "hyphen/la-x-liturgic"; - Latin language javascript import { hyphenate } from "hyphen/la"; - Lithuanian language javascript import { hyphenate } from "hyphen/lt"; - Latvian language javascript import { hyphenate } from "hyphen/lv"; - Malayalam language javascript import { hyphenate } from "hyphen/ml"; - Mongolian, Cyrillic script, alternative patterns javascript import { hyphenate } from "hyphen/mn-cyrl-x-lmc"; - Mongolian, Cyrillic script javascript import { hyphenate } from "hyphen/mn-cyrl"; - Marathi language javascript import { hyphenate } from "hyphen/mr"; - Multiple languages using the Ethiopic scripts javascript import { hyphenate } from "hyphen/mul-ethi"; - Norwegian Bokmål, bokmål, norsk bokmål language javascript import { hyphenate } from "hyphen/nb"; - Dutch language javascript import { hyphenate } from "hyphen/nl"; - Norwegian Nynorsk, nynorsk language javascript import { hyphenate } from "hyphen/nn"; - Norwegian, norsk language javascript import { hyphenate } from "hyphen/no"; - Occitan language javascript import { hyphenate } from "hyphen/oc"; - Odia, Oriya language javascript import { hyphenate } from "hyphen/or"; - Panjabi, Punjabi language javascript import { hyphenate } from "hyphen/pa"; - Pāli language javascript import { hyphenate } from "hyphen/pi"; - Polish language javascript import { hyphenate } from "hyphen/pl"; - Piedmontese language javascript import { hyphenate } from "hyphen/pms"; - Portuguese language javascript import { hyphenate } from "hyphen/pt"; - Romansh language javascript import { hyphenate } from "hyphen/rm"; - Romanian language javascript import { hyphenate } from "hyphen/ro"; - Russian language javascript import { hyphenate } from "hyphen/ru"; - Sanskrit language javascript import { hyphenate } from "hyphen/sa"; - Serbocroatian, Cyrillic script javascript import { hyphenate } from "hyphen/sh-cyrl"; - Serbocroatian, Latin script javascript import { hyphenate } from "hyphen/sh-latn"; - Slovak language javascript import { hyphenate } from "hyphen/sk"; - Slovenian language javascript import { hyphenate } from "hyphen/sl"; - Serbian, Cyrillic script javascript import { hyphenate } from "hyphen/sr-cyrl"; - Swedish language javascript import { hyphenate } from "hyphen/sv"; - Tamil language javascript import { hyphenate } from "hyphen/ta"; - Telugu language javascript import { hyphenate } from "hyphen/te"; - Thai language javascript import { hyphenate } from "hyphen/th"; - Turkmen language javascript import { hyphenate } from "hyphen/tk"; - Turkish language javascript import { hyphenate } from "hyphen/tr"; - Ukrainian language javascript import { hyphenate } from "hyphen/uk"; - Mandarin Chinese, pinyin transliteration javascript import { hyphenate } from "hyphen/zh-latn-pinyin"; ### Aliases for specific languages - Alias for hyphen/de-1996 javascript import { hyphenate } from "hyphen/de"; - Alias for hyphen/el-monoton javascript import { hyphenate } from "hyphen/el"; - Alias for hyphen/en-us javascript import { hyphenate } from "hyphen/en"; - Alias for hyphen/mul-ethi javascript import { hyphenate } from "hyphen/ethi"; - Alias for hyphen/mn-cyrl javascript import { hyphenate } from "hyphen/mn"; - Alias for hyphen/sh-cyrl javascript import { hyphenate } from "hyphen/sh"; - Alias for hyphen/sr-cyrl javascript import { hyphenate } from "hyphen/sr"; - Alias for hyphen/zh-latn-pinyin javascript import { hyphenate } from "hyphen/zh";

Factory function

Factory function can be used to create hyphenate function with changed default options.

Create hyphenation function with predefined exception list

import createHyphenator from "hyphen";
import patterns from "hyphen/patterns/en-us";

const hyphenate = createHyphenator(patterns, {
  // result in Promise
  async: true,
  // exceptions of hyphenation
  exceptions: ["present", "ta-ble"]
});

Predefined functions

The following are predefined hyphenate functions.

import createHyphenator from "hyphen";
import patterns from "hyphen/patterns/en-us";

const hyphenate = createHyphenator(patterns, {
  async: true
});

const hyphenateSync = createHyphenator(patterns);

Predefined hyphenate functions are set in every language pack.

jsDelivr CDN for older websites

It is possible to use hyphen on older websites with jsDelivr network. Check the package page on their website.

<script src="https://cdn.jsdelivr.net/npm/hyphen@1.10.6/patterns/en-us.min.js"></script>
<script src="https://cdn.jsdelivr.net/npm/hyphen@1.10.6/hyphen.min.js"></script>

After the script is added on your page, use еру createHyphenator to create a hyphenate function.

var hyphenate = createHyphenator(hyphenationPatternsEnUs, {
  async: true
});

Alternatives

Check other great hyphenation libraries:

  • Hyphenopoly does client-side hyphenation of HTML-Documents.
  • Hypher A fast and small hyphenation engine.

Text hyphenation in CSS

The CSS hyphens property is intended to add hyphenation support to modern browsers without Javascript:

p {
  hyphens: auto;
}

It is part of the CSS Text Level 3 specification. The browser compatibility list can be found on the related MDN page.

DEPRECATED

  • Option debug will be deprecated in further versions;

Migration

<summary>from 1.9.1 to 1.10.0</summary> Option html default value changed from false to true In cases when text parser should not skip HTML tags, apply the following code changes. Default exported hyphenate function javascript // Code before 1.10.0 hyphenate(text); javascript // Code after 1.10.0 hyphenate(text, { html: false }); Create hyphenate function with pre 1.10.0 behavior using a factory function: javascript // Code after 1.10.0 const hyphenate = createHyphenator(patterns, { async: true, html: false }); hyphenate(text);

Contributors ✨

Thanks goes to these wonderful people (emoji key):

Eugene Tiurin
Eugene Tiurin

🤔 💻 🚧
Krisztián Balla
Krisztián Balla

🐛 🧑‍🏫 📣
Robin Millette
Robin Millette

💻 🐛
Asko Soukka
Asko Soukka

💻 🐛
Nicolas Sierra
Nicolas Sierra

💻 🐛
Jaume Ortolà
Jaume Ortolà

💻 🐛
Simon Osterlehner
Simon Osterlehner

💻
Jason Wohlgemuth
Jason Wohlgemuth

📖
Kamil Mielnik
Kamil Mielnik

💻 🐛
Oskar Köök
Oskar Köök

💻 🐛
Add your contributions

This project follows the all-contributors specification. Contributions of any kind welcome!

更新日志

Changelog

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

Unreleased

[1.10.6]

Changed

  • Optimized patterns dictionary size

[1.10.5]

Fixed

  • Fixed the issue with bad hyphenation, that started in version 1.7.0

[1.10.4]

Fixed

  • Fixed problem with en-us.cjs.js in package build

[1.10.3]

Changed

  • Optimized text parsing when html option is false

[1.10.2]

Changed

  • Destructured text reader into composable verifiers

  • Optimized patterns storage to reduce patterns file size

[1.10.1]

Changed

  • Optimized patterns storage to reduce patterns file size

Fixed

  • Fixed hyphenation exceptions for Norwegian-bokmål, Norwegian-nynorsk language patterns; minor patterns fixes

[1.10.0]

Changed

  • BREAKING CHANGE: Option html default value changed to true

[1.9.1] - 2023-11-17

Changed

  • Simplified patterns export container

Fixed

  • Fixed broken patterns for Norwegian-bokmål, Norwegian-nynorsk language patterns

[1.9.0] - 2023-11-13

Changed

  • Moved patterns trie generation step to build phase

[1.8.0] - 2023-11-08

Added

  • Added exceptions option

Changed

  • Exclude debug code from production build

Fixed

  • Fixed broken patterns exceptions

[1.7.2] - 2023-11-07

Changed

  • Reworked text reader in favour of if statements

[1.7.1] - 2023-10-29

Fixed

  • Fixed bad hyphenation of words with apostrophe symbol

[1.7.0] - 2023-10-11

Changed

  • Prebuild a dictionary trie to improve performance of a hyphenation algorithm

[1.6.6] - 2023-06-04

Fixed

  • Minor performance fix

[1.6.5] - 2023-03-11

Added

  • Add async web worker support

[1.6.4] - 2021-04-04

Fixed

  • Handle the edge case of a word “constructor” being interpreted as a JavaScript keyword. - by @arseni-mourzenko

[1.6.3] - 2021-04-04

Fixed

  • Change undesired behavior of option minWordLength: value now can be set to less than 5 - by @kamilmielnik

[1.6.2] - 2020-06-03

Fixed

  • Prevent hyphenation of HTML attributes in hyphenateHTML

[1.6.0] - 2020-05-08

Added

  • Add option minWordLength

[1.5.0] - 2020-04-03

Added

  • Add ability to configure initialized hyphenator function
  • Add exports for nodejs and webpack environments

Changed

  • Skip HTML syntax only when html option is set to true
  • License changed to ISC

Fixed

  • Fix bad script behavior in async mode

[1.3.1] - 2020-03-28

Changed

  • Split source code into several files with following bundling

Fixed

  • Replaced template literals for ES3 compatibility

[1.3.0] - 2020-03-25

Added

  • Add HTML sections exclusion from hyphenation
  • Add async mode
  • Protect hyphenated text from repeated hyphenation

[1.2.1] - 2020-03-18

Changed

  • Reduced NPM package size

[1.2.0] - 2020-03-18

Fixed

  • Fixed badly processed repeated patterns in a word by @jaumeortola
  • Fixed a case when a special character is considered a letter

[1.1.1] - 2019-03-20

Changed

  • Replaced a unicode literal char with a corresponding code by @oskarkook

Fixed

[1.1.0] - 2019-02-10

Changed

  • Rebuilt patterns with tex2js translator

Fixed

1.0.2 - 2016-08-28

Fixed

1.0.1 - 2016-08-18

Added

  • Output stats in debug mode

Fixed

1.0.0 - 2016-08-07

Added

  • First working version with 75 language patterns