A module for node.js and the browser that takes in text and returns text that is stripped of stopwords. Has pre-defined stopword lists for 62 languages and also takes lists with custom stopwords as input.
The Retrieval-Augmented Generation (RAG) module contains document processing and embedding utilities.
Javascript SDK for Sensible, the developer-first platform for extracting structured data from documents so that you can build document-automation features into your SaaS products
n8n node to extract text, images and tables from PDF with multilingual support, language detection and comprehensive test suite
N8N nodes for processing PDF and Excel files