Detalhes do pacote

google-img-scrap

yoannchb-pro7.4kMIT1.1.4

Scrap images from google images with customs pre filled google dork options

google, image, scrap, options

readme (leia-me)

Google-img-scrap

Scrap images from google images with customs pre filled dorking options

Update

Found a bug ?

  • Tell it in my github issues dont be afraid :)

Installation

npm i google-img-scrap

Import

const {
  GOOGLE_IMG_SCRAP,
  GOOGLE_IMG_INVERSE_ENGINE_URL,
  GOOGLE_IMG_INVERSE_ENGINE_UPLOAD,
  GOOGLE_QUERY
} = require('google-img-scrap');
// OR
import {
  GOOGLE_IMG_SCRAP,
  GOOGLE_IMG_INVERSE_ENGINE_URL,
  GOOGLE_IMG_INVERSE_ENGINE_UPLOAD,
  GOOGLE_QUERY
} from 'google-img-scrap';

Options definition

  • "search" string what you want to search
  • "proxy" AxiosProxyConfig configure a proxy with axios proxy
  • "excludeWords" string[] exclude some words from the search
  • "domains" string[] filter by domains
  • "excludeDomains" string[] exclude some domains
  • "safeSearch" boolean active safe search or not for nsfw for example
  • "custom" string add extra query
  • "urlMatch" string[][] get image when an url match a string (example: "cdn") | example below
  • "filterByTitles" string[][] filter images by titles | example below
  • "query" GoogleQuery set a query (can be [TYPE, DATE, COLOR, SIZE, LICENCE, EXTENSION]) (use GOOGLE_QUERY items | example below
  • "limit" number to limit the size of the results

Result

{
  url: 'https://images.google.com/search?tbm=isch&tbs=&q=cats',
  search: "cats",
  result: [
    {
      id: 'K6Qd9XWnQFQCoM',
      title: 'Domestic cat',
      url: 'https://i.natgeofe.com/n/548467d8-c5f1-4551-9f58-6817a8d2c45e/NationalGeographic_2572187_2x1.jpg',
      originalUrl: 'https://www.nationalgeographic.com/animals/mammals/facts/domestic-cat',
      height: 1536,
      width: 3072
    },
    {
      id: 'HkevFQZ5DYu7oM',
      title: 'Cat - Wikipedia',
      url: 'https://upload.wikimedia.org/wikipedia/commons/1/15/Cat_August_2010-4.jpg',
      originalUrl: 'https://en.wikipedia.org/wiki/Cat',
      height: 2226,
      width: 3640
    },
    ...
  ]
}

How to use ?

Simple example

Search cats images

const test = await GOOGLE_IMG_SCRAP({
  search: 'cats'
});

console.log(test);

Reverse search engine

The second parameter is like GOOGLE_IMG_SCRAP it include all type of options omitting search. (Omit<Config, "search">)

With an url (cost: 2 request)

const test = await GOOGLE_IMG_INVERSE_ENGINE_URL(
  'https://upload.wikimedia.org/wikipedia/commons/1/15/Cat_August_2010-4.jpg',
  { limit: 5 }
);

console.log(test);

With a local image (cost: 3 request)

const imageBuffer = fs.readFileSync('demonSlayer.png');
const test = await GOOGLE_IMG_INVERSE_ENGINE_UPLOAD(imageBuffer, {
  limit: 5
});

console.log(test);

Custom query

All query options are optional (see below for all the options) and need to be in uppercase. You can combine as much as you want. Find all possible query options below.

const test = await GOOGLE_IMG_SCRAP({
  search: 'cats',
  query: {
    TYPE: GOOGLE_QUERY.TYPE.CLIPART,
    LICENCE: GOOGLE_QUERY.LICENCE.COMMERCIAL_AND_OTHER,
    EXTENSION: GOOGLE_QUERY.EXTENSION.JPG
  }
});

console.log(test);

Limit result size

const test = await GOOGLE_IMG_SCRAP({
  search: 'cats',
  limit: 5
});

console.log(test);

Proxy

See axios documentation to setup the proxy

const test = await GOOGLE_IMG_SCRAP({
  search: 'cats',
  proxy: {
    protocol: 'https',
    host: 'example.com',
    port: 8080
  }
});

console.log(test);

Domains

Only scrap from a specific domain

const test = await GOOGLE_IMG_SCRAP({
  search: 'cats',
  domains: ['alamy.com', 'istockphoto.com', 'vecteezy.com']
});

console.log(test);

Exclude domains

const test = await GOOGLE_IMG_SCRAP({
  search: 'cats',
  excludeDomains: ['istockphoto.com', 'alamy.com']
});

console.log(test);

Exclude words

If you don' like black cats and white cats

const test = await GOOGLE_IMG_SCRAP({
  search: 'cats',
  excludeWords: ['black', 'white'] //If you don't like black cats and white cats
});

console.log(test);

Safe search (no nsfw)

const test = await GOOGLE_IMG_SCRAP({
  search: 'cats',
  safeSearch: false
});

console.log(test);

Custom query params

const test = await GOOGLE_IMG_SCRAP({
  search: 'cats',
  custom: 'name=content&name2=content2'
});

console.log(test);

How urlMatch and filterByTitles work ?

const test = await GOOGLE_IMG_SCRAP({
  search: 'cats',
  //will build something like this "(draw and white) or (albino and white)"
  filterByTitles: [
    ['draw', 'white'],
    ['albino', 'white']
  ],
  //will build something like this "(cdn and wikipedia) or (cdn istockphoto)"
  urlMatch: [
    ['cdn', 'wikipedia'],
    ['cdn', 'istockphoto']
  ]
});

console.log(test);

Google query

{
  SIZE: {
    LARGE,
    MEDIUM,
    ICON
  },
  COLOR: {
    BLACK_AND_WHITE,
    TRANSPARENT,
    RED,
    BLUE,
    PURPLE,
    ORANGE,
    YELLOW,
    GREEN,
    TEAL,
    PINK,
    WHITE,
    GRAY,
    BLACK,
    BROWN
  },
  TYPE: {
    CLIPART,
    DRAW,
    GIF
  },
  EXTENSION: {
    JPG,
    GIF,
    BMP,
    PNG,
    SVG,
    WEBP,
    ICO,
    RAW
  },
  DATE: {
    DAY,
    WEEK,
    MONTH,
    YEAR
  },
  LICENCE: {
    CREATIVE_COMMONS,
    COMMERCIAL_AND_OTHER
  }
}

changelog (log de mudanças)

Changelog

1.1.4

  • Fixed user agent to avoid bad image quality, errors and captcha (gohoski)

1.1.3

  • Some fixes

1.1.2

  • Fixed empty result
  • Removed average color

1.1.1

  • Fixed empty result

1.1.0

  • Added google image inverse search engine. You can now search images with a local image or with an image url.

1.0.9

  • Fixed many bugs
  • filterByTitles is now working
  • urlMatch added in types
  • All the code have been write back in typescript with a new structure
  • Removed execute
  • Added proxy configuration
  • Writed back all test with jest

1.0.8

  • Fixed "ERROR: Cannot assign to "queryName" because it is a constant" (by GaspardCulis)
  • Removed gstatic url
  • Added average color, id, title and originalUrl

1.0.7

  • Readme update

1.0.6

  • Fixed types
  • Added limit to limit the size of the results

1.0.5

  • Added types (by christophe77)

v1.0.4

  • New option urlMatch. You now get image when an url match a string (example: "cdn")
  • New option filterByTitles. Filter images by titles

v1.0.3

  • New option execute. allow you to execute a function to remove "gstatic.com" domains for example

v1.0.2

  • Cannot set 'domains' and 'excludeDomains' as same time
  • Fixed some bugs
  • New option excludeWords

v1.0.1

  • Added the missing dependencie