Home Reference Source

pdf-gold-digger

Pdf information extraction library based on pdf.js and node.js with various output formats.

GitHub npm GitHub commits since tagged version GitHub last commit doc

Install

npm install -g pdf-gold-digger

Usage

pdfdig -i some_file.pdf

Avaliable commands

pdfdig -h
ex. pdfdig -i input-file -o output_directory -f json

  --input  or  -i   pdf file location (required)
  --output or  -o   pdf file location (optional default "out")
  --debug  or  -d   show debug information (optional - default "false")
  --format or  -f   format (optional - default "text") - ("text,json,xml,html") 
  --font   or  -t   extract fonts as ttf files (optional)
  --help   or  -h   display this help message

Advanced usage

git clone https://github.com/vane/pdf-gold-digger
sh demo.sh

and see results in out directory

Documentation

pdf-gold-digger

Features:

TODO: