Welcome, guest | Sign In | My Account | Store | Cart

This recipe shows how to use selpg, a Linux command-line utility written in C, together with xtopdf, a Python toolkit for PDF creation, to print only a selected range of pages from a text file, to a PDF file, for display or print purposes. The way to do this is to run the selpg utility at the Linux command line, with options specifying the start and end pages of the range, and pipe its output to the StdinToPDF.py program, which is a part of the xtopdf toolkit.

Bash, 31 lines
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
The steps for this recipe are as follows:

1. Download the selpg utility from its repository on Bitbucket:

https://bitbucket.org/vasudevram/selpg/src

including all these files: makefile, mk, selpg.c and showsyserr.c

2. Build the selpg utility by running the shell script called mk. It calls the make command which uses the C compiler on your Linux system to compile and link the source code.

This will result in a binary called selpg.

3. Install ReportLab v1.21, which is a dependency for xtopdf, from http://www.reportlab.com/ftp , by downloading either the .zip or the .tar.gz file from there, extracting its contents into some new folder, and following the instructions in the README or INSTALL file.

4. Install xtopdf in some new folder, following the instructions given here:

http://jugad2.blogspot.in/2012/07/guide-to-installing-and-using-xtopdf.html

(The instructions are for Windows, but anyone with basic Linux experience can easily adapt them for Linux, since the task only requires uncompressing the xtopdf zip or tar.gz file and setting an environment variable or two.)

The above steps are a one time task.

5. After that, you can run this pipeline whenever you wish, with appropriate values for the name of the input text file and the output PDF file, to select a range of pages from any text file and print them to PDF:

$ ./selpg -s2 -e4 input_file.txt | python StdinToPDF.py output_file.pdf

where you have to replace the 2 in the -s2 with the start page number, and the 4 in the -e4 with the end page number, of the range of pages that you wish to print to the PDF file.

More details are available at this blog post:

http://jugad2.blogspot.in/2014/10/print-selected-text-pages-to-pdf-with.html

This recipe is useful when you have a text file and you want to print only some of the pages - a range - of the text file to PDF. An example is when you printed a large text file earlier, but the printer jammed in the middle of the job, so some pages were not printed. The recipe can be used to print only the pages that were not printed earlier. If you have a file of 10 pages, the ranges could be 1-6, 3-7 or 4-10, for example, that is, could start at the beginning or somewhere in the middle, and end somewhere in the middle or at the end of the original file. The only requirements are that the ending page number should be greater than or equal to the starting page number, and that both the starting and ending page numbers should be valid, i.e., pages with those page numbers should exist. The selpg utility used in the recipe checks for many kinds of invalid input, and rejects them with an error message.

Here, pages can be defined in one of two ways:

  1. Pages defined by a fixed number of lines of text per page. This is common for many reporting programs that generate reports simply as lines of text. A common / traditional default for such reports is 72 lines per logical page, so this is the default that the selpg utility uses. It is possible to change that default to any other reasonable value, by using the appropriate command line option (-llines_per_page) of selpg. Check the usage of the selpg command by running it without any arguments.

  2. Pages defined by a form feed character as the delimiter between pages. This is also a common format used by many reporting programs that generate reports, and it dates from back when line printers were common, since the form feed character, ASCII 12, has the effect, when printed, of making the printer move the print head to the top of the next page, regardless of where its current position was on the page, before the form feed was encountered in the file. This is also known as the printer throwing a page. To use the recipe with pages defined by form feeds, you have to use the -f option of selpg. Again, consult the usage message by running selpg without command line options.

When the pipeline is run, the selpg command reads the input text file, selects only the lines from the specified range of pages (i.e. first line of starting page through to last line of ending page), and pipes those lines of text to the StdinToPDF.py program, which writes those lines to the output PDF file.

1 comment