Creating PDF

PDF is certainly one of the most complicated formats out there. And certainly one of the formats most of the developers will be having trouble with, as there are many versions and dialects. That makes it almost impossible and highly challenging to have just one right way of creating PDF files. That’s why, creation of PDF files has been delegated to an abstraction layer of PDF generators. If you don’t like how PDF files are generated, you can create your own layer, using your favourite library.

Currently, there are three PDF generators implemented:

  • PdfkitPdfGenerator (default), built on top of the pdfkit and wkhtmltopdf.

  • ReportlabPdfGenerator, build on top of the famous reportlab.

  • PilPdfGenerator, build on top of the Pillow. Produced PDFs would contain images only (even texts are stored as images), unlike pdfkit or reportlab based solutions, where PDFs would simply contain selectable text. However, it’s the generator that will likely won’t ask for any system dependencies that you don’t yet have installed.

Building PDF with text using pdfkit

While pdfkit generator is heavier and has wkhtmltopdf as a system dependency, it produces better quality PDFs and has no issues with fonts or unicode characters.

See the following full functional snippet for generating PDF using pdfkit.

# Required imports
from faker import Faker
from faker_file.providers.pdf_file import PdfFileProvider
from faker_file.providers.pdf_file.generators.pdfkit_generator import (
    PdfkitPdfGenerator,
)

FAKER = Faker()  # Initialize Faker
FAKER.add_provider(PdfFileProvider)  # Register PdfFileProvider

# Generate PDF file using `pdfkit`
pdf_file = FAKER.pdf_file(pdf_generator_cls=PdfkitPdfGenerator)

See the full example here

The generated PDF will have 10,000 characters of text, which is about 2 pages.

If you want PDF with more pages, you could either:

  • Increase the value of max_nb_chars accordingly.

  • Set value of wrap_chars_after to 80 characters to force longer pages.

  • Insert manual page breaks and other content.


See the example below for max_nb_chars tweak:

# Generate PDF file of 20,000 characters, using `pdfkit`
pdf_file = FAKER.pdf_file(
    pdf_generator_cls=PdfkitPdfGenerator, max_nb_chars=20_000
)

See the full example here


See the example below for wrap_chars_after tweak:

# Generate PDF file, wrapping each line after 80 characters, using `pdfkit`
pdf_file = FAKER.pdf_file(
    pdf_generator_cls=PdfkitPdfGenerator, wrap_chars_after=80
)

See the full example here


As mentioned above, it’s possible to diversify the generated context with images, paragraphs, tables, manual text break and pretty much everything that is supported by PDF format specification, although currently only images, paragraphs, tables and manual text breaks are supported out of the box. In order to customise the blocks PDF file is built from, the DynamicTemplate class is used. See the example below for usage examples:

from faker_file.base import DynamicTemplate
from faker_file.contrib.pdf_file.pdfkit_snippets import (
    add_page_break,
    add_paragraph,
    add_picture,
    add_table,
)

# Create a PDF file with paragraph, picture, table and manual page breaks
# in between the mentioned elements. The ``DynamicTemplate`` simply
# accepts a list of callables (such as ``add_paragraph``,
# ``add_page_break``) and dictionary to be later on fed to the callables
# as keyword arguments for customising the default values.
pdf_file = FAKER.pdf_file(
    pdf_generator_cls=PdfkitPdfGenerator,
    content=DynamicTemplate(
        [
            (add_paragraph, {}),  # Add paragraph
            (add_page_break, {}),  # Add page break
            (add_picture, {}),  # Add picture
            (add_page_break, {}),  # Add page break
            (add_table, {}),  # Add table
            (add_page_break, {}),  # Add page break
        ]
    ),
)

# You could make the list as long as you like or simply multiply for
# easier repetition as follows:
pdf_file = FAKER.pdf_file(
    pdf_generator_cls=PdfkitPdfGenerator,
    content=DynamicTemplate(
        [
            (add_paragraph, {}),  # Add paragraph
            (add_page_break, {}),  # Add page break
            (add_picture, {}),  # Add picture
            (add_page_break, {}),  # Add page break
            (add_table, {}),  # Add table
            (add_page_break, {}),  # Add page break
        ]
        * 5  # Will repeat your config 5 times
    ),
)

See the full example here

Building PDFs with text using reportlab

While reportlab generator is much lighter than the pdfkit and does not have system dependencies, but might produce PDF files with questionable encoding when generating unicode text.

See the following full functional snippet for generating PDF using reportlab.

from faker_file.providers.pdf_file.generators.reportlab_generator import (
    ReportlabPdfGenerator,
)

# Generate PDF file using `reportlab`
pdf_file = FAKER.pdf_file(pdf_generator_cls=ReportlabPdfGenerator)

See the full example here


All examples shown for pdfkit apply for reportlab generator, however when building PDF files from blocks (paragraphs, images, tables and page breaks), the imports shall be adjusted.

As mentioned above, it’s possible to diversify the generated context with images, paragraphs, tables, manual text break and pretty much everything that is supported by PDF format specification, although currently only images, paragraphs, tables and manual text breaks are supported. In order to customise the blocks PDF file is built from, the DynamicTemplate class is used. See the example below for usage examples:

from faker_file.contrib.pdf_file.reportlab_snippets import (
    add_page_break,
    add_paragraph,
    add_picture,
    add_table,
)

# Create a PDF file with paragraph, picture, table and manual page breaks
# in between the mentioned elements. The ``DynamicTemplate`` simply
# accepts a list of callables (such as ``add_paragraph``,
# ``add_page_break``) and dictionary to be later on fed to the callables
# as keyword arguments for customising the default values.
pdf_file = FAKER.pdf_file(
    pdf_generator_cls=ReportlabPdfGenerator,
    content=DynamicTemplate(
        [
            (add_paragraph, {}),  # Add paragraph
            (add_page_break, {}),  # Add page break
            (add_picture, {}),  # Add picture
            (add_page_break, {}),  # Add page break
            (add_table, {}),  # Add table
            (add_page_break, {}),  # Add page break
        ]
    ),
)

# You could make the list as long as you like or simply multiply for
# easier repetition as follows:
pdf_file = FAKER.pdf_file(
    pdf_generator_cls=ReportlabPdfGenerator,
    content=DynamicTemplate(
        [
            (add_paragraph, {}),  # Add paragraph
            (add_page_break, {}),  # Add page break
            (add_picture, {}),  # Add picture
            (add_page_break, {}),  # Add page break
            (add_table, {}),  # Add table
            (add_page_break, {}),  # Add page break
        ]
        * 5  # Will repeat your config 5 times
    ),
)

See the full example here

Building PDFs with text using Pillow

Usage example:

from faker_file.providers.pdf_file.generators.pil_generator import (
    PilPdfGenerator,
)

pdf_file = FAKER.pdf_file(pdf_generator_cls=PilPdfGenerator)

See the full example here


With options:

pdf_file = FAKER.pdf_file(
    pdf_generator_cls=PilPdfGenerator,
    pdf_generator_kwargs={
        "encoding": "utf8",
        "font_size": 14,
        "page_width": 800,
        "page_height": 1200,
        "line_height": 16,
        "spacing": 5,
    },
    wrap_chars_after=100,
)

See the full example here


All examples shown for pdfkit and reportlab apply to Pillow generator, however when building PDF files from blocks (paragraphs, images, tables and page breaks), the imports shall be adjusted.

As mentioned above, it’s possible to diversify the generated context with images, paragraphs, tables, manual text break and pretty much everything that is supported by PDF format specification, although currently only images, paragraphs, tables and manual text breaks are supported. In order to customise the blocks PDF file is built from, the DynamicTemplate class is used. See the example below for usage examples:

from faker_file.contrib.pdf_file.pil_snippets import (
    add_page_break,
    add_paragraph,
    add_picture,
    add_table,
)

# Create a PDF file with paragraph, picture, table and manual page breaks
# in between the mentioned elements. The ``DynamicTemplate`` simply
# accepts a list of callables (such as ``add_paragraph``,
# ``add_page_break``) and dictionary to be later on fed to the callables
# as keyword arguments for customising the default values.
pdf_file = FAKER.pdf_file(
    pdf_generator_cls=PilPdfGenerator,
    content=DynamicTemplate(
        [
            (add_paragraph, {}),  # Add paragraph
            (add_page_break, {}),  # Add page break
            (add_picture, {}),  # Add picture
            (add_page_break, {}),  # Add page break
            (add_table, {}),  # Add table
            (add_page_break, {}),  # Add page break
        ]
    ),
)

# You could make the list as long as you like or simply multiply for
# easier repetition as follows:
pdf_file = FAKER.pdf_file(
    pdf_generator_cls=PilPdfGenerator,
    content=DynamicTemplate(
        [
            (add_paragraph, {}),  # Add paragraph
            (add_page_break, {}),  # Add page break
            (add_picture, {}),  # Add picture
            (add_page_break, {}),  # Add page break
            (add_table, {}),  # Add table
            (add_page_break, {}),  # Add page break
        ]
        * 5  # Will repeat your config 5 times
    ),
)

See the full example here

Creating PDFs with graphics using Pillow

There’s a so called graphic PDF file provider available. Produced PDF files would not contain text, so don’t use it when you need text based content. However, sometimes you just need a valid file in PDF format, without caring much about the content. That’s where a GraphicPdfFileProvider comes to rescue:

from faker_file.providers.pdf_file import GraphicPdfFileProvider

pdf_file = FAKER.graphic_pdf_file()

See the full example here

The generated file will contain a random graphic (consisting of lines and shapes of different colours).


One of the most useful arguments supported is size.

pdf_file = FAKER.graphic_pdf_file(size=(800, 800))

See the full example here