all2md.cli

Command-line interface for all2md document conversion library.

This module provides a simple CLI tool for converting documents to Markdown format using the all2md library. It supports all formats handled by the library and provides convenient options for common use cases.

Environment Variable Support

All CLI options support environment variable defaults using the pattern ALL2MD_<OPTION_NAME> where option names are converted to uppercase with hyphens and dots replaced by underscores. CLI arguments always override environment variables.

Examples

Basic conversion:

$ all2md document.pdf

Specify output file:

$ all2md document.docx --out output.md

Save attachments:

$ all2md document.docx --attachment-mode save --attachment-output-dir ./attachments

Use underscore emphasis:

$ all2md document.html --markdown-emphasis-symbol "_"

Convert multiple files:

$ all2md *.pdf --output-dir ./converted

Use rich formatting:

$ all2md document.pdf --rich

Rich “cat” shortcut (equivalent to all2md document.pdf --rich):

$ rcat document.pdf

Process directory recursively:

$ all2md ./documents --recursive --output-dir ./markdown

Collate multiple files into one output:

$ all2md *.pdf --collate --out combined.md

Use environment variables for defaults:

$ export ALL2MD_RICH=true
$ export ALL2MD_OUTPUT_DIR=./converted
$ export ALL2MD_MARKDOWN_EMPHASIS_SYMBOL="_"
$ all2md *.pdf  # Uses environment defaults
all2md.cli.main(args: list[str] | None = None) int

Execute main CLI entry point with focused delegation to specialized processors.

all2md.cli.rcat_main(args: list[str] | None = None) int

Console-script entry point for rcat (rich cat).

Equivalent to all2md <args> --rich: renders documents with rich terminal formatting, automatically falling back to plain Markdown when output is piped.

class all2md.cli.DynamicCLIBuilder

Bases: object

Builds CLI arguments dynamically from options dataclasses.

This class introspects converter options dataclasses and their metadata to automatically generate argparse arguments, eliminating the need for hard-coded CLI argument definitions.

Initialize the CLI builder.

__init__() None

Initialize the CLI builder.

get_options_class_map() Dict[str, Type[Any]]

Expose cached options classes for external introspection.

snake_to_kebab(name: str) str

Convert snake_case to kebab-case.

Parameters:

name (str) – Snake case field name

Returns:

Kebab case CLI name

Return type:

str

infer_cli_name(field_name: str, format_prefix: str | None = None, is_boolean_with_true_default: bool = False) str

Infer CLI argument name from field name.

Parameters:
  • field_name (str) – Dataclass field name

  • format_prefix (str, optional) – Format prefix (e.g., ‘pdf’, ‘html’)

  • is_boolean_with_true_default (bool) – Whether this is a boolean field with default=True

Returns:

CLI argument name with – prefix

Return type:

str

get_argument_kwargs(field: Any, metadata: Dict[str, Any], cli_name: str, options_class: Type) Dict[str, Any]

Build argparse kwargs from field metadata using robust type resolution.

This method replaces brittle string matching with proper type introspection using typing.get_type_hints and helper methods.

Parameters:
  • field (Field) – Dataclass field

  • metadata (dict) – Field metadata

  • cli_name (str) – CLI argument name

  • options_class (Type) – The dataclass containing the field

Returns:

Kwargs for argparse.add_argument()

Return type:

dict

add_options_class_arguments(parser: ArgumentParser, options_class: Type, format_prefix: str | None = None, group_name: str | None = None) None

Add arguments for an options dataclass.

Parameters:
  • parser (ArgumentParser) – Parser to add arguments to

  • options_class (Type) – Options dataclass type

  • format_prefix (str, optional) – Prefix for argument names (e.g., ‘pdf’)

  • group_name (str, optional) – Name for argument group

add_format_specific_options(parser: ArgumentParser, options_class: Type, format_prefix: str | None = None, group_name: str | None = None) None

Add arguments for format-specific options, excluding BaseOptions fields.

Parameters:
  • parser (ArgumentParser) – Parser to add arguments to

  • options_class (Type) – Options dataclass type

  • format_prefix (str, optional) – Prefix for argument names (e.g., ‘pdf’)

  • group_name (str, optional) – Name for argument group

add_renderer_options(parser: ArgumentParser, options_class: Type, format_name: str) None

Add renderer options for a given format with dedicated prefixes.

add_transform_arguments(parser: ArgumentParser) None

Add transform-related CLI arguments.

Adds –transform flag and dynamic transform parameter arguments based on discovered transforms and their metadata.

Parameters:

parser (ArgumentParser) – Parser to add transform arguments to

add_global_attachment_arguments(parser: ArgumentParser) None

Add global attachment handling arguments that apply to all formats.

These arguments are extracted from AttachmentOptionsMixin and apply to all formats unless overridden by format-specific arguments.

Parameters:

parser (ArgumentParser) – Parser to add arguments to

build_parser() ArgumentParser

Build the complete argument parser with dynamic arguments.

Returns:

Configured parser

Return type:

ArgumentParser

resolve_option_field(dest: str) tuple[Any, Dict[str, Any]] | None

Return the dataclass field and metadata for a CLI destination.

map_args_to_options(parsed_args: Namespace, json_options: dict | None = None) dict

Map CLI arguments to options using dot notation parsing.

This simplified version uses dot notation in argument destinations to directly map to the nested structure of options.

Parameters:
  • parsed_args (argparse.Namespace) – Parsed command line arguments

  • json_options (dict or None) – Options loaded from JSON file

Returns:

Mapped options dictionary ready for to_markdown()

Return type:

dict

all2md.cli.create_parser() ArgumentParser

Create and configure the argument parser using dynamic generation.

all2md.cli.collect_argument_problems(parsed_args: Namespace, files: List[CLIInputItem] | None = None) list[ValidationProblem]

Collect validation problems for parsed CLI arguments.

all2md.cli.report_validation_problems(problems: Iterable[ValidationProblem], *, logger: Logger | None = None) bool

Report validation problems via logging.

Returns True when any errors were encountered.

all2md.cli.validate_arguments(parsed_args: Namespace, files: List[CLIInputItem] | None = None, *, logger: Logger | None = None) bool

Validate parsed arguments, logging any issues. Maintains legacy API.

For CLI module documentation organized by functionality, see CLI Module.