Skip to content

Pdf Module

Total files documented: 2


pdf/database.py

Purpose

This file, database.py, provides functions for interacting with a SQL Server database using the pyodbc library. It includes functions for establishing a database connection and inserting process status information.

Key Responsibilities

  • Establish a connection to a SQL Server database
  • Insert process status information into the database

Important Functions

get_db_connection_wizard()

This function establishes a connection to a SQL Server database using the pyodbc library. It takes no arguments and returns a database connection object. If the connection fails, it raises an HTTPException with a status code of 500.

insert_process_status()

This function inserts or updates process status information into the database. It takes four arguments: project_id, instance, status_text, and final_json. It uses a SQL query to either update an existing row or insert a new row if one does not exist.

Important Classes

None

System Fit

This file fits into the wider 3-Cubed Python system as part of the data storage and retrieval layer. It provides a way to interact with the SQL Server database, allowing other components of the system to store and retrieve process status information.


pdf/pdf_llm_one_shot.py

Purpose

This file, pdf/pdf_llm_one_shot.py, is designed to process and enhance images from PDF files using super-resolution and denoising techniques. It also utilizes a Large Language Model (LLM) to extract process flow information from the enhanced images.

Key Responsibilities

  • Image enhancement using super-resolution and denoising techniques
  • LLM-based process flow extraction from enhanced images
  • PDF rasterization and image processing
  • Output file management

Important Functions

  • calculate_laplacian_variance: calculates the variance of the Laplacian of an image to assess sharpness
  • calculate_contrast: calculates the standard deviation of an image's grayscale values to assess contrast
  • process_single_image: enhances a single image using super-resolution and denoising techniques
  • convert_and_enhance: processes multiple images from PDF files or individual image files using the process_single_image function
  • invoke_llm_with_resize: (Not clear from code) - likely invokes the LLM with resized images
  • dfs: (Not clear from code) - likely performs a depth-first search on a graph
  • create_dfs_graph_from_data: (Not clear from code) - likely creates a graph from data using depth-first search
  • merge_graphs: (Not clear from code) - likely merges multiple graphs
  • main: the entry point of the script, likely orchestrating the entire process

Important Classes

  • GraphData: a Pydantic model representing graph data, containing nodes and edges
  • NodeData: a Pydantic model representing node data within the graph
  • EdgeData: a Pydantic model representing edge data within the graph

System Fit

This file is part of the 3-Cubed Python system, which appears to be a comprehensive tool for processing and analyzing images and text data. This script specifically focuses on enhancing images and extracting process flow information using an LLM. The output of this script can be used as input for further analysis or processing within the system.


🤖