Pdf Module¶
Total files documented: 2
pdf/database.py¶
Purpose¶
This file, database.py, provides functions for interacting with a SQL Server database using the pyodbc library. It includes functions for establishing a database connection and inserting process status information.
Key Responsibilities¶
- Establish a connection to a SQL Server database
- Insert process status information into the database
Important Functions¶
get_db_connection_wizard()¶
This function establishes a connection to a SQL Server database using the pyodbc library. It takes no arguments and returns a database connection object. If the connection fails, it raises an HTTPException with a status code of 500.
insert_process_status()¶
This function inserts or updates process status information into the database. It takes four arguments: project_id, instance, status_text, and final_json. It uses a SQL query to either update an existing row or insert a new row if one does not exist.
Important Classes¶
None
System Fit¶
This file fits into the wider 3-Cubed Python system as part of the data storage and retrieval layer. It provides a way to interact with the SQL Server database, allowing other components of the system to store and retrieve process status information.
pdf/pdf_llm_one_shot.py¶
Purpose¶
This file, pdf/pdf_llm_one_shot.py, is designed to process and enhance images from PDF files using super-resolution and denoising techniques. It also utilizes a Large Language Model (LLM) to extract process flow information from the enhanced images.
Key Responsibilities¶
- Image enhancement using super-resolution and denoising techniques
- LLM-based process flow extraction from enhanced images
- PDF rasterization and image processing
- Output file management
Important Functions¶
calculate_laplacian_variance: calculates the variance of the Laplacian of an image to assess sharpnesscalculate_contrast: calculates the standard deviation of an image's grayscale values to assess contrastprocess_single_image: enhances a single image using super-resolution and denoising techniquesconvert_and_enhance: processes multiple images from PDF files or individual image files using theprocess_single_imagefunctioninvoke_llm_with_resize: (Not clear from code) - likely invokes the LLM with resized imagesdfs: (Not clear from code) - likely performs a depth-first search on a graphcreate_dfs_graph_from_data: (Not clear from code) - likely creates a graph from data using depth-first searchmerge_graphs: (Not clear from code) - likely merges multiple graphsmain: the entry point of the script, likely orchestrating the entire process
Important Classes¶
GraphData: a Pydantic model representing graph data, containing nodes and edgesNodeData: a Pydantic model representing node data within the graphEdgeData: a Pydantic model representing edge data within the graph
System Fit¶
This file is part of the 3-Cubed Python system, which appears to be a comprehensive tool for processing and analyzing images and text data. This script specifically focuses on enhancing images and extracting process flow information using an LLM. The output of this script can be used as input for further analysis or processing within the system.