Chatbot Module¶
Total files documented: 19
chatbot/config.py¶
Purpose¶
This file, chatbot/config.py, is responsible for loading and managing configuration settings for the 3-Cubed Python system. It uses the pydantic library to define a Settings class that holds environment-specific configuration data.
Key Responsibilities¶
- Load configuration settings from environment variables and secret managers
- Define a
Settingsclass to hold configuration data - Initialize the OpenAI API key and database connections
- Provide access to configuration settings through the
settingsobject
Important Functions¶
Not clear from code.
Important Classes¶
Settings¶
The Settings class is defined using pydantic.BaseSettings. It holds environment-specific configuration data, including:
envi: the environment (e.g., 'dev')config: the cloud function URL (loaded from a secret manager)
The settings object is an instance of the Settings class, providing access to the configuration settings.
System Fit¶
This file fits into the wider 3-Cubed Python system as a configuration management module. It loads and manages configuration settings, which are then used by other modules in the system. The Settings class provides a centralized location for accessing configuration data, making it easier to manage and update settings across the system.
chatbot/crypt_.py¶
Purpose¶
This file, crypt_.py, provides cryptographic functionality for the 3-Cubed Python system. It includes classes for encryption (Crypt) and password hashing (Hash).
Key Responsibilities¶
- Provide encryption functionality using Fernet symmetric encryption
- Provide password hashing functionality using bcrypt
- Verify password hashes
Important Functions¶
encrypt(message: str): Encrypts a message using Fernet symmetric encryption.decrypt(message: bytes): Decrypts a message using Fernet symmetric encryption.bcrypt(password: str): Hashes a password using bcrypt.verify_password(plain_password, hashed_password): Verifies a password against a hashed password.
Important Classes¶
Crypt: A class for encryption using Fernet symmetric encryption.__init__(key = config_data["CRYPT_KEY"]): Initializes the Crypt class with a key.encrypt(message: str): Encrypts a message using Fernet symmetric encryption.decrypt(message: bytes): Decrypts a message using Fernet symmetric encryption.
Hash: A class for password hashing using bcrypt.__init__(): Initializes the Hash class.bcrypt(password: str): Hashes a password using bcrypt.verify_password(plain_password, hashed_password): Verifies a password against a hashed password.
System Fit¶
This file fits into the wider 3-Cubed Python system by providing a secure way to store and verify sensitive data, such as passwords and encryption keys. The Crypt class can be used to encrypt sensitive data, while the Hash class can be used to hash and verify passwords. The config_data dictionary is used to store sensitive configuration data, such as the encryption key.
chatbot/main1.py¶
chatbot/main1.py¶
Purpose¶
This file sets up the FastAPI application for the chatbot, enabling CORS and initializing the database schema.
Key Responsibilities¶
- Set up logging configuration
- Initialize the FastAPI application
- Enable CORS middleware
- Create the database schema
- Include the generative4 router
Important Functions¶
logging.basicConfig(level=logging.INFO): Sets up the logging configuration for the application.app.add_middleware(CORSMiddleware, ...): Enables CORS middleware for the application.models.Base.metadata.create_all(engine): Creates the database schema.app.include_router(generative4.router): Includes the generative4 router in the application.
Important Classes¶
FastAPI: The FastAPI application instance.CORSMiddleware: The CORS middleware instance.
System Fit¶
This file is part of the wider 3-Cubed Python system, which includes the database and router modules. The chatbot application is built on top of the FastAPI framework and uses the database module to interact with the database schema. The router module is used to handle incoming requests and route them to the appropriate handlers.
chatbot/authentication/authentication.py¶
Purpose¶
This file provides authentication functionality for the chatbot API using API key authentication.
Key Responsibilities¶
- Authenticate incoming requests using an API key
- Validate the API key against a stored encrypted key
- Raise an HTTP exception if the API key is invalid
Important Functions¶
apiKeyAuthentication: The main authentication function that takes an API key as input and returnsTrueif valid, raising an HTTP exception otherwise.
Important Classes¶
None
System Fit¶
This file is part of the chatbot's authentication module and is used to secure API endpoints. It is designed to work with the FastAPI framework and is likely used in conjunction with other authentication mechanisms to provide a robust security system. The Crypt object is used to decrypt the stored API key, and the config_data object is used to retrieve the encrypted API key from the configuration.
chatbot/authentication/__init__.py¶
Purpose¶
Unable to read this file content.
Extracted Functions¶
None found.
Extracted Classes¶
None found.
chatbot/constants/constants1.py¶
Purpose¶
This file, constants1.py, contains various constants and templates used throughout the 3-Cubed Python system. It defines tokenization patterns, templates for chatbot responses, and lists of supported Freshdesk and ProductHelp URLs.
Key Responsibilities¶
- Define tokenization patterns for text processing
- Provide templates for chatbot responses
- Store lists of supported URLs for Freshdesk and ProductHelp
Important Functions¶
None
Important Classes¶
None
System Fit¶
This file is part of the 3-Cubed Python system, which appears to be a chatbot and machine learning (ML) system. The constants and templates defined in this file are used to support the chatbot's functionality, including tokenization, response generation, and URL validation. The supported URLs are likely used to provide relevant information to the chatbot and its users.
Notes¶
- The
TOKENIZE_PATTERNis used to split text into individual tokens. - The
templatevariable defines a basic template for chatbot responses. - The
verify_fix_templatevariable defines a template for validating AI responses. - The
SUPPORTED_FRESHDESK_URLSandSUPPORTED_PRODUCTHELP_URLSlists contain URLs that are supported by the system.
chatbot/constants/__init__.py¶
Purpose¶
Unable to read this file content.
Extracted Functions¶
None found.
Extracted Classes¶
None found.
chatbot/database/database.py¶
Purpose¶
This file, database.py, is responsible for establishing a connection to the database and providing a way to retrieve a database session. It uses the SQLAlchemy library to interact with the database.
Key Responsibilities¶
- Establish a connection to the database using the configuration data from
config.py. - Create a scoped session maker to manage database connections.
- Provide a function to retrieve a database session (
get_db).
Important Functions¶
get_db¶
This function returns a database session. It:
* Creates a new database session using the scoped session maker.
* Sets autoflush to False to prevent automatic flushing of the session.
* Yields the database session to the caller.
* Catches and handles any operational errors or exceptions that occur during the database connection.
* Rolls back the database session in case of an error and closes it in the finally block.
Important Classes¶
None.
System Fit¶
This file fits into the wider 3-Cubed Python system as part of the data access layer. It provides a way to interact with the database, which is used by other components of the system to store and retrieve data. The get_db function is likely used by other parts of the system to retrieve a database session, which is then used to perform database operations.
chatbot/database/models.py¶
chatbot/database/models.py¶
Purpose¶
This file defines database models for storing and retrieving data related to chatbot interactions and collection records. It uses SQLAlchemy to interact with the database.
Key Responsibilities¶
- Define database tables for storing chat history, collection records, and question suggestions
- Establish relationships between tables using foreign keys
- Provide a foundation for data storage and retrieval in the chatbot system
Important Functions¶
None
Important Classes¶
- CollectionRecords: Represents a collection of records, storing information such as URL, collection name, unique ID, chunk size, and creation timestamp.
- ChatHistory: Stores chat interactions, including chat ID, input, output, source, and creation timestamp.
- QuestionSuggestion: Stores question suggestions, including sources and the suggested question.
System Fit¶
This file fits into the wider 3-Cubed Python system by providing a data storage layer for the chatbot. It interacts with the database using SQLAlchemy, allowing for efficient data retrieval and storage. The models defined in this file are used to store and retrieve data related to chatbot interactions and collection records, enabling the chatbot to function effectively.
chatbot/database/__init__.py¶
Purpose¶
Unable to read this file content.
Extracted Functions¶
None found.
Extracted Classes¶
None found.
chatbot/helpers/scraping.py¶
Purpose¶
This file, scraping.py, contains a collection of helper functions for web scraping tasks. It provides functionality for checking and normalizing URLs, extracting links and content from web pages, and cleaning scraped data.
Key Responsibilities¶
- Normalizing URLs to ensure they start with a protocol prefix
- Extracting links from web pages based on specific tags and classes
- Cleaning scraped content to remove unnecessary characters and whitespace
- Extracting text content from web pages
- Extracting text from SVG elements
- Extracting title metadata from web pages
Important Functions¶
check_url(url: str): Normalizes a URL by adding a protocol prefix if necessary.get_soup_obj(url: str): Sends a GET request to the provided URL and returns a BeautifulSoup object representing the HTML content.get_all_links(url: str, tag_and_class_name: dict = None, limit: int = 30): Extracts all links from a web page based on specific tags and classes, with an optional limit on the number of links returned.clean_content(content: str): Removes unnecessary characters and whitespace from scraped content.get_link_content(url: str): Extracts the text content from a web page.svg_extraction(content: str): Extracts text from SVG elements within the provided content.get_link_title(url: str): Extracts the title metadata from a web page.
Important Classes¶
None
System Fit¶
This file is part of the 3-Cubed Python system, which utilizes FastAPI and ML components. The scraping.py file provides a set of reusable functions for web scraping tasks, which can be used throughout the system to extract data from web pages. The extracted data can then be processed and used for various purposes, such as training machine learning models or populating a database.
chatbot/routers/generative4.py¶
Purpose¶
This file, generative4.py, is a FastAPI router that handles API endpoints for generating and processing links, extracting link metadata, and ranking context based on similarity.
Key Responsibilities¶
- Handles API endpoints for generating and processing links
- Extracts link metadata and ranks context based on similarity
- Utilizes various libraries and services, including LangChain, Whisper, and PostgreSQL
Important Functions¶
rerank_by_similarity(query_embedding, context_list, top_k=10): Ranks context based on similarity to a query embeddingget_links(db: Session = Depends(database.get_db), _ = Depends(apiKeyAuthentication)): Retrieves a list of links from the databaseprocess_link(url: str, db: Session = Depends(database.get_db), _ = Depends(apiKeyAuthentication)): Processes a link by checking if it exists in the database, scraping metadata, and ranking context
Important Classes¶
None
System Fit¶
This file fits into the wider 3-Cubed Python system as a key component of the chatbot's API endpoints. It interacts with the database, utilizes various libraries and services, and provides functionality for generating and processing links, extracting link metadata, and ranking context based on similarity.
chatbot/routers/__init__.py¶
Purpose¶
Unable to read this file content.
Extracted Functions¶
None found.
Extracted Classes¶
None found.
chatbot/utilities/crawl.py¶
Purpose¶
This file, crawl.py, contains a class-based implementation for web crawling. It uses the requests and BeautifulSoup libraries to fetch and parse web pages, extracting links and content.
Key Responsibilities¶
- Fetch web pages using
requests - Parse web pages using
BeautifulSoup - Extract links and content from web pages
- Handle pagination and limit the number of links and content extracted
Important Functions¶
get_links(url, div_class=None): Extracts links from a web page, optionally filtering by a specificdivclass.get_link_content(links): Extracts content from a list of links, cleaning and processing the text.get_all_site_content(): Fetches and extracts content from all links on a website, joining the content into a single string.
Important Classes¶
Crawler: The main class for web crawling, responsible for initializing the crawler with a URL, setting headers, and providing methods for link extraction and content processing.
System Fit¶
This file fits into the wider 3-Cubed Python system as part of the chatbot module, providing a utility for web crawling and content extraction. It can be used to fetch and process content from websites, which can then be used to train machine learning models or populate a chatbot's knowledge base.
chatbot/utilities/generic1.py¶
Purpose¶
This file, generic1.py, contains utility functions and classes for text processing, OpenAI API interactions, and database operations. It provides functionality for text splitting, chunking, embedding, and question generation.
Key Responsibilities¶
- Text processing:
- Splitting text into chunks
- Converting text to JSON serializable format
- OpenAI API interactions:
- Generating embeddings for text
- Creating chat completions
- Database operations:
- Adding question suggestions to the database
- Utility functions:
- Processing chunks and metadata
- Generating background templates for question generation
Important Functions¶
convert_to_json_serializable(data): Converts data to a JSON serializable format by replacing infinite and NaN values with None.process_chunks(chunks, metadata): Processes chunks and metadata by creating unique IDs and updating metadata.get_embedding(text_to_embed, retries=3): Generates an embedding for the provided text using the OpenAI API.background_template_generation(question, sources, db, unique_id): Generates a background template for question generation using the OpenAI API and adds a question suggestion to the database.
Important Classes¶
TextProcessor: A class responsible for text processing tasks, including splitting text into chunks and generating embeddings.
System Fit¶
This file is part of the wider 3-Cubed Python system, which appears to be a chatbot or conversational AI system. The functions and classes in this file are designed to support the generation of questions and embeddings for text, as well as database operations for storing question suggestions. The OpenAI API is used for generating embeddings and chat completions, and the TextProcessor class provides a convenient interface for text processing tasks.
chatbot/utilities/__init__.py¶
Purpose¶
Unable to read this file content.
Extracted Functions¶
None found.
Extracted Classes¶
None found.
chatbot/vectordb/vector_database.py¶
Vector Database Module¶
==========================
Purpose¶
This module provides a vector database implementation using PostgreSQL and SQLAlchemy. It allows for inserting, deleting, and searching vectors in the database.
Key Responsibilities¶
- Connect to a PostgreSQL database using SQLAlchemy
- Define a
VectorRecordclass to represent a vector in the database - Implement a
PgVectorDBclass to manage vector operations - Provide methods for inserting, deleting, and searching vectors
Important Functions¶
insert(data_to_insert)¶
Inserts a new vector into the database. The data_to_insert parameter should contain the vector data and metadata.
delete_chunks(uid, chunk_size)¶
Deletes chunks of vectors from the database based on the provided uid and chunk_size. The uid is used to identify the chunks to delete.
similarity_search(question_vector, top_k=10)¶
Performs a similarity search on the stored vectors in the database. It calculates the cosine similarity between the question_vector and each stored vector, and returns the top top_k results.
Important Classes¶
VectorRecord¶
Represents a vector in the database. It has two attributes: vector_data and meta_data.
PgVectorDB¶
Manages vector operations in the database. It provides methods for inserting, deleting, and searching vectors.
System Fit¶
This module fits into the wider 3-Cubed Python system as a component of the CrewAI framework. It provides a vector database implementation that can be used by other components to store and retrieve vectors.
Notes¶
- The
DATABASE_URLenvironment variable should be set to the URL of the PostgreSQL database. - The
vector_dataattribute of theVectorRecordclass is stored as an array of floats. - The
meta_dataattribute of theVectorRecordclass is stored as a JSON object. - The
similarity_searchmethod uses the cosine similarity metric to calculate the similarity between vectors.
chatbot/vectordb/vector_database1.py¶
Purpose¶
This file, vector_database1.py, provides a PostgreSQL database interface for storing and querying vector data using the CrewAI system. It utilizes the FastAPI framework and the SciPy library for vector similarity calculations.
Key Responsibilities¶
- Store vector data in a PostgreSQL database
- Provide methods for inserting, deleting, and querying vector data
- Perform similarity searches using cosine similarity
Important Functions¶
insert(data_to_insert): Inserts a new vector record into the database.delete_chunks(uid, chunk_size): Deletes a specified number of records from the database based on the provideduidandchunk_size.similarity_search(question_vector, top_k=10): Performs a similarity search on the stored vector data using the providedquestion_vectorand returns the toptop_kresults.
Important Classes¶
VectorRecord: Represents a single vector record stored in the database. It contains the vector data and metadata.PgVectorDB: Provides an interface to the PostgreSQL database for storing and querying vector data.
System Fit¶
This file fits into the wider 3-Cubed Python system as a data storage and retrieval component. It allows the system to store and query vector data, which can be used for various machine learning and natural language processing tasks. The PgVectorDB class can be used in conjunction with other components, such as the FastAPI API, to provide a robust and scalable vector database solution.
chatbot/vectordb/__init__.py¶
Purpose¶
Unable to read this file content.
Extracted Functions¶
None found.
Extracted Classes¶
None found.