Ml Module¶

Total files documented: 7

`ml/cleantxt.py`¶

Purpose¶

This file, cleantxt.py, contains a function to clean and preprocess text data for machine learning models. It removes punctuation, numbers, and converts text to lowercase.

Key Responsibilities¶

Clean and preprocess text data for machine learning models
Remove punctuation and numbers from text
Convert text to lowercase

Important Functions¶

`clean_text(text)`¶

This function takes in a string of text and returns the cleaned text. It performs the following operations: * Converts the text to lowercase * Removes punctuation using regular expressions * Removes numbers using regular expressions

Important Classes¶

None

System Fit¶

This file is part of the wider 3-Cubed Python system, which utilizes FastAPI and CrewAI for machine learning tasks. The cleantxt.py file is likely used as a utility module to preprocess text data before it is fed into machine learning models. The cleaned text can then be used for tasks such as natural language processing, text classification, or sentiment analysis.

`ml/db.py`¶

Purpose¶

This file, db.py, provides a set of functions for interacting with a database using the pyodbc library. It establishes a connection to the database, fetches forms, rules, and other data based on specific process types.

Key Responsibilities¶

Establish a connection to the database using pyodbc.
Fetch forms from the mstRef_Forms table based on process types.
Fetch rules from the mstRef_BusinessRules table based on process types.
Handle database connection errors and exceptions.

Important Functions¶

`get_db_connection()`¶

Establishes a connection to the database using pyodbc. It takes no arguments and returns a database connection object.

`fetch_forms(process_type: str) -> list`¶

Fetches forms from the mstRef_Forms table based on process types. It takes a comma-separated list of process types as input and returns a list of form names.

`fetch_rules(process_type: str) -> list`¶

Fetches rules from the mstRef_BusinessRules table based on process types. It takes a comma-separated list of process types as input and returns a list of rule names.

System Fit¶

This file fits into the wider 3-Cubed Python system as a utility for interacting with the database. It provides a set of functions that can be used by other parts of the system to fetch data from the database. The functions are designed to be reusable and can be easily integrated into other components of the system.

Notes¶

The database connection string is hardcoded in the get_db_connection() function. This may not be suitable for production environments where database credentials should be kept secure.
The fetch_forms() and fetch_rules() functions use dynamic SQL to construct the query conditions based on the input process types. This may introduce security risks if not properly sanitized.
The functions do not handle pagination or limit the number of results returned from the database. This may lead to performance issues if the database contains a large number of records.

`ml/ner.py`¶

ml/ner.py¶

Purpose¶

This file provides a function for removing company names from input text using spaCy's Named Entity Recognition (NER) model. It also includes a custom dictionary for fallback cases where spaCy's model is unable to detect company names.

Key Responsibilities¶

Load spaCy's NER model for English language
Define a function to remove company names from input text
Provide a custom dictionary for fallback company name detection

Important Functions¶

remove_company_names(text: str) -> str: Removes company names (ORG entities) from the input text using spaCy's NER. It includes case normalization and fallback to a custom dictionary.

Important Classes¶

None

System Fit¶

This file is part of the wider 3-Cubed Python system, which utilizes spaCy's NER model for text processing. The remove_company_names function can be used in various applications, such as data preprocessing, text analysis, or content generation. It is designed to be a reusable component that can be easily integrated into other parts of the system.

Notes¶

The en_core_web_sm spaCy model is used for NER, which is a small English model suitable for general-purpose text processing.
The custom dictionary custom_companies contains a list of company names that are not detected by spaCy's model. This dictionary can be extended or modified as needed.
The remove_company_names function preserves the original text for case-sensitive replacement and uses regular expressions to replace detected company names.

`ml/suggesttools.py`¶

Purpose¶

This file, suggesttools.py, provides a set of custom tools for predicting forms and rules used in process activities, considering the predicted product. These tools utilize a large language model (LLM) to generate suggestions based on input prompts.

Key Responsibilities¶

Predict forms used in a process activity, considering the predicted product.
Predict rules used in a process activity, considering the predicted product.
Suggest forms used in a process activity, considering the predicted product and available forms.
Utilize a large language model (LLM) to generate suggestions based on input prompts.

Important Functions¶

`predict_forms_tool`¶

Purpose: Predict forms used in a process activity, considering the predicted product.
Input: process_type, activity_name, product, and an LLM instance.
Output: A list of predicted forms.

`predict_rules_tool`¶

Purpose: Predict rules used in a process activity, considering the predicted product.
Input: process_type, activity_name, product, and an LLM instance.
Output: A list of predicted rules, each with a skill level.

`suggest_forms`¶

Purpose: Suggest forms used in a process activity, considering the predicted product and available forms.
Input: process_type, activity_name, product, and an LLM instance.
Output: A list of suggested forms, each with a mode.

System Fit¶

This file fits into the wider 3-Cubed Python system by providing a set of custom tools for predicting forms and rules used in process activities. These tools can be integrated with other components of the system to provide a comprehensive solution for process-based form and rule classification. The file utilizes the crewai library for interacting with the LLM and the db module for fetching forms from the database.

`ml/aht/cleantxt.py`¶

Purpose¶

This file, cleantxt.py, contains a function to clean and preprocess text data for machine learning models. It removes punctuation, converts text to lowercase, and removes numbers.

Key Responsibilities¶

Clean and preprocess text data
Remove punctuation and numbers from text
Convert text to lowercase

Important Functions¶

`clean_text(text)`¶

This function takes in a string of text and returns the cleaned text. It uses regular expressions to remove punctuation and numbers, and converts the text to lowercase.

System Fit¶

This file is part of the 3-Cubed Python system, specifically within the ml/aht module. It is designed to be used in conjunction with machine learning models to preprocess text data before feeding it into the models. The clean_text function can be used as a preprocessing step in various machine learning pipelines within the system.

`ml/ml_notebooks/cleantxt.py`¶

Purpose¶

This file, cleantxt.py, contains functions for cleaning text data. It removes punctuation, converts text to lowercase, and removes digits from the input text.

Key Responsibilities¶

Clean text data by removing punctuation and digits
Convert text to lowercase

Important Functions¶

`clean_text(text)`¶

Removes punctuation and digits from the input text and converts it to lowercase. This function is a simple text cleaning utility.

`clean_text1(text)`¶

This function is identical to clean_text(text). It is unclear why there are two identical functions in this file.

Important Classes¶

None

System Fit¶

This file is part of the ML (Machine Learning) component of the 3-Cubed Python system. It is likely used as a utility function to preprocess text data before it is fed into machine learning models. The cleaned text data can then be used for tasks such as text classification, sentiment analysis, or topic modeling.

`ml/ml_notebooks/main.py`¶

Purpose¶

This file, main.py, is the main entry point for the 3-Cubed(ML) FastAPI application. It handles API requests for making NVA (Not Vital Activity) predictions using a machine learning model.

Key Responsibilities¶

Load and verify API keys
Load and initialize machine learning models for NVA predictions
Handle API requests for making NVA predictions
Return predictions in a standardized format

Important Functions¶

make_nva_predictions(text_inputs): This function takes a list of NVATextInput objects as input and returns a list of NVAPredictionResponse objects containing the predicted NVA labels and confidence scores.
Predict_Nva(text_inputs, api_key): This is the API endpoint for making NVA predictions. It takes a list of NVATextInput objects and an API key as input, verifies the API key, and calls the make_nva_predictions function to generate the predictions.

Important Classes¶

NVATextInput: This is a Pydantic model representing the input data for making NVA predictions. It has attributes for id, industry, activity_name, and systems_and_applications.
NVAPredictionResponse: This is a Pydantic model representing the output data for NVA predictions. It has attributes for id, nva, and confidence.

System Fit¶

This file fits into the wider 3-Cubed Python system as the main entry point for the machine learning API. It relies on the api_keys.json file for API key verification and the NVA_Type_pkl directory for loading the machine learning models. The predictions generated by this file can be used as input for other components of the system.

🤖

Ml Module¶

ml/cleantxt.py¶

Purpose¶

Key Responsibilities¶

Important Functions¶

clean_text(text)¶

Important Classes¶

System Fit¶

ml/db.py¶

Purpose¶

Key Responsibilities¶

Important Functions¶

get_db_connection()¶

fetch_forms(process_type: str) -> list¶

fetch_rules(process_type: str) -> list¶

System Fit¶

Notes¶

ml/ner.py¶

ml/ner.py¶

Purpose¶

Key Responsibilities¶

Important Functions¶

Important Classes¶

System Fit¶

Notes¶

ml/suggesttools.py¶

Purpose¶

Key Responsibilities¶

Important Functions¶

predict_forms_tool¶

predict_rules_tool¶

suggest_forms¶

System Fit¶

ml/aht/cleantxt.py¶

Purpose¶

Key Responsibilities¶

Important Functions¶

clean_text(text)¶

System Fit¶

ml/ml_notebooks/cleantxt.py¶

Purpose¶

Key Responsibilities¶

Important Functions¶

clean_text(text)¶

clean_text1(text)¶

Important Classes¶

System Fit¶

ml/ml_notebooks/main.py¶

Purpose¶

Key Responsibilities¶

Important Functions¶

Important Classes¶

System Fit¶

`ml/cleantxt.py`¶

`clean_text(text)`¶

`ml/db.py`¶

`get_db_connection()`¶

`fetch_forms(process_type: str) -> list`¶

`fetch_rules(process_type: str) -> list`¶

`ml/ner.py`¶

`ml/suggesttools.py`¶

`predict_forms_tool`¶

`predict_rules_tool`¶

`suggest_forms`¶

`ml/aht/cleantxt.py`¶

`clean_text(text)`¶

`ml/ml_notebooks/cleantxt.py`¶

`clean_text(text)`¶

`clean_text1(text)`¶

`ml/ml_notebooks/main.py`¶