pyexamgenerator package

Submodules

pyexamgenerator.exam_generator module

class pyexamgenerator.exam_generator.ExamGenerator[fuente]

Bases: object

A class to generate exams from a question bank stored in an Excel file. It can create multiple versions of an exam, shuffle questions and answers, and export to various formats like DOCX and Moodle XML.

add_page_number(document: Document) → Document[fuente]

Adds page numbers to the footer of a Word document using direct docx XML manipulation. The format will be «Página X de Y».

Parámetros:: document (Document) – The docx Document object to which page numbers will be added.
Devuelve:: The Document object with the added page numbers.
Tipo del valor devuelto:: Document

generate_exam_from_excel(bank_excel_path: str, output_dir: str | None = None, exam_names: list | None = None, questions_per_topic: dict | None = None, selection_method: str = 'azar', subject: str | None = None, exam: str | None = None, course: str | None = None, num_exams: int = 2, top_margin: float = 1, bottom_margin: float = 1, left_margin: float = 0.5, right_margin: float = 0.5, export_moodle_xml: bool = False, font_size: int = 9, xml_cat_additional_text: str | None = None, penalty: int = -25, check: bool = False, update_excel: bool = False, answer_sheet_instructions: str | None = None, verbose: bool = False)[fuente]

Generates multiple exams in .docx format with varied question and answer orders. This is the main orchestrator method of the class.

Parámetros:

bank_excel_path (str) – Path to the Excel file with the question bank.
output_dir (Optional[str]) – The directory to save the generated exam files. If None, uses the same directory as the bank_excel_path.
exam_names (Optional[list]) – A list of names for the exam versions (e.g., [“1A”, “1B”]).
questions_per_topic (Optional[dict]) – A dictionary specifying how many questions to select from each topic.
selection_method (str) – Method for selecting questions: “azar” (random), “primeras” (first N), “menos usadas” (least used).
subject (Optional[str]) – The subject name for the exam header.
exam (Optional[str]) – The exam name (e.g., “Parcial 1”).
course (Optional[str]) – The course name or year (e.g., “24-25”).
num_exams (int) – The number of different exam versions to generate if exam_names is not provided.
top_margin (float) – Top page margin in inches.
bottom_margin (float) – Bottom page margin in inches.
left_margin (float) – Left page margin in inches.
right_margin (float) – Right page margin in inches.
export_moodle_xml (bool) – If True, exports the exam to Moodle XML format.
font_size (int) – Font size for the text in the document.
xml_cat_additional_text (Optional[str]) – Additional text for the Moodle XML category.
penalty (int) – Penalty for incorrect answers in the Moodle XML export.
check (bool) – If True, generates a preview and waits for user confirmation before creating all exams.
update_excel (bool) – If True, updates the source Excel file with usage statistics.
answer_sheet_instructions (Optional[str]) – Text with instructions for the answer sheet.
verbose (bool) – If True, prints detailed progress messages to the console.

generate_moodle_xml(df: DataFrame, file_name: str, num_total_questions: int, exam: str | None = None, exam_type: str | None = None, xml_cat_additional_text: str | None = None, penalty: int = -25)[fuente]

Generates a Moodle XML file from a DataFrame, including category information and correctly handling shuffled answers.

Parámetros:

df (pd.DataFrame) – The DataFrame containing the questions and answers.
file_name (str) – The name of the XML file to generate.
num_total_questions (int) – The total number of questions.
exam (Optional[str], optional) – The name of the exam.
exam_type (Optional[str], optional) – The type of exam.
xml_cat_additional_text (Optional[str], optional) – Additional text for the Moodle XML category.
penalty (int, optional) – The penalty percentage for an incorrect answer (e.g., -25 for 25%).

generate_question_text(df: DataFrame, renumber: bool = False, shuffle_answers: bool = False) → Tuple[str, str, DataFrame][fuente]

Generates the formatted text of the questions (with and without solutions) from a DataFrame. This method is primarily used for the “check” functionality to create a quick preview.

Parámetros:

df (pd.DataFrame) – The DataFrame containing the questions.
renumber (bool, optional) – If True, renumbers questions sequentially. Defaults to False.
shuffle_answers (bool, optional) – If True, shuffles the order of answers. Defaults to False.

Devuelve:

A tuple containing the student’s exam text, the full exam text (with solutions),: and the modified DataFrame.

Tipo del valor devuelto:

Tuple[str, str, pd.DataFrame]

read_questions_from_excel(excel_path: str) → DataFrame | None[fuente]

Reads questions from an Excel file and returns a pandas DataFrame.

Parámetros:

excel_path (str) – The path to the Excel file containing the questions.

Devuelve:

A DataFrame with the questions if the read is successful,: None in case of an error.

Tipo del valor devuelto:

Optional[pd.DataFrame]

save_questions_to_docx(question_text: str, file_name: str | None = None)[fuente]

Saves the formatted question text to a .docx file.

Parámetros:

question_text (str) – The text of the questions to be saved.
file_name (Optional[str], optional) – The name of the .docx file.

exception pyexamgenerator.exam_generator.NoAcceptableQuestionsError[fuente]

Bases: Exception

Excepción personalizada para cuando no se encuentran preguntas aceptables.

pyexamgenerator.main_app module

class pyexamgenerator.main_app.ExamApp(root: Tk)[fuente]

Bases: object

Main application for the Exam Generator Suite.

This class builds the graphical user interface (GUI) using Tkinter, and orchestrates the interactions between the user and the backend modules (QuestionGenerator, pyexamgenerator, QuestionBankManager).

root

The main window of the application.

Type:: tk.Tk

notebook

The widget for managing the application’s tabs.

Type:: ttk.Notebook

api_key

The API key for using the question generator.

Type:: str

question_generator

Instance of the question generator.

Type:: QuestionGenerator

exam_generator

Instance of the exam generator.

Type:: ExamGenerator

prompt_types

Dictionary of prompt types.

Type:: dict

selected_prompt_type

Variable for the selected prompt type.

Type:: tk.StringVar

custom_prompt_text

Variable for the custom prompt text.

Type:: tk.StringVar

status_text

Variable for the status bar text.

Type:: tk.StringVar

status_label

Label for the status bar.

Type:: ttk.Label

excel_filepath

Variable for the Excel file path.

Type:: tk.StringVar

subject

Variable for the subject.

Type:: tk.StringVar

exam_name

Variable for the exam name.

Type:: tk.StringVar

course

Variable for the course.

Type:: tk.StringVar

num_exams

Variable for the number of exams.

Type:: tk.StringVar

exam_names

Variable for the names of the exams.

Type:: tk.StringVar

questions_per_topic

Variable for questions per topic (dictionary format).

Type:: tk.StringVar

num_questions_same_topic

Variable for the number of questions per topic (same for all).

Type:: tk.StringVar

selection_method

Variable for the question selection method.

Type:: tk.StringVar

top_margin

Variable for the top margin.

Type:: tk.StringVar

bottom_margin

Variable for the bottom margin.

Type:: tk.StringVar

left_margin

Variable for the left margin.

Type:: tk.StringVar

right_margin

Variable for the right margin.

Type:: tk.StringVar

font_size

Variable for the font size.

Type:: tk.StringVar

answer_sheet_instructions

Variable for the answer sheet instructions.

Type:: tk.StringVar

penalty

Variable for the penalty.

Type:: tk.StringVar

xml_cat_additional_text

Variable for the additional XML category text.

Type:: tk.StringVar

export_moodle

Variable for exporting to Moodle.

Type:: tk.BooleanVar

update_excel

Variable for updating the Excel file.

Type:: tk.BooleanVar

num_columns_var

Variable for the number of columns in theme selection.

Type:: tk.StringVar

selection_method_var

Variable for the theme selection method UI control.

Type:: tk.StringVar

pdf_files_var

Variable for the PDF files.

Type:: tk.StringVar

num_questions_var

Variable for the number of questions to generate.

Type:: tk.StringVar

output_filename_var

Variable for the output filename.

Type:: tk.StringVar

api_key_var

Variable for the API key.

Type:: tk.StringVar

question_bank_manager

Instance of the question bank manager.

Type:: QuestionBankManager

existing_bank_path

Variable for the path to an existing question bank.

Type:: tk.StringVar

reviewed_add_path

Variable for the path to reviewed questions to be added.

Type:: tk.StringVar

add_edit_prompt_type(existing_type: str | None = None, existing_prompt: str | None = None) → None[fuente]

Adds or edits a prompt type via a new Toplevel window.

Parámetros:

existing_type (Optional[str], optional) – The existing prompt type (for editing). Defaults to None.
existing_prompt (Optional[str], optional) – The existing prompt text (for editing). Defaults to None.

add_reviewed_questions_to_bank() → None[fuente]: Adds questions from a reviewed Excel file to an existing question bank, avoiding duplicates based on the selected criteria.

create_exam_tab()[fuente]: Creates the “Generate Exams” tab with all its widgets.

create_manage_bank_tab()[fuente]: Creates the “Manage Question Bank” tab with all its widgets.

create_question_tab() → None[fuente]: Creates the “Generate Questions” tab with all its widgets.

delete_prompt_type() → None[fuente]: Deletes the selected prompt type, preventing deletion of default types.

generate_exams() → None[fuente]: Gathers all parameters from the “Generate Exams” tab and triggers the exam generation process. It validates user input, constructs the questions_per_topic dictionary, and calls the generate_exam_from_excel method from the backend pyexamgenerator class.

generate_questions() → None[fuente]: Gathers all parameters from the GUI and triggers the question generation process. This function acts as a bridge between the user interface and the backend logic.

hide_tree_tooltip(event=None)[fuente]: Hides the treeview tooltip.

load_api_key() → str | None[fuente]

Loads the API key from a configuration file.

Devuelve:: The API key if found, otherwise None.
Tipo del valor devuelto:: Optional[str]

load_default_prompt_types() → Dict[str, str][fuente]

Defines the default prompt types in a Python dictionary.

Devuelve:

A dictionary where keys are prompt type names: and values are the default prompt texts.

Tipo del valor devuelto:

Dict[str, str]

load_prompt_types() → Dict[str, str][fuente]

Loads custom prompt types from a JSON file.

Devuelve:

A dictionary where keys are prompt type names: and values are the prompt texts.

Tipo del valor devuelto:

Dict[str, str]

load_themes_for_selection() → None[fuente]: Loads themes from an Excel file for the “questions per theme” selection UI. It reads the “Tema” column and dynamically creates Spinbox widgets for each unique theme.

on_model_select(event=None)[fuente]: Handles the selection of a model in the Treeview table. Updates the selected_model_var with the technical name of the chosen model.

on_treeview_motion(event)[fuente]: Shows a tooltip with the full cell text if the mouse hovers over a cell where the content is wider than the column.

open_api_key_help()[fuente]: Opens the Google AI documentation for setting up API key environment variables.

populate_models_table()[fuente]: Fetches available Gemini models using the provided API key and populates the Treeview table. Adapted for the new google-genai library.

save_api_key(api_key: str) → None[fuente]

Saves the API key to a configuration file.

Parámetros:: api_key (str) – The API key to save.

save_current_api_key() → None[fuente]

Saves the API key currently entered in the GUI.

This method gets the API key from the GUI variable, saves it using the save_api_key method, and updates the question generator instance (self.question_generator) with the new API key. It also shows informational or error messages to the user via messagebox.

save_prompt_types() → None[fuente]: Saves custom prompt types to a JSON file.

save_revised_questions_to_xlsx() → None[fuente]: Saves revised questions from a DOCX file to a new XLSX file. It uses the QuestionBankManager class to read the DOCX. If no name is provided for the XLSX file, one is generated automatically based on the DOCX name.

select_excel_file() → None[fuente]: Opens a dialog to select an Excel file. The path of the selected file is saved in the self.excel_filepath variable. After selecting the file, self.load_themes_for_selection() is called to load the themes.

select_existing_bank_file() → None[fuente]: Opens a dialog for the user to select an existing Excel file that contains the question bank. The path of the selected file is saved in the self.existing_bank_path variable.

select_existing_bank_for_gen_file() → None[fuente]: Opens a dialog for the user to select an existing Excel file to use as a question bank when generating new questions (optional). The path of the selected file is saved in the self.existing_bank_for_gen_path variable.

select_gen_exams_output_dir() → None[fuente]: Opens a dialog to select the output directory for generated exams.

select_gen_questions_output_dir() → None[fuente]: Opens a dialog to select the output directory for generated questions.

select_pdf_files() → None[fuente]: Opens a dialog for the user to select one or more PDF files. The paths of the selected files are inserted into the corresponding entry widget.

select_reviewed_file_to_add() → None[fuente]: Opens a dialog for the user to select a revised Excel file containing questions to add to the existing question bank. The path of the selected file is saved in the self.reviewed_add_path variable.

select_revised_docx_file() → None[fuente]: Opens a dialog to select a revised DOCX file. The path of the selected file is saved in the self.revised_docx_path_var variable.

select_revised_xlsx_output_dir() → None[fuente]: Opens a dialog to select the output directory for the revised XLSX file.

show_about_dialog()[fuente]: Displays an “About” dialog with program and license information.

toggle_pages_per_chunk_entry()[fuente]: Enables or disables the “pages per chunk” entry based on the checkbox state.

update_custom_prompt_display(event: Event | None = None) → None[fuente]

Updates the custom prompt text in the GUI.

This method displays the text of the selected prompt type in the corresponding text area. If no prompt type is selected or the selected type does not exist, the text area is cleared.

Parámetros:: event (Optional[tk.Event]) – The event that triggered this method call. Defaults to None.

update_prompt_type_options() → None[fuente]: Updates the options in the prompt type dropdown menu.

update_status(message: str) → None[fuente]

Updates the text of the status bar.

Parámetros:: message (str) – The text to display in the status bar.

update_theme_display() → None[fuente]: Reloads themes when the “select by theme” option changes. This method clears or loads the theme selection widgets depending on whether the “select by theme” option is enabled or not.

pyexamgenerator.main_app.main()[fuente]

pyexamgenerator.question_bank_manager module

class pyexamgenerator.question_bank_manager.QuestionBankManager[fuente]

Bases: object

Manages a question bank, allowing reading from DOCX files, saving to Excel, and adding new questions while avoiding duplicates.

add_questions_without_duplicates(existing_bank_path: str, reviewed_questions_path: str, duplicate_check_columns: List[str] | None = ['Pregunta', 'Respuesta A', 'Respuesta B', 'Respuesta C', 'Respuesta D'], add_only_acceptable: bool = False) → Tuple[int, DataFrame | None][fuente]

Adds questions from a reviewed Excel file to an existing bank, avoiding duplicates. It can filter questions by their “Estado” (State) column.

Parámetros:

existing_bank_path (str) – Path to the existing question bank Excel file.
reviewed_questions_path (str) – Path to the Excel file with revised questions to add.
duplicate_check_columns (Optional[List[str]]) – List of column names used to identify duplicate questions. Defaults to checking question and all answers.
add_only_acceptable (bool, optional) – If True, only adds questions with “Aceptable” status. Defaults to False (adds all).

Devuelve:

A tuple containing the number of questions added and the updated: question bank DataFrame. Returns (-1, None) if there are errors reading the files.

Tipo del valor devuelto:

Tuple[int, Optional[pd.DataFrame]]

add_reviewed_questions_to_existing_excel(existing_excel_path: str, reviewed_excel_path: str) → str | None[fuente]

Adds questions from a reviewed XLSX file to another existing XLSX file, avoiding duplication. This is a high-level wrapper function.

Parámetros:

existing_excel_path (str) – The path to the existing question bank XLSX file.
reviewed_excel_path (str) – The path to the XLSX file containing the revised questions.

Devuelve:

The path to the updated XLSX file if the operation is successful, otherwise None.

Tipo del valor devuelto:

Optional[str]

generate_excel_from_docx(docx_path: str, excel_output_filename: str, output_dir: str | None = None) → str | None[fuente]

Generates an XLSX file from a revised DOCX file. This is a convenience wrapper around read_questions_from_docx and save_questions_to_excel.

Parámetros:

docx_path (str) – The path to the input DOCX file.
excel_output_filename (str) – The filename for the output XLSX file.
output_dir (Optional[str]) – The directory to save the output file. If None, saves it in the same directory as the input docx_path.

Devuelve:

The path to the saved XLSX file if the operation is successful, otherwise None.

Tipo del valor devuelto:

Optional[str]

read_questions_from_docx(docx_path: str) → DataFrame | None[fuente]

Reads revised questions from a DOCX file and returns a pandas DataFrame. The DOCX file must follow a specific format where each piece of information (question, topic, state, options, etc.) is prefixed with a specific keyword.

Parámetros:

docx_path (str) – The path to the DOCX file containing the questions.

Devuelve:

A pandas DataFrame with the extracted questions.: Returns None if an error occurs while reading the file.

Tipo del valor devuelto:

Optional[pd.DataFrame]

save_dataframe_to_excel(df: DataFrame, filepath: str, overwrite: bool = True, new_suffix: str = '_actualizado') → str | None[fuente]

Saves a pandas DataFrame to an Excel file.

Parámetros:

df (pd.DataFrame) – The DataFrame to save.
filepath (str) – The destination Excel file path.
overwrite (bool, optional) – If True, overwrites the existing file. If False, saves to a new file with a suffix. Defaults to True.
new_suffix (str, optional) – The suffix to add to the filename if overwrite is False. Defaults to “_actualizado”.

Devuelve:

The path of the file where the data was saved, or None if there was an error.

Tipo del valor devuelto:

Optional[str]

save_questions_to_excel(df: DataFrame, output_filename: str = 'preguntas_revisadas') → None[fuente]

Saves the question DataFrame to an Excel file.

Parámetros:

df (pd.DataFrame) – The pandas DataFrame containing the questions to save.
output_filename (str) – The name of the output Excel file (without the extension). Defaults to “preguntas_revisadas”.

pyexamgenerator.question_generator module

class pyexamgenerator.question_generator.QuestionGenerator(api_key: str, model_name: str)[fuente]

Bases: object

Generates multiple-choice questions from PDF files using the Gemini model.

extract_pdf_text(pdf_path: str) → str | None[fuente]

Extracts text from a PDF file.

Parámetros:: pdf_path (str) – The path to the PDF file.
Devuelve:: The extracted text from the PDF, or None if an error occurs.
Tipo del valor devuelto:: Optional[str]

extract_topic_number_from_path(pdf_path: str) → str[fuente]

Extracts the topic name from the PDF filename.

Parámetros:: pdf_path (str) – The path to the PDF file.
Devuelve:: The extracted topic name, or «Tema desconocido» if not found.
Tipo del valor devuelto:: str

generate_multiple_choice_questions(pdf_paths: List[str], prompt_type: str, num_questions_per_chunk_target: int = 5, output_filename: str | None = None, output_dir: str | None = None, custom_prompt: str | None = None, generate_docx: bool = True, existing_bank_path: str | None = None, process_by_pages: bool = False, pages_per_chunk: int = 1, similarity_threshold: float | None = 0.8, bank_prompt_scope: str | None = None, prompt_example_content_type: str = 'solo_enunciados', print_raw_gemini_answer: bool = False, max_generation_attempts_per_chunk: int = 3) → DataFrame[fuente]

Main function to generate multiple-choice questions from multiple PDFs, with retries to reach the desired number of questions per chunk.

This method orchestrates the entire generation process, including: - Loading an existing question bank for similarity checks. - Iterating through each provided PDF. - Splitting PDFs into chunks if required. - Looping with multiple attempts per chunk to reach the target number of questions. - Building prompts, generating content, and analyzing responses. - Saving accepted and rejected questions to files.

Parámetros:

pdf_paths (List[str]) – List of paths to the PDF files.
prompt_type (str) – The key for the predefined prompt type.
num_questions_per_chunk_target (int) – The target number of questions per text chunk.
output_filename (Optional[str]) – The base name for the output files.
output_dir (Optional[str]) – The directory to save the output files. If None, uses the current working directory.
custom_prompt (Optional[str]) – A user-provided custom prompt to override the default.
generate_docx (bool) – Whether to generate a DOCX file for review.
existing_bank_path (Optional[str]) – Path to an existing Excel question bank for similarity filtering.
process_by_pages (bool) – If True, process the PDF in chunks of pages.
pages_per_chunk (int) – The number of pages per chunk if processing by pages.
similarity_threshold (Optional[float]) – The threshold for the Jaccard similarity filter.
bank_prompt_scope (Optional[str]) – Scope for selecting guidance questions from the bank.
prompt_example_content_type (str) – Content type for guidance questions (“solo_enunciados” or “enunciados_y_respuestas”).
print_raw_gemini_answer (bool) – If True, prints the raw API response for debugging.
max_generation_attempts_per_chunk (int) – The maximum number of attempts to reach the target per chunk.

Devuelve:

A DataFrame containing all the successfully generated and filtered questions.

Tipo del valor devuelto:

pd.DataFrame

save_questions_to_docx(df: DataFrame, filepath: str) → None[fuente]: Saves the questions to a DOCX file for manual review.

save_questions_to_excel(df: DataFrame, filepath: str) → None[fuente]: Saves the questions to an Excel file.

save_rejected_questions_to_docx(df: DataFrame, filepath: str) → None[fuente]

Saves questions discarded due to similarity to a DOCX file for inspection.

Parámetros:

df (pd.DataFrame) – The DataFrame containing the discarded questions and similarity details.
filepath (str) – The filepath for the output file.

exception pyexamgenerator.question_generator.QuotaExceededError[fuente]

Bases: Exception

Excepción para cuando se excede el límite de cuota de la API de Gemini.

exception pyexamgenerator.question_generator.ServiceOverloadedError[fuente]

Bases: Exception

Excepción para cuando los servidores de Google están saturados (Error 503).

pyexamgenerator.tooltip module

class pyexamgenerator.tooltip.ToolTip(widget, text='')[fuente]

Bases: object

Creates a tooltip (a pop-up window with text) for a given Tkinter widget. This is a standard helper class for providing hover-text functionality.

hidetip(event=None)[fuente]: Hides and destroys the tooltip window. This method is called when the mouse cursor leaves the widget.

showtip(event=None)[fuente]: Display text in the tooltip window. This method is called when the mouse cursor enters the widget.

Module contents

class pyexamgenerator.ExamGenerator[fuente]

Bases: object

A class to generate exams from a question bank stored in an Excel file. It can create multiple versions of an exam, shuffle questions and answers, and export to various formats like DOCX and Moodle XML.

add_page_number(document: Document) → Document[fuente]

Adds page numbers to the footer of a Word document using direct docx XML manipulation. The format will be «Página X de Y».

Parámetros:: document (Document) – The docx Document object to which page numbers will be added.
Devuelve:: The Document object with the added page numbers.
Tipo del valor devuelto:: Document

generate_exam_from_excel(bank_excel_path: str, output_dir: str | None = None, exam_names: list | None = None, questions_per_topic: dict | None = None, selection_method: str = 'azar', subject: str | None = None, exam: str | None = None, course: str | None = None, num_exams: int = 2, top_margin: float = 1, bottom_margin: float = 1, left_margin: float = 0.5, right_margin: float = 0.5, export_moodle_xml: bool = False, font_size: int = 9, xml_cat_additional_text: str | None = None, penalty: int = -25, check: bool = False, update_excel: bool = False, answer_sheet_instructions: str | None = None, verbose: bool = False)[fuente]

Generates multiple exams in .docx format with varied question and answer orders. This is the main orchestrator method of the class.

Parámetros:

bank_excel_path (str) – Path to the Excel file with the question bank.
output_dir (Optional[str]) – The directory to save the generated exam files. If None, uses the same directory as the bank_excel_path.
exam_names (Optional[list]) – A list of names for the exam versions (e.g., [“1A”, “1B”]).
questions_per_topic (Optional[dict]) – A dictionary specifying how many questions to select from each topic.
selection_method (str) – Method for selecting questions: “azar” (random), “primeras” (first N), “menos usadas” (least used).
subject (Optional[str]) – The subject name for the exam header.
exam (Optional[str]) – The exam name (e.g., “Parcial 1”).
course (Optional[str]) – The course name or year (e.g., “24-25”).
num_exams (int) – The number of different exam versions to generate if exam_names is not provided.
top_margin (float) – Top page margin in inches.
bottom_margin (float) – Bottom page margin in inches.
left_margin (float) – Left page margin in inches.
right_margin (float) – Right page margin in inches.
export_moodle_xml (bool) – If True, exports the exam to Moodle XML format.
font_size (int) – Font size for the text in the document.
xml_cat_additional_text (Optional[str]) – Additional text for the Moodle XML category.
penalty (int) – Penalty for incorrect answers in the Moodle XML export.
check (bool) – If True, generates a preview and waits for user confirmation before creating all exams.
update_excel (bool) – If True, updates the source Excel file with usage statistics.
answer_sheet_instructions (Optional[str]) – Text with instructions for the answer sheet.
verbose (bool) – If True, prints detailed progress messages to the console.

generate_moodle_xml(df: DataFrame, file_name: str, num_total_questions: int, exam: str | None = None, exam_type: str | None = None, xml_cat_additional_text: str | None = None, penalty: int = -25)[fuente]

Generates a Moodle XML file from a DataFrame, including category information and correctly handling shuffled answers.

Parámetros:

df (pd.DataFrame) – The DataFrame containing the questions and answers.
file_name (str) – The name of the XML file to generate.
num_total_questions (int) – The total number of questions.
exam (Optional[str], optional) – The name of the exam.
exam_type (Optional[str], optional) – The type of exam.
xml_cat_additional_text (Optional[str], optional) – Additional text for the Moodle XML category.
penalty (int, optional) – The penalty percentage for an incorrect answer (e.g., -25 for 25%).

generate_question_text(df: DataFrame, renumber: bool = False, shuffle_answers: bool = False) → Tuple[str, str, DataFrame][fuente]

Generates the formatted text of the questions (with and without solutions) from a DataFrame. This method is primarily used for the “check” functionality to create a quick preview.

Parámetros:

df (pd.DataFrame) – The DataFrame containing the questions.
renumber (bool, optional) – If True, renumbers questions sequentially. Defaults to False.
shuffle_answers (bool, optional) – If True, shuffles the order of answers. Defaults to False.

Devuelve:

A tuple containing the student’s exam text, the full exam text (with solutions),: and the modified DataFrame.

Tipo del valor devuelto:

Tuple[str, str, pd.DataFrame]

read_questions_from_excel(excel_path: str) → DataFrame | None[fuente]

Reads questions from an Excel file and returns a pandas DataFrame.

Parámetros:

excel_path (str) – The path to the Excel file containing the questions.

Devuelve:

A DataFrame with the questions if the read is successful,: None in case of an error.

Tipo del valor devuelto:

Optional[pd.DataFrame]

save_questions_to_docx(question_text: str, file_name: str | None = None)[fuente]

Saves the formatted question text to a .docx file.

Parámetros:

question_text (str) – The text of the questions to be saved.
file_name (Optional[str], optional) – The name of the .docx file.

exception pyexamgenerator.NoAcceptableQuestionsError[fuente]

Bases: Exception

Excepción personalizada para cuando no se encuentran preguntas aceptables.

class pyexamgenerator.QuestionBankManager[fuente]

Bases: object

Manages a question bank, allowing reading from DOCX files, saving to Excel, and adding new questions while avoiding duplicates.

add_questions_without_duplicates(existing_bank_path: str, reviewed_questions_path: str, duplicate_check_columns: List[str] | None = ['Pregunta', 'Respuesta A', 'Respuesta B', 'Respuesta C', 'Respuesta D'], add_only_acceptable: bool = False) → Tuple[int, DataFrame | None][fuente]

Adds questions from a reviewed Excel file to an existing bank, avoiding duplicates. It can filter questions by their “Estado” (State) column.

Parámetros:

existing_bank_path (str) – Path to the existing question bank Excel file.
reviewed_questions_path (str) – Path to the Excel file with revised questions to add.
duplicate_check_columns (Optional[List[str]]) – List of column names used to identify duplicate questions. Defaults to checking question and all answers.
add_only_acceptable (bool, optional) – If True, only adds questions with “Aceptable” status. Defaults to False (adds all).

Devuelve:

A tuple containing the number of questions added and the updated: question bank DataFrame. Returns (-1, None) if there are errors reading the files.

Tipo del valor devuelto:

Tuple[int, Optional[pd.DataFrame]]

add_reviewed_questions_to_existing_excel(existing_excel_path: str, reviewed_excel_path: str) → str | None[fuente]

Adds questions from a reviewed XLSX file to another existing XLSX file, avoiding duplication. This is a high-level wrapper function.

Parámetros:

existing_excel_path (str) – The path to the existing question bank XLSX file.
reviewed_excel_path (str) – The path to the XLSX file containing the revised questions.

Devuelve:

The path to the updated XLSX file if the operation is successful, otherwise None.

Tipo del valor devuelto:

Optional[str]

generate_excel_from_docx(docx_path: str, excel_output_filename: str, output_dir: str | None = None) → str | None[fuente]

Generates an XLSX file from a revised DOCX file. This is a convenience wrapper around read_questions_from_docx and save_questions_to_excel.

Parámetros:

docx_path (str) – The path to the input DOCX file.
excel_output_filename (str) – The filename for the output XLSX file.
output_dir (Optional[str]) – The directory to save the output file. If None, saves it in the same directory as the input docx_path.

Devuelve:

The path to the saved XLSX file if the operation is successful, otherwise None.

Tipo del valor devuelto:

Optional[str]

read_questions_from_docx(docx_path: str) → DataFrame | None[fuente]

Reads revised questions from a DOCX file and returns a pandas DataFrame. The DOCX file must follow a specific format where each piece of information (question, topic, state, options, etc.) is prefixed with a specific keyword.

Parámetros:

docx_path (str) – The path to the DOCX file containing the questions.

Devuelve:

A pandas DataFrame with the extracted questions.: Returns None if an error occurs while reading the file.

Tipo del valor devuelto:

Optional[pd.DataFrame]

save_dataframe_to_excel(df: DataFrame, filepath: str, overwrite: bool = True, new_suffix: str = '_actualizado') → str | None[fuente]

Saves a pandas DataFrame to an Excel file.

Parámetros:

df (pd.DataFrame) – The DataFrame to save.
filepath (str) – The destination Excel file path.
overwrite (bool, optional) – If True, overwrites the existing file. If False, saves to a new file with a suffix. Defaults to True.
new_suffix (str, optional) – The suffix to add to the filename if overwrite is False. Defaults to “_actualizado”.

Devuelve:

The path of the file where the data was saved, or None if there was an error.

Tipo del valor devuelto:

Optional[str]

save_questions_to_excel(df: DataFrame, output_filename: str = 'preguntas_revisadas') → None[fuente]

Saves the question DataFrame to an Excel file.

Parámetros:

df (pd.DataFrame) – The pandas DataFrame containing the questions to save.
output_filename (str) – The name of the output Excel file (without the extension). Defaults to “preguntas_revisadas”.

class pyexamgenerator.QuestionGenerator(api_key: str, model_name: str)[fuente]

Bases: object

Generates multiple-choice questions from PDF files using the Gemini model.

extract_pdf_text(pdf_path: str) → str | None[fuente]

Extracts text from a PDF file.

Parámetros:: pdf_path (str) – The path to the PDF file.
Devuelve:: The extracted text from the PDF, or None if an error occurs.
Tipo del valor devuelto:: Optional[str]

extract_topic_number_from_path(pdf_path: str) → str[fuente]

Extracts the topic name from the PDF filename.

Parámetros:: pdf_path (str) – The path to the PDF file.
Devuelve:: The extracted topic name, or «Tema desconocido» if not found.
Tipo del valor devuelto:: str

generate_multiple_choice_questions(pdf_paths: List[str], prompt_type: str, num_questions_per_chunk_target: int = 5, output_filename: str | None = None, output_dir: str | None = None, custom_prompt: str | None = None, generate_docx: bool = True, existing_bank_path: str | None = None, process_by_pages: bool = False, pages_per_chunk: int = 1, similarity_threshold: float | None = 0.8, bank_prompt_scope: str | None = None, prompt_example_content_type: str = 'solo_enunciados', print_raw_gemini_answer: bool = False, max_generation_attempts_per_chunk: int = 3) → DataFrame[fuente]

Main function to generate multiple-choice questions from multiple PDFs, with retries to reach the desired number of questions per chunk.

This method orchestrates the entire generation process, including: - Loading an existing question bank for similarity checks. - Iterating through each provided PDF. - Splitting PDFs into chunks if required. - Looping with multiple attempts per chunk to reach the target number of questions. - Building prompts, generating content, and analyzing responses. - Saving accepted and rejected questions to files.

Parámetros:

pdf_paths (List[str]) – List of paths to the PDF files.
prompt_type (str) – The key for the predefined prompt type.
num_questions_per_chunk_target (int) – The target number of questions per text chunk.
output_filename (Optional[str]) – The base name for the output files.
output_dir (Optional[str]) – The directory to save the output files. If None, uses the current working directory.
custom_prompt (Optional[str]) – A user-provided custom prompt to override the default.
generate_docx (bool) – Whether to generate a DOCX file for review.
existing_bank_path (Optional[str]) – Path to an existing Excel question bank for similarity filtering.
process_by_pages (bool) – If True, process the PDF in chunks of pages.
pages_per_chunk (int) – The number of pages per chunk if processing by pages.
similarity_threshold (Optional[float]) – The threshold for the Jaccard similarity filter.
bank_prompt_scope (Optional[str]) – Scope for selecting guidance questions from the bank.
prompt_example_content_type (str) – Content type for guidance questions (“solo_enunciados” or “enunciados_y_respuestas”).
print_raw_gemini_answer (bool) – If True, prints the raw API response for debugging.
max_generation_attempts_per_chunk (int) – The maximum number of attempts to reach the target per chunk.

Devuelve:

A DataFrame containing all the successfully generated and filtered questions.

Tipo del valor devuelto:

pd.DataFrame

save_questions_to_docx(df: DataFrame, filepath: str) → None[fuente]: Saves the questions to a DOCX file for manual review.

save_questions_to_excel(df: DataFrame, filepath: str) → None[fuente]: Saves the questions to an Excel file.

save_rejected_questions_to_docx(df: DataFrame, filepath: str) → None[fuente]

Saves questions discarded due to similarity to a DOCX file for inspection.

Parámetros:

df (pd.DataFrame) – The DataFrame containing the discarded questions and similarity details.
filepath (str) – The filepath for the output file.

exception pyexamgenerator.QuotaExceededError[fuente]

Bases: Exception

Excepción para cuando se excede el límite de cuota de la API de Gemini.

exception pyexamgenerator.ServiceOverloadedError[fuente]

Bases: Exception

Excepción para cuando los servidores de Google están saturados (Error 503).