Post Analysis

Project Overview

This project focuses on developing and evaluating a robust CAPTCHA recognition model using CRNN (Convolutional Recurrent Neural Network) and CTC (Connectionist Temporal Classification) loss. The main objective is to evaluate different data augmentation strategies and hyperparameter configurations to improve the model's robustness and generalization.

Technologies Used

PyTorch for model implementation and training
CTC Loss for sequence-level supervision without character alignment
Data Augmentation with custom pipelines for realistic CAPTCHA distortions
Grid Search for hyperparameter tuning across multiple training scenarios

Key Features

Modular augmentation system with full control over geometric, color, blur, noise, and distractor-line parameters
YAML-based configuration for tuning multiple parameters at once
Automated logging and checkpointing per trial
Built-in analysis suite for visualizing learning curves, error samples, confidence scores, and character distribution

Training and Tuning

Running an Experiment

You can launch training or hyperparameter tuning with:

python main.py --config configs/tuning_M.yaml

Replace tuning_M.yaml with tuning_S.yaml or tuning_L.yaml depending on the scale of your experiment.
The results (checkpoints, logs, final models) will be saved under outputs/.

How Tuning Works

The core training loop is implemented in trainer/train.py.
Hyperparameter tuning is orchestrated by trainer/tuner.py, which performs grid search:
- All combinations of specified parameters (batch_size, learning_rate, epochs, optimizer) are iterated automatically.
- Each trial is run independently and logs its own history, checkpoints, and final model.
- The best model is determined by Validation LER (Levenshtein Error Rate).

Configurable Parameters

All experiments are configured via YAML files under configs/.

Key parameters include:

Dataset paths
- train_root, val_root, test_root
- label_path → JSON file with captcha labels
- original_image_dir → un-augmented training images
Tuning settings
- batch_size → list of batch sizes to try (e.g. [4, 8, 16])
- learning_rate → learning rates in scientific notation (e.g. [1e-3, 5e-4, 1e-4])
- epochs → training epochs per trial
- optimizer → choice of optimizer (adam, adamw, sgd)
- seed → random seed for reproducibility
- ctc_blank_index → blank token index for CTC loss
Augmentation scenarios
- Multiple named augmentation configurations can be defined.
- Parameters: angle_range, shear_range, brightness_range, contrast_range, noise_std, blur_probability, blur_radius, lines_probability, line_count, line_thickness.
Output directories
- checkpoint_dir, final_model_dir, history_dir
- log_file_path
Post-analysis
- analysis_scripts → list of visualization/analysis tasks to run automatically after tuning.

With this design, you can easily add new configs or augmentation scenarios without modifying the training code.

Post Analysis

After training and hyperparameter tuning, this project provides automatic post-analysis scripts to help evaluate and visualize the results.
These scripts run either manually (via python analysis/run_all_analysis.py ...) or automatically if specified in the config file.

Automatic Execution

The config file (configs/*.yaml) can specify which analysis scripts to run under the analysis_scripts section.
After tuning, tuner.py will call these scripts automatically if they are listed.

Available Analysis Scripts

All analysis utilities are stored in the analysis/ folder. Each script serves a specific purpose:

plot_curves.py: Plots training loss and validation LER curves for each trial.
compare_trials.py: Compares final LER across all trials in a bar chart.
plot_charset_freq.py: Analyzes and plots character frequency in training labels.
plot_prediction_dist.py: Analyzes distribution of predicted characters across the test set.

These tools help debug model behavior, compare augmentation effects, and ensure the model generalizes well.

Output Structure

All experiment results are saved under the outputs/ directory.
This folder is automatically created during training and organized into subfolders for clarity.

Contents Explained

checkpoints/ → Per-epoch saved states for resuming training or debugging.
models/ → Final trained model and best-performing model per trial.
logs/ → Detailed training histories (train_loss, val_ler) in JSON format.
training_log.txt → Human-readable log with trial hyperparameters and validation metrics.
predictions/ → Saved prediction results in JSON format for test evaluation.
plots/ → Generated analysis figures.

Google Colab (Optional)

This project can also be run in Google Colab by using main.ipynb, making it easy to experiment without setting up a local environment.
Two approaches are supported:

1. Manual Upload (ZIP files)

You can manually upload both the dataset and project folder as zip archives.

Steps:

Upload your dataset (part2.zip) which must contain:

part2/
├── train/
├── val/
└── test/

Upload the project code (captcha-cracker.zip) containing:

captcha-cracker/
├── main.py
├── configs/
├── trainer/
└── ...

Use the provided Colab setup script to extract the files:

import os, zipfile

# Extract dataset
with zipfile.ZipFile("part2.zip", 'r') as zip_ref:
    zip_ref.extractall("data")

# Extract project
with zipfile.ZipFile("captcha-cracker.zip", 'r') as zip_ref:
    zip_ref.extractall("captcha-cracker")

%cd captcha-cracker

Install dependencies and run training:

!pip install -r requirements.txt
!python main.py --config configs/tuning_M.yaml

2. GitHub Clone

Instead of uploading the zipped project folder, you can directly clone the repository:

!git clone https://github.com/captcha-cracker.git
%cd captcha-cracker
!pip install -r requirements.txt
!python main.py --config configs/tuning_M.yaml

Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
analysis		analysis
configs		configs
models		models
trainer		trainer
utils		utils
.gitignore		.gitignore
README.md		README.md
final_model.pth		final_model.pth
main.ipynb		main.ipynb
main.py		main.py
make_predictions.ipynb		make_predictions.ipynb
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project Overview

Technologies Used

Key Features

Training and Tuning

Running an Experiment

How Tuning Works

Configurable Parameters

Post Analysis

Automatic Execution

Available Analysis Scripts

Output Structure

Contents Explained

Google Colab (Optional)

1. Manual Upload (ZIP files)

Steps:

2. GitHub Clone

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Project Overview

Technologies Used

Key Features

Training and Tuning

Running an Experiment

How Tuning Works

Configurable Parameters

Post Analysis

Automatic Execution

Available Analysis Scripts

Output Structure

Contents Explained

Google Colab (Optional)

1. Manual Upload (ZIP files)

Steps:

2. GitHub Clone

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages