Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 42 additions & 0 deletions tuts/114-amazon-transcribe-gs/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# Transcribe: Transcribe audio to text

## Source

https://docs.aws.amazon.com/transcribe/latest/dg/getting-started.html

## Use case

- **ID**: transcribe/getting-started
- **Level**: beginner
- **Core actions**: `transcribe:StartTranscriptionJob`

## Steps

1. Create a sample audio file (WAV with silence)
2. Upload to S3
3. Start a transcription job
4. Wait for completion
5. Get results
6. List transcription jobs

## Resources created

| Resource | Type |
|----------|------|
| `transcribe-tut-<random>` | S3 bucket |
| `tut-job-<random>` | Transcription job |

## Cost

Transcribe pricing is per second of audio transcribed. This tutorial transcribes 1 second, costing a fraction of a cent.

## Duration

~16 seconds

## Related docs

- [Getting started with Amazon Transcribe](https://docs.aws.amazon.com/transcribe/latest/dg/getting-started.html)
- [Amazon Transcribe API reference](https://docs.aws.amazon.com/transcribe/latest/APIReference/Welcome.html)
- [Supported languages](https://docs.aws.amazon.com/transcribe/latest/dg/supported-languages.html)
- [Amazon Transcribe pricing](https://aws.amazon.com/transcribe/pricing/)
8 changes: 8 additions & 0 deletions tuts/114-amazon-transcribe-gs/REVISION-HISTORY.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# Revision History: 114-amazon-transcribe-gs

## Shell (CLI script)

### 2026-04-14 v1 published
- Type: functional
- Initial version

123 changes: 123 additions & 0 deletions tuts/114-amazon-transcribe-gs/amazon-transcribe-gs.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
# Transcribe audio to text with Amazon Transcribe

This tutorial shows you how to create a sample audio file, upload it to Amazon S3, start a transcription job with Amazon Transcribe, wait for the job to complete, retrieve the results, and list recent transcription jobs.

## Prerequisites

- AWS CLI configured with credentials and a default region
- Python 3 installed (used to generate a WAV file)
- Permissions for `transcribe:StartTranscriptionJob`, `transcribe:GetTranscriptionJob`, `transcribe:ListTranscriptionJobs`, `transcribe:DeleteTranscriptionJob`, `s3:CreateBucket`, `s3:PutObject`, `s3:DeleteObject`, `s3:DeleteBucket`

## Step 1: Create a sample audio file

Generate a 1-second WAV file containing silence using Python. This gives Transcribe a valid audio file to process without needing an external recording.

```bash
python3 -c "
import struct, wave
with wave.open('/tmp/sample.wav', 'w') as w:
w.setnchannels(1)
w.setsampwidth(2)
w.setframerate(16000)
w.writeframes(struct.pack('<' + 'h' * 16000, *([0] * 16000)))
"
```

The file is 16 kHz mono PCM, which is the recommended format for Amazon Transcribe. One second of silence produces a ~32 KB file.

## Step 2: Upload to S3

Create an S3 bucket and upload the audio file. Transcribe reads input from S3.

```bash
BUCKET_NAME="transcribe-tut-$(openssl rand -hex 4)-$(aws sts get-caller-identity --query 'Account' --output text)"

aws s3api create-bucket --bucket "$BUCKET_NAME"
aws s3 cp /tmp/sample.wav "s3://$BUCKET_NAME/sample.wav" --quiet
```

For regions other than `us-east-1`, the script adds `--create-bucket-configuration LocationConstraint=$REGION`.

## Step 3: Start a transcription job

Start an asynchronous transcription job pointing to the uploaded audio.

```bash
JOB_NAME="tut-job-$(openssl rand -hex 4)"

aws transcribe start-transcription-job \
--transcription-job-name "$JOB_NAME" \
--language-code en-US \
--media "MediaFileUri=s3://$BUCKET_NAME/sample.wav" \
--output-bucket-name "$BUCKET_NAME" \
--query 'TranscriptionJob.{Name:TranscriptionJobName,Status:TranscriptionJobStatus}' \
--output table
```

`--language-code` specifies the language of the audio. `--output-bucket-name` tells Transcribe where to write the JSON result file. Without it, Transcribe uses a service-managed bucket.

## Step 4: Wait for completion

Poll the job status until it reaches `COMPLETED` or `FAILED`.

```bash
for i in $(seq 1 30); do
STATUS=$(aws transcribe get-transcription-job \
--transcription-job-name "$JOB_NAME" \
--query 'TranscriptionJob.TranscriptionJobStatus' --output text)
echo " Status: $STATUS"
[ "$STATUS" = "COMPLETED" ] || [ "$STATUS" = "FAILED" ] && break
sleep 5
done
```

Most short audio files complete within 15–30 seconds. The script polls every 5 seconds with a 150-second timeout.

## Step 5: Get results

Retrieve the transcript URI from the completed job.

```bash
aws transcribe get-transcription-job \
--transcription-job-name "$JOB_NAME" \
--query 'TranscriptionJob.Transcript.TranscriptFileUri' --output text
```

The result is a JSON file in your S3 bucket containing the transcript text, confidence scores, and word-level timestamps. Since the input was silence, the transcript will be empty or minimal.

## Step 6: List transcription jobs

List recent completed transcription jobs.

```bash
aws transcribe list-transcription-jobs --status COMPLETED \
--query 'TranscriptionJobSummaries[:3].{Name:TranscriptionJobName,Status:TranscriptionJobStatus,Created:CreationTime}' \
--output table
```

You can filter by `--status` (`QUEUED`, `IN_PROGRESS`, `COMPLETED`, `FAILED`) and by `--job-name-contains` to find specific jobs.

## Cleanup

Delete the transcription job and the S3 bucket:

```bash
aws transcribe delete-transcription-job --transcription-job-name "$JOB_NAME"
aws s3 rm "s3://$BUCKET_NAME" --recursive
aws s3 rb "s3://$BUCKET_NAME"
```

Amazon Transcribe charges per second of audio transcribed. This tutorial transcribes 1 second of audio, costing a fraction of a cent. The S3 bucket is also deleted during cleanup.

The script automates all steps including cleanup:

```bash
bash amazon-transcribe-gs.sh
```

## Related resources

- [Getting started with Amazon Transcribe](https://docs.aws.amazon.com/transcribe/latest/dg/getting-started.html)
- [Amazon Transcribe API reference](https://docs.aws.amazon.com/transcribe/latest/APIReference/Welcome.html)
- [Supported languages](https://docs.aws.amazon.com/transcribe/latest/dg/supported-languages.html)
- [Amazon Transcribe pricing](https://aws.amazon.com/transcribe/pricing/)
106 changes: 106 additions & 0 deletions tuts/114-amazon-transcribe-gs/amazon-transcribe-gs.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
#!/bin/bash
# Tutorial: Transcribe audio to text with Amazon Transcribe
# Source: https://docs.aws.amazon.com/transcribe/latest/dg/getting-started.html

WORK_DIR=$(mktemp -d)
LOG_FILE="$WORK_DIR/transcribe-$(date +%Y%m%d-%H%M%S).log"
exec > >(tee -a "$LOG_FILE") 2>&1

REGION=${AWS_DEFAULT_REGION:-${AWS_REGION:-$(aws configure get region 2>/dev/null)}}
if [ -z "$REGION" ]; then
echo "ERROR: No AWS region configured. Set one with: export AWS_DEFAULT_REGION=us-east-1"
exit 1
fi
export AWS_DEFAULT_REGION="$REGION"
ACCOUNT_ID=$(aws sts get-caller-identity --query 'Account' --output text)
echo "Region: $REGION"

RANDOM_ID=$(cat /dev/urandom | tr -dc 'a-z0-9' | fold -w 8 | head -n 1)
BUCKET_NAME="transcribe-tut-${RANDOM_ID}-${ACCOUNT_ID}"
JOB_NAME="tut-job-${RANDOM_ID}"

handle_error() { echo "ERROR on line $1"; trap - ERR; cleanup; exit 1; }
trap 'handle_error $LINENO' ERR

cleanup() {
echo ""
echo "Cleaning up resources..."
aws transcribe delete-transcription-job --transcription-job-name "$JOB_NAME" 2>/dev/null && echo " Deleted job $JOB_NAME"
if aws s3 ls "s3://$BUCKET_NAME" > /dev/null 2>&1; then
aws s3 rm "s3://$BUCKET_NAME" --recursive --quiet 2>/dev/null
aws s3 rb "s3://$BUCKET_NAME" 2>/dev/null && echo " Deleted bucket $BUCKET_NAME"
fi
rm -rf "$WORK_DIR"
echo "Cleanup complete."
}

# Step 1: Create a sample audio file (WAV with silence)
echo "Step 1: Creating sample audio file"
python3 -c "
import struct, wave
with wave.open('$WORK_DIR/sample.wav', 'w') as w:
w.setnchannels(1)
w.setsampwidth(2)
w.setframerate(16000)
w.writeframes(struct.pack('<' + 'h' * 16000, *([0] * 16000)))
"
echo " Created sample.wav (1 second, 16kHz mono)"

# Step 2: Upload to S3
echo "Step 2: Uploading to S3"
if [ "$REGION" = "us-east-1" ]; then
aws s3api create-bucket --bucket "$BUCKET_NAME" > /dev/null
else
aws s3api create-bucket --bucket "$BUCKET_NAME" \
--create-bucket-configuration LocationConstraint="$REGION" > /dev/null
fi
aws s3 cp "$WORK_DIR/sample.wav" "s3://$BUCKET_NAME/sample.wav" --quiet
echo " Uploaded to s3://$BUCKET_NAME/sample.wav"

# Step 3: Start transcription job
echo "Step 3: Starting transcription job: $JOB_NAME"
aws transcribe start-transcription-job \
--transcription-job-name "$JOB_NAME" \
--language-code en-US \
--media "MediaFileUri=s3://$BUCKET_NAME/sample.wav" \
--output-bucket-name "$BUCKET_NAME" \
--query 'TranscriptionJob.{Name:TranscriptionJobName,Status:TranscriptionJobStatus}' --output table

# Step 4: Wait for completion
echo "Step 4: Waiting for transcription to complete..."
for i in $(seq 1 30); do
STATUS=$(aws transcribe get-transcription-job --transcription-job-name "$JOB_NAME" \
--query 'TranscriptionJob.TranscriptionJobStatus' --output text)
echo " Status: $STATUS"
[ "$STATUS" = "COMPLETED" ] || [ "$STATUS" = "FAILED" ] && break
sleep 5
done

# Step 5: Get results
echo "Step 5: Transcription results"
if [ "$STATUS" = "COMPLETED" ]; then
RESULT_URI=$(aws transcribe get-transcription-job --transcription-job-name "$JOB_NAME" \
--query 'TranscriptionJob.Transcript.TranscriptFileUri' --output text)
echo " Result URI: $RESULT_URI"
echo " (Audio was silence, so transcript will be empty or minimal)"
else
echo " Job status: $STATUS"
fi

# Step 6: List jobs
echo "Step 6: Listing transcription jobs"
aws transcribe list-transcription-jobs --status COMPLETED \
--query 'TranscriptionJobSummaries[:3].{Name:TranscriptionJobName,Status:TranscriptionJobStatus,Created:CreationTime}' --output table 2>/dev/null || \
echo " No completed jobs"

echo ""
echo "Tutorial complete."
echo "Do you want to clean up all resources? (y/n): "
read -r CHOICE
if [[ "$CHOICE" =~ ^[Yy]$ ]]; then
cleanup
else
echo "Manual cleanup:"
echo " aws transcribe delete-transcription-job --transcription-job-name $JOB_NAME"
echo " aws s3 rm s3://$BUCKET_NAME --recursive && aws s3 rb s3://$BUCKET_NAME"
fi
58 changes: 58 additions & 0 deletions tuts/118-amazon-lex-gs/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
# Lex: Create a chatbot

Create an Amazon Lex V2 chatbot with an IAM role, English locale, and a sample intent, then build the bot locale.

## Source

https://docs.aws.amazon.com/lexv2/latest/dg/getting-started.html

## Use case

- **ID**: lex/getting-started
- **Level**: intermediate
- **Core actions**: `lexv2-models:CreateBot`, `lexv2-models:CreateIntent`, `lexv2-models:CreateBotLocale`, `lexv2-models:BuildBotLocale`

## Steps

1. Create an IAM role for the bot
2. Create a bot
3. Create an English locale
4. Create an OrderPizza intent with sample utterances
5. Build the bot locale
6. Describe the bot

## Resources created

| Resource | Type |
|----------|------|
| `tut-bot-<random>` | Lex V2 bot |
| `lex-tut-role-<random>` | IAM role (with Polly policy) |

## Duration

~40 seconds

## Cost

No charge for bot creation. Lex charges per text or speech request when the bot processes conversations. This tutorial does not send conversation requests.

## Related docs

- [Getting started with Amazon Lex V2](https://docs.aws.amazon.com/lexv2/latest/dg/getting-started.html)
- [Creating a bot](https://docs.aws.amazon.com/lexv2/latest/dg/build-create.html)
- [Adding intents](https://docs.aws.amazon.com/lexv2/latest/dg/build-intents.html)
- [Amazon Lex pricing](https://aws.amazon.com/lex/pricing/)

---

## Appendix

| Field | Value |
|-------|-------|
| Date | 2026-04-14 |
| Script lines | 106 |
| Exit code | 0 |
| Runtime | 40s |
| Steps | 6 |
| Issues | Fixed bot/locale wait timing |
| Version | v1 |
8 changes: 8 additions & 0 deletions tuts/118-amazon-lex-gs/REVISION-HISTORY.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# Revision History: 118-amazon-lex-gs

## Shell (CLI script)

### 2026-04-14 v1 published
- Type: functional
- Initial version

Loading
Loading