|
| 1 | +--- |
| 2 | +description: Generate pyatlan_v9 msgspec model files by cloning the models repo and running the Pkl code generator |
| 3 | +--- |
| 4 | + |
| 5 | +# Generate v9 Models |
| 6 | + |
| 7 | +Generates pyatlan_v9 msgspec model files from Pkl type definitions in the atlanhq/models repo. |
| 8 | + |
| 9 | +## Usage |
| 10 | + |
| 11 | +- `/generate-v9-models` — Clone models@master, generate and sync v9 models |
| 12 | +- `/generate-v9-models <branch>` — Clone models@<branch> instead of master |
| 13 | +- `/generate-v9-models <branch> test` — Also run tests after sync |
| 14 | +- `/generate-v9-models test` — Clone models@master and run tests after sync |
| 15 | + |
| 16 | +## Instructions |
| 17 | + |
| 18 | +### 1. Clone or update the models repo |
| 19 | + |
| 20 | +Parse args to determine the branch (default: `master`) and whether to run tests (args contain "test"). |
| 21 | + |
| 22 | +The models repo should be cloned as a sibling directory of this repo (atlan-python): |
| 23 | + |
| 24 | +```bash |
| 25 | +# Determine paths |
| 26 | +SDK_DIR="$(pwd)" # atlan-python root |
| 27 | +MODELS_DIR="$(cd .. && pwd)/models" |
| 28 | +BRANCH="master" # override with first non-"test" arg |
| 29 | + |
| 30 | +if [ -d "$MODELS_DIR" ]; then |
| 31 | + cd "$MODELS_DIR" && git fetch origin && git checkout "$BRANCH" && git pull origin "$BRANCH" |
| 32 | +else |
| 33 | + git clone --branch "$BRANCH" --single-branch git@github.com:atlanhq/models.git "$MODELS_DIR" |
| 34 | +fi |
| 35 | +``` |
| 36 | + |
| 37 | +### 2. Run Pkl evaluation |
| 38 | + |
| 39 | +From the models repo root, run the Pkl code generator with SDK mode enabled: |
| 40 | + |
| 41 | +```bash |
| 42 | +cd "$MODELS_DIR" |
| 43 | +OVERLAYS_PATH="${SDK_DIR}/pyatlan_v9/model/assets/_overlays/" |
| 44 | + |
| 45 | +pkl eval typedefs/*.pkl -m . -p sdk=true \ |
| 46 | + -p targetOutputDir=gen_v9/pyatlan_v9/model/assets/ \ |
| 47 | + -p internalPackage=pyatlan_v9.model \ |
| 48 | + -p sdkOverlaysBasePath="$OVERLAYS_PATH" |
| 49 | +``` |
| 50 | + |
| 51 | +- `sdkOverlaysBasePath` must be an absolute path — Pkl resolves `read?()` relative to the module file, not CWD |
| 52 | +- The `-p sdk=true` flag enables SDK code generation (search field descriptors, overlay injection, etc.) |
| 53 | +- Output goes to `models/gen_v9/pyatlan_v9/model/assets/` as a staging area |
| 54 | + |
| 55 | +### 3. Selective sync |
| 56 | + |
| 57 | +Copy generated files to the SDK, **excluding** these files that have manual patches or are hand-written: |
| 58 | + |
| 59 | +| File | Reason | |
| 60 | +|------|--------| |
| 61 | +| `__init__.py` | Hand-written init with `__all__` | |
| 62 | +| `entity.py` | Patched: `_metadata_proxies`, `type_name: Any`, `SaveSemantic` | |
| 63 | +| `referenceable.py` | Patched: `InternalKeywordField`, field descriptors, helper exports | |
| 64 | +| `atlas_glossary.py` | Patched: GTC anchor-in-attributes handling | |
| 65 | +| `atlas_glossary_term.py` | Patched: GTC anchor-in-attributes handling | |
| 66 | +| `atlas_glossary_category.py` | Patched: GTC anchor-in-attributes handling | |
| 67 | +| `quick_sight_dataset.py` | Patched: `useLocalTypeAsPrefix` field naming | |
| 68 | +| `quick_sight_dataset_field.py` | Patched: `useLocalTypeAsPrefix` field naming | |
| 69 | +| `quick_sight_folder.py` | Patched: `useLocalTypeAsPrefix` field naming | |
| 70 | +| `data_quality_rule.py` | Hand-written, not yet generated correctly | |
| 71 | + |
| 72 | +```bash |
| 73 | +rsync -av \ |
| 74 | + --exclude='__init__.py' \ |
| 75 | + --exclude='entity.py' \ |
| 76 | + --exclude='referenceable.py' \ |
| 77 | + --exclude='atlas_glossary.py' \ |
| 78 | + --exclude='atlas_glossary_term.py' \ |
| 79 | + --exclude='atlas_glossary_category.py' \ |
| 80 | + --exclude='quick_sight_dataset.py' \ |
| 81 | + --exclude='quick_sight_dataset_field.py' \ |
| 82 | + --exclude='quick_sight_folder.py' \ |
| 83 | + --exclude='data_quality_rule.py' \ |
| 84 | + gen_v9/pyatlan_v9/model/assets/ \ |
| 85 | + "${SDK_DIR}/pyatlan_v9/model/assets/" |
| 86 | +``` |
| 87 | + |
| 88 | +16 additional types in pyatlan_v9 (persona.py, purpose.py, badge.py, access_control.py, etc.) are hand-written and NOT generated by Pkl — rsync won't touch them since they don't exist in gen_v9. |
| 89 | + |
| 90 | +### 4. Post-sync patches |
| 91 | + |
| 92 | +**a) asset.py** — replace `list[str]` with `set[str]` for 7 fields (in both `Asset` and `AssetAttributes` classes): |
| 93 | + |
| 94 | +```bash |
| 95 | +cd "${SDK_DIR}" |
| 96 | +sed -i '' \ |
| 97 | + 's/\(owner_users: \)list\[str\]/\1set[str]/g; |
| 98 | + s/\(owner_groups: \)list\[str\]/\1set[str]/g; |
| 99 | + s/\(admin_users: \)list\[str\]/\1set[str]/g; |
| 100 | + s/\(admin_groups: \)list\[str\]/\1set[str]/g; |
| 101 | + s/\(viewer_users: \)list\[str\]/\1set[str]/g; |
| 102 | + s/\(viewer_groups: \)list\[str\]/\1set[str]/g; |
| 103 | + s/\(admin_roles: \)list\[str\]/\1set[str]/g' \ |
| 104 | + pyatlan_v9/model/assets/asset.py |
| 105 | +``` |
| 106 | + |
| 107 | +**b) related_entity.py** — ensure `relationship_attributes` field exists after `unique_attributes`. If the generated file doesn't have it, add: |
| 108 | +```python |
| 109 | + # Relationship-specific attributes |
| 110 | + relationship_attributes: Union[dict[str, Any], None, UnsetType] = UNSET |
| 111 | + """Attributes of the relationship itself (e.g., description, status, etc.).""" |
| 112 | +``` |
| 113 | + |
| 114 | +**c) process.py** — append backward-compat alias at end of file (after deferred field descriptors): |
| 115 | +```python |
| 116 | +Process.Attributes = Process # backward-compat alias for Process.Attributes.generate_qualified_name |
| 117 | +``` |
| 118 | + |
| 119 | +### 5. Run tests (if args contain "test") |
| 120 | + |
| 121 | +```bash |
| 122 | +cd "${SDK_DIR}" && python -m pytest tests_v9/unit/ -x -q |
| 123 | +``` |
| 124 | + |
| 125 | +### 6. Report summary |
| 126 | + |
| 127 | +Report: how many files were generated, how many synced, how many excluded, and test results if applicable. |
| 128 | + |
| 129 | +## Notes |
| 130 | + |
| 131 | +- The models repo is cloned from `git@github.com:atlanhq/models.git` |
| 132 | +- If `../models` already exists, it fetches and checks out the requested branch instead of re-cloning |
| 133 | +- Generated files go to `models/gen_v9/` as a staging area, then are selectively synced |
| 134 | +- The `set[str]` patch is needed because Pkl's `multiValued=true` generates `list[str]` but the SDK semantically needs `set[str]` for user/group/role fields |
| 135 | +- Overlay files (custom methods like `creator()`, `updater()`, policy helpers) live at `pyatlan_v9/model/assets/_overlays/` in this repo |
0 commit comments