Skip to content

Commit 17c05fa

Browse files
Aryamanz29claude
andcommitted
[feat] Regenerate pyatlan_v9 models with updated Pkl generator
- Regenerate all asset models using updated Pkl generator: - set[str] fields now generated natively (no post-sync sed patch) - validate/minimize/relate methods removed (dead code) - relationship_attributes field in RelatedEntity base class - Move overlay files to SDK repo (_overlays/ directory) - Fix overlay imports: ai_model.py, collection.py, connection.py - Add ruff exclusion for _overlays/ and F821 per-file-ignores - Update generate-v9-models skill: sdkOnly mode, temp staging, ruff auto-fix+format step, removed asset.py sed patch - Update process_test.py to use Process.generate_qualified_name Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent 65ac20e commit 17c05fa

484 files changed

Lines changed: 38596 additions & 23541 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.claude/skills/generate-v9-models/SKILL.md

Lines changed: 28 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -36,25 +36,28 @@ fi
3636

3737
### 2. Run Pkl evaluation
3838

39-
From the models repo root, run the Pkl code generator with SDK mode enabled:
39+
From the models repo root, run the Pkl code generator in **SDK-only mode** (`-p sdkOnly=true`). This generates only Python SDK files — no JSON typedefs, no frontend code, no samples.
40+
41+
Use a temp directory for staging so no generated files land in the models repo:
4042

4143
```bash
4244
cd "$MODELS_DIR"
45+
STAGING_DIR="$(mktemp -d)"
4346
OVERLAYS_PATH="${SDK_DIR}/pyatlan_v9/model/assets/_overlays/"
4447

45-
pkl eval typedefs/*.pkl -m . -p sdk=true \
46-
-p targetOutputDir=gen_v9/pyatlan_v9/model/assets/ \
48+
pkl eval typedefs/*.pkl -m "$STAGING_DIR" -p sdkOnly=true -p sdk=true \
49+
-p targetOutputDir=pyatlan_v9/model/assets/ \
4750
-p internalPackage=pyatlan_v9.model \
4851
-p sdkOverlaysBasePath="$OVERLAYS_PATH"
4952
```
5053

54+
- `-p sdkOnly=true` skips JSON typedef generation — only Python SDK files are produced
5155
- `sdkOverlaysBasePath` must be an absolute path — Pkl resolves `read?()` relative to the module file, not CWD
52-
- The `-p sdk=true` flag enables SDK code generation (search field descriptors, overlay injection, etc.)
53-
- Output goes to `models/gen_v9/pyatlan_v9/model/assets/` as a staging area
56+
- Output goes to a temp staging directory (nothing written to models repo)
5457

5558
### 3. Selective sync
5659

57-
Copy generated files to the SDK, **excluding** these files that have manual patches or are hand-written:
60+
Copy generated files from the staging dir to the SDK, **excluding** these files that have manual patches or are hand-written:
5861

5962
| File | Reason |
6063
|------|--------|
@@ -81,55 +84,49 @@ rsync -av \
8184
--exclude='quick_sight_dataset_field.py' \
8285
--exclude='quick_sight_folder.py' \
8386
--exclude='data_quality_rule.py' \
84-
gen_v9/pyatlan_v9/model/assets/ \
87+
"${STAGING_DIR}/pyatlan_v9/model/assets/" \
8588
"${SDK_DIR}/pyatlan_v9/model/assets/"
89+
90+
# Clean up staging dir
91+
rm -rf "$STAGING_DIR"
8692
```
8793

88-
16 additional types in pyatlan_v9 (persona.py, purpose.py, badge.py, access_control.py, etc.) are hand-written and NOT generated by Pkl — rsync won't touch them since they don't exist in gen_v9.
94+
16 additional types in pyatlan_v9 (persona.py, purpose.py, badge.py, access_control.py, etc.) are hand-written and NOT generated by Pkl — rsync won't touch them since they don't exist in staging.
8995

9096
### 4. Post-sync patches
9197

92-
**a) asset.py** — replace `list[str]` with `set[str]` for 7 fields (in both `Asset` and `AssetAttributes` classes):
93-
94-
```bash
95-
cd "${SDK_DIR}"
96-
sed -i '' \
97-
's/\(owner_users: \)list\[str\]/\1set[str]/g;
98-
s/\(owner_groups: \)list\[str\]/\1set[str]/g;
99-
s/\(admin_users: \)list\[str\]/\1set[str]/g;
100-
s/\(admin_groups: \)list\[str\]/\1set[str]/g;
101-
s/\(viewer_users: \)list\[str\]/\1set[str]/g;
102-
s/\(viewer_groups: \)list\[str\]/\1set[str]/g;
103-
s/\(admin_roles: \)list\[str\]/\1set[str]/g' \
104-
pyatlan_v9/model/assets/asset.py
105-
```
106-
107-
**b) related_entity.py** — ensure `relationship_attributes` field exists after `unique_attributes`. If the generated file doesn't have it, add:
98+
**related_entity.py** — ensure `relationship_attributes` field exists after `unique_attributes`. This file is not generated in sdkOnly mode, so it persists across regens. If starting from scratch, add:
10899
```python
109100
# Relationship-specific attributes
110101
relationship_attributes: Union[dict[str, Any], None, UnsetType] = UNSET
111102
"""Attributes of the relationship itself (e.g., description, status, etc.)."""
112103
```
113104

114-
**c) process.py** — append backward-compat alias at end of file (after deferred field descriptors):
115-
```python
116-
Process.Attributes = Process # backward-compat alias for Process.Attributes.generate_qualified_name
105+
### 5. Run ruff auto-fix and format
106+
107+
After syncing and patching, run ruff to fix unused imports and format the generated files:
108+
109+
```bash
110+
cd "${SDK_DIR}"
111+
uv run ruff check --fix --select F401,F811 pyatlan_v9/
112+
uv run ruff format pyatlan_v9/
117113
```
118114

119-
### 5. Run tests (if args contain "test")
115+
### 6. Run tests (if args contain "test")
120116

121117
```bash
122118
cd "${SDK_DIR}" && python -m pytest tests_v9/unit/ -x -q
123119
```
124120

125-
### 6. Report summary
121+
### 7. Report summary
126122

127123
Report: how many files were generated, how many synced, how many excluded, and test results if applicable.
128124

129125
## Notes
130126

131127
- The models repo is cloned from `git@github.com:atlanhq/models.git`
132128
- If `../models` already exists, it fetches and checks out the requested branch instead of re-cloning
133-
- Generated files go to `models/gen_v9/` as a staging area, then are selectively synced
134-
- The `set[str]` patch is needed because Pkl's `multiValued=true` generates `list[str]` but the SDK semantically needs `set[str]` for user/group/role fields
129+
- `-p sdkOnly=true` ensures only Python SDK files are generated (no JSON typedefs written to models repo)
130+
- Generated files go to a temp staging dir, then are selectively synced to `atlan-python/pyatlan_v9/model/assets/`
131+
- Fields with `useSetType=true` in the Pkl typedefs generate `set[str]` instead of `list[str]` (used for user/group/role fields)
135132
- Overlay files (custom methods like `creator()`, `updater()`, policy helpers) live at `pyatlan_v9/model/assets/_overlays/` in this repo

pyatlan_v9/model/assets/_init_orchestration.py

Lines changed: 1 addition & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,4 @@
88
This module provides convenient imports for all Orchestration types and their Related variants.
99
"""
1010

11-
12-
__all__ = [
13-
14-
]
11+
__all__ = []

pyatlan_v9/model/assets/_init_sage_maker_unified_studio.py

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -20,8 +20,12 @@
2020
from .sage_maker_unified_studio_asset import SageMakerUnifiedStudioAsset
2121
from .sage_maker_unified_studio_asset_schema import SageMakerUnifiedStudioAssetSchema
2222
from .sage_maker_unified_studio_project import SageMakerUnifiedStudioProject
23-
from .sage_maker_unified_studio_published_asset import SageMakerUnifiedStudioPublishedAsset
24-
from .sage_maker_unified_studio_subscribed_asset import SageMakerUnifiedStudioSubscribedAsset
23+
from .sage_maker_unified_studio_published_asset import (
24+
SageMakerUnifiedStudioPublishedAsset,
25+
)
26+
from .sage_maker_unified_studio_subscribed_asset import (
27+
SageMakerUnifiedStudioSubscribedAsset,
28+
)
2529

2630
__all__ = [
2731
"RelatedSageMakerUnifiedStudio",

pyatlan_v9/model/assets/_overlays/ai_model.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,9 @@
1+
# STDLIB_IMPORT: from typing import Dict, List
12
# IMPORT: from pyatlan.model.enums import AIDatasetType, AtlanConnectorType
23
# IMPORT: from pyatlan.utils import to_camel_case
34
# INTERNAL_IMPORT: from pyatlan.utils import init_guid, validate_required_fields
5+
# INTERNAL_IMPORT: from pyatlan.model.transform import get_type
6+
# INTERNAL_IMPORT: from pyatlan.model.assets.process import Process
47

58
@classmethod
69
@init_guid

pyatlan_v9/model/assets/_overlays/collection.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1+
# STDLIB_IMPORT: from typing import TYPE_CHECKING
2+
# STDLIB_IMPORT: from uuid import uuid4
13
# IMPORT: from pyatlan.errors import AtlanError
24
# IMPORT: from pyatlan.errors import ErrorCode
35
# INTERNAL_IMPORT: from pyatlan.utils import init_guid, validate_required_fields

pyatlan_v9/model/assets/_overlays/connection.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
1+
# STDLIB_IMPORT: from typing import TYPE_CHECKING, List, Optional
12
# IMPORT: from pyatlan.model.enums import AtlanConnectorType
23
# INTERNAL_IMPORT: from pyatlan.utils import init_guid, validate_required_fields
34

pyatlan_v9/model/assets/adf.py

Lines changed: 23 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,6 @@
1616

1717
from typing import Any, ClassVar, Union
1818

19-
import msgspec
2019
from msgspec import UNSET, UnsetType
2120

2221
from .airflow_related import RelatedAirflowTask
@@ -43,16 +42,18 @@
4342
from .schema_registry_related import RelatedSchemaRegistrySubject
4443
from .soda_related import RelatedSodaCheck
4544
from .spark_related import RelatedSparkJob
46-
from pyatlan_v9.model.conversion_utils import categorize_relationships, merge_relationships
45+
from pyatlan_v9.model.conversion_utils import (
46+
categorize_relationships,
47+
merge_relationships,
48+
)
4749
from pyatlan_v9.model.serde import Serde, get_serde
4850
from pyatlan_v9.model.transform import register_asset
4951

50-
from .adf_related import RelatedADF
51-
5252
# =============================================================================
5353
# FLAT ASSET CLASS
5454
# =============================================================================
5555

56+
5657
@register_asset
5758
class ADF(Asset):
5859
"""
@@ -170,7 +171,9 @@ class ADF(Asset):
170171
readme: RelatedReadme | None | UnsetType = UNSET
171172
"""README that is linked to this asset."""
172173

173-
schema_registry_subjects: list[RelatedSchemaRegistrySubject] | None | UnsetType = UNSET
174+
schema_registry_subjects: list[RelatedSchemaRegistrySubject] | None | UnsetType = (
175+
UNSET
176+
)
174177
""""""
175178

176179
soda_checks: list[RelatedSodaCheck] | None | UnsetType = UNSET
@@ -185,30 +188,6 @@ class ADF(Asset):
185188
def __post_init__(self) -> None:
186189
self.type_name = "ADF"
187190

188-
# =========================================================================
189-
# SDK Methods
190-
# =========================================================================
191-
192-
def validate(self, for_creation: bool = False) -> None:
193-
errors: list[str] = []
194-
if self.type_name is UNSET:
195-
errors.append("type_name is required")
196-
if self.name is UNSET:
197-
errors.append("name is required")
198-
if self.qualified_name is UNSET or self.qualified_name is None:
199-
errors.append("qualified_name is required")
200-
if errors:
201-
raise ValueError(f"ADF validation failed: {errors}")
202-
203-
def minimize(self) -> "ADF":
204-
self.validate()
205-
return ADF(qualified_name=self.qualified_name, name=self.name)
206-
207-
def relate(self) -> "RelatedADF":
208-
if self.guid is not UNSET:
209-
return RelatedADF(guid=self.guid)
210-
return RelatedADF(qualified_name=self.qualified_name)
211-
212191
# =========================================================================
213192
# Optimized Serialization Methods (override Asset base class)
214193
# =========================================================================
@@ -260,6 +239,7 @@ def from_json(json_data: str | bytes, serde: Serde | None = None) -> ADF:
260239
# NESTED FORMAT CLASSES
261240
# =============================================================================
262241

242+
263243
class ADFAttributes(AssetAttributes):
264244
"""ADF-specific attributes for nested API format."""
265245

@@ -269,6 +249,7 @@ class ADFAttributes(AssetAttributes):
269249
adf_asset_folder_path: str | None | UnsetType = UNSET
270250
"""Defines the folder path in which this ADF asset exists."""
271251

252+
272253
class ADFRelationshipAttributes(AssetRelationshipAttributes):
273254
"""ADF-specific relationship attributes for nested API format."""
274255

@@ -344,7 +325,9 @@ class ADFRelationshipAttributes(AssetRelationshipAttributes):
344325
readme: RelatedReadme | None | UnsetType = UNSET
345326
"""README that is linked to this asset."""
346327

347-
schema_registry_subjects: list[RelatedSchemaRegistrySubject] | None | UnsetType = UNSET
328+
schema_registry_subjects: list[RelatedSchemaRegistrySubject] | None | UnsetType = (
329+
UNSET
330+
)
348331
""""""
349332

350333
soda_checks: list[RelatedSodaCheck] | None | UnsetType = UNSET
@@ -356,6 +339,7 @@ class ADFRelationshipAttributes(AssetRelationshipAttributes):
356339
output_from_spark_jobs: list[RelatedSparkJob] | None | UnsetType = UNSET
357340
""""""
358341

342+
359343
class ADFNested(AssetNested):
360344
"""ADF in nested API format for high-performance serialization."""
361345

@@ -364,6 +348,7 @@ class ADFNested(AssetNested):
364348
append_relationship_attributes: ADFRelationshipAttributes | UnsetType = UNSET
365349
remove_relationship_attributes: ADFRelationshipAttributes | UnsetType = UNSET
366350

351+
367352
# =============================================================================
368353
# CONVERSION HELPERS & CONSTANTS
369354
# =============================================================================
@@ -400,19 +385,22 @@ class ADFNested(AssetNested):
400385
"output_from_spark_jobs",
401386
]
402387

388+
403389
def _populate_adf_attrs(attrs: ADFAttributes, obj: ADF) -> None:
404390
"""Populate ADF-specific attributes on the attrs struct."""
405391
_populate_asset_attrs(attrs, obj)
406392
attrs.adf_factory_name = obj.adf_factory_name
407393
attrs.adf_asset_folder_path = obj.adf_asset_folder_path
408394

395+
409396
def _extract_adf_attrs(attrs: ADFAttributes) -> dict:
410397
"""Extract all ADF attributes from the attrs struct into a flat dict."""
411398
result = _extract_asset_attrs(attrs)
412399
result["adf_factory_name"] = attrs.adf_factory_name
413400
result["adf_asset_folder_path"] = attrs.adf_asset_folder_path
414401
return result
415402

403+
416404
# =============================================================================
417405
# CONVERSION FUNCTIONS
418406
# =============================================================================
@@ -452,6 +440,7 @@ def _adf_to_nested(adf: ADF) -> ADFNested:
452440
remove_relationship_attributes=remove_rels,
453441
)
454442

443+
455444
def _adf_from_nested(nested: ADFNested) -> ADF:
456445
"""Convert nested format to flat ADF."""
457446
attrs = nested.attributes if nested.attributes is not UNSET else ADFAttributes()
@@ -461,7 +450,7 @@ def _adf_from_nested(nested: ADFNested) -> ADF:
461450
nested.append_relationship_attributes,
462451
nested.remove_relationship_attributes,
463452
_ADF_REL_FIELDS,
464-
ADFRelationshipAttributes
453+
ADFRelationshipAttributes,
465454
)
466455
return ADF(
467456
guid=nested.guid,
@@ -488,6 +477,7 @@ def _adf_from_nested(nested: ADFNested) -> ADF:
488477
**merged_rels,
489478
)
490479

480+
491481
def _adf_to_nested_bytes(adf: ADF, serde: Serde) -> bytes:
492482
"""Convert flat ADF to nested JSON bytes."""
493483
return serde.encode(_adf_to_nested(adf))
@@ -498,6 +488,7 @@ def _adf_from_nested_bytes(data: bytes, serde: Serde) -> ADF:
498488
nested = serde.decode(data, ADFNested)
499489
return _adf_from_nested(nested)
500490

491+
501492
# ---------------------------------------------------------------------------
502493
# Deferred field descriptor initialization
503494
# ---------------------------------------------------------------------------
@@ -535,4 +526,4 @@ def _adf_from_nested_bytes(data: bytes, serde: Serde) -> ADF:
535526
ADF.SCHEMA_REGISTRY_SUBJECTS = RelationField("schemaRegistrySubjects")
536527
ADF.SODA_CHECKS = RelationField("sodaChecks")
537528
ADF.INPUT_TO_SPARK_JOBS = RelationField("inputToSparkJobs")
538-
ADF.OUTPUT_FROM_SPARK_JOBS = RelationField("outputFromSparkJobs")
529+
ADF.OUTPUT_FROM_SPARK_JOBS = RelationField("outputFromSparkJobs")

0 commit comments

Comments
 (0)