Skip to content

Commit 0b391d8

Browse files
authored
Merge pull request #1478 from bijay27bit/E2EgcsNewChangesSink_BT
E2E GCS Sink additional test scenarios.
2 parents 74fb78d + 9181e54 commit 0b391d8

5 files changed

Lines changed: 171 additions & 3 deletions

File tree

Lines changed: 76 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,76 @@
1+
@GCS_Sink
2+
Feature: GCS sink - Verification of GCS Sink plugin macro scenarios
3+
4+
@BQ_SOURCE_DATATYPE_TEST @GCS_SINK_TEST
5+
Scenario:Validate successful records transfer from BigQuery to GCS sink with macro fields
6+
Given Open Datafusion Project to configure pipeline
7+
Then Select plugin: "BigQuery" from the plugins list as: "Source"
8+
When Expand Plugin group in the LHS plugins list: "Sink"
9+
When Select plugin: "GCS" from the plugins list as: "Sink"
10+
Then Open BigQuery source properties
11+
Then Enter BigQuery property reference name
12+
Then Enter BigQuery property projectId "projectId"
13+
Then Enter BigQuery property datasetProjectId "projectId"
14+
Then Override Service account details if set in environment variables
15+
Then Enter BigQuery property dataset "dataset"
16+
Then Enter BigQuery source property table name
17+
Then Validate output schema with expectedSchema "bqSourceSchemaDatatype"
18+
Then Validate "BigQuery" plugin properties
19+
Then Close the BigQuery properties
20+
Then Open GCS sink properties
21+
Then Override Service account details if set in environment variables
22+
Then Enter the GCS sink mandatory properties
23+
Then Enter GCS property "projectId" as macro argument "gcsProjectId"
24+
Then Enter GCS property "serviceAccountType" as macro argument "serviceAccountType"
25+
Then Enter GCS property "serviceAccountFilePath" as macro argument "serviceAccount"
26+
Then Enter GCS property "path" as macro argument "gcsSinkPath"
27+
Then Enter GCS sink property "pathSuffix" as macro argument "gcsPathSuffix"
28+
Then Enter GCS property "format" as macro argument "gcsFormat"
29+
Then Click on the Macro button of Property: "writeHeader" and set the value to: "WriteHeader"
30+
Then Click on the Macro button of Property: "location" and set the value to: "gcsSinkLocation"
31+
Then Click on the Macro button of Property: "contentType" and set the value to: "gcsContentType"
32+
Then Click on the Macro button of Property: "outputFileNameBase" and set the value to: "OutFileNameBase"
33+
Then Click on the Macro button of Property: "fileSystemProperties" and set the value to: "FileSystemPr"
34+
Then Validate "GCS" plugin properties
35+
Then Close the GCS properties
36+
Then Connect source as "BigQuery" and sink as "GCS" to establish connection
37+
Then Save the pipeline
38+
Then Preview and run the pipeline
39+
Then Enter runtime argument value "projectId" for key "gcsProjectId"
40+
Then Enter runtime argument value "serviceAccountType" for key "serviceAccountType"
41+
Then Enter runtime argument value "serviceAccount" for key "serviceAccount"
42+
Then Enter runtime argument value for GCS sink property path key "gcsSinkPath"
43+
Then Enter runtime argument value "gcsPathDateSuffix" for key "gcsPathSuffix"
44+
Then Enter runtime argument value "jsonFormat" for key "gcsFormat"
45+
Then Enter runtime argument value "writeHeader" for key "WriteHeader"
46+
Then Enter runtime argument value "contentType" for key "gcsContentType"
47+
Then Enter runtime argument value "gcsSinkBucketLocation" for key "gcsSinkLocation"
48+
Then Enter runtime argument value "outputFileNameBase" for key "OutFileNameBase"
49+
Then Enter runtime argument value "gcsCSVFileSysProperty" for key "FileSystemPr"
50+
Then Run the preview of pipeline with runtime arguments
51+
Then Wait till pipeline preview is in running state
52+
Then Open and capture pipeline preview logs
53+
Then Verify the preview run status of pipeline in the logs is "succeeded"
54+
Then Close the pipeline logs
55+
Then Click on preview data for GCS sink
56+
Then Verify preview output schema matches the outputSchema captured in properties
57+
Then Close the preview data
58+
Then Deploy the pipeline
59+
Then Run the Pipeline in Runtime
60+
Then Enter runtime argument value "projectId" for key "gcsProjectId"
61+
Then Enter runtime argument value "serviceAccountType" for key "serviceAccountType"
62+
Then Enter runtime argument value "serviceAccount" for key "serviceAccount"
63+
Then Enter runtime argument value for GCS sink property path key "gcsSinkPath"
64+
Then Enter runtime argument value "gcsPathDateSuffix" for key "gcsPathSuffix"
65+
Then Enter runtime argument value "jsonFormat" for key "gcsFormat"
66+
Then Enter runtime argument value "writeHeader" for key "WriteHeader"
67+
Then Enter runtime argument value "contentType" for key "gcsContentType"
68+
Then Enter runtime argument value "gcsSinkBucketLocation" for key "gcsSinkLocation"
69+
Then Enter runtime argument value "outputFileNameBase" for key "OutFileNameBase"
70+
Then Enter runtime argument value "gcsCSVFileSysProperty" for key "FileSystemPr"
71+
Then Run the Pipeline in Runtime with runtime arguments
72+
Then Wait till pipeline is in running state
73+
Then Open and capture logs
74+
Then Verify the pipeline status is "Succeeded"
75+
Then Verify data is transferred to target GCS bucket
76+
Then Validate the values of records transferred to GCS bucket is equal to the values from source BigQuery table

src/e2e-test/features/gcs/sink/GCSSink.feature

Lines changed: 51 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -95,7 +95,7 @@ Feature: GCS sink - Verification of GCS Sink plugin
9595
| parquet | application/octet-stream |
9696
| orc | application/octet-stream |
9797

98-
@GCS_SINK_TEST @BQ_SOURCE_TEST
98+
@BQ_SOURCE_TEST @GCS_SINK_TEST
9999
Scenario Outline: To verify data is getting transferred successfully from BigQuery to GCS with combinations of contenttype
100100
Given Open Datafusion Project to configure pipeline
101101
When Source is BigQuery
@@ -265,3 +265,53 @@ Feature: GCS sink - Verification of GCS Sink plugin
265265
Then Open and capture logs
266266
Then Verify the pipeline status is "Succeeded"
267267
Then Verify data is transferred to target GCS bucket
268+
269+
@GCS_AVRO_FILE @GCS_SINK_TEST
270+
Scenario Outline: To verify data transferred successfully from GCS Source to GCS Sink with datatypes and write header true
271+
Given Open Datafusion Project to configure pipeline
272+
When Select plugin: "GCS" from the plugins list as: "Source"
273+
When Expand Plugin group in the LHS plugins list: "Sink"
274+
When Select plugin: "GCS" from the plugins list as: "Sink"
275+
Then Connect plugins: "GCS" and "GCS2" to establish connection
276+
Then Navigate to the properties page of plugin: "GCS"
277+
Then Replace input plugin property: "project" with value: "projectId"
278+
Then Override Service account details if set in environment variables
279+
Then Enter input plugin property: "referenceName" with value: "sourceRef"
280+
Then Enter GCS source property path "gcsAvroAllDataFile"
281+
Then Select GCS property format "avro"
282+
Then Click on the Get Schema button
283+
Then Verify the Output Schema matches the Expected Schema: "gcsAvroAllTypeDataSchema"
284+
Then Validate "GCS" plugin properties
285+
Then Close the Plugin Properties page
286+
Then Navigate to the properties page of plugin: "GCS2"
287+
Then Enter GCS property projectId and reference name
288+
Then Enter GCS sink property path
289+
Then Select GCS property format "<FileFormat>"
290+
Then Select GCS sink property contentType "<contentType>"
291+
Then Enter GCS File system properties field "gcsCSVFileSysProperty"
292+
Then Click on the Macro button of Property: "writeHeader" and set the value to: "WriteHeader"
293+
Then Validate "GCS" plugin properties
294+
Then Close the GCS properties
295+
Then Save the pipeline
296+
Then Preview and run the pipeline
297+
Then Enter runtime argument value "writeHeader" for key "WriteHeader"
298+
Then Run the preview of pipeline with runtime arguments
299+
Then Wait till pipeline preview is in running state
300+
Then Open and capture pipeline preview logs
301+
Then Verify the preview run status of pipeline in the logs is "succeeded"
302+
Then Close the pipeline logs
303+
Then Close the preview
304+
Then Deploy the pipeline
305+
Then Run the Pipeline in Runtime
306+
Then Enter runtime argument value "writeHeader" for key "WriteHeader"
307+
Then Run the Pipeline in Runtime with runtime arguments
308+
Then Wait till pipeline is in running state
309+
Then Open and capture logs
310+
Then Verify the pipeline status is "Succeeded"
311+
Then Verify data is transferred to target GCS bucket
312+
Then Validate the data from GCS Source to GCS Sink with expected csv file and target data in GCS bucket
313+
Examples:
314+
| FileFormat | contentType |
315+
| csv | text/csv |
316+
| tsv | text/plain |
317+
| delimited | text/csv |

src/e2e-test/features/gcs/sink/GCSSinkError.feature

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -65,3 +65,39 @@ Feature: GCS sink - Verify GCS Sink plugin error scenarios
6565
Then Select GCS property format "csv"
6666
Then Click on the Validate button
6767
Then Verify that the Plugin Property: "format" is displaying an in-line error message: "errorMessageInvalidFormat"
68+
69+
@BQ_SOURCE_TEST @GCS_SINK_TEST
70+
Scenario: To verify and validate the Error message in pipeline logs after deploy with invalid bucket path
71+
Given Open Datafusion Project to configure pipeline
72+
When Select plugin: "BigQuery" from the plugins list as: "Source"
73+
When Expand Plugin group in the LHS plugins list: "Sink"
74+
When Select plugin: "GCS" from the plugins list as: "Sink"
75+
Then Connect source as "BigQuery" and sink as "GCS" to establish connection
76+
Then Open BigQuery source properties
77+
Then Enter the BigQuery source mandatory properties
78+
Then Validate "BigQuery" plugin properties
79+
Then Close the BigQuery properties
80+
Then Open GCS sink properties
81+
Then Enter GCS property projectId and reference name
82+
Then Enter GCS property "path" as macro argument "gcsSinkPath"
83+
Then Select GCS property format "csv"
84+
Then Click on the Validate button
85+
Then Close the GCS properties
86+
Then Save the pipeline
87+
Then Preview and run the pipeline
88+
Then Enter runtime argument value "gcsInvalidBucketNameSink" for key "gcsSinkPath"
89+
Then Run the preview of pipeline with runtime arguments
90+
Then Wait till pipeline preview is in running state and check if any error occurs
91+
Then Open and capture pipeline preview logs
92+
Then Close the pipeline logs
93+
Then Close the preview
94+
Then Deploy the pipeline
95+
Then Run the Pipeline in Runtime
96+
Then Enter runtime argument value "gcsInvalidBucketNameSink" for key "gcsSinkPath"
97+
Then Run the Pipeline in Runtime with runtime arguments
98+
Then Wait till pipeline is in running state
99+
Then Verify the pipeline status is "Failed"
100+
Then Open Pipeline logs and verify Log entries having below listed Level and Message:
101+
| Level | Message |
102+
| ERROR | errorMessageInvalidBucketNameSink |
103+
Then Close the pipeline logs

src/e2e-test/resources/errorMessage.properties

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,4 +34,4 @@ errorMessageMultipleFileWithoutClearDefaultSchema=Found a row with 4 fields when
3434
errorMessageInvalidSourcePath=Invalid bucket name in path 'abc@'. Bucket name should
3535
errorMessageInvalidDestPath=Invalid bucket name in path 'abc@'. Bucket name should
3636
errorMessageInvalidEncryptionKey=CryptoKeyName.parse: formattedString not in valid format: Parameter "abc@" must be
37-
37+
errorMessageInvalidBucketNameSink=Unable to read or access GCS bucket.

src/e2e-test/resources/pluginParameters.properties

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -109,7 +109,6 @@ gcsDataTypeTest2File=testdata/GCS_DATATYPE_TEST_2.csv
109109
gcsReadRecursivePath=testdata/GCS_RECURSIVE_TEST
110110
gcsReadWildcardPath=testdata/GCS_WILDCARD_TEST,testdata/GCS_WILDCARD_TEST/test
111111
gcsFileSysProperty={"textinputformat.record.delimiter": "@"}
112-
gcsCSVFileSysProperty={"csvinputformat.record.csv": "1"}
113112
gcsDatatypeChange=[{"key":"createddate","value":"datetime"},{"key":"revenue","value":"double"},\
114113
{"key":"points","value":"decimal"},{"key":"BytesData","value":"bytes"}]
115114
gcsDataTypeTestFileSchema=[{"key":"id","value":"int"},{"key":"name","value":"string"},\
@@ -175,6 +174,13 @@ encryptedMetadataSuffix=.metadata
175174
gcsPathFieldOutputSchema={ "type": "record", "name": "text", "fields": [ \
176175
{ "name": "EmployeeDepartment", "type": "string" }, { "name": "Employeename", "type": "string" }, \
177176
{ "name": "Salary", "type": "int" }, { "name": "wotkhours", "type": "int" }, { "name": "pathFieldColumn", "type": "string" } ] }
177+
gcsInvalidBucketNameSink=ggg
178+
writeHeader=true
179+
gcsSinkBucketLocation=US
180+
contentType=application/octet-stream
181+
outputFileNameBase=part
182+
gcsCSVFileSysProperty={"csvinputformat.record.csv": "1"}
183+
jsonFormat=json
178184
## GCS-PLUGIN-PROPERTIES-END
179185

180186
## BIGQUERY-PLUGIN-PROPERTIES-START

0 commit comments

Comments
 (0)