Skip to content

chore: fix Dataset system tests#1209

Merged
sararob merged 24 commits intogoogleapis:mainfrom
sararob:sr-pandas-tabular-dataset
May 9, 2022
Merged

chore: fix Dataset system tests#1209
sararob merged 24 commits intogoogleapis:mainfrom
sararob:sr-pandas-tabular-dataset

Conversation

@sararob
Copy link
Contributor

@sararob sararob commented May 9, 2022

Updated Dataset system tests to remove shared_state, fixes failing system tests that were added with create_from_dataframe method.

@product-auto-label product-auto-label bot added size: l Pull request size is large. api: vertex-ai Issues related to the googleapis/python-aiplatform API. labels May 9, 2022
@sararob sararob requested a review from sasha-gitg May 9, 2022 15:13
Copy link
Member

@sasha-gitg sasha-gitg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few comments. Approving to unblock release, we can resolve the comments afterwards. Thanks!

shared_state["dataset_name"] = new_dataset.name
yield
def storage_client(self):
yield storage.Client(project=e2e_base._PROJECT)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Preference to reference either _TEST_PROJECT(line 130) or e2e_base._PROJECT but not both.

assert len(list(data_items_post_import)) == 469
assert len(list(data_items_post_import)) == 469
finally:
if text_dataset is not None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This condition doesn't seem necessary. Preference to remove it. If this line is present or not present, referencing text_dataset would throw.

my_dataset.import_data(
gcs_source=_TEST_TEXT_ENTITY_EXTRACTION_GCS_SOURCE,
import_schema_uri=_TEST_TEXT_ENTITY_IMPORT_SCHEMA,
import_request_timeout=600.0,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the purpose of the adding the timeout?

aiplatform.init(project=_TEST_PROJECT, location=_TEST_LOCATION)
try:
text_dataset = aiplatform.TextDataset.create(
display_name=f"temp_sdk_integration_test_create_text_dataset_{uuid.uuid4()}",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new_dataset = create_lro.result()
shared_state["dataset_name"] = new_dataset.name
yield
def staging_bucket(self, storage_client):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Preference to eventually rely on e2e_base implementation for deduplicaton of code.

@sararob sararob merged commit c773539 into googleapis:main May 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

api: vertex-ai Issues related to the googleapis/python-aiplatform API. size: l Pull request size is large.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants