Fix tests to accommodate wait for container re-download on downgrade by sairon · Pull Request #4368 · home-assistant/operating-system

sairon · 2025-10-29T07:41:32Z

Because the OS downgrade performed in tests now triggers change in container snapshotters, all containers need to be redownloaded. Make sure that CLI container exists and increase the timeout for the time being.

Summary by CodeRabbit

Tests
- Enhanced OS update test reliability with improved container status verification and extended timeout handling.

Because the OS downgrade performed in tests now triggers change in container snapshotters, all containers need to be redownloaded. Make sure that CLI container exists and increase the timeout for the time being.

coderabbitai · 2025-10-29T07:44:57Z

📝 Walkthrough

Walkthrough

This change enhances test reliability by adding container status polling to verify that hassio_supervisor and hassio_cli containers are running after reactivation. The test decorator timeout is increased from 300 to 600 seconds to accommodate these additional checks.

Changes

Cohort / File(s)	Change Summary
Container status polling for test robustness `tests/smoke_test/test_os_update.py`	Adds `check_container_running()` helper function to poll Docker container status via `docker inspect`. Introduces a waiting loop after target reactivation to confirm container readiness. Increases test timeout from 300 to 600 seconds with a TODO note.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Helper function implementation—verify polling logic and timeout handling are correct
Container readiness wait logic—confirm it doesn't mask underlying issues or create flaky behavior
Timeout increase justification—validate that 600 seconds is appropriate and the TODO is tracked

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The PR title "Fix tests to accommodate wait for container re-download on downgrade" directly and accurately reflects the primary changes in the changeset. The modifications add a container status polling helper function, introduce waiting loops to ensure containers are running before proceeding, and increase the test timeout from 300 to 600 seconds—all of which align perfectly with accommodating waits during container re-download on downgrade. The title is concise, specific, and avoids vague language or noise, making it clear to reviewers scanning commit history that this PR addresses test timing issues related to container lifecycle management during downgrade operations.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch tests-wait-for-cli

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

tests/smoke_test/test_os_update.py (1)

44-48: Extract duplicated helper to module level.

The check_container_running function is duplicated identically in test_init (lines 13-17). Extract it to module level to eliminate duplication and improve maintainability.

Apply this diff to extract the helper:

 _LOGGER = logging.getLogger(__name__)
 
+
+def check_container_running(container_name):
+    """Check if a container is running via docker inspect."""
+    from labgrid.driver import ShellDriver
+    # Note: This assumes shell context; caller must provide shell.run_check
+    # Better: pass shell as parameter
+    raise NotImplementedError("Needs shell parameter")
+

Actually, since the helper needs access to shell.run_check, a better approach is to accept shell as a parameter:

 _LOGGER = logging.getLogger(__name__)
 
+
+def check_container_running(shell, container_name):
+    """Check if a container is running via docker inspect."""
+    out = shell.run_check(
+        f"docker container inspect -f '{{{{.State.Status}}}}' {container_name} || true"
+    )
+    return "running" in out
+

 @pytest.mark.dependency()
 @pytest.mark.timeout(120)
 def test_init(shell, shell_json):
-    def check_container_running(container_name):
-        out = shell.run_check(
-            f"docker container inspect -f '{{{{.State.Status}}}}' {container_name} || true"
-        )
-        return "running" in out
-
     # wait for important containers first
     while True:
-        if check_container_running("homeassistant") and check_container_running("hassio_supervisor"):
+        if check_container_running(shell, "homeassistant") and check_container_running(shell, "hassio_supervisor"):
             break

         sleep(1)

Then update test_os_update similarly to remove its local definition and call the module-level function.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between 4695d10 and 2d5e4f1.

📒 Files selected for processing (1)

tests/smoke_test/test_os_update.py (2 hunks)

🔇 Additional comments (1)

tests/smoke_test/test_os_update.py (1)

42-42: LGTM! Timeout increase aligns with PR objectives.

The doubled timeout accommodates container re-downloads during the OS downgrade scenario. Ensure the TODO is tracked so the timeout can be reduced after the 17.0 release.

coderabbitai · 2025-10-29T07:45:00Z

tests/smoke_test/test_os_update.py

+    # temporary needed for OS 17.0 -> 16.x path, where all containers must be re-downloaded
+    while True:
+        if check_container_running("hassio_supervisor") and check_container_running("hassio_cli"):
+            break
+
+        sleep(1)


⚠️ Potential issue | 🟠 Major

Add explicit timeout to container wait loop.

The wait loop lacks a timeout or maximum iteration limit. While the pytest @pytest.mark.timeout(600) decorator provides an outer bound, explicit timeouts improve test reliability by providing clearer failure messages and more predictable behavior.

Apply this diff to add an explicit timeout:

# temporary needed for OS 17.0 -> 16.x path, where all containers must be re-downloaded + max_wait = 120 # seconds + elapsed = 0 while True: if check_container_running("hassio_supervisor") and check_container_running("hassio_cli"): break + if elapsed >= max_wait: + raise TimeoutError( + f"Containers hassio_supervisor and hassio_cli did not start within {max_wait}s" + ) + elapsed += 1 sleep(1)

Alternatively, use a more concise approach with a counter:

# temporary needed for OS 17.0 -> 16.x path, where all containers must be re-downloaded - while True: + max_attempts = 120 + for attempt in range(max_attempts): if check_container_running("hassio_supervisor") and check_container_running("hassio_cli"): break sleep(1) + else: + raise TimeoutError( + "Containers hassio_supervisor and hassio_cli did not start within 120s" + )

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

# temporary needed for OS 17.0 -> 16.x path, where all containers must be re-downloaded

while True:

if check_container_running("hassio_supervisor") and check_container_running("hassio_cli"):

break

sleep(1)

# temporary needed for OS 17.0 -> 16.x path, where all containers must be re-downloaded

max_wait = 120 # seconds

elapsed = 0

while True:

if check_container_running("hassio_supervisor") and check_container_running("hassio_cli"):

break

if elapsed >= max_wait:

raise TimeoutError(

f"Containers hassio_supervisor and hassio_cli did not start within {max_wait}s"

)

elapsed += 1

sleep(1)

Suggested change

# temporary needed for OS 17.0 -> 16.x path, where all containers must be re-downloaded

while True:

if check_container_running("hassio_supervisor") and check_container_running("hassio_cli"):

break

sleep(1)

# temporary needed for OS 17.0 -> 16.x path, where all containers must be re-downloaded

max_attempts = 120

for attempt in range(max_attempts):

if check_container_running("hassio_supervisor") and check_container_running("hassio_cli"):

break

sleep(1)

else:

raise TimeoutError(

"Containers hassio_supervisor and hassio_cli did not start within 120s"

)

🤖 Prompt for AI Agents

In tests/smoke_test/test_os_update.py around lines 72-77, the infinite loop that waits for hassio_supervisor and hassio_cli to run needs an explicit timeout; replace the while True loop with a timed-wait using a max_wait (e.g. 600 seconds or a shorter sensible value) by recording start = time.time(), looping while elapsed < max_wait and sleeping briefly, and after the loop raise a clear failure (pytest.fail or AssertionError) if containers never became ready; ensure time is imported if not already present.

agners

LGTM

Fix tests to accommodate wait for container re-download on downgrade

2d5e4f1

Because the OS downgrade performed in tests now triggers change in container snapshotters, all containers need to be redownloaded. Make sure that CLI container exists and increase the timeout for the time being.

sairon requested a review from agners October 29, 2025 07:41

sairon added the build Build and CI related issues label Oct 29, 2025

home-assistant bot added cla-signed hacktoberfest labels Oct 29, 2025

coderabbitai bot reviewed Oct 29, 2025

View reviewed changes

agners approved these changes Oct 29, 2025

View reviewed changes

sairon merged commit af87ada into dev Oct 29, 2025
3 checks passed

sairon deleted the tests-wait-for-cli branch October 29, 2025 10:06

github-actions bot locked and limited conversation to collaborators Nov 6, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix tests to accommodate wait for container re-download on downgrade#4368

Fix tests to accommodate wait for container re-download on downgrade#4368
sairon merged 1 commit intodevfrom
tests-wait-for-cli

sairon commented Oct 29, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Oct 29, 2025

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Oct 29, 2025

Uh oh!

agners left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

sairon commented Oct 29, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Oct 29, 2025

Walkthrough

Changes

Estimated code review effort

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Oct 29, 2025

Choose a reason for hiding this comment

Uh oh!

agners left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

sairon commented Oct 29, 2025 •

edited by coderabbitai bot

Loading