Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ubuntu 1ES hosted pool images hitting No space left on device #13036

Closed
5 tasks
carlossanlop opened this issue Apr 4, 2023 · 15 comments
Closed
5 tasks

Ubuntu 1ES hosted pool images hitting No space left on device #13036

carlossanlop opened this issue Apr 4, 2023 · 15 comments
Assignees

Comments

@carlossanlop
Copy link
Member

carlossanlop commented Apr 4, 2023

Build

Build: https://dev.azure.com/dnceng-public/public/_build/results?buildId=227553&view=results

Failing jobs:

eng/testing/tests.mobile.targets(117,5): error MSB4018: (NETCORE_ENGINEERING_TELEMETRY=Build) The "AndroidAppBuilderTask" task failed unexpectedly.
System.Exception: Error: Process returned non-zero exit code: zip I/O error: No space left on device
zip error: Output file write failure (write error on zip file)
   at Utils.RunProcess(TaskLoggingHelper logger, String path, String args, IDictionary`2 envVars, String workingDir, Boolean ignoreErrors, Boolean silent, MessageImportance debugMessageImportance) in /_/src/tasks/Common/Utils.cs:line 98
   at ApkBuilder.BuildApk(String abi, String mainLibraryFileName, String monoRuntimeHeaders) in /_/src/tasks/AndroidAppBuilder/ApkBuilder.cs:line 221
   at AndroidAppBuilderTask.Execute() in /_/src/tasks/AndroidAppBuilder/AndroidAppBuilder.cs:line 110
   at Microsoft.Build.BackEnd.TaskExecutionHost.Microsoft.Build.BackEnd.ITaskExecutionHost.Execute()
   at Microsoft.Build.BackEnd.TaskBuilder.ExecuteInstantiatedTask(ITaskExecutionHost taskExecutionHost, TaskLoggingContext taskLoggingContext, TaskHost taskHost, ItemBucket bucket, TaskExecutionMode howToExecuteTask)

Build leg reported

Build Android x64/x86/arm64 Release AllSubsets_Mono

Pull Request

dotnet/runtime#84315

Action required for the engineering services team

To triage this issue (First Responder / @dotnet/dnceng):

  • Open the failing build above and investigate
  • Add a comment explaining your findings

If this is an issue that is causing build breaks across multiple builds and would get benefit from being listed on the build analysis check, follow the next steps:

  1. Add the label "Known Build Error"
  2. Edit this issue and add an error string in the Json below that can help us match this issue with future build breaks. You should use the known issues documentation
{
   "ErrorMessage" : "No space left on device",
   "BuildRetry": false,
   "ErrorPattern": "",
   "ExcludeConsoleLog": false
}

Release Note Category

  • Feature changes/additions
  • Bug fixes
  • Internal Infrastructure Improvements

Release Note Description

Additional information about the issue reported

No response

Report

Build Definition Step Name Console log Pull Request
367456 dotnet/sdk Publish Test Results Log
Build Definition Test Pull Request
2241134 dotnet-sdk Microsoft.NET.Publish.Tests.dll.6.WorkItemExecution #33065
2240427 dotnet-sdk Microsoft.NET.Publish.Tests.dll.7.WorkItemExecution #33065
2240426 dotnet-sdk Microsoft.NET.Publish.Tests.dll.5.WorkItemExecution #33064
367708 dotnet/runtime PayloadGroup0.WorkItemExecution
2235164 dotnet-sdk Microsoft.NET.Publish.Tests.dll.7.WorkItemExecution #32948
354966 dotnet/runtime readytorun/GenericCycleDetection/Depth3Test/Depth3Test.sh dotnet/runtime#89472

Summary

24-Hour Hit Count 7-Day Hit Count 1-Month Count
0 5 7

Known issue validation

Build: 🔎
Result validation: ⚠️ Validation could not be done without an Azure DevOps build URL on the issue. Please add it to the "Build: 🔎" line.

@steveisok
Copy link
Member

This seems to be happening across the board for android. Curious why all of a sudden?

@dkurepa
Copy link
Member

dkurepa commented Apr 5, 2023

Hello, according to the logs, the machines that are building the projects are running out of space, not the android phones. Has there been a recent increase in the number of APKs that are being built? Currently it looks like its 198

@steveisok
Copy link
Member

Hello, according to the logs, the machines that are building the projects are running out of space, not the android phones. Has there been a recent increase in the number of APKs that are being built? Currently it looks like its 198

We've been building the same 198+ suites for a few years w/o a problem. We can investigate where the size increase may be coming from in the tests.

@steveisok
Copy link
Member

Additionally, this is failing in servicing branches where the test surface hasn't really changed. The output size of the tests is roughly 62GB. @dkurepa has disk size changed at all on the build machines?

@dkurepa
Copy link
Member

dkurepa commented Apr 6, 2023

It looks like we have changed the SKU we're using for the NetCore-Svc-Public pool from Standard_D4_v3 to Standard_D4a_v4, but from what I'm seeing, the Temp storage size on these is the same. This happened ~ beginning of March

@premun premun changed the title [6.0] Android devices have no disk space left Ubuntu 1ES hosted pool images hitting No space left on device Apr 12, 2023
@premun
Copy link
Member

premun commented Apr 12, 2023

This seems to be reported for other repositories as well. All seems to be 1ES Ubuntu images

@steveisok
Copy link
Member

The disk space issues with android have been worked around. As premek said, definitely seems to be more widespread.

@premun
Copy link
Member

premun commented Apr 17, 2023

I was advised to open an IcM but when looking through the reported pipelines in this issue, I couldn't find a recent repro. Maybe this was fixed and is no longer happening?

Let's leave it open and monitor for new occurences

@mthalman
Copy link
Member

I was advised to open an IcM but when looking through the reported pipelines in this issue, I couldn't find a recent repro. Maybe this was fixed and is no longer happening?

Let's leave it open and monitor for new occurences

I got a failure right now: https://dev.azure.com/dnceng/internal/_build/results?buildId=2163350&view=logs&j=b6422d7d-1f12-5a27-5bca-0651b2848f2b&t=3e07d252-5687-5ff1-9e2c-f75b5de04ae8

@premun
Copy link
Member

premun commented Apr 21, 2023

@premun
Copy link
Member

premun commented May 2, 2023

Okay, so we have got an answer on what happened here in the IcM ticket. Seems like the pool switched over to using ephemeral disks which is causing these issues.
The options we were given:

  • If you really don't want ephemeral, you can set the SKU tier to Premium
  • Change the SKU to a slighter bigger one to have more space
  • Attach an empty data disk for more storage, without having to pay more for the SKU

@premun premun assigned premun and unassigned ilyas1974 Jun 19, 2023
@premun
Copy link
Member

premun commented Jun 23, 2023

Action steps:

  • Figure out who's still using this image
  • Move them out of 1es-ubuntu-*

@premun premun assigned andriipatsula and unassigned premun Jun 23, 2023
@oleksandr-didyk oleksandr-didyk self-assigned this Jul 13, 2023
@oleksandr-didyk
Copy link
Contributor

List of pipelines that are still using 1es-ubuntu-*

1es-ubuntu-2004:

  • \dotnet\dotnet\source-build-sdk-diff-tests
  • \dotnet\source-build\source-build-release
  • \dotnet\source-build\source-build-release-wip
  • \dotnet\command-line-api\dotnet-command-line-api
  • \Microsoft\go\infra\microsoft-go-infra-fuzz

1es0-ubuntu-1804:

  • \dotnet\helix-machines\dotnet-helix-machines-ci
  • \Microsoft\go\microsoft-go-validation
  • \dotnet\sdk\dotnet-sdk-official-ci
  • \Microsoft\go\microsoft-go-rolling
  • \Microsoft\go\microsoft-go

@premun premun assigned premun and unassigned oleksandr-didyk Aug 11, 2023
@premun
Copy link
Member

premun commented Aug 14, 2023

I don't see more occurences of this in the past 30 days. I see one but it's not on this VM so it's unrelated.

I think the builds that were hitting this have already moved off of this queue and since we don't want to pay for the premium SKU, I don't think this is further actionable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants