-
Notifications
You must be signed in to change notification settings - Fork 139
[WIP] Enable guest tracing #695
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
d3ece3e
to
9de4e6d
Compare
In the future, the outb handler will need to take a Hypervisor instance in order to be able to access register and memory state of the VM, so it doesn't make sense for these interfaces to be more public than the `Hypervisor` trait. Nobody outside of Hyperlight seems to use these at the moment, so it's probably simplest to restrict these to `pub(crate)`. Signed-off-by: Lucy Menon <168595099+syntactically@users.noreply.github.com>
This adds (unused) support for creating trace files for sandboxes and passing them around to relevant sandbox event handler code. This will be used for collecting debug trace and profiling information. Signed-off-by: Lucy Menon <168595099+syntactically@users.noreply.github.com> Signed-off-by: Doru Blânzeanu <dblnz@pm.me>
This will be useful in the near future, when it will allow transforming the exe_info into unwind information without an extra copy. Signed-off-by: Lucy Menon <168595099+syntactically@users.noreply.github.com>
This adds a new interface which tracing code can use to request the values of registers from the hypervisor supervising a sandbox. Signed-off-by: Lucy Menon <168595099+syntactically@users.noreply.github.com>
Signed-off-by: Lucy Menon <168595099+syntactically@users.noreply.github.com> Signed-off-by: Doru Blânzeanu <dblnz@pm.me>
Signed-off-by: Doru Blânzeanu <dblnz@pm.me>
This allows for producing profiles of memory usage. Signed-off-by: Lucy Menon <168595099+syntactically@users.noreply.github.com> Signed-off-by: Doru Blânzeanu <dblnz@pm.me>
Signed-off-by: Lucy Menon <168595099+syntactically@users.noreply.github.com>
Signed-off-by: Doru Blânzeanu <dblnz@pm.me>
…ords - `hyperlight-guest-tracing` defines a `TraceBuffer` that keeps `TraceRecord`s that the guest issues. When the buffer capacity is reached, it automatically issues an `Out` instruction with the corresponding info for the host to retrieve the buffer. - The guest can issue `TraceRecord`s by using the `hyperlight-guest-tracing-macro` crate that pushes new records to the buffer. - `hyperlight_common` contains the definitions for the frame types a guest can send to the host using the `Out` instruction. Signed-off-by: Doru Blânzeanu <dblnz@pm.me>
- When an `Out` instruction is intercepted, the Hypervisor checks for the frame Id, to verify what type of exit it is. Based on this, when a trace record type is received, we copy the array of trace records from the guest's memory, calculate the timestamp based on the cycles returned by the guest's RDTSC and write the record to the trace file. Signed-off-by: Doru Blânzeanu <dblnz@pm.me>
- Add traces wherever we think it might be of use to help us better profile the codebase. - Add flush instructions before halting or running an `OutB` instruction to ensure the records are sent before the guest exits due to an error or a normal exit Signed-off-by: Doru Blânzeanu <dblnz@pm.me>
Signed-off-by: Doru Blânzeanu <dblnz@pm.me>
Signed-off-by: Doru Blânzeanu <dblnz@pm.me>
Signed-off-by: Doru Blânzeanu <dblnz@pm.me>
Signed-off-by: Doru Blânzeanu <dblnz@pm.me>
Signed-off-by: Doru Blânzeanu <dblnz@pm.me>
The output data should be changed to something that other tools can open #704. |
Just to document this: it was confirmed that the host under a nested MSHV will incorrectly not see the invariant TSC bit, and this should be ignored; on the azure platforms, at least, the TSC is in fact invariant. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work! Could we add a step where the test binaries are executing in the CI pipeline? Right now it looks like none of this code is being exercised since it is all behind feature flag.
|
||
Hyperlight provides advanced observability features for guest code running inside micro virtual machines. You can enable guest-side tracing, stack unwinding, and memory profiling using the `trace_guest`, `unwind_guest`, and `mem_profile` features. This section explains how to build, run, and inspect guest traces. | ||
|
||
### Building a Guest with Tracing Support |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section is focused on how to use the development tools to test from a hyperlight developers perspective but it would be nice to think about it from a user of hyper light. For instance, if I am writing my own guest what do I need to do? I can of course look at these samples but having some docs about it would be useful here.
trace_guest = [] | ||
unwind_guest = [] | ||
mem_profile = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the reasoning for three separate features? From a usability standpoint it seems like a lot of things to keep track of.
From a maintainer standpoint it also feels like a large additional matrix of things to test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think part of the confusion for me is that these seem like three related (as in use some base functionality) but different features but are wrapped up into a single PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jsturtevant They kind of are, since Doru built this on top of my old memory profiling PR from way back when. The idea with that was that the base tracing feature that lets you generate a file of fine-grained events quickly is useful for other things (and indeed it is being used it here for the performance events), while the full memory profiling build is extraordinarily slow, since it generates a trace event on every allocation/deallocation. Carrying around the registers to unwind the guest stack also seemed like something that would be useful in other contexts but a different amount of overhead & not always needed, e.g. the performance events here don't I believe use it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense that you may want to turn those on/off due to overhead. I think it would make for an easier review if these were split out, but I am fine with it if others are.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recommend reviewing commit-by-commit; I put a lot of effort into making sure that each commit itself is small/atomic/reviewable. That means that we can have a PR like this one where we only merge things after they are useful (& consequently we know that the design is usable), but reviewers can understand and review each step along that path in isolation.
pub extern "C" fn abort() -> ! { | ||
abort_with_code(&[0, 0xFF]) | ||
} | ||
|
||
/// Exits the VM with an Abort OUT action and a specific code. | ||
#[hyperlight_guest_tracing_macro::trace_function] | ||
pub fn abort_with_code(code: &[u8]) -> ! { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you need to flush on an abort too?
pub(crate) fn outb(port: u16, data: &[u8]) { | ||
// Ensure all tracing data is flushed before sending OUT bytes | ||
hyperlight_guest_tracing_macro::flush!(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the abort section this means this gets called twice, is that expected?
@@ -60,6 +61,7 @@ pub(crate) fn call_guest_function(function_call: FunctionCall) -> Result<Vec<u8> | |||
core::mem::transmute::<usize, GuestFunc>(function_pointer) | |||
}; | |||
|
|||
hyperlight_guest_tracing_macro::trace!("Calling guest function"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you need a corresponding "guest function completed"?
If so this seems like a common pattern we could include in the macro
as is done in the io.rs:
hyperlight_guest_tracing_macro::trace!("Start copy of data");
odb[stack_ptr_rel as usize..stack_ptr_rel as usize + data.len()].copy_from_slice(&data);
hyperlight_guest_tracing_macro::trace!("Finish copy of data");
if let Ok(parsed) = syn::parse2::<TraceInput>(input2) { | ||
let trace_message = match parsed.message { | ||
Lit::Str(lit_str) => lit_str.value(), | ||
_ => "expression".to_string(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this fail or print a message that indicates something when wrong using the marco?
@@ -616,6 +617,7 @@ impl ExclusiveSharedMemory { | |||
/// Return the address of memory at an offset to this `SharedMemory` checking | |||
/// that the memory is within the bounds of the `SharedMemory`. | |||
#[instrument(err(Debug), skip_all, parent = Span::current(), level= "Trace")] | |||
#[allow(dead_code)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why all the deadcode allows?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these are to get rid of clippy warnings when you build with the basic trace_guest infrastructure support but not with any of the things that use it. Maybe another option here is to replace the feature with a cfg item that is any(features_which_use_this)
.as_ref() | ||
.shared_mem | ||
.copy_to_slice(buffer, ptr as usize - SandboxMemoryLayout::BASE_ADDRESS) | ||
// .read::<u64>((addr - SandboxMemoryLayout::BASE_ADDRESS as u64) as usize) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: remove this commented line
let start = hyperlight_guest_tracing::invariant_tsc::read_tsc(); | ||
let start_time = std::time::Instant::now(); | ||
// Sleep for 1 second to get a good sample | ||
std::thread::sleep(std::time::Duration::from_secs(1)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this cause a delay when running with tracing feature?
Description
This PR updates the work @syntactically has done in #103 to latest main and extends it to support tracing the guest for performance.
This effort is part of the #669 issue.
Improvements on the trace file output will be addressed in #704 .
How it works
It defines a
TraceBuffer
in the guest that holdsTraceRecord
s. These records are only sent to the host when the buffer gets full, or an exit (Out/Halt
instructions) is anticipated.The records are sent to the host using an
Out
instruction and providing the pointer to an array ofTraceRecords
.The host accesses the guest memory, reads these records and writes them to the trace file.
The guest uses the Invariant TSC and
rdtsc
instruction to read the clock cycles in the guest and calculate the timestamp on the host.The timestamp is relative to when the TraceInfo started (before creating the Hypervisor Driver in this case).
Two additional crates define the guest logic for tracing. One defines some macros meant to make the annotation of functions simpler to the user
Note: The data processing could be improved (maybe OTEL), I've just used what we had at hand.
Results
Below are some results I have generated based on our hello-world example in the hyperlight repository.
1. My host machine on WSL2 running KVM on Intel(R) Core(TM) Ultra 7 165H
Has Invariant TSC
2. Azure VM running mshv3 on AMD EPYC 7763 64-Core Processor
Does not report an invariant TSC. I am currently investigating if I can get an Azure VM that reports supporting an invariant TSC.TODO
dump_trace
functionality to visualize the traces