Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

oneshot / as_dict can cause Process instances to return incorrect results for certain methods #1373

Closed
press-any-key-to-continue opened this issue Dec 7, 2018 · 7 comments

Comments

@press-any-key-to-continue
Copy link

press-any-key-to-continue commented Dec 7, 2018

Closely related to Issue #1111, but the problem's a little more extensive, and not just a simple case of oneshot being thread-unsafe. I also thought a “more formal” report/investigation into the problem and its effects was due; hopefully this'll be of some use to other developers.
See also — <//issues.apache.org/jira/browse/AURORA-1939>, <//github.com/apache/aurora/commit/cdc5b8efd5bb86d38f73cca6d91903078b120333>, and Issue #1110.

The Problem

Whilst using oneshot, and by extension as_dict, calling certain methods on multiple non-equal Process instances will return incorrect results. This will typically occur in multithreaded applications, but the phenomenon can be reproduced in single-threaded code as well. The problem will occur even if oneshot/as_dict is only used in one place in one thread in a multithreaded application, so long as such an application can access different Process instances at the same time that the oneshot/as_dict is in effect. For single-threaded code, when inside oneshot, attempts to access affected methods on multiple different Process instances will return incorrect results, including if those attempts are made further down the call stack of such code, in some place distant from where oneshot was called.

Affected Methods

An incomplete list of methods that I believe are affected; I haven't delved far into platform-specific psutil code:

  • ppid
  • uids
  • parent
  • children
  • username (POSIX platforms only)
  • cpu_times
  • cpu_percent
  • memory_info
  • memory_percent

If any of these are called on multiple different Process instances whilst oneshot or as_dict is in effect / executing, regardless of where it is in effect / executing, they will return incorrect results.

Reproduce

Call one of the Reproduce* functions in the following script to run one of the reproduce experiments; they should finish with a message being displayed regarding a change in process' parent PID, which should not normally happen, and not at all under Windows. The FindHeterogeneousProcesses function can be ignored/replaced; it just finds two suitable processes to run the experiments with.

from __future__ import unicode_literals
from __future__ import print_function
import sys
import threading
import psutil

def ReproduceA():
    print("psutil version: ", psutil.version_info)
    print("Python version: ", sys.version_info[0:4])
    ((ProcessA, ProcessAparent), (ProcessB, ProcessBparent)) = FindHeterogeneousProcesses()
    print("Process A - {0:d} {1}, parent PID is {2:d}".format( ProcessA.pid, ProcessA.name(), ProcessAparent ))
    print("Process B - {0:d} {1}, parent PID is {2:d}".format( ProcessB.pid, ProcessB.name(), ProcessBparent ))
    print("A Process' parent should never change (on Windows, and should be reasonably fixed on other platforms); the PID of Process A's parent should always be {0:d}".format( ProcessAparent ))
    print("Running experiment A . . .")
    print()
    synchronizeA = threading.Event()
    synchronizeB = threading.Event()
    ConflictThread = threading.Thread(
        name=   "Conflict Thread",
        target= RunConflictThread,
        args=   ( ProcessB, synchronizeA, synchronizeB ),
    )
    print("Main Thread: Starting Conflict Thread and passing control to")
    ConflictThread.start()
    synchronizeA.set()
    synchronizeB.wait()
    synchronizeB.clear()
    print("Main Thread: Received back control; reading details of Process A without using a oneshot cache")
    ProcessAparent2 = ProcessA.ppid()
    print("Main Thread: Process A {1:d} {2} parent PID is {0:d}".format( ProcessAparent2, ProcessA.pid, ProcessA.name() ))
    print("Main Thread: Reading details of Process A using a oneshot cache")
    with ProcessA.oneshot():
        ProcessAparent3 = ProcessA.ppid()
        print("Main Thread: Process A {1:d} {2} parent PID is {0:d}".format( ProcessAparent3, ProcessA.pid, ProcessA.name() ))
    print("Main Thread: Signalling Conflict Thread to end")
    synchronizeA.set()
    ConflictThread.join()
    print("Ended Conflict Thread, experiment complete")
    print()
    if ProcessAparent2 != ProcessAparent:
        print(" - Process A's parent changed from {1:d} to {0:d} - ".format( ProcessAparent2, ProcessAparent ))
    else:
        print("Process A's parent remained the same value of {0:d}".format( ProcessAparent2 ))

def RunConflictThread(ProcessB,synchronizeA,synchronizeB):
    synchronizeA.wait()
    synchronizeA.clear()
    print("Conflict Thread: Received control")
    print("Conflict Thread: Beginning oneshot cache for reading details of Process B")
    with ProcessB.oneshot():
        ProcessBparent = ProcessB.ppid()
        print("Conflict Thread: Read Process B's parent PID, which is {0:d}".format( ProcessBparent ))
        print("Conflict Thread: Handing back control to Main Thread -before- ending the oneshot cache; Process B's details will still be in the cache when the Main Thread resumes")
        synchronizeB.set()
        synchronizeA.wait()
        synchronizeA.clear()
        print("Conflict Thread: Received back control; ending oneshot cache")
    print("Conflict Thread: Cache ended, exiting")

def ReproduceB():
    print("psutil version: ", psutil.version_info)
    print("Python version: ", sys.version_info[0:4])
    ((ProcessA, ProcessAparent), (ProcessB, ProcessBparent), (ProcessC, ProcessCparent)) = FindHeterogeneousProcesses(3)
    print("Process A - {0:d} {1}, parent PID is {2:d}".format( ProcessA.pid, ProcessA.name(), ProcessAparent ))
    print("Process B - {0:d} {1}, parent PID is {2:d}".format( ProcessB.pid, ProcessB.name(), ProcessBparent ))
    print("Process C - {0:d} {1}, parent PID is {2:d}".format( ProcessC.pid, ProcessC.name(), ProcessCparent ))
    print("A Process' parent should never change (on Windows, and should be reasonably fixed on other platforms); the PID of Process A's parent should always be {0:d}".format( ProcessAparent ))
    print("Running experiment B . . .")
    print()
    synchronizeA = threading.Event()
    synchronizeB = threading.Event()
    OneshotThread = threading.Thread(
        name=   "Oneshot Thread",
        target= RunOneshotThread,
        args=   ( ProcessC, synchronizeA, synchronizeB ),
    )
    print("Main Thread: Starting Oneshot Thread and handing control to")
    OneshotThread.start()
    synchronizeB.wait()
    synchronizeB.clear()
    print("Main Thread: Received back control; querying Processes B and A for their details, which should be different")
    ProcessBparent2 = ProcessB.ppid()
    print("Main Thread: Process B - {0:d} {1}, parent PID is {2:d}".format( ProcessB.pid, ProcessB.name(), ProcessBparent2 ))
    ProcessAparent2 = ProcessA.ppid()
    print("Main Thread: Process A - {0:d} {1}, parent PID is {2:d}".format( ProcessA.pid, ProcessA.name(), ProcessAparent2 ))
    print("Main Thread: Signalling Oneshot Thread to end")
    synchronizeA.set()
    OneshotThread.join()
    print("Ended Oneshot Thread, experiment complete")
    print()
    if ProcessAparent2 != ProcessAparent:
        print(" - Process A's parent changed from {1:d} to {0:d} - ".format( ProcessAparent2, ProcessAparent ))
    else:
        print("Process A's parent remained the same value of {0:d}".format( ProcessAparent2 ))

def RunOneshotThread(ProcessC,synchronizeA,synchronizeB):
    print("Oneshot Thread: Started, beginning oneshot cache on Process C")
    with ProcessC.oneshot():
        print("Oneshot Thread: oneshot cache started; not reading anything from Process C - immediately handing back control to Main Thread")
        synchronizeB.set()
        synchronizeA.wait()
        synchronizeA.clear()
        print("Oneshot Thread: Received back control; ending oneshot cache")
    print("Oneshot Thread: Cache ended, exiting")

def ReproduceSingleThreaded():
    print("psutil version: ", psutil.version_info)
    print("Python version: ", sys.version_info[0:4])
    ((ProcessA, ProcessAparent), (ProcessB, ProcessBparent)) = FindHeterogeneousProcesses()
    print("Process A - {0:d} {1}, parent PID is {2:d}".format( ProcessA.pid, ProcessA.name(), ProcessAparent ))
    print("Process B - {0:d} {1}, parent PID is {2:d}".format( ProcessB.pid, ProcessB.name(), ProcessBparent ))
    print("A Process' parent should never change (on Windows, and should be reasonably fixed on other platforms); the PID of Process A's parent should always be {0:d}".format( ProcessAparent ))
    print("Running single-threaded experiment . . .")
    print()
    print("Beginning oneshot cache for Process B")
    with ProcessB.oneshot():
    #with ProcessA.oneshot(): produces same results
        ProcessBparent2 = ProcessB.ppid()
        print("Read Process B's parent as {0:d}".format( ProcessBparent2 ))
        ProcessAparent2 = ProcessA.ppid()
        print("Read Process A's parent as {0:d}".format( ProcessAparent2 ))
        print("Ending oneshot cache")
    print("Re-reading Process A's parent")
    ProcessAparent3 = ProcessA.ppid()
    print("Read Process A's parent as {0:d}".format( ProcessAparent3 ))
    print()
    if ProcessAparent2 != ProcessAparent:
        print(" - Process A's parent changed from {1:d} to {0:d} - ".format( ProcessAparent2, ProcessAparent ))
    else:
        print("Process A's parent remained the same value of {0:d}".format( ProcessAparent2 ))

def FindHeterogeneousProcesses( total=2 ):
    def ConsiderProcess(thisProcess, thisProcessParent):
        return\
            thisProcessParent > 0\
            and thisProcessParent != thisProcess.pid\
            and (not psutil.WINDOWS or thisProcessParent not in(4,))

    Results = []
    for thisProcess in psutil.process_iter():
        thisProcessParent = thisProcess.ppid()
        if ConsiderProcess( thisProcess, thisProcessParent )\
        and len([ P for P in Results if thisProcessParent == P[1] ]) <=0:
            Result = (thisProcess, thisProcessParent)
            Results.append(Result)
            if len(Results) >= total:
                break
    if len(Results) < total:
        raise Exception("Could not find enough Processes with different parents")
    return Results

#Uncomment one of these to run the respective experiment
#ReproduceA()
#ReproduceB()
#ReproduceSingleThreaded()

I've run these tests on Windows (7) and Linux (Ubuntu), with both Python 2.7 and 3, against the latest release of psutil, 5.4.8 at time of writing.

Cause

I believe the cause of this problem is the oneshot cache being global, rather than per-Process-instance. When an affected method, such as ppid, is called on a particular Process instance, the result is stored in the global oneshot cache. As a result, whilst the oneshot cache is active, any further calls to the affected method, such as ppid, on any Process instance, will return the globally cached value, instead of the value for that particular Process instance.
I believe the “cache is active” variable is also global, which will switch the cache on for all Process instances, rather than for just the Process for which oneshot/as_dict was called.

Recommendations

If not planning to fix the issue, add two warnings to the documentation for oneshot and as_dict.
One warning should be that these methods are not safe for use in multithreaded code. I'd be wary of using the term “not thread safe” in such a warning, as most will then assume that a lock around all calls to oneshot/as_dict will solve the problem — it won't; oneshot/as_dict will affect the return values of the affected methods regardless of whether they are called from within another oneshot/as_dict or not, as demonstrated in the reproduce script.
The other warning should be that, whilst in oneshot, only one Process instance (or multiple instances that are all equal) should be dealt with, and that applications should be careful that they're not accidentally dealing with another instance somewhere further down the call stack of their code. This would include being wary of firing any event handlers / Observer-pattern code from within oneshot, especially (but not exclusively) if it may cause re-entrancy.
If planning to fix the problem, assuming my diagnosis is correct (I haven't used Python for very long), it should be a simple case of making the oneshot cache per-instance instead of global, along with the “cache is active” variable.

@giampaolo
Copy link
Owner

Mmm... this is interesting. It appears this is important and it also appears I've used bad judgement regarding #1111 and #1110. On a first glance it appears this can be solved simply by using a lock inside oneshot() but I am not sure how to test this. I've just created #1375 PR. Could you verify if it solves the issue?

Whilst using oneshot, and by extension as_dict, calling certain methods on multiple non-equal Process instances will return incorrect results.

What do you mean by "non-equal Process instances"?

This will typically occur in multithreaded applications, but the phenomenon can be reproduced in single-threaded code as well.

The problem I have with the code you pasted is that it's confusing. Do you think you can come up with a smaller/simpler script which reproduces the problem? Also it's not clear to me how this can be a problem on single threaded applications. Do you have a test case also for that?

I believe the cause of this problem is the oneshot cache being global, rather than per-Process-instance.

Mmmm no I think this is incorrect. The cache is not global. It is implemented as a decorator which is used to decorate those functions which are meant to be cached when entering oneshot() context manager. The cache per se is not bound to the Process instance but rather to the decorated method itself but AFAIU that should not represent a problem as it remains non-global.

@press-any-key-to-continue
Copy link
Author

press-any-key-to-continue commented Dec 9, 2018

I've cherry-picked the changed __init__.py file and given it a quick test on a 5.4.8 installation; unfortunately the problem seems to remain. I reckon it's more than a thread-safety problem, but I'll get to that in a bit.

What do you mean by "non-equal Process instances"?

This is just me being a little over-precise; it basically means different instances of the psutil.Process class. More specifically, instances where the pid is different, and thus for which the equality operator == will return false. For example—

ProcessA = psutil.Process(4)
ProcessB = psutil.Process(9)
ProcessC = psutil.Process(4)

ProcessA and ProcessB are “different non-equal Process instances” — they are different instances of psutil.Process, and ProcessA != ProcessB. ProcessA and ProcessC are, however, “different but equal Process instances” — they are different instances of psutil.Process, but ProcessA == ProcessC, because they have the same pid, and thus represent the same process. To put it another way, ProcessA == ProcessC returns True, but ProcessA is ProcessC returns False.

The problem I have with the code you pasted is that it's confusing. Do you think you can come up with a smaller/simpler script which reproduces the problem? Also it's not clear to me how this can be a problem on single threaded applications. Do you have a test case also for that?

Hmm… tricky. I tried to keep the code short and simple; I'll see if I can simplify it further and/or explain it in some more detail. Let's start with the single-threaded reproduce code:

from __future__ import unicode_literals
from __future__ import print_function
import sys
import psutil

def ReproduceSingleThreaded():
    print("psutil version: ", psutil.version_info)
    print("Python version: ", sys.version_info[0:4])

    #To run this experiment, we need two Processes with different parents;
    #get them below, and make sure their ppid properties are different.
    #Change the following two lines to specify processes running on the current system
    ProcessA = psutil.Process(848)
    ProcessB = psutil.Process(1832)
    ProcessAparent = ProcessA.ppid()
    ProcessBparent = ProcessB.ppid()
    if ProcessAparent == ProcessBparent: raise Exception("Input processes have the same parent")
    
    #Output some information and
    #state our expected behavior - the return value of ProcessA.ppid() should not change
    print("Process A - {0:d} {1}, parent PID is {2:d}".format( ProcessA.pid, ProcessA.name(), ProcessAparent ))
    print("Process B - {0:d} {1}, parent PID is {2:d}".format( ProcessB.pid, ProcessB.name(), ProcessBparent ))
    print("A Process' parent should never change (on Windows, and should be reasonably fixed on other platforms); the PID of Process A's parent should always be {0:d}".format( ProcessAparent ))
    print("Running single-threaded experiment . . .")
    print()
    
    print("Beginning oneshot cache for Process B")
    with ProcessB.oneshot():
    #with ProcessA.oneshot(): produces same results
        ProcessBparent2 = ProcessB.ppid()
        print("Read Process B's parent as {0:d}".format( ProcessBparent2 ))
        ProcessAparent2 = ProcessA.ppid()
        print("Read Process A's parent as {0:d}".format( ProcessAparent2 ))
        print("Ending oneshot cache")
    print("Reading Process A's parent again-")
    ProcessAparent3 = ProcessA.ppid()
    print("Read Process A's parent as {0:d}".format( ProcessAparent3 ))
    print("Experiment complete")
    print()
    
    if ProcessAparent2 != ProcessAparent:
        print(" - Process A's parent changed from {1:d} to {0:d} - ".format( ProcessAparent2, ProcessAparent ))
    else:
        print("Process A's parent remained the same value of {0:d}".format( ProcessAparent2 ))

Be sure to change the assignment of ProcessA and ProcessB to processes that are running on your system. The output from the program I get is as follows:

psutil version:  (5, 5, 0)
Python version:  (2, 7, 14, 'final')
Process A - 848 swriter.exe, parent PID is 1944
Process B - 1832 taskhost.exe, parent PID is 772
A Process' parent should never change (on Windows, and should be reasonably fixed on other platforms); the PID of Process A's parent should always be 1944
Running single-threaded experiment . . .

Beginning oneshot cache for Process B
Read Process B's parent as 772
Read Process A's parent as 772
Ending oneshot cache
Reading Process A's parent again-
Read Process A's parent as 1944
Experiment complete

 - Process A's parent changed from 1944 to 772 -

This script, and the others, use the ppid method to demonstrate the problem. The value returned by this method should not normally change, and should not change at all under Windows. The main experiment code begins after the “Running single-threaded experiment” line. Having determined the ppid of ProcessA earlier, the experiment begins by starting a oneshot context. It then reads the ppid of ProcessB. My theory is that, the result of this call is stored in oneshot's cache, but this cache is shared amongst all Process instances. Therefore, when it comes to retrieving the ppid of ProcessA on the next lines, the value cached by the earlier call to ProcessB will be returned. Effectively, the program will report ProcessA as having the same ppid as ProcessB, which is the behavior I observe. When the program exits the oneshot context, it checks the value of ProcessA's ppid again. By my theory, it should now return the correct value for ProcessA's parent, as the oneshot cache has been cleared and is no longer in use, which is the behavior I observe.

The multithreaded reproduce functions posted are just an extension on this idea — that, whilst within oneshot/as_dict somewhere in the code, there can be no other place in the code where another non-equal Process instance is having certain values of itself queried. If one thread begins a oneshot context and queries details of a Process, but during this context the thread is interrupted and another thread runs, one which queries the details of a different Process, the results returned by that different Process will be that of the first Process. The most important thing to note here is that the other thread need not start a oneshot context of its own; this means that synchronizing/locking calls to oneshot shouldn't have an effect on the problem, at least by my understanding.
Let's go through one of the multithreaded experiments, simplified a little; the following script manually creates the scenario where one thread is interrupted whilst within a oneshot context:

from __future__ import unicode_literals
from __future__ import print_function
import sys
import threading
import psutil

def ReproduceMultithreaded():
    print("psutil version: ", psutil.version_info)
    print("Python version: ", sys.version_info[0:4])
    
    #To run this experiment, we need two Processes with different parents;
    #get them below, and make sure their ppid properties are different.
    #Change the following two lines to specify processes running on the current system
    ProcessA = psutil.Process(848)
    ProcessB = psutil.Process(1832)
    ProcessAparent = ProcessA.ppid()
    ProcessBparent = ProcessB.ppid()
    if ProcessAparent == ProcessBparent: raise Exception("Input processes have the same parent")

    #Set up the second thread, passing it the Process to query and some synchronization primitives
    synchronizeA = threading.Event()
    synchronizeB = threading.Event()
    ConflictThread = threading.Thread(
        name=   "Conflict Thread",
        target= RunConflictThread,
        args=   ( ProcessB, synchronizeA, synchronizeB ),
    )
    
    #Output some information and
    #state our expected behavior - the return value of ProcessA.ppid() should not change
    print("Process A - {0:d} {1}, parent PID is {2:d}".format( ProcessA.pid, ProcessA.name(), ProcessAparent ))
    print("Process B - {0:d} {1}, parent PID is {2:d}".format( ProcessB.pid, ProcessB.name(), ProcessBparent ))
    print("A Process' parent should never change (on Windows, and should be reasonably fixed on other platforms); the PID of Process A's parent should always be {0:d}".format( ProcessAparent ))
    print("Running multithreaded experiment . . .")
    print()
    
    print("Main Thread: Starting Conflict Thread and passing control to")
    ConflictThread.start()
    synchronizeB.wait()
    synchronizeB.clear()
    print("Main Thread: Received back control; reading details of Process A, without using a oneshot cache")
    ProcessAparent2 = ProcessA.ppid()
    print("Main Thread: Process A {1:d} {2} parent PID is {0:d}".format( ProcessAparent2, ProcessA.pid, ProcessA.name() ))
    
    print("Performing cleanup")
    print("Main Thread: Signalling Conflict Thread to end")
    synchronizeA.set()
    ConflictThread.join()
    print("Ended Conflict Thread, experiment complete")
    print()
    
    if ProcessAparent2 != ProcessAparent:
        print(" - Process A's parent changed from {1:d} to {0:d} - ".format( ProcessAparent2, ProcessAparent ))
    else:
        print("Process A's parent remained the same value of {0:d}".format( ProcessAparent2 ))

def RunConflictThread(ProcessB,synchronizeA,synchronizeB):
    print("Conflict Thread: Received control")
    print("Conflict Thread: Beginning oneshot cache for reading details of Process B")
    with ProcessB.oneshot():
        ProcessBparent = ProcessB.ppid()
        print("Conflict Thread: Read Process B's parent PID, which is {0:d}".format( ProcessBparent ))
        print("Conflict Thread: Simulating a thread interrupt, and handing back control to Main Thread whilst this thread is still within oneshot")
        synchronizeB.set()
        synchronizeA.wait()
        synchronizeA.clear()
        print("Conflict Thread: Received back control; ending oneshot cache")
    print("Conflict Thread: Cache ended, exiting")

As before, be sure to change the assignments of ProcessA and ProcessB to processes running on the system. The output I get is as follows:

psutil version:  (5, 5, 0)
Python version:  (2, 7, 14, 'final')
Process A - 848 swriter.exe, parent PID is 1944
Process B - 1832 taskhost.exe, parent PID is 772
A Process' parent should never change (on Windows, and should be reasonably fixed on other platforms); the PID of Process A's parent should always be 1944
Running multithreaded experiment . . .

Main Thread: Starting Conflict Thread and passing control to
Conflict Thread: Received control
Conflict Thread: Beginning oneshot cache for reading details of Process B
Conflict Thread: Read Process B's parent PID, which is 772
Conflict Thread: Simulating a thread interrupt, and handing back control to Main Thread whilst this thread is still within oneshot
Main Thread: Received back control; reading details of Process A, without using a oneshot cache
Main Thread: Process A 848 swriter.exe parent PID is 772
Performing cleanup
Main Thread: Signalling Conflict Thread to end
Conflict Thread: Received back control; ending oneshot cache
Conflict Thread: Cache ended, exiting
Ended Conflict Thread, experiment complete

 - Process A's parent changed from 1944 to 772 -

After some setup, the code that actually runs the experiment to reproduce the problem starts after the “Beginning multithreaded experiment” line. Here we have another thread, “Conflict Thread”, that enters a oneshot context, reading the ppid of ProcessB, and then simulates an interrupt, passing control back to the main thread. The main thread then reads the ppid of ProcessA, but as can be seen in the output, the ppid of ProcessB is returned. Note also that the main thread did not use oneshot when retrieving the ppid from ProcessA.

Mmmm no I think this is incorrect. The cache is not global. It is implemented as a decorator which is used to decorate those functions which are meant to be cached when entering oneshot() context manager. The cache per se is not bound to the Process instance but rather to the decorated method itself but AFAIU that should not represent a problem as it remains non-global.

Quite possibly! My understanding of Python in-detail is a little limited. The cache may not be strictly global/static, but I think it may be behaving as if it is, or in other words, I believe it is being shared by all Process instances. And I agree; the cache, along with the “cache is active” variable, is not bound to the Process instance, either directly or indirectly, which I think is the root of the problem.
By my understanding of Python, methods are actually all static/global, and it is only the first parameter passed to them (this/self), that makes them non-static when called normally. Therefore, when a decorator is applied to a method, the decorator is applied statically — only once, and not once for each instance. The result is that, regardless of what Process instance gets called, the same decorator instance will be called, and thus the cache that the decorator contains as well as the “cache is active” variable become shared across all Process instances.
If I'm right, this means that the cache and the “cache is active” variable would both be static — there is one and only one of them in memory, and not multiple instances of them; and it means they would both be (indirectly) global in scope — their single locations in memory can be accessed regardless of the current variable scope, regardless of the current Process instance. In which case, the solution would be to make them non-static in some way — one cache and associated “cache is active” variable for each Process instance.

@giampaolo
Copy link
Owner

OK, I can reproduce the problem with a single-threaded app:

import psutil
pids = psutil.pids()
a = psutil.Process(pids.pop())
b = psutil.Process(pids.pop())
print("without oneshot()")
print(a.ppid())
print(b.ppid())
with b.oneshot():
    print("with oneshot()")
    print(a.ppid())
    print(b.ppid())

It prints (on Linux):

without oneshot()
1273
2
with oneshot()
1273
1273

...so ppid() changes depending on whether oneshot() is used or not, and that is definitively a problem. Also you're right in saying that this is independent from "thread safety". It's a different issue.

@press-any-key-to-continue
Copy link
Author

Nice. Let us know how it goes, and if you'd like me to check anything.

I made a quick multithreaded version of your script, if it's useful:

from __future__ import print_function
import threading
import psutil

pids = psutil.pids()
aid = pids.pop()
bid = pids.pop()
    
def RunOtherThread():
    b = psutil.Process(bid)
    print("{0:d} {1} parent is {2:d}".format(b.pid, b.name(), b.ppid()))
    
print("Without oneshot . . .")
a = psutil.Process(aid)
OtherThread = threading.Thread(target=RunOtherThread)
print("{0:d} {1} parent is {2:d}".format(a.pid, a.name(), a.ppid()))
OtherThread.start()
OtherThread.join()
print("With oneshot . . .")
a = psutil.Process(aid)
OtherThread = threading.Thread(target=RunOtherThread)
with a.oneshot():
    print("{0:d} {1} parent is {2:d}".format(a.pid, a.name(), a.ppid()))
    OtherThread.start()
    OtherThread.join()
Without oneshot . . .
6668 conhost.exe parent is 668
1440 python.exe parent is 6740
With oneshot . . .
6668 conhost.exe parent is 668
1440 python.exe parent is 668

@giampaolo
Copy link
Owner

OK, after all you were right. The problem is memoize_when_activated decorator which indeed uses a global cache. And it is "global" because it's bound to the decorated method instead of the Process class instance (which is the right solution). Also, threads don't have anything to do with this specific (caching) problem, but thread-safety is desirable and should be implemented (which is basically what #1110 and #1111 are about). @rezmont

giampaolo added a commit that referenced this issue Dec 13, 2018
@giampaolo
Copy link
Owner

OK this was addressed in 2cdf81d. I will fix the thread-safety problem in a different branch/commit.

@press-any-key-to-continue
Copy link
Author

Nice. I've given 2cdf81d (cherry-picked into 5.4.8) a few tests myself and it looks like it works. All of my multithreading tests now succeed, too.
Thanks very much!

nlevitt added a commit to nlevitt/psutil that referenced this issue Apr 9, 2019
* origin/master: (182 commits)
  giampaolo#1394 / windows / process exe(): convert errno 0 into ERROR_ACCESS_DENIED; errno 0 occurs when the Python process runs in 'Virtual Secure Mode'
  pre-release
  fix win num_handles() test
  update readme
  fix giampaolo#1111: use a lock to make Process.oneshot() thread safe
  pdate HISTORY
  giampaolo#1373: different approach to oneshot() cache (pass Process instances around - which is faster)
  use PROCESS_QUERY_LIMITED_INFORMATION also for username()
  Linux: refactor _parse_stat_file() and return a dict instead of a list (+ maintainability)
  fix giampaolo#1357: do not expose Process' memory_maps() and io_counters() methods if not supported by the kernel
  giampaolo#1376 Windows: check if variable is NULL before free()ing it
  enforce lack of support for Win XP
  fix giampaolo#1370: improper usage of CloseHandle() may lead to override the original error code resulting in raising a wrong exception
  update HISTORY
  (Windows) use PROCESS_QUERY_LIMITED_INFORMATION access rights (giampaolo#1376)
  update HISTORY
  revert 5398c48; let's do it in a separate branch
  giampaolo#1111 make Process.oneshot() thread-safe
  sort HISTORY
  give CREDITS to @EccoTheFlintstone for giampaolo#1368
  fix ionice set not working on windows x64 due to LENGTH_MISMATCH  (giampaolo#1368)
  make flake8 happy
  give CREDITS to @amanusk for giampaolo#1369 / giampaolo#1352 and update doc
  Add CPU frequency support for FreeBSD (giampaolo#1369)
  giampaolo#1359: add test case for cpu_count(logical=False) against lscpu utility
  disable false positive mem test on travis + osx
  fix PEP8 style mistakes
  give credits to @koenkooi for giampaolo#1360
  Fix giampaolo#1354 [Linux] disk_io_counters() fails on Linux kernel 4.18+ (giampaolo#1360)
  giampaolo#1350: give credits to @amanusk
  FreeBSD adding temperature sensors (WIP) (giampaolo#1350)
  pre release
  sensors_temperatures() / linux: convert defaultdict to dict
  fix giampaolo#1004: Process.io_counters() may raise ValueError
  fix giampaolo#1307: [Linux] disk_partitions() does not honour PROCFS_PATH
  refactor hasattr() checks as global constants
  giampaolo#1197 / linux / cpu_freq(): parse /proc/cpuinfo in case /sys/devices/system/cpu fs is not available
  fix giampaolo#1277 / osx / virtual_memory: 'available' and 'used' memory were not calculated properly
  travis / osx: set py 3.6
  travis: disable pypy; se py 3.7 on osx
  skip test on PYPY + Travis
  fix travis
  fix giampaolo#715: do not print exception on import time in case cpu_times() fails.
  fix different travis failures
  give CREDITS for giampaolo#1320 to @truthbk
  [aix] improve compilation on AIX, better support for gcc/g++ + fix cpu metrics (giampaolo#1320)
  give credits to @alxchk for giampaolo#1346 (sunOS)
  Fix giampaolo#1346 (giampaolo#1347)
  giampaolo#1284, giampaolo#1345 - give credits to @amanusk
  Add parsing for /sys/class/thermal (giampaolo#1345)
  Fix decoding error in tests
  catch UnicodeEncodeError on print()
  use memory tolerance in occasionally failing test
  Fix random 0xC0000001 errors when querying for Connections (giampaolo#1335)
  Correct capitalization of PyPI (giampaolo#1337)
  giampaolo#1341: move open() utilities/wrappers in _common.py
  Refactored ps() function in test_posix (giampaolo#1341)
  fix giampaolo#1343: document Process.as_dict() attrs values
  giampaolo#1332 - update HISTORY
  make psutil_debug() aware of PSUTIL_DEBUG (giampaolo#1332)
  also include PYPY (or try to :P)
  travis: add python 3.7 build
  add download badge
  remove failing test assertions
  remove failing test
  make test more robust
  pre release
  pre release
  set version to 5.4.7
  OSX / SMC / sensors: revert giampaolo#1284 (giampaolo#1325)
  setup.py: add py 3.7
  fix giampaolo#1323: [Linux] sensors_temperatures() may fail with ValueError
  fix failing linux tests
  giampaolo#1321 add unit tests
  giampaolo#1321: refactoring
  make disk_io_counters more robust (giampaolo#1324)
  fix typo
  Fix DeprecationWarning: invalid escape sequence (giampaolo#1318)
  remove old test
  update is_storage_device() docstring
  fix giampaolo#1305 / disk_io_counters() / Linux: assume SECTOR_SIZE is a fixed 512
  giampaolo#1313 remove test which no longer makes sense
  disk_io_counters() - linux: mimic iostat behavior (giampaolo#1313)
  fix wrong reference link in doc
  disambiguate TESTFN for parallel testing
  fix giampaolo#1309: add STATUS_PARKED constant and fix STATUS_IDLE (both on linux)
  give CREDITS to @sylvainduchesne for giampaolo#1294
  retain GIL when querying connections table (giampaolo#1306)
  Update index.rst (giampaolo#1308)
  fix giampaolo#1279: catch and skip ENODEV in net_if_stat()
  appveyor: retire 3.5, add 3.7
  revert file renaming of macos files; get them back to 'osx' prefix
  winmake: add upload-wheels cmd
  Rename OSX to macOS (giampaolo#1298)
  apveyor: reset py 3.4 and remove 3.7 (not available yet)
  try to fix occasional children() failure on Win: https://ci.appveyor.com/project/giampaolo/psutil/build/job/je3qyldbb86ff66h
  appveyor: remove py 3.4 and add 3.7
  giampaolo#1284: give credits to @amanusk + some minor adjustments
  little refactoring
  Osx temps (giampaolo#1284)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants